PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome2187.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_006905 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1SC0008SC0030Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC00082220.205956molybdenum cofactor biosynthesis protein MogA
SC0009017-2.232069hypothetical protein
SC0010116-3.252292hypothetical protein
SC0011016-3.688091hypothetical protein
SC0012017-4.262927molecular chaperone DnaK
SC0013-216-2.635881molecular chaperone DnaJ
SC0014-119-3.797808LysR family transcriptional regulator
SC0015-117-1.975324hypothetical protein
SC0016017-1.359886hypothetical protein
SC0017016-1.560542hypothetical protein
SC0018015-1.210379hydroxymethyltransferase
SC0019219-2.658463hypothetical protein
SC0020318-1.333103fimbrial subunit
SC0021219-0.780123fimbrial chaperone
SC0022120-1.257361fimbrial subunit
SC0023025-2.293188fimbrial subunit
SC0024026-2.385072fimbrial subunit
SC0025-126-4.096660fimbrial chaperone
SC0026130-7.957793hypothetical protein
SC0027130-9.102774hypothetical protein
SC0028-125-7.166385hypothetical protein
SC0029018-4.206436arylsulfatase
SC0030-116-3.802027hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0012SHAPEPROTEIN1413e-39 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 141 bits (357), Expect = 3e-39
Identities = 84/388 (21%), Positives = 152/388 (39%), Gaps = 86/388 (22%)

Query: 5 IGIDLGTTNSCVAIMDGTQARVLENAEGDRTTPSIIAYTQDGET------LVGQPAKRQA 58
+ IDLGT N+ + + Q VL PS++A QD VG AK+
Sbjct: 13 LSIDLGTANTLIYVKG--QGIVLNE-------PSVVAIRQDRAGSPKSVAAVGHDAKQML 63

Query: 59 VTNPQNTLFAIKRLIGRRFQDEEVQRDVSIMPYKIIGADNGDAWLDVKGQKMAPPQISAE 118
P N + AI+ +K +A ++ +
Sbjct: 64 GRTPGN-IAAIR---------------------------------PMKDGVIADFFVTEK 89

Query: 119 VLKK-MKKTAEDYLGEPVTEAVITVPAYFNDAQRQATKDAGRIAGLEVKRIINEPTAAAL 177
+L+ +K+ + P ++ VP +R+A +++ + AG +I EP AAA+
Sbjct: 90 MLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAI 149

Query: 178 AYGL--DKEVGNRTIAVYDLGGGTFDISIIEIDEVDGEKTFEVLATNGDTHLGGEDFDTR 235
GL + G+ V D+GGGT ++++I ++ V + +GG+ FD
Sbjct: 150 GAGLPVSEATGS---MVVDIGGGTTEVAVISLNGV---------VYSSSVRIGGDRFDEA 197

Query: 236 LINYLVDEFKKDQGIDLRNDPLAMQRLKEAAEKAKIELSSA----QQTDVNLPYITADAT 291
+INY+ + G + AE+ K E+ SA + ++ +
Sbjct: 198 IINYVRRNYGSLIG-------------EATAERIKHEIGSAYPGDEVREIEVRGRNLAEG 244

Query: 292 GPKHMNIKVTRAKLESLVEDLVNRSIEPLKVALQD-AGLSVSDIND--VILVGGQTRMPM 348
P+ + + LE+L E + + + VAL+ SDI++ ++L GG +
Sbjct: 245 VPRGFTLN-SNEILEALQEP-LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRN 302

Query: 349 VQKKVAEFFGKEPRKDVNPDEAVAIGAA 376
+ + + E G +P VA G
Sbjct: 303 LDRLLMEETGIPVVVAEDPLTCVARGGG 330


2SC0045SC0059Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0045-1243.757527ribonucleoside hydrolase RihC
SC0046-1222.325934transcription regulator sensor for citrate
SC0047-1212.688238transcription regulator, histidine kinase for
SC00480255.025135oxalacetate decarboxylase subunit beta
SC0049-2183.204733oxaloacetate decarboxylase
SC0050-2120.360680oxaloacetate decarboxylase subunit gamma
SC0051-2110.392080citrate-sodium symport
SC0052-191.708001citrate lyase synthetase
SC0053-1112.317132citrate lyase subunit gamma
SC0054-1121.598320citrate lyase subunit beta
SC0055-1111.836424citrate lyase subunit alpha/citrate-ACP
SC0056-1223.626576hypothetical protein
SC00570223.111471modifier of citrate lyase
SC0058-1253.312627dihydrodipicolinate reductase
SC00590273.254856hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0046HTHFIS691e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 68.7 bits (168), Expect = 1e-15
Identities = 28/141 (19%), Positives = 48/141 (34%), Gaps = 2/141 (1%)

Query: 1 MDSITTLIVEDEPMLAEILVDTIKIFPQFSIVGIADKLESAKKQIRLYQPQLILLDNFLP 60
M T L+ +D+ + +L + V I + + I L++ D +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSR--AGYDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 DGKGIDLIRHTISTNYTGRIIFITADNHMDTISDALRMGVFDYLIKPVHYQRLQHTLERF 120
D DL+ ++ ++A N T A G +DYL KP L + R
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 121 TRYRSSLRSSEQANQTHVDAL 141
S + + L
Sbjct: 119 LAEPKRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0047CARBMTKINASE300.018 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 30.2 bits (68), Expect = 0.018
Identities = 19/81 (23%), Positives = 30/81 (37%), Gaps = 13/81 (16%)

Query: 104 DATYITVGNEKGQRLYHVNPDEIGKYMEGGDSDDALYNAKSYVSVRKGSLGSSLRGKSPI 163
+ + G EK Q L V +E+ KY E G + GS+G +
Sbjct: 238 NGAALYYGTEKEQWLREVKVEELRKYYEEG-------------HFKAGSMGPKVLAAIRF 284

Query: 164 QDSTGKVIGIVSVGYTLEQLE 184
+ G+ I + +E LE
Sbjct: 285 IEWGGERAIIAHLEKAVEALE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0049RTXTOXIND310.014 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.0 bits (70), Expect = 0.014
Identities = 17/67 (25%), Positives = 29/67 (43%), Gaps = 7/67 (10%)

Query: 508 ASSAPVQAAAPA-------GAGTPVTAPLAGNIWKVIATEGQTVAEGDVLLILEAMKMET 560
+ V+ A A G + + ++I EG++V +GDVLL L A+ E
Sbjct: 75 SVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA 134

Query: 561 EIRAAQA 567
+ Q+
Sbjct: 135 DTLKTQS 141



Score = 29.4 bits (66), Expect = 0.046
Identities = 15/56 (26%), Positives = 22/56 (39%), Gaps = 10/56 (17%)

Query: 535 KVIATEGQTVAEGDVLLILEAMKMETEIRAAQAGTVRGIAVKSGDAVSVGDTLMTL 590
V G+ G EI+ + V+ I VK G++V GD L+ L
Sbjct: 82 IVATANGKLTHSGRSK----------EIKPIENSIVKEIIVKEGESVRKGDVLLKL 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0052LPSBIOSNTHSS381e-05 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 38.3 bits (89), Expect = 1e-05
Identities = 21/102 (20%), Positives = 43/102 (42%), Gaps = 4/102 (3%)

Query: 158 NPFTLGHRYLVEQAAAACDWLHLFVVKEDAS--FFSYTDRWALIEQGIAGIDNVTLHSGS 215
+P T GH ++E+ D +++ V++ FS +R I + IA + N + S
Sbjct: 10 DPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLPNAQVDSFE 69

Query: 216 AYMISRATFPGYFLKEKGV--VDDCHCQIDLQLFREHLAPAL 255
++ A +G+ + D ++ + + LA L
Sbjct: 70 GLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDL 111


3SC0096SC0101Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC00960163.818371L-arabinose isomerase
SC00971163.838581hypothetical protein
SC00980173.770269DNA-binding transcriptional regulator AraC
SC00991174.185378DedA family membrane protein
SC01003143.517854thiamine transporter ATP-binding subunit
SC01013153.099582thiamine transporter membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0101PF06580300.024 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.8 bits (67), Expect = 0.024
Identities = 15/79 (18%), Positives = 27/79 (34%), Gaps = 3/79 (3%)

Query: 4 RRQPLIPGWLIPGLCAAALMITVSLAAFLALWLNAPSGAWSTIWRDSYLWHVVRFSFWQA 63
R GWL + L + + +W A + W + +++
Sbjct: 60 RSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFVANTSIWRLL---AFINTKPVAFTLPL 116

Query: 64 FLSAVLSVVPAVFLARALY 82
LS + +VV F+ LY
Sbjct: 117 ALSIIFNVVVVTFMWSLLY 135


4SC0118SC0123Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SC0118-1123.035298cell division protein FtsL
SC0119-1133.004703division specific transpeptidase,
SC0120-1144.063495UDP-N-acetylmuramoylalanyl-D-glutamate--2,
SC0121-1153.558389UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D-
SC0122-2163.287090phospho-N-acetylmuramoyl-pentapeptide-
SC0123-2153.176418UDP-N-acetylmuramoyl-L-alanyl-D-glutamate
5SC0180SC0193Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0180-1183.076959aspartate alpha-decarboxylase
SC0181-1193.072339pantoate--beta-alanine ligase
SC0182-2152.3868433-methyl-2-oxobutanoate
SC0183-1153.0585572-amino-4-hydroxy-6-
SC01840104.155977poly(A) polymerase
SC0185-2134.162772glutamyl-Q tRNA(Asp) synthetase
SC0186-1132.507117RNA polymerase-binding transcription factor
SC0187-1132.890945sugar fermentation stimulation protein A
SC0188-2153.5991152'-5' RNA ligase
SC0189-2154.013361ATP-dependent RNA helicase HrpB
SC0190-3163.066012penicillin-binding protein 1b
SC0191-2122.596454ferrichrome outer membrane transporter
SC0192-2163.816202iron-hydroxamate transporter ATP-binding
SC0193-1153.527893iron-hydroxamate transporter substrate-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0193FERRIBNDNGPP4990.0 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 499 bits (1286), Expect = 0.0
Identities = 246/296 (83%), Positives = 266/296 (89%)

Query: 1 MRDLYPLTRRRLLTAMALSPLLWQMNTAQAAAIDPRRIVALEWLPVELLLALGITPYGVA 60
M L ++RRRLLTAMALSPLLWQMNTA AAAIDP RIVALEWLPVELLLALGI PYGVA
Sbjct: 1 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA 60

Query: 61 DVPNYKLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEKLARIAPGH 120
D NY+LWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPE LARIAPG
Sbjct: 61 DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 120

Query: 121 GFDFSDGKKPLAVARRSLVELAQTLNLEAAAEKHLAQYDRFIASQKPHFIRRGGRPLLMT 180
GF+FSDGK+PLA+AR+SL E+A LNL++AAE HLAQY+ FI S KP F++RG RPLL+T
Sbjct: 121 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLT 180

Query: 181 TLIDPRHMLVLGPNCLFQEVLDEYGIVNAWQGETNFWGSTAVSIDRLAMYKEADVICFDH 240
TLIDPRHMLV GPN LFQE+LDEYGI NAWQGETNFWGSTAVSIDRLA YK+ DV+CFDH
Sbjct: 181 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 240

Query: 241 GNSTDMNALMATPLWQAMPFVRAGRFHRVPAVWFYGATLSTMHFVRILDNVLGGKA 296
NS DM+ALMATPLWQAMPFVRAGRF RVPAVWFYGATLS MHFVR+LDN +GGKA
Sbjct: 241 DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA 296


6SC0258SC0263Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SC02580174.103015hypothetical protein
SC02591194.789319ribonuclease H
SC02601205.724814DNA polymerase III subunit epsilon
SC02613205.601565*hypothetical protein
SC02622214.718054hypothetical protein
SC02631213.579869hypothetical protein
7SC0273SC0305Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC02730233.700443outer membrane lipoprotein
SC02741244.692095hypothetical protein
SC02752234.124786hypothetical protein
SC02762222.981121inner membrane protein
SC02773244.054342shiga-like toxin A subunit
SC02783254.868660inner membrane protein
SC02795324.243790hypothetical protein
SC02804302.532089hypothetical protein
SC02814302.864767hypothetical protein
SC02823281.933448Rhs-family protein
SC02833310.668516hypothetical protein
SC02843310.737516Rhs-family protein
SC0285641-14.905082hypothetical protein
SC0286438-12.135756hypothetical protein
SC0287443-13.282752hypothetical protein
SC0288637-8.751967hypothetical protein
SC0289830-2.224865hypothetical protein
SC0290932-1.785862hypothetical protein
SC0291930-1.772738integrase core subunit
SC0292932-3.366461outer membrane protein
SC0293930-2.946216fimbriae assembly chaperone
SC0294928-2.456510fimbriae usher
SC0295833-5.572892fimbriae subunit
SC02961033-6.468713xylanase/chitin deacetylase
SC0297938-9.730441transcriptional regulator
SC0298935-10.367722hypothetical protein
SC0299935-10.459592hypothetical protein
SC0300838-11.342946fimbrial protein
SC0301836-12.160097fimbrial subunit
SC0304232-10.053900transcriptional regulator
SC0305-115-3.009685hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0275OMPADOMAIN703e-15 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 69.6 bits (170), Expect = 3e-15
Identities = 32/98 (32%), Positives = 44/98 (44%), Gaps = 13/98 (13%)

Query: 344 INKAAREIARVGGAVTVTGHTDSQPIHSAEFPSNLVLSEKRAAEVAALLTSGGVPAGRVH 403
+ + G+V V G+TD I S + N LSE+RA V L S G+PA ++
Sbjct: 241 LYSQLSNLDPKDGSVVVLGYTDR--IGSDAY--NQGLSERRAQSVVDYLISKGIPADKIS 296

Query: 404 IVGKGDTVPVADN---------GSKAGRAKNRRVEILV 432
G G++ PV N A +RRVEI V
Sbjct: 297 ARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0287HOKGEFTOXIC270.004 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 27.5 bits (61), Expect = 0.004
Identities = 11/24 (45%), Positives = 18/24 (75%)

Query: 9 SLLFMVLIVLFVILFFTWLGRENI 32
SL++ VLIV +L FT+L R+++
Sbjct: 7 SLVWCVLIVCLTLLIFTYLTRKSL 30


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0294PF005778310.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 831 bits (2148), Expect = 0.0
Identities = 311/872 (35%), Positives = 455/872 (52%), Gaps = 52/872 (5%)

Query: 4 KQPALLLFIAGVVHCANA-------HAYTFDASML-GDAAKGVDMSLFNQG-VQQPGTYR 54
K F+ V CA A F+ L D D+S F G PGTYR
Sbjct: 20 KHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYR 79

Query: 55 VDVMVNGKRIDTRDVVFKLEKDGQGTPFLAPCLTVSQLSRYGVKTEDYPQLWKAAKTPDE 114
VD+ +N + TRDV F QG + PCLT +QL+ G+ T + A D
Sbjct: 80 VDIYLNNGYMATRDVTFNTGDSEQG---IVPCLTRAQLASMGLNTASVSGMNLLAD--DA 134

Query: 115 CADLT-AIPQAKAVLDINNQQLQLSIPQLALRTKFKGIAPEDLWDDGIPAFLMNYSARTM 173
C LT I A A LD+ Q+L L+IPQ + + +G P +LWD GI A L+NY+
Sbjct: 135 CVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGN 194

Query: 174 QTDYKMDMGRRDNSSWVQLQPGINIGAWRVRNATSWQR-----SSQLSGKWQAAYTYAER 228
+G + +++ LQ G+NIGAWR+R+ T+W SS KWQ T+ ER
Sbjct: 195 SVQN--RIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLER 252

Query: 229 GLYSLKSRLTLGQKTSQGEIFDSVPFTGVMLASDDNMVPYSERQFAPVVRGIARTQARVE 288
+ L+SRLTLG +QG+IFD + F G LASDDNM+P S+R FAPV+ GIAR A+V
Sbjct: 253 DIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVT 312

Query: 289 VKQNGYTIYNTTVAPGPFALRDLSVTDSSGDLHVTVWEADGSTQMFVVPYQTPAIALHQG 348
+KQNGY IYN+TV PGPF + D+ +SGDL VT+ EADGSTQ+F VPY + + +G
Sbjct: 313 IKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREG 372

Query: 349 YLKYSLLAGRYRSSDSATDKAQIAQATLMYGLPWNLTAYGGIQSATHYQAASLGLGVSLG 408
+ +YS+ AG YRS ++ +K + Q+TL++GLP T YGG Q A Y+A + G+G ++G
Sbjct: 373 HTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMG 432

Query: 409 RWGSLSVDGSDTHSQRQGEAVQQGASWRLRYSNQLTATGTNFSLTRWQYASQGYNTLSDV 468
G+LSVD + +S ++ G S R Y+ L +GTN L ++Y++ GY +D
Sbjct: 433 ALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADT 492

Query: 469 LDSYRHDGNRL-------------WSWRENLQPSSRTTLMLSQSWGRHLGNLSLTGSRTD 515
S + N + + L ++Q GR L L+GS
Sbjct: 493 TYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQT 551

Query: 516 WRNRPGHDDSYGLSWGTSIGGGSLSLNWNQNRTLWRNGAHRKENITSLWFSMSLSRWTGN 575
+ D+ + T+ + +L+++ + W+ G ++ + +L ++ S W +
Sbjct: 552 YWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKG---RDQMLALNVNIPFSHWLRS 608

Query: 576 -------NVSASWQMTSPSHGGQTQQVGVNGEAFSQ-QLDWEVRQSYRADAPPGGGNNSA 627
+ SAS+ M+ +G T GV G L + V+ Y G+
Sbjct: 609 DSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGY 668

Query: 628 LHLAWNGGYGLLGGDYSYSRAMRQMGVNIAGGIVIHHHGVTLGQPLQGSVALVEAPGASG 687
L + GGYG YS+S ++Q+ ++GG++ H +GVTLGQPL +V LV+APGA
Sbjct: 669 ATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKD 728

Query: 688 VPVGGWPGVKTDFRGDTTVGNLSVYQENTVSLDPSRLPDDAEVTQTDVRVVPTEGAVVEA 747
V GV+TD+RG + + Y+EN V+LD + L D+ ++ VVPT GA+V A
Sbjct: 729 AKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRA 788

Query: 748 KFHTRIGARALMTLKREDGSAIPFGAQVTVNGQDGSADLVDTDSQVYLTGLADKGELTVK 807
+F R+G + LMTL + +PFGA VT + S+ +V + QVYL+G+ G++ VK
Sbjct: 789 EFKARVGIKLLMTLTH-NNKPLPFGAMVT-SESSQSSGIVADNGQVYLSGMPLAGKVQVK 846

Query: 808 WGA---QQCRVNYHLPAHKGIAGLYQMSGLCR 836
WG C NY LP L Q+S CR
Sbjct: 847 WGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0295PF05775913e-26 Enterobacteria AfaD invasin protein
		>PF05775#Enterobacteria AfaD invasin protein

Length = 142

Score = 91.1 bits (226), Expect = 3e-26
Identities = 38/132 (28%), Positives = 66/132 (50%), Gaps = 2/132 (1%)

Query: 14 SVSLLVTVSSLMPIANAAEKLQTTLRVGAYFRAGHVPDGMVLAQGWVTYHGSHSGFRVWS 73
S+SL + LM + + ++ TL Y + DG+ LA G + +HSGFRVW
Sbjct: 4 SISLTLCGILLMLMGSFSQAADITLMNHKYM-GNLLHDGVKLATGRIICQDTHSGFRVWI 62

Query: 74 DEQKAGNTPTVLLLSGQQDPRHHIQVRLEGEGWQPDTVNGRGAILRTAADNAS-FSVVVD 132
+ ++ G ++ + P+H++++R+ G GW G + T ++AS F + VD
Sbjct: 63 NARQEGGGAGKYIVQSTEGPQHNLRIRISGNGWSSFVEKGIQGVFNTIKEDASIFYIEVD 122

Query: 133 GNQEVPADTWTL 144
GNQ+V +
Sbjct: 123 GNQQVQPGKYLF 134


8SC0322SC0395Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0322219-1.835257gamma-glutamyl kinase
SC0323425-2.984988gamma-glutamyl phosphate reductase
SC0324635-4.917760*hypothetical protein
SC0325534-5.190827EaC protein
SC0326532-4.594934Eaa protein
SC0327836-4.921698hypothetical protein
SC0328734-5.019206hypothetical protein
SC0329735-5.349621hypothetical protein
SC0330636-5.360594Eae protein
SC0331734-6.059917endodeoxyribonuclease
SC0332733-5.458542ssDNA-binding protein controls activity of
SC0333636-5.870801hypothetical protein
SC0334642-6.244999host septation inhibitor
SC0335637-6.648153C3-like protein
SC0336735-4.465426hypothetical protein
SC0337737-4.503404hypothetical protein
SC0338738-4.570760antitermination protein
SC0339737-4.378458regulatory protein Cro (antirepressor)
SC0340736-4.055430replication protein O
SC0341737-4.025343hypothetical protein
SC0342440-5.393815phosphoadenosine phosphosulfate reductase family
SC0343536-4.076119hypothetical protein
SC0344638-3.824694endodeoxyribonuclease RUS
SC0345837-5.470436protein Niprotein NZ
SC0346838-6.116431gp23-like protein
SC0347736-4.218815**holin-like protein
SC03491036-3.403672lysin
SC03501037-3.033494Rz lysis protein
SC03521035-2.967523hypothetical protein
SC0353934-2.499570hypothetical protein
SC0354932-2.326738gp2-like protein
SC0355932-2.943601packaging glycoprotein
SC0356831-3.272730scaffolding protein
SC0357731-3.354839coat protein
SC0358634-3.155213DNA stabilization protein
SC0359635-2.743349DNA stabilization protein
SC0360740-3.813760packaged DNA stabilization protein gp26
SC0361538-4.855216hypothetical protein
SC0362642-6.759002DNA transfer protein gp7
SC0363442-7.357472DNA transfer protein gp20
SC0364442-8.701478TPA: injection protein
SC0366540-9.463912hypothetical protein
SC0367538-8.786269hypothetical protein
SC0368339-8.862538hypothetical protein
SC0369130-5.494125bactoprenol glucosyl transferase
SC0370129-4.741855bactoprenol-linked glucose translocase
SC0371227-4.013277IS3 transposase
SC0372128-4.432858hypothetical protein
SC0373228-4.274157permease
SC0374226-3.685694isopropylmalate isomerase large subunit
SC0375023-3.5182923-isopropylmalate isomerase subunit
SC0376017-2.125106fumarylacetoacetate (FAA) hydrolase
SC0377218-2.604374hydrolase or acyltransferase
SC0378219-2.995927LysR family transcriptional regulator
SC0379322-4.508127outer membrane protein
SC0380322-4.534333fimbriae usher protein
SC0381322-5.239189fimbriae usher protein
SC0382333-9.613071fimbriae chaperone
SC0383335-10.349133fimbriae major subunit
SC0384236-11.128757inner membrane protein
SC0385236-10.396447hypothetical protein
SC0386235-10.201055diguanylate cyclase/phosphodiesterase
SC0387138-12.387344response regulator
SC0388126-4.887477inner membrane protein
SC0389-119-0.554142hypothetical protein
SC0390-1171.058891response regulator
SC0391-1173.569028inner membrane protein
SC0392-2143.883423outer membrane lipoprotein
SC03930121.197215outer membrane efflux protein-like protein
SC0394012-1.719848cation efflux pump
SC0395013-3.325613transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0322CARBMTKINASE361e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 36.3 bits (84), Expect = 1e-04
Identities = 32/126 (25%), Positives = 48/126 (38%), Gaps = 12/126 (9%)

Query: 130 VPVINENDAVATAEIKVGDNDNLSALAAILAGADKLLLLTDQQGLFTADPRSNPQAELIK 189
VPVI E+ + E V D D A AD ++LTD G + + ++
Sbjct: 197 VPVILEDGEIKGVE-AVIDKDLAGEKLAEEVNADIFMILTDVNGAALY--YGTEKEQWLR 253

Query: 190 DVYGVDDALRSIAGDSVSGLGTGGMSTKLQAA-DVACRAGIDTIIASGSKPGVIGDVMEG 248
+V V++ + G M K+ AA G IIA K + +EG
Sbjct: 254 EV-KVEELRKYYEEG---HFKAGSMGPKVLAAIRFIEWGGERAIIAHLEK---AVEALEG 306

Query: 249 ISVGTR 254
GT+
Sbjct: 307 -KTGTQ 311



Score = 30.2 bits (68), Expect = 0.011
Identities = 16/76 (21%), Positives = 33/76 (43%), Gaps = 13/76 (17%)

Query: 4 SQTLVVKLGTSVLTGGSRRLNRAHIVELVRQCAQ----LHAAGHRIVIVTSG-------- 51
+ +V+ LG + L ++ + +++ VR+ A+ + A G+ +VI
Sbjct: 2 GKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLL 61

Query: 52 -AIAAGREHLGYPELP 66
+ AG+ G P P
Sbjct: 62 LHMDAGQATYGIPAQP 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0336PF05272250.025 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 25.0 bits (54), Expect = 0.025
Identities = 7/33 (21%), Positives = 14/33 (42%)

Query: 24 DIEEDLGISDDEWDSYSEGDKDEIMKDVAWERM 56
D+ + LG + EG + + + WE +
Sbjct: 802 DLVQALGADPGKSSPMLEGQVRDWLNENGWEYL 834


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0337ICENUCLEATIN270.014 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 26.6 bits (58), Expect = 0.014
Identities = 30/92 (32%), Positives = 40/92 (43%), Gaps = 13/92 (14%)

Query: 3 GNRSAATNTGDCS--AADVS----GSQSVAAAFGIEGKARASEGGAI------VLCYRDE 50
G+RS T DC A D S G S+ A G K S G + VL +R
Sbjct: 1163 GDRSKLTAGNDCILMAGDRSKLTAGINSILTA-GCRSKLIGSNGSTLTAGENSVLIFRCW 1221

Query: 51 DGELIHIRASKVGENGIMPNTWYQLNEDGEFV 82
DG+ +K G+ GI + YQ++ED V
Sbjct: 1222 DGKRYTNVVAKTGKGGIEADMPYQMDEDNNIV 1253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0341PF05272533e-09 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 53.2 bits (127), Expect = 3e-09
Identities = 30/82 (36%), Positives = 48/82 (58%), Gaps = 2/82 (2%)

Query: 4 SELSDLLWAQVDRVAPHLLPNGKKEGHEWVAGNVNGDKGNSLKVNLSGKKKWADFAEGDG 63
+ L+D L + + P LP G GHE+ G++ G KG+S KVN++ KW DF+ G+
Sbjct: 12 TSLADALLTRAKDLLPEWLPGGVLVGHEYECGSLAGGKGDSCKVNVT-TGKWCDFSTGES 70

Query: 64 G-DMLDLWMACRGINLHQAMQE 84
G D+LDL+ G+ + +A +
Sbjct: 71 GRDLLDLYAEIHGLKVSKAAAQ 92


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0360ACRIFLAVINRP280.031 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.3 bits (63), Expect = 0.031
Identities = 15/110 (13%), Positives = 40/110 (36%), Gaps = 8/110 (7%)

Query: 40 TDVSSIAEKAN-QAGGGAYDAQVRNDEQDVILDEHEKRITKTEEDISGIKVKLLEIENDV 98
DV + + N Q G Q + + K E+ + +++ +
Sbjct: 201 VDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNPEEFGKVTLRV-----NS 255

Query: 99 NGLKIKVQDIDGKVSDIIVDYVSLSRTGTQTLASSINVSGSYFVNGTKVV 148
+G ++++D+ +V +Y ++R + A+ + + + N
Sbjct: 256 DGSVVRLKDV-ARVELGGENYNVIARINGKP-AAGLGIKLATGANALDTA 303


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0373TCRTETA516e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 50.6 bits (121), Expect = 6e-09
Identities = 57/265 (21%), Positives = 91/265 (34%), Gaps = 32/265 (12%)

Query: 103 FSGTLSDRFGRKPIIFYSLLAGGILTLLCATASSWPMLVVYRALLGIAVSGITAAVTVYI 162
G LSDRFGR+P++ SL + + ATA +L + R + GI +G T AV
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGI--TGATGAVAGAY 119

Query: 163 SEEVSPA---------LAGIVTGYFIFGNSLGSMSGRVFATLMMEHVSIDTIFFIFGGVL 213
+++ ++ + G LG + G F L
Sbjct: 120 IADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAP----------FFAAAAL 169

Query: 214 IAMALAVKLFL---PTSRQFVPTPSLQLGAVLKGGLEHFKNIRVSLCFVIGFI--LFGSF 268
+ FL + P L + + V+ + FI L G
Sbjct: 170 NGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTV-VAALMAVFFIMQLVGQV 228

Query: 269 TSIFNFLAFYLHRPPYELSYTWIGLIPVSFSLT--FFLAPYAARVALNIGSMNALSMLII 326
+ ++ F R + T IG+ +F + A VA +G AL + +I
Sbjct: 229 PAAL-WVIFGEDR--FHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMI 285

Query: 327 CMMVGAFLTLIAPSLWVFISGIVLL 351
G L A W+ +VLL
Sbjct: 286 ADGTGYILLAFATRGWMAFPIMVLL 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0381PF005777620.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 762 bits (1968), Expect = 0.0
Identities = 261/880 (29%), Positives = 413/880 (46%), Gaps = 63/880 (7%)

Query: 4 TINLNRKS-LALLIAIVCSGSAQG----EEYYFDPALLQGATYGQ-NIARFNE-QQTPSG 56
I +R + + + + C+ +AQ E YF+P L +++RF Q+ P G
Sbjct: 17 HIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPG 76

Query: 57 DYLADVYVNGTLVTSSTNIRFNAVKEGQQTEPCLPLSVMKAAQIKSLPATDAA----TEC 112
Y D+Y+N + + ++ FN Q PCL + + + + + + C
Sbjct: 77 TYRVDIYLNNGYMAT-RDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDAC 135

Query: 113 RPLREWVPHAGWQFDSATLRLLLTIPMTELTHKPRGYISPSEWDSGALALFLRHNTNWTH 172
PL + A Q D RL LTIP ++++ RGYI P WD G A L +N +
Sbjct: 136 VPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNS 195

Query: 173 TENTDSHYRYQYLWSGLNMGVNLGLWQVRHQSNLRYANSNQS-GSAWRYNSVRTWVQRPV 231
+N Y + L G+N+G W++R + Y +S+ S GS ++ + TW++R +
Sbjct: 196 VQN-RIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDI 254

Query: 232 ASINSILSLGDSYTDSSLFGSLSFNGAKLVTDERMRPQGKRGYAPEVRGVAASSAHVVVK 291
+ S L+LGD YT +F ++F GA+L +D+ M P +RG+AP + G+A +A V +K
Sbjct: 255 IPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIK 314

Query: 292 QLGKVIYETNVPPGPFYIDDLYNTRYQGDLEVEVIEASGKTSRFTVPYSSVPDSVRPGNW 351
Q G IY + VPPGPF I+D+Y GDL+V + EA G T FTVPYSSVP R G+
Sbjct: 315 QNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHT 374

Query: 352 HYSLAFGRVRQYY--DIENRFFEGTFQHGVNNTITLNLGSRIAQRYQAWLAGGVWATGM- 408
YS+ G R + RFF+ T HG+ T+ G+++A RY+A+ G G
Sbjct: 375 RYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGAL 434

Query: 409 GAFGLNATWSNARAEHNERQQGWRAELSYSKTFT-TGTNLVLAAYRYSTNGFRDLQDVLG 467
GA ++ T +N+ + + G Y+K+ +GTN+ L YRYST+G+ + D
Sbjct: 435 GALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTY 494

Query: 468 VRREAKTGI-------------DYYSDTLHQRNRLSATVSQPLGRLGTLNLSASTADYYN 514
R DYY+ ++R +L TV+Q LGR TL LS S Y+
Sbjct: 495 SRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWG 554

Query: 515 NQSRITQLQMGYSNQWRNISYGVNIARQRTTWDYDRFYHGVNEPLDVSSRQKYTETTMSF 574
+ Q Q G + + +I++ ++ + + W + ++
Sbjct: 555 TSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKG------------------RDQMLAL 596

Query: 575 NVSIPLDWGENRTSVA------MNYNQSSQSRSST---VSMTGSSGENSDLSWSVYGGYE 625
NV+IP S + +Y+ S + G+ E+++LS+SV GY
Sbjct: 597 NVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYA 656

Query: 626 RYRNSNSDSSAPTTFGGNLQQNTRFGALRANYDQGDNYRQEGLGASGTLVLHPGGLTAGP 685
+ NS S+ L +G Y D+ +Q G SG ++ H G+T G
Sbjct: 657 GGGDGNSGSTG----YATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQ 712

Query: 686 YTSDTFALIHADGAQGAIVQNGQGAVVDRFGYAILPSLSPYRVNNVTLDTRKMRSDAELT 745
+DT L+ A GA+ A V+N G D GYA+LP + YR N V LDT + + +L
Sbjct: 713 PLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLD 772

Query: 746 GGSQQIVPYAGAIARVNFATISGKAVLISVKMPDGGIPPMGADVFNGEGTNIGMVGQSGQ 805
+VP GAI R F G +L+++ + P GA V + + G+V +GQ
Sbjct: 773 NAVANVVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQ 831

Query: 806 IYARIAHPSGSLLVRWGTGANQRCRVAYQLDLHTKEPFLY 845
+Y +G + V+WG N C YQL +++ L
Sbjct: 832 VYLSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLT 871


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0389ENTEROVIROMP1347e-43 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 134 bits (339), Expect = 7e-43
Identities = 59/183 (32%), Positives = 88/183 (48%), Gaps = 21/183 (11%)

Query: 1 MKRRSSFLVFLGLLLASPLALANDQHTVSFGYAQTHLSSLKNSDSKDLRGFNFKYRYEFN 60
MK+ + +L + TV+ GYAQ+ N + GFN KYRYE +
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAATSTVTGGYAQSDAQGQMN----KMGGFNLKYRYEED 56

Query: 61 ET-WGMLGSFTATRNEMENYTWKEGKLHKNGSDSVDYGSLMFGPTYRFNDYVSLYGNAGI 119
+ G++GSFT T + K Y + GP YR ND+ S+YG G+
Sbjct: 57 NSPLGVIGSFTYTEKSRTASSGDYNK--------NQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 ATMKF--------NKHSKEDSFAYGAGVIFNPVKSISIDASWEASRFFAVDTNTFGVSVG 171
KF + + F+YGAG+ FNP++++++D S+E SR +VD T+ VG
Sbjct: 109 GYGKFQTTEYPTYKHDTSDYGFSYGAGLQFNPMENVALDFSYEQSRIRSVDVGTWIAGVG 168

Query: 172 YRF 174
YRF
Sbjct: 169 YRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0393cdtoxinb310.010 Cytolethal distending toxin B signature.
		>cdtoxinb#Cytolethal distending toxin B signature.

Length = 269

Score = 30.7 bits (69), Expect = 0.010
Identities = 21/70 (30%), Positives = 27/70 (38%), Gaps = 4/70 (5%)

Query: 75 AIALRNNRDLRKAGLNVEAARALYRIQRAEMLPTLGIATAMDAGRTPADLSVTDEPEINR 134
AIA+RNN A VE +R R + L D R PADL + + R
Sbjct: 155 AIAMRNN----DAPALVEEVYNFFRDSRDPVHQALNWMILGDFNREPADLEMNLTVPVRR 210

Query: 135 RYEMAGATTA 144
E+ A
Sbjct: 211 ASEIISPAAA 220


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0394RTXTOXIND484e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 48.3 bits (115), Expect = 4e-08
Identities = 18/112 (16%), Positives = 37/112 (33%), Gaps = 7/112 (6%)

Query: 74 ELRSRVGGTLDAVSVPEGRLVSRGQLLFQIDPRPFEVALDTAVAQLRQAEVLARQAQADF 133
E++ + + V EG V +G +L ++ E + L QA + + Q
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 134 DRIQR-------LVASGAVSRKNADDVTATRNARQAQMQSAKAAVAAARLEL 178
I+ L + ++V + + Q + + L L
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNL 209



Score = 34.4 bits (79), Expect = 7e-04
Identities = 20/106 (18%), Positives = 37/106 (34%), Gaps = 13/106 (12%)

Query: 112 LDTAVAQLRQAEVLARQAQADFDRIQRLVASGAVSRKNADDVTATRNARQAQMQSAKAAV 171
L +QL Q E A+ ++ + +L + ++ + +
Sbjct: 268 LRVYKSQLEQIESEILSAKEEYQLVTQLFKN---------EILDKLRQTTDNIGLLTLEL 318

Query: 172 AAARLELSWTRITAPIAGRVDRILVTRGNLVSGGVAGNATLLTTIV 217
A + I AP++ +V ++ V GGV A L IV
Sbjct: 319 AKNEERQQASVIRAPVSVKVQQLKVHT----EGGVVTTAETLMVIV 360


9SC0431SC0436Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC04312143.016330hypothetical protein
SC04322153.563801hypothetical protein
SC04331153.526505recombination associated protein
SC04341153.372853fructokinase
SC04352163.027585MFS transport protein AraJ
SC04361153.198212exonuclease SbcC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0435TCRTETA522e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 52.1 bits (125), Expect = 2e-09
Identities = 70/356 (19%), Positives = 122/356 (34%), Gaps = 35/356 (9%)

Query: 5 IFSLALGTFGLGMAEFSIMGVLTELARDVGITIPAAGH---MISFYAFGVVLGAPVMALF 61
+ ++AL G+G+ IM VL L RD+ + H +++ YA APV+
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 62 SSRFSLKHILLFLVTLCVMGNAIFTFSSSYLMLAVGRLVSGFPHGAFFGVGAIVLSKIIR 121
S RF + +LL + + AI + +L +GR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 122 PGKVTAAVAGMVSGMTVANLVGIPVGTYLSQEFSWRYTFLLIAVFNIAVLTAIFFWVPDI 181
G A G +S +V PV L FS F A N F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 182 RDKAQGSLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYIKPFMMYI 229
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 230 SGFSETSMTFIMMLVGLGM---VLGNLLSGKLSGRYTPLRIAVVTDLVIVLSLMALFFFS 286
F + T + L G+ + +++G ++ R R ++ ++ +
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLG---MIADGTGYILLA 295

Query: 287 GYKTASLTFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAIG 340
+ F + + P +L E G G +A +L S +G
Sbjct: 296 FATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVG 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0436RTXTOXIND497e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 49.1 bits (117), Expect = 7e-08
Identities = 32/198 (16%), Positives = 71/198 (35%), Gaps = 13/198 (6%)

Query: 373 TQQSHDRAQLSQWQQQLLSDTRQRDALPPLTLDLTPQALAEARALHTRQRPLRHRLAALQ 432
TQ S +A+L Q + Q+LS + + + LP L L P + R L ++
Sbjct: 139 TQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSL------IK 192

Query: 433 GQILPKQKRQAQLQAAIARHHQEQAQYTQRLADKRLSYKTKAQELADVRTICEQ----EA 488
Q Q ++ Q + + + E+ R+ + + L D ++ + +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKH 252

Query: 489 RIKDLESQRAHLQS--GQPCPLCGSTTHPAIAAYQALELSANQTRRDALEKEVKTLAEEG 546
+ + E++ + ++A + +L + + L+K +T
Sbjct: 253 AVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNI- 311

Query: 547 AALRGQLDALTQQLQRDE 564
L +L ++ Q
Sbjct: 312 GLLTLELAKNEERQQASV 329


10SC0466SC0489Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC04660223.041936thiamine biosynthesis protein ThiI
SC04672204.1141752-aminoethylphosphonate ABC transporter
SC04680204.5481702-aminoethylphosphonate ABC transporter
SC04690184.4918762-aminoethylphosphonate ABC transporter ATPase
SC04700183.4107582-aminoethylphosphonate ABC transporter
SC04710182.8632182-aminoethylphosphonate transport, repressor
SC0472-1152.3563072-aminoethylphosphonate--pyruvate transaminase
SC0473-2142.041379phosphonoacetaldehyde hydrolase
SC0474-214-2.257513DJ-1 family protein
SC0475-117-3.0964622-dehydropantoate 2-reductase
SC0476015-2.926485nucleotide-binding protein
SC0477115-2.970613MFS family transporter
SC0478017-3.891532IS903 transposase
SC0479-115-2.648643hypothetical protein
SC0480218-0.955024hypothetical protein
SC04811230.704159protoheme IX farnesyltransferase
SC04820220.561869cytochrome o ubiquinol oxidase subunit IV
SC0483-1200.482584cytochrome o ubiquinol oxidase subunit III
SC04840200.159843cytochrome o ubiquinol oxidase subunit I
SC0485117-0.367722cytochrome o ubiquinol oxidase subunit II
SC0486321-0.035287muropeptide transporter
SC0487424-0.677490hypothetical protein
SC0488426-0.746065transcriptional regulator BolA
SC0489324-0.645268trigger factor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0469PF05272290.041 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.9 bits (64), Expect = 0.041
Identities = 7/21 (33%), Positives = 12/21 (57%)

Query: 46 VLALIGPSGSGKTTVLRAVAG 66
+ L G G GK+T++ + G
Sbjct: 598 SVVLEGTGGIGKSTLINTLVG 618


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0470MALTOSEBP280.047 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 28.2 bits (62), Expect = 0.047
Identities = 33/118 (27%), Positives = 52/118 (44%), Gaps = 14/118 (11%)

Query: 118 PLVKNYLSFIYNSKLLKTAPASWQDL--LDAKFKNKLQYSTPGQAADGMAVMLQAFH-SF 174
P+ LS IYN LL P +W+++ LD + K K G++A + F
Sbjct: 133 PIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAK------GKSALMFNLQEPYFTWPL 186

Query: 175 GSKDAGFAYL---GKLQANNVGPSASTGK--LTALVNKGEIYVANGDLQMNLAQMERN 227
+ D G+A+ GK +VG + K LT LV+ + N D ++A+ N
Sbjct: 187 IAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAEAAFN 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0477TCRTETA841e-19 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 83.7 bits (207), Expect = 1e-19
Identities = 81/373 (21%), Positives = 151/373 (40%), Gaps = 25/373 (6%)

Query: 2 LGMFMVLPVLTTY--GMALQGASEALIGIAIGIYGLAQAIFQIPFGLLSDRIGRKPLIVG 59
+G+ +++PVL + A GI + +Y L Q G LSDR GR+P+++
Sbjct: 19 VGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLV 78

Query: 60 GLAVFVAGSVIAALSHSIWGIILGRALQG-SGAIAAAVMALLSDLTREQNRTKAMAFIGV 118
LA I A + +W + +GR + G +GA A A ++D+T R + F+
Sbjct: 79 SLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSA 138

Query: 119 SFGITFAIAMVLGPIVTHSLGLNALFWMIAALATLGILLTIWVVPNSTNHVLNRESGMVK 178
FG VLG ++ +A F+ AAL L L +++P S
Sbjct: 139 CFGFGMVAGPVLGGLMG-GFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREAL 197

Query: 179 GSFSKVLAEPRLLKLNFGIMCLHILLMSTFVA-LPGQLADAGFPAAEHWKVYLATMVIAF 237
+ + G+ + L+ F+ L GQ+ A + + + I
Sbjct: 198 NPLAS-------FRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGI 250

Query: 238 A--------AVVPFIIYAEVKRRMKQVFLFCVGLI--VVAEIVLWGAGQHFWELVIGVQL 287
+ ++ +I V R+ + +G+I I+L A + + I V L
Sbjct: 251 SLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLL 310

Query: 288 FFLAFNL--MEALLPSLISKESPAGYKGTAMGVYSTSQFLGVALGGSLGGWIDGTFDGQT 345
+ ++A+L + +E +G+ + S + +G L ++ T++G
Sbjct: 311 ASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTWNG-W 369

Query: 346 VFLAGAVLAMVWL 358
++AGA L ++ L
Sbjct: 370 AWIAGAALYLLCL 382


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0486TCRTETB419e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 41.0 bits (96), Expect = 9e-06
Identities = 40/190 (21%), Positives = 76/190 (40%), Gaps = 15/190 (7%)

Query: 223 RNNAWLI-LLLIVLYKLGDAFAMSLTTTFLIRGVGFDAGEVGVVNKTLGLLATIVGALYG 281
R+N LI L ++ + + + ++++ + VN L +I A+YG
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 282 GILMQRLSLFRALLIFGILQGASNAGYWLLSITDKNMFSMGAAVFFENLCGGMGTAAFVA 341
L +L + R LL I+ + ++ + FS+ + G G AAF A
Sbjct: 71 K-LSDQLGIKRLLLFGIIINCFGS----VIGFVGHSFFSL---LIMARFIQGAGAAAFPA 122

Query: 342 LLM----TLCNKSFSATQFALLSALSAVGRVYVGPVAGWFVEAH-GWPTFYLFSVVAAVP 396
L+M K F L+ ++ A+G VGP G + + W L ++ +
Sbjct: 123 LVMVVVARYIPKENRGKAFGLIGSIVAMG-EGVGPAIGGMIAHYIHWSYLLLIPMITIIT 181

Query: 397 GLLLLLVCRQ 406
L+ + ++
Sbjct: 182 VPFLMKLLKK 191


11SC0555SC0571Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0555019-3.447345DNA-binding transcriptional activator AllS
SC0556120-4.929511ureidoglycolate hydrolase
SC0557117-4.402235DNA-binding transcriptional repressor AllR
SC0558218-4.635121hydroxypyruvate isomerase
SC0559216-2.981359tartronic semialdehyde reductase
SC0560215-3.190614permease
SC0561214-1.859420allantoin permease
SC0562114-0.812775allantoinase
SC05631150.456415purine permease YbbY
SC05640131.379109glycerate kinase
SC0565-1141.248471hypothetical protein
SC0566-1132.292127allantoate amidohydrolase
SC0567-1113.200213ureidoglycolate dehydrogenase
SC05680104.564727membrane protein FdrA
SC05691124.438953hypothetical protein
SC05700123.775135hypothetical protein
SC05710153.177814carbamate kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0560TCRTETA507e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 50.2 bits (120), Expect = 7e-09
Identities = 51/326 (15%), Positives = 112/326 (34%), Gaps = 24/326 (7%)

Query: 62 AYCSMQIPC----GILVDKFGQKIMLMAGFTLFIIGTLCIAKANGLAMIYTGSLMAGGGC 117
Y MQ C G L D+FG++ +L+ + +A A L ++Y G ++AG
Sbjct: 51 LYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITG 110

Query: 118 ASFFSSAYSLSSANVPQARRA----LANAIINSGSAIGMGIGLIGSSILVKNMSMAWQNV 173
A+ + A + + RA +A G G +G +
Sbjct: 111 AT-GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPH--------A 161

Query: 174 LYIVAAILVIMLCVFTLVIRGKAKSDSAQAEKQTQTVTEDEKRAPLFSGLLCSVYFLYFC 233
+ AA L + + + ++ + ++ R ++ ++ ++F
Sbjct: 162 PFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFI 221

Query: 234 TCYGYYLIVTWLPSYLQTERGFDGGAIGLASALVAVVG-VPGALFFSHLSDKFR-NSKVK 291
+ + + +D IG++ A ++ + A+ ++ + +
Sbjct: 222 MQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALM 281

Query: 292 VILGLEIVAAAMLAFTVLSPNTTMLMVSLTLYGLLGKMAVDPILISFVSEQASAKSLGRA 351
+ + + +LAF +MV L G+ P L + +S Q + G+
Sbjct: 282 LGMIADGTGYILLAFATRGWMAFPIMVLLASGGI-----GMPALQAMLSRQVDEERQGQL 336

Query: 352 FSLFNFFGMSSAVVAPTLTGFISDVT 377
+++V P L I +
Sbjct: 337 QGSLAALTSLTSIVGPLLFTAIYAAS 362


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0562UREASE462e-07 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 46.3 bits (110), Expect = 2e-07
Identities = 36/163 (22%), Positives = 56/163 (34%), Gaps = 32/163 (19%)

Query: 4 DLIIKNGTVILENEARVIDIAVQGGKIAAIGEN------------LEEAKNVLDATGLIV 51
D +I N ++ DI ++ G+IAAIG+ + V+ G IV
Sbjct: 69 DTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIV 128

Query: 52 SPGMVDAHTHISEPGRTHWEGYETGTRAAAKGGITTMIEMPLNQLPATVDRET------- 104
+ G +D+H H P + A G+T M+ PA T
Sbjct: 129 TAGGMDSHIHFICPQQIE---------EALMSGLTCMLGGGTG--PAHGTLATTCTPGPW 177

Query: 105 -IELKFDAAKGKLTIDAAQLGGLVSYNLDRLHELDEVGVVGFK 146
I +AA ++ A G + L E+ G K
Sbjct: 178 HIARMIEAADA-FPMNLAFAGKGNASLPGALVEMVLGGATSLK 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0571CARBMTKINASE361e-128 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 361 bits (929), Expect = e-128
Identities = 127/310 (40%), Positives = 177/310 (57%), Gaps = 16/310 (5%)

Query: 2 KTLVVALGGNALLQRGEALTAENQYRNIADAVPALARL-ARSYRLAIVHGNGPQVGLLAL 60
K +V+ALGGNAL QRG+ + E N+ +A + AR Y + I HGNGPQVG L L
Sbjct: 3 KRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLL 62

Query: 61 QNLAWKA---VEPYPLDVLVAESQGMIGYMLAQRLALEPDM----PPVTTVLTRIKVSAD 113
A +A + P+DV A SQG IGYM+ Q L E V T++T+ V +
Sbjct: 63 HMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKN 122

Query: 114 DPAFLEPEKFIGPVYSPEEQMALEATYGWHMKRD-GKYLRRVVASPAPRQIIESAAIELL 172
DPAF P K +GP Y E L GW +K D G+ RRVV SP P+ +E+ I+ L
Sbjct: 123 DPAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKL 182

Query: 173 LKEGHVVICSGGGGVPVAGEG---EGVEAVIDKDLAAALLAEQIAADGLIILTDADAVYE 229
++ G +VI SGGGGVPV E +GVEAVIDKDLA LAE++ AD +ILTD +
Sbjct: 183 VERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAAL 242

Query: 230 HWGTPQQRAIRQASPDELAPFAKAD----GAMGPKVTAVSGYVKRCGKPAWIGALSRIDD 285
++GT +++ +R+ +EL + + G+MGPKV A +++ G+ A I L + +
Sbjct: 243 YYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLEKAVE 302

Query: 286 TLAGRAGTCI 295
L G+ GT +
Sbjct: 303 ALEGKTGTQV 312


12SC0580SC0598Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC05802141.301989hypothetical protein
SC05812131.288180bifunctional 5,10-methylene-tetrahydrofolate
SC0582314-0.405628major type 1 subunit fimbrin (pilin)
SC0583216-1.177223fimbrial protein
SC0585216-2.417371outer membrane usher protein
SC0586122-6.310308minor fimbrial subunit
SC0587124-7.816540fimbrial protein
SC0588127-9.174232transcriptional regulator FimZ
SC0589437-12.894249regulatory protein
SC0590442-14.394251fimbrial protein
SC0591242-13.465829hypothetical protein
SC0592241-13.043435*glycosyl translocase
SC0593241-13.144632glycosyltransferase
SC0594342-13.630022hypothetical protein
SC0595-127-6.847007hypothetical protein
SC0596016-1.784229glycosyltransferase
SC0597117-0.556999glycosyl translocase
SC0598217-0.282024hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0585PF005778340.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 834 bits (2157), Expect = 0.0
Identities = 404/861 (46%), Positives = 559/861 (64%), Gaps = 21/861 (2%)

Query: 18 LSSVALSVLVALCPLTSRGESYFNPAFLSADTASVADLSRFEKGNHQPPGIYRVDIWRND 77
+ ++ A S E YFNP FL+ D +VADLSRFE G PPG YRVDI+ N+
Sbjct: 27 FVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNN 86

Query: 78 EFVATQDIRFEAGAVGTGDKSGGLMPCFTPEWIKRLGVNTAAFPVSDKGVDTTCIHLPEK 137
++AT+D+ F G D G++PC T + +G+NTA+ + D C+ L
Sbjct: 87 GYMATRDVTFNTG-----DSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSM 141

Query: 138 IPGAEVAFDFASMRLNISLPQASLLNSARGYIPPEEWDEGIPAALINYSFTGSR-----G 192
I A D RLN+++PQA + N ARGYIPPE WD GI A L+NY+F+G+ G
Sbjct: 142 IHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIG 201

Query: 193 TDSDSYFLSLLSGLNYGPWRLRNNGAWNYSKGDG--YHSQRWNNIGTWVQRAIIPLKSEL 250
+S +L+L SGLN G WRLR+N W+Y+ D +W +I TW++R IIPL+S L
Sbjct: 202 GNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRL 261

Query: 251 VMGDSNTGNDVFDSVGFRGARLYSSDNMYPDSLQGYAPTVRGIARTAAKLTIRQNGYVIY 310
+GD T D+FD + FRGA+L S DNM PDS +G+AP + GIAR A++TI+QNGY IY
Sbjct: 262 TLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIY 321

Query: 311 QSYVSPGAFAITDLNPTSSSGDLEVTVDEKDGSQQRYTVPYSTVPLLQREGRVKYDLVAG 370
S V PG F I D+ +SGDL+VT+ E DGS Q +TVPYS+VPLLQREG +Y + AG
Sbjct: 322 NSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAG 381

Query: 371 DFRSGNSQQSSPFFFQGTVIAGLPAGLTAYGGTQLADRYRAVVVGAGRNLGDWGAVSVDV 430
++RSGN+QQ P FFQ T++ GLPAG T YGGTQLADRYRA G G+N+G GA+SVD+
Sbjct: 382 EYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDM 441

Query: 431 THARSQLADDSTHQGQSLRFLYAKSLNNYGTNFQLLGYRYSTRGFYTLDDVAYRSMEGYD 490
T A S L DDS H GQS+RFLY KSLN GTN QL+GYRYST G++ D Y M GY+
Sbjct: 442 TQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYN 501

Query: 491 YEYDSDGRRHKVPVAQSYHNLRYSKKGRFQVNISQNLGDYGSLYLSGSQQNYWNTADTNT 550
E DG P Y+NL Y+K+G+ Q+ ++Q LG +LYLSGS Q YW T++ +
Sbjct: 502 IE-TQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDE 560

Query: 551 WYQLGYASGWQGISYSLSWSWNESVGISGADRILAFNMSVPFSVLTGRRYARDTLLDRTY 610
+Q G + ++ I+++LS+S ++ G D++LA N+++PF D+ +
Sbjct: 561 QFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPF----SHWLRSDSKSQWRH 616

Query: 611 ATFNANRNRNRDGDNSWQTGVGGTLLEGRNLSYSVTQGRS----STNSYSGSASASWQAT 666
A+ + + + + +G + GV GTLLE NLSYSV G + + +G A+ +++
Sbjct: 617 ASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGG 676

Query: 667 YGTLGVGYNYDRDQHDYNWQLSGGVVGHADGITFSQPLGDTNVLIKAPGAKGVRIENQTG 726
YG +GY++ D + +SGGV+ HA+G+T QPL DT VL+KAPGAK ++ENQTG
Sbjct: 677 YGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTG 736

Query: 727 VKTDWRGYAVMPYATVYRYNRVALDTNTMDNHTDVENNVSSVVPTEGALVRAAFDTRIGV 786
V+TDWRGYAV+PYAT YR NRVALDTNT+ ++ D++N V++VVPT GA+VRA F R+G+
Sbjct: 737 VRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGI 796

Query: 787 RAIITARLGGRPLPFGAIVRETASGITSMVGDDGQIYLSGLPLKGELFIQWGEGKNARCI 846
+ ++T +PLPFGA+V +S + +V D+GQ+YLSG+PL G++ ++WGE +NA C+
Sbjct: 797 KLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCV 856

Query: 847 APYALAEDSLKQAITIASATC 867
A Y L +S +Q +T SA C
Sbjct: 857 ANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0588HTHFIS699e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 69.1 bits (169), Expect = 9e-16
Identities = 29/122 (23%), Positives = 59/122 (48%), Gaps = 2/122 (1%)

Query: 32 MKPASVIIMDEHPIVRMSIEVLLGKNSNIQVILKTDDSRTAIKYLRTYPVDLVILDIELP 91
M A++++ D+ +R + L + V T ++ T +++ DLV+ D+ +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRA-GYDV-RITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 92 GTDGFTLLKRIKSIQEHTRILFLSSKSEAFYAGRAIRAGANGFVSKRKDLNDIYNAVKMI 151
+ F LL RIK + +L +S+++ A +A GA ++ K DL ++ +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 152 LS 153
L+
Sbjct: 119 LA 120


13SC0614SC0648Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0614-2113.641632carboxylate-amine ligase
SC0615-1123.011251phosphopantetheinyltransferase component of
SC0616-1123.088663outer membrane receptor FepA
SC06171154.142285enterobactin/ferric enterobactin esterase
SC06182164.327218hypothetical protein
SC06191154.563914enterobactin synthase subunit F
SC06200133.915674ferric enterobactin transport protein FepE
SC06211135.667263iron-enterobactin transporter ATP-binding
SC06220115.559112iron-enterobactin transporter permease
SC0623-1125.141291iron-enterobactin transporter membrane protein
SC0624-2124.688205enterobactin exporter EntS
SC0625-2144.675343iron-enterobactin transporter periplasmic
SC0626-2164.403479isochorismate synthase
SC0627-1164.110490enterobactin synthase subunit E
SC06280174.1897242,3-dihydro-2,3-dihydroxybenzoate synthetase
SC06290153.6051662,3-dihydroxybenzoate-2,3-dehydrogenase
SC0630-1142.811460hypothetical protein
SC0631-1101.186443carbon starvation protein
SC0632-115-2.528433hypothetical protein
SC0633-116-2.603103hypothetical protein
SC0634-117-4.051076aminotransferase
SC0635-114-3.402359hypothetical protein
SC0636-215-4.202093bifunctional 3'-phosphoadenosine
SC0637-312-2.144968LysR family transcriptional regulator
SC0638-313-0.181525disulfide isomerase/thiol-disulfide oxidase
SC0639-3130.387967alkyl hydroperoxide reductase
SC0640-3120.507689alkyl hydroperoxide reductase
SC0641-1120.360116hypothetical protein
SC0642-1121.428265oxidoreductase
SC06431132.374870hydrogenase
SC06440133.311300hydrogenase
SC06450174.109983universal stress protein UspA and related
SC0646-2224.869877nucleoside diphosphate kinase regulator
SC0647-2224.803105DASS family citrate:succinate transporter
SC0648-1214.604858triphosphoribosyl-dephospho-CoA synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0615ENTSNTHTASED369e-133 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 369 bits (948), Expect = e-133
Identities = 231/234 (98%), Positives = 233/234 (99%)

Query: 1 MLTSHFPLPFAGHRLHIVDFDASSFHEHDLLWLPHHDRLRSAGRKRKAEHLAGRIAAVHA 60
MLTSHFPLPFAGHRLHIVDFDASSF EHDLLWLPHHDRLRSAGRKRKAEHLAGRIAAVHA
Sbjct: 1 MLTSHFPLPFAGHRLHIVDFDASSFREHDLLWLPHHDRLRSAGRKRKAEHLAGRIAAVHA 60

Query: 61 LREMGVRTVPGIGDKRQPLWPDGLFGSISHCATTALAVISRQRIGIDIEKIMSQHTATEL 120
LRE+GVRTVPG+GDKRQPLWPDGLFGSISHCATTALAVISRQRIGIDIEKIMSQHTATEL
Sbjct: 61 LREVGVRTVPGMGDKRQPLWPDGLFGSISHCATTALAVISRQRIGIDIEKIMSQHTATEL 120

Query: 121 APSIIDSDERQILQASLLPFPLALTLAFSAKESVYKAFSDRVTLPGFNSAKVTSLTATHI 180
APSIIDSDERQILQASLLPFPLALTLAFSAKESVYKAFSDRVTLPGFNSAKVTSLTATHI
Sbjct: 121 APSIIDSDERQILQASLLPFPLALTLAFSAKESVYKAFSDRVTLPGFNSAKVTSLTATHI 180

Query: 181 SLHLLPAFAATMAERTVRTEWFQRDNSVITLVSAITRVPHDRSAPASILSAIPR 234
SLHLLPAFAATMAERTVRTEWFQRDNSVITLVSAITRVPHDRSAPASILSAIPR
Sbjct: 181 SLHLLPAFAATMAERTVRTEWFQRDNSVITLVSAITRVPHDRSAPASILSAIPR 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0625FERRIBNDNGPP602e-12 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 60.0 bits (145), Expect = 2e-12
Identities = 47/210 (22%), Positives = 82/210 (39%), Gaps = 21/210 (10%)

Query: 105 EPNAETVAAQMPDLILISATGGDSALALYDQLSAIAPTLVINYDDKS-----WQSLLTQL 159
EPN E + P ++ SA G S + L+ IAP N+ D + LT++
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPLAMARKSLTEM 141

Query: 160 GEITGQEKQAAARIAEFEAQLTTVKQRIALPPQPVSALVYTPAAHSANLWTPESAQGKLL 219
++ + A +A++E + ++K R L ++ P S ++L
Sbjct: 142 ADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEIL 201

Query: 220 TQLGFTLATLPRGLQTSKSQGKRHDIIQLGGENLAAGLNGESLFLFAGDNKDVAALYANP 279
+ G A + + + + LAA + + L ++KD+ AL A P
Sbjct: 202 DEYGIPNAW--------QGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATP 253

Query: 280 LLAHLPAVQNKRVHALGTETFRLDYYSATL 309
L +P V+ R + F Y ATL
Sbjct: 254 LWQAMPFVRAGRFQRVPAVWF----YGATL 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0628ISCHRISMTASE424e-153 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 424 bits (1090), Expect = e-153
Identities = 148/299 (49%), Positives = 192/299 (64%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQSYALPTALDIPTNKVNWAFEPERAALLIHDMQDYFVSFWGRNCPMMDQVIANI 60
MAIP +Q Y +PTA D+P NKV+W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRQYCKEHHIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVEALTPDEADTV 120
L+ C + IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P++ D V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKDIGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSREEHLMALNYVAGRSGRVVMTESLL------PTPVPASKAE-----------LRALIL 223
FS E+H MAL Y AGR VMT+SLL P V + A +R I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDETDEPLD-DENLIDYGLDSVRMMGLAARWRKVHGDIDFVMLAKNPTIDAWWALLS 281
LL ET E + E+L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W LL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0629DHBDHDRGNASE338e-120 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 338 bits (867), Expect = e-120
Identities = 104/257 (40%), Positives = 148/257 (57%), Gaps = 20/257 (7%)

Query: 9 KTVWVTGAGKGIGYATALAFVDAGARVIGFDRE---------------FTQESYPFATEV 53
K ++TGA +GIG A A GA + D E++P
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP----- 63

Query: 54 MDVADAGQVAQVCQRVLQKTPRLDVLVNAAGILRMGATDALSVDDWQQTFAVNVGGAFNL 113
DV D+ + ++ R+ ++ +D+LVN AG+LR G +LS ++W+ TF+VN G FN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 114 FSQTMAQFRRQQGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALTVGLELAGCGVRCN 173
++ G+IVTV S+ A PR M+AY +SKAA +GLELA +RCN
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 174 VVSPGSTDTDMQRTLWVSEDAEQQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDLA 233
+VSPGST+TDMQ +LW E+ +Q I+G E FK GIPL K+A+P +IA+ +LFL S A
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 234 SHITLQDIVVDGGSTLG 250
HIT+ ++ VDGG+TLG
Sbjct: 244 GHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0640STREPTOPAIN310.011 Streptopain (C10) cysteine protease family signature.
		>STREPTOPAIN#Streptopain (C10) cysteine protease family signature.

Length = 398

Score = 31.2 bits (70), Expect = 0.011
Identities = 17/73 (23%), Positives = 33/73 (45%), Gaps = 1/73 (1%)

Query: 2 LDTNMKTQLRAYLEKLTKPVELIATLDDS-AKSAEIKELLAEIAELSDKVTFKEDNTLPV 60
D N K + +++E + ++ LD + A +AEIK+ + + S + + + N +
Sbjct: 109 FDANGKENIASFMESYVEQIKENKKLDTTYAGTAEIKQPVVKSLLDSKGIHYNQGNPYNL 168

Query: 61 RKPSFLITNPGSQ 73
P PG Q
Sbjct: 169 LTPVIEKVKPGEQ 181


14SC0720SC0766Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0720-1153.647771putrescine transporter
SC0721-1144.106038ornithine decarboxylase
SC07220175.021562DNA-binding transcriptional activator KdpE
SC07230164.948326sensor protein KdpD
SC07240153.630784potassium-transporting ATPase subunit C
SC07250153.271198potassium-transporting ATPase subunit B
SC07260142.940838potassium-transporting ATPase subunit A
SC0727-1152.969260potassium-transporting ATPase subunit F
SC0728-1143.443422deoxyribodipyrimidine photolyase
SC0729-2122.822805POT family transport protein
SC0730-1153.851654hydrolase-oxidase
SC0731-2123.064958hypothetical protein
SC0732-1131.903340hypothetical protein
SC07330172.402198LamB/YcsF family protein
SC07340171.391802endonuclease VIII
SC07351212.006096hypothetical protein
SC07362252.012531type II citrate synthase
SC07371252.578984succinate dehydrogenase cytochrome b556 large
SC07381283.002434succinate dehydrogenase flavoprotein subunit
SC07391281.974847succinate dehydrogenase iron-sulfur subunit
SC07402312.3078522-oxoglutarate dehydrogenase E1
SC07411322.056071dihydrolipoamide succinyltransferase
SC07421311.683944succinyl-CoA synthetase subunit beta
SC07431280.850727succinyl-CoA synthetase subunit alpha
SC07441220.084744cytochrome d terminal oxidase, polypeptide
SC0745019-0.314296cytochrome d terminal oxidase polypeptide
SC07466180.877202hypothetical protein
SC07472161.418410hypothetical protein
SC07482201.413395acyl-CoA thioester hydrolase
SC07492191.417262colicin uptake protein TolQ
SC07502171.879123colicin uptake protein TolR
SC07511161.547369cell envelope integrity inner membrane protein
SC0752-1140.767602translocation protein TolB
SC0753-113-0.091350peptidoglycan-associated outer membrane
SC0754-19-0.122675tol-pal system protein YbgF
SC0755-212-1.166857*****quinolinate synthetase
SC0756-118-4.104402NMN family, nucleoside/purine/pyrimidine
SC0757-127-6.935719zinc transporter ZitB
SC0758140-10.576628homeobox protein
SC0759139-10.314648phospho-2-dehydro-3-deoxyheptonate aldolase
SC0760226-5.541675fumarate hydratase class I anaerobic
SC0761217-1.722651fumarate hydratase
SC07622170.709187LysR family transcriptional regulator
SC07631183.576402cation transporter
SC07640287.568540oxaloacetate decarboxylase subunit gamma
SC07650256.513140oxaloacetate decarboxylase
SC0766-1194.586240oxalacetate decarboxylase subunit beta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0722HTHFIS911e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.4 bits (227), Expect = 1e-23
Identities = 37/123 (30%), Positives = 58/123 (47%), Gaps = 1/123 (0%)

Query: 2 TNVLIVEDEQAIRRFLRAALEGDGLRVYEAETLQRGLLEAATRKPDLIILDLGLPDGDGI 61
+L+ +D+ AIR L AL G V A DL++ D+ +PD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 DFIRDLRQWSA-IPVIVLSARSEESDKIAALDAGADDYLSKPFGIGELQARLRVALRRHA 120
D + +++ +PV+V+SA++ I A + GA DYL KPF + EL + AL
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 121 ASP 123
P
Sbjct: 124 RRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0733V8PROTEASE300.010 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 30.0 bits (67), Expect = 0.010
Identities = 15/87 (17%), Positives = 26/87 (29%), Gaps = 8/87 (9%)

Query: 35 LTLVSSANIACGFHAGDAQTMLT---CVREALKNGVAIGAHPSFPDRDNFG----RTAMV 87
+ + IA G G T+LT V + A+ A PS ++DN+ +
Sbjct: 95 VEAPTGTFIASGVVVGK-DTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQI 153

Query: 88 LPPETVYAQTLYQIGALGAIVQAQGGV 114
+ + V
Sbjct: 154 TKYSGEGDLAIVKFSPNEQNKHIGEVV 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0739TCRTETOQM310.004 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.0 bits (70), Expect = 0.004
Identities = 12/41 (29%), Positives = 23/41 (56%), Gaps = 1/41 (2%)

Query: 15 VDNAPRMQDYTLEGEEGRDM-MLLDALIQLKEKDPSLSFRR 54
++N + T+E + + MLLDAL+++ + DP L +
Sbjct: 339 IENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYV 379


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0751IGASERPTASE615e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 60.8 bits (147), Expect = 5e-12
Identities = 28/194 (14%), Positives = 59/194 (30%), Gaps = 5/194 (2%)

Query: 64 YNRQQDQQASARRAEEERKKLQQQQAEELQQKQAAEQERLKQLEKERLAAQEQQKQAEEA 123
YN + +++ Q E R+ + A + E
Sbjct: 981 YNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETV 1040

Query: 124 AKLAQQQQQAEEAAKAAADAKKKAEAEAAKAAADAKKKAEAEAVKAAADAKKKAEAEAAK 183
A+ ++Q+ + E + A E AK A K+ +A + +
Sbjct: 1041 AENSKQESKTVEKNEQDATETTAQNREVAKEA-----KSNVKANTQTNEVAQSGSETKET 1095

Query: 184 AAADAKKKAEAEAAKAAAEAKKKAEAEAAKAAAEAKKKADAEAAKAAAEAKKKADAAAAK 243
+ K+ A E + A +K + + + K+ +E + AE ++ D
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNI 1155

Query: 244 AAADAKKKAAAAEG 257
++ A
Sbjct: 1156 KEPQSQTNTTADTE 1169



Score = 59.3 bits (143), Expect = 1e-11
Identities = 24/177 (13%), Positives = 58/177 (32%), Gaps = 2/177 (1%)

Query: 61 VQQYNRQQDQQASARRAEEERKKLQQQQAEELQQKQAAEQERLKQLEKERLAAQEQQKQA 120
+ N Q S EE ++ + +E ++ + ++ +
Sbjct: 997 ITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQ 1056

Query: 121 EEAAKLAQQQQQAEEAAKAAADAKKKAEAEAAKAAADAKKKAEAEAVKAAADAKKKAEAE 180
+ AQ ++ A+EA + E + + + E + A + ++KA+ E
Sbjct: 1057 DATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE-TATVEKEEKAKVE 1115

Query: 181 AAKAAADAKKKAEAEAAKAAAEA-KKKAEAEAAKAAAEAKKKADAEAAKAAAEAKKK 236
K K ++ + +E + +AE K+ ++ A +
Sbjct: 1116 TEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPA 1172



Score = 52.0 bits (124), Expect = 2e-09
Identities = 27/221 (12%), Positives = 75/221 (33%), Gaps = 20/221 (9%)

Query: 66 RQQDQQASARRAEEERKKLQQQQAEE--LQQKQAAEQER------LKQLEKERLAAQEQQ 117
+ A + E + + +Q A E Q ++ A++ + + E + ++ ++
Sbjct: 1035 ETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKE 1094

Query: 118 KQ-----------AEEAAKLAQQQQQAEEAAKAAADAKKKAEAEAAKAAADAKKKAEAEA 166
Q EE AK+ ++ Q E + + K+ ++E + A+ ++ +
Sbjct: 1095 TQTTETKETATVEKEEKAKVETEKTQ-EVPKVTSQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 167 VKAAADAKKKAEAEAAKAAADAKKKAEAEAAKAAAEAKKKAEAEAAKAAAEAKKKADAEA 226
++ A+ + A + E ++ + E + A + +
Sbjct: 1154 NIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNS 1213

Query: 227 AKAAAEAKKKADAAAAKAAADAKKKAAAAEGVDDLLGDLSS 267
+ + + + ++ + L DL+S
Sbjct: 1214 ESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTS 1254



Score = 48.9 bits (116), Expect = 3e-08
Identities = 23/191 (12%), Positives = 57/191 (29%)

Query: 65 NRQQDQQASARRAEEERKKLQQQQAEELQQKQAAEQERLKQLEKERLAAQEQQKQAEEAA 124
NR+ ++A + + Q E ++ Q E + +EKE A E +K E
Sbjct: 1065 NREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPK 1124

Query: 125 KLAQQQQQAEEAAKAAADAKKKAEAEAAKAAADAKKKAEAEAVKAAADAKKKAEAEAAKA 184
+Q + E++ A+ E + + + + A + + E
Sbjct: 1125 VTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVT 1184

Query: 185 AADAKKKAEAEAAKAAAEAKKKAEAEAAKAAAEAKKKADAEAAKAAAEAKKKADAAAAKA 244
+ + + ++ K + ++ + A ++
Sbjct: 1185 ESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDR 1244

Query: 245 AADAKKKAAAA 255
+ A +
Sbjct: 1245 STVALCDLTST 1255



Score = 42.4 bits (99), Expect = 3e-06
Identities = 23/198 (11%), Positives = 55/198 (27%), Gaps = 1/198 (0%)

Query: 59 AVVQQYNRQQDQQASARRAEEERKKLQQQQAEELQQKQAAEQERLKQLEKERLAAQEQQK 118
A Q Q + E K+ + EE + + + + + ++ + QEQ +
Sbjct: 1078 ANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSE 1137

Query: 119 QAEEAAKLAQQQQQAEEAAKAAADAKKKAEAEAAKAAADAKKKAEAEAVKAAADAKKKAE 178
+ A+ A++ + + A+ E + + E
Sbjct: 1138 TVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVE 1197

Query: 179 AEAAKAAADAKKKAEAEAAKAAAEAKKKA-EAEAAKAAAEAKKKADAEAAKAAAEAKKKA 237
A + +E++ +++ + D
Sbjct: 1198 NPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNT 1257

Query: 238 DAAAAKAAADAKKKAAAA 255
+A + A A A+ A
Sbjct: 1258 NAVLSDARAKAQFVALNV 1275


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0753OMPADOMAIN1152e-33 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 115 bits (289), Expect = 2e-33
Identities = 36/119 (30%), Positives = 55/119 (46%), Gaps = 4/119 (3%)

Query: 56 EEQARLQMQQLQQNNIVYFDLDKYDIRSDFAAMLDAHANFLRSN--PSYKVTVEGHADER 113
+Q + + V F+ +K ++ + A LD + L + V V G+ D
Sbjct: 205 APAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI 264

Query: 114 GTPEYNISLGERRANAVKMYLQGKGVSADQISIVSYGKEKPAVLGHDEAAYAKNRRAVL 172
G+ YN L ERRA +V YL KG+ AD+IS G+ P V G+ K R A++
Sbjct: 265 GSDAYNQGLSERRAQSVVDYLISKGIPADKISARGMGESNP-VTGN-TCDNVKQRAALI 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0754RTXTOXIND290.023 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.023
Identities = 2/39 (5%), Positives = 19/39 (48%)

Query: 56 LTQLQQQLSDNQSDIDSLRGQIQENQYQLNQVMERQKQI 94
+ + + + + +++ + Q+++ + ++ E + +
Sbjct: 254 VLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLV 292


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0765RTXTOXIND310.014 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.0 bits (70), Expect = 0.014
Identities = 17/67 (25%), Positives = 29/67 (43%), Gaps = 7/67 (10%)

Query: 508 ASSAPVQAAAPA-------GAGTPVTAPLAGNIWKVIATEGQTVAEGDVLLILEAMKMET 560
+ V+ A A G + + ++I EG++V +GDVLL L A+ E
Sbjct: 75 SVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA 134

Query: 561 EIRAAQA 567
+ Q+
Sbjct: 135 DTLKTQS 141



Score = 29.4 bits (66), Expect = 0.046
Identities = 15/56 (26%), Positives = 22/56 (39%), Gaps = 10/56 (17%)

Query: 535 KVIATEGQTVAEGDVLLILEAMKMETEIRAAQAGTVRGIAVKSGDAVSVGDTLMTL 590
V G+ G EI+ + V+ I VK G++V GD L+ L
Sbjct: 82 IVATANGKLTHSGRSK----------EIKPIENSIVKEIIVKEGESVRKGDVLLKL 127


15SC0780SC0792Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC07800153.593317molybdate ABC transporter permease
SC07810143.938323molybdate transporter ATP-binding protein
SC07820144.069431phosphotransferase
SC07830174.9836526-phosphogluconolactonase
SC0784-1165.766273pectinesterase
SC07850166.012267imidazolonepropionase
SC07860165.934977formimidoylglutamase
SC0787-1165.074851histidine utilization repressor
SC07880155.606865urocanate hydratase
SC07890155.576993histidine ammonia-lyase
SC07900145.233260kinase inhibitor protein
SC0791-2154.353507adenosylmethionine-8-amino-7-oxononanoate
SC0792-2153.609667biotin synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0785PRTACTNFAMLY300.014 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 30.4 bits (68), Expect = 0.014
Identities = 17/55 (30%), Positives = 24/55 (43%), Gaps = 5/55 (9%)

Query: 230 VLQTAKALGIPVKGHVEQLSLLGGAQLVSRYQGLSADHIEYLDEAGVAAMRDGGT 284
VL+ +P G +S+LG ++L L HI AGVAAM+
Sbjct: 202 VLRDTNVTAVPASGAPAAVSVLGASELT-----LDGGHITGGRAAGVAAMQGAVV 251


16SC0838SC0863Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0838-1113.796515pyruvate formate lyase
SC08390133.370938pyruvate formate lyase activating enzyme
SC08400143.404780molybdopterin biosynthesis protein MoeB
SC08411153.207060molybdopterin biosynthesis protein MoeA
SC08420142.533916L-asparaginase
SC0843-114-0.087295glutathione transporter ATP-binding protein
SC0844-114-3.139395ABC transporter substrate-binding protein
SC0845-118-4.076304ABC transporter substrate-binding protein
SC0846125-6.381011ABC transporter inner membrane component
SC0847534-8.570472ribosomal protein S12 methylthiotransferase
SC0848744-10.574659electron transfer flavoprotein subunit beta
SC0849742-10.041498electron transfer flavoprotein alpha subunit
SC0850534-8.357877hypothetical protein
SC0851431-7.522454acyl-CoA dehydrogenase
SC0852223-5.405242dehydrogenase
SC0853015-3.332140transposase of Tn10
SC0854-211-1.385109LysR family transcriptional regulator
SC0855-1100.599005dehydrogenase
SC0856-1120.228928glutathione S-transferase
SC0857-1110.996905D-alanyl-D-alanine carboxypeptidase
SC08580120.800718DNA-binding transcriptional repressor DeoR
SC08591121.027898undecaprenyl pyrophosphate phosphatase
SC08601120.845098multidrug translocase
SC08612140.160709hypothetical protein
SC08622120.391487hypothetical protein
SC0863215-0.147385paral regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0857BLACTAMASEA475e-08 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 46.7 bits (111), Expect = 5e-08
Identities = 49/207 (23%), Positives = 78/207 (37%), Gaps = 25/207 (12%)

Query: 1 MTQYASSLRSLAAGSVLLFLFASPVKAEEQTIAPPGVDAR-AWILMDYASGKVLAEGNAD 59
M + SL A ++ L + ASP E+ ++ + R I MD ASG+ L AD
Sbjct: 1 MRYIRLCIISLLA-TLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRAD 59

Query: 60 EKLDPASLTKIMTSYVVGQALKAGKIKLTDMVTVGKDAWATGNPALRGSSVMFLKPGDQV 119
E+ S K++ V + AG +L + + +P V D +
Sbjct: 60 ERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSP------VSEKHLADGM 113

Query: 120 SVADLNKGIIIQSGNDACIALADYVAGSQESFIGLMNAYAKRLGLTNTT---FQTVHGLD 176
+V +L I S N A L V G + A+ +++G T ++T
Sbjct: 114 TVGELCAAAITMSDNSAANLLLATVGGPAG-----LTAFLRQIGDNVTRLDRWETELNEA 168

Query: 177 APGQF---STARDMA------LLGKAL 194
PG +T MA L + L
Sbjct: 169 LPGDARDTTTPASMAATLRKLLTSQRL 195


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0860TCRTETB423e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 42.2 bits (99), Expect = 3e-06
Identities = 65/356 (18%), Positives = 126/356 (35%), Gaps = 51/356 (14%)

Query: 48 QAGLDWVPTSMTAYLAGGMFLQWLLGPLSDRIGRRPVMLAGVVWFIVTCLATLLAKNIEQ 107
A +WV T+ + G + G LSD++G + ++L G++ + + +
Sbjct: 48 PASTNWVNTAFMLTFSIGTAV---YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFS 104

Query: 108 FT-FLRFLQGISLCFIGAVGYAAIQESFEEAVCIKITALMANVALISPLLGPLVGAAWVH 166
RF+QG A+ + + K L+ ++ + +GP +G H
Sbjct: 105 LLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAH 164

Query: 167 VLPWEGMFILFAALAAIAFFGLQRAMPETATRRGE------------------------- 201
+ W +L + I L + + + +G
Sbjct: 165 YIHW-SYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSI 223

Query: 202 ------TLSFKALGRDYRLV---------IKNRRFVAGALALGFVSLPLLAWIAQSPIII 246
LSF + R V KN F+ G L G + + +++ P ++
Sbjct: 224 SFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMM 283

Query: 247 ISGEQLSSYEYG-LLQVPVFGALIAGNLVLARLTSRRTVRSLIVMGGWPIVAGLIITAAA 305
QLS+ E G ++ P ++I + L RR ++ +G + + +
Sbjct: 284 KDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTAS-- 341

Query: 306 TVVSSHAYLWMTAGLSVYAFGIGLANAGLVRLTLFSSDMSKGTVSAAMGMLQMLIF 361
+ +MT + V+ G GL+ V T+ SS + + A M +L F
Sbjct: 342 -FLLETTSWFMTIII-VFVLG-GLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSF 394


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0862TCRTETB330.002 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.3 bits (76), Expect = 0.002
Identities = 33/150 (22%), Positives = 65/150 (43%), Gaps = 6/150 (4%)

Query: 218 LLIGVVVLAMAFAEGSANDWL-PLLMVDGHGFSP-TSGSLIYAGFTLGMTVGRFTGGWFI 275
+IGV+ + F + + P +M D H S GS+I T+ + + + GG +
Sbjct: 258 FMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILV 317

Query: 276 DRYSRVTVVR-ASALM--GALGIGLIIFVDSDWVA-GVSVILWGLGASLGFPLTISAASD 331
DR + V+ + L ++ S ++ + +L GL + TI ++S
Sbjct: 318 DRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSL 377

Query: 332 TGPDAPTRVSVVATTGYLAFLVGPPLLGYL 361
+A +S++ T +L+ G ++G L
Sbjct: 378 KQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0863HTHTETR476e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 47.3 bits (112), Expect = 6e-09
Identities = 17/80 (21%), Positives = 33/80 (41%)

Query: 7 RRANDPKRREKIIQATLEAVKTYGVHAVTHRKIAAIAQVPLGSMTYYFAGMDALLSEAFT 66
+ + R+ I+ L GV + + +IA A V G++ ++F L SE +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 67 LFTENMSRQYQDFFAQVTDA 86
L N+ ++ A+
Sbjct: 65 LSESNIGELELEYQAKFPGD 84


17SC0874SC0888Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0874216-3.296550hypothetical protein
SC0875115-3.14570923S rRNA methyluridine methyltransferase
SC0876014-3.746588PTS system ascorbate-specific transporter
SC0877-212-2.378841inner membrane protein
SC0878-312-1.219413sulfatase
SC0879-213-0.559159arginine ABC transporter ATP-binding protein
SC0880-1130.369384arginine transporter permease subunit ArtM
SC0881-1141.547978arginine transporter permease subunit ArtQ
SC08820141.484506arginine ABC transporter ATP-binding protein
SC08831163.025135arginine transporter ATP-binding subunit
SC08842174.049105lipoprotein
SC08852164.017387hypothetical protein
SC08861153.989708aminidase
SC0887-2193.767693nucleoside-diphosphate-sugar epimerase
SC0888-2173.363677hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0883PF05272300.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.007
Identities = 16/50 (32%), Positives = 22/50 (44%), Gaps = 1/50 (2%)

Query: 31 LVLLGPSGAGKSSLLRVLNLLEMPRSGTLTIAGNHFDFTKTPSDKAIREL 80
+VL G G GKS+L+ L L+ S T G D + + EL
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDF-FSDTHFDIGTGKDSYEQIAGIVAYEL 647


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0887NUCEPIMERASE642e-13 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 63.6 bits (155), Expect = 2e-13
Identities = 68/370 (18%), Positives = 122/370 (32%), Gaps = 71/370 (19%)

Query: 1 MKVLVTGATSGLGRNAVEFLRNKGISVRA---------TGRNEAMGKLLEKMGAEFVHAD 51
MK LVTGA +G + + L G V +A +LL + G +F D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 52 LTELVSSQAKVMLAGIDTLWHCS-------SFTSPWGTQQAFDLANVRATRRLGEWAVAW 104
L + + ++ S +P A+ +N+ + E
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPH----AYADSNLTGFLNILEGCRHN 116

Query: 105 GVRNFIHISSPSLYFDYHHHRDIKEDFRPHRFANEFARSKAAGEEVINLLAQANPQT--- 161
+++ ++ SS S+Y + D + +A +K A E L+A
Sbjct: 117 KIQHLLYASSSSVYGL-NRKMPFSTDDSVDHPVSLYAATKKANE----LMAHTYSHLYGL 171

Query: 162 RFTVLRPQSLFGPHDK--VFIPRLAHMMHHYGSVLLPHGGSALVDMTYYENAIHAMWLAS 219
T LR +++GP + + + + M S+ + + G D TY ++ A+
Sbjct: 172 PATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQ 231

Query: 220 QPGCDHLPS--------------GRAYNITNGENRTLRSIVQKLIDELTIDCRIRSVPYP 265
R YNI N L +Q L D L I+ + +P
Sbjct: 232 DVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQ 291

Query: 266 MLDMIARSMERFGKKSAKEPPLTHYGVSKLNFDFTLNTTRAQDELGYQPIITLDEGIERT 325
D+ T +T + +G+ P T+ +G++
Sbjct: 292 PGDV----------------LETS-----------ADTKALYEVIGFTPETTVKDGVKNF 324

Query: 326 AAWLRDHGNL 335
W RD +
Sbjct: 325 VNWYRDFYKV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0888NUCEPIMERASE552e-10 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 55.2 bits (133), Expect = 2e-10
Identities = 30/125 (24%), Positives = 49/125 (39%), Gaps = 17/125 (13%)

Query: 4 RILVLGASGYIGQHLVFALSQQGHQVRA---------AARRIERLEKQRLANVSCHKVDL 54
+ LV GA+G+IG H+ L + GHQV + + RLE HK+DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 55 HWPENLPALLRD--IDTVYYLVH------GMGEGGDFIAHERQAALNVRDALRQTPVKQL 106
E + L + V+ H + + LN+ + R ++ L
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 107 IFLSS 111
++ SS
Sbjct: 122 LYASS 126


18SC0909SC0935Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0909-1113.750217leucyl/phenylalanyl-tRNA--protein transferase
SC0910-2113.299563cysteine/glutathione ABC transporter
SC0911-1103.188691cysteine/glutathione ABC transporter
SC0912-1112.842932thioredoxin reductase
SC0913-1112.324234leucine-responsive transcriptional regulator
SC0914-1122.723073DNA translocase FtsK
SC0915-2131.518135outer-membrane lipoprotein carrier protein
SC0916-1141.841082recombination factor protein RarA
SC09170150.874255seryl-tRNA synthetase
SC0918212-0.560620anaerobic dimethyl sulfoxide reductase subunit
SC0919415-1.581388anaerobic dimethyl sulfoxide reductase subunit
SC0920514-2.625664anaerobic dimethyl sulfoxide reductase subunit
SC0921525-7.657720hypothetical protein
SC0922218-5.652193MFS family transporter protein
SC0923321-6.418931amino acid APC transporter
SC0924121-5.062472pyruvate formate lyase-activating enzyme 1
SC0925122-4.750534hypothetical protein
SC0926120-3.983890secreted protein SopD-like protein
SC0927120-1.381180pyruvate formate lyase I
SC0928-210-1.357957formate transporter
SC0929-111-0.279055hypothetical protein
SC0930122-0.622214hypothetical protein
SC0931125-0.611504phosphoserine aminotransferase
SC0932126-0.9168703-phosphoshikimate 1-carboxyvinyltransferase
SC0933330-1.699592Zn-dependent protease with chaperone function
SC0934534-1.216981cytidylate kinase
SC0935640-1.54701330S ribosomal protein S1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0914IGASERPTASE531e-08 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 52.8 bits (126), Expect = 1e-08
Identities = 45/279 (16%), Positives = 89/279 (31%), Gaps = 42/279 (15%)

Query: 602 PQLPRPNRVR-----VPTRRELASYGIKLPSQRIAE-------EKAREAERNQYETGAQL 649
+ PN ++ VP+ E + + P A E E + + +T +
Sbjct: 995 TNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN 1054

Query: 650 TDEEIDAMHQDELARQFAQSQQHRYGETYQHDTQQAEDDDTAAEAELARQFAASQQQRYS 709
+ + Q R+ A+ + + +TQ E + +E + + +
Sbjct: 1055 EQDATETTAQ---NREVAKEAK----SNVKANTQTNEVAQSGSETKETQTTETKETATVE 1107

Query: 710 GEQPAGAQPFSLDDLDFSPMKVLVDEGPHEPLFTPGVMPESTPVQQPVAPQPQYQQPQQP 769
E+ A KV ++ P T V P+ +Q QPQ +P +
Sbjct: 1108 KEEKA---------------KVETEKTQEVPKVTSQVSPKQ---EQSETVQPQ-AEPARE 1148

Query: 770 VAPQPQYQQPQQPVASQPQYQQPQQPVAPQ-PQYQQPQQPVAPQPQYQQPQQPVAPQPQY 828
P ++PQ + +QP + + Q V + P P
Sbjct: 1149 NDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSV--VENPENTTPAT 1206

Query: 829 QQPQQPVALQPQYQ-QPQQPVAPQPQYQQPQQPTAPQDS 866
QP + + + ++ V P +P ++ S
Sbjct: 1207 TQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRS 1245



Score = 43.9 bits (103), Expect = 5e-06
Identities = 31/175 (17%), Positives = 55/175 (31%), Gaps = 17/175 (9%)

Query: 405 QPQEAQSAPWQQPVPVASAPQYAATPATAAEYDSLAPQETQPQWQAPDAEQHWQPEPTHQ 464
P+ +Q PQ A PA + + D EQ + ++
Sbjct: 1122 VPKVTSQVSPKQEQSETVQPQ--AEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNV 1179

Query: 465 PEPVYQPEPIAAEPSHMPPPVIEQPVATEPEPNTEETRPARPPLYYFEEVEEKRAREREQ 524
+PV + + S + P P T+P N+E + + + + R
Sbjct: 1180 EQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESS----------NKPKNRHRRSVRS 1229

Query: 525 LAAWYQPIPEPVKENVPVKSTVSVAPSIPPVEAVAAAASLDAGIKSGALAAGAAA 579
+ EP + +STV++ A + A + AL G A
Sbjct: 1230 VPH----NVEPATTSSNDRSTVALCDLT-STNTNAVLSDARAKAQFVALNVGKAV 1279



Score = 37.7 bits (87), Expect = 4e-04
Identities = 22/188 (11%), Positives = 47/188 (25%), Gaps = 29/188 (15%)

Query: 748 PESTPVQQPVAPQPQYQ-------QPQQPVAPQPQYQQPQQPVASQPQYQQPQQPVAPQP 800
+ PV P P Q+ + Q + A + + +
Sbjct: 1020 VDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKAN 1079

Query: 801 QYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVALQPQYQQPQ----------QPVAP 850
+ + Q + ++ + V + + P+ Q
Sbjct: 1080 TQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETV 1139

Query: 851 QPQYQ------------QPQQPTAPQDSLIHPLLMRNGDSRPLQRPTTPLPSLDLLTPPP 898
QPQ + +PQ T P + + +T + + + + P
Sbjct: 1140 QPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENP 1199

Query: 899 SEVEPVDT 906
P T
Sbjct: 1200 ENTTPATT 1207



Score = 35.4 bits (81), Expect = 0.002
Identities = 20/173 (11%), Positives = 46/173 (26%), Gaps = 10/173 (5%)

Query: 754 QQPVAPQPQYQQPQQPVAPQPQYQQPQQPVASQPQYQ---QPQQPVAPQPQYQQPQQPVA 810
Q Q ++ + + VA Q + ++ + V
Sbjct: 1056 QDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVE 1115

Query: 811 PQPQYQQPQQPVAPQPQYQQPQQPVALQPQYQQPQQPV----APQPQYQQPQQPTAPQDS 866
+ + P+ P+ +Q + +QPQ + ++ +PQ Q Q +
Sbjct: 1116 TEKTQEVPKVTSQVSPKQEQSE---TVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPA 1172

Query: 867 LIHPLLMRNGDSRPLQRPTTPLPSLDLLTPPPSEVEPVDTFALEQMARLVEAR 919
+ + T + P+ +P + R
Sbjct: 1173 KETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRR 1225


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0922TCRTETB348e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 8e-04
Identities = 39/158 (24%), Positives = 62/158 (39%), Gaps = 6/158 (3%)

Query: 8 VMLLLCGLLLLT-LAIAVLNTLVPLWLAQANLPTWQVGMVSSSYFTGNLVGTLFTGYLIK 66
+++ LC L + L VLN +P N P V++++ +GT G L
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 67 RIGFNRSYYLASLIFAAGCVGLGVMVGFWSWMSW-RFIAGIGCAMIWVVVESALMCSGTS 125
++G R +I G V V F+S + RFI G G A +V +
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 126 HNRGRLLAAYMMVYYMGTFLGQLLVSKVSGELLHVLPW 163
NRG+ + MG +G + G + H + W
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPA----IGGMIAHYIHW 168


19SC0956SC1008Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0956017-4.621118outer membrane protein 1a (IA;b;f), porin
SC0957-117-3.939028asparaginyl-tRNA synthetase
SC0958122-5.527959leucine response regulator
SC0959223-4.822609diaminopropionate ammonia-lyase
SC0960322-4.128850hypothetical protein
SC0961322-4.422012Lrp family transcriptional regulator
SC0962424-2.715214nicotinate phosphoribosyltransferase
SC0963629-3.871975Gifsy-2 prophage integrase
SC0965529-3.266428Gifsy-2 prophage RecT
SC0966636-9.070807exodeoxyribonuclease VIII
SC0967542-11.904026Gifsy-1 prophage protein
SC0968542-10.519182Gifsy-1 prophage protein
SC0969643-9.316819Gifsy-1 prophage protein
SC0970739-8.690330hypothetical protein
SC0971538-7.060948hypothetical protein
SC0972434-4.337786hypothetical protein
SC0973432-3.166894regulator
SC0974328-2.374098Gifsy-1 prophage cI
SC0975228-2.410050replication protein O of prophage CP-933X
SC0976123-0.113754replication protein O of prophage CP-933X
SC0977021-0.931994hypothetical protein
SC0978021-1.166754Gifsy-1 prophage DinI
SC0979225-2.729139Gifsy-2 prophage protein
SC0980226-2.923887hypothetical protein
SC0981326-2.519671Gifsy-1 prophage protein
SC0982429-3.603243Gifsy-1 prophage protein
SC0983529-3.617022Gifsy-1 prophage RegQ
SC0984631-3.263535Gifsy-2 prophage GtgA
SC0985325-0.352307Gifsy-1 prophage protein
SC0986426-0.532998Gifsy-2 prophage lysozyme
SC0987426-0.735490Gifsy-2 prophage RzpD
SC09883271.035530Gifsy-2 prophage protein
SC09903281.199377hypothetical protein
SC09912270.434878Gifsy-2 prophage VmtV
SC09924280.009849Gifsy-2 prophage minor tail protein
SC09934270.287339Gifsy-2 prophage minor tail protein
SC09944250.508263Gifsy-2 prophage minor tail protein
SC0995724-1.132513Gifsy-2 prophage minor tail protein
SC0996622-0.987828Gifsy-2 prophage attachment and invasion
SC09974252.192742Gifsy-2 prophage superoxide dismutase(Cu-Zn)
SC09983242.125620Gifsy-2 prophage minor tail protein
SC0999328-0.086865Gifsy-2 prophage tail assembly protein
SC1000228-0.169309Gifsy-2 prophage tail assembly protein
SC1001128-0.715367hypothetical protein
SC1002229-1.027063Gifsy-2 prophage tail fiber protein
SC1003533-9.043967Gifsy-2 prophage tail fiber assembly
SC1004632-9.845870Gifsy-2 prophage type III secreted protein
SC1005-113-3.431432ISSfl3 orfC,D
SC1006-115-4.092240hypothetical protein
SC1007-216-4.257649hypothetical protein
SC1008-315-3.008206Gifsy-2 prophage protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0956ECOLIPORIN474e-170 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 474 bits (1221), Expect = e-170
Identities = 214/389 (55%), Positives = 264/389 (67%), Gaps = 31/389 (7%)

Query: 2 MKRKILAAVIPALLAAATANANAAEIYNKDGNKLDLYGKAVGRHVWTTTGDSKNADQTYA 61
MKRK+LA VIPALLAA A+A AEIYNKDGNKLDLYGK G H + + SK+ DQTY
Sbjct: 1 MKRKVLALVIPALLAAGAAHA--AEIYNKDGNKLDLYGKVDGLH-YFSDDSSKDGDQTYM 57

Query: 62 QIGFKGETQINTDLTGFGQWEYRTKADRAEGEQQNSNLVRLAFAGLKYAEVGSIDYGRNY 121
++GFKGETQIN LTG+GQWEY +A+ EGE NS RLAFAGLK+ + GS DYGRNY
Sbjct: 58 RVGFKGETQINDQLTGYGQWEYNVQANTTEGEGANS-WTRLAFAGLKFGDYGSFDYGRNY 116

Query: 122 GIVYDVESYTDMAPYFSGETWGGAYTDNYMTSRAGGLLTYRNSDFFGLVDGLSFGIQYQG 181
G++YDVE +TDM P F G+++ Y DNYMT RA G+ TYRN+DFFGLVDGL+F +QYQG
Sbjct: 117 GVLYDVEGWTDMLPEFGGDSY--TYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQG 174

Query: 182 KNQDNHS---------------INSQNGDGVGYTMAYEFD-GFGVTAAYSNSKRTNDQQD 225
KN+ + I NGDG G + Y+ GF AAY+ S RTN+Q +
Sbjct: 175 KNESQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVN 234

Query: 226 RDG---NGDRAESWAVGAKYDANNVYLAAVYAETRNMSIVENTVTD-TVEMANKTQNLEV 281
G GD+A++W G KYDANN+YLA +Y+ETRNM+ T +ANKTQN EV
Sbjct: 235 AGGTIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEV 294

Query: 282 VAQYQFDFGLRPAISYVQSKGKQLNGAD---GSADLAKYIQAGATYYFNKNMNVWVDYRF 338
AQYQFDFGLRPA+S++ SKGK L + DL KY GATYYFNKN + +VDY+
Sbjct: 295 TAQYQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKI 354

Query: 339 NLLDEND--YSSSYVGTDDQAAVGITYQF 365
NLLD++D Y + + TDD A+G+ YQF
Sbjct: 355 NLLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0976ACRIFLAVINRP290.023 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.7 bits (64), Expect = 0.023
Identities = 20/108 (18%), Positives = 35/108 (32%), Gaps = 12/108 (11%)

Query: 32 NSDAERLVDALFMQLKQI-------FPAATQTNLRSDADERVAKQQWIAAFSENGIRTRK 84
+ AE ++ M+L +I F L + + + R
Sbjct: 640 ENSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARN 699

Query: 85 QLSAGMQKARSSQSPFWPS-----PGQFISWCREGSGALGVSVDDIMD 127
QL + +S P+ + +E + ALGVS+ DI
Sbjct: 700 QLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQ 747


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0994GPOSANCHOR350.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 34.7 bits (79), Expect = 0.002
Identities = 43/298 (14%), Positives = 92/298 (30%), Gaps = 29/298 (9%)

Query: 477 LDEKIATLQEKIARARKTPWTVSSSQTEYDQQQLNELQEQKRQKDLLDAKAQAERNYQKT 536
+ + TL+ K + + E ++ N ++ ++ L KA + +
Sbjct: 62 FEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEAR 121

Query: 537 QKRRNEQNAALNRDNETESLRHQREVARITAMQYADAAVRNAALERENERHKKAMARQKE 596
+ + T + + A A A ALE A+ K
Sbjct: 122 KADLEKALEGAMNF-STADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKT 180

Query: 597 KPKAYYNDEAGRLLLQYSQQQAQTEGLIAAAKLSTTEKMTEAHKQLLSFQQRIADLSGKK 656
EA + L+ + + A +AK+ T E A
Sbjct: 181 LEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARK-----------AD 229

Query: 657 LTADEQSVLAHKDEIALALQKLDISQQDLQHQNAFNELKKKTLTLTSQLADEESRVRQQH 716
L + + + ++ L+ + L+ + A E + S + + +
Sbjct: 230 LEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAE 289

Query: 717 ALALATMGMGDQQRGRYEEHLKIQQHYQEQLEQLKRDSKAKGTYGSDEYRQAEQELQA 774
AL + ++ E ++ + L+RD A R+A+++L+A
Sbjct: 290 KAAL------EAEKADLEHQSQVLN---ANRQSLRRDLDAS--------REAKKQLEA 330



Score = 33.9 bits (77), Expect = 0.004
Identities = 47/248 (18%), Positives = 76/248 (30%), Gaps = 19/248 (7%)

Query: 479 EKIATLQEKIARARKTPWTVSSSQTEYDQQQLNELQEQKRQKDLLDAKAQAERNYQKTQK 538
E + KT ++ + L+ AK + +
Sbjct: 165 EGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALA 224

Query: 539 RRNEQNAALNRDNETESLRHQREVARITAMQYADAAVRNAALERENERHKKAMA------ 592
R S ++ + A + A R A LE+ E
Sbjct: 225 ARKADLEKALEGAMNFSTADSAKIKTLEA-EKAALEARQAELEKALEGAMNFSTADSAKI 283

Query: 593 RQKEKPKAYYNDEAGRLLLQYSQQQAQTEGLIAAAKLSTTEKM-TEAHKQLLSFQQRIAD 651
+ E KA E L Q A + L S K EA Q L Q +I++
Sbjct: 284 KTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISE 343

Query: 652 LSGKKLTADEQSVLAHKDEIALALQKL-------DISQQDLQHQ-NAFNELKKKTLTLTS 703
S + L D + K ++ QKL + S+Q L+ +A E KK+ +
Sbjct: 344 ASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQ---VEK 400

Query: 704 QLADEESR 711
L + S+
Sbjct: 401 ALEEANSK 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0996ENTEROVIROMP1591e-52 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 159 bits (404), Expect = 1e-52
Identities = 51/156 (32%), Positives = 83/156 (53%), Gaps = 8/156 (5%)

Query: 1 MKKIV-VAVLVGLALGSIGVANAAGYKNTVSIGYAYTDLSGWLSGNANGANIKYNWEDLD 59
MKKI ++ L + + G + AA +TV+ GYA +D G ++ G N+KY +E+ +
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAA--TSTVTGGYAQSDAQGQMN-KMGGFNLKYRYEEDN 57

Query: 60 SGFGAMGSVTYTSADVNNYGYKVGDADYTSLLVGPSYRFNDYLNAYVMIGAANGHIKDN- 118
S G +GS TYT Y + GP+YR ND+ + Y ++G G +
Sbjct: 58 SPLGVIGSFTYTEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKFQTTE 117

Query: 119 ---WGNSDNKTAFAYGAGIQLNPVENIAVNASYEHT 151
+ + + F+YGAG+Q NP+EN+A++ SYE +
Sbjct: 118 YPTYKHDTSDYGFSYGAGLQFNPMENVALDFSYEQS 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1002CHANLCOLICIN340.004 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 33.5 bits (76), Expect = 0.004
Identities = 58/240 (24%), Positives = 86/240 (35%), Gaps = 18/240 (7%)

Query: 123 NATAAGQASEQAQTSAGQASES-----ATAAVNAAGAAEASATQAASSAASAESSAGTA- 176
N T G S G SES ATA + A + A QAA + A+AE+ A
Sbjct: 26 NGTPDGSGSGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKA 85

Query: 177 -----TTKAGEASASAASADTARTAAAASAAAAKTSEANADASRTA---AGDSAAAAAAS 228
T + + A + +RT +A A A + A+ R A + A A +
Sbjct: 86 NRDALTQRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKEAEA 145

Query: 229 ATAAQTSAERAGASETAAKTSETQAASSAGDAGASATAAAASEKAAAASAAAAKTSETNA 288
A A AE+ E + +ET+ +A AA SE+A A A K S +
Sbjct: 146 AEKAFQEAEQR-RKEIEREKAETERQLKLAEA-EEKRLAALSEEAKAVEIAQKKLSAAQS 203

Query: 289 ATSASTAAASATAASSSASEASTHAAASDTSASLA--AQSSTAAGAAATRAEDAAKRAED 346
+ S+S + A + AQ+S + + RA D
Sbjct: 204 EVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRAND 263


20SC1034SC1052Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1034223-5.012064AraC family transcriptional regulator
SC1035329-7.946031acylphosphatase
SC1036332-9.039542sulfur transfer protein TusE
SC1037231-8.452337hypothetical protein
SC1039133-8.665174*pathogenicity island encoded protein: SPI3
SC1040030-6.983122pathogenicity island encoded protein: SPI3
SC1041125-5.222645inner membrane protein
SC1042124-4.372406hypothetical protein
SC1043124-3.533998outer protein
SC1044017-1.888643pathogenicity island encoded protein: SPI3
SC1045-115-1.254620copper resistance; histidine kinase
SC1046-212-0.009923transcriptional regulatory protein YedW
SC1047-2151.937844hypothetical protein
SC1048-2172.2926024-hydroxyphenylacetate catabolism
SC1049-2202.2472744-hydroxyphenylacetate catabolism
SC1050-1233.6041704-hydroxyphenylacetate catabolism
SC1051-1244.5266574-hydroxyphenylacetate catabolism
SC10520243.7643824-hydroxyphenylacetate catabolism
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1042PF078241651e-56 Type III secretion chaperone
		>PF07824#Type III secretion chaperone

Length = 120

Score = 165 bits (419), Expect = 1e-56
Identities = 33/114 (28%), Positives = 63/114 (55%), Gaps = 1/114 (0%)

Query: 1 MESLLNRLYDALGLDAPE-DEPLLIIDDGIQVYFNESDHTLEMCCPFMPLPDDILTLQHF 59
ME L + + ALG+ + + D+ +++DD + +Y + ++ + CPF LP++I L +
Sbjct: 1 MEDLADVICRALGIPSIDTDDQAIMLDDDVLIYIEKEGDSINLLCPFCALPENINDLIYA 60

Query: 60 LRLNYTSAVTIGADADNTALVALYRLPQTSTEEEALTGFELFISNVKQLKEHYA 113
L LNY+ + + D + +L+A L + E+ E +IS V+ LK+ +A
Sbjct: 61 LSLNYSEKICLATDDEGGSLIARLDLTGINEFEDIYVNTEYYISRVRWLKDEFA 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1043TYPE3OMBPROT6650.0 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 665 bits (1717), Expect = 0.0
Identities = 186/396 (46%), Positives = 254/396 (64%), Gaps = 5/396 (1%)

Query: 166 LNNQPWQTIKNTLTHNGHHYTNTQLPAAEMKIGAKDIFPSAYEGKGVCSWDTKNIHHANN 225
LNN+ W + ++H+G +Y PA+ MKIG K+IF Y GKG+C T+ H N
Sbjct: 146 LNNKNWGPVNKNISHHGKNYGFQLTPASHMKIGNKNIFVKEYNGKGICCASTRESDHIAN 205

Query: 226 LWMSTVSVHEDGKDKTLFCGIRHGVLSPYH-EKDPLLRHVGAENKAKEVLTAALFSKPEL 284
+W+S V V ++GK+ +F GIRHGV+S Y +K+ R V A NKA+E+++AAL+S+PEL
Sbjct: 206 MWLSKV-VDDEGKE--IFSGIRHGVISAYGLKKNSSERAVAARNKAEELVSAALYSRPEL 262

Query: 285 LNKALAGEAVSLKLVSVGLLTASNIFGKEGTMVEDQMRAWQSL-TQPGKMIHLKIRNKDG 343
L++AL+G+ V LK+VS LLT +++ G E +M++DQ+ A + L ++ G+ L IRN DG
Sbjct: 263 LSQALSGKTVDLKIVSTSLLTPTSLTGGEESMLKDQVNALKGLNSKRGEPTKLLIRNSDG 322

Query: 344 DLQTVKIKPDVAAFNVGVNELALKLGFGLKASDSYNAEALHQLLGNDLRPEARPGGWVGE 403
L+ V + V FN GVNELALK+G G + D N E++ LLG++ GGW E
Sbjct: 323 LLKEVSVNLKVVTFNFGVNELALKMGLGWRNVDKLNDESICSLLGDNFLKNGVIGGWAAE 382

Query: 404 WLAQYPDNYEVVNTLARQIKDIWKNNQHHKDGGEPYKLAQRLAMLAHEIDAVPAWNCKSG 463
+ + P V LA QIK+I D GEPYKL+QR+ +LA+ I AVP WNCKSG
Sbjct: 383 AIEKNPPCKNDVIYLANQIKEIINKKLQKNDNGEPYKLSQRMTLLAYTIGAVPCWNCKSG 442

Query: 464 KDRTGMMDSEIKREIISLHQTHMLSAPGSLPDSGGQKIFQKVLLNSGNLEIQKQNTGGAG 523
KDRTGM D+EIKREII H+T S S S +++F +L+NSGN+EIQ+ NTG G
Sbjct: 443 KDRTGMQDAEIKREIIRKHETGQFSQLNSKLSSEEKRLFSTILMNSGNMEIQEMNTGVPG 502

Query: 524 NKVMKNLSPEVLNLSYQKRVGDENIWQSVKGISSLI 559
NKVMK L L LSY +R+GD IW VKG SS +
Sbjct: 503 NKVMKKLPLSSLELSYSERIGDSKIWNMVKGYSSFV 538


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1045PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.7 bits (77), Expect = 0.001
Identities = 18/102 (17%), Positives = 38/102 (37%), Gaps = 15/102 (14%)

Query: 348 ILLQRVLSNLLTNAIRYSDENAVIRIESAYDDNVAEIRVANPGSHPADADKLFRRFWRGD 407
+L+Q ++ N + + I + I ++ D+ + V N GS K
Sbjct: 258 MLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE-------- 309

Query: 408 NARHTAGFGLGLSLVNA-IALLHGGSASYRYADEHNIFSVRL 448
G GL V + +L+G A + +++ + +
Sbjct: 310 ------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1046HTHFIS792e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.5 bits (196), Expect = 2e-19
Identities = 29/117 (24%), Positives = 55/117 (47%), Gaps = 1/117 (0%)

Query: 2 KILLIEDNQKTIEWVRQGLTEAGYVVDYACDGRDGLHLALQEHYSLIILDIMLPGLDGWQ 61
IL+ +D+ + Q L+ AGY V + L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLRALRTAHQS-PAICLTARDSVEDRVKGLEAGANDYLVKPFSFAELLARVRAQLRQ 117
+L ++ A P + ++A+++ +K E GA DYL KPF EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


21SC1068SC1091Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1068420-0.547118hypothetical protein
SC10691133.884100trp-repressor binding protein
SC10702143.591270hypothetical protein
SC10710143.667146hypothetical protein
SC10720123.006787TetR/AcrR family transcriptional regulator
SC1073-1102.365157hypothetical protein
SC1074-291.935943trifunctional transcriptional regulator/proline
SC1075122-4.201876hypothetical protein
SC1076124-5.516856SSS family major sodium/proline symporter
SC1077131-7.510048hypothetical protein
SC1078230-6.657045transcriptional regulator
SC1079227-6.486756N-acetylmannosamine-6-phosphate 2-epimerase
SC1080024-6.431104N-acetylneuraminic acid mutarotase
SC1081019-4.112439outer membrane protein
SC1082-116-2.183198dehydrogenase-like protein
SC1083015-1.680271*oxidoreductase
SC1084-217-3.412131hydrolase
SC1085-222-5.529830hypothetical protein
SC1086-126-7.136585hypothetical protein
SC1087027-7.586236transcriptional regulator in curly
SC1088033-8.755757curli assembly protein CsgF
SC1089031-8.307335curli assembly protein CsgE
SC1090-126-4.699325DNA-binding transcriptional regulator CsgD
SC1091-121-4.061685curlin minor subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1072HTHTETR624e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.3 bits (151), Expect = 4e-14
Identities = 30/158 (18%), Positives = 58/158 (36%), Gaps = 8/158 (5%)

Query: 20 RQLILTAALAVFSQYGIHGARLEQVAERAGVSKTNLLYYYPSKEALYVAVMRQILDVWLA 79
RQ IL AL +FSQ G+ L ++A+ AGV++ + +++ K L+ +
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 80 PLKAFRAEF--SPLEAIKEYIRLKLEVSRDYPQASRLF-CMEMLAGAPLLMEELTGDLKA 136
++A+F PL ++E + LE + + L + M + +
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRN 132

Query: 137 LIDEKSALIAGWVHSG-----KLAPVSPHHLIFMIWAA 169
L E I + A + ++
Sbjct: 133 LCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGY 170


22SC1120SC1134Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC11201123.005552flagellar basal body P-ring biosynthesis protein
SC11212142.791053flagellar basal-body rod protein FlgB
SC11222142.834745flagellar basal body rod protein FlgC
SC11231153.236557flagellar basal body rod modification protein
SC1124-2163.485327flagellar hook protein FlgE
SC1125-1152.425705flagellar basal body rod protein FlgF
SC1126-1151.513856flagellar basal body rod protein FlgG
SC11271152.726684flagellar basal body L-ring protein
SC11281152.576784flagellar basal body P-ring biosynthesis protein
SC11292152.230478flagellar rod assembly protein/muramidase FlgJ
SC11303131.291430flagellar hook-associated protein FlgK
SC11313141.785360flagellar hook-associated protein FlgL
SC11324132.035656ribonuclease E
SC1133215-1.574850hypothetical protein
SC1134215-1.08731923S rRNA pseudouridylate synthase C
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1123SYCECHAPRONE290.010 Gram-negative bacterial type III secretion SycE cha...
		>SYCECHAPRONE#Gram-negative bacterial type III secretion SycE

chaperone signature.
Length = 130

Score = 28.9 bits (64), Expect = 0.010
Identities = 16/34 (47%), Positives = 20/34 (58%), Gaps = 2/34 (5%)

Query: 44 LKNQDPTNPLQNNELTTQLAQISTVSGIEKLNTT 77
L N+ P N L NN L TQL + V G E+L T+
Sbjct: 89 LWNRQPLNSLDNNSLYTQLEML--VQGAERLQTS 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1124FLGHOOKAP1417e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.1 bits (96), Expect = 7e-06
Identities = 17/48 (35%), Positives = 29/48 (60%)

Query: 356 LTNGALEASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR 403
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + L+N+R
Sbjct: 499 LSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 37.6 bits (87), Expect = 9e-05
Identities = 22/60 (36%), Positives = 31/60 (51%), Gaps = 4/60 (6%)

Query: 2 SFSQAVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
+ A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1126FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1127FLGLRINGFLGH355e-128 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 355 bits (911), Expect = e-128
Identities = 211/232 (90%), Positives = 223/232 (96%)

Query: 4 MQKYALHAYPVMALMVATLTGCAWIPAKPLVQGATTAQPIPGPVPVANGSIFQSAQPINY 63
MQK A H Y + +L+V +LTGCAWIP+ PLVQGAT+AQP+PGP PVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 64 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTSFGFDTVPRYLQGLFGNS 123
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKT+FGFDTVPRYLQGLFGN+
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 124 RADMEASGGNSFNGKGGANASNTFSGTLTVTVDQVLANGNLHVVGEKQIAINQGTEFIRF 183
RAD+EASGGN+FNGKGGANASNTFSGTLTVTVDQVL NGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 184 SGVVNPRTISGSNSVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 235
SGVVNPRTISGSN+VPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1128FLGPRINGFLGI430e-153 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 430 bits (1106), Expect = e-153
Identities = 153/362 (42%), Positives = 215/362 (59%), Gaps = 9/362 (2%)

Query: 7 LAGIVLALVATLAHAERIRDLTSVQGVRENSLIGYGLVVGLDGTGDQTTQTPFTTQTLNN 66
A L+ A RI+D+ S+Q R+N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 14 SALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRA 73

Query: 67 MLSQLGITVPTGTNMQLKNVAAVMVTASYPPFARQGQTIDVVVSSMGNAKSLRGGTLLMT 126
ML LGIT G + KN+AAVMVTA+ PPFA G +DV VSS+G+A SLRGG L+MT
Sbjct: 74 MLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMT 132

Query: 127 PLKGVDSQVYALAQGNILVGGVGASAGGSSVQVNQLNGGRITNGAIIERELPTQFGAGNT 186
L G D Q+YA+AQG ++V G A +++ R+ NGAIIERELP++F
Sbjct: 133 SLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVN 192

Query: 187 INLQLNDEDFTMAQQITDAINRAR----GYGSATALDARTVQVRVPSGNSSQVRFLADIQ 242
+ LQL + DF+ A ++ D +N G A D++ + V+ P + R +A+I+
Sbjct: 193 LVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEIE 251

Query: 243 NMEVNVTPQDAKVVINSRTGSVVMNREVTLDSCAVAQGNLSVTVNRQLNVNQPNTPFGGG 302
N+ V T AKVVIN RTG++V+ +V + AV+ G L+V V V QP PF G
Sbjct: 252 NLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSRG 309

Query: 303 QTVVTPQTQIDLRQSGGSLQSVRSSANLNSVVRALNALGATPMDLMSILQSMQSAGCLRA 362
QT V PQT I Q G + ++ +L ++V LN++G +++ILQ ++SAG L+A
Sbjct: 310 QTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQA 368

Query: 363 KL 364
+L
Sbjct: 369 EL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1129FLGFLGJ4990.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 499 bits (1285), Expect = 0.0
Identities = 263/316 (83%), Positives = 289/316 (91%), Gaps = 3/316 (0%)

Query: 1 MIGDGKLLASAAWDAQSLNELKAKAGQDPAANIRPVARQVEGMFVQMMLKSMREALPKDG 60
MI D KLLASAAWDAQSLNELKAKAG+DPAANIRPVARQVEGMFVQMMLKSMR+ALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSDQTRLYTSMYDQQIAQQMTAGKGLGLADMMVKQMTGGQTMPADDAPQVPLKFSLET 120
LFSS+ TRLYTSMYDQQIAQQMTAGKGLGLA+MMVKQMT Q +P + P P+KF LET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VNSYQNQALTQLVRKAIPKTPDSSDAPLSGDSKDFLARLSLPARLASEQSGVPHHLILAQ 180
V YQNQAL+QLV+KA+P+ D S L GDSK FLA+LSLPA+LAS+QSGVPHHLILAQ
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDS---LPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQ 177

Query: 181 AALESGWGQRQILRENGEPSYNVFGVKATASWKGPVTEITTTEYENGEAKKVKAKFRVYS 240
AALESGWGQRQI RENGEPSYN+FGVKA+ +WKGPVTEITTTEYENGEAKKVKAKFRVYS
Sbjct: 178 AALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYS 237

Query: 241 SYLEALSDYVALLTRNPRYAAVTTAATAEQGAVALQNAGYATDPNYARKLTSMIQQLKAM 300
SYLEALSDYV LLTRNPRYAAVTTAA+AEQGA ALQ+AGYATDP+YARKLT+MIQQ+K++
Sbjct: 238 SYLEALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSI 297

Query: 301 SEKVSKTYSANLDNLF 316
S+KVSKTYS N+DNLF
Sbjct: 298 SDKVSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1130FLGHOOKAP16640.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 664 bits (1714), Expect = 0.0
Identities = 438/553 (79%), Positives = 487/553 (88%), Gaps = 8/553 (1%)

Query: 2 SSLINHAMSGLNAAQAALNTVSNNINNYNVAGYTRQTTILAQANSTLGAGGWIGNGVYVS 61
SSLIN+AMSGLNAAQAALNT SNNI++YNVAGYTRQTTI+AQANSTLGAGGW+GNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRGAQNQSSGLTTRYEQMSKIDNLLADKSSSLSGSLQSFFTSLQTLV 121
GVQREYDAFITNQLR AQ QSSGLT RYEQMSKIDN+L+ +SSL+ +Q FFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKAEGLVNQFKTTDQYLRDQDKQVNIAIGSSVAQINNYAKQIANLND 181
SNAEDPAARQALIGK+EGLVNQFKTTDQYLRDQDKQVNIAIG+SV QINNYAKQIA+LND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRMTGVGAGASPNDLLDQRDQLVSELNKIVGVEVSVQDGGTYNLTMANGYTLVQGSTA 241
QISR+TGVGAGASPN+LLDQRDQLVSELN+IVGVEVSVQDGGTYN+TMANGY+LVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSSADPTRTTVAYVDEAAGNIEIPEKLLNTGSLGGLLTFRSQDLDQTRNTLGQL 301
RQLAAVPSSADP+RTTVAYVD AGNIEIPEKLLNTGSLGG+LTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFADAFNAQHTKGYDADGNKGKDFFSIGSPVVYSNSNNADKTVSLTAKVVDSTKVQAT 361
ALAFA+AFN QH G+DA+G+ G+DFF+IG P V N+ N V++ A V D++ V AT
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGD-VAIGATVTDASAVLAT 359

Query: 362 DYKIVFDGTDWQVTRTADNTTFTATKDADGKLEIDGLKVTVGTGAQKNDSFLLKPVSNAI 421
DYKI FD WQVTR A NTTFT T DA+GK+ DGL++T NDSF LKPVS+AI
Sbjct: 360 DYKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAI 419

Query: 422 VDMNVKVTNEAEIAMASESKLDPDVDTGDSDNRNGQALLDLQ-NSNVVGGNKTFNDAYAT 480
V+M+V +T+EA+IAMASE D GDSDNRNGQALLDLQ NS VGG K+FNDAYA+
Sbjct: 420 VNMDVLITDEAKIAMASEE------DAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYAS 473

Query: 481 LVSDVGNKTSTLKTSSTTQANVVKQLYKQQQSVSGVNLDEEYGNLQRYQQYYLANAQVLQ 540
LVSD+GNKT+TLKTSS TQ NVV QL QQQS+SGVNLDEEYGNLQR+QQYYLANAQVLQ
Sbjct: 474 LVSDIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQ 533

Query: 541 TANALFDALLNIR 553
TANA+FDAL+NIR
Sbjct: 534 TANAIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1131FLAGELLIN414e-06 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 41.2 bits (96), Expect = 4e-06
Identities = 30/138 (21%), Positives = 59/138 (42%)

Query: 1 MRISTQMMYEQNMSGITNSQAEWMKLGEQMSTGKRVTNPSDDPIAASQAVVLSQAQAQNS 60
I+T + + + SQ+ E++S+G R+ + DD + A + +
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 61 QYALARTFATQKVSLEESVLSQVTTAIQTAQEKIVYAGNGTLSDDDRASLATDLQGIRDQ 120
Q + E L+++ +Q +E V A NGT SD D S+ ++Q ++
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 121 LMNLANSTDGNGRYIFAG 138
+ ++N T NG + +
Sbjct: 122 IDRVSNQTQFNGVKVLSQ 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1132IGASERPTASE552e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 55.1 bits (132), Expect = 2e-09
Identities = 44/263 (16%), Positives = 84/263 (31%), Gaps = 34/263 (12%)

Query: 513 PSEEEYAERKRPEQPALATFAMPDVPPAPTPVEPAVSVATAKKDNVAAAQPAQPGLFSRF 572
P E+ + DVP P+ + A+ D PA
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSN-----NEEIARVDEAPVPPPAPA------ 1031

Query: 573 LNALKQLFSGEETKAVETAAPKAEEKAERQQDRRKPRQNNRRDRNERRDTRDNR----AG 628
+ E +K K E+ A QN + + + + N
Sbjct: 1032 TPSETTETVAENSKQESKTVEKNEQDATE-----TTAQNREVAKEAKSNVKANTQTNEVA 1086

Query: 629 RDGGESRDDNRRNRRQTQQQNAEAR---DTRQQETAEKVKTGDEQQQTPRRERSRRRNDD 685
+ G E+++ ++T E + +T + + KV + Q +P++E+S
Sbjct: 1087 QSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS----QVSPKQEQSETVQPQ 1142

Query: 686 KRQAQQEVKALNREEQPVQETEQEERVQQVQPRRKQRQLNQKVRFTNSAVVETVDTPVVV 745
A++ +N +E Q + QP ++ N + T S V T ++ V
Sbjct: 1143 AEPARENDPTVNIKEPQSQTNTTAD---TEQPAKETSS-NVEQPVTESTTVNTGNSVVEN 1198

Query: 746 DEPRPVENVEQPVPAPRTELAKV 768
E + P +E +
Sbjct: 1199 PENTTPATTQ---PTVNSESSNK 1218



Score = 39.3 bits (91), Expect = 1e-04
Identities = 48/331 (14%), Positives = 81/331 (24%), Gaps = 45/331 (13%)

Query: 630 DGGESRDDNRRNRRQTQQQNAEARDTRQQETAEKVKTGDEQQQTPRRERSRRRNDDKRQA 689
D G + R + N E Q + T + Q S +
Sbjct: 963 DLGAWKYKLRNVNGRYDLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDE 1022

Query: 690 QQEVKALNREEQPVQETEQEERVQQVQPRRKQRQLNQKVRFTNSAVVETVDTPVVVDEPR 749
ET E Q+ + K Q + N V + V +
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKE-AKSNVKANTQ 1081

Query: 750 PVENVEQPVPAPRTELAKVDLPVVADIAPEQDDSVEPRDNTGMPRRSRRSPRHLRVSGQR 809
E + T+ + A + E+ VE +++ P+ +
Sbjct: 1082 TNEVAQSGSETKETQTTETKET--ATVEKEEKAKVETE-------KTQEVPKVTSQVSPK 1132

Query: 810 RRRYRDERYPTQSPMPLTVACASPEMASGKVWIRYPIVRPQETQVVDEQREADLALPQPV 869
+ + + + P V +E Q AD P
Sbjct: 1133 QEQSETVQPQAEPARE-----------------NDPTVNIKEPQS-QTNTTADTEQPAKE 1174

Query: 870 VAEPQVTAATVALEPQASVQAVENVVVEPQTVAEPQTPEVVEVETTHPEVIAAPVDEQPQ 929
Q V + + PE TT P V + ++
Sbjct: 1175 T-------------SSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKN 1221

Query: 930 LIAESDTPVAQEVIADAEPVAETADASITVA 960
S V V EP +++ TVA
Sbjct: 1222 RHRRSVRSVPHNV----EPATTSSNDRSTVA 1248


23SC1186SC1319Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1186123-4.335976MutT-like protein
SC1187125-4.916912hypothetical protein
SC1188225-4.679132ribosomal large subunit pseudouridine synthase
SC1189326-4.799108isocitrate dehydrogenase
SC1190632-5.629867prophage lambda integrase (Int(lambda))
SC1191732-6.426965transposase of Tn10
SC1192526-4.777463hypothetical protein
SC1193427-4.345696hypothetical protein
SC1194628-4.593978hypothetical protein
SC1195427-4.116640hypothetical protein
SC1196327-3.885193regulatory protein
SC1197325-2.825559hypothetical protein
SC1198324-2.280555antirepressor
SC1201323-1.955356replication protein
SC1202122-2.711254DNA adenine methylase
SC1203325-3.414757crossover junction endodeoxyribonuclease
SC1204326-4.028254hypothetical protein
SC1205430-4.490688hypothetical protein
SC1206434-5.877761hypothetical protein
SC1207333-3.944142hypothetical protein
SC1208330-1.996795hypothetical protein
SC1209328-1.517215prophage membrane protein
SC1210222-1.524641prophage membrane protein
SC1211221-1.152916Fels-1 prophage chitinase
SC12122220.171925protein gp55
SC12133200.989330hypothetical protein
SC12143201.490263Gifsy-1 prophage DNA packaging protein
SC12153201.531621Gifsy-1 prophage terminase large chaing gp2
SC12164232.981551Gifsy-1 prophage head to tail joining protein
SC12175242.939782Gifsy-1 prophage head-tail preconnector gp4
SC12183232.218386Gifsy-1 prophage head-tail preconnector gp5
SC12194251.225670Gifsy-1 prophage head protein gpshp
SC12204250.443089Gifsy-1 prophage head protein gp7
SC12214250.858468Gifsy-1 prophage DNA packaging protein
SC12225250.680526Gifsy-1 prophage minor capsid protein FII
SC12235240.405637Gifsy-1 prophage GlpA
SC12245241.466628Gifsy-1 prophagei VmtZ
SC12256231.374485Gifsy-1 prophage VmtU
SC12264231.480166Gifsy-1 prophage VmtV
SC12274231.562469Gifsy-1 prophage minor tail protein
SC12283222.116545Gifsy-1 prophage VmtT
SC12294222.836811Gifsy-1 prophage VmtH
SC12306232.909943Gifsy-1 prophage VmtM
SC12314222.827832Gifsy-1 prophage VmtL
SC12324222.613693Gifsy-1 prophage VtaK
SC12334211.731082Gifsy-1 prophage VtiI
SC12343221.229212Gifsy-1 prophage VhsJ
SC1235127-3.519217hypothetical protein
SC1236030-5.279660side tail fiber protein
SC1240235-10.023257hypothetical protein
SC1241234-10.276147SopB protein
SC1242339-11.985196hypothetical protein
SC1243443-13.062043periplasmic murein peptide-binding protein
SC1244544-13.943819hypothetical protein
SC1245542-14.826597hypothetical protein
SC1246438-14.160775hypothetical protein
SC1247338-13.718692isocitrate dehydrogenase
SC1248439-13.382098hypothetical protein
SC1249134-12.305923hypothetical protein
SC1250238-12.359064hypothetical protein
SC1251434-10.464684envelope lipoprotein
SC1252637-9.429575macrophage survival gene; reduced mouse
SC1253738-10.101656envelope protein
SC1254739-10.555907cold shock protein-like protein
SC1255636-10.360594PhoP regulated protein
SC1256631-9.529690PhoP regulated protein: reduced macrophage
SC1257426-6.341748*lysozyme inhibitor
SC1260125-5.691772lipoprotein
SC1261130-5.946714hypothetical protein
SC1262019-3.116405molecular chaperone
SC1263018-2.065071inner membrane protein
SC1264117-1.222609hypothetical protein
SC1265317-0.435588inner membrane protein
SC12661170.649746outer membrane lipoprotein
SC12670150.377150ABC transporter substrate-binding protein
SC12681170.104159ABC transporter
SC1269118-0.837290ABC transporter
SC1270017-1.173583ABC transporter ATPase
SC1271-121-4.094430ABC transporter ATPase
SC1272023-5.680527hypothetical protein
SC1273226-6.797310hypothetical protein
SC1274028-7.217362*hypothetical protein
SC1275027-7.122714aminoglycoside resistance protein
SC1276-228-8.598271response regulator
SC1277-125-6.600616transcriptional regulator
SC1278-126-5.665750hypothetical protein
SC1279027-5.549992chorismate mutase
SC1280225-5.567924leucine export protein LeuE
SC1281228-4.264865hypothetical protein
SC1282329-3.444163hypothetical protein
SC1283129-2.948344hypothetical protein
SC12842210.882560hypothetical protein
SC12852202.139669hypothetical protein
SC12861203.019520hemolysin
SC12871202.454147hypothetical protein
SC12880191.797966hypothetical protein
SC12890162.135579MFS family transporter
SC1290-218-3.154542AraC family transcriptional regulator
SC1291-115-3.632574hypothetical protein
SC1292011-3.934658hypothetical protein
SC1293011-3.590184inner membrane protein
SC1294-110-3.377262hypothetical protein
SC1295-111-3.118877methyl-accepting chemotaxis protein
SC1296-28-1.733296hypothetical protein
SC1297-210-1.842261hypothetical protein
SC1298-116-1.537228scaffolding protein for murein-synthesizing
SC1299017-1.460409arylsulfatase
SC1300021-1.032578aldehyde reductase
SC1301328-1.942814aldose 1-epimerase
SC1302127-1.747694hypothetical protein
SC1303020-1.300046glyceraldehyde-3-phosphate dehydrogenase
SC1304014-1.530140methionine sulfoxide reductase B
SC1305114-1.801598hypothetical protein
SC1306114-2.497886hypothetical protein
SC1307011-2.259370hypothetical protein
SC1308014-3.724865metabolite transport protein
SC1309019-5.123176hypothetical protein
SC1310-118-4.475090hypothetical protein
SC1311018-3.921819sugar kinase ydjH
SC1312013-1.754718oxidoreductase ydjG
SC1313012-1.429993transcriptional regulator YdjF
SC1314-1100.614215metabolite transport protein
SC1315-1112.707102nicotinamidase/pyrazinamidase
SC1316-1122.951494asparaginase
SC13170143.455926protease 4
SC13180123.778681hypothetical protein
SC13190123.212794selenophosphate synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1212PYOCINKILLER328e-04 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 32.5 bits (73), Expect = 8e-04
Identities = 28/110 (25%), Positives = 43/110 (39%), Gaps = 10/110 (9%)

Query: 53 LTDATAALQQEVTERAKEKRRQHAADEERKRADEELAKIQADADAAERARGGLQQQLAAV 112
T+A ++LQ + AA + A A+ QA A+A +A +QQ A
Sbjct: 193 FTEAISSLQIRMN-------TLTAAKASIEAAAANKAREQAAAEAKRKAEEQARQQAAIR 245

Query: 113 Q-RQLAGSETGRLSALAAASQ--AKAETGILLAQLLGEADDLAGKFAKEA 159
A G + A AA A+ LAQ + +A + G+ A
Sbjct: 246 AANTYAMPANGSVVATAAGRGLIQVAQGAASLAQAISDAIAVLGRVLASA 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1229GPOSANCHOR374e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 37.0 bits (85), Expect = 4e-04
Identities = 40/227 (17%), Positives = 72/227 (31%), Gaps = 25/227 (11%)

Query: 516 TEYDQQQLNELQEQKRQKDLLDAKAQAERNYQETQKRRNEQNAALNRDNETESLRHQREV 575
+ L+ AK + + R S ++
Sbjct: 189 EARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKI 248

Query: 576 ARITAMQYADAAVRNAALERENERHKKALSQQAKKPKTYHNDEARRLLLQYSQQQAQTEG 635
+ A + A R A LE+ E + EA + L+ + + +
Sbjct: 249 KTLEA-EKAALEARQAELEKALEGAMNFSTA---DSAKIKTLEAEKAALEAEKADLEHQS 304

Query: 636 QIAAAKLSTTE----------KMTEAHKQLLSFQQRIADLSGKKLTADEQSVLAHKDEIA 685
Q+ A + K EA Q L Q +I++ S + L D + K ++
Sbjct: 305 QVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLE 364

Query: 686 LALQKL-------DISQQDLQHQ-NALNELKKKTLTLTSQLADEESR 724
QKL + S+Q L+ +A E KK+ + L + S+
Sbjct: 365 AEHQKLEEQNKISEASRQSLRRDLDASREAKKQ---VEKALEEANSK 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1236CHANLCOLICIN368e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 35.8 bits (82), Expect = 8e-04
Identities = 58/240 (24%), Positives = 86/240 (35%), Gaps = 18/240 (7%)

Query: 123 NATAAGQASEQAQTSAGQASES-----ATAAVNAAGAAEASATQAASSAASAESSAGTA- 176
N T G S G SES ATA + A + A QAA + A+AE+ A
Sbjct: 26 NGTPDGSGSGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKA 85

Query: 177 -----TTKAGEASASAASADTARTAAAASAAAAKTSEANADASRTA---AGDSAAAAAAS 228
T + + A + +RT +A A A + A+ R A + A A +
Sbjct: 86 NRDALTQRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKEAEA 145

Query: 229 ATAAQTSAERAGASETAAKTSETQAASSAGDAGASATAAAASEKAAAASAAAAKTSETNA 288
A A AE+ E + +ET+ +A AA SE+A A A K S +
Sbjct: 146 AEKAFQEAEQR-RKEIEREKAETERQLKLAEA-EEKRLAALSEEAKAVEIAQKKLSAAQS 203

Query: 289 ATSASTAAASATAASSSASEASTHAAASDTSASLA--AQSSTAAGAAATRAEDAAKRAED 346
+ S+S + A + AQ+S + + RA D
Sbjct: 204 EVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRAND 263



Score = 30.8 bits (69), Expect = 0.029
Identities = 47/261 (18%), Positives = 84/261 (32%), Gaps = 7/261 (2%)

Query: 97 DVRPEALRSFEAMVEEVARQASEASRNATAAGQASEQAQTSAGQASESATAAVNAAGAAE 156
D+ EALR + A A+ A A + + +A + A AA A AE
Sbjct: 96 DIVNEALRHNASRTPSATELA-HANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQEAE 154

Query: 157 ASATQAASSAASAESSAGTATTKAGEASASAASADTARTAAAASAAAAKTSEANADASRT 216
+ A E A E AA ++ A+ A A K S A ++ +
Sbjct: 155 QRRKEIEREKAETERQLKLAEA---EEKRLAALSEEAK---AVEIAQKKLSAAQSEVVKM 208

Query: 217 AAGDSAAAAAASATAAQTSAERAGASETAAKTSETQAASSAGDAGASATAAAASEKAAAA 276
+ S++ AE + + ++ A D + A++
Sbjct: 209 DGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNR 268

Query: 277 SAAAAKTSETNAATSASTAAASATAASSSASEASTHAAASDTSASLAAQSSTAAGAAATR 336
A A TA+ + + + + S + + A A
Sbjct: 269 PFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHE 328

Query: 337 AEDAAKRAEDIADVISLEDAS 357
AE+ K+A++ ++DA
Sbjct: 329 AEENLKKAQNNLLNSQIKDAV 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1256ENTEROVIROMP1972e-67 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 197 bits (503), Expect = 2e-67
Identities = 67/187 (35%), Positives = 94/187 (50%), Gaps = 18/187 (9%)

Query: 1 MKNIILSTLVITTSVLVVNVAQADTNAFSVGYAQSKVQDFKN-IRGVNVKYRYE-DDSPV 58
MK I + + + A T+ + GYAQS Q N + G N+KYRYE D+SP+
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAATSTVTGGYAQSDAQGQMNKMGGFNLKYRYEEDNSPL 60

Query: 59 SFISSLSYLYGDRQASGSVEPEGIHYHDKFEVKYGSLMVGPAYRLSDNFSLYALAGVGTV 118
I S +Y R AS D + +Y + GPAYR++D S+Y + GVG
Sbjct: 61 GVIGSFTYTEKSRTASSG---------DYNKNQYYGITAGPAYRINDWASIYGVVGVGYG 111

Query: 119 KATFKEHSTQDGDSFSNKISSRKTGFAWGAGVQMNPLENIVVDVGYEGSNISSTKINGFN 178
K E+ T D+ GF++GAG+Q NP+EN+ +D YE S I S + +
Sbjct: 112 KFQTTEYPTYKHDT-------SDYGFSYGAGLQFNPMENVALDFSYEQSRIRSVDVGTWI 164

Query: 179 VGVGYRF 185
GVGYRF
Sbjct: 165 AGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1285HTHTETR280.002 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.4 bits (63), Expect = 0.002
Identities = 8/37 (21%), Positives = 17/37 (45%), Gaps = 5/37 (13%)

Query: 4 LSWIIFGLIAGILAKWIMPG-----KDGGGFFMTIIL 35
+ I+ G I+G++ W+ K ++ I+L
Sbjct: 163 AAIIMRGYISGLMENWLFAPQSFDLKKEARDYVAILL 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1291PRTACTNFAMLY280.012 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 28.5 bits (63), Expect = 0.012
Identities = 17/59 (28%), Positives = 25/59 (42%)

Query: 49 QGLTVGIIILTIGVMAPIASGTLPPSTLIHSFVNWKSLVAIAVGVFVSWLGGRGITLMG 107
Q + L IG + + LPPS ++ N ++ A VS LG +TL G
Sbjct: 174 QRSAIVDGGLHIGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPAAVSVLGASELTLDG 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1307ACETATEKNASE290.030 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.0 bits (65), Expect = 0.030
Identities = 16/53 (30%), Positives = 25/53 (47%), Gaps = 7/53 (13%)

Query: 234 VVARCQEICGK--DNLGLVIECSGANIALKQAIDMLRPNGEVVRVGMGFKPLD 284
V R EI K ++L ++ C N + A+ NG+ + MGF PL+
Sbjct: 186 VSQRAAEILNKPIESLKIIT-CHLGNGSSIAAVK----NGKSIDTSMGFTPLE 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1308TCRTETB310.010 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.0 bits (70), Expect = 0.010
Identities = 22/88 (25%), Positives = 34/88 (38%), Gaps = 5/88 (5%)

Query: 71 MFLGALVGGIIGDKTGRRNAFILYEAIHIASMVVGAFSPNMMF-LIACRFVMGVGLGALL 129
+G V G + D+ G + + I+ V+G + LI RF+ G G A
Sbjct: 62 FSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFP 121

Query: 130 VTLFAGFTEYMPGRNR----GTWSSRVS 153
+ Y+P NR G S V+
Sbjct: 122 ALVMVVVARYIPKENRGKAFGLIGSIVA 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1314TCRTETB393e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 39.5 bits (92), Expect = 3e-05
Identities = 24/118 (20%), Positives = 47/118 (39%), Gaps = 1/118 (0%)

Query: 65 ALMLGYFIGSLTGGFIGDYLGRRKAFRINLLLVGISATAAAFVPNMY-WLIFFRCLMGTG 123
A ML + IG+ G + D LG ++ +++ + + + LI R + G G
Sbjct: 57 AFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAG 116

Query: 124 MGALIMVGYASFTEFIPPVVRGKWSARLSFVGNWSPMLSAGIGVVVIAFLSWRMMFLL 181
A + +IP RGK + + + IG ++ ++ W + L+
Sbjct: 117 AAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLI 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1315ISCHRISMTASE320.001 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 32.3 bits (73), Expect = 0.001
Identities = 38/178 (21%), Positives = 58/178 (32%), Gaps = 32/178 (17%)

Query: 3 NRALLLV-DLQNDFCAGGALAVAEGDSTIDIANALIDWCQPRQIPVLASQDWHPAQHGSF 61
NRA+LL+ D+QN F + L + C IPV+
Sbjct: 29 NRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCVQLGIPVV------------- 75

Query: 62 ASQHQAEPYSQGELD-GLPQTLW-PDHCVQHTDVAALHPLLNQHAIDACIYKGENPLIDS 119
+ A+P SQ D L W P + + L + D + K
Sbjct: 76 ---YTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDD-DLVLTKWR------ 125

Query: 120 YSAFFDNEHRQKTTLDTWLREHDVTELIVMGLATDYCVKFTVLDALELGYAVNVITDG 177
YSAF +T L +R+ +LI+ G+ T +A + D
Sbjct: 126 YSAFK------RTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDA 177


24SC1372SC1379Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1372018-3.315044electron transfer flavoprotein subunit YdiR
SC1373-119-4.153555electron transfer flavoprotein YdiQ
SC1374020-5.278395AraC family transcriptional regulator
SC1375021-4.690711acyl-CoA dehydrogenase
SC1376023-5.297833acetyl-CoA:acetoacetyl-CoA transferase subunit
SC1377023-5.3741423-dehydroquinate dehydratase
SC1378125-5.713838quinate/shikimate dehydrogenase
SC1379221-3.972920MFS family transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1379TCRTETB348e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.5 bits (79), Expect = 8e-04
Identities = 32/166 (19%), Positives = 71/166 (42%), Gaps = 7/166 (4%)

Query: 23 FLHGMSVITLAQNMTSLAQKFSTDSAGIAYLISGIGLGRLVSILFFGVLSDKFGRRAIIL 82
F ++ + L ++ +A F+ A ++ + L + +G LSD+ G + ++L
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 83 LGAVLYML----FFFGIPASPNLMIAFILAVCVGVANSALDTGGYPALMECFPKASGSAV 138
G ++ F G L++A + A AL + G A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARY--IPKENRGKAF 141

Query: 139 ILVKAMVSFGQMIYPLIVSALLVNHIWYGYAVVIPGILFVLITLML 184
L+ ++V+ G+ + P I ++ ++I + Y ++IP I + + ++
Sbjct: 142 GLIGSIVAMGEGVGPAI-GGMIAHYIHWSYLLLIPMITIITVPFLM 186


25SC1397SC1442Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1397-119-3.235134methyl-accepting chemotaxis protein
SC1398-217-3.985159murein lipoprotein
SC1399-114-1.099044pyruvate kinase
SC1400016-0.769804amino acid permease
SC1401017-0.086844hydrolase or acyltransferase
SC14020130.902397hypothetical protein
SC1403-1111.983468DeoR family transcriptional regulator
SC1404-1113.288161tetrathionate reductase complex subunit A
SC14050142.479334tetrathionate reductase complex subunit C
SC14060152.356896tetrathionate reductase complex subunit B
SC14071180.180584tetrathionate reductase complex: sensory
SC1408029-5.952442tetrathionate reductase complex: response
SC1409231-7.211316hypothetical protein
SC1410234-8.198387inner membrane protein
SC1411439-9.862522MerR family transcriptional regulator
SC1412543-11.036208secretion system transcriptonal activator
SC1413543-10.840142secretion system regulator:sensor component
SC1414743-10.662204secretion system apparatus protein SsaB
SC1415541-9.755618secretion system apparatus protein SsaC
SC1416438-8.478236secretion system apparatus protein SsaD
SC1417435-7.357546secretion system effector protein SsaE
SC1418436-7.196828secretion system effector protein SseA
SC1419332-6.633956secretion system effector protein SseB
SC1420336-5.729682secretion system chaperone protein SscA
SC1421439-5.949224secretion system effector protein SseC
SC1422640-6.180840secretion system effector protein SseD
SC1423741-6.884248secretion system effector SseE
SC1424741-7.110685secretion system chaperone protein SscB
SC1425643-8.373380secretion system effector protein SseF
SC1426543-9.161918secretion system effector protein SseG
SC1427340-10.434502secretion system apparatus protein SsaG
SC1428142-9.186349secretion system apparatus protein SsaH
SC1429242-9.362299secretion system apparatus protein SsaI
SC1430236-8.237814secretion system apparatus protein SsaJ
SC1431035-6.514560hypothetical protein
SC1432034-6.760219secretion system apparatus protein SsaK
SC1433133-6.428258secretion system apparatus protein SsaL
SC1434235-6.401635secretion system apparatus protein SsaM
SC1435235-6.516618secretion system apparatus protein SsaV
SC1436340-6.743523type III secretion system ATPase
SC1437544-10.153704secretion system apparatus protein SsaO
SC1438645-11.082356secretion system apparatus protein SsaP
SC1439334-8.178759type III secretion system protein
SC1440128-7.501595type III secretion system protein
SC1441023-6.554404secretion system apparatus protein SsaS
SC1442019-4.754656secretion system apparatus protein SsaT
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1397VACJLIPOPROT270.008 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 26.8 bits (59), Expect = 0.008
Identities = 14/29 (48%), Positives = 19/29 (65%)

Query: 6 QLILGAVVLGSTLLAGCSSNAKIDQLSSD 34
+L L A+ LG+TLL GC+S+ Q SD
Sbjct: 2 KLRLSALALGTTLLVGCASSGTDQQGRSD 30


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1398VACJLIPOPROT270.006 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 27.2 bits (60), Expect = 0.006
Identities = 17/45 (37%), Positives = 26/45 (57%), Gaps = 1/45 (2%)

Query: 5 KLVLGAVILGSTLLAGCSSNAKIDQLSSD-VQTLNAKVDQLSNDV 48
KL L A+ LG+TLL GC+S+ Q SD ++ N + + +V
Sbjct: 2 KLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNV 46


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1408HTHFIS842e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.5 bits (209), Expect = 2e-21
Identities = 31/127 (24%), Positives = 56/127 (44%)

Query: 2 ATIHLLDDDTAVTNACAFLLESLGYDVKCWTQGADFLAQASLYQAGVVLLDMRMPVLDGQ 61
ATI + DDD A+ L GYDV+ + A + +V+ D+ MP +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 GVHDALRQCGSTLAVVFLTGHGDVPMAVEQMKRGAVDFLQKPVSVKPLQAALERALTVSS 121
+ +++ L V+ ++ A++ ++GA D+L KP + L + RAL
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 122 AAVARRE 128
++ E
Sbjct: 124 RRPSKLE 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1412HTHFIS667e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.0 bits (161), Expect = 7e-15
Identities = 28/119 (23%), Positives = 50/119 (42%), Gaps = 2/119 (1%)

Query: 1 MKEYKILLVDDHEIIINGIMNALLPWPHFKIVEHVKNGLEVYNACCAYEPDILILDLSLP 60
M IL+ DD I + AL + V N ++ A + D+++ D+ +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGY--DVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 GINGLDIIPQLHQRWPAMNILVYTAYQQEYMTIKTLAAGANGYVLKSSSQQVLLAALQT 119
N D++P++ + P + +LV +A IK GA Y+ K L+ +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1413HTHFIS693e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 69.1 bits (169), Expect = 3e-14
Identities = 31/156 (19%), Positives = 58/156 (37%), Gaps = 13/156 (8%)

Query: 691 ILLVDDADINRDIISKMLVSLGQHVTIAASSNEALTLSQQQRFDLVLIDIRMPEIDGIEC 750
IL+ DD R ++++ L G V I +++ DLV+ D+ MP+ + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 751 VQLWHDEPNNLDPDCMFVALSASVATEDIHRCKKNGIHHYITKPVTLATLARYISIAAEY 810
+ PD + +SA + + G + Y+ KP L L
Sbjct: 66 LP----RIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIG-------- 113

Query: 811 QLLRNIELQEQDPSRCSALLAT-DDMVINSKIFQSL 845
+ R + ++ PS+ +V S Q +
Sbjct: 114 IIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEI 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1415TYPE3OMGPROT5810.0 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 581 bits (1498), Expect = 0.0
Identities = 158/500 (31%), Positives = 259/500 (51%), Gaps = 15/500 (3%)

Query: 11 LLFILNTVKSDELSWKGNDFTLYARQMPLAEVLHLLSENYDTAITISPLITATFSGKIPP 70
LL + + + EL W + A+ L ++L NYD + +S I SG+
Sbjct: 17 LLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDFGANYDATVVVSDKINDKVSGQFEH 76

Query: 71 GPPVDILNNLAAQYDLLTWFDGSMLYVYPASLLKHQVITFNILSTGRFIHYLRSQNILSS 130
P D L ++A+ Y+L+ ++DG++LY++ S + ++I L+ I
Sbjct: 77 DNPQDFLQHIASLYNLVWYYDGNVLYIFKNSEVASRLIRLQESEAAELKQALQRSGIWE- 135

Query: 131 PGCEVKEITGTRAVEVSGVPSCLTRISQLASVLDNALIKR--KDSAVSVSIYTLKYATAM 188
P + R V VSG P L + Q A+ L+ R K A+++ I+ LKYA+A
Sbjct: 136 PRFGWRPDASNRLVYVSGPPRYLELVEQTAAALEQQTQIRSEKTGALAIEIFPLKYASAS 195

Query: 189 DTQYQYRDQSVVVPGVVSVL-REMSKTSVPASSTTN-----GSPATQALPMFAADPRQNA 242
D YRD V PGV ++L R +S ++ + N + A ADP NA
Sbjct: 196 DRTIHYRDDEVAAPGVATILQRVLSDATIQQVTVDNQRIPQAATRASAQARVEADPSLNA 255

Query: 243 VIVRDYAANMAGYRKLITELDQRQQMIEISVKIIDVNAGDINQLGIDWGTAVSLGG---- 298
+IVRD M Y++LI LD+ IE+++ I+D+NA + +LG+DW + G
Sbjct: 256 IIVRDSPERMPMYQRLIHALDKPSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQV 315

Query: 299 --KKIAFNTGLNDGGASGFSTVISDTSNFMVRLNALEKSSQAYVLSQPSVVTLNNIQAVL 356
K + + GA G + R+N LE A V+S+P+++T N QAV+
Sbjct: 316 VIKTTGDQSNIASNGALGSLVDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVI 375

Query: 357 DKNITFYTKLQGEKVAKLESITTGSLLRVTPRLLNDNGTQKIMLNLNIQDGQQSDTQSET 416
D + T+Y K+ G++VA+L+ IT G++LR+TPR+L +I LNL+I+DG Q S
Sbjct: 376 DHSETYYVKVTGKEVAELKGITYGTMLRMTPRVLTQGDKSEISLNLHIEDGNQKPNSSGI 435

Query: 417 DPLPEVQNSEIASQATLLAGQSLLLGGFKQGKQIHSQNKIPLLGDIPVVGHLFRNDTTQV 476
+ +P + + + + A + GQSL++GG + + + +K+PLLGDIP +G LFR +
Sbjct: 436 EGIPTISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELT 495

Query: 477 HSVIRLFLIKASVVNNGISH 496
+RLF+I+ +++ GI+H
Sbjct: 496 RRTVRLFIIEPRIIDEGIAH 515


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1420SYCDCHAPRONE889e-25 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 87.7 bits (217), Expect = 9e-25
Identities = 40/154 (25%), Positives = 67/154 (43%), Gaps = 7/154 (4%)

Query: 6 TLQQAHDTMRFFRRGGSLRMLL---DDDVTQPLNTLYRYATQLMEVKEFAGAARLFQLLT 62
T + F + GG++ ML D L LY A + ++ A ++FQ L
Sbjct: 8 TQEYQLAMESFLKGGGTIAMLNEISSDT----LEQLYSLAFNQYQSGKYEDAHKVFQALC 63

Query: 63 IYDAWSFDYWFRLGECCQAQKHWGEAIYAYGRAAQIKIDAPQAPWAAAECYLACDNVCYA 122
+ D + ++ LG C QA + AI++Y A + I P+ P+ AAEC L + A
Sbjct: 64 VLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEA 123

Query: 123 IKALKAVVRICGEVSEHQILRLRAEKMLQQLSDR 156
L + + +E + L R ML+ + +
Sbjct: 124 ESGLFLAQELIADKTEFKELSTRVSSMLEAIKLK 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1422PF05844280.018 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 28.4 bits (63), Expect = 0.018
Identities = 29/149 (19%), Positives = 49/149 (32%), Gaps = 19/149 (12%)

Query: 9 VLPAPSL-LTPSSSQAPSGEGMGTESMLLLFDDIWTKLMELAKKLRDIMRSYNVVKQRLG 67
L AP L P +A E + +LL+ I K EL RD + Q+
Sbjct: 50 ELNAPRQVLDPVRMEAAGSELDSSVELLLILFRIAQKARELGVLQRDNENQAIIHAQK-- 107

Query: 68 WELQVNVLQTQMKTIDEAFRASMITAGGAMLSGVLTIGLGAVGGETGLIAGQAVGHTAGG 127
+DE + + A+++GV + VG L G+A+
Sbjct: 108 ------------AQVDEMRSGATLMIAMAVIAGVGALASAVVGSLGALKNGKAISQEK-- 153

Query: 128 VMGLGAGVAQRQSDQNKAIADLKQNGAQS 156
L + R + + L + +
Sbjct: 154 --TLQKNIDGRNELIDAKMQALGKTSDED 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1424SYCDCHAPRONE791e-21 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 79.2 bits (195), Expect = 1e-21
Identities = 26/127 (20%), Positives = 49/127 (38%)

Query: 16 LKQLLSVDPETVYASGYASWQEGDYSRAVIDFSWLVMAQPWSWRAHIALAGTWMMLKEYT 75
L ++ S E +Y+ + +Q G Y A F L + + R + L + +Y
Sbjct: 28 LNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYD 87

Query: 76 TAINFYGHALMLDASHPEPVYQTGVCLKMMGEPGLAREAFQTAIKMSYADASWSEIRQNA 135
AI+ Y + ++D P + CL GE A A ++ + E+
Sbjct: 88 LAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEFKELSTRV 147

Query: 136 QIMVDTL 142
M++ +
Sbjct: 148 SSMLEAI 154


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1430FLGMRINGFLIF532e-10 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 53.4 bits (128), Expect = 2e-10
Identities = 29/183 (15%), Positives = 69/183 (37%), Gaps = 15/183 (8%)

Query: 23 LYRSLPEDEANQMLALLMQHHIDAEKKQEEDGVTLRVEQSQFINAVELLRLNGYPHRQFT 82
L+ +L + + ++A L Q +I + V + L G P +
Sbjct: 53 LFSNLSDQDGGAIVAQLTQMNIPYRFA--NGSGAIEVPADKVHELRLRLAQQGLP-KGGA 109

Query: 83 TADKMFPANQLVVSPQEEQQKINFLK--EQRIEGMLSQMEGVINAKVTIALPTYDEGS-- 138
++ + +S EQ +N+ + E + + + V +A+V +A+P + S
Sbjct: 110 VGFELLDQEKFGISQFSEQ--VNYQRALEGELARTIETLGPVKSARVHLAMP---KPSLF 164

Query: 139 --NASPSSVAVFIKYSPQVNMEAFRVK-IKDLIEMSIPGLQYSKISILMQPAEFRMVPDV 195
S +V + P ++ ++ + L+ ++ GL ++++ Q +
Sbjct: 165 VREQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNT 224

Query: 196 PAR 198
R
Sbjct: 225 SGR 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1439FLGMOTORFLIN513e-10 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 51.1 bits (122), Expect = 3e-10
Identities = 21/67 (31%), Positives = 38/67 (56%)

Query: 247 LEQIPQQVLFEIGRASLEIGQLRQLKTGDVLPVGGCFAPEVTIRVNDRIIGQGELIACGN 306
+ IP ++ E+GR + I +L +L G V+ + G + I +N +I QGE++ +
Sbjct: 57 IMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVAD 116

Query: 307 EFMVRIT 313
++ VRIT
Sbjct: 117 KYGVRIT 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1440TYPE3IMPPROT2319e-80 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 231 bits (592), Expect = 9e-80
Identities = 79/215 (36%), Positives = 130/215 (60%), Gaps = 8/215 (3%)

Query: 8 LQLIGILFLLSILPLIIVMGTSFLKLAVVFSILRNALGIQQVPPNIALYGLALVLSLFIM 67
+ LI +L ++LP II GT F+K ++VF ++RNALG+QQ+P N+ L G+AL+LS+F+M
Sbjct: 5 ISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLSMFVM 64

Query: 68 GPTLLAVKERWHPVQVAGAPFWT-SEWDSKALAPYRQFLQKNSEEKEANYFRNLIKRTWP 126
P + + V + S+ + L YR +L K S+ + +F N +
Sbjct: 65 WPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQLKRQY 124

Query: 127 ED-------IKRKIKPDSLLILIPAFTVSQLTQAFRIGLLIYLPFLAIDLLISNILLAMG 179
+ K +I+ S+ L+PA+ +S++ AF+IG +YLPF+ +DL++S++LLA+G
Sbjct: 125 GEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVLLALG 184

Query: 180 MMMVSPMTISLPFKLLIFLLAGGWDLTLAQLVQSF 214
MMM+SP+TIS P KL++F+ GW L L+ +
Sbjct: 185 MMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQY 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1441TYPE3IMQPROT729e-21 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 72.5 bits (178), Expect = 9e-21
Identities = 30/85 (35%), Positives = 50/85 (58%)

Query: 4 SELTQFVTQLLWIVLFTSMPVVLVASVVGVIVSLVQALTQIQDQTLQFMIKLLAIAITLM 63
+L + L++VL S +VA+++G++V L Q +TQ+Q+QTL F IKLL + + L
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 VSYPWLSGILLNYTRQIMLRIGEHG 88
+ W +LL+Y RQ++ G
Sbjct: 62 LLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1442TYPE3IMRPROT1637e-52 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 163 bits (415), Expect = 7e-52
Identities = 55/229 (24%), Positives = 101/229 (44%), Gaps = 5/229 (2%)

Query: 8 WLIALAVAFIRPLSLSLLLPLLKSGSLGAALLRNGVLMSLTFPILPIIYQQKIMMHIGKD 67
WL +R L+L P+L S+ ++ G+ M +TF I P + + +
Sbjct: 12 WLNLYFWPLLRVLALISTAPILSERSVPK-RVKLGLAMMITFAIAPSLPANDVPVF---S 67

Query: 68 YSWLGLVTGEVIIGFLIGFCAAVPFWAVDMAGFLLDTLRGATMGTIFNSTIEAETSLFGL 127
+ L L +++IG +GF F AV AG ++ G + T + +
Sbjct: 68 FFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLAR 127

Query: 128 LFSQFLCVIFFISGGMEFILNILYESYQYLPPGRTLLFDQQFLKYIQAEWRTLYQLCISF 187
+ ++F G +++++L +++ LP G L FL +A ++ +
Sbjct: 128 IMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSL-IFLNGLML 186

Query: 188 SLPAIICMVLADLALGLLNRSAQQLNVFFLSMPLKSILVLLTLLISFPY 236
+LP I ++ +LALGLLNR A QL++F + PL + + + P
Sbjct: 187 ALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPL 235


26SC1535SC1546Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1535216-1.439616O-acetylserine/cysteine export protein
SC1536013-0.159864hypothetical protein
SC15370140.280744DNA-binding transcriptional activator MarA
SC15380140.881415DNA-binding transcriptional repressor MarR
SC1539015-0.114119multiple drug resistance protein MarC
SC1540015-0.574622sugar efflux transporter
SC1541-118-1.105736succinate semialdehyde dehydrogenase
SC1542121-4.360396glutaminase
SC1543125-5.731988hypothetical protein
SC1544126-4.987201inner membrane protein
SC1545125-3.910132outer membrane protein
SC1546022-3.477773hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1540TCRTETB574e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 57.2 bits (138), Expect = 4e-11
Identities = 44/192 (22%), Positives = 85/192 (44%), Gaps = 8/192 (4%)

Query: 36 LSDIAESFHMQTAQVGIMLTIYAWVVAVMSLPFMLLTSQMERRKLLICLFVLFIASHVLS 95
L DIA F+ A + T + ++ + + L+ Q+ ++LL+ ++ V+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 96 FLAWN-FTVLVISRIGIAFAHAIFWSITASLAIRLAPAGKRAQALSLIATGTALAMVLGL 154
F+ + F++L+++R A F ++ + R P R +A LI + A+ +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 155 PIGRVVGQYFGWRTTFFAIGMGALITLLCLIKLLPKLPSEHSGSLKSLPLLFRRPALMSL 214
IG ++ Y W + I M +IT+ L+KLL K + LMS+
Sbjct: 157 AIGGMIAHYIHW-SYLLLIPMITIITVPFLMKLLKK------EVRIKGHFDIKGIILMSV 209

Query: 215 YVLTVVVVTAHY 226
++ ++ T Y
Sbjct: 210 GIVFFMLFTTSY 221


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1542BLACTAMASEA310.008 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 30.5 bits (69), Expect = 0.008
Identities = 12/51 (23%), Positives = 21/51 (41%), Gaps = 1/51 (1%)

Query: 22 GQGKVADYIPALASVEGSKLGI-AICTVDGQHYQAGDAHERFSIQSISKVL 71
+ + I S ++G+ + G+ A A ERF + S KV+
Sbjct: 21 ASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTFKVV 71


27SC1620SC1630Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SC1620-122-3.885965LysR family transcriptional regulator
SC1621-122-4.871924methyl-accepting chemotaxis protein III, ribose
SC1622230-7.745783alcohol dehydrogenase
SC1623441-11.109837hypothetical protein
SC1624442-11.383861dipicolinate reductase
SC1626437-11.092360translocated effector: regulated by SPI-2
SC1627234-8.961003inner membrane protein
SC1628234-9.154189periplasmic binding protein
SC1629232-8.513754amino acid ABC transporter permease component
SC1630-124-5.719196polar amino acid ABC transporter ATPase
28SC1654SC1665Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SC1654-113-3.629627Smr domain-containing protein
SC1655-114-3.802832O-6-alkylguanine-DNA:cysteine-protein
SC1656-214-3.678380fumarate/nitrate reduction transcriptional
SC1657-217-4.401041universal stress protein UspE
SC1658-123-6.846933hypothetical protein
SC1659024-6.313846hypothetical protein
SC1660130-7.547632transcriptional regulator
SC1661335-8.975352hypothetical protein
SC1662538-10.500851hypothetical protein
SC1663435-8.464892hypothetical protein
SC1664229-5.655421serine/threonine protein kinase
SC1665026-5.849174AraC family transcriptional regulator
29SC1679SC1690Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1679315-1.104982thiosulfate:cyanide sulfurtransferase
SC16802130.230568peripheral inner membrane phage-shock protein
SC16811130.046930DNA-binding transcriptional activator PspC
SC16820130.900181phage shock protein B
SC1683-1120.613813phage shock protein PspA
SC1684-2130.878430phage shock protein operon transcriptional
SC1685-2130.161457peptide ABC transporter substrate-binding
SC1686-117-3.146454peptide ABC transporter
SC1687020-4.486726peptide ABC transporter
SC1688025-6.342701peptide ABC transporter ATP-binding protein
SC1689024-6.279218peptide ABC transporter ATP-binding protein
SC1690-129-5.752955hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1682MPTASEINHBTR260.015 Metalloprotease inhibitor signature.
		>MPTASEINHBTR#Metalloprotease inhibitor signature.

Length = 122

Score = 25.7 bits (56), Expect = 0.015
Identities = 6/43 (13%), Positives = 14/43 (32%)

Query: 30 AGRGELSQSEQQRLLQLTDDAQRMRERIQALEDILDAEHPNWR 72
AG+ + + + A + + E L + +W
Sbjct: 37 AGQLGIEATGSGVCAGPAEQANALAGDVACAEQWLGDKPVSWS 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1683RTXTOXIND280.028 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 28.3 bits (63), Expect = 0.028
Identities = 19/104 (18%), Positives = 43/104 (41%), Gaps = 5/104 (4%)

Query: 40 LVEVRSNSARALAEKKQLSRRIEQATTQQTEWQEKAELA-LRKDKDDLARAALIEKQKLT 98
+ + R + +L K+ +++ + + EL + + + L K++
Sbjct: 232 VEKSRLDDFSSLLHKQAIAK-HAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQ 290

Query: 99 DLIATLEQEVTLVDDTLARMKKEIGELENKLSETRARQQALMLR 142
+ + E+ D L + IG L +L++ RQQA ++R
Sbjct: 291 LVTQLFKNEIL---DKLRQTTDNIGLLTLELAKNEERQQASVIR 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1684HTHFIS344e-118 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 344 bits (883), Expect = e-118
Identities = 124/341 (36%), Positives = 176/341 (51%), Gaps = 22/341 (6%)

Query: 6 DNLLGEANRFLEVLEQVSRLAPLDKPVLIIGERGTGKELIANRLHYLSSRWQGPLISLNC 65
L+G + E+ ++RL D ++I GE GTGKEL+A LH R GP +++N
Sbjct: 137 MPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINM 196

Query: 66 AALNENLLDSELFGHEAGAFTGAQKRHPGRFERADGGTLFLDELATAPMLVQEKLLRVIE 125
AA+ +L++SELFGHE GAFTGAQ R GRFE+A+GGTLFLDE+ PM Q +LLRV++
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQ 256

Query: 126 YGELERVGGSQPLQVNVRLVCATNADLPAMVKEGTFRADLLDRLAFDVVQLPPLRERQSD 185
GE VGG P++ +VR+V ATN DL + +G FR DL RL ++LPPLR+R D
Sbjct: 257 QGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAED 316

Query: 186 IMLMAEHFAIQMCRELRLPLFPGFTDRAKETLLHYAWPGNVRELKNVVERSVYRHGSSE- 244
I + HF Q +E F A E + + WPGNVREL+N+V R +
Sbjct: 317 IPDLVRHFVQQAEKEGLDVK--RFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVI 374

Query: 245 -------HPLDEIVIDPFQRHPAEPPAPALPAA------------SATPDLPLKLREFQL 285
EI P ++ A + ++ A
Sbjct: 375 TREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLA 434

Query: 286 QQEKALLQRSLQQAKFNQKRAADLLALTYHQFRALLKKHQL 326
+ E L+ +L + NQ +AADLL L + R +++ +
Sbjct: 435 EMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1688HTHFIS310.007 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.007
Identities = 9/16 (56%), Positives = 14/16 (87%)

Query: 38 LVGESGSGKSLIAKAI 53
+ GESG+GK L+A+A+
Sbjct: 165 ITGESGTGKELVARAL 180


30SC1714SC1725Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1714-1133.054644short chain dehydrogenase
SC1715-2133.260695cob(I)yrinic acid a,c-diamide
SC1716-2123.23446523S rRNA pseudouridylate synthase B
SC1717-1133.143811hypothetical protein
SC1718-1133.456665hypothetical protein
SC1719-1143.400023anthranilate synthase component I
SC17201131.763616bifunctional glutamine
SC1721216-0.348273bifunctional indole-3-glycerol phosphate
SC1722217-0.819367tryptophan synthase subunit beta
SC1723020-2.017136tryptophan synthase subunit alpha
SC1724-121-3.529645hypothetical protein
SC1725-118-3.279242hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1714DHBDHDRGNASE943e-25 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 94.3 bits (234), Expect = 3e-25
Identities = 60/242 (24%), Positives = 103/242 (42%), Gaps = 22/242 (9%)

Query: 10 LQNRIILVTGASDGIGREAALTYARYGATVILLGRNEEKLRRVAQHIADEQHVQPQWFTL 69
++ +I +TGA+ GIG A T A GA + + N EKL +V + E F
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEA-FPA 64

Query: 70 DLLTCTAEECRQVADRIAAHYPRLDGILHNAGLLGEIGPMSEQDPQIWQDVMQVNVNATF 129
D+ + ++ RI +D +++ AG+L G + + W+ VN F
Sbjct: 65 DV--RDSAAIDEITARIEREMGPIDILVNVAGVL-RPGLIHSLSDEEWEATFSVNSTGVF 121

Query: 130 MLTQALLPLLLKSDAGSLVFTSSSVGRQGRANWGAYATSKFATEGMMQVLADEYQNRPLR 189
++++ ++ +GS+V S+ R + AYA+SK A + L E +R
Sbjct: 122 NASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIR 181

Query: 190 VNCINPGGTRTSMRASAFPTEDPQ------------------KLKTPADIMPLYLWLMGD 231
N ++PG T T M+ S + E+ KL P+DI L+L+
Sbjct: 182 CNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSG 241

Query: 232 DS 233
+
Sbjct: 242 QA 243


31SC1768SC1785Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1768-1163.203706transcriptional regulator
SC1769-1153.201689N5-glutamine S-adenosyl-L-methionine-dependent
SC1770-1142.929217peptide chain release factor 1
SC1771-2142.394883glutamyl-tRNA reductase
SC1772-2121.652702molecular chaperone LolB
SC1773-2151.0954534-diphosphocytidyl-2-C-methyl-D-erythritol
SC1774-216-0.476956ribose-phosphate pyrophosphokinase
SC1775-216-2.344012sulfate transporter YchM
SC1776127-8.797541hypothetical protein
SC1777221-6.222314peptidyl-tRNA hydrolase
SC1778320-5.534788GTP-dependent nucleic acid-binding protein EngD
SC1779422-5.440931hypothetical protein
SC1780319-3.350007hypothetical protein
SC1781218-0.632722hypothetical protein
SC17823204.031913hydrogenase-1 small subunit
SC17833173.539326hydrogenase 1 b-type cytochrome subunit
SC17840133.341400hydrogenase 1 maturation protease
SC17850123.178757hydrogenase-1 operon protein HyaE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1770RTXTOXIND320.004 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.004
Identities = 16/112 (14%), Positives = 39/112 (34%), Gaps = 12/112 (10%)

Query: 9 LEALHERHEEVQALLGDAGIIADQDRFRALSREYAQLS-DVSRCFTDWQQVQDDIETAQM 67
R ++ +LL I + +Y + ++ + +Q++ +I +A+
Sbjct: 230 SRVEKSRLDDFSSLLHKQAI--AKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKE 287

Query: 68 MLD--DPEMREMAQEELREAKEKSEQLEQQLQVLLLPKDPDDERNAFLEVRA 117
+ ++LR+ + L +L +ER +RA
Sbjct: 288 EYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKN-------EERQQASVIRA 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1775RTXTOXINA310.016 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 31.1 bits (70), Expect = 0.016
Identities = 23/81 (28%), Positives = 37/81 (45%), Gaps = 16/81 (19%)

Query: 288 LGAIESLLCAV----VL---DGMTGTKHKANSELIGQGLGNM---VAPFF------GGIT 331
L + +L A+ +L D T TK A EL + LGN+ ++ + G++
Sbjct: 242 LDTVSGILSAISASFILSNADADTRTKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGLS 301

Query: 332 ATAAIARSAANVRAGATSPVS 352
+AA A A+ A SP+S
Sbjct: 302 TSAAAAGLIASAVTLAISPLS 322


32SC1847SC1874Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1847033-6.595916hypothetical protein
SC1848134-9.043925hypothetical protein
SC1849134-8.220962serine/threonine protein phosphatase 1
SC1850339-10.968323inner membrane protein
SC1851538-10.700511hypothetical protein
SC1852535-9.925459invasion-associated type III-secreted protein
SC1853434-9.362252hypothetical protein
SC1854533-6.347810hypothetical protein
SC1855435-7.364992acetyltransferase
SC1856331-4.215390hypothetical protein
SC1857129-2.118225hypothetical protein
SC1858236-5.637437hypothetical protein
SC1859134-6.013622transposase
SC1860437-7.205236hypothetical protein
SC1861540-7.925720error-prone repair: component of DNA polymerase
SC1862528-6.439639fimbriae usher protein
SC1863628-6.570218PhoPQ-activated integral membrane protein
SC1864624-4.979839inner membrane protein
SC1865528-7.105608inner membrane protein
SC1866425-6.239049inner membrane lipoprotein
SC1867424-5.780618hydrolase
SC1868531-6.670591hypothetical protein
SC1869433-7.621388DNA-binding protein
SC1872434-8.645219hypothetical protein
SC1873425-3.202790hypothetical protein
SC1874222-2.269699Gifsy-1 prophage VtaP
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1852SOPEPROTEIN401e-146 Salmonella type III secretion SopE effector protein ...
		>SOPEPROTEIN#Salmonella type III secretion SopE effector protein

signature.
Length = 239

Score = 401 bits (1032), Expect = e-146
Identities = 163/237 (68%), Positives = 193/237 (81%)

Query: 2 TNITLSTQHYRIHRSDVEPVKEKTTEKDIFAKSITAVRNSFISLSTSLSDRFSLHQQTDI 61
T ITLS Q++RI + + +KEK+TEK+ AKSI AV+N FI L + LS+RF H+ T+
Sbjct: 1 TKITLSPQNFRIQKQETTLLKEKSTEKNSLAKSILAVKNHFIELRSKLSERFISHKNTES 60

Query: 62 PTTHFHRGSASEGRAVLTSKTVKDFMLQKLNSLDIKGNASKDPAYARQTCEAILSAVYSN 121
THFHRGSASEGRAVLT+K VKDFMLQ LN +DI+G+ASKDPAYA QT EAILSAVYS
Sbjct: 61 SATHFHRGSASEGRAVLTNKVVKDFMLQTLNDIDIRGSASKDPAYASQTREAILSAVYSK 120

Query: 122 NKDHCCKLLISKGVSITPFLKEIGEAAQNAGLPGEIKNGVFTPGGAGANPFVVPLIAAAS 181
NKD CC LLISKG++I PFL+EIGEAA+NAGLPG KN VFTP GAGANPF+ PLI++A+
Sbjct: 121 NKDQCCNLLISKGINIAPFLQEIGEAAKNAGLPGTTKNDVFTPSGAGANPFITPLISSAN 180

Query: 182 IKYPHMFINHNQQVSFKAYAEKIVMKEVTPLFNKGTMPTPQQFQLTIENIANKHLQN 238
KYP MFIN +QQ SFK YAEKI+M EV PLFN+ MPTPQQFQL +ENIANK++QN
Sbjct: 181 SKYPRMFINQHQQASFKIYAEKIIMTEVAPLFNECAMPTPQQFQLILENIANKYIQN 237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1855SACTRNSFRASE280.011 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.0 bits (62), Expect = 0.011
Identities = 13/60 (21%), Positives = 27/60 (45%), Gaps = 2/60 (3%)

Query: 60 WLCIDYLWVSESARSNGLGSKLMEMAEKEGLRKGCVHGLVDTFSFQ--ALPFYEKQGYIL 117
+ I+ + V++ R G+G+ L+ A + +++T A FY K +I+
Sbjct: 89 YALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1868ACRIFLAVINRP340.001 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 34.0 bits (78), Expect = 0.001
Identities = 25/83 (30%), Positives = 40/83 (48%), Gaps = 10/83 (12%)

Query: 123 QLPFAWPLSVILMLTALAALY--YHLPALLLFIVPLWLT-ALLASVRLNQYVNIRFLLVW 179
Q P +S +++ LAALY + +P ++ +VPL + LLA+ NQ ++ F++
Sbjct: 871 QAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGL 930

Query: 180 LTL------TAILIYGRFILQRW 196
LT AILI F
Sbjct: 931 LTTIGLSAKNAILIVE-FAKDLM 952


33SC1950SC1962Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1950-117-3.034606response regulator
SC1951020-2.315774hypothetical protein
SC1952-1180.062923hypothetical protein
SC1953-1170.297973hypothetical protein
SC1954-1150.049949DNA-binding transcriptional activator SdiA
SC1955-2151.054414amino-acid ABC transporter ATP-binding protein
SC1956-213-0.210354ABC-type amino acid transporter permease
SC1957113-1.462212D-cysteine desulfhydrase
SC1958215-3.114090cystine transporter subunit
SC1959215-3.492245flagella biosynthesis protein FliZ
SC1960116-3.480994flagellar biosynthesis sigma factor
SC1961117-4.192607flagellin methylation protein
SC1962116-3.799578flagellin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1950HTHFIS754e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.3 bits (185), Expect = 4e-18
Identities = 24/113 (21%), Positives = 46/113 (40%), Gaps = 2/113 (1%)

Query: 4 VLLVDDHELVRAGIRRILEDIKGIKVVGEACCGEDAVKWCRTNAVDVVLMDMNMPGIGGL 63
+L+ DD +R + + L G V + +W D+V+ D+ MP
Sbjct: 6 ILVADDDAAIRTVLNQALSR-AGYDVRITSN-AATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 64 EATRKIARSTADIKVIMLTVHTENPLPAKVMQAGAAGYLSKGAAPQEVVSAIR 116
+ +I ++ D+ V++++ K + GA YL K E++ I
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1962FLAGELLIN2678e-86 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 267 bits (684), Expect = 8e-86
Identities = 257/510 (50%), Positives = 300/510 (58%), Gaps = 13/510 (2%)

Query: 2 AQVINTNSLSLLTQNNLNKSQSALGTAIERLSSGLRINSAKDDAAGQAIANRFTANIKGL 61
AQVINTNSLSLLTQNNLNKSQS+L +AIERLSSGLRINSAKDDAAGQAIANRFT+NIKGL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 TQASRNANDGISIAQTTEGALNEINNNLQRVRELAVQSANSTNSQSDLDSIQAEITQRLN 121
TQASRNANDGISIAQTTEGALNEINNNLQRVREL+VQ+ N TNS SDL SIQ EI QRL
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EIDRVSGQTQFNGVKVLAQDNTLTIQVGANDGETIDIDLKQINSQTLGLDTLNVQKKYDV 181
EIDRVS QTQFNGVKVL+QDN + IQVGANDGETI IDL++I+ ++LGLD NV +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEA 180

Query: 182 SDTAVAASYSDSKQNIAVPDKTAITAKIGAATSGGAGIKADISFKDGKYYATVSGYDDAA 241
+ + + S SG K Y +
Sbjct: 181 TVGDLKS--SFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTT 238

Query: 242 DTDKNGTYEVTVAADTGAVTFATTPTVVDLPTDAKAVSKVQQNDTEIAATNAKAALKAAG 301
D +N T A + K + K G
Sbjct: 239 DDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGV-TFTIDTKTGNDGNG 297

Query: 302 VADAEADTATLVKMSYTDNNGKVIDGGFAFKTSGGYYAASV-------DKSGAASLKVTS 354
+ + G ++S Y + V DK+ S K++
Sbjct: 298 KVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSD 357

Query: 355 YVD---ATTGTEKTAANKLGGADGKTEVVTIDGKTYNASKAAGHNFKAQPELAEAAATTT 411
++ T A+ + VT+ GKT K A E A AA +T
Sbjct: 358 LEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKST 417

Query: 412 ENPLQKIDAALAQVDALRSDLGAVQNRFNSAITNLGNTVNNLSSARSRIEDSDYATEVSN 471
NPL ID+AL++VDA+RS LGA+QNRF+SAITNLGNTV NL+SARSRIED+DYATEVSN
Sbjct: 418 ANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSN 477

Query: 472 MSRAQILQQAGTSVLAQANQVPQNVLSLLR 501
MS+AQILQQAGTSVLAQANQVPQNVLSLLR
Sbjct: 478 MSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


34SC1972SC1985Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1972-1143.926198flagellar hook-basal body protein FliE
SC19730133.595015hypothetical protein
SC19740134.905976flagellar MS-ring protein
SC19750154.930755flagellar motor switch protein G
SC1976-1164.657759flagellar assembly protein H
SC1977-1153.949932flagellum-specific ATP synthase
SC19780153.383233flagellar biosynthesis chaperone
SC1979-1173.420480flagellar hook-length control protein
SC1980-1151.190045flagellar basal body protein FliL
SC19810130.525135flagellar motor switch protein FliM
SC1982114-2.246300flagellar motor switch protein FliN
SC1983114-2.558198flagellar biosynthesis protein FliO
SC1984015-4.066970flagellar biosynthesis protein FliP
SC1985-116-3.429783flagellar biosynthesis protein FliQ
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1972FLGHOOKFLIE1102e-35 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 110 bits (277), Expect = 2e-35
Identities = 90/103 (87%), Positives = 95/103 (92%)

Query: 2 AAIQGIEGVISQLQATAMAASGQETHSQSTVSFAGQLHAALDRISDRQTAARVQAEKFTL 61
+AIQGIEGVISQLQATAM+A QE+ Q T+SFAGQLHAALDRISD QTAAR QAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGIALNDVMADMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPG+ALNDVM DMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1974FLGMRINGFLIF7840.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 784 bits (2026), Expect = 0.0
Identities = 557/559 (99%), Positives = 558/559 (99%)

Query: 2 SATASTATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQ 61
SATASTATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQ
Sbjct: 1 SATASTATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQ 60

Query: 62 DGGAIVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKF 121
DGGAIVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKF
Sbjct: 61 DGGAIVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKF 120

Query: 122 GISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLE 181
GISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLE
Sbjct: 121 GISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLE 180

Query: 182 PGRALDEGQISAVVHLVSSAVAGLPLGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDV 241
PGRALDEGQISAVVHLVSSAVAGLP GNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDV
Sbjct: 181 PGRALDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDV 240

Query: 242 ESRIQRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNIS 301
ESRIQRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNIS
Sbjct: 241 ESRIQRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNIS 300

Query: 302 EQVGAGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRNTQRN 361
EQVGAGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPR+TQRN
Sbjct: 301 EQVGAGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRN 360

Query: 362 ETSNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMG 421
ETSNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMG
Sbjct: 361 ETSNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMG 420

Query: 422 FSDKRGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVR 481
FSDKRGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVR
Sbjct: 421 FSDKRGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVR 480

Query: 482 PQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSD 541
PQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSD
Sbjct: 481 PQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSD 540

Query: 542 NDPRVVALVIRQWMSNDHE 560
NDPRVVALVIRQWMSNDHE
Sbjct: 541 NDPRVVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1975FLGMOTORFLIG339e-118 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 339 bits (870), Expect = e-118
Identities = 114/329 (34%), Positives = 196/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLSGTDKSVILLMTIGEDRAAEVFKHLSTREVQALSTAMANVRQISNKQLTDVLSEFE 60
+S L+G K+ ILL++IG + +++VFK+LS E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANEYLRSVLVKALGEERASSLLEDILETRDTTSGIETLNFMEPQSAAD 120
+ + +Y R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLKRSQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEPPLREKFLRNMSQRAADILRDDLANRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1976FLGFLIH368e-133 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 368 bits (945), Expect = e-133
Identities = 193/235 (82%), Positives = 209/235 (88%), Gaps = 7/235 (2%)

Query: 1 MSNELPWQVWTPDDLAPPPETFVPVEADNVTLTEDTPEPELTAEQQLEQELAQLKIQAHE 60
MS+ LPW+ WTPDDLAPP FVP+ T+ E+ AE LEQ+LAQL++QAHE
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEE-------AEPSLEQQLAQLQMQAHE 53

Query: 61 QGYNAGLAEGRQKGHAQGYQEGLAQGLEQGQAQAQTQQAPIHARMQQLVSEFQNTLDALD 120
QGY AG+AEGRQ+GH QGYQEGLAQGLEQG A+A++QQAPIHARMQQLVSEFQ TLDALD
Sbjct: 54 QGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALD 113

Query: 121 SVIASRLMQMALEAARQVIGQTPAVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRV 180
SVIASRLMQMALEAARQVIGQTP VDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRV
Sbjct: 114 SVIASRLMQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRV 173

Query: 181 EEMLGATLSLHGWRLRGDPTLHHGGCKVSADEGDLDASVATRWQELCRLAAPGVL 235
++MLGATLSLHGWRLRGDPTLH GGCKVSADEGDLDASVATRWQELCRLAAPGV+
Sbjct: 174 DDMLGATLSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1978FLGFLIJ2064e-72 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 206 bits (526), Expect = 4e-72
Identities = 130/147 (88%), Positives = 138/147 (93%)

Query: 1 MAQHGALETLKDLAEKEVDDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRSNLNTDMGNG 60
MA+HGAL TLKDLAEKEV+DAARLLGEMRRGCQQAEEQLKMLIDYQNEYR+NLN+DM G
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 IASNRWINYQQFIQTLEKAIEQHRLQLTQWTQKVDLALKSWREKKQRLQAWQTLQDRQTA 120
I SNRWINYQQFIQTLEKAI QHR QL QWTQKVD+AL SWREKKQRLQAWQTLQ+RQ+
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRMDQKKMDEFAQRAAMRKPE 147
AALLAENR+DQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1979FLGHOOKFLIK408e-144 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 408 bits (1049), Expect = e-144
Identities = 191/409 (46%), Positives = 231/409 (56%), Gaps = 38/409 (9%)

Query: 1 MITLPQLITTDTDMTAGLTSGKTTGSAEDFLALLAGALGADGAQGKDARITLADLQAAGG 60
MI L LIT D D T L GK + +A+DFLALL+ AL + K A L
Sbjct: 1 MIRLAPLITADVDTTT-LPGGKASDAAQDFLALLSEALAGETTTDKAAPQLL-------- 51

Query: 61 KLSKELLTQHGEPGQAVKLADLLAQKAN---ATDETLTDLTQAQHLLSTLTPSLKTSALA 117
++ + T GEP + ++D AQ+AN DET + Q + LT + + A
Sbjct: 52 -VATDKPTTKGEPLISDIVSD--AQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAA 108

Query: 118 ALSKTAQHDEKTPALSDEDLASLSALFAMLPGQPVATPVAGETPAENHIALPSLLRGDMP 177
K DEK L+++ ASLSALFAMLPG V D P
Sbjct: 109 VADKNTTKDEKADDLNEDVTASLSALFAMLPGFDNTPKVT-----------------DAP 151

Query: 178 SAPQEETHTLSFSEHEKGKTEASLARASDDRATGPSLTPLVVAAAATSAKVEVDSPSAPV 237
S F++ T L A D A G PL A +K EV S +PV
Sbjct: 152 STVLPTEKPTLFTK----LTSEQLTTAQPDDAPGTPAQPLTPLVAEAQSKAEVISTPSPV 207

Query: 238 THGAAMPTLSSATAQPLPVASAPELSAPLGSHEWQQTFSQQVMLFTRQGQQSAQLRLHPE 297
T AA P ++ QPLP +AP LSAPLGSHEWQQ+ SQ + LFTRQGQQSA+LRLHP+
Sbjct: 208 T-AAASPLITPHQTQPLPTVAAPVLSAPLGSHEWQQSLSQHISLFTRQGQQSAELRLHPQ 266

Query: 298 ELGQVHISLKLDDNQAQLQMVSPHSHVRAALEAALPMLRTQLAESGIQLGQSSISSESFA 357
+LG+V ISLK+DDNQAQ+QMVSPH HVRAALEAALP+LRTQLAESGIQLGQS+IS ESF+
Sbjct: 267 DLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAALPVLRTQLAESGIQLGQSNISGESFS 326

Query: 358 GQQQ-SSSQQQSSRAQHTDAFGAEDDIALAAPASLQAAARGNGAVDIFA 405
GQQQ +S QQQS R + + EDD L P SLQ GN VDIFA
Sbjct: 327 GQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVSLQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1981FLGMOTORFLIM384e-136 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 384 bits (987), Expect = e-136
Identities = 86/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 5 ILSQAEIDALLNGDS--DTKDEPTPGIASDSDIRPYDPNTQRRVVRERLQALEIINERFA 62
+LSQ EID LL S D E I+ I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 63 RQFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 122
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 123 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 182
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 183 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 240
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 241 HEDQNWRDNLVRQVQHSELELVANFADIPLRLSQILKLKPGDVLPIEKP---DRIIAHVD 297
+ L ++ ++++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 298 GVPVLTSQYGTVNGQYALRVEHLI 321
Q G V + A ++ I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1982FLGMOTORFLIN2092e-73 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 209 bits (534), Expect = 2e-73
Identities = 136/137 (99%), Positives = 136/137 (99%)

Query: 1 MSDMNNPSDENTGALDDLWADALNEQKATTNKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60
MSDMNNPSDENTGALDDLWADALNEQKATT KSAADAVFQQLGGGDVSGAMQDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1984FLGBIOSNFLIP330e-117 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 330 bits (847), Expect = e-117
Identities = 225/245 (91%), Positives = 233/245 (95%)

Query: 1 MRRLLFLSLAGLWLFSPAAAAQLPGLISQPLAGGGQSWSLSVQTLVFITSLTFLPAILLM 60
MRRLL ++ LWL +P A AQLPG+ SQPL GGGQSWSL VQTLVFITSLTF+PAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEQK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSE+K
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALDKGAQPLRAFMLRQTREADLALFARLANSGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEAL+KGAQPLR FMLRQTREADL LFARLAN+GPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1985TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.5 bits (165), Expect = 1e-18
Identities = 23/78 (29%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALITGLIISILQAATQINEMTLSFIPKIVAVFIAII 63
+ ++ G +A+ + L L+ +VA I GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 VAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


35SC1997SC2011Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1997221-2.004058hypothetical protein
SC1998119-2.580166porin
SC1999125-2.703538cold-shock protein
SC2000225-1.305055DNA polymerase V subunit UmuC
SC2001125-3.871296DNA polymerase V subunit UmuD
SC2002225-2.107275hypothetical protein
SC2003224-2.455634*hypothetical protein
SC2004128-4.179063*hypothetical protein
SC2005123-7.013424P4-type integrase
SC2006022-6.679964hypothetical protein
SC2008021-6.101577*branched chain amino acid transport protein
SC2009121-7.101677hypothetical protein
SC2010121-6.370201hypothetical protein
SC2011-117-3.175628hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1998ECOLIPORIN5660.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 566 bits (1459), Expect = 0.0
Identities = 270/397 (68%), Positives = 312/397 (78%), Gaps = 14/397 (3%)

Query: 1 MNRKVLALLVPALLVAGAANAAEIYNKNGNKLDLYGKVDGLRYFSDNAGDDGDQSYARIG 60
M RKVLAL++PALL AGAA+AAEIYNK+GNKLDLYGKVDGL YFSD++ DGDQ+Y R+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQINDMLTGYGQWEYNIKVNTTEGEGANSWTRLGFAGLKFGEYGSFDYGRNYGVIY 120
FKGETQIND LTGYGQWEYN++ NTTEGEGANSWTRL FAGLKFG+YGSFDYGRNYGV+Y
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 121 DIEAWTDALPEFGGDTYTQTDVYMLGRTNGVATYRNTDFFGLVEGLNFALQYQGNNEDPG 180
D+E WTD LPEFGGD+YT D YM GR NGVATYRNTDFFGLV+GLNFALQYQG NE
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 181 AGEGTANGSDANSGSRKLARENGDGFGMSASYDFDFGLSLGAAYSSSDRTDNQVARGYGD 240
A + ++ N+G + +NGDGFG+S +YD G S GAAY++SDRT+ QV G
Sbjct: 181 ADDVNIGTNNRNNG-DDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAG--- 236

Query: 241 GMNERNNYTGGETAEAWTVGAKYDAYNVYLAAMYAETRNMTYYGGGNGEGNGGIANKTQN 300
GG+ A+AWT G KYDA N+YLA MY+ETRNMT YG + +GG+ANKTQN
Sbjct: 237 -----GTIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQN 291

Query: 301 FEVVAQYQFDFGLRPSIAYLQSKGKDLGGQEVHRGNWHYTDKDLVKYVDVGMTYYFNKNM 360
FEV AQYQFDFGLRP++++L SKGKDL N + DKDLVKY DVG TYYFNKN
Sbjct: 292 FEVTAQYQFDFGLRPAVSFLMSKGKDLT-----YNNVNGDDKDLVKYADVGATYYFNKNF 346

Query: 361 STYVDYKINLLDEDDDFYASNGIATDDIVGVGLVYQF 397
STYVDYKINLLD+DD FY GI+TDDIV +G+VYQF
Sbjct: 347 STYVDYKINLLDDDDPFYKDAGISTDDIVALGMVYQF 383


36SC2024SC2061Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SC20242182.687265nicotinate-nucleotide--dimethylbenzimidazole
SC20252211.553215cobalamin synthase
SC20262211.219579adenosylcobinamide kinase
SC20271191.610251cobyric acid synthase
SC2028-1171.347171synthesis of vitamin B12 adenosyl cobalamide
SC2029-1162.603356synthesis of vitamin B12 adenosyl cobalamide
SC2030-1153.004602cobalt transport protein CbiN
SC2031-2143.190269cobalt transport protein CbiM
SC2032-2133.538569cobalt-precorrin-2 C(20)-methyltransferase
SC2033-2133.361327synthesis of vitamin B12 adenosyl cobalamide
SC2034-3143.713145cobalt-precorrin-6x reductase
SC2035-2142.972636precorrin-3B C(17)-methyltransferase
SC2036-2172.831944cobalamin biosynthesis protein CbiG
SC2037-1163.637880synthesis of vitamin B12 adenosyl cobalamide
SC20380153.343651cobalt-precorrin-6Y C(15)-methyltransferase
SC20390141.649935cobalt-precorrin-6Y C(5)-methyltransferase
SC20400150.163861cobalt-precorrin-6A synthase
SC2041117-0.262533cobalt-precorrin-8X methylmutase
SC20420130.463585cobalamin biosynthesis protein
SC2043-3140.471633cobyrinic acid a,c-diamide synthase
SC2044-2160.298840propanediol utilization transcriptional
SC2045-1202.048149propanediol utilization propanediol diffusion
SC20461294.827957propanediol utilization polyhedral bodies
SC20471305.061532propanediol utilization polyhedral bodies
SC20481304.704485propanediol utilization dehydratase large
SC2049-1244.866922propanediol utilization dehydratase, medium
SC20500225.364691propanediol utilization dehydratase small
SC20511235.895096propanediol utilization diol dehydratase
SC20523225.556497propanediol utilization diol dehydratase
SC20533246.188569propanediol utilization polyhedral bodies
SC20543236.192384propanediol utilization polyhedral bodies
SC20553256.815785hypothetical protein
SC20562257.203985hypothetical protein
SC20572266.764188propanediol utilization polyhedral bodies
SC20582276.636791hypothetical protein
SC20591266.094693propanediol utilization CoA-dependent
SC2060-1204.463272propanediol utilization propanol dehydrogenase
SC2061-1163.285019propanediol utilization polyhedral bodies
37SC2087SC2111Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2087-119-7.728541imidazole glycerol phosphate synthase subunit
SC2088014-5.780327bifunctional phosphoribosyl-AMP
SC2089021-7.746242regulator of length of O-antigen component of
SC2090128-10.511916UDP-glucose/GDP-mannose dehydrogenase
SC2091232-12.2079166-phosphogluconate dehydrogenase
SC2092443-15.645500hypothetical protein
SC2093442-15.294315phosphomannomutase
SC2094552-18.117696phosphomannomutase
SC2095340-15.535976hypothetical protein
SC2096228-11.001653hypothetical protein
SC2097120-7.354091hypothetical protein
SC2098014-4.092921O-antigen polymerase
SC2099-112-0.565631UTP-glucose-1-phosphate uridylyltransferase
SC21000161.320590colanic acid biosynthesis protein
SC2101-1222.880141glycosyl transferase family protein
SC2102-1223.115914pyruvyl transferase
SC2103-1263.937334colanic acid exporter
SC2104-1295.483123UDP-glucose lipid carrier transferase
SC2105-1336.335741phosphomannomutase
SC2106-1254.849132mannose-1-phosphate guanylyltransferase
SC21070202.887556glycosyl transferase family protein
SC2108117-0.694839glycosyl transferase in colanic acid
SC2109117-0.588767GDP fucose synthetase
SC2110116-1.616186GDP-D-mannose dehydratase
SC2111315-2.918964colanic acid biosynthesis acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2089IGASERPTASE320.005 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 31.6 bits (71), Expect = 0.005
Identities = 17/84 (20%), Positives = 36/84 (42%), Gaps = 1/84 (1%)

Query: 156 STTAEGAQRRLAEYIQQVDEEVAKELEVDLKDNITLQTKTLQESLETQEVVAQEQKDLRI 215
+T R +A+ + + + EV + T +T+T E+ ET V +E+ +
Sbjct: 1058 ATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT-TETKETATVEKEEKAKVET 1116

Query: 216 KQIEEALRYADEAKITQPQIQQTQ 239
++ +E + + Q Q + Q
Sbjct: 1117 EKTQEVPKVTSQVSPKQEQSETVQ 1140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2109NUCEPIMERASE884e-22 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 88.3 bits (219), Expect = 4e-22
Identities = 64/344 (18%), Positives = 128/344 (37%), Gaps = 47/344 (13%)

Query: 5 RIFVAGHRGMVGSAIVRQLAQRG-------------DVEL------VLRTRD----ELDL 41
+ V G G +G + ++L + G DV L +L ++DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 42 LDGRAVQAFFAGAGIDQVYLAAAKVGGIVANNTYPADFIYENMMIESNIIHAAHLHNVNK 101
D + FA ++V+++ + + + P + N+ NI+ + +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 102 LLFLGSSCIYPKLARQPMAESELLQGTLEPTNEPYAIAKIAGIKLCESYNRQYGRDYRSM 161
LL+ SS +Y + P + P + YA K A + +Y+ YG +
Sbjct: 121 LLYASSSSVYGLNRKMPFSTD---DSVDHPVS-LYAATKKANELMAHTYSHLYGLPATGL 176

Query: 162 MPTNLYGPHDNFHPDNSHVIPALLRRFHEAAQSHAPEVVVWGSGTPMREFLHVDDMAAAS 221
+YGP PD AL + + + +V + G R+F ++DD+A A
Sbjct: 177 RFFTVYGPWGR--PDM-----ALFKFTKAMLEGKSIDV--YNYGKMKRDFTYIDDIAEAI 227

Query: 222 IHVMELA----REVWQENTAPMLSH-----INVGTGVDCTIRELAQTIAKVVGYQGRVVF 272
I + ++ + E P S N+G + + Q + +G + +
Sbjct: 228 IRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNM 287

Query: 273 DAAKPDGTPRKLLDVTRLHQ-LGWYHEISLEAGLAGTYQWFLEN 315
+P D L++ +G+ E +++ G+ W+ +
Sbjct: 288 LPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2110NUCEPIMERASE1072e-28 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 107 bits (268), Expect = 2e-28
Identities = 81/361 (22%), Positives = 127/361 (35%), Gaps = 58/361 (16%)

Query: 6 LITGVTGQDGSYLAEFLLEKGYEVHGIKRRASSFNTERVDHIYQDPH--------SCNPK 57
L+TG G G ++++ LLE G++V GI + N Y D P
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGI----DNLND------YYDVSLKQARLELLAQPG 53

Query: 58 FHLHYGDLTDASNLTRILQEVQPDEVYNLGAMSHVAVSFESPEYTADVDAMGTLRLLEAI 117
F H DL D +T + + V+ V S E+P AD + G L +LE
Sbjct: 54 FQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGC 113

Query: 118 RFLGLEKKTRFYQASTSELYGLVQEIPQKETTPF-YPRSPYAVAKLYAYWITVNYRESYG 176
R ++ AS+S +YGL +++P +P S YA K + Y YG
Sbjct: 114 RHNKIQ---HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYG 170

Query: 177 IYACNGILFNHESPRRGETFVTRKITRAIANIAQGLESCLYLGNMDSLRDWGHAKDYVRM 236
+ A F P K T+A+ G +Y RD+ + D
Sbjct: 171 LPATGLRFFTVYGPWGRPDMALFKFTKAMLE---GKSIDVY-NYGKMKRDFTYIDD---- 222

Query: 237 QWMMLQQEQPEDFVIATGVQYSVRQFVELAAAQLGIKLRFEGEGINEKGIVVSVTGHDAP 296
IA + +R + A + + V G+ +P
Sbjct: 223 --------------IAEAI---IRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSP 265

Query: 297 GVKPGDVIVAV--------DPRY--FRPAEVETLLGDPSKAHEKLGWKPEITLSEMVSEM 346
V+ D I A+ +P +V D +E +G+ PE T+ + V
Sbjct: 266 -VELMDYIQALEDALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNF 324

Query: 347 V 347
V
Sbjct: 325 V 325


38SC2125SC2165Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2125-1164.134442hypothetical protein
SC2126-1164.061846PAS/PAC domain/diguanylate
SC2127-1174.9701463-methyladenine DNA glycosylase
SC2128-1174.487950chaperone
SC2129-1143.318759multidrug efflux system subunit MdtA
SC21300143.111102multidrug efflux system subunit MdtB
SC21310131.785958multidrug efflux system subunit MdtC
SC21320140.852763signal transduction histidine-protein kinase
SC2133115-2.525396DNA-binding transcriptional regulator BaeR
SC2134116-3.162709hypothetical protein
SC2135117-4.255471inner membrane protein
SC2136022-6.607481inner membrane protein
SC2137344-14.129314hypothetical protein
SC2138651-16.403673hypothetical protein
SC2139752-15.175050hypothetical protein
SC21401152-13.666632hypothetical protein
SC21411055-13.889267hypothetical protein
SC21421053-14.329837hypothetical protein
SC2143954-13.953462hypothetical protein
SC21441159-16.404468hypothetical protein
SC2145850-13.588003hypothetical protein
SC2146127-8.855686hypothetical protein
SC2147124-8.683125hypothetical protein
SC2148024-6.751086hypothetical protein
SC2149-121-5.065661hypothetical protein
SC2150-117-2.798198hypothetical protein
SC2151016-1.803121protease
SC2152-216-2.675414hypothetical protein
SC2153-1150.757625inner membrane protein
SC2154-1142.261677hypothetical protein
SC2155-1132.083902lipid kinase
SC2156-1142.781295fructose-bisphosphate aldolase
SC21570144.066243MFS family transporter
SC21582174.914168glycohydrolase
SC21592191.057767sugar kinase
SC2160223-3.411584GntR family transcriptional regulator
SC2161225-4.919918phosphomethylpyrimidine kinase
SC2162130-6.982287hydroxyethylthiazole kinase
SC2163335-9.108858hypothetical protein
SC2164127-7.128331hypothetical protein
SC2165-112-3.849865outer membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2128SHAPEPROTEIN492e-08 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 49.0 bits (117), Expect = 2e-08
Identities = 31/129 (24%), Positives = 56/129 (43%), Gaps = 20/129 (15%)

Query: 132 TMMVHIRHTAHSQ-LPEAITQAVIGRPINFQGLGGDDANRQAQGILERAAKRAGFQDVVF 190
M+ H HS + ++ P+ R+A + +A+ AG ++V
Sbjct: 89 KMLQHFIKQVHSNSFMRPSPRVLVCVPVGA-----TQVERRA---IRESAQGAGAREVFL 140

Query: 191 QYEPVAAGLDYEATLREEKRVLVVDIGGGTTDCSMLLMGPQWRQRADRENSLLGHSGCRV 250
EP+AA + + E +VVDIGGGTT+ +++ + ++ S R+
Sbjct: 141 IEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN-----------GVVYSSSVRI 189

Query: 251 GGNDLDIAL 259
GG+ D A+
Sbjct: 190 GGDRFDEAI 198



Score = 35.9 bits (83), Expect = 2e-04
Identities = 25/81 (30%), Positives = 39/81 (48%), Gaps = 12/81 (14%)

Query: 377 ALDQPLARILEQVRLALDSAQEKPDV--------IYLTGGSARSPLIKKALSEQLPGIPV 428
AL +PL I+ V +AL+ Q P++ + LTGG A + + L E+ GIPV
Sbjct: 259 ALQEPLTGIVSAVMVALE--QCPPELASDISERGMVLTGGGALLRNLDRLLMEET-GIPV 315

Query: 429 AGGDD-FGSVTAGLARWAEVV 448
+D V G + E++
Sbjct: 316 VVAEDPLTCVARGGGKALEMI 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2129RTXTOXIND423e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.1 bits (99), Expect = 3e-06
Identities = 36/172 (20%), Positives = 71/172 (41%), Gaps = 10/172 (5%)

Query: 107 KVALAQAQGQLAKDNATLANARRDLARYQQ---LAKTNLVSRQELDAQQAL--VNETQGT 161
K A+ + + + + L + L + + AK +L + L + +T
Sbjct: 251 KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDN 310

Query: 162 IKADEANVASAQLQLDWSRITAPVSGRV-GLKQVDVGNQISSSDTAGIVVITQTHPIDLI 220
I +A + + S I APVS +V LK G +++++T +V++ + +++
Sbjct: 311 IGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVT 369

Query: 221 FTLPESDIATVVQAQKAGKTLVVEAWDRTNSHKL-SEGVLLSLDNQIDPTTG 271
+ DI + Q A + VEA+ T L + ++LD D G
Sbjct: 370 ALVQNKDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419



Score = 40.6 bits (95), Expect = 9e-06
Identities = 20/122 (16%), Positives = 46/122 (37%), Gaps = 13/122 (10%)

Query: 63 GTVTAA-NTVTVRSRVDGQLIALHFQEGQQVNAGDLLAQIDPSQFKVALAQAQGQLAKDN 121
G +T + + ++ + + + +EG+ V GD+L ++ + + Q
Sbjct: 88 GKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQ------- 140

Query: 122 ATLANARRDLARYQQLAKTNLVSRQELDAQQALVNETQGTIKADEANVASAQLQLDWSRI 181
++L AR + RYQ L+++ EL+ L + + L +
Sbjct: 141 SSLLQARLEQTRYQILSRS-----IELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQF 195

Query: 182 TA 183
+
Sbjct: 196 ST 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2130ACRIFLAVINRP8860.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 886 bits (2290), Expect = 0.0
Identities = 291/1036 (28%), Positives = 503/1036 (48%), Gaps = 29/1036 (2%)

Query: 13 SRLFILRPVATTLLMAAILLAGIIGYRFLPVAALPEVDYPTIQVVTLYPGASPDVMTSAV 72
+ FI RP+ +L +++AG + LPVA P + P + V YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMSSQS-SGGASVVTLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L MSS S S G+ +TL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPNPPIYSKVNPADPPIMTLAVTSNAMPMTQVE--DMVETRVAQKISQVSGVGLVTLAGG 189
+ I S + +M S+ TQ + D V + V +S+++GVG V L G
Sbjct: 122 VQQQGI-SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAQAVAALGLTSETVRTAITGANVNSAKGSLDGP------ERAVTLSANDQ 243
Q A+R+ L+A + LT V + N A G L G + ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MQSADEYRRLII-AYQNGAPVRLGDVATVEQGAENSWLGAWANQAPAIVMNVQRQPGANI 302
++ +E+ ++ + +G+ VRL DVA VE G EN + A N PA + ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 IATADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVRDTQFELMLAIALVVMIIYLFL 362
+ TA +I+ L +L P+ +KV D T ++ S+ + L AI LV +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N+ AT+IP +AVP+ L+GTFA++ +SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAVTLAVAIL 481
+ E P A K +I ++ + L AV IP+ F G G ++R+F++T+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SQQSLRKQNRFSRACERMFDRVIASYGRGLAKVLNHPWL 538
+S +V+L LTP +CA +L S + + F FD + Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVAFATLLLSVMLWIVIPKGFFPVQDNGIIQGTLQAPQSSSYASMAQRQRQVAERILQ 598
L + + V+L++ +P F P +D G+ +Q P ++ + QV + L+
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VQSLTTFVGVDGANSTLNSTRLQINLKPLDARDDR---VQQVISRLQTAVATIPG 653
+ V+S+ T G + N+ ++LKP + R+ + VI R + + I
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 654 VALYLQPTQDLTIDTQVSRTQYQFSLQ---ATTLDALSHWVPKL-QNALQSLPQLSEVSS 709
++ P I + T + F L DAL+ +L A Q L V
Sbjct: 660 G--FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDRGLAAWVNVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTA 769
+ + + VD++ A LG+S++D++ + A G ++ + ++ ++ +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 STPGLAALETIRLTSRDGGTVPLSAIARIEQRFAPLSINHLDQFPITTFSFNVPEGYSLD 829
++ + + S +G VP SA + + + P G S
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 DAVQAILDTEKTLALPADITTQFQGSTLAFQAALGSTVWLIVAAVVAMYIVLGVLYESFI 889
DA+ + + LPA I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DAMALMENLAS--KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALIIAGSELDIIAIIGIILLIGIVKKNAIMMIDFALAAEREQ 949
P++++ +P VG LLA + + D+ ++G++ IG+ KNAI++++FA ++
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMSPRDAIFQACLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIAMVGGLLVSQV 1009
G +A A +R RPILMT+LA +LG LPL +S G G+ + +GI ++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDRL 1025
L +F PV +++ R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2131ACRIFLAVINRP8800.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 880 bits (2275), Expect = 0.0
Identities = 282/1035 (27%), Positives = 503/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIYRPVATILIAAAITLCGILGFRLLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ ++A + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVNEMTSSS-SLGSTRIILEFNFDRDINGAARDVQAAINAAQSLLPGGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSES--WSQGKLYDFASTQLAQTIAQIDGVGDVDVGGSSL 182
+ S + +M+ S++ +Q + D+ ++ + T+++++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLNPQALFNQGVSLDEVREAIDSANVRRPQGAIEDSV------HRWQIQTNDELK 236
A+R+ L+ L ++ +V + N + G + + I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGAAVRLGDVASVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G+ VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDGIRAKLPELRAMIPAAIDLQIAQDRSPTIRASLQEVEETLAISVALVIMVVFLFLRS 355
T I+AKL EL+ P + + D +P ++ S+ EV +TL ++ LV +V++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATLIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RATLIP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVISMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LVVSLTLTPMMCGWMLKSSKPRTQPRKRGVG----RLLVALQQGYGTSLKWVLNHTRLVG 530
++V+L LTP +C +LK K G Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVFLGTVALNIWLYIAIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 586
+++ VA + L++ +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 587 RD-DPAVNNVTGFT-GGSRVNSGMMFITLKPRGER---KETAQQIIDRLRVKLAKEPGAR 641
+ +V V GF+ G N+GM F++LKP ER + +A+ +I R +++L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLMAVQDIRVGGRQANASYQYTLLSDSLAALREWEPKIRKALSAL-----PQLADVNSD 696
+ + I G ++ L D + + R L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QQDNGAEMNLIYDRDTMSRLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 756
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 SQDISALEKMFVINRDGKAIPLSYFAQWRPANAPLSVNHQGLSAASTIAFNLPTGTSLSQ 816
++K++V + +G+ +P S F + + I GTS
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ATEAINRTMTQLGVPPTVRGSFSGTAQVFQQTMNSQLILIVAAIATVYIVLGILYESYVH 876
A + ++L P + ++G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRSGG 936
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA + G
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LTPAQAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLL 996
+A A +R RPI+MT+LA + G LPL +S G GS + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVVYLFFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2132BCTERIALGSPF310.010 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 31.0 bits (70), Expect = 0.010
Identities = 20/66 (30%), Positives = 26/66 (39%), Gaps = 14/66 (21%)

Query: 187 RGLLAPVKRLVEGTHRLAAGDFTTRVTPTSADEL-----------GKLAQDFNQLASTLE 235
L+A V+ V H LA + P S + L G L N+LA E
Sbjct: 104 SQLMAAVRSKVMEGHSLAD---AMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTE 160

Query: 236 KNQQMR 241
+ QQMR
Sbjct: 161 QRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2133HTHFIS751e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 74.9 bits (184), Expect = 1e-17
Identities = 28/140 (20%), Positives = 65/140 (46%), Gaps = 2/140 (1%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLINHGDKLLPYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + ++ L ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTIL-RRC 128
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 129 KPQRELQQQDAESPLMIDES 148
+ +L+ + ++ S
Sbjct: 124 RRPSKLEDDSQDGMPLVGRS 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2152PF05932868e-25 Tir chaperone protein (CesT)
		>PF05932#Tir chaperone protein (CesT)

Length = 127

Score = 86.0 bits (213), Expect = 8e-25
Identities = 30/118 (25%), Positives = 46/118 (38%), Gaps = 3/118 (2%)

Query: 6 DRLLRQFSLKLNTDSIVFDENRLCSFIIDNRYRI-LLTSTNSEYIMIYGFCGRPPDNNNL 64
LL FS L +VFD++ C+ IIDN + + L E +++ G P +
Sbjct: 7 KTLLDDFSRSLEMQPLVFDDHGTCNMIIDNTFALTLSCDYARERLLLIGLLE--PHKDIP 64

Query: 65 AFEFLNANLWFAENNGPHLCYDNNSQSLLLALNFSLNESSVEKLECEIEVVIRSMENL 122
L L N GP L D S + + SV L+ E+ ++ M
Sbjct: 65 QQCLLAGALNPLLNAGPGLGLDEKSGLYHAYQSIPREKLSVPTLKREMAGLLEWMRGW 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2157TCRTETA371e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 37.1 bits (86), Expect = 1e-04
Identities = 33/153 (21%), Positives = 52/153 (33%), Gaps = 20/153 (13%)

Query: 253 FSEIFFMLALPFFTKRFGIKKVLLLGLITAAIRYGFFVYGGAETYFTYALLFLGILLHGV 312
+ L + RFG + VLL+ L AA+ Y +L++G ++ G+
Sbjct: 54 LMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAP-----FLWVLYIGRIVAGI 108

Query: 313 SYDFYYVTAYIYVDKKAPVHMRTAAQGLITLCCQGFGSLLGYRLGGVMMEKMFAYPQPVN 372
+ V D R G ++ C GFG + G LGG+M P
Sbjct: 109 TGATGAVAGAYIADI-TDGDERARHFGFMS-ACFGFGMVAGPVLGGLMGGFSPHAP---- 162

Query: 373 GLTFNWAGMWTFGAVMIAVIALLFMIFFRESDK 405
+ A + + L ES K
Sbjct: 163 ---------FFAAAALNGLNFLTGCFLLPESHK 186



Score = 32.5 bits (74), Expect = 0.003
Identities = 55/286 (19%), Positives = 93/286 (32%), Gaps = 17/286 (5%)

Query: 29 LNKSGFSAGEIGWSYACTAIAAILSPILVGSVTDRFFSAQKVLAVLMFAGAVLMYFAAQQ 88
L S G A A+ ++G+++DRF ++ + ++ AGA + Y
Sbjct: 35 LVHSNDVTAHYGILLALYALMQFACAPVLGALSDRF--GRRPVLLVSLAGAAVDYAI--- 89

Query: 89 TTFAGFFPLLLAYSLTYMPTIALTNSIAFANVPDVERDFPRIRVMGTIG-WIASGLACGF 147
A F +L + T A T ++A A + D+ R R G + G+ G
Sbjct: 90 MATAPFLWVLYIGRIVAGITGA-TGAVAGAYIADITDGDERARHFGFMSACFGFGMVAG- 147

Query: 148 LPQMLGY-NDISPTNTPLLITAASSALLGVFAFCLPDTPPKSTGKMDIKVMLGLDALVLL 206
P + G SP + P AA + L + L K + + L A
Sbjct: 148 -PVLGGLMGGFSP-HAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRW 205

Query: 207 RDKN------FLVFFFCSFLFAMPLAFYYIFANGYLTEVGMKNATGWMTLGQFSEIFFML 260
VFF + +P A + IF G + +
Sbjct: 206 ARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAM 265

Query: 261 ALPFFTKRFGIKKVLLLGLITAAIRYGFFVYGGAETYFTYALLFLG 306
R G ++ L+LG+I Y + ++ L
Sbjct: 266 ITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLA 311


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2163TYPE3OMGPROT270.019 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 26.8 bits (59), Expect = 0.019
Identities = 10/43 (23%), Positives = 17/43 (39%), Gaps = 7/43 (16%)

Query: 3 SKLLPCALLLATSFAWAAPA-------TTGIDQYELKSFIADF 38
++L LLL +S++WA L+ + DF
Sbjct: 10 KRVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDF 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2165PF005776810.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 681 bits (1759), Expect = 0.0
Identities = 246/839 (29%), Positives = 389/839 (46%), Gaps = 26/839 (3%)

Query: 2 LRMTPIASLVLLTLFTWQTQAIATETFDTHFMVGGMRDQKITNFHLDENKPIPGQYELDI 61
L + V + A F+ F+ + + + + PG Y +DI
Sbjct: 23 LAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDI 82

Query: 62 YVNNQWRGKYDIIVADEPGST----CISTELLKNIGVISDGLQPQ---GATDCIALKDVV 114
Y+NN + D+ C++ L ++G+ + + C+ L ++
Sbjct: 83 YLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMI 142

Query: 115 RSGGYTFNIGVFRLDLSVPQAYVNEVEAGYVLPENWDRGINAFYTSYYASQYYSDYKNSG 174
++G RL+L++PQA+++ GY+ PE WD GINA +Y S + G
Sbjct: 143 HDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGG 202

Query: 175 SSESTYVRFNSGFNLLGWQAHADTTFNKTD-----GSSGEWKSNTLYLERGIAELLGTLR 229
+S Y+ SG N+ W+ +TT++ GS +W+ +LER I L L
Sbjct: 203 NSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLT 262

Query: 230 AGDQYTSSEIFDSVRFTGVRLFRDMQMLPNSKQNFTPLVQGIAQTNALVTIEQNGFVVYQ 289
GD YT +IFD + F G +L D MLP+S++ F P++ GIA+ A VTI+QNG+ +Y
Sbjct: 263 LGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYN 322

Query: 290 KEVPPGPFSIADLQLAGGGADLDVTVREADGSINTWLVPYASVPNMLQPGVSKYDFSAGR 349
VPPGPF+I D+ AG DL VT++EADGS + VPY+SVP + + G ++Y +AG
Sbjct: 323 STVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGE 382

Query: 350 SHIEGADNQAD-FTQISYQYGLNNLLTLYGGTMLSNHYNAFTLGTGWNT-RIGAISLDAT 407
A + F Q + +GL T+YGGT L++ Y AF G G N +GA+S+D T
Sbjct: 383 YRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMT 442

Query: 408 RSHSKQDNGDVFDGQSYQIAYNKYLTQTLTRFGLAAYRYSSQDYRTFNDHVWANNKNNYR 467
+++S + DGQS + YNK L ++ T L YRYS+ Y F D ++
Sbjct: 443 QANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNI 502

Query: 468 RDKNDVYDI----ADYYQNDFGRKNTFSANVSQSLPEGWGAVSLSALWRDYWGRSGTSKD 523
++ V + DYY + ++ V+Q L + LS + YWG S +
Sbjct: 503 ETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSNVDEQ 561

Query: 524 YQISYSNTFQKINYTLSASQTYDE-DHNEDKRFNLFISIPFD--WGDGITTPRRHLNVSN 580
+Q + F+ IN+TLS S T + D+ L ++IPF + RH + S
Sbjct: 562 FQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASY 621

Query: 581 STTFDDDGFTSNNIGLTGTAGSRDQFNYGVNVSH---QRHDSETTAGTNLTWNTPVATLN 637
S + D +G +N G+ GT + +Y V + +S +T L + N
Sbjct: 622 SMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNAN 681

Query: 638 GSYSQSSNYTQTGGSISGGVVAWSGGLNLSSRLSDTFAIMQAPGLEGAYVNGQKYRTTNK 697
YS S + Q +SGGV+A + G+ L L+DT +++APG + A V Q T+
Sbjct: 682 IGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDW 741

Query: 698 KGTVVYDNLTPYRENHLMLDVSQSSSEAELRGNRKVAAPYRGAVVLVNFDTDQRKPWFIK 757
+G V T YREN + LD + + +L P RGA+V F +
Sbjct: 742 RGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMT 801

Query: 758 AQRPDGSPLIFGYDVVDHHGHNVGIVGQGSQLFIRTNDIPPEVSVPVDKEQGLSCSITF 816
+ PL FG V + GIV Q+++ + +V V +E+ C +
Sbjct: 802 L-THNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANY 859


39SC2175SC2200Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2175-1163.087272sensor/kinase in regulatory system
SC2176-2174.131906MerR family transcriptional regulator
SC2177-2163.480674hypothetical protein
SC2178-2142.912111ABC-type proline/glycine betaine transport
SC2179-1132.322371proline/glycine betaine ABC transporter ATPase
SC2180-2142.041407ABC-type proline/glycine betaine transport
SC21810141.536020ABC transporter substrate-binding protein
SC21820141.454650beta-D-glucoside glucohydrolase, periplasmic
SC21831152.219357D-lactate dehydrogenase
SC21843161.847599D-alanyl-D-alanine endopeptidase
SC21852161.385064hypothetical protein
SC21863181.904286DedA family membrane protein
SC21872192.501303acetoin dehydrogenase
SC21881182.702519multidrug resistance outer membrane protein
SC21891191.792075hypothetical protein
SC21900172.988027hypothetical protein
SC21910154.117316tRNA-dihydrouridine synthase C
SC2192-2143.693213salicylate hydroxylase
SC2193-2142.376550glutathione S-transferase
SC2194-1142.350150flutathione S-transferase
SC21951162.6188031,2-dioxygenase
SC21962161.896509sugar transporter
SC2197218-0.261339LysR family transcriptional regulator
SC2198320-1.321104hypothetical protein
SC2199317-0.961976hypothetical protein
SC2200315-1.492421cytidine deaminase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2175PF065802205e-69 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 220 bits (561), Expect = 5e-69
Identities = 60/216 (27%), Positives = 116/216 (53%), Gaps = 3/216 (1%)

Query: 328 LGEGIAQLLSAQILAGQYERQKALLTQSEIKLLHAQVNPHFLFNALNTIKAVIRRDSEQA 387
L G + + + ++ ++++ L AQ+NPHF+FNALN I+A+I D +A
Sbjct: 134 LYFGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKA 193

Query: 388 SQLVQYLSTFFRKNLKR-PSEIVTLADEIEHVNAYLQIEKARFQSRLQVQLDVPSTLSRQ 446
+++ LS R +L+ + V+LADE+ V++YLQ+ +F+ RLQ + + +
Sbjct: 194 REMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDV 253

Query: 447 KLPAFTLQPIVENAIKHGTSQLLDTGNVAIRARREGQHLMLDIEDNAGLYQPSAG-SSGL 505
++P +Q +VEN IKHG +QL G + ++ ++ + L++E+ L + S+G
Sbjct: 254 QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGT 313

Query: 506 GMSLVDKRLREHFGDDYGISVACEPDCFTRITLRLP 541
G+ V +RL+ +G + I ++ + + +P
Sbjct: 314 GLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2184BLACTAMASEA375e-05 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 37.5 bits (87), Expect = 5e-05
Identities = 38/180 (21%), Positives = 68/180 (37%), Gaps = 6/180 (3%)

Query: 11 LALMLAVPFAPQAVAKTAATTAASQPEIASGSAMI-VDLNTNKVIYSNHPDLVRPIASIT 69
++L+ +P A A + S+ +++ MI +DL + + + + D P+ S
Sbjct: 9 ISLLATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTF 68

Query: 70 KLMTAMVVLDARLPLDEILKVDISQTPEMKGVYSRV---RLNSEISRKNMLLLALMSSEN 126
K++ VL DE L+ I + YS V L ++ + A+ S+N
Sbjct: 69 KVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDN 128

Query: 127 RAAASLAHYY--PGGYNAFIKAMNAKAKALGMTHTRFVEPTGLSIHNVSTARDLTKLLIA 184
AA L P G AF++ + L T E + +T + L
Sbjct: 129 SAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRK 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2186BCTERIALGSPF270.031 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 27.5 bits (61), Expect = 0.031
Identities = 8/39 (20%), Positives = 16/39 (41%), Gaps = 1/39 (2%)

Query: 152 WLHDLDQHLRH-GVWLILAIVLVVGVRWWLKRRGKAEAR 189
L + +R G W++LA++ + R+ K
Sbjct: 215 VLMGMSDAVRTFGPWMLLALLAGFMAFRVMLRQEKRRVS 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2187DHBDHDRGNASE1124e-32 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 112 bits (281), Expect = 4e-32
Identities = 68/253 (26%), Positives = 116/253 (45%), Gaps = 12/253 (4%)

Query: 3 KVAIVTASDSGIGKACALLLAQNGFDIGITWHSDERGAQETAKKAAQFGVRAETIHLDLS 62
K+A +T + GIG+A A LA G I ++ E+ + + A+ AE D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE-ARHAEAFPADVR 67

Query: 63 QLPEGAQAIEYLIQRLGRVDVLVNNAGAMTKSAFIDMPFTQWRQIFTVDVDGAFLCAQIA 122
+ + + +G +D+LVN AG + + +W F+V+ G F ++
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 123 ARHMIKQGEGGRIINITSVHEHTPLPQASAYTAAKHALGGLTKSMALELIEHHILVNAVA 182
+++M+ + G I+ + S P +AY ++K A TK + LEL E++I N V+
Sbjct: 128 SKYMMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PGAIATPM-------NDMDDSDIEPGSEP---SIPIARPGSTHEIASLVAWLCSEGASYT 232
PG+ T M + + I+ E IP+ + +IA V +L S A +
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 233 TGQSLIVDGGFML 245
T +L VDGG L
Sbjct: 247 TMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2196TCRTETB531e-09 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 53.3 bits (128), Expect = 1e-09
Identities = 66/402 (16%), Positives = 142/402 (35%), Gaps = 19/402 (4%)

Query: 22 RVIICCFLVVMLDGFDTAAIGFIAPDIRTHWQLSASELAPLFGAGLLGLTAGALLCGPLA 81
+++I ++ + + PDI + + + A +L + G + G L+
Sbjct: 14 QILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLS 73

Query: 82 DRFGRKRVIELCVALFGALSLLSAFS-PDIETLVLLRFLTGLGLGGAMPNTIT-MTSEYL 139
D+ G KR++ + + S++ L++ RF+ G G A P + + + Y+
Sbjct: 74 DQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAG-AAAFPALVMVVVARYI 132

Query: 140 PARRRGALVTLMFCGFTLGSATGGIVSAQLVPLIGWHGILALGGILPLMLFFGLLFALPE 199
P RG L+ +G G + + I W +L + I + + F L+ L +
Sbjct: 133 PKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPF-LMKLLKK 191

Query: 200 SPRWQVRRQLPQAV---------VARTVSAITGERYLDTQFFLHETAAIAKGSI----RQ 246
R + + + + T S + FL I K +
Sbjct: 192 EVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPG 251

Query: 247 LFAGRQLVITLMLWVVFFMSLLIIYLLSSWMPTLLNHRGIDLQQASWVTAAFQVGGTLGA 306
L +I ++ + F ++ + +M ++ + + G
Sbjct: 252 LGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFG- 310

Query: 307 LLLGVLMDRLNPFRVLAVSYALGAVCIVMIGLSENG-LWLMALAIFGTGIGISGSQVGLN 365
+ G+L+DR P VL + +V + W M + I G+S ++ ++
Sbjct: 311 YIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVIS 370

Query: 366 ALTATLYPTQSRATGVSWSNAIGRCGAIVGSLSGGMMMALNF 407
+ ++ Q G+S N G G ++++
Sbjct: 371 TIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPL 412



Score = 41.8 bits (98), Expect = 4e-06
Identities = 40/169 (23%), Positives = 73/169 (43%), Gaps = 1/169 (0%)

Query: 251 RQLVITLMLWVVFFMSLLIIYLLSSWMPTLLNHRGIDLQQASWVTAAFQVGGTLGALLLG 310
R I + L ++ F S+L +L+ +P + N +WV AF + ++G + G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 311 VLMDRLNPFRVLAVSYALGAVCIVMIGLSENGLWLMALAIFGTGIGISGSQVGLNALTAT 370
L D+L R+L + V+ + + L+ +A F G G + + + A
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVAR 130

Query: 371 LYPTQSRATGVSWSNAIGRCGAIVGSLSGGMMM-ALNFSFDTLFFVIAI 418
P ++R +I G VG GGM+ +++S+ L +I I
Sbjct: 131 YIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITI 179


40SC2245SC2266Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2245017-3.351128alkaline phosphatase
SC2246-224-4.085420*bacteriophage tail fiber assembly protein
SC22470201.881274tail fiber protein of phage
SC22482243.294200hypothetical protein
SC22492337.443001virulence protein MsgA-like protein
SC22502398.882415outer membrane protein
SC225134711.739238transcriptional regulator NarP
SC225255313.401593heme lyase subunit, cytochrome c-type
SC225365413.616094heme lyase disulfide oxidoreductase, cytocyhrome
SC225475615.033200cytochrome c-type biogenesis protein
SC225534712.237345cytochrome c-type biogenesis protein CcmE
SC225624211.193245heme exporter protein C, cytochrome c-type
SC22571358.654568heme ABC exporter
SC22581317.770773heme ABC exporter
SC22590225.003647cytochrome c biogenesis protein CcmA
SC22600163.205861cytochrome c-type protein NapC
SC22611163.592082citrate reductase cytochrome c-type subunit
SC22620153.109808quinol dehydrogenase membrane component
SC2263-1163.654175quinol dehydrogenase periplasmic component
SC2264-1153.913830nitrate reductase catalytic subunit
SC2265-1163.920623periplasmic nitrate reductase
SC2266-2163.103829ecotin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2251HTHFIS673e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 67.2 bits (164), Expect = 3e-15
Identities = 23/114 (20%), Positives = 48/114 (42%), Gaps = 2/114 (1%)

Query: 9 VLIVDDHPLMRRGIRQLLELDPAFHVVAEAGDGASAIDLANRIEPDLILLDLNMKGLSGL 68
+L+ DD +R + Q L A + V + A+ + DL++ D+ M +
Sbjct: 6 ILVADDDAAIRTVLNQALSR--AGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 69 DTLNALRRDGVTAQIIILTVSDSASDIYALIDAGADGYLLKDSDPEVLLEAIRK 122
D L +++ +++++ ++ + GA YL K D L+ I +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2255PF04335290.006 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 29.0 bits (65), Expect = 0.006
Identities = 10/30 (33%), Positives = 12/30 (40%)

Query: 1 MNLRRKNRLWVVCAVLAGLALTTALVLYAL 30
R K WVV V LA + + AL
Sbjct: 27 AAERSKKLAWVVAGVAGALATAGVVAVAAL 56


41SC2301SC2328Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2301-1144.089155hypothetical protein
SC23020124.0183784-amino-4-deoxy-L-arabinose transferase
SC23031145.355144inner membrane protein
SC23041135.637900hypothetical protein
SC23050115.531462O-succinylbenzoic acid--CoA ligase
SC2306-1114.916259O-succinylbenzoate synthase
SC2307-1134.230292naphthoate synthase
SC23080134.036776acyl-CoA thioester hydrolase
SC2309-1122.8670892-succinyl-5-enolpyruvyl-6-hydroxy-3-
SC2310-1120.850751menaquinone-specific isochorismate synthase
SC2311-1121.565958hypothetical protein
SC2312-1132.004441hypothetical protein
SC23130182.635426ribonuclease Z
SC23140202.616994chemotaxis signal transduction protein
SC23151233.214225von Willebrand factor A
SC23163304.037212NADH dehydrogenase subunit N
SC23172303.026222NADH dehydrogenase subunit M
SC23181283.950861NADH dehydrogenase subunit L
SC2319-1294.015445NADH dehydrogenase subunit K
SC2320-1294.055520NADH dehydrogenase subunit J
SC23210283.981315NADH dehydrogenase subunit I
SC2322-1273.948289NADH dehydrogenase subunit H
SC2323-1264.045918NADH dehydrogenase subunit G
SC2324-2232.711488NADH dehydrogenase I subunit F
SC2325-2161.294344NADH dehydrogenase subunit E
SC2326-215-0.096880bifunctional NADH:ubiquinone oxidoreductase
SC2327-217-2.502643NADH dehydrogenase subunit B
SC2328-114-3.239001NADH dehydrogenase subunit A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2312AUTOINDCRSYN300.002 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 30.2 bits (68), Expect = 0.002
Identities = 11/74 (14%), Positives = 27/74 (36%), Gaps = 12/74 (16%)

Query: 1 MIDWQDLHHSELTVPQLYALLKLRCAVFV--------VEQRCPYLDVDGDDLVGDNRHIL 52
M++ D++H+ L+ + L LR F + D + + ++
Sbjct: 1 MLEIFDVNHTLLSETKSGELFTLRKETFKDRLNWAVQCTDGMEFDQYDNN----NTTYLF 56

Query: 53 GWHQDELVAYARIL 66
G + ++ R +
Sbjct: 57 GIKDNTVICSLRFI 70


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2314HTHFIS475e-08 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 47.1 bits (112), Expect = 5e-08
Identities = 32/135 (23%), Positives = 56/135 (41%), Gaps = 16/135 (11%)

Query: 185 PGAVAIVAEDSKVARAMLEKGLNAMGIPHQMHVTGKDAWERIQQLAQEAEAEGKPISEKI 244
GA +VA+D R +L + L+ G ++ W I +
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIA-------------AGDG 48

Query: 245 ALVLTDLEMPEMDGFTLTRKIKTDERLKKIPVVIHSSLSGSANEDHIRKVKADGYVAK-F 303
LV+TD+ MP+ + F L +IK + +PV++ S+ + + A Y+ K F
Sbjct: 49 DLVVTDVVMPDENAFDLLPRIK--KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106

Query: 304 EINELSSVIQEMLER 318
++ EL +I L
Sbjct: 107 DLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2323SECA320.011 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 32.2 bits (73), Expect = 0.011
Identities = 46/189 (24%), Positives = 70/189 (37%), Gaps = 36/189 (19%)

Query: 474 VDGIDSDLQNKIDVIVQALAGAKKPLIISGTNAGSSEVIQAAANVAKALKGRGADVGITM 533
VD +DS L ID A+ PLIISG SSE+ + + L + + T
Sbjct: 208 VDEVDSIL---ID-------EARTPLIISGPAEDSSEMYKRVNKIIPHLIRQEKEDSETF 257

Query: 534 IA----------RSVNSMGLGM-------MGGGSLDDALGELETGNADAVVVLENDLHRH 576
R VN G+ + G +D+ N + + L H
Sbjct: 258 QGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYSPANIMLMHHVTAALRAH 317

Query: 577 ASATRVNAALAKAPLVMVVDHQRTAIMENAHLV--LSAASFAESDGTVINNEGRA----- 629
A TR + K V++VD M+ L A A+ +G I NE +
Sbjct: 318 ALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAK-EGVQIQNENQTLASIT 376

Query: 630 -QRFFQVYD 637
Q +F++Y+
Sbjct: 377 FQNYFRLYE 385


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2327FLGBIOSNFLIP280.019 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 28.3 bits (63), Expect = 0.019
Identities = 18/56 (32%), Positives = 25/56 (44%), Gaps = 3/56 (5%)

Query: 68 MVTSFT---AVHDVARFGAEVLRASPRQADLMVVAGTCFTKMAPVIQRLYDQMLEP 120
M+TSFT V + R A P Q L + F M+PVI ++Y +P
Sbjct: 60 MMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQP 115


42SC2445SC2462Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC24452141.287508hypothetical protein
SC24461130.069054hypothetical protein
SC24472160.200634hypothetical protein
SC24480120.092916acetyltransferase
SC24490121.100648N-acetylmuramoyl-L-alanine amidase
SC2450-2162.621557coproporphyrinogen III oxidase
SC2451-1202.649702inner membrane protein
SC24520214.809456hypothetical protein
SC24530225.211154transcriptional regulator EutR
SC24541256.197377carboxysome structural protein, ethanolamine
SC24552256.134780ethanolamine ammonia-lyase, heavy chain
SC24564246.893501hypothetical protein
SC24573228.071873transport protein in ethanolamine utilization
SC24582217.232229heatshock protein (Hsp70)
SC24592236.579057aldehyde oxidoreductase in ethanolamine
SC24601215.521287detox protein in ethanolamine utilization
SC24611204.908247detox protein in ethanolamine utilization
SC24620163.976561phosphotransacetylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2448SACTRNSFRASE332e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.0 bits (75), Expect = 2e-04
Identities = 16/102 (15%), Positives = 39/102 (38%), Gaps = 4/102 (3%)

Query: 24 LRPWNDPEMDIERKVNHDVSLFLVAEVSGEVVG--TVMGGYDGHRGSAYYLGVHPEFRGR 81
+ + D +MD+ + FL + +G + ++G + V ++R +
Sbjct: 47 FKQYEDDDMDVSYVEEEGKAAFL-YYLENNCIGRIKIRSNWNG-YALIEDIAVAKDYRKK 104

Query: 82 GIANALLNRLEKKLIARGCPKIQIMVRDDNDVVLGMYERLGY 123
G+ ALL++ + + + +D N Y + +
Sbjct: 105 GVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2458SHAPEPROTEIN499e-09 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 48.6 bits (116), Expect = 9e-09
Identities = 32/116 (27%), Positives = 51/116 (43%), Gaps = 9/116 (7%)

Query: 64 VRDGIVWDFFGAVTLVRRHLDTLEQQLGCRFT-HAATSFPPGTDP---RISINVLESAGL 119
++DG++ DFF +++ + + R + P G R + AG
Sbjct: 76 MKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGA 135

Query: 120 EVSHVLDEPTAVA---DLLALDNAG--VVDIGGGTTGIAIVKQGKVTYSADEATGG 170
+++EP A A L + G VVDIGGGTT +A++ V YS+ GG
Sbjct: 136 REVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGG 191


43SC2475SC2486Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2475-2113.813942oxidoreductase Fe-S binding subunit
SC2476-3103.432691aminoglycoside/multidrug efflux system
SC2477-2103.655695hypothetical protein
SC2478-2104.114508succinyl-diaminopimelate desuccinylase
SC2479-1134.145583hypothetical protein
SC2480-1113.826417hypothetical protein
SC24810120.522930hypothetical protein
SC2482-1131.160006phosphoribosylaminoimidazole-succinocarboxamide
SC24830121.945621lipoprotein
SC24841160.239045dihydrodipicolinate synthase
SC24851180.588917glycine cleavage system transcriptional
SC24862141.703858thioredoxin-dependent thiol peroxidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2476ACRIFLAVINRP12740.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1274 bits (3298), Expect = 0.0
Identities = 647/1032 (62%), Positives = 797/1032 (77%), Gaps = 2/1032 (0%)

Query: 1 MANFFIDRPIFAWVLAILLCLTGALAIFSLPVEQYPDLAPPNVRITANYPGASAQTLENT 60
MANFFI RPIFAWVLAI+L + GALAI LPV QYP +APP V ++ANYPGA AQT+++T
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMTGLDNLMYMSSQSSGTGQATITLSFIAGTAPDEAVQQVQNQLQSAMRKLPQ 120
VTQVIEQNM G+DNLMYMSS S G TITL+F +GT PD A QVQN+LQ A LPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 AVQDQGVTVRKTGDTNILTIAFVSTDGSMDKQDIADYVASNIQDPLSRVNGVGDIDAYGS 180
VQ QG++V K+ + ++ FVS + + DI+DYVASN++D LSR+NGVGD+ +G+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYSMRIWLDPAKLNSFQMTTKDVTDAIESQNAQIAVGQLGGTPSVDKQALNATINAQSLL 240
QY+MRIWLD LN +++T DV + ++ QN QIA GQLGGTP++ Q LNA+I AQ+
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 QTPQQFRDITLCVNQDGSEVKLGDVATVELGAEKYDYLSRFNGNPASGLGVKLASGANEM 300
+ P++F +TL VN DGS V+L DVA VELG E Y+ ++R NG PA+GLG+KLA+GAN +
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 ATAKLVLDRLNELAQYFPHGLEYKIAYETTSFVKASIIDVVKTLLEAIALVFLVMYLFLQ 360
TAK + +L EL +FP G++ Y+TT FV+ SI +VVKTL EAI LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLMGTFSVLYAFGYSINTLTMFAMVLAIGLLVDDAIVVVENVERIM 420
N RATLIPTIAVPVVL+GTF++L AFGYSINTLTMF MVLAIGLLVDDAIVVVENVER+M
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 SEEGLTPREATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGTTGAIYRQFSITIVSAMVL 480
E+ L P+EAT KSM QIQGALVGIAMVLSAVF+PMAFFGG+TGAIYRQFSITIVSAM L
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVAMILTPALCATLLKPLHKGEQHGQRGFFGWFNRTFNRNAERYEKGVAKILHRSLRW 540
SVLVA+ILTPALCATLLKP+ + GFFGWFN TF+ + Y V KIL + R+
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 541 ILIYVLLLGGMVFLFLRLPTSFLPQEDRGMFTTSIQLPSGSTQQQTLKVVEKVENYYFTH 600
+LIY L++ GMV LFLRLP+SFLP+ED+G+F T IQLP+G+TQ++T KV+++V +YY +
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 601 EKNNIMSVFSTVGSGPGGNGQNVARMFVRLKDWDARDPTTGSSFAIIERATKAFNQIKEA 660
EK N+ SVF+ G G QN FV LK W+ R+ S+ A+I RA +I++
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 661 RVFASSPPAISGLGSSAGFDMELQDHAGAGHDALMAARDQLIELAGKN-SSLTRVRHNGL 719
V + PAI LG++ GFD EL D AG GHDAL AR+QL+ +A ++ +SL VR NGL
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 720 DDSPQLQIDIDQRKAQALGVSIDDINDTLQTAWGSSYVNDFMDRGRVKKVYVQAAAKYRM 779
+D+ Q ++++DQ KAQALGVS+ DIN T+ TA G +YVNDF+DRGRVKK+YVQA AK+RM
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 780 LPDDINLWYVRNKDGGMVPFSAFATSRWETGSPRLERYNGYSAVEIVGEAAPGVSTGTAM 839
LP+D++ YVR+ +G MVPFSAF TS W GSPRLERYNG ++EI GEAAPG S+G AM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 840 DVMESLVHQLPGGFGLEWTAMSYQERLSGAQAPALYAISLLVVFLCLAALYESWSVPFSV 899
+ME+L +LP G G +WT MSYQERLSG QAPAL AIS +VVFLCLAALYESWS+P SV
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 900 MLVVPLGVIGALLATWMRGLENDVYFQVGLLTVIGLSAKNAILIVEFANE-MNQKGHALL 958
MLVVPLG++G LLA + +NDVYF VGLLT IGLSAKNAILIVEFA + M ++G ++
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 959 DATLYASRQRPRPILMTSLAFIFGVLPMATSTGAGSGSQHAVGTGVTGGMISATILAIFF 1018
+ATL A R R RPILMTSLAFI GVLP+A S GAGSG+Q+AVG GV GGM+SAT+LAIFF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1019 VPLFFVLIRRRF 1030
VP+FFV+IRR F
Sbjct: 1021 VPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2480AUTOINDCRSYN320.005 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 32.1 bits (73), Expect = 0.005
Identities = 24/122 (19%), Positives = 50/122 (40%), Gaps = 13/122 (10%)

Query: 459 SRIAVHPARQREGIGQQLIVCACMQAAQCDYLSVSFGYT-------PELWRFWQRCGFVL 511
SR V +R ++ +G + + + + + +Y S GY + +R G+
Sbjct: 100 SRFFVDKSRAKDILGNEYPISSMLFLSMINY-SKDKGYDGIYTIVSHPMLTILKRSGWG- 157

Query: 512 VRMGNHREASSGCYTAMALLPLSDAG-KRLAQQEHRRLRRDADILTQWNGEAIPLAALDE 570
+R+ + + LP+ D + LA++ +R ++ L QW + + A
Sbjct: 158 IRVVEQGLSEKEERVYLVFLPVDDENQEALARRINRSGTFMSNELKQW---PLRVPAAIA 214

Query: 571 QA 572
QA
Sbjct: 215 QA 216


44SC2497SC2511Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2497011-3.418317phosphoribosylglycinamide formyltransferase
SC2498011-3.973486polyphosphate kinase
SC2499218-4.738037exopolyphosphatase
SC2500225-7.117599diguanylate cyclase
SC2501735-5.578292hypothetical protein
SC2502834-4.689189inner membrane protein
SC2503623-2.213585hypothetical protein
SC2504420-0.981717hypothetical protein
SC25053160.674157IS3-like transposase
SC25060133.207099hypothetical protein
SC25071184.561928phage integrase family site specific
SC25081184.453890GMP synthase
SC25091173.941225inosine 5'-monophosphate dehydrogenase
SC25101183.947162exodeoxyribonuclease VII large subunit
SC25111183.652458outer membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2506SECA534e-09 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 53.0 bits (127), Expect = 4e-09
Identities = 18/26 (69%), Positives = 21/26 (80%)

Query: 767 RKSKKIGRNVPCPCGSGMKYKRCHGR 792
+K+GRN PCPCGSG KYK+CHGR
Sbjct: 874 TGERKVGRNDPCPCGSGKKYKQCHGR 899


45SC2555SC2638Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2555-2133.764382POT family di-/tripeptide transport protein
SC2556-2133.663690nitrogen regulatory protein P-II 1
SC2557-2123.129268transcriptional regulator of two-component
SC2558-3123.247685hypothetical protein
SC2559-3133.041899sensory kinase in regulatory system
SC2560-3133.270640phosphoribosylformylglycinamidine synthase
SC2561-1131.309951hypothetical protein
SC25620141.845365transglycosylase
SC25630191.659034tRNA-specific adenosine deaminase
SC2564118-0.287707hypothetical protein
SC2565116-1.931551phosphotransferase system IIB components
SC2566017-3.403224N-acetylmuramic acid 6-phosphate etherase
SC2567017-4.250676DNA-binding transcriptional regulator
SC2568014-4.2438872-dehydropantoate 2-reductase
SC2569-113-3.165721permease
SC2570-112-1.624255LysR family transcriptional regulator
SC2571-1160.604865ferredoxin
SC2572-1160.8661464'-phosphopantetheinyl transferase
SC2573-1150.210509pyridoxine 5'-phosphate synthase
SC2574014-4.782427DNA repair protein RecO
SC2575016-6.072395GTP-binding protein Era
SC2576121-6.865122ribonuclease III
SC2577124-7.150649signal peptidase I
SC2578227-8.245751GTP-binding protein LepA
SC2579439-9.269433hypothetical protein
SC2580231-4.827882hypothetical protein
SC2581329-3.090038Fels-2 prophage Tfa
SC2582228-3.051455Fels-2 prophage Pin
SC2584327-3.259419bacteriophage protein
SC2585227-2.792815bacteriophage protein
SC2586131-4.075716bacteriophage protein
SC2587330-4.852928bacteriophage protein
SC2588530-6.711808hypothetical protein
SC2589830-6.615918bacteriophage protein
SC2590829-5.655785hypothetical protein
SC2591423-3.472017regulatory protein
SC2592420-2.611498hypothetical protein
SC2593319-1.984170bacteriophage protein
SC2594319-0.497592bacteriophage protein
SC2595220-0.221603bacteriophage protein
SC25962190.492307bacteriophage protein
SC25973221.054414bacteriophage protein
SC25983241.038038bacteriophage protein
SC25992230.941802bacteriophage protein
SC26004230.368691bacteriophage protein
SC26013260.538576bacteriophage protein
SC26024230.607844bacteriophage protein
SC26035240.729311bacteriophage protein
SC26044231.373846bacteriophage protein
SC26055231.504547bacteriophage protein
SC26064231.399220bacteriophage protein
SC26075210.741023hypothetical protein
SC26084240.561284bacteriophage protein
SC26095240.422194hypothetical protein
SC2610526-0.753861hypothetical protein
SC2611733-4.181961hypothetical protein
SC2612732-3.690222hypothetical protein
SC2613832-3.744449hypothetical protein
SC2614629-4.979251Gifsy-2 prophage lysozyme
SC2615224-3.573818Gifsy-1 prophage protein
SC2616325-3.607607hypothetical protein
SC2617022-1.289381Gifsy-2 prophage protein
SC2618124-2.626178Gifsy-2 prophage protein
SC2619023-2.724309Gifsy-2 prophage molecular chaperone
SC2620119-0.916462Gifsy-1 prophage NinG
SC2621323-2.898068hypothetical protein
SC2622225-3.568235Gifsy-2 prophage protein
SC2623429-4.061416bacteriophage protein
SC2624729-3.384202Gifsy-1 prophage DinI
SC2625632-2.006802bacteriophage protein
SC2626630-1.608983hypothetical protein
SC2627730-3.329468hypothetical protein
SC2628833-2.666022Ead protein
SC2629832-3.475021hypothetical protein
SC2630732-3.011820DNA-binding protein
SC2631832-3.644152replication protein
SC2632836-6.209031Gifsy-1 prophage PrpO
SC2633525-2.147034Gifsy-1 prophage cI protein
SC2634525-2.094088DNA-binding protein
SC2635425-1.624457Gifsy-1 prophage protein
SC2636425-2.073925Gifsy-1 prophage protein
SC2637323-1.345291Gifsy-1 prophage protein
SC2638322-0.202804Gifsy-1 prophage RecE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2555TCRTETA320.007 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 31.7 bits (72), Expect = 0.007
Identities = 71/362 (19%), Positives = 119/362 (32%), Gaps = 52/362 (14%)

Query: 44 SHAINLFSAYA-SLVYVTPILGGWLADRLLGNRVAVITGALLMTLGHVVLGLESDSTLSL 102
+H L + YA P+LG +DR G R ++ + + ++ L +
Sbjct: 43 AHYGILLALYALMQFACAPVLGAL-SDRF-GRRPVLLVSLAGAAVDYAIMATAP--FLWV 98

Query: 103 YAALAIIICGYGLFKSNISCLLGELYAPDDNRRDGGFSLLYAAGNIGSIAAPIACGLAAQ 162
I+ G + + ++ D+ R GF + A G +A P+ GL
Sbjct: 99 LYIGRIVAGITGATGAVAGAYIADITDGDERARHFGF--MSACFGFGMVAGPVLGGLMGG 156

Query: 163 WYGWHVGFALAGVGMFIGLLIFLSGHRHFQQTRGVNRPALRAVKFALPT-WGWLVLMLCI 221
+ H F A + L FL+G ++ R LR + W M +
Sbjct: 157 -FSPHAPFFAAA---ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVV 212

Query: 222 APVFFTLLLENNWSGYVLAIVCVFAAQLI----ARIMVKFPEHRRALWQIVLLMITGTLF 277
A + F QL+ A + V F E R W + I+ F
Sbjct: 213 AALMAVF----------------FIMQLVGQVPAALWVIFGEDRFH-WDATTIGISLAAF 255

Query: 278 WVLAQQGGSSISLFIDRFVNRHWLNMTVPTALFQSVNAIAVMAAGVVLAWLSSPKESARS 337
+L S I V IA ++LA+ +
Sbjct: 256 GIL----HSLAQAMITGPVAARLGERRALMLGM-----IADGTGYILLAFAT-------- 298

Query: 338 VLRVWLKFAVGLVLMGGGFMLLALNAHQARLDGQASMGMMIAGLALMGFAELFIDPVAMA 397
R W+ F + ++L GG + AL A +R + G + LA + + P+
Sbjct: 299 --RGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFT 356

Query: 398 QI 399
I
Sbjct: 357 AI 358


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2557HTHFIS481e-170 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 481 bits (1240), Expect = e-170
Identities = 166/480 (34%), Positives = 247/480 (51%), Gaps = 42/480 (8%)

Query: 7 AHLLLVDDDPGLLKLLGMRLTSEGYSVVTAESGQEGLRVLHREKVDLVISDLRMDEMDGM 66
A +L+ DDD + +L L+ GY V + R + DLV++D+ M + +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 QLFTEIQKVQPGMPVIILTAHGSIPDAVAATQKGVFSFLTKPIDRDALYKAIDEALE--- 123
L I+K +P +PV++++A + A+ A++KG + +L KP D L I AL
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 124 --QSAPATDDSWRNAIVTRSPLMLRLLEQARMVAQSDVSVLINGQSGTGKEIFAQAIHNA 181
S D +V RS M + + Q+D++++I G+SGTGKE+ A+A+H+
Sbjct: 124 RRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDY 183

Query: 182 SPRSNKPFVAINCGALPEQLLESELFGHARGAFTGAVSNREGLFQAAEGGTLFLDEIGDM 241
R N PFVAIN A+P L+ESELFGH +GAFTGA + G F+ AEGGTLFLDEIGDM
Sbjct: 184 GKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDM 243

Query: 242 PAPLQVKLLRVLQERKVRPLGSNRDIDIDVRIISATHRDLPKAMARGEFREDLYYRLNVV 301
P Q +LLRVLQ+ + +G I DVRI++AT++DL +++ +G FREDLYYRLNVV
Sbjct: 244 PMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVV 303

Query: 302 SLKIPALAERTEDIPLLANHLLRQSAQRHKPFVRAFSTDAMKRLMTASWPGNVRQLVNVI 361
L++P L +R EDIP L H ++Q+ + V+ F +A++ + WPGNVR+L N++
Sbjct: 304 PLRLPPLRDRAEDIPDLVRHFVQQAEKEGLD-VKRFDQEALELMKAHPWPGNVRELENLV 362

Query: 362 EQCVALTSSPVISDALVEQALEGENTALPT------------------------------ 391
+ AL VI+ ++E L E P
Sbjct: 363 RRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDA 422

Query: 392 ------FAEARNQFELNYLRKLLQITKGNVTHAARMAGRNRTEFYKLLSRHELDANDFKE 445
+ + E + L T+GN AA + G NR K + +
Sbjct: 423 LPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVSVYRSSR 482


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2559PF06580330.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.3 bits (76), Expect = 0.002
Identities = 20/127 (15%), Positives = 45/127 (35%), Gaps = 26/127 (20%)

Query: 354 DVDLEAERCIAEPMLLMSVLDNLYSNAVHYG----AESGNICIRSRSQGSTVYIDVVNSG 409
++ PML+ + L N + +G + G I ++ TV ++V N+G
Sbjct: 245 QINPAIMDVQVPPMLVQT----LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTG 300

Query: 410 EPIPQTEREMIFEPFFQGSHQRKGAVKGSGLGLSIARDCIRRMQGEIQLVDDNAQEVCFR 469
+ +E +G GL R+ ++ + G + + ++
Sbjct: 301 SLALKNTKE------------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN 342

Query: 470 ISLPLPA 476
+ +P
Sbjct: 343 AMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2569TCRTETB371e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 36.8 bits (85), Expect = 1e-04
Identities = 34/177 (19%), Positives = 70/177 (39%), Gaps = 3/177 (1%)

Query: 213 FWLLFMILALGVFSGMVISSSSAQIGMTQYGLLSGAL-VVSLVSIFNSIGRLFWGGLTDK 271
WL + V + MV++ S I + V + + SIG +G L+D+
Sbjct: 17 IWLCILSF-FSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 272 LGGYNTLVIVYLFTCLCMLLLFFFNGNTSVFYFSALGVGFAYAGILVIFPGLTSQNFGMR 331
LG L+ + C ++ F + S+ + G A + + ++
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 332 NQGLNYGFMYFGFAVGAVIAPYVTSAIAKYTGSYNTVFILTTVLLLIGVVLTLITKK 388
N+G +G + A+G + P + IA Y ++ + ++ + ++ L + KK
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIH-WSYLLLIPMITIITVPFLMKLLKK 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2575TCRTETOQM310.006 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.0 bits (70), Expect = 0.006
Identities = 30/121 (24%), Positives = 50/121 (41%), Gaps = 15/121 (12%)

Query: 7 YCGFIAIVGRPNVGKSTLLNKLLGQKISITSRKAQTTRHRIVGIHTEGPYQAIYVDTPGL 66
G I +G + G + N LL ++ IT + T+ E I +DTPG
Sbjct: 26 NSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITS------FQWENTKVNI-IDTPG- 77

Query: 67 HMEEKRAINRLMNKAASSSIGDVELVIFVVEGTRWTPDDEMVLNKLRDGKAPVILAVNKV 126
HM+ + R + S + L+I +G + ++ + LR P I +NK+
Sbjct: 78 HMDFLAEVYRSL-----SVLDGAILLISAKDGVQ--AQTRILFHALRKMGIPTIFFINKI 130

Query: 127 D 127
D
Sbjct: 131 D 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2578TCRTETOQM1538e-42 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 153 bits (388), Expect = 8e-42
Identities = 95/458 (20%), Positives = 181/458 (39%), Gaps = 91/458 (19%)

Query: 3 NIRNFSIIAHIDHGKSTLSDRIIQICGG---LSDREMEAQVLDSMDLERERGITIKAQSV 59
I N ++AH+D GK+TL++ ++ G L + D+ LER+RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 60 TLDFKASDGETYQLNFIDTPGHVDFSYEVSRSLAACEGALLVVDAGQGVEAQTLANCYTA 119
+ + E ++N IDTPGH+DF EV RSL+ +GA+L++ A GV+AQT +
Sbjct: 62 SFQW-----ENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHAL 116

Query: 120 MEMDLEVVPVLNKIDLPAADPERVAEEIED-----------------IVGIDATDAVRCS 162
+M + + +NKID D V ++I++ + + T++ +
Sbjct: 117 RKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWD 176

Query: 163 AKTGVGVTDVLERLVRDIPP---------------------------------------- 182
G D+LE+ +
Sbjct: 177 T-VIEGNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVIT 235

Query: 183 -----PQGDPDGPLQALIIDSWFDNYLGVVSLVRIKNGTMRKGDKIKVMSTGQTYNADRL 237
L + + ++ +R+ +G + D +++ +
Sbjct: 236 NKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI---KIT 292

Query: 238 GIFTPKQVDRTELKCGEVGWLVCAIKDIL--GAPVGDTLTSARNPAEKALPGFKKVKPQV 295
++T + ++ G +V + L + +GDT P + + + P +
Sbjct: 293 EMYTSINGELCKIDKAYSGEIVILQNEFLKLNSVLGDTK---LLPQRERI---ENPLPLL 346

Query: 296 YAGLFPVSSDDYESFRDALGKLSLNDASL-FYEPESSSALGFGFRCGFLGLLHMEIIQER 354
+ P E DAL ++S +D L +Y ++ + FLG + ME+
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATHEIIL----SFLGKVQMEVTCAL 402

Query: 355 LEREYDLDLITTAPTVVYEVET---TAKETIYVDSPSK 389
L+ +Y +++ PTV+Y +E A+ TI+++ P
Sbjct: 403 LQEKYHVEIEIKEPTVIY-MERPLKKAEYTIHIEVPPN 439



Score = 35.2 bits (81), Expect = 7e-04
Identities = 18/81 (22%), Positives = 30/81 (37%), Gaps = 1/81 (1%)

Query: 398 ELREPIAECHMLLPQAYLGNVITLCIEKRGVQTNMVYHGNQVALTYEIPMAEVVLDFFDR 457
EL EP + PQ YL T + + N+V L+ EIP + ++
Sbjct: 534 ELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARC-IQEYRSD 592

Query: 458 LKSTSRGYASLDYNFKRFQAS 478
L + G + K + +
Sbjct: 593 LTFFTNGRSVCLTELKGYHVT 613


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2581SECA270.041 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 26.8 bits (59), Expect = 0.041
Identities = 15/49 (30%), Positives = 20/49 (40%), Gaps = 15/49 (30%)

Query: 51 HRKGGPVLV-----EHREYTHEELI----------AQAEARKAELLAEA 84
KG PVLV E E EL A+ A +A ++A+A
Sbjct: 446 TAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQA 494


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2617BINARYTOXINB270.024 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 27.3 bits (60), Expect = 0.024
Identities = 13/49 (26%), Positives = 18/49 (36%)

Query: 83 TRTHQSNCNTRSQTHSSSTSKTRSSSVGFSVGGPVGASIGLIKQMESMS 131
QS NT SQT + S + + S + V G S+S
Sbjct: 302 KNEDQSTQNTDSQTRTISKNTSTSRTHTSEVHGNAEVHASFFDIGGSVS 350


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2628ANTHRAXTOXNA250.033 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 25.5 bits (55), Expect = 0.033
Identities = 7/24 (29%), Positives = 16/24 (66%)

Query: 24 DKFREAEKHIAELEAKLETADRLQ 47
+KF+++ ++ + E ET D++Q
Sbjct: 57 EKFKDSINNLVKTEFTNETLDKIQ 80


46SC2695SC2727Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2695327-5.772361HlyD family secretion protein
SC2696738-8.638078hypothetical protein
SC2697538-9.039916transposase insN
SC2698438-9.139634hypothetical protein
SC2699233-7.388664reverse transcriptase-like protein
SC2700-217-0.191985hypothetical protein
SC2701-1163.105460flagellar synthesis: repressor of fliC
SC2702-1163.947638site-specific recombinase
SC27030133.186217hypothetical protein
SC27040133.135690glycosyl transferase family protein
SC27050141.722885ABC transporter
SC2706218-1.277670enterochelin esterase
SC2707423-4.608677hypothetical protein
SC2708427-6.711578outer membrane receptor FepA
SC2709234-8.083165inner membrane protein
SC2710124-5.269143pentapeptide repeat-containing protein
SC2711018-3.598214hypothetical protein
SC2712-116-3.264833virulence protein VirK
SC2713-312-1.067304transcription activator
SC2714-3121.456304nickel transporter
SC2715-3132.979374tricarboxylic transport: regulatory protein
SC2716-2162.422870tricarboxylic transport: regulatory protein
SC2717-2203.413473hypothetical protein
SC2718-1213.946029tricarboxylic transport
SC27190214.120012hypothetical protein
SC27200183.082126hypothetical protein
SC27211172.958730hypothetical protein
SC27223152.463391hydroxyglutarate oxidase
SC27234141.193139succinate-semialdehyde dehydrogenase I
SC2724414-0.1583864-aminobutyrate aminotransferase
SC2725319-1.319290gamma-aminobutyrate transporter
SC2726123-1.111575DNA-binding transcriptional regulator CsiR
SC2727323-3.744924LysM domain/BON superfamily protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2695RTXTOXIND2432e-78 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 243 bits (623), Expect = 2e-78
Identities = 97/443 (21%), Positives = 181/443 (40%), Gaps = 61/443 (13%)

Query: 6 HDAAMDDPDIQRERAFSGAGRIVLICSLLFLILGIWAWFGRLDEVSTGNGKVIPSSREQV 65
H ++ P +R R ++ ++ I + G+++ V+T NGK+ S R +
Sbjct: 45 HLELIETPVSRRPRLV---AYFIMGFLVIAFI---LSVLGQVEIVATANGKLTHSGRSKE 98

Query: 66 LQSLDGGILAQLTVREGDRVQANQIVARLDPTRLASNVGESAAKYRASLASSARLTA--- 122
++ ++ I+ ++ V+EG+ V+ ++ +L ++ ++ + + R
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSR 158

Query: 123 --EVNDLPL----AFPAELNGWPDLIAAETRLYKSR-----------RAQLSDTEAELRD 165
E+N LP P N + + T L K + L AE
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT 218

Query: 166 ALASVNK----------ELAITQRLEKSGAASHVEVLRLQRQKSDLG------------- 202
LA +N+ L L A + VL + + +
Sbjct: 219 VLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQI 278

Query: 203 --------LKITDLRSQYYVQAREALSKANAEVDMLSAILKGREDSVTRLTVRSPVRGIV 254
+ + + + + L + + +L+ L E+ +R+PV V
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKV 338

Query: 255 KNIQVTTIGGVIPPNGEMMEIVPVDDRLLIETRLSPRDIAFIHPGQRALVKITAYDYAIY 314
+ ++V T GGV+ +M IVP DD L + + +DI FI+ GQ A++K+ A+ Y Y
Sbjct: 339 QQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRY 398

Query: 315 GGLDGVVETISPDTIQDKVKPEIFYYRVFIRTHQDYLQNKSGRRFSIVPGMIATVDIKTG 374
G L G V+ I+ D I+D+ + V I ++ L + + GM T +IKTG
Sbjct: 399 GYLVGKVKNINLDAIEDQRLGL--VFNVIISIEENCLSTG-NKNIPLSSGMAVTAEIKTG 455

Query: 375 EKTIVDYLIKPF-NRAKEALRER 396
++++ YL+ P E+LRER
Sbjct: 456 MRSVISYLLSPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2705PF05272340.004 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.3 bits (78), Expect = 0.004
Identities = 46/217 (21%), Positives = 66/217 (30%), Gaps = 49/217 (22%)

Query: 992 PPG----TVVAVVGRSGAGKSTLIKLLAGLYSPGSGQIRVGER-----------LIDAAS 1036
PG V + G G GKSTLI L GL +G + +
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSE 649

Query: 1037 LSDYRRQTGLVTQDVALFSGDIAENI-RYPRPNSSDTEVESAARRAGLFETV---QHL-- 1090
++ +RR D + RY V+ R+ ++ T Q+L
Sbjct: 650 MTAFRR------ADAEAVKAFFSSRKDRYRGA--YGRYVQDHPRQVVIWCTTNKRQYLFD 701

Query: 1091 PLGFRT--PVNNGG----TDLSAGQRQLIALA--------RAHLA--QAHILLLDEATAR 1134
G R PV G L + QL A A R + I E R
Sbjct: 702 ITGNRRFWPVLVPGRANLVWLQKFRGQLFAEALHLYLAGERYFPSPEDEEIYFRPEQELR 761

Query: 1135 -IDRSAEERLMTSLTRVTHTEKRIALIVAHRLTTARR 1170
++ + RL LTR A A + +
Sbjct: 762 LVETGVQGRLWALLTREG---APAAEGAAQKGYSVNT 795


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2715PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.7 bits (77), Expect = 0.001
Identities = 23/101 (22%), Positives = 38/101 (37%), Gaps = 21/101 (20%)

Query: 387 LLDNALKY----TPEQGIVTARLERDGDAVTLVVEDSGPGIDDEHIHLALQPFHRLDNVG 442
L++N +K+ P+ G + + +D VTL VE++G L N
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA--------------LKNTK 308

Query: 443 NVAGAGIGLALVND-IARLHRTHPHFSRSEALGGLYVRIRF 482
G GL V + + L+ T SE G + +
Sbjct: 309 E--STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2716HTHFIS972e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 96.8 bits (241), Expect = 2e-25
Identities = 35/122 (28%), Positives = 61/122 (50%), Gaps = 1/122 (0%)

Query: 2 RLLLAEDNRELAHWLEKALVQNGFAVDCVFDGLAADHLLHSEMYALAVLDINMPGMDGLE 61
+L+A+D+ + L +AL + G+ V + + + L V D+ MP + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VVQRLRKRGQTLPVLLLTARSAVADRVKGLNVGADDYLPKPFELEE-LDARLRALLRRSA 120
++ R++K LPVL+++A++ +K GA DYLPKPF+L E + RAL
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 GQ 122

Sbjct: 125 RP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2727INTIMIN270.030 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 27.3 bits (60), Expect = 0.030
Identities = 20/69 (28%), Positives = 33/69 (47%), Gaps = 6/69 (8%)

Query: 82 SVDDQVKTTTPAAESQFYTVKSGDTLSAISKQVYGNANLYNKIFEANKPMLKSPE---KI 138
D ++ T FYT+K+G+T++ +SK N + I+ NK + S K
Sbjct: 48 GSDSKLLTHNSYQNRLFYTLKTGETVADLSKSQDINLST---IWSLNKHLYSSESEMMKA 104

Query: 139 YPGQVLRIP 147
PGQ + +P
Sbjct: 105 EPGQQIILP 113


47SC2767SC2847Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2767-1163.101322PTS system glucitol/sorbitol-specific
SC2768-1152.250713sorbitol-6-phosphate dehydrogenase
SC2769-1142.503039DNA-binding transcriptional activator GutM
SC27700133.751591DNA-binding transcriptional repressor SrlR
SC27710134.413493D-arabinose 5-phosphate isomerase
SC27720134.048099anaerobic nitric oxide reductase transcriptional
SC27731123.324367anaerobic nitric oxide reductase
SC27741134.159778nitric oxide reductase
SC27751134.693946hydrogenase maturation protein
SC27760192.988153electron transport protein HydN
SC2777-1243.278424hypothetical protein
SC27780294.305692hydrogenase 3 maturation protease
SC27791295.391915HycE processing protein
SC27801275.406648hydrogenase activity
SC27811264.704578formate hydrogenlyase complex iron-sulfur
SC27821234.309149formate hydrogenlyase subunit 5
SC27832184.419394hydrogenase 3, membrane subunit (part of FHL
SC27842174.015572formate hydrogenlyase subunit 3
SC27851172.538122hydrogenase-3, iron-sulfur subunit (part of FHL
SC27861153.192900formate hydrogenlyase regulatory protein HycA
SC27870132.902838hydrogenase nickel incorporation protein
SC27880143.259483hydrogenase nickel incorporation protein HypB
SC27890142.672046hydrogenase assembly chaperone
SC2790-1142.502351hydrogenase expression/formation protein
SC27910151.989060hydrogenase expression/formation protein
SC2792-1140.626430formate hydrogen-lyase transcriptional
SC2793-213-1.126425hypothetical protein
SC2794-216-3.892317iron transporter: fur regulated
SC2795-220-5.071916iron transporter: fur regulated
SC2796-125-6.723082iron transporter: fur regulated
SC2797134-8.217732iron transporter: fur regulated
SC2798239-9.298723transcriptional regulator
SC2799340-9.253234AraC family transcriptional regulator
SC2800338-7.015001hypothetical protein
SC2801334-6.116558flagellar biosynthesis/type III secretory
SC2802334-7.184895inner membrane protein
SC2803330-8.365052cell invasion protein
SC2804329-8.862725cell invasion protein
SC2805230-9.012127cell invasion protein
SC2806230-9.373578cell invasion protein
SC2807132-10.773276regulatory protein
SC2808232-10.641855invasion protein regulator
SC2809231-8.568528cell invasion protein
SC2810230-7.801258protein tyrosine phosphate
SC2811027-6.978967virulence associated chaperone
SC2812024-5.100243hypothetical protein
SC2813024-5.101051acyl carrier protein
SC2814022-5.460558cell invasion protein
SC2815120-5.293561cell invasion protein
SC2816121-5.195046cell invasion protein
SC2817122-5.873024cell invasion protein
SC2818-126-6.539191surface presentation of antigens; secretory
SC2819-125-5.681451surface presentation of antigens protein SpaS
SC2820-126-5.044157surface presentation of antigens; secretory
SC2821-222-3.583639surface presentation of antigens; secretory
SC2822-223-3.920187surface presentation of antigens protein SpaP
SC2823-122-4.358818surface presentation of antigens protein SpaO
SC2824-123-5.544941surface presentation of antigens; secretory
SC2825-124-6.090562surface presentation of antigens; secretory
SC2826-223-5.938995ATP synthase SpaL
SC2827-127-7.700174surface presentation of antigens; secretory
SC2828-125-7.442681invasion protein
SC2829-128-7.204965invasion protein
SC2830025-6.573654invasion protein; outer membrane
SC2831228-7.505567invasion protein
SC2832227-7.231834invasion protein
SC2833022-4.863042ABC-type transport system
SC2834123-5.176817acetyltransferase
SC2835126-6.032089hypothetical protein
SC2836-313-1.240256serine/threonine-specific protein phosphatase 2
SC2837-2120.528889hypothetical protein
SC2838-3122.389542hypothetical protein
SC2839-2122.661445hypothetical protein
SC2840-2133.084930hypothetical protein
SC2841-2133.945251DNA mismatch repair protein MutS
SC2842-2134.581961hypothetical protein
SC2843-2145.046495permease
SC2844-1134.473951nucleoside-diphosphate-sugar epimerase
SC2845-1144.432793hypothetical protein
SC2846-1144.509340aldolase
SC28470134.053440tRNA synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2768DHBDHDRGNASE828e-21 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 82.4 bits (203), Expect = 8e-21
Identities = 66/257 (25%), Positives = 120/257 (46%), Gaps = 7/257 (2%)

Query: 3 QVAVVIGGGQTLGAFLCRGLAEEGYRVAVVDIQSDKAANVAQEINADFGEGMAYGFGADA 62
++A + G Q +G + R LA +G +A VD +K V + A+ A F AD
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE--ARHAEAFPADV 66

Query: 63 TSEQSVLALSRGVDEIFGRVNLLVYSAGIAKAAFISDFQLGDFDRSLQVNLVGYFLCARE 122
++ ++ ++ G +++LV AG+ + I +++ + VN G F +R
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 123 FSRLMIRDGIQGRIIQINSKSGKVGSKHNSGYSAAKFGGVGLTQSLALDLAEYGITVHSL 182
S+ M D G I+ + S V + Y+++K V T+ L L+LAEY I + +
Sbjct: 127 VSKYM-MDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 183 MLGNLLKSPMFQSL-LPQYATKLGIKPDEVEQYYIDKVPLKRGCDYQDVLNMLLFYASPK 241
G+ ++ M SL + + IK +E + +PLK+ D+ + +LF S +
Sbjct: 186 SPGS-TETDMQWSLWADENGAEQVIKGS-LETFKTG-IPLKKLAKPSDIADAVLFLVSGQ 242

Query: 242 ASYCTGQSINVTGGQVM 258
A + T ++ V GG +
Sbjct: 243 AGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2770ARGREPRESSOR270.044 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 27.1 bits (60), Expect = 0.044
Identities = 10/45 (22%), Positives = 18/45 (40%), Gaps = 5/45 (11%)

Query: 1 MKPRQRQAAILEHLQKQGKCSVEEL-----AQYFDTTGTTIRKDL 40
M QR I E + + +EL ++ T T+ +D+
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDI 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2772HTHFIS352e-119 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 352 bits (905), Expect = e-119
Identities = 114/306 (37%), Positives = 167/306 (54%), Gaps = 21/306 (6%)

Query: 187 MIGLSPAMTQLKKEIEIVAGSDLNVLIGGETGTGKELVAKAIHQGSPRAVNPLVYLNCAA 246
++G S AM ++ + + + +DL ++I GE+GTGKELVA+A+H R P V +N AA
Sbjct: 139 LVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAA 198

Query: 247 LPESVAESELFGHVKGAFTGAISNRSGKFEMADNGTLFLDEIGELSLALQAKLLRVLQYG 306
+P + ESELFGH KGAFTGA + +G+FE A+ GTLFLDEIG++ + Q +LLRVLQ G
Sbjct: 199 IPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQG 258

Query: 307 DIQRVGDDRSLRVDVRVLAATNRDLREEVLAGRFRADLFHRLSVFPLFVPPLRERGDDVV 366
+ VG +R DVR++AATN+DL++ + G FR DL++RL+V PL +PPLR+R +D+
Sbjct: 259 EYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIP 318

Query: 367 LLAGYFCEQCRLRLGLSRVVLSPGARRHLLNYGWPGNVRELEHAIHRAVVLARATRAGDE 426
L +F +Q + GL A + + WPGNVRELE+ + R L E
Sbjct: 319 DLVRHFVQQAE-KEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITRE 377

Query: 427 VVL-----EEQHFALS---------------EDVLPAPSAESFLALPACRNLRESTENFQ 466
++ E + E+ + A ALP +
Sbjct: 378 IIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEME 437

Query: 467 REMIRQ 472
+I
Sbjct: 438 YPLILA 443


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2789TYPE4SSCAGA270.011 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 27.0 bits (59), Expect = 0.011
Identities = 19/75 (25%), Positives = 37/75 (49%), Gaps = 8/75 (10%)

Query: 12 IDGNQAKVD--VCGIQRDVDLTLVGSCDENGQPRLGQWVLVHVGFAMSVINEAEARDTLD 69
I GNQ + D G+ D L ++NG+P G W+ + + F + ++ ++ D +
Sbjct: 171 IIGNQIRTDQKFMGV-FDESLKERQEAEKNGEPTGGDWLDIFLSF---IFDKKQSSDVKE 226

Query: 70 ALQN--MFDVEPDVG 82
A+ + V+PD+
Sbjct: 227 AINQEPVPHVQPDIA 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2792HTHFIS385e-129 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 385 bits (991), Expect = e-129
Identities = 143/373 (38%), Positives = 207/373 (55%), Gaps = 39/373 (10%)

Query: 387 YQEIHRLKERLVDENLALTEQLNNVDSEFGEIIGRSEAMYNVLKQVEMVAQSDSTVLILG 446
E+ + R + E +L + + ++GRS AM + + + + Q+D T++I G
Sbjct: 108 LTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITG 167

Query: 447 ETGTGKELIARAIHNLSGRSGRRMVKMNCAAMPAGLLESDLFGHERGAFTGASAQRIGRF 506
E+GTGKEL+ARA+H+ R V +N AA+P L+ES+LFGHE+GAFTGA + GRF
Sbjct: 168 ESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRF 227

Query: 507 ELADKSSLFLDEVGDMPLELQPKLLRVLQEQEFERLGSNKLIQTDVRLIAATNRDLKKMV 566
E A+ +LFLDE+GDMP++ Q +LLRVLQ+ E+ +G I++DVR++AATN+DLK+ +
Sbjct: 228 EQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSI 287

Query: 567 ADREFRNDLYYRLNVFPIQLPPLRERPEDIPLLVKAFTFKIARRMGRNIDSIPAETLRTL 626
FR DLYYRLNV P++LPPLR+R EDIP LV+ F + A + G ++ E L +
Sbjct: 288 NQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFV-QQAEKEGLDVKRFDQEALELM 346

Query: 627 SSMEWPGNVRELENVVERAVLLTRGNVLQLS-LPDITAVTPDTSPVATESAKEG------ 679
+ WPGNVRELEN+V R L +V+ + + SP+ +A+ G
Sbjct: 347 KAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQ 406

Query: 680 ----------------------------EDEYQLIIRVLKETNGVVAGPKGAAQRLGLKR 711
E EY LI+ L T G AA LGL R
Sbjct: 407 AVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIK---AADLLGLNR 463

Query: 712 TTLLSRMKRLGID 724
TL +++ LG+
Sbjct: 464 NTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2794adhesinb322e-112 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 322 bits (827), Expect = e-112
Identities = 90/309 (29%), Positives = 164/309 (53%), Gaps = 14/309 (4%)

Query: 4 LHRLKTLLLAGIVAILAL-------SPAYAKEKFKVITTFTVIADMAKNVAGDAAEVSSI 56
+ + + L+L + + S K V+ T ++IAD+ KN+AGD + SI
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKNIAGDKINLHSI 60

Query: 57 TKPGAEIHEYQPTPGDIKRAQGAQLILANGLNLER----WFARFYQHLSGVPE---VVVS 109
G + HEY+P P D+K+ A LI NG+NLE WF + ++ VS
Sbjct: 61 VPVGQDPHEYEPLPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVS 120

Query: 110 TGVKPMGITEGPYNGKPNPHAWMSAENALIYVDNIRDALVKYDPDNAQIYKQNAERYKAK 169
GV + + GK +PHAW++ EN +IY NI L + DP N + Y++N + Y K
Sbjct: 121 EGVDVIYLEGQSEKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANKETYEKNLKAYVEK 180

Query: 170 IRQMADPLRAELEKIPADQRWLVTSEGAFSYLARDNDMKELYLWPINADQQGTPKQVRKV 229
+ + + + IP +++ +VTSEG F Y ++ ++ Y+W IN +++GTP Q++ +
Sbjct: 181 LSALDKEAKEKFNNIPGEKKMIVTSEGCFKYFSKAYNVPSAYIWEINTEEEGTPDQIKTL 240

Query: 230 IDTIKKHHIPAIFSESTVSDKPARQVARESGAHYGGVLYVDSLSAADGPVPTYLDLLRVT 289
++ ++K +P++F ES+V D+P + V++++ ++ DS++ +Y +++
Sbjct: 241 VEKLRKTKVPSLFVESSVDDRPMKTVSKDTNIPIYAKIFTDSVAEKGEEGDSYYSMMKYN 300

Query: 290 TETIVNGIN 298
E I G++
Sbjct: 301 LEKIAEGLS 309


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2799BORPETOXINA310.007 Bordetella pertussis toxin A subunit signature.
		>BORPETOXINA#Bordetella pertussis toxin A subunit signature.

Length = 269

Score = 30.5 bits (68), Expect = 0.007
Identities = 16/57 (28%), Positives = 30/57 (52%), Gaps = 8/57 (14%)

Query: 201 IISDLTRKWSQAEVAGKLFMSVSSLKRKLAAEEVSFSKIYLDARMNQAIKLLRMGAG 257
++ LT + Q + F+S SS +R ++++YL+ RM +A++ R G G
Sbjct: 66 VLDHLTGRSCQVGSSNSAFVSTSSSRR--------YTEVYLEHRMQEAVEAERAGRG 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2803FLGMRINGFLIF437e-07 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 42.6 bits (100), Expect = 7e-07
Identities = 33/167 (19%), Positives = 63/167 (37%), Gaps = 10/167 (5%)

Query: 23 LLKGLDQEQANEVIAVLQMHNIEANKIDSGKLGYSITVAEPDFTAAVYWIKTYQLPPRPR 82
L L + ++A L NI + +G +I V + LP
Sbjct: 53 LFSNLSDQDGGAIVAQLTQMNI-PYRFANG--SGAIEVPADKVHELRLRLAQQGLPKGGA 109

Query: 83 VEIAQMFPADSLVSSPRAEKARLYSAIEQRLEQSLQTMEGVLSARVHISYDIDA---GEN 139
V + + S +E+ A+E L ++++T+ V SARVH++ + E
Sbjct: 110 VGFE-LLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQ 168

Query: 140 GRPPKPVHLSALAVYERGSPLAHQISDIKRFLKNSFADVDYDNISVV 186
P V ++ QIS + + ++ A + N+++V
Sbjct: 169 KSPSASVTVTLEPGRALDEG---QISAVVHLVSSAVAGLPPGNVTLV 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2807PF07212280.045 Hyaluronoglucosaminidase
		>PF07212#Hyaluronoglucosaminidase

Length = 336

Score = 28.1 bits (62), Expect = 0.045
Identities = 12/39 (30%), Positives = 21/39 (53%)

Query: 234 MSTSTLKRKLAEEGTSFSDIYLSARMNQAAKLLRIGNHN 272
+S +K++ +GT+ IY+++ KLLRI N
Sbjct: 241 LSIDIVKKQKGGKGTAAQGIYINSTSGTTGKLLRIRNLG 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2810BACYPHPHTASE304e-100 Salmonella/Yersinia modular tyrosine phosphatase si...
		>BACYPHPHTASE#Salmonella/Yersinia modular tyrosine phosphatase

signature.
Length = 468

Score = 304 bits (778), Expect = e-100
Identities = 67/212 (31%), Positives = 102/212 (48%), Gaps = 17/212 (8%)

Query: 340 GKPVALAGSYPKNTPDALEAHMKMLLEKECSCLVVLTSEDQMQAKQ--LPPYFRGSYTFG 397
G +A YP LE+H +ML E L VL S ++ ++ +P YFR S T+G
Sbjct: 252 GNTRTIACQYP--LQSQLESHFRMLAENRTPVLAVLASSSEIANQRFGMPDYFRQSGTYG 309

Query: 398 EVHTNSQKVSSASQGEAI--DQYNMQL-SCGEKRYTIPVLHVKNWPDHQPLPS--TDQLE 452
+ S+ G+ I D Y + + G+K ++PV+HV NWPD + S T L
Sbjct: 310 SITVESKMTQQVGLGDGIMADMYTLTIREAGQKTISVPVVHVGNWPDQTAVSSEVTKALA 369

Query: 453 YLADRVKNSNQNGAPGRSSS-----DKHLPMIHCLGGVGRTGTMAAALVLKDNPHSNL-- 505
L D+ + +N + SS K P+IHC GVGRT + A+ + D+ +S L
Sbjct: 370 SLVDQTAETKRNMYESKGSSAVGDDSKLRPVIHCRAGVGRTAQLIGAMCMNDSRNSQLSV 429

Query: 506 EQVRADFRDSRNNRMLEDASQF-VQLKAMQAQ 536
E + + R RN M++ Q V +K + Q
Sbjct: 430 EDMVSQMRVQRNGIMVQKDEQLDVLIKLAEGQ 461


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2811PF05932337e-05 Tir chaperone protein (CesT)
		>PF05932#Tir chaperone protein (CesT)

Length = 127

Score = 33.2 bits (76), Expect = 7e-05
Identities = 16/111 (14%), Positives = 39/111 (35%), Gaps = 7/111 (6%)

Query: 4 PLTFDDNNQCLLLLDSDIFTSIEAK--DDIWLLNGMIIPLSPVCGDSIWRQIMVINGELA 61
PL FDD+ C +++D+ ++ + LL G++ P D + ++
Sbjct: 21 PLVFDDHGTCNMIIDNTFALTLSCDYARERLLLIGLLEPH----KDIPQQCLLAGALNPL 76

Query: 62 ANNEGTLAYIDAAETLLFIHAI-TDLTNTYHIISQLESFVNQQEALKNILQ 111
N L + + +I + + + ++ + + Q
Sbjct: 77 LNAGPGLGLDEKSGLYHAYQSIPREKLSVPTLKREMAGLLEWMRGWREASQ 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2816BACINVASINC5150.0 Salmonella/Shigella invasin protein C signature.
		>BACINVASINC#Salmonella/Shigella invasin protein C signature.

Length = 409

Score = 515 bits (1327), Expect = 0.0
Identities = 407/409 (99%), Positives = 408/409 (99%)

Query: 1 MLISNVGINPAAYLNNHSVENSSQTASQSVSAKDILNSIGISSSKVSDLGLSPTLSAPAP 60
MLISNVGINPAAYLNNHSVENSSQTASQSVSAKDILNSIGISSSKVSDLGLSPTLSAPAP
Sbjct: 1 MLISNVGINPAAYLNNHSVENSSQTASQSVSAKDILNSIGISSSKVSDLGLSPTLSAPAP 60

Query: 61 GVLTQTPGTITSFLKASIQNTDMNQDLNALANNVTTKANEVVQTQLREQQAEVGKFFDIS 120
GVLTQTPGTITSFLKASIQNTDMNQDLNALANNVTTKANEVVQTQLREQQAEVGKFFDIS
Sbjct: 61 GVLTQTPGTITSFLKASIQNTDMNQDLNALANNVTTKANEVVQTQLREQQAEVGKFFDIS 120

Query: 121 GMSSSAVALLAAANTLMLTLNQADSKLSGKLSLVSFDAAKTTASSMMREGMNALSGSISQ 180
GMSSSAVALLAAANTLMLTLNQADSKLSGKLSLVSFDAAKTTASSMMREGMNALSGSISQ
Sbjct: 121 GMSSSAVALLAAANTLMLTLNQADSKLSGKLSLVSFDAAKTTASSMMREGMNALSGSISQ 180

Query: 181 SALQLGITGVGAKLEYKGLQNERGALKHNAAKIDKLTTESHSIKNVLNGQNSVKLGAEGV 240
SALQLGITGVGAKLEYKGLQNERGALKHNAAKIDKLTTESHSIKNVLNGQNSVKLGAEGV
Sbjct: 181 SALQLGITGVGAKLEYKGLQNERGALKHNAAKIDKLTTESHSIKNVLNGQNSVKLGAEGV 240

Query: 241 DSLKSLNMKKTGTDATKNLNDATLKSNAGTSATESLGIKDSNKQISPEHQAILSKRLESV 300
DSLKSLNMKKTGTDATKNLNDATLKSNAGTSATESLGIK+SNKQISPEHQAILSKRLESV
Sbjct: 241 DSLKSLNMKKTGTDATKNLNDATLKSNAGTSATESLGIKNSNKQISPEHQAILSKRLESV 300

Query: 301 ESDIRLEQNTMDMTRIDARKMQMTGDLIMKNSVTVGGIAGASGQYAATQERSEQQISQVN 360
ESDIRLEQNTMDMTRIDARKMQMTGDLIMKNSVTVGGIAGAS QYAATQERSEQQISQVN
Sbjct: 301 ESDIRLEQNTMDMTRIDARKMQMTGDLIMKNSVTVGGIAGASRQYAATQERSEQQISQVN 360

Query: 361 NRVASTASDEARESSRKSTSLIQEMLKTMESINQSKASALAAIAGNIRA 409
NRVASTASDEARESSRKSTSLIQEMLKTMESINQSKASALAAIAGNIRA
Sbjct: 361 NRVASTASDEARESSRKSTSLIQEMLKTMESINQSKASALAAIAGNIRA 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2817BACINVASINB8420.0 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 842 bits (2175), Expect = 0.0
Identities = 592/593 (99%), Positives = 592/593 (99%)

Query: 1 MVNDASSISRSGYTQNPRLAEAAFEGVRKNTDFLKAADKAFKDVVATKAGDLKAGTKSGE 60
MVNDASSISRSGYTQNPRLAEAAFEGVRKNTDFLKAADKAFKDVVATKAGDLKAGTKSGE
Sbjct: 1 MVNDASSISRSGYTQNPRLAEAAFEGVRKNTDFLKAADKAFKDVVATKAGDLKAGTKSGE 60

Query: 61 SAINTVGLKPPTDAAREKLSSEGQLTLLLGKLMTLLGDVSLSQLESRLAVWQAMIESQKE 120
SAINTVGLKPPTDAAREKLSSEGQLTLLLGKLMTLLGDVSLSQLESRLAVWQAMIESQKE
Sbjct: 61 SAINTVGLKPPTDAAREKLSSEGQLTLLLGKLMTLLGDVSLSQLESRLAVWQAMIESQKE 120

Query: 121 MGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAATKKLTQAQNKLQSLDPADPG 180
MGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAATKKLTQAQNKLQSLDPADPG
Sbjct: 121 MGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAATKKLTQAQNKLQSLDPADPG 180

Query: 181 YAQTEAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQGTANAASQN 240
YAQ EAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQGTANAASQN
Sbjct: 181 YAQAEAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQGTANAASQN 240

Query: 241 QVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQAEMEKKSAEF 300
QVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQAEMEKKSAEF
Sbjct: 241 QVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQAEMEKKSAEF 300

Query: 301 QEETRKAEETNRIMGCIGKVLGALLTIVSVVAAVFTGGASLALAAVGLAVMVADEIVKAA 360
QEETRKAEETNRIMGCIGKVLGALLTIVSVVAAVFTGGASLALAAVGLAVMVADEIVKAA
Sbjct: 301 QEETRKAEETNRIMGCIGKVLGALLTIVSVVAAVFTGGASLALAAVGLAVMVADEIVKAA 360

Query: 361 TGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTAEMAGSIVGAIVAAIAMV 420
TGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTAEMAGSIVGAIVAAIAMV
Sbjct: 361 TGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTAEMAGSIVGAIVAAIAMV 420

Query: 421 AVIVVVAVVGKGAAAKLGNALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLG 480
AVIVVVAVVGKGAAAKLGNALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLG
Sbjct: 421 AVIVVVAVVGKGAAAKLGNALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLG 480

Query: 481 NVGSKMGLQTNALSKELVGNTLNKVALGMEVTNTAAQSAGGVAEGVFIKNASEALADFML 540
NVGSKMGLQTNALSKELVGNTLNKVALGMEVTNTAAQSAGGVAEGVFIKNASEALADFML
Sbjct: 481 NVGSKMGLQTNALSKELVGNTLNKVALGMEVTNTAAQSAGGVAEGVFIKNASEALADFML 540

Query: 541 ARFAMDQIQQWLKQSVEIFGENQKVTAELQKAMSSAVQQNADASRFILRQSRA 593
ARFAMDQIQQWLKQSVEIFGENQKVTAELQKAMSSAVQQNADASRFILRQSRA
Sbjct: 541 ARFAMDQIQQWLKQSVEIFGENQKVTAELQKAMSSAVQQNADASRFILRQSRA 593


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2818SYCDCHAPRONE1282e-40 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 128 bits (322), Expect = 2e-40
Identities = 39/160 (24%), Positives = 72/160 (45%), Gaps = 4/160 (2%)

Query: 4 QNNVSEERVAEMIWDAVSEGATLKDVHGIPQDMMDGLYAHAYEFYNQGRLDEAETFFRFL 63
Q + + + G T+ ++ I D ++ LY+ A+ Y G+ ++A F+ L
Sbjct: 3 QETTDTQEYQLAMESFLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQAL 62

Query: 64 CIYDFYNPDYTMGLAAVCQLKKQFQKACDLYAVAFTLLKNDYRPVFFTGQCQLLMRKAAK 123
C+ D Y+ + +GL A Q Q+ A Y+ + + R F +C L + A+
Sbjct: 63 CVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAE 122

Query: 124 ARQCF----ELVNERTEDESLRAKALVYLEALKTAETEQH 159
A EL+ ++TE + L + LEA+K + +H
Sbjct: 123 AESGLFLAQELIADKTEFKELSTRVSSMLEAIKLKKEMEH 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2819TYPE3IMSPROT341e-118 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 341 bits (876), Expect = e-118
Identities = 119/360 (33%), Positives = 205/360 (56%), Gaps = 19/360 (5%)

Query: 1 MSSNKTEKPTKKRLEDSAKKGQSFKSKDLIIACLTLGGIAYLVSYGSFN-EFMGIIKIII 59
MS KTE+PT K++ D+ KKGQ KSK+++ L + A L+ + E + +I
Sbjct: 1 MSGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIP 60

Query: 60 ADNFDQSMADYSLAVFGIGLKYLIPFMLLCL---VCSALPAL----LQAGFVLATEALKP 112
+QS +S A+ + L+ F LC +AL A+ +Q GF+++ EA+KP
Sbjct: 61 ---AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKP 117

Query: 113 NLSALNPVEGAKKLFSMRTVKDTVKTLLYLSSFVVAAIICWKKYKVEIFSQLNGNVVDIA 172
++ +NP+EGAK++FS++++ + +K++L + V+ +I+ W K + + L I
Sbjct: 118 DIKKINPIEGAKRIFSIKSLVEFLKSILKV---VLLSILIWIIIKGNLVTLLQLPTCGIE 174

Query: 173 VIWRELLLALVLTCLACA---LIVLLLDAVAEYFRTMKDMKMDKEEVKREMKEQEGNPEV 229
I L L + C +++ + D EY++ +K++KM K+E+KRE KE EG+PE+
Sbjct: 175 CITPLLGQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEI 234

Query: 230 KSKRREVHMEILSEQVKSDIENSRLIVANPTHITIGIYFKPELMPIPMISVYETNQRALA 289
KSKRR+ H EI S ++ +++ S ++VANPTHI IGI +K P+P+++ T+ +
Sbjct: 235 KSKRRQFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQT 294

Query: 290 VRAYAEKVGVPVIVDIKLARSLFKTHRRYDLVSLEEIDEVLRLLVWLE--EVENAGKDVI 347
VR AE+ GVP++ I LAR+L+ + E+I+ +L WLE +E +++
Sbjct: 295 VRKIAEEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEKQHSEML 354


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2820TYPE3IMRPROT1845e-60 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 184 bits (470), Expect = 5e-60
Identities = 48/237 (20%), Positives = 103/237 (43%), Gaps = 4/237 (1%)

Query: 12 LVASAALGFARVAPIFFFLPFLNSGVLSGAPRNAIIILVALGVWPHALNEVPPFLSVAMI 71
+ RV + P L+ + + + +++ + P P S +
Sbjct: 12 WLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPVFSFFAL 71

Query: 72 PLVLQEAAVGVMLGCLLSWPFWVMHALGCIIDNQRGATLSSSIDPANGIDTSEMANFLNM 131
L +Q+ +G+ LG + + F + G II Q G + ++ +DPA+ ++ +A ++M
Sbjct: 72 WLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDM 131

Query: 132 SAAVVHLQNGGLVTMVDVLTKSYQLCDPMNEC--TPSLPPLLTFINQVAQNALVLASPVV 189
A ++ L G + ++ +L ++ E + + L + + N L+LA P++
Sbjct: 132 LALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALPLI 191

Query: 190 LVLLLSEVFLGLLSRFAPQMNAFAISLTVKSGIAVLIMLLYFS--PVLPDNVLRLSF 244
+LL + LGLL+R APQ++ F I + + + +M +++ F
Sbjct: 192 TLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIF 248


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2821TYPE3IMQPROT894e-27 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 88.7 bits (220), Expect = 4e-27
Identities = 86/86 (100%), Positives = 86/86 (100%)

Query: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60
MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL
Sbjct: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60

Query: 61 FLLSGWYGEVLLSYGRQVIFLALAKG 86
FLLSGWYGEVLLSYGRQVIFLALAKG
Sbjct: 61 FLLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2822TYPE3IMPPROT303e-107 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 303 bits (777), Expect = e-107
Identities = 224/224 (100%), Positives = 224/224 (100%)

Query: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60
MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS
Sbjct: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60

Query: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120
MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL
Sbjct: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120

Query: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180
KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL
Sbjct: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180

Query: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIAT 224
LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIAT
Sbjct: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIAT 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2823TYPE3OMOPROT5380.0 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 538 bits (1387), Expect = 0.0
Identities = 300/303 (99%), Positives = 302/303 (99%)

Query: 1 MSLRVRQIDRREWLLAQTATECQRHGQEATLEYPTRQGMWVRLSDAEKRWSAWIQPGDWL 60
MSLRVRQIDRREWLLAQTATECQRHG+EATLEYPTRQGMWVRLSDAEKRWSAWI+PGDWL
Sbjct: 1 MSLRVRQIDRREWLLAQTATECQRHGREATLEYPTRQGMWVRLSDAEKRWSAWIKPGDWL 60

Query: 61 EHVSPALAGAAVSAGAEHLVVPWLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLL 120
EHVSPALAGAAVSAGAEHLVVPWLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLL
Sbjct: 61 EHVSPALAGAAVSAGAEHLVVPWLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLL 120

Query: 121 HIMSDRGGLWFEHLPELPAVGGGRPKMLRWPLRFVIGSSDTQCSLLGRIGIGDVLLIRTS 180
HIMSDRGGLWFEHLPELPAVGGGRPKMLRWPLRFVIGSSDTQ SLLGRIGIGDVLLIRTS
Sbjct: 121 HIMSDRGGLWFEHLPELPAVGGGRPKMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTS 180

Query: 181 RAEVYCYAKKLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYR 240
RAEVYCYAKKLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYR
Sbjct: 181 RAEVYCYAKKLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYR 240

Query: 241 KNVTLAELEAMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESG 300
KNVTLAELEAMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESG
Sbjct: 241 KNVTLAELEAMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESG 300

Query: 301 NGE 303
NGE
Sbjct: 301 NGE 303


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2824SSPANPROTEIN6000.0 Salmonella invasion protein InvJ signature.
		>SSPANPROTEIN#Salmonella invasion protein InvJ signature.

Length = 336

Score = 600 bits (1548), Expect = 0.0
Identities = 331/336 (98%), Positives = 334/336 (99%)

Query: 1 MGDVSAVSSSGNILLPQQDEVGGLSEALKKAVEKHKTEYSGDKKDRDYGDAFVMHKETAL 60
MGDVSAVSSSGNILLPQQDEVGGLSEALKKAVEKHKTEYSGDKKDRDYGDAFVMHKETAL
Sbjct: 1 MGDVSAVSSSGNILLPQQDEVGGLSEALKKAVEKHKTEYSGDKKDRDYGDAFVMHKETAL 60

Query: 61 PVLLAAWRHGAPAKSEHHNGNVSGLHHNGKGDLRIAEKLLKVTAEKSVGLISAEAKVDKS 120
P+LLAAWRHGAPAKSEHHNGNVSGLHHNGK +LRIAEKLLKVTAEKSVGLISAEAKVDKS
Sbjct: 61 PLLLAAWRHGAPAKSEHHNGNVSGLHHNGKSELRIAEKLLKVTAEKSVGLISAEAKVDKS 120

Query: 121 AALLSSKNRPLESVSGKKLSADLKAVESVSEVADNATGISDDNIKALPGDNKAIAGEGVR 180
AALLSSKNRPLESVSGKKLSADLKAVESVSEV DNATGISDDNIKALPGDNKAIAGEGVR
Sbjct: 121 AALLSSKNRPLESVSGKKLSADLKAVESVSEVTDNATGISDDNIKALPGDNKAIAGEGVR 180

Query: 181 KEGAPLARDVAPARMAAANTGKPDDKDHKKVKDVSQLPLQPTTIADLSQLTGGDEKMPLA 240
KEGAPLARDVAPARMAAANTGKP+DKDHKKVKDVSQLPLQPTTIADLSQLTGGDEKMPLA
Sbjct: 181 KEGAPLARDVAPARMAAANTGKPEDKDHKKVKDVSQLPLQPTTIADLSQLTGGDEKMPLA 240

Query: 241 AQSKPMMTIFPTADGVKGEDSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLH 300
AQSKPMMTIFPTADGVKGEDSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLH
Sbjct: 241 AQSKPMMTIFPTADGVKGEDSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLH 300

Query: 301 DQWQNGNPQRWHLTRDDQQNPQQQQHRQQSGEEDDA 336
DQWQNGNPQRWHLTRDDQQNPQQQQHRQQSGEEDDA
Sbjct: 301 DQWQNGNPQRWHLTRDDQQNPQQQQHRQQSGEEDDA 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2825SSPAMPROTEIN1693e-57 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type M

signature.
Length = 147

Score = 169 bits (429), Expect = 3e-57
Identities = 141/147 (95%), Positives = 143/147 (97%)

Query: 1 MHSLTRIKVLQRRCTVFHSQCESILLRYQDEDRGLQAEEEAILEQIAGLKLLLDTLRAEN 60
MHSLTRIKVLQRRCTVFHSQCESILLRYQDEDR LQ EEEAI+EQIAGLKLLLDTLRAEN
Sbjct: 1 MHSLTRIKVLQRRCTVFHSQCESILLRYQDEDRRLQVEEEAIVEQIAGLKLLLDTLRAEN 60

Query: 61 RQLSREEIYTLLRKQSIVRRQIKDLELQIIQIQEKRSELEKKREEFQKKSKYWLRKEGNY 120
RQLSREEIY LLRKQSIVRRQIKDLELQIIQIQEKRSELEKKREEFQ+KSKYWLRKEGNY
Sbjct: 61 RQLSREEIYALLRKQSIVRRQIKDLELQIIQIQEKRSELEKKREEFQEKSKYWLRKEGNY 120

Query: 121 QRWIIRQKRFYIQREIQQEEAESEEII 147
QRWIIRQKR YIQREIQQEEAESEEII
Sbjct: 121 QRWIIRQKRLYIQREIQQEEAESEEII 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2827SSPAKPROTEIN2059e-72 Invasion protein B family signature.
		>SSPAKPROTEIN#Invasion protein B family signature.

Length = 133

Score = 205 bits (522), Expect = 9e-72
Identities = 43/133 (32%), Positives = 75/133 (56%)

Query: 1 MQHLDIAELVRSALEVSGCDPSLIGGIDSHSTIVLDLFALPSICISVKEDDVWIWAQLGA 60
M ++++ +LVR +L GC PS+I +DSHS I + L ++P+I I++ + V +WA A
Sbjct: 1 MSNINLVQLVRDSLFTIGCPPSIITDLDSHSAITISLDSMPAINIALVNEQVMLWANFDA 60

Query: 61 DSMVVLQQRAYEILMTIMEGCHFARGGQLLLGEQNGELTLKALVHPDFLSDGEKFSTALN 120
S V LQ AY IL ++ ++ + L + L L+ ++ D++ DG F+ L+
Sbjct: 61 PSDVKLQSSAYNILNLMLMNFSYSINELVELHRSDEYLQLRVVIKDDYVHDGIVFAEILH 120

Query: 121 GFYNYLEVFSRSL 133
FY +E+ + L
Sbjct: 121 EFYQRMEILNGVL 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2829INVEPROTEIN6040.0 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 604 bits (1558), Expect = 0.0
Identities = 371/372 (99%), Positives = 371/372 (99%)

Query: 1 MIPGSTSGISFSRILSRQTSHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSA 60
MIPGSTSGISFSRILSRQ SHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSA
Sbjct: 1 MIPGSTSGISFSRILSRQASHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSA 60

Query: 61 ALAQFRNRRDYEKKSSNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFP 120
ALAQFRNRRDYEKKSSNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFP
Sbjct: 61 ALAQFRNRRDYEKKSSNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFP 120

Query: 121 DPSDLVLVLRELLRRKDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLS 180
DPSDLVLVLRELLRRKDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLS
Sbjct: 121 DPSDLVLVLRELLRRKDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLS 180

Query: 181 LKPGLLRASYRQFIQSESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSR 240
LKPGLLRASYRQFIQSESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSR
Sbjct: 181 LKPGLLRASYRQFIQSESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSR 240

Query: 241 LEFGQLLRRLTQLKMLRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSL 300
LEFGQLLRRLTQLKMLRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSL
Sbjct: 241 LEFGQLLRRLTQLKMLRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSL 300

Query: 301 LADIIGLNALLLSHKEHASFLQIFYQVCKAIPSSLFYEEYWQEELLMALRSMTDIAYKHE 360
LADIIGLNALLLSHKEHASFLQIFYQVCKAIPSSLFYEEYWQEELLMALRSMTDIAYKHE
Sbjct: 301 LADIIGLNALLLSHKEHASFLQIFYQVCKAIPSSLFYEEYWQEELLMALRSMTDIAYKHE 360

Query: 361 MAEQRRTIEKLS 372
MAEQRRTIEKLS
Sbjct: 361 MAEQRRTIEKLS 372


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2830TYPE3OMGPROT5760.0 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 576 bits (1485), Expect = 0.0
Identities = 169/540 (31%), Positives = 271/540 (50%), Gaps = 57/540 (10%)

Query: 4 HILLARVLACAALVLVTPGYSSE----KIPVTGSGFVAKDDSLRTFFDAMALQLKEPVIV 59
H RVL L+L + ++ E IP +VAK +SLR V+V
Sbjct: 6 HSFFKRVLTGTLLLLSSYSWAQELDWLPIPYV---YVAKGESLRDLLTDFGANYDATVVV 62

Query: 60 SKMAARKKITGNFEFHDPNALLEKLSLQLGLIWYFDGQAIYIYDASEMRNAVVSLRNVSL 119
S K++G FE +P L+ ++ L+WY+DG +YI+ SE+ + ++ L+
Sbjct: 63 SD-KINDKVSGQFEHDNPQDFLQHIASLYNLVWYYDGNVLYIFKNSEVASRLIRLQESEA 121

Query: 120 NEFNNFLKRSGLYNKNYPLRGDNRKGTFYVSGPPVYVDMVVNAATMMDKQND--GIELGR 177
E L+RSG++ + R D YVSGPP Y+++V A +++Q + G
Sbjct: 122 AELKQALQRSGIWEPRFGWRPDASNRLVYVSGPPRYLELVEQTAAALEQQTQIRSEKTGA 181

Query: 178 QKIGVMRLNNTFVGDRTYNLRDQKMVIPGIATAIERLLQGEEQPLGNIVSSEPPAMPAFS 237
I + L DRT + RD ++ PG+AT ++R+L + + P
Sbjct: 182 LAIEIFPLKYASASDRTIHYRDDEVAAPGVATILQRVLSDATIQQVTVDNQRIP------ 235

Query: 238 ANGEKGKAANYAGGMSLQEALKQNAAAGNIKIVAYPDTNSLLVKGTAEQVHFIEMLVKAL 297
Q A + +A A ++ A P N+++V+ + E++ + L+ AL
Sbjct: 236 -----------------QAATRASAQA---RVEADPSLNAIIVRDSPERMPMYQRLIHAL 275

Query: 298 DVAKRHVELSLWIVDLNKSDLERLGTSWSGSI-----------TIGDKLGVSLNQSSIST 346
D +E++L IVD+N L LG W I T GD+ ++ N + S
Sbjct: 276 DKPSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQVVIKTTGDQSNIASNGALGSL 335

Query: 347 LDG---SRFIAAVNALEEKKQATVVSRPVLLTQENVPAIFDNNRTFYTKLIGERNVALEH 403
+D +A VN LE + A VVSRP LLTQEN A+ D++ T+Y K+ G+ L+
Sbjct: 336 VDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDHSETYYVKVTGKEVAELKG 395

Query: 404 VTYGTMIRVLPRFSADG---QIEMSLDIEDGNDKTPQSDTTTSVDALPEVGRTLISTIAR 460
+TYGTM+R+ PR G +I ++L IEDGN Q ++ ++ +P + RT++ T+AR
Sbjct: 396 ITYGTMLRMTPRVLTQGDKSEISLNLHIEDGN----QKPNSSGIEGIPTISRTVVDTVAR 451

Query: 461 VPHGKSLLVGGYTRDANTDTVQSIPFLGKLPLIGSLFRYSSKNKSNVVRVFMIEPKEIVD 520
V HG+SL++GG RD + + +P LG +P IG+LFR S+ VR+F+IEP+ I +
Sbjct: 452 VGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVRLFIIEPRIIDE 511


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2844NUCEPIMERASE842e-20 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 83.7 bits (207), Expect = 2e-20
Identities = 55/217 (25%), Positives = 92/217 (42%), Gaps = 31/217 (14%)

Query: 1 MQIIITGGGGFLGQKLASALLNSSL------AFNELLLVDLKMPARLS--DSPRLRCLEA 52
M+ ++TG GF+G ++ LL + N+ V LK ARL P + +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQ-ARLELLAQPGFQFHKI 59

Query: 53 DLT-QPGVLESVITANTSVVYHLAA-------IVSSHAEDDFDLGWKVNLDLTRQLLEAC 104
DL + G+ + + + V+ + + HA D NL +LE C
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYAD------SNLTGFLNILEGC 113

Query: 105 RRQPQKIRFVFSSSLAVYGG--TLPECVTDTTALTPRSSYGAQKAACELLVNDYTRKGYV 162
R + +++SS +VYG +P D+ P S Y A K A EL+ + Y+ +
Sbjct: 114 RHNKIQ-HLLYASSSSVYGLNRKMPFSTDDSVD-HPVSLYAATKKANELMAHTYSHLYGL 171

Query: 163 DGLALRLPTICVRPGKPNRAASSFVSAIIREPLQGET 199
LR T+ G+P+ A F A+ L+G++
Sbjct: 172 PATGLRFFTVYGPWGRPDMALFKFTKAM----LEGKS 204


48SC2871SC2890Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2871116-3.034931hypothetical protein
SC2872-112-3.038645hypothetical protein
SC2873-110-1.929724hypothetical protein
SC2874-111-0.223207transposase
SC2875-113-0.007890hypothetical protein
SC2876-1100.484055hypothetical protein
SC28770142.359239phosphoadenosine phosphosulfate reductase
SC28780131.708607sulfite reductase subunit beta
SC28790130.536525sulfite reductase subunit alpha
SC2880223-2.459143synthase
SC2881329-2.163691beta-lactamase family hydrolase
SC2882329-1.769757hypothetical protein
SC2883226-1.527687integrase
SC2884127-1.573738hypothetical protein
SC2885022-0.159090hypothetical protein
SC28860210.211032phosphopyruvate hydratase
SC2887-1120.573321CTP synthetase
SC2888-1120.687737nucleoside triphosphate pyrophosphohydrolase
SC28891130.746766fimbrial subunit
SC2890213-1.225546outer membrane usher protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2873BONTOXILYSIN290.023 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 29.5 bits (66), Expect = 0.023
Identities = 13/51 (25%), Positives = 24/51 (47%), Gaps = 3/51 (5%)

Query: 298 YEPINGTDQLNVAVKRITSLHKNMNKVYGQRTDTASFDVMNQQGSMEDVLD 348
Y + TD +N++ + ++ + +KVY Q+ D V+ E LD
Sbjct: 1077 YLSLKNTDGINISSVKFKLINIDESKVYVQKWDECIICVL---DGTEKYLD 1124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2878PF07675300.023 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 30.4 bits (68), Expect = 0.023
Identities = 19/81 (23%), Positives = 34/81 (41%), Gaps = 10/81 (12%)

Query: 217 KTTVVIP---PQND--IDLHANDMNFVAIAENGKLVGFNLLVGGGLSIEHGNK-----KT 266
T +P PQN + A+ ++VAI+++G L G + G++ + K
Sbjct: 249 TNTYTLPASLPQNQASYSIQASAGSYVAISKDGVLYGTGVANASGVATVNMTKQITENGN 308

Query: 267 YARTASEFGYLPLEHTLAVAE 287
Y + YLP+ + E
Sbjct: 309 YDVVITRSNYLPVIKQIQAGE 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2886ANTHRAXTOXNA290.036 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.3 bits (65), Expect = 0.036
Identities = 31/132 (23%), Positives = 51/132 (38%), Gaps = 9/132 (6%)

Query: 211 GYAPNLGSNAEALAVIAEAVKAAGYELGKDITLAMDCAASEFYKDGKYVLA-----GEGN 265
P L N + A+ +E K YE+GK I+L + + ++ + +
Sbjct: 147 RETPKLIINIKDYAINSEQSKEVYYEIGKGISLDIISKDKSLDPEFLNLIKSLSDDSDSS 206

Query: 266 KAFTSEEFTHFLEELTKQYPIVSIEDGLDESDW---DGFAYQTKVLG-DKIQLVGDDLFV 321
S++F LE K I I++ L E F+Y ++L D+F
Sbjct: 207 DLLFSQKFKEKLELNNKSIDINFIKENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFE 266

Query: 322 TNTKILKEGIEK 333
K+ K G EK
Sbjct: 267 YMNKLEKGGFEK 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2890PF005777030.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 703 bits (1816), Expect = 0.0
Identities = 217/875 (24%), Positives = 375/875 (42%), Gaps = 73/875 (8%)

Query: 12 PIACGVGMLLSVSPYSASGKDIEFNTDFLDVKNRDNVNIAQFSRKGFILPGVYLLQIKIN 71
+AC + +P S++ ++ FN FL + ++++F + PG Y + I +N
Sbjct: 31 FVACA---FAAQAPLSSA--ELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLN 85

Query: 72 GQTLPQEFPVNWVIPEHDPQGSEVCAEPELVTQLGIKPELAEKLVWITHGERQCLAPDSL 131
+ V + QG C + +G+ + + + C+ S+
Sbjct: 86 NGYMATR-DVTFN-TGDSEQGIVPCLTRAQLASMGLNTASVSGMNLL--ADDACVPLTSM 141

Query: 132 -KGMDFQADLGHSTLLVNLPQAYMEYSDVDWDPPARWDNGIPGIILDYNINNQLRHDQES 190
Q D+G L + +PQA+M + PP WD GI +L+YN + ++
Sbjct: 142 IHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIG 201

Query: 191 GSEEQSISGNGTLGANLGAWRLRADWQASYDHRDDDENTSTLHDQSWSRYYAYRALPTLG 250
G+ N G N+GAWRLR + SY+ D + + R + L
Sbjct: 202 GNS-HYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKN--KWQHINTWLERDIIPLR 258

Query: 251 AKLTLGESYLQSDVFDSFNYIGASVVSDDQMLPPKLRGYAPEIVGIARSNAKVKVSWQGR 310
++LTLG+ Y Q D+FD N+ GA + SDD MLP RG+AP I GIAR A+V + G
Sbjct: 259 SRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGY 318

Query: 311 VLYETQVPAGPFRIQDLNQ-SVSGTLHVTVEEQNGQTQEFDVNTASVPFLTRPGMVRYKM 369
+Y + VP GPF I D+ SG L VT++E +G TQ F V +SVP L R G RY +
Sbjct: 319 DIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSI 378

Query: 370 ALGRPQDWDHHPITGTFASAEASWGVTNGWSLYGGAIGESNYQAVALGSGKDLGVVGAVA 429
G + + F + G+ GW++YGG Y+A G GK++G +GA++
Sbjct: 379 TAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALS 438

Query: 430 VDITHSIAHMPQDDGFDGETLQGNSYRISYSRDFDEIDSRLTFAGYRFSEKNFMSMSDYL 489
VD+T + + +P D G S R Y++ +E + + GYR+S + + +D
Sbjct: 439 VDMTQANSTLPDD-----SQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTT 493

Query: 490 DAKT--YHHLNA-----------------GHEKERYTVTYNQNFREQGMSAYFSYSRSTF 530
++ Y+ +++ + +T Q + Y S S T+
Sbjct: 494 YSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTS-TLYLSGSHQTY 552

Query: 531 WDSPDQS-NYNLSLSWYFDLGSIKNLSASLNGYRSEYNGDKDDGVYISLSVPWG------ 583
W + + + L+ F+ + + S + ++ + +D + +++++P+
Sbjct: 553 WGTSNVDEQFQAGLNTAFEDIN---WTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSD 609

Query: 584 ------NDSISYNGT-FNGSQHRNQLGYSGH--SQNGDNWQLHVG-----QDEQGAQADG 629
+ S SY+ + + N G G N ++ + G G+
Sbjct: 610 SKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYA 669

Query: 630 YYSHQGALTDIDLSADYEEGSYRSLGMSLRGGMTLTTQGGALHRGSLAGSTRLLVDTDGI 689
+++G + ++ + + + L + GG+ G L + T +LV G
Sbjct: 670 TLNYRGGYGNANIGYSHSDD-IKQLYYGVSGGVLAHANGVTLGQPL--NDTVVLVKAPGA 726

Query: 690 ADVPVSGNDSPTSTNIFGKAVIADVGSYSRSLARIDLNKLPEKAEATKSVVQITLTEGAI 749
D V N + T+ G AV+ Y + +D N L + + +V + T GAI
Sbjct: 727 KDAKV-ENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAI 785

Query: 750 GYRHFDVVSGEKMMAVFRLADGDFPPFGAEVKNERQQQLGLVANDGNAWLAGVKAGETLK 809
F G K++ + PFGA V +E Q G+VA++G +L+G+ ++
Sbjct: 786 VRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQ 844

Query: 810 VFW--DGAAQCEA--SLPPTFTPELLANALLLPCK 840
V W + A C A LPP +LL L C+
Sbjct: 845 VKWGEEENAHCVANYQLPPESQQQLL-TQLSAECR 878


49SC2960SC2972Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2960123-3.961773inner membrane protein
SC2961327-5.873340transporter
SC2962629-3.194115hypothetical protein
SC29638291.885759hypothetical protein
SC29648261.045565hypothetical protein
SC29657240.209424outer membrane protein
SC29668240.325381hypothetical protein
SC29677251.332500fimbrial chaperone protein
SC29686240.436600outer membrane usher protein
SC2969530-6.634747fimbrial-like protein
SC2970631-7.055093hypothetical protein
SC2971528-4.998792hypothetical protein
SC2972124-3.489415hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2962ISCHRISMTASE280.003 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 28.4 bits (63), Expect = 0.003
Identities = 16/68 (23%), Positives = 31/68 (45%), Gaps = 2/68 (2%)

Query: 20 GQVIALKKMLDEPHECAAVLQQIAAIRGAVNGLMREVIKGHLTEHIVHQSDEVRREEDLD 79
+ +LD+ A +Q+ +A G N E I+ + E + +++ +EDL
Sbjct: 198 AFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIAELLQETPEDITDQEDL- 256

Query: 80 VILKVLDS 87
+ + LDS
Sbjct: 257 -LDRGLDS 263


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2968PF005776260.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 626 bits (1617), Expect = 0.0
Identities = 230/856 (26%), Positives = 380/856 (44%), Gaps = 66/856 (7%)

Query: 19 SQATEFNASLLDSGNLSNVDLTAFSREGYVAPGNYILDIWLNDQPVREQYPVRVVPVAGL 78
S FN L + DL+ F + PG Y +DI+LN+ + + V
Sbjct: 44 SAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATRD-VTFNTGDSE 102

Query: 79 DAAVICVTTDMVAMLGLKDKIIHGLKPVTGIPDGQCLELRSA--DSQVRYSAENQRLTFI 136
V C+T +A +GL + + + D C+ L S D+ + QRL
Sbjct: 103 QGIVPCLTRAQLASMGLNTA---SVSGMNLLADDACVPLTSMIHDATAQLDVGQQRLNLT 159

Query: 137 IPQAWMRYQDPDWVPPSRWSDGVTAGLLDYSLMVNRYMPQQGETSTSYSLYGTAGFNLGA 196
IPQA+M + ++PP W G+ AGLL+Y+ N + G S L +G N+GA
Sbjct: 160 IPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGA 219

Query: 197 WRLRSDYQYSRFDS-GQGASQSDFYLPQTYLFRALPALRSKLTLGQTYLSSAIFDSFRFA 255
WRLR + +S S S++ + T+L R + LRS+LTLG Y IFD F
Sbjct: 220 WRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFR 279

Query: 256 GLTLASDERMLPPSLQGYAPKISGIANSNAQVTVSQNGRILYQTRVSPGPFELPDLSQ-N 314
G LASD+ MLP S +G+AP I GIA AQVT+ QNG +Y + V PGPF + D+
Sbjct: 280 GAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAG 339

Query: 315 ISGNLDVSVRESDGSVRTWQVNTASVPFMARQGQVRYKVAAGRPLYGGTHNNSTVSPDFL 374
SG+L V+++E+DGS + + V +SVP + R+G RY + AG G N P F
Sbjct: 340 NSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSG---NAQQEKPRFF 396

Query: 375 LGEATWGAFNNTSLYGGLIASTGDYQSAALGIGQNMGLLGALSADVTRSDARLPHGQKQS 434
G ++YGG + Y++ GIG+NMG LGALS D+T++++ LP +
Sbjct: 397 QSTLLHGLPAGWTIYGGTQLA-DRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHD 455

Query: 435 GYSYRINYAKTFDKTGSTLAFVGYRFSDRHFLSMPEYLQRRTTDGGD------------- 481
G S R Y K+ +++G+ + VGYR+S + + + R
Sbjct: 456 GQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKF 515

Query: 482 ------AWHEKQSYTVTYSQSVPVLNMSAALSVSRLNYWNAQ-SNNNYMLSLNKVFSLGD 534
A++++ +T +Q + + LS S YW + + LN F
Sbjct: 516 TDYYNLAYNKRGKLQLTVTQQLG-RTSTLYLSGSHQTYWGTSNVDEQFQAGLNTAF---- 570

Query: 535 LQGLPASVSFARNQYTGG-GSQNQVYATISIPWGDSR-----------QVSYSVQKDNRG 582
+ + ++S++ + G + ++IP+ SYS+ D G
Sbjct: 571 -EDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNG 629

Query: 583 GLQQTVNYSD--FHNPDTTWNISAGHNRYDTGSN-SSFSGSVQSRLPWGQAAADATLQPG 639
+ + + ++++ G+ G++ S+ ++ R +G A +
Sbjct: 630 RMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDD 689

Query: 640 QYRSLGLSWYGSVTATAHGAAFSQSMAGNEPRMMIDTGDVAGVPVNGNSGV-TNRFGVGV 698
+ L G V A A+G Q + N+ +++ V +GV T+ G V
Sbjct: 690 -IKQLYYGVSGGVLAHANGVTLGQPL--NDTVVLVKAPGAKDAKVENQTGVRTDWRGYAV 746

Query: 699 VSAGSSYRRSDISVDVAALPEDVDVSSSVISQVLTEGAVGYRQIDANQGEQVLGHIRLAD 758
+ + YR + +++D L ++VD+ ++V + V T GA+ + A G ++L + +
Sbjct: 747 LPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTLT-HN 805

Query: 759 GASPPFGALVVSGKTGRTAGMVGDGGLAYLTGLSGEDRRTLNVSW--DGRVQCRLTLPET 816
PFGA+V S +++G+V D G YL+G+ + V W + C
Sbjct: 806 NKPLPFGAMVTSES-SQSSGIVADNGQVYLSGM--PLAGKVQVKWGEEENAHCVANYQLP 862

Query: 817 VTLSRGPL---LLPCR 829
+ L CR
Sbjct: 863 PESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2971ENTEROVIROMP961e-27 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 96.1 bits (239), Expect = 1e-27
Identities = 53/183 (28%), Positives = 77/183 (42%), Gaps = 17/183 (9%)

Query: 1 MNKMLLAGSAGIVLLSAAASPVWADDNASTFSLGYAQSH-TNHAGTLRGVRLANNYEMSP 59
M K+ SA +L+ A A ST + GYAQS + G L YE
Sbjct: 1 MKKIACL-SALAAVLAFTAGTSVAA--TSTVTGGYAQSDAQGQMNKMGGFNLKYRYEEDN 57

Query: 60 D-WGLTTSFAWLNGSQRYSDESSNGRVTTRYYSLLAGPSWKINNQLSLYSQVGPVLLHQR 118
G+ SF + S SS +YY + AGP+++IN+ S+Y VG +
Sbjct: 58 SPLGVIGSFTYTEKS---RTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKFQ 114

Query: 119 DH---GINESDSKVGYGYSAGVAYTPVSNVAITLGYEGADFDATHNSGSLNSNGFNLGVG 175
S G+ Y AG+ + P+ NVA+ YE + S++ + GVG
Sbjct: 115 TTEYPTYKHDTSDYGFSYGAGLQFNPMENVALDFSYEQS------RIRSVDVGTWIAGVG 168

Query: 176 YRF 178
YRF
Sbjct: 169 YRF 171


50SC2988SC2998Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SC2988-113-3.069929hypothetical protein
SC2989-113-0.240938global regulator
SC2990-114-0.695919hypothetical protein
SC2991-115-0.060127hypothetical protein
SC2992-2140.8942736-phospho-beta-glucosidase
SC2993-1122.479130outer membrane protein
SC29940173.836441glycine dehydrogenase
SC2995-1163.724640glycine cleavage system protein H
SC2996-1143.324320glycine cleavage system aminomethyltransferase
SC29970143.668678hypothetical protein
SC2998-1143.1853912-octaprenyl-6-methoxyphenyl hydroxylase
51SC3054SC3071Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3054-211-3.209216major facilitator superfamily nucleoside
SC3055-119-3.542803ornithine decarboxylase
SC3056032-5.163835hypothetical protein
SC3057231-5.546200*lactoylglutathione lyase
SC3058128-4.463136acetyl-CoA hydrolase
SC3059127-4.189193monoamine oxidase
SC3060227-4.472955LysR family transcriptional regulator
SC3061224-3.881755LysR family transcriptional regulator
SC3062325-4.284117arylsulfatase
SC3063123-3.974330arylsulfatase regulator
SC3064124-3.631369response regulator
SC3065121-2.894567hypothetical protein
SC3066020-2.510450amino acid transporter
SC3067022-4.001448hypothetical protein
SC3068022-4.450045oxidoreductase
SC3069227-5.695289NAD-dependent aldehyde dehydrogenase
SC3070119-5.488239DNA replication/recombination/repair protein
SC3071015-4.550166hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3062IGASERPTASE300.030 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.030
Identities = 21/108 (19%), Positives = 34/108 (31%), Gaps = 5/108 (4%)

Query: 231 DEWISRFKSQYEQGYANVYRQRIARLKKLGFLRDDIPLPGLELDKEWQAMTPEQQKYTAK 290
+ + R Q Y N+ L+K R ++P E ++ W M +
Sbjct: 580 NPYAFRRIKDGGQLYLNLENYTYYALRKGASTRSELPKNSGESNENWLYMGKTSDEAKRN 639

Query: 291 VM-----QVYAAMIANMDAQIGTVIETLKKTGRDKNTILVFLSDNGVN 333
VM + + G L T + K+ FL G N
Sbjct: 640 VMNHINNERMNGFNGYFGEEEGKNNGNLNVTFKGKSEQNRFLLTGGTN 687


52SC3134SC3151Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3134-116-3.291532zinc transporter ZupT
SC3135116-4.381328hypothetical protein
SC3136015-3.296484arylsulfate sulfotransferase
SC3137120-2.367416thiol-disulfide isomerase and thioredoxin
SC3138120-1.478541disulfide oxidoreductase
SC3139223-1.872526hypothetical protein
SC3140016-1.414374integrase
SC3141314-0.3268643,4-dihydroxy-2-butanone 4-phosphate synthase
SC31424120.717219hypothetical protein
SC31431112.133566glycogen synthesis protein GlgS
SC31441102.222775hypothetical protein
SC3145092.142367inner membrane protein
SC31460102.780400hypothetical protein
SC3147-1133.044803bifunctional heptose 7-phosphate kinase/heptose
SC3148-1132.483660bifunctional glutamine-synthetase
SC3149-2141.725730hypothetical protein
SC3150-2152.480551signal transduction protein
SC31510163.048543multifunctional tRNA nucleotidyl
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3146IGASERPTASE502e-08 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 49.7 bits (118), Expect = 2e-08
Identities = 40/238 (16%), Positives = 76/238 (31%), Gaps = 8/238 (3%)

Query: 197 PNNAFDAEGLTKLTQETERRRRERNEVEQDVEVAVREKNRDALERKLEIEQQEAFMTLEQ 256
N A+ + + E R A + E E +QE+ +
Sbjct: 999 TPNNIQAD-VPSVPSNNEEIARVDEAPVPPPAPATPSETT---ETVAENSKQESKTVEKN 1054

Query: 257 EQQVKTRTAEQNAKIAAFEAERHREAE-QTRILAERQIQETEIEREQAVRSRKVEAEREV 315
EQ TA+ + A EA+ + +A QT +A+ + E + + + VE E +
Sbjct: 1055 EQDATETTAQN--REVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKA 1112

Query: 316 RIKEIEQQQVTEIANQTKSIAIAAKSEQQSQAEARANDALADAVRAQ-QNVETTRQTAEA 374
+++ + Q+V ++ +Q +++ Q AR ND + Q Q T A
Sbjct: 1113 KVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPA 1172

Query: 375 DRAKQVALIAAAQDAETKAVELTVRAKAEKEAAELQAAAIIELAEATRKKGLAEAEAQ 432
+ V A Q E + + + +
Sbjct: 1173 KETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSV 1230


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3147LPSBIOSNTHSS290.027 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 29.0 bits (65), Expect = 0.027
Identities = 10/37 (27%), Positives = 20/37 (54%)

Query: 347 GVFDILHAGHVSYLANARKLGDRLIVAVNSDASTKRL 383
G FD + GH+ + +L D++ VAV + + + +
Sbjct: 7 GSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPM 43


53SC3284SC3296Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3284-2173.330191ATPase
SC3285-1245.461639cytochrome d ubiquinol oxidase subunit III
SC3286-1255.847840serine endoprotease
SC3287-1265.639642serine endoprotease
SC32880255.364257inner membrane protein
SC32890254.347951oxalacetate decarboxylase subunit beta
SC3290-1191.933865oxaloacetate decarboxylase
SC3291-114-2.542737oxaloacetate decarboxylase subunit gamma
SC3292-114-2.400366L(+)-tartrate dehydratase subunit beta
SC3293-115-1.921045tartrate dehydratase subunit alpha
SC3294017-2.053443cation transporter
SC3295317-2.344420GntR family transcriptional regulator
SC3296317-2.662277GntR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3286V8PROTEASE703e-15 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 69.7 bits (170), Expect = 3e-15
Identities = 34/187 (18%), Positives = 65/187 (34%), Gaps = 32/187 (17%)

Query: 90 GLGSGVIIDAAKGYVLTNNHVINQAQKISIQL------------NDGREFDAKLIGGDDQ 137
+ SGV++ K +LTN HV++ L +G ++ +
Sbjct: 102 FIASGVVV--GKDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 138 SDIALLQIQN-------PSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALG 190
D+A+++ + ++++ + +V G P ++ +
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKP-------VATMW 212

Query: 191 RSGLNLEGLEN-FIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSIGIGFAIPSN 249
S + L+ +Q D S GNSG + N E+IGI+ I N
Sbjct: 213 ESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHW---GGVPNEFNGAVFINEN 269

Query: 250 MAQTLAQ 256
+ L Q
Sbjct: 270 VRNFLKQ 276


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3287V8PROTEASE534e-10 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 53.5 bits (128), Expect = 4e-10
Identities = 31/160 (19%), Positives = 59/160 (36%), Gaps = 26/160 (16%)

Query: 77 RTLGSGVIMDQRGYIITNKHVINDADQIIVALQ------------DGRVFEALLVGSDSL 124
+ SGV++ + ++TNKHV++ AL+ +G +
Sbjct: 101 TFIASGVVVG-KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 125 TDLAVLKI-------NATGGLPTIPINTKRTPHIGDVVLAIGNPYNLGQTITQGIISATG 177
DLA++K + + ++ + + G P + T + G
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGD-KPVATMW--ESKG 216

Query: 178 RIGLNPTGRQNFLQTDASINHGNSGGALVNSLGELMGINT 217
+I + +Q D S GNSG + N E++GI+
Sbjct: 217 KI---TYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHW 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3290RTXTOXIND310.014 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.0 bits (70), Expect = 0.014
Identities = 17/67 (25%), Positives = 29/67 (43%), Gaps = 7/67 (10%)

Query: 508 ASSAPVQAAAPA-------GAGTPVTAPLAGNIWKVIATEGQTVAEGDVLLILEAMKMET 560
+ V+ A A G + + ++I EG++V +GDVLL L A+ E
Sbjct: 75 SVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA 134

Query: 561 EIRAAQA 567
+ Q+
Sbjct: 135 DTLKTQS 141



Score = 29.4 bits (66), Expect = 0.046
Identities = 15/56 (26%), Positives = 22/56 (39%), Gaps = 10/56 (17%)

Query: 535 KVIATEGQTVAEGDVLLILEAMKMETEIRAAQAGTVRGIAVKSGDAVSVGDTLMTL 590
V G+ G EI+ + V+ I VK G++V GD L+ L
Sbjct: 82 IVATANGKLTHSGRSK----------EIKPIENSIVKEIIVKEGESVRKGDVLLKL 127


54SC3321SC3329Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3321-113-3.88022450S ribosomal protein L11 methyltransferase
SC3322-119-6.949612tRNA-dihydrouridine synthase B
SC3323-118-5.576257Fis family transcriptional regulator
SC3324-118-5.751871methyltransferase
SC3325-122-5.184998hypothetical protein
SC3326-122-4.934267DNA-binding transcriptional regulator EnvR
SC3327121-3.870324hypothetical protein
SC3328119-2.898068multidrug transporter acriflavin resistance
SC3329224-3.738404hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3323DNABINDNGFIS1573e-54 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 157 bits (399), Expect = 3e-54
Identities = 98/98 (100%), Positives = 98/98 (100%)

Query: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60
MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ
Sbjct: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60

Query: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98
PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN
Sbjct: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3326HTHTETR1292e-39 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 129 bits (325), Expect = 2e-39
Identities = 83/216 (38%), Positives = 130/216 (60%), Gaps = 3/216 (1%)

Query: 1 MAKKTKADALKTRQHLIETAIAQFALRGVANTTLNDIADAADVTRGAIYWHFENKTQLFN 60
MA+KTK +A +TRQH+++ A+ F+ +GV++T+L +IA AA VTRGAIYWHF++K+ LF+
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EVW-LQQPPLRELIQDRLTGCWNDNPLQDLREKFIAALQYIAAVPRQQALMQILYHKCEF 119
E+W L + + EL + +PL LRE I L+ R++ LM+I++HKCEF
Sbjct: 61 EIWELSESNIGELELEYQAKF-PGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 120 HNGM-ISEQAIREKIGFHHQSLLEVLQRCMDKKLISGSLDLDVILIILHGSFSGIVKNWL 178
M + +QA R + + + L+ C++ K++ L II+ G SG+++NWL
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 179 MNPTSYDLYKQAPALVDNLLKMLSPDGSVRQLMPNE 214
P S+DL K+A V LL+M ++R NE
Sbjct: 180 FAPQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3328ACRIFLAVINRP13860.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1386 bits (3590), Expect = 0.0
Identities = 914/1032 (88%), Positives = 972/1032 (94%)

Query: 1 MANFFIRRPIFAWVLAIILMMAGALAIMQLPVAQYPTIAPPAVSISATYPGADAQTVQDT 60
MANFFIRRPIFAWVLAIILMMAGALAI+QLPVAQYPTIAPPAVS+SA YPGADAQTVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120
VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGISVEKSSSSFLMVAGFVSDNPNTTQDDISDYVASNIKDSISRLNGVGDVQLFGA 180
EVQQQGISVEKSSSS+LMVAGFVSDNP TTQDDISDYVASN+KD++SRLNGVGDVQLFGA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWLDANLLNKYQLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRL 240
QYAMRIWLDA+LLNKY+LTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 KDPEEFGKVTLRVNTDGSVVHLKDVARIELGGENYNVVARINGKPASGLGIKLATGANAL 300
K+PEEFGKVTLRVN+DGSVV LKDVAR+ELGGENYNV+ARINGKPA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTATAIKAKLAELQPFFPQGMKVVYPYDTTPFVKISIHEVVKTLFEAIILVFLVMYLFLQ 360
DTA AIKAKLAELQPFFPQGMKV+YPYDTTPFV++SIHEVVKTLFEAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NIRATLIPTIAVPVVLLGTFAVLAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N+RATLIPTIAVPVVLLGTFA+LAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 MEDNLSPREATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTRAIYRQFSITIVSAMAL 480
MED L P+EATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGST AIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATLLKPVSAEHHEKKSGFFGWFNTRFDHSVNHYTNSVSGIVRNTGRY 540
SVLVALILTPALCATLLKPVSAEHHE K GFFGWFNT FDHSVNHYTNSV I+ +TGRY
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 541 LIIYLLIVVGMAVLFLRLPTSFLPEEDQGVFLTMIQLPSGATQERTQKVLDQVTHYYLNN 600
L+IY LIV GM VLFLRLP+SFLPEEDQGVFLTMIQLP+GATQERTQKVLDQVT YYL N
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 601 EKANVESVFTVNGFSFSGQGQNSGMAFVSLKPWEERNGEENSVEAVIARATRAFSQIRDG 660
EKANVESVFTVNGFSFSGQ QN+GMAFVSLKPWEERNG+ENS EAVI RA +IRDG
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 661 LVFPFNMPAIVELGTATGFDFELIDQGGLGHDALTKARNQLLGMVAKHPDLLVRVRPNGL 720
V PFNMPAIVELGTATGFDFELIDQ GLGHDALT+ARNQLLGM A+HP LV VRPNGL
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 721 EDTPQFKLDVDQEKAQALGISLSDINETISAALGGYYVNDFIDRGRVKKVYVQADAQFRM 780
EDT QFKL+VDQEKAQALG+SLSDIN+TIS ALGG YVNDFIDRGRVKK+YVQADA+FRM
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 781 LPGDINNLYVRSANGEMVPFSTFSSARWIYGSPRLERYNGMPSMELLGEAAPGRSTGEAM 840
LP D++ LYVRSANGEMVPFS F+++ W+YGSPRLERYNG+PSME+ GEAAPG S+G+AM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 841 SLMENLASQLPNGIGYDWTGMSYQERLSGNQEPALYAISLIVVFLCLAALYESWSIPFSV 900
+LMENLAS+LP GIGYDWTGMSYQERLSGNQ PAL AIS +VVFLCLAALYESWSIP SV
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 901 MLVVPLGVVGALLAASLRGLNNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMEKEGRGLI 960
MLVVPLG+VG LLAA+L NDVYF VGLLTTIGLSAKNAILIVEFAKDLMEKEG+G++
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 961 EATLEASRMRLRPILMTSLAFILGVMPLVISRGAGSGAQNAVGTGVMGGMLTATLLAIFF 1020
EATL A RMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGM++ATLLAIFF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1021 VPVFFVVVKRRF 1032
VPVFFVV++R F
Sbjct: 1021 VPVFFVVIRRCF 1032


55SC3417SC3422Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC34171123.108066hypothetical protein
SC34181133.1490733-dehydroquinate synthase
SC34192163.739739shikimate kinase I
SC34200134.042674porin
SC34210153.655695hypothetical protein
SC34220143.062560hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3417IGASERPTASE425e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 42.0 bits (98), Expect = 5e-06
Identities = 30/176 (17%), Positives = 56/176 (31%), Gaps = 5/176 (2%)

Query: 146 ANATQPAPGATSAEQTAGNTSQDISLPPISSTPTQGQSPVVADGQQRVEVQGDLNNALTQ 205
N Q + + + +PP + + VA+ ++ + N
Sbjct: 1000 PNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDAT 1059

Query: 206 NPEQMNNVAVN---STLPTEPATVAPVRNGSTTRQAAVSEPTERHTTRPERKQAVIKPKK 262
N S + T ++GS T++ +E E T E K V K
Sbjct: 1060 ETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKT 1119

Query: 263 PQTTAKTTTAEPKKPVAPVKRTEPAAPAATPKATTTTTAPQATASAAPVQTAKPAQ 318
+ T+ PK+ + + +P A A T + + T +PA+
Sbjct: 1120 QEVPKVTSQVSPKQEQS--ETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAK 1173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3419CARBMTKINASE326e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 32.5 bits (74), Expect = 6e-04
Identities = 27/91 (29%), Positives = 40/91 (43%), Gaps = 18/91 (19%)

Query: 32 FYDSDQEIEKRTGADVGWVFDVEGEDGFRN----------REEKVINELTEKQGIVLATG 81
FYD + KR + GW+ + G+R E + I +L E+ IV+A+G
Sbjct: 136 FYDEETA--KRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKLVERGVIVIASG 193

Query: 82 GGSVKSRETRNRLSARGVVVYLETTIEKQLA 112
GG V + +GV E I+K LA
Sbjct: 194 GGGVPVILEDGEI--KGV----EAVIDKDLA 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3420TYPE3OMGPROT2692e-86 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 269 bits (688), Expect = 2e-86
Identities = 82/301 (27%), Positives = 134/301 (44%), Gaps = 18/301 (5%)

Query: 117 LENRSINLQYADAGELAKAGEKLLSAKGTIMVDKRTNRLLLRDNRAALAELEKWVSQMDL 176
L + +I D + +A SA+ + D N +++RD+ + ++ + +D
Sbjct: 219 LSDATIQQVTVDNQRIPQAAT-RASAQARVEADPSLNAIIVRDSPERMPMYQRLIHALDK 277

Query: 177 PVAQVELAAHIVTINEKSLRELGVKWTLADATQAGAVGDVATLSSDLSVAAATSRVGFNI 236
P A++E+A IV IN L ELGV W + T + T ++A+ G
Sbjct: 278 PSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQVVIKTTGDQSNIASN----GALG 333

Query: 237 GRINGRLLDL---ELSALEQKQQLDIIASPRLLASHLQPASIKQGSEIPYQVSSGESGAT 293
++ R LD ++ LE + +++ P LL A I SE Y +G+ A
Sbjct: 334 SLVDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDH-SETYYVKVTGKEVA- 391

Query: 294 SVEFKEAVLG--MEVTPTVLQKG---RIRLKLHISQNVPGQVLQQADGEVLAIDKQEIET 348
E K G + +TP VL +G I L LHI +G + I + ++T
Sbjct: 392 --ELKGITYGTMLRMTPRVLTQGDKSEISLNLHIEDGNQKPNSSGIEG-IPTISRTVVDT 448

Query: 349 QVEVKSGETLALGGIFSRKNKSGSDSVPLLGDIPWLGQLFRHDGKEDERRELVVFITPRL 408
V G++L +GGI+ + VPLLGDIP++G LFR + R + I PR+
Sbjct: 449 VARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVRLFIIEPRI 508

Query: 409 V 409
+
Sbjct: 509 I 509



Score = 29.5 bits (66), Expect = 0.032
Identities = 18/98 (18%), Positives = 34/98 (34%), Gaps = 4/98 (4%)

Query: 1 MKRWIAIILIALMPAAQAG----KAAKVTLVVDDVPVVQVLQALAEQERQNLVVSPDVSG 56
KR + L+ L + A V + +L +VVS ++
Sbjct: 9 FKRVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDFGANYDATVVVSDKIND 68

Query: 57 TLSLHLTDVPWKQALQTVVNSAGLVLRQEGNILHVHSQ 94
+S + LQ + + LV +GN+L++
Sbjct: 69 KVSGQFEHDNPQDFLQHIASLYNLVWYYDGNVLYIFKN 106


56SC3433SC3451Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC34330233.509006osmolarity sensor protein
SC34340243.369063osmolarity response regulator
SC3435-1212.863696transcription elongation factor GreB
SC3436-2183.296235RNase R
SC3437-2152.673466ferrous iron transport protein A
SC3438-2142.663903ferrous iron transport protein B
SC3439-2102.352913hypothetical protein
SC3440-2123.145365hypothetical protein
SC3441-3153.302433carboxylesterase BioH
SC3442-2132.966197gluconate periplasmic binding protein
SC3443-2132.437961DNA uptake protein
SC3444-2143.206132GntP family, high-affinity gluconate permease in
SC3445-2143.1140154-alpha-glucanotransferase
SC3446-1173.345955maltodextrin phosphorylase
SC34471173.244571transcriptional regulator MalT
SC34481194.581672hypothetical protein
SC34492204.799231RNA 3'-terminal-phosphate cyclase
SC34501183.604123hypothetical protein
SC34511183.879590ribonucleoprotein related-protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3433PF06580320.006 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.006
Identities = 27/188 (14%), Positives = 71/188 (37%), Gaps = 45/188 (23%)

Query: 270 INKDIEECNAIIEQFIDYLR------TGQEMPM--EMADLNSVL-------GEVIAAESG 314
I +D + ++ + +R +++ + E+ ++S L + +
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQ---- 241

Query: 315 YEREINTALQAGSIQVKMHPLSIKRAVANMVVNA--ARYGNGWIKVSSGTESHRAWFQVE 372
+E +IN A+ V++ P+ ++ V N + + G I + ++ +VE
Sbjct: 242 FENQINPAIM----DVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVE 297

Query: 373 DDGPGIKLEQRKHLFQPFVRGDSARSTSGTGLGLAIV-QRIIDNH--NGMLEIGTSERGG 429
+ G ++ TG GL V +R+ + +++ + ++G
Sbjct: 298 NTGSLALKNTKE----------------STGTGLQNVRERLQMLYGTEAQIKL-SEKQGK 340

Query: 430 LSIRAWLP 437
++ +P
Sbjct: 341 VNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3434HTHFIS996e-26 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 98.8 bits (246), Expect = 6e-26
Identities = 39/136 (28%), Positives = 72/136 (52%), Gaps = 3/136 (2%)

Query: 11 KILVVDDDMRLRALLERYLTEQGFQVRSVANAEQMDRLLTRESFHLMVLDLMLPGEDGLS 70
ILV DDD +R +L + L+ G+ VR +NA + R + L+V D+++P E+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 71 ICRRLRSQSNPMPIIMVTAKGEEVDRIVGLEIGADDYIPKPFNPRELLARIRAVL---RR 127
+ R++ +P+++++A+ + I E GA DY+PKPF+ EL+ I L +R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 128 QANELPGAPSQEEAVI 143
+ ++L ++
Sbjct: 125 RPSKLEDDSQDGMPLV 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3438TCRTETOQM429e-06 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 41.8 bits (98), Expect = 9e-06
Identities = 42/142 (29%), Positives = 66/142 (46%), Gaps = 30/142 (21%)

Query: 1 MKKLTIGLIGNPNSGKTTLFNQL---TGARQRVGNW-AGVTV------ERKEG---QFAT 47
MK + IG++ + ++GKTTL L +GA +G+ G T ER+ G Q
Sbjct: 1 MKIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGI 60

Query: 48 T-----DHQVTLVDLPGTYSLTTISSQTSLDEQIACHYILSGDADLLINVVDASNLE-RN 101
T + +V ++D PG + SL +L G A LLI+ D + R
Sbjct: 61 TSFQWENTKVNIIDTPG-HMDFLAEVYRSLS-------VLDG-AILLISAKDGVQAQTRI 111

Query: 102 LYLTLQLLELGIPCIVALNMLD 123
L+ L+ ++GIP I +N +D
Sbjct: 112 LFHALR--KMGIPTIFFINKID 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3443PF06580280.038 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 27.5 bits (61), Expect = 0.038
Identities = 12/26 (46%), Positives = 16/26 (61%), Gaps = 5/26 (19%)

Query: 96 AKMRKVADDAPLMERVEYALQSQINP 121
KM +A +A LM AL++QINP
Sbjct: 152 WKMASMAQEAQLM-----ALKAQINP 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3447RTXTOXIND300.044 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.044
Identities = 18/115 (15%), Positives = 42/115 (36%), Gaps = 6/115 (5%)

Query: 534 QSEIQFAQGFLQAAWETQERAFQLIKEQHLEQLPMHEFLVRIRAQLL------WAWARLD 587
+++ Q L A Q R L + L +LP + Q + + +
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIK 192

Query: 588 EAEASARSGIAVLSTFQPQQQLQCLTLLVQCSLARGDLDNARSQLNRLENLLGNG 642
E ++ ++ +++ + LT+L + + +S+L+ +LL
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQ 247


57SC3487SC3515Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3487-1223.787758hypothetical protein
SC3488-1244.101014hypothetical protein
SC3489-2223.332869leucine/isoleucine/valine transporter
SC3490-2172.298531leucine/isoleucine/valine transporter
SC3491-3161.831315leucine/isoleucine/valine transporter permease
SC3492-1150.808177branched-chain amino acid transporter permease
SC3493-1161.333439hypothetical protein
SC3494-2161.530631hypothetical protein
SC34951162.139863hypothetical protein
SC34961172.539677Leu/Ile/Val/Thr-binding protein
SC34972172.310076RNA polymerase factor sigma-32
SC34982162.338322cell division protein FtsX
SC34992152.119131cell division protein FtsE
SC35002143.985025cell division protein FtsY
SC35011154.03560116S rRNA m(2)G966-methyltransferase
SC35021163.960652hypothetical protein
SC35032143.657304hypothetical protein
SC35041153.593317hypothetical protein
SC35051153.917992zinc/cadmium/mercury/lead-transporting ATPase
SC35060141.531731methyl-accepting transmembrane citrate/phenol
SC35072161.697535sulfur transfer protein SirA
SC35081151.709710hypothetical protein
SC35090152.169660hypothetical protein
SC3510-2143.220884major facilitator superfamily transporter
SC3511-2143.408308PerM family permease
SC3512-2154.089079holo-(acyl carrier protein) synthase 2
SC3513-2143.343466nickel responsive regulator
SC3514-2143.173602ABC transporter ATP-binding protein
SC3515-2133.161690mtultidrug ABC transporter permease/ATPase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3500IGASERPTASE300.024 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.024
Identities = 15/114 (13%), Positives = 34/114 (29%), Gaps = 2/114 (1%)

Query: 17 DKEQKQEQTEEQQIVEEQRPVEPPVETAADVDAQTPAHSKAETEAFAEEVVDVTEKVQES 76
+++ K E + Q++ + V P E + V Q + + +E T ++
Sbjct: 1109 EEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADT 1168

Query: 77 EKP-QPVEPEPAAAIETAAPQIAVEREELPLPEEVKDEAISPEEWQAEAETVEV 129
E+P + + + PE P + +
Sbjct: 1169 EQPAKETSSNVEQPVTESTTVNTGNSVV-ENPENTTPATTQPTVNSESSNKPKN 1221


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3503SHIGARICIN270.027 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 26.7 bits (59), Expect = 0.027
Identities = 6/29 (20%), Positives = 16/29 (55%)

Query: 7 FFIIIIALIVVAASFRFVQQRREKAANEA 35
+++I AA ++F++Q+ K ++
Sbjct: 173 ALMVLIQSTSEAARYKFIEQQIGKRVDKT 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3505ACRIFLAVINRP300.039 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.2 bits (68), Expect = 0.039
Identities = 17/78 (21%), Positives = 34/78 (43%), Gaps = 3/78 (3%)

Query: 336 AEERRAPIERFIDRFSRIYTPVIMVIALLVTLIPPLMFDGGWQEWIYKGLTLLLIGCPCA 395
E++ P E S+I ++ + +L + P+ F GG IY+ ++ ++ A
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVS---A 477

Query: 396 LVISTPAAITSGLAAAAR 413
+ +S A+ A A
Sbjct: 478 MALSVLVALILTPALCAT 495


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3507PF012061012e-32 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 101 bits (254), Expect = 2e-32
Identities = 28/72 (38%), Positives = 42/72 (58%)

Query: 9 DHTLDALGLRCPEPVMMVRKTVRNMQTGETLLIIADDPATTRDIPGFCTFMEHDLLAQET 68
D +LDA GL CP P++ +KT+ M GE L ++A DP + +D F H+LL Q+
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 69 EGLPYRYLLRKA 80
E Y + L++A
Sbjct: 65 EDGTYHFRLKRA 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3508PF04183280.035 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 27.9 bits (62), Expect = 0.035
Identities = 17/91 (18%), Positives = 28/91 (30%), Gaps = 14/91 (15%)

Query: 121 LGQILDVHVFNRLRQNRRWWLAPTASTLFGNISDTLAFFFIAFWRSPDAFMAEHWMEIAL 180
LG I + L+ + +TL + + AE W+
Sbjct: 347 LGVIWRENPCRWLKPDES---PVLMATLMECDENNQPL--AGAYIDRSGLDAETWLT--- 398

Query: 181 VDYCFKVLISIIFFLPMYGVLL-----NMLL 206
V++ + L YGV L N+ L
Sbjct: 399 -QLFRVVVVPLYHLLCRYGVALIAHGQNITL 428


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3510TCRTETA483e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 48.3 bits (115), Expect = 3e-08
Identities = 75/403 (18%), Positives = 137/403 (33%), Gaps = 42/403 (10%)

Query: 13 LRLNLRIVSIVMFNFASYLTIGLPLAVLPGYVHD--AMGFSAFWAGLIISLQYFATLLSR 70
++ N ++ I+ + IGL + VLPG + D G++++L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 71 PHAGRYADVLGPKKIVVFGLCGCFLSGFGYLLADIASAWPMISLLLLGLGRVILGI-GQS 129
P G +D G + +++ L G + Y + A L +L +GR++ GI G +
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAG---AAVDYAIMATAPF-----LWVLYIGRIVAGITGAT 112

Query: 130 FAGTGSTLWGVGVVGSLHIGRVISWNGIVTYGAMAMGAPLGVLCYAWGGLQGLALTVMGV 189
A G+ + + R + M G LG L G
Sbjct: 113 GAVAGAYIADITDGDER--ARHFGFMSACFGFGMVAGPVLGGLM----GGFSPHAPFFAA 166

Query: 190 ALLAILLAL----------PRPSVKANKGKPLPFRAVLGRVWLYGMALALA-----SAGF 234
A L L L + P + + +A +A
Sbjct: 167 AALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVG 226

Query: 235 GVIATFITLFYDAK-GWDGAAFALTLFSVAFVGT---RLLFPNGINRLGGLNVAMICFGV 290
V A +F + + WD ++L + + + ++ RLG M+
Sbjct: 227 QVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIA 286

Query: 291 EIIGLLLVGTAAMPWMAKIGVLLTGMGFSLVFPALGVVAVKAVPPQNQGAALATYTVFMD 350
+ G +L+ A WMA ++L + PAL + + V + QG +
Sbjct: 287 DGTGYILLAFATRGWMAFPIMVLLA-SGGIGMPALQAMLSRQVDEERQGQLQGSLAALTS 345

Query: 351 MSLGVTGPLAGLVMTWAGVPV----IYLAAAGLVAMALLLTWR 389
++ + GPL + A + ++A A L + L R
Sbjct: 346 LT-SIVGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3512ENTSNTHTASED342e-04 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 34.2 bits (78), Expect = 2e-04
Identities = 30/116 (25%), Positives = 54/116 (46%), Gaps = 9/116 (7%)

Query: 30 RRASWLAGRVLLSRALSPL---PEMVYGEQGKPAFSAGTPLWFNLSHSGDTIALLLSDEG 86
R+A LAGR+ AL + G++ +P + G L+ ++SH T ++S +
Sbjct: 46 RKAEHLAGRIAAVHALREVGVRTVPGMGDKRQPLWPDG--LFGSISHCATTALAVISRQ- 102

Query: 87 EVGCDIEVIRPRDNWRSLANAVFSLGEHAEMEAERPERQLADFWRI-WTRKEAIVK 141
+G DIE I + LA ++ E ++A LA + ++ KE++ K
Sbjct: 103 RIGIDIEKIMSQHTATELAPSIIDSDERQILQASLLPFPLAL--TLAFSAKESVYK 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3514ABC2TRNSPORT482e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 48.0 bits (114), Expect = 2e-08
Identities = 43/171 (25%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 200 REREHGTVEHLLVMPVTPFEIMMAKV-WSMGLVVLVVSGLSLMLMVKGVLGVPIEGSIPL 258
R T E +L + +I++ ++ W+ L +G + +V LG + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAG---IGVVAAALGY-TQWLSLL 148

Query: 259 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLMILVLLPLQMLSGGSTPRESMPQAVQD 317
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 318 IMLTMPTTHFVSLAQAILYRGAGLSIVWPQFLTLLAIGGVFFL-IALLRFR 367
+P +H + L + I+ + + + I FFL ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


58SC3524SC3532Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3524-115-3.461583phosphatase
SC3525020-6.697479hypothetical protein
SC3526230-9.038818glutathione reductase
SC3527337-10.453781alpha-ketoglutarate permease
SC3528031-9.069772alpha-ketoglutarate permease
SC3529029-8.467832hypothetical protein
SC3530121-6.164043protein rpiR
SC3531118-4.517667ribokinase
SC3532213-1.951028L-asparaginase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3527TCRTETB348e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 8e-04
Identities = 65/374 (17%), Positives = 127/374 (33%), Gaps = 57/374 (15%)

Query: 2 SVAQASYLITAYGITVTLAAWVTGVLVQTLGPRKVMFCGLVAFIIGS-IGFIGIGLKNMD 60
A +++ TA+ +T ++ V G L LG ++++ G++ GS IGF+G +
Sbjct: 47 PPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVG----HSF 102

Query: 61 LVWMLPFYAIRGIGYPLFAYSFLIWINYSTPVARRSTAVGWFWFTFSLGLSVIGPFFSSI 120
++ I+G G F ++ + P R A G ++G +GP +
Sbjct: 103 FSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEG-VGPAIGGM 161

Query: 121 ALPVLGEIHVLWVGLLFVLIGSILGIWVNRDVVPASEIHP-------------------- 160
+ W LL + + +I+ + ++
Sbjct: 162 IAHYIH-----WSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFML 216

Query: 161 --------------FSAGELLKGITILQRPIIAIGL-------VVKSVNGIAQYGLATFL 199
S +K I + P + GL + GI +A F+
Sbjct: 217 FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFV 276

Query: 200 PL--YLISYGYSKTEWLHMWSSVFLVAIFANLFFGFFGDKFGWRKTIMWVGGFGYAVVLL 257
+ Y++ + + + S + + + FG+ G R+ ++V G + +
Sbjct: 277 SMVPYMMKDVHQLSTAE-IGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSV 335

Query: 258 LVWAVPQLLGHNFYVMAF-VLCLCGVTMAGYVPLSALFPM-LAPDSKGAAMSVLNLGAGL 315
LL + M ++ + G +S + L GA MS+LN + L
Sbjct: 336 SFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFL 395

Query: 316 GAFIAPAITALFYS 329
AI S
Sbjct: 396 SEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3529VACCYTOTOXIN280.049 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 28.5 bits (63), Expect = 0.049
Identities = 23/71 (32%), Positives = 33/71 (46%), Gaps = 9/71 (12%)

Query: 7 INSFRSRPELFNADNKMDTC-----DLITRMGNVDGLTHIELNYPDHF---IGQDKKIIK 58
I F+ R L+N +N+MD C D I G G +N P+++ G+ K I
Sbjct: 755 IEQFKERLALYNNNNRMDICVVRNTDDIKACGTAIG-NQSMVNNPENYKYLEGKAWKNIG 813

Query: 59 QCITDNGLKVS 69
T NG K+S
Sbjct: 814 ISKTANGSKIS 824


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3532SUBTILISIN280.043 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 28.3 bits (63), Expect = 0.043
Identities = 14/49 (28%), Positives = 19/49 (38%), Gaps = 3/49 (6%)

Query: 254 YDAAIAHHADGIIYAGTGAGSVSVRSDAGIKKAEKAGIIVVRASRTGNG 302
AI D II G +KKA + I+V+ A+ GN
Sbjct: 133 IYYAIEQKVD-IISMSLGGPEDVPELHEAVKKAVASQILVMCAA--GNE 178


59SC3545SC3550Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3545-1143.051411diguanylate phosphodiesterase
SC3546-1143.272666ketodeoxygluconokinase
SC3547-2133.369713Zn-dependent peptidase
SC3548-2143.833889C4-dicarboxylate transporter DctA
SC3549-1134.476846phosphodiesterase
SC35500145.465676cellulose synthase subunit BcsC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3545SALSPVBPROT290.016 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 29.3 bits (65), Expect = 0.016
Identities = 44/160 (27%), Positives = 63/160 (39%), Gaps = 30/160 (18%)

Query: 93 DFFTRHHLLASVNVDGPTLIAMRRQPDILAAMERLPWLRFELV----EHIRLPKDSSFAS 148
DF+ H +++ G T A R D AA WL E V EHI ++
Sbjct: 157 DFWLLHDSNGILHLLGKT--AAARLSDPQAASHTAQWLVEESVTPAGEHI------YYSY 208

Query: 149 MCEFGPLWLDDFGTGMANFSA---LSEVRYDYIKVALELFVMLRQSAEGRNLFTLLLQLM 205
+ E G + + SA LS+V+Y A +L++ + + LFTL+
Sbjct: 209 LAENGDNVDLNGNEAGRDRSAMRYLSKVQYGNATPAADLYLWTSATPAVQWLFTLVFDYG 268

Query: 206 NRYCRGVIVEGVETLEEWRDVQRSPAFAAQGYFLSRPVPL 245
R GV D Q PAF AQ +L+R P
Sbjct: 269 ER--------GV-------DPQVPPAFTAQNSWLARQDPF 293


60SC3567SC3618Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3567215-4.291445*phosphoethanolamine transferase
SC3568418-4.511127long polar fimbrial minor protein
SC3569317-4.303614long polar fimbrial operon protein
SC3570217-3.175989long polar fimbrial chaperone
SC35710111.409489long polar fimbria
SC35720112.924203hypothetical protein
SC35730113.038293hypothetical protein
SC3574-1123.1543853-methyladenine DNA glycosylase
SC35750133.063224hypothetical protein
SC35760133.108276biotin sulfoxide reductase
SC35770141.063375outer membrane lipoprotein
SC3578-115-0.4151592-hydroxyacid dehydrogenase
SC3579220-2.397691hypothetical protein
SC3580627-3.951165transcriptional regulator
SC3581730-3.856323cold-shock protein
SC3582831-5.037975hypothetical protein
SC3583936-6.161373hypothetical protein
SC3584837-7.387775integrase
SC3585522-3.044513acetyltransferase
SC3586419-1.969757hypothetical protein
SC3587419-1.924904acetyltransferase
SC3588117-1.658515transposase insK for insertion sequence e
SC3589216-1.222812transposase of Tn10
SC3590-1161.553460glycyl-tRNA synthetase subunit beta
SC3591-2130.396696glycyl-tRNA synthetase subunit alpha
SC3592011-0.141086outer membrane lipoprotein
SC35930110.004098inner membrane protein
SC35940111.528506hypothetical protein
SC35951112.092055xylulokinase
SC35961132.134811xylose isomerase
SC35970163.068450xylose operon regulatory protein
SC3598-1153.660780hypothetical protein
SC3599-2163.802498alpha-amylase
SC3600-2163.528095L-xylulose kinase, cryptic
SC3601-1133.3540823-keto-L-gulonate-6-phosphate decarboxylase
SC3602-1133.552141L-xylulose 5-phosphate 3-epimerase
SC3603-1133.313434AraC family transcriptional regulator
SC3604-1133.148761aldehyde dehydrogenase
SC3605-1153.141098transcriptional regulator
SC3606-2162.875613selenocysteinyl-tRNA-specific translation
SC3607-3192.359021selenocysteine synthase
SC3608-2220.854245glutathione S-transferase
SC3609-218-1.591674PTS family, mannitol-specific enzyme IIABC
SC3610119-1.708477mannitol-1-phosphate 5-dehydrogenase
SC3611118-0.911289mannitol repressor protein
SC3612219-0.574777hypothetical protein
SC36131180.470763hypothetical protein
SC36140180.718144inner membrane lipoprotein
SC36151162.005349inner membrane protein
SC3616-2143.492741L-lactate permease
SC3617-2143.162800DNA-binding transcriptional repressor LldR
SC3618-1143.285552L-lactate dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3575SACTRNSFRASE348e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 34.1 bits (78), Expect = 8e-05
Identities = 20/52 (38%), Positives = 26/52 (50%), Gaps = 5/52 (9%)

Query: 76 VAPDALRHGIGKALL----EYVQQR-FPLLSLEVYQKNQSAVNFYHALGFRI 122
VA D + G+G ALL E+ ++ F L LE N SA +FY F I
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3577OMPADOMAIN1161e-33 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 116 bits (293), Expect = 1e-33
Identities = 43/124 (34%), Positives = 64/124 (51%), Gaps = 11/124 (8%)

Query: 108 LNMPNNVTFDSSSATLKPAGANTLTGVAMVLKEY--PKTAVNVVGYTDSTGSHDLNMRLS 165
+ ++V F+ + ATLKP G L + L +V V+GYTD GS N LS
Sbjct: 215 FTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLS 274

Query: 166 QQRADSVASSLITQGVDASRIRTSGMGPANPIASNSTAEGK---------AQNRRVEITL 216
++RA SV LI++G+ A +I GMG +NP+ N+ K A +RRVEI +
Sbjct: 275 ERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334

Query: 217 SPLQ 220
++
Sbjct: 335 KGIK 338


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3587SACTRNSFRASE444e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 43.8 bits (103), Expect = 4e-08
Identities = 26/121 (21%), Positives = 50/121 (41%), Gaps = 9/121 (7%)

Query: 18 FFSSVHTIASHYYTREQIDAWAPADIDLERWANHIKELQPFVVELDGEIAGYADFQPN-- 75
F + V T +++ + D+D+ K F+ L+ G + N
Sbjct: 30 FENGVWTYTEERFSKPYFKQYEDDDMDVSYVEEEGKAA--FLYYLENNCIGRIKIRSNWN 87

Query: 76 --GYIDHFFVSGTYSRQGVGILLMNCIHEEARQRGISEL---TSNVSKAAEVFFLRHGFH 130
I+ V+ Y ++GVG L++ E A++ L T +++ +A F+ +H F
Sbjct: 88 GYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFI 147

Query: 131 I 131
I
Sbjct: 148 I 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3598FLGFLGJ421e-06 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 42.0 bits (98), Expect = 1e-06
Identities = 30/105 (28%), Positives = 47/105 (44%), Gaps = 17/105 (16%)

Query: 134 TRRIPWNTLLERVDIIPTSMVATMAAAESGWGTSKLARSN----NNLFGMKCT---KGRC 186
+ L + +P ++ AA ESGWG ++ R N NLFG+K + KG
Sbjct: 154 AQLSLPAQLASQQSGVPHHLILAQAALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPV 213

Query: 187 T---------NTPGKVKG-YSQFASVEESVSAYVANLNTHPAYSS 221
T KVK + ++S E++S YV L +P Y++
Sbjct: 214 TEITTTEYENGEAKKVKAKFRVYSSYLEALSDYVGLLTRNPRYAA 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3600TYPE4SSCAGA270.009 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 26.6 bits (58), Expect = 0.009
Identities = 11/29 (37%), Positives = 18/29 (62%)

Query: 7 FGAALAARVGTGVYRDFREAQRDLQHPVR 35
F A+A TG Y + ++AQ+DL+ +R
Sbjct: 591 FNKAVADAKNTGNYDEVKKAQKDLEKSLR 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3606TCRTETOQM532e-09 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 52.9 bits (127), Expect = 2e-09
Identities = 35/106 (33%), Positives = 53/106 (50%), Gaps = 16/106 (15%)

Query: 3 IATAGHVDHGKTTLLQAI---TGV------------NADRLPEEKKRGMTIDLGYAYWPQ 47
I HVD GKTTL +++ +G D E++RG+TI G +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 48 PDGRVLGFIDVPGHEKFLSNMLAGVGGIDHALLVVACDDGVMAQTR 93
+ +V ID PGH FL+ + + +D A+L+++ DGV AQTR
Sbjct: 66 ENTKV-NIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTR 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3615PF03895721e-17 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 72.2 bits (177), Expect = 1e-17
Identities = 18/80 (22%), Positives = 38/80 (47%), Gaps = 2/80 (2%)

Query: 1368 VENKMSGGIASAMAMAGLPQAYAPGANMTSIAGGTFNGESAVAIGV-SMVSESGGWVYKL 1426
+ ++ G+A+ A++ L Q G S A G + ++A+AIGV S +++ +
Sbjct: 1 LSKELQTGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGV 60

Query: 1427 QGTSNSQGDYSAAIGAGFQW 1446
+ + G S G+++
Sbjct: 61 AFNTYN-GGMSYGASVGYEF 79


61SC3632SC3643Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3632111-3.3943042-amino-3-ketobutyrate CoA ligase
SC3633217-6.197838ADP-L-glycero-D-manno-heptose-6-epimerase
SC3634323-8.360276ADP-heptose--LPS heptosyltransferase
SC3635435-12.539347ADP-heptose--LPS heptosyltransferase
SC3636642-15.260488hypothetical protein
SC3637444-15.447087hexose transferase, lipopolysaccharide core
SC3638545-16.549948lipopolysaccharide core biosynthesis protein
SC3639341-15.156213lipopolysaccharide core biosynthesis protein
SC3640238-12.403884UDP-D-glucose:(galactosyl)lipopolysaccharide
SC3641032-9.914080UDP-D-galactose:(glucosyl)lipopolysaccharide-
SC3642-223-6.321893UDP-D-galactose:(glucosyl)lipopolysaccharide-1,
SC3643-318-4.107336inner membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3633NUCEPIMERASE1002e-26 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 100 bits (250), Expect = 2e-26
Identities = 75/348 (21%), Positives = 124/348 (35%), Gaps = 67/348 (19%)

Query: 2 IIVTGGAGFIGSNIVKALNDKGITDILVVDNLKD--------------GTKFVNLVDLNI 47
+VTG AGFIG ++ K L + G ++ +DNL D +++
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 48 ADYMDKEDFLIQIMSGEELGDIEAIFHEGACSSTTEWDGKYMMDNNYQYSK-------EL 100
AD + + + G E +F + +Y ++N + Y+ +
Sbjct: 62 ADR----EGMTDLF---ASGHFERVFISPHRLAV-----RYSLENPHAYADSNLTGFLNI 109

Query: 101 LHYCLERGIP-FLYASSAATYGGRTSD-FIESREYEKPLNVYGYSKFLFDEYVRQILPEA 158
L C I LYASS++ YG F + P+++Y +K +
Sbjct: 110 LEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 159 NSQIVGFRYFNVYGPREGHKGSMASVAFHLNTQLNNGESPKLFEGSENFKRDFVYVGDVA 218
G R+F VYGP + MA F + G+S ++ KRDF Y+ D+A
Sbjct: 170 GLPATGLRFFTVYGPWG--RPDMA--LFKFTKAMLEGKSIDVY-NYGKMKRDFTYIDDIA 224

Query: 219 AVNL------------WFLESGKSG-------IFNLGTGRAESFQAVADATLAY-HKKGS 258
+ W +E+G ++N+G A +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 259 IEYIPFPDKLKGRYQAFTQADLTNLRNA-GYDKPFKTVAEGVTEYMAW 305
+P G T AD L G+ P TV +GV ++ W
Sbjct: 285 KNMLPLQ---PGDVL-ETSADTKALYEVIGF-TPETTVKDGVKNFVNW 327


62SC3667SC3678Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3667-2103.419109tRNA guanosine-2'-O-methyltransferase
SC3668-1143.453307ATP-dependent DNA helicase RecG
SC3669-1152.883426hypothetical protein
SC36700152.596731GltS family glutamate transport protein
SC36710152.241481NCS2 family, purine/xanthine transport protein
SC3672-1130.275657hypothetical protein
SC3673213-0.405723alpha-xylosidase
SC3674414-2.369525transporter
SC3675420-3.998577transposase
SC3676420-4.420943*transposase
SC3677419-3.882994hypothetical protein
SC3678516-2.440766autotransported protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3668SECA412e-05 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 41.0 bits (96), Expect = 2e-05
Identities = 27/79 (34%), Positives = 38/79 (48%), Gaps = 7/79 (8%)

Query: 291 MRLVQGDV-----GSGKTLVAALAA-LRAIAHGKQVALMAPTELLAEQHANNFRNWFEPL 344
M L + + G GKTL A L A L A+ GK V ++ + LA++ A N R FE L
Sbjct: 92 MVLNERCIAEMRTGEGKTLTATLPAYLNALT-GKGVHVVTVNDYLAQRDAENNRPLFEFL 150

Query: 345 GVEVGWLAGKQKGKARQAQ 363
G+ VG A++
Sbjct: 151 GLTVGINLPGMPAPAKREA 169


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3678PERTACTIN1191e-29 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 119 bits (300), Expect = 1e-29
Identities = 162/715 (22%), Positives = 278/715 (38%), Gaps = 94/715 (13%)

Query: 235 TGDSSEGLRTGQSGSLIRLGDDATIETSGASSTGIYAASSSRTELGNNATITVNGASAHA 294
TG + G+ G+++ L ATI A + G + +
Sbjct: 236 TGGRAAGV-AAMDGAIVHL-QRATIRRGDAPAGGAVPGGAVPGGAVPGG-FGPLLDGWYG 292

Query: 295 VYATNATVNLGENATISVNSASKAASYSKAPAGLYALSRGAINLAGGAAITMAGDNSSES 354
V +++TV+L A V + A+ +S G+++ G I G
Sbjct: 293 VDVSDSTVDL---AQSIVEAPQLGAAIRAGRGARVTVSGGSLSAPHGNVIETGGGARRFP 349

Query: 355 YAISTETGGIVDGS--SGGRFVIDGDIRAAGATAASGTLPQ--------------QNSTI 398
S + + G+ G + T A G Q + +
Sbjct: 350 PPASPLSITLQAGARAQGRALLYRVLPEPVKLTLAGGAQGQGDIVATELPPIPGASSGPL 409

Query: 399 KLNMTDNSRWDGASYITSATAGTGVISVQMSDATWNMTSSSTLTDLTLNSGATINFSH-- 456
+ + +RW GA+ V S+ + +ATW MT +S + L L S +++F
Sbjct: 410 DVALASQARWTGATRA--------VDSLSIDNATWVMTDNSNVGALRLASDGSVDFQQPA 461

Query: 457 EDGEPWQTLTINEDYVGNGGKLVFNTVLNDDDSETDRLQVLGNTSGNTFVAVNNIGGAGA 516
E G ++ L ++ G+G +F + D +D+L V+ + SG + V N G A
Sbjct: 462 EAGR-FKVLMVDT-LAGSG---LFRMNVFADLGLSDKLVVMRDASGQHRLWVRNSGSEPA 516

Query: 517 QTIEGIEIVNVAGNSNGTFEKASR---IVAGAYDYNVVQKGKNWYLTSYIEPDEPIIPDP 573
+ + +V S TF A++ + G Y Y + G + S + P P P
Sbjct: 517 -SGNTMLLVQTPRGSAATFTLANKDGKVDIGTYRYRLAANGNGQW--SLVGAKAPPAPKP 573

Query: 574 VDPVIPDPVDPDPVDPVIPDPVIPDPVDPDPVDPEPVDPVIPDPVIPDIGQSDTPPITEH 633
P P P P P P P P P +P P P ++ + +
Sbjct: 574 APQPGPQPGPQPPQPPQPPQP----PQPPQPPQRQPEAPAPQPPAGRELSAAANAAVNTG 629

Query: 634 QFRPEVGSYLANNYAANTLFMTRLHDRLGETQYTDMLTGEKKVTSLWMRNVGAHTRFNDG 693
+ A + A L RLGE + G W R + ++
Sbjct: 630 GVGLASTLWYAESNA--------LSKRLGELRLNPDAGG------AWGRGFAQRQQLDNR 675

Query: 694 SGQLKTRINSYV--LQLGGDLAQWSTDGLDRWHIGAMAGYANSQNRTQSSVSDYHSRGQV 751
+G+ R + V +LG D A + G RWH+G +AGY + D G
Sbjct: 676 AGR---RFDQKVAGFELGADHA-VAVAG-GRWHLGGLAGYTRGD---RGFTGD--GGGHT 725

Query: 752 TGYSVGLYGTWYANNIDRSGAYVDTWMLYNWFDN--KDMGQDQAA--EKYKSKGITASVE 807
VG Y T+ AN+ G Y+D + + +N K G D A KY++ G+ S+E
Sbjct: 726 DSVHVGGYATYIANS----GFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGVSLE 781

Query: 808 AGYSFRLGESAHQSYWLQPKAQVVWMGVQADDNREANGTLVKDDTAGNLLTRMGVKAYIN 867
AG F ++L+P+A++ V R ANG V+D+ ++L R+G++
Sbjct: 782 AGRRFAH----ADGWFLEPQAELAVFRVGGGAYRAANGLRVRDEGGSSVLGRLGLEV--- 834

Query: 868 GHNAIDNDKSREFQPFVEANWIHNTQPA-SVKMNDVS--SDMRGTKNIGELKVGI 919
I+ R+ QP+++A+ + A +V+ N ++ +++RGT+ EL +G+
Sbjct: 835 -GKRIELAGGRQVQPYIKASVLQEFDGAGTVRTNGIAHRTELRGTR--AELGLGM 886


63SC3694SC3703Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SC3694-120-3.851646inner membrane protein
SC3695022-4.831291glycosyl hydrolase family protein
SC3696335-8.757928hypothetical protein
SC3697537-7.131888helix-turn-helix protein
SC3698535-7.074048hypothetical protein
SC3699533-7.312504hypothetical protein
SC3700534-7.415281phosphotransferase system, HPr-related protein
SC3701635-7.840102fructose-1,6-bisphosphate aldolase
SC3702631-6.749506sugar (pentulose and hexulose) kinase
SC3703317-5.154466PTS system galactitol-specific enzyme IIC
64SC3731SC3736Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SC37311337.579874heat shock chaperone IbpB
SC37322409.917409heat shock protein IbpA
SC37331307.452463hypothetical protein
SC37342266.625322hypothetical protein
SC37351245.630278heme lyase disulfide oxidoreductase, cytocyhrome
SC37360204.610091cytochrome c-type biogenesis protein
65SC3773SC3782Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC37732211.427996dipeptide/oligopeptide/nickel ABC transporter
SC37742331.215081glucosamine--fructose-6-phosphate
SC37753311.093094bifunctional N-acetylglucosamine-1-phosphate
SC37765371.186336ATP synthase F0F1 subunit epsilon
SC37775391.085822ATP synthase F0F1 subunit beta
SC37785320.323378ATP synthase F0F1 subunit gamma
SC37796330.213890ATP synthase F0F1 subunit alpha
SC3780524-1.832754ATP synthase F0F1 subunit delta
SC3781420-2.066420ATP synthase F0F1 subunit B
SC3782218-0.282252ATP synthase F0F1 subunit C
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3781PYOCINKILLER270.043 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 26.7 bits (58), Expect = 0.043
Identities = 15/42 (35%), Positives = 21/42 (50%)

Query: 70 AEAQVIIEQANKRRAQILDEAKTEAEQERTKIVAQAQAEIEA 111
A+A + ANK R Q EAK +AE++ + A A A
Sbjct: 210 AKASIEAAAANKAREQAAAEAKRKAEEQARQQAAIRAANTYA 251


66SC3960SC3965Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SC3960218-1.218780hypothetical protein
SC3961217-1.449321mannose-6-phosphate isomerase
SC3962218-1.846882autoinducer-2 (AI-2) kinase
SC3963420-3.435110transcriptional repressor
SC3964422-3.274498sugar ABC transporter membrane subunit
SC3965120-3.193359sugar ABC transporter membrane subunit
67SC4063SC4076Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC4063-2213.408434isocitrate lyase
SC4064-2213.372357bifunctional isocitrate dehydrogenase
SC4065-1213.387627hypothetical protein
SC40660202.430030IclR family transcriptional regulator
SC40671212.012768B12-dependent methionine synthase
SC4069014-0.727769peptidase E
SC4070014-1.884659hypothetical protein
SC4071-118-4.454721hypothetical protein
SC4072-120-5.16632323S rRNA pseudouridine synthase F
SC4073025-4.758607hypothetical protein
SC4074127-6.071895Na+-dependent transporter
SC4075128-5.620226hypothetical protein
SC4076126-5.982011hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4063BINARYTOXINB320.008 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 31.6 bits (71), Expect = 0.008
Identities = 14/58 (24%), Positives = 23/58 (39%)

Query: 289 ETSTPDLELARRFADAIHAKYPGKLLAYNCSPSFNWQKNLDDKTIASFQQQLSDMGYK 346
ET+ PD+ L A P L Y + N D +T + + QL+++
Sbjct: 544 ETTKPDMTLKEALKIAFGFNEPNGNLQYQGKDITEFDFNFDQQTSQNIKNQLAELNAT 601


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4067BCTERIALGSPD340.005 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 33.7 bits (77), Expect = 0.005
Identities = 16/71 (22%), Positives = 31/71 (43%), Gaps = 13/71 (18%)

Query: 343 SGLEPLNIGDDSLFVNVGERTN---VTGSA----KFKRLIKEEKYSEALDVARQQVEGGA 395
+P+ D ++ + +TN VT + +R+I + LD+ R QV A
Sbjct: 298 QAAKPVAALDKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQ------LDIRRPQVLVEA 351

Query: 396 QIIDINMDEGM 406
I ++ +G+
Sbjct: 352 IIAEVQDADGL 362


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4071TRNSINTIMINR290.012 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 28.9 bits (64), Expect = 0.012
Identities = 14/54 (25%), Positives = 31/54 (57%), Gaps = 2/54 (3%)

Query: 13 AGLVTSKKMAKVQRTAKKSRVQAREAREAVEENKKAQLERDKQLSEQQKQAVLA 66
+G + + ++ + AK++ AR+ +AVE N +AQ + Q + +Q++ L+
Sbjct: 310 SGELKDDIVEQIAQQAKEAGEVARQ--QAVESNAQAQQRYEDQHARRQEELQLS 361


68SC4132SC4147Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC4132-116-4.408198outer membrane lipoprotein
SC4133-118-6.881625excinuclease ABC subunit A
SC4134542-15.414481single-stranded DNA-binding protein
SC4135426-7.611075hypothetical protein
SC4136327-8.358173hypothetical protein
SC4137426-7.937927methyl-accepting chemotaxis protein
SC4138325-7.273708ABC transporter outer membrane protein
SC4139424-6.644127membrane permease, cation efflux pump
SC4140423-5.763773inner membrane protein
SC4141025-8.121492bacteriocin/lantibiotic ABC transporter
SC4142-214-0.733725hypothetical protein
SC4143-2130.150973diguanylate cyclase/phosphodiesterase
SC4144-2142.367851DNA-binding transcriptional regulator SoxS
SC4145-3132.525729redox-sensing transcriptional activator SoxR
SC4146-3142.968117glutathione S-transferase
SC4147-2163.054135xanthine/uracil permease family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4139RTXTOXIND2668e-87 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 266 bits (682), Expect = 8e-87
Identities = 87/425 (20%), Positives = 175/425 (41%), Gaps = 25/425 (5%)

Query: 9 LMMIIISLTILIIILTYFIEINSVVHGQGVITTKDNAQLISLSKGGTIQDIYVAEGDTVK 68
+ I+ ++ IL+ ++ V G +T ++ I + +++I V EG++V+
Sbjct: 60 VAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVR 119

Query: 69 KGELLAKVVNLDLQKEYQRYRTQKGYLDKDVNEI-------SFILDKENESGLITLDGTR 121
KG++L K+ L E +TQ L + + S L+K E L +
Sbjct: 120 KGDVLLKLT--ALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQ 177

Query: 122 SLSNKEVKANIELVHSQIRA-------KELKKTSLDSEISGLQEKLSSKEKELALLAEEI 174
++S +EV L+ Q KEL +E + +++ E + +
Sbjct: 178 NVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRL 237

Query: 175 NILSPLVKKGISPYTNFLNKKQAYIKVKSEINDIESSITLKKDDIELVVNDIEALNNELR 234
+ S L+ K L ++ Y++ +E+ +S + + +I + + + +
Sbjct: 238 DDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFK 297

Query: 235 LSLSKIISKNLQELEVVNSTLKVIEKQINEEDIYSPVDGVIYKINKSATTHGGVIQAADL 294
+ + + + ++ L E++ I +PV + ++ T GGV+ A+
Sbjct: 298 NEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQL--KVHTEGGVVTTAET 355

Query: 295 LFEIKPKVRTMLADVKILPKYRDQIYVDEAVKLDVQSIIQPKIKSYNATIDNISPDSYEE 354
L I P+ T+ + K I V + + V++ + + NI+ D+ E+
Sbjct: 356 LMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIED 415

Query: 355 NTGGTIQRYYKVIIAFDVNE----DDLRWLKPGMTVDASVITGKHSIMEYLLSPLMKGVD 410
G + VII+ + N + L GM V A + TG S++ YLLSPL + V
Sbjct: 416 QRLGL---VFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVT 472

Query: 411 KAFSE 415
++ E
Sbjct: 473 ESLRE 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4140GPOSANCHOR493e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 49.3 bits (117), Expect = 3e-07
Identities = 49/190 (25%), Positives = 76/190 (40%), Gaps = 30/190 (15%)

Query: 96 DSAQVEKKGNGKRRNKKEEEELKKQLDDAENAKK--EADKAK-EEAEKAKEAAEKALNEA 152
A+ + + + L++ LD + AKK EA+ K EE K EA+ ++L
Sbjct: 293 LEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRD 352

Query: 153 FEVQNSSK-QIEEMLQNFLADNVAKDNLAQQSDASQQNTQA---KATQASKQNDAEKVLP 208
+ +K Q+E Q N + S+AS+Q+ + + +A KQ +
Sbjct: 353 LDASREAKKQLEAEHQKLEEQN-------KISEASRQSLRRDLDASREAKKQVEKALEEA 405

Query: 209 QPI-------NKNTSTGK--SNSSKNEEN-KLDAESVKEPLKVTLALAAES----NSGSK 254
NK K + K E KL+AE+ + LK LA AE +G
Sbjct: 406 NSKLAALEKLNKELEESKKLTEKEKAELQAKLEAEA--KALKEKLAKQAEELAKLRAGKA 463

Query: 255 DDSITNFTKP 264
DS T KP
Sbjct: 464 SDSQTPDAKP 473



Score = 48.1 bits (114), Expect = 9e-07
Identities = 35/136 (25%), Positives = 63/136 (46%), Gaps = 4/136 (2%)

Query: 98 AQVEKKGNGKRRNKKEEEELKKQLDDAENAKKEADKAKEEAEKAKEAAEKALNEAFEVQN 157
A+ +K + ++ + L++ LD + AKK+ +KA EEA A EK E E +
Sbjct: 365 AEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKK 424

Query: 158 SSKQIEEMLQNFL-ADNVA-KDNLAQQSD--ASQQNTQAKATQASKQNDAEKVLPQPINK 213
+++ + LQ L A+ A K+ LA+Q++ A + +A +Q K +P
Sbjct: 425 LTEKEKAELQAKLEAEAKALKEKLAKQAEELAKLRAGKASDSQTPDAKPGNKAVPGKGQA 484

Query: 214 NTSTGKSNSSKNEENK 229
+ K N +K +
Sbjct: 485 PQAGTKPNQNKAPMKE 500



Score = 43.1 bits (101), Expect = 3e-05
Identities = 17/115 (14%), Positives = 42/115 (36%), Gaps = 19/115 (16%)

Query: 101 EKKGNGKRRNKKEEEELKKQLDDAENAKKEAD-------KAKEEAEKAKEAAEKALNEAF 153
++ ++ + + + E + + + A ++L
Sbjct: 260 ARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQ-SQVLNANRQSLRRDL 318

Query: 154 EVQNSSK-QIEEMLQNFLADNVAKDNLAQQSDASQQNTQAK---ATQASKQNDAE 204
+ +K Q+E Q + + N + S+AS+Q+ + + +A KQ +AE
Sbjct: 319 DASREAKKQLEAEHQ-----KLEEQN--KISEASRQSLRRDLDASREAKKQLEAE 366


69SC4186SC4197Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SC4186222-6.691089hypothetical protein
SC4187323-8.233437hypothetical protein
SC4188533-11.031619hypothetical protein
SC4189438-14.117253hypothetical protein
SC4190546-17.186017LuxR family transcriptional regulator
SC4191442-15.747388DNA-binding domain-containing protein
SC4192540-12.764835hypothetical protein
SC4193440-12.523111transposase insF
SC4194645-14.391532hypothetical protein
SC4195440-10.918559hypothetical protein
SC4196119-1.478015hypothetical protein
SC4197219-1.230329non-specific acid phosphatase
70SC4230SC4241Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC4230-1113.199480arginine-binding periplasmic protein
SC4231-1133.631459***Fe-S protein
SC4232-1143.782887hypothetical protein
SC4233-3152.589465ATPase
SC4234-2132.970522N-acetylmuramoyl-L-alanine amidase
SC4235-1162.350702DNA mismatch repair protein
SC42361181.231570tRNA delta(2)-isopentenylpyrophosphate
SC42374231.091278RNA-binding protein Hfq
SC42383210.953230GTPase HflX
SC42393201.354885FtsH protease regulator HflK
SC42404191.226684FtsH protease regulator HflC
SC42412160.184846inner membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4234PF03544310.007 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 31.1 bits (70), Expect = 0.007
Identities = 16/65 (24%), Positives = 26/65 (40%), Gaps = 7/65 (10%)

Query: 130 PPPPPPPVVAKRVESAPRPTEPARNPFKSSDDRLTGVTSSNTVTRPAARASAGAGDKVVI 189
P P P P K+VE R +P + S + + RP + + A K V
Sbjct: 99 PKPKPKPKPVKKVEQPKRDVKPVESRPASPFE-------NTAPARPTSSTATAATSKPVT 151

Query: 190 AIDAG 194
++ +G
Sbjct: 152 SVASG 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4235ALARACEMASE300.028 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 30.1 bits (68), Expect = 0.028
Identities = 26/161 (16%), Positives = 57/161 (35%), Gaps = 18/161 (11%)

Query: 31 VENSLDAGATRVDIDIER---GGAKLIR-IRDNGCGIKKEELALALARHATSKIASLDDL 86
++ SLD A + ++ I R A++ ++ N G E + A+ + +L++
Sbjct: 5 IQASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGATDGFALLNLEEA 64

Query: 87 EAIISLGFRGEAL----------ASISSVSRLTLTSRTAEQAEAWQAYAEGRDMDVTVK- 135
+ G++G L I RLT + Q +A Q +D+ +K
Sbjct: 65 ITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDIYLKV 124

Query: 136 -PAAHPVGTTLEVLDLFYNTPARRKFMRTEK--TEFNHIDE 173
+ +G + + + + + F +
Sbjct: 125 NSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAEH 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4238SECA330.002 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 33.3 bits (76), Expect = 0.002
Identities = 26/144 (18%), Positives = 55/144 (38%), Gaps = 6/144 (4%)

Query: 282 HVVDAADVRVQENIEAVNTVLEEIDAHEIPTLMVMNKIDMLDDFEPRIDRDEENK-PIRV 340
++D +DV N + IDA+ P + ++ + + R+ D + PI
Sbjct: 665 ELLDVSDVSETINSIREDVFKATIDAYIPPQSL--EEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 341 WLSAQSGVGIPQLFQALTERLSGEVAQHTLRLPPQEGRLRSRFYQLQAIEKEWMEEDGSV 400
WL + + L + + + + + + R + LQ ++ W E ++
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAM 782

Query: 401 SLQVRMPIVDWRRLCKQEPALIEY 424
+R I R +++P EY
Sbjct: 783 D-YLRQGIH-LRGYAQKDP-KQEY 803


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4240PYOCINKILLER290.030 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 29.0 bits (64), Expect = 0.030
Identities = 18/65 (27%), Positives = 30/65 (46%), Gaps = 3/65 (4%)

Query: 225 NRMRAEREAVARRHRSQGQEEAEKLRAAADYEVTK---TLAEAERQGRIMRGEGDAEAAK 281
N+ R + A A+R + + +RAA Y + +A A +G I +G A A+
Sbjct: 220 NKAREQAAAEAKRKAEEQARQQAAIRAANTYAMPANGSVVATAAGRGLIQVAQGAASLAQ 279

Query: 282 LFADA 286
+DA
Sbjct: 280 AISDA 284


71SC4325SC4387Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC4325-216-5.114318hypothetical protein
SC4326-217-1.228207hydroxylase for synthesis of
SC4327-120-1.373013hypothetical protein
SC4328-120-0.077113inner membrane protein
SC4329-1221.054822acetyltransferase
SC4330-1221.338215inner membrane protein
SC43310283.323758valyl-tRNA synthetase
SC4332-2162.089834DNA polymerase III subunit chi
SC4333-3121.471886leucyl aminopeptidase
SC4334-2100.446527hypothetical protein
SC4335-28-0.184640permease
SC4336-110-1.222812permease
SC4337-113-1.330729L-idonate regulator
SC4338018-3.596415GntP family, L-idonate transport protein
SC4339-121-5.355191gluconate 5-dehydrogenase
SC4340026-7.854495L-idonate 5-dehydrogenase
SC4341132-11.273937D-gluconate kinase
SC4342236-12.426702alcohol dehydrogenase
SC4343850-18.123576*hypothetical protein
SC4344545-15.818288hypothetical protein
SC4345546-16.617411replication protein
SC4346751-17.664006hypothetical protein
SC4347755-17.702150hypothetical protein
SC4348757-18.245441hypothetical protein
SC4349652-16.282486hypothetical protein
SC4350655-16.801024hypothetical protein
SC4351444-12.438850hypothetical protein
SC4352338-10.337392hypothetical protein
SC4353125-6.225843hypothetical protein
SC4354023-5.802803hypothetical protein
SC4355026-6.383906hypothetical protein
SC4356022-4.170336SAM-dependent methyltransferase
SC4357126-5.501288hypothetical protein
SC4358123-4.551901hypothetical protein
SC4359222-4.954188hypothetical protein
SC4360-116-0.641597hypothetical protein
SC4361-114-0.267119hypothetical protein
SC4362-2141.356892DNA-binding transcriptional repressor UxuR
SC4363-1161.498482tryptophanyl-tRNA synthetase
SC4364-1141.867728hypothetical protein
SC43650122.810961aspartate racemase
SC4366-1142.548095DNA-binding transcriptional regulator
SC4367-1133.451792isoaspartyl dipeptidase
SC43680142.160022hypothetical protein
SC4369-2141.515196hypothetical protein
SC4370-2151.226702hypothetical protein
SC4371-2160.480053hypothetical protein
SC43720200.120002MFS family transporter
SC4373229-7.152461hypothetical protein
SC4374432-8.204289integrase
SC4375426-6.616825hypothetical protein
SC4376323-4.045707hypothetical protein
SC4377119-2.491808hypothetical protein
SC4378018-2.732254inner membrane protein
SC43792252.996965endoribonuclease SymE
SC43801222.838568amino acid transporter LysE
SC43811242.624694hypothetical protein
SC43820221.529465GTP-binding protein YjiA
SC4383-2200.755379hypothetical protein
SC4384-3170.495295carbon starvation protein
SC4385-216-2.221935methyl-accepting chemotaxis protein I, serine
SC4386120-3.524578hypothetical protein
SC4387016-3.192461PTS permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4329SACTRNSFRASE356e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 34.9 bits (80), Expect = 6e-05
Identities = 22/106 (20%), Positives = 38/106 (35%), Gaps = 7/106 (6%)

Query: 43 KGYTVADPNLDELYQVYSQPGAAYWVVEQNGCVVGGGGVAPLSCSEPDICELQKMYFLPV 102
K Y D ++ V + AA+ +N C+ G + + ++ +
Sbjct: 48 KQYEDDD---MDVSYVEEEGKAAFLYYLENNCI----GRIKIRSNWNGYALIEDIAVAKD 100

Query: 103 ISGQGLAKKLALMALEHAREQGFKRCYLETTAFLREAIALYERLGF 148
+G+ L A+E A+E F LET A Y + F
Sbjct: 101 YRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4339DHBDHDRGNASE1378e-42 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 137 bits (347), Expect = 8e-42
Identities = 84/253 (33%), Positives = 131/253 (51%), Gaps = 8/253 (3%)

Query: 10 KNILITGAAQGIGYLLATGLGRYGARIIVNDITPERAETAVTKLQQEGIKAIAAPFNVTH 69
K ITGAAQGIG +A L GA I D PE+ E V+ L+ E A A P +V
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 70 KQDIEAAVGHIEKDIGAIDVLINNAGIQRRHPFTEFPEQEWNDVIAVNQTAVFLVSQAVT 129
I+ IE+++G ID+L+N AG+ R ++EW +VN T VF S++V+
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 130 RRMVARQAGKVINICSMQSELGRDTITPYAASKGAVKMLTRGMCVELARHNIQVNGIAPG 189
+ M+ R++G ++ + S + + R ++ YA+SK A M T+ + +ELA +NI+ N ++PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 190 YFKTEMTKALVEDE--------AFTSWLCKRTPAARWGDPQELIGAAVFLSSKASDFVNG 241
+T+M +L DE P + P ++ A +FL S + +
Sbjct: 189 STETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITM 248

Query: 242 HLLFVDGGMLVAV 254
H L VDGG + V
Sbjct: 249 HNLCVDGGATLGV 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4354TETREPRESSOR290.015 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 29.1 bits (65), Expect = 0.015
Identities = 20/56 (35%), Positives = 28/56 (50%), Gaps = 5/56 (8%)

Query: 120 FALVTKKRKLIDALAQQILEAHFPTSIQEDIADEMGFDIRTSLRQRDPKFRQAVLR 175
+ V KR L+DALA +IL H S+ G ++ LR FR+A+LR
Sbjct: 42 YWHVKNKRALLDALAVEILARHHDYSLPAA-----GESWQSFLRNNAMSFRRALLR 92


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4367UREASE378e-05 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 37.4 bits (87), Expect = 8e-05
Identities = 32/129 (24%), Positives = 49/129 (37%), Gaps = 33/129 (25%)

Query: 26 CDVLLANGKIIAVGADIPSDIVPDCT--------VINLSGRMLCPGFIDQHVHLIGG--- 74
D+ L +G+I A+G D+ P T VI G+++ G +D H+H I
Sbjct: 86 ADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIHFICPQQI 145

Query: 75 ------------GGEAGP------TTRTP-EVSLSRLTEA--GITTVVGLLGTDSVSRHP 113
GG GP TT TP ++R+ EA + G + S P
Sbjct: 146 EEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIARMIEAADAFPMNLAFAGKGNAS-LP 204

Query: 114 ASLLAKTRA 122
+L+
Sbjct: 205 GALVEMVLG 213


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4370TCRTETA401e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.8 bits (93), Expect = 1e-05
Identities = 83/396 (20%), Positives = 141/396 (35%), Gaps = 31/396 (7%)

Query: 9 PRHPIFTALFGMMVLTLGMGVGRFLYTPMLPVMLAEKQLTFNQLSWIASANYAGYLAGSL 68
P P+ L + + +G+G L P+LP +L + + N ++ A Y
Sbjct: 3 PNRPLIVILSTVALDAVGIG----LIMPVLPGLLRDLVHS-NDVTAHYGILLALYALMQF 57

Query: 69 LFSFGLFHLPSRL--RPMLLASAVATGILILSMAIFTQPAVVMLVRFLAGVASAGMMIFG 126
+ L L R RP+LL S + MA V+ + R +AG+ A + G
Sbjct: 58 ACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG 117

Query: 127 SMI-----VLHHTRHPFVIAALFSGVGAGIALGNEYVIGGLHYALSAHSLWLGAGALAGI 181
+ I RH ++A F G G+ G V+GGL S H+ + A AL G+
Sbjct: 118 AYIADITDGDERARHFGFMSACF---GFGMVAGP--VLGGLMGGFSPHAPFFAAAALNGL 172

Query: 182 LLLIVAMLIPPRAHALPPAPLARIENQPMPWWQLA-LLYGFAGFGYIIVATYLPLMAKSA 240
L L+P +H PL R P+ ++ A + A + L +A
Sbjct: 173 NFLTGCFLLPE-SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAA 231

Query: 241 GSPLLTAHL--WSLVGLAIIPGCFGWLWA----------AKHWGVLPCLTANLLIQSACV 288
+ W + I FG L + A G L ++
Sbjct: 232 LWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGY 291

Query: 289 LLSLASDSLLLLILSSIGFGATFMGTTSLVMPLARQLSAPGNINLLGLVTLTYGIGQILG 348
+L + + + + +G +L L+RQ+ L G + + I+G
Sbjct: 292 ILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVG 351

Query: 349 PLAASLSGNGASAIINATLCGAAALFFAALISAAQQ 384
PL + + N A A + + A ++
Sbjct: 352 PLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4372TCRTETB485e-08 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 47.6 bits (113), Expect = 5e-08
Identities = 35/141 (24%), Positives = 64/141 (45%), Gaps = 5/141 (3%)

Query: 58 SLYLAGGMALQWLLGPLSDRIGRRPVLIAGALIFTLACAATLLTTSMTQFLV-ARFVQGT 116
L + G A+ G LSD++G + +L+ G +I + S L+ ARF+QG
Sbjct: 59 MLTFSIGTAV---YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGA 115

Query: 117 SICFIATVGYVTVQEAFGQTKAIKLMAIITSIVLVAPVIGPLSGAALMHFVHWKVLFGII 176
+ V V + K +I SIV + +GP G + H++HW L +I
Sbjct: 116 GAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL-LI 174

Query: 177 AVMGLLALCGLLLAMPETVQR 197
++ ++ + L+ + + V+
Sbjct: 175 PMITIITVPFLMKLLKKEVRI 195


72SC0040SC0052N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0040-2120.929115isoleucyl-tRNA synthetase
SC0041-310-0.307403lipoprotein signal peptidase
SC0042-211-1.802620FKBP-type peptidylprolyl isomerase
SC0043-1120.4565274-hydroxy-3-methylbut-2-enyl diphosphate
SC0044-1172.392517nitrite reductase
SC0045-1243.757527ribonucleoside hydrolase RihC
SC0046-1222.325934transcription regulator sensor for citrate
SC0047-1212.688238transcription regulator, histidine kinase for
SC00480255.025135oxalacetate decarboxylase subunit beta
SC0049-2183.204733oxaloacetate decarboxylase
SC0050-2120.360680oxaloacetate decarboxylase subunit gamma
SC0051-2110.392080citrate-sodium symport
SC0052-191.708001citrate lyase synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0040LIPPROTEIN48310.020 Mycoplasma P48 major surface lipoprotein signature.
		>LIPPROTEIN48#Mycoplasma P48 major surface lipoprotein signature.

Length = 428

Score = 31.1 bits (70), Expect = 0.020
Identities = 13/61 (21%), Positives = 28/61 (45%), Gaps = 5/61 (8%)

Query: 785 ADEIWGYLPGEREKYVFTGEWYDGLFGLEENEEFNDAFWDDVRYIK---DQVNKELENQK 841
AD+ W + ++EK++ E + EE + N+ + ++ K + K + + K
Sbjct: 344 ADKKWSHFGTQKEKWIGVAE--NHFSNTEEQAKINNKIKEAIKMFKELPEDFVKYINSDK 401

Query: 842 A 842
A
Sbjct: 402 A 402


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0042INFPOTNTIATR290.007 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 28.8 bits (64), Expect = 0.007
Identities = 12/32 (37%), Positives = 19/32 (59%)

Query: 8 NSAILVHFTLKLDDGSTAESTRNNGKPALFRL 39
+ + V +T L DG+ +ST GKPA F++
Sbjct: 144 SDTVTVEYTGTLIDGTVFDSTEKAGKPATFQV 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0046HTHFIS691e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 68.7 bits (168), Expect = 1e-15
Identities = 28/141 (19%), Positives = 48/141 (34%), Gaps = 2/141 (1%)

Query: 1 MDSITTLIVEDEPMLAEILVDTIKIFPQFSIVGIADKLESAKKQIRLYQPQLILLDNFLP 60
M T L+ +D+ + +L + V I + + I L++ D +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSR--AGYDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 DGKGIDLIRHTISTNYTGRIIFITADNHMDTISDALRMGVFDYLIKPVHYQRLQHTLERF 120
D DL+ ++ ++A N T A G +DYL KP L + R
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 121 TRYRSSLRSSEQANQTHVDAL 141
S + + L
Sbjct: 119 LAEPKRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0047CARBMTKINASE300.018 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 30.2 bits (68), Expect = 0.018
Identities = 19/81 (23%), Positives = 30/81 (37%), Gaps = 13/81 (16%)

Query: 104 DATYITVGNEKGQRLYHVNPDEIGKYMEGGDSDDALYNAKSYVSVRKGSLGSSLRGKSPI 163
+ + G EK Q L V +E+ KY E G + GS+G +
Sbjct: 238 NGAALYYGTEKEQWLREVKVEELRKYYEEG-------------HFKAGSMGPKVLAAIRF 284

Query: 164 QDSTGKVIGIVSVGYTLEQLE 184
+ G+ I + +E LE
Sbjct: 285 IEWGGERAIIAHLEKAVEALE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0049RTXTOXIND310.014 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.0 bits (70), Expect = 0.014
Identities = 17/67 (25%), Positives = 29/67 (43%), Gaps = 7/67 (10%)

Query: 508 ASSAPVQAAAPA-------GAGTPVTAPLAGNIWKVIATEGQTVAEGDVLLILEAMKMET 560
+ V+ A A G + + ++I EG++V +GDVLL L A+ E
Sbjct: 75 SVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA 134

Query: 561 EIRAAQA 567
+ Q+
Sbjct: 135 DTLKTQS 141



Score = 29.4 bits (66), Expect = 0.046
Identities = 15/56 (26%), Positives = 22/56 (39%), Gaps = 10/56 (17%)

Query: 535 KVIATEGQTVAEGDVLLILEAMKMETEIRAAQAGTVRGIAVKSGDAVSVGDTLMTL 590
V G+ G EI+ + V+ I VK G++V GD L+ L
Sbjct: 82 IVATANGKLTHSGRSK----------EIKPIENSIVKEIIVKEGESVRKGDVLLKL 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0052LPSBIOSNTHSS381e-05 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 38.3 bits (89), Expect = 1e-05
Identities = 21/102 (20%), Positives = 43/102 (42%), Gaps = 4/102 (3%)

Query: 158 NPFTLGHRYLVEQAAAACDWLHLFVVKEDAS--FFSYTDRWALIEQGIAGIDNVTLHSGS 215
+P T GH ++E+ D +++ V++ FS +R I + IA + N + S
Sbjct: 10 DPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLPNAQVDSFE 69

Query: 216 AYMISRATFPGYFLKEKGV--VDDCHCQIDLQLFREHLAPAL 255
++ A +G+ + D ++ + + LA L
Sbjct: 70 GLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDL 111


73SC0193SC0200N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0193-1153.527893iron-hydroxamate transporter substrate-binding
SC0194-1142.629836iron-hydroxamate transporter permease subunit
SC0195-1140.479850fimbrial subunit
SC01960130.098084fimbrial outer membrane usher
SC0197-1110.626532periplasmic fimbrial chaperone
SC01980121.469415minor fimbrial subunit
SC01990140.868525minor fimbrial subunit
SC02001141.839917minor fimbrial subuni
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0193FERRIBNDNGPP4990.0 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 499 bits (1286), Expect = 0.0
Identities = 246/296 (83%), Positives = 266/296 (89%)

Query: 1 MRDLYPLTRRRLLTAMALSPLLWQMNTAQAAAIDPRRIVALEWLPVELLLALGITPYGVA 60
M L ++RRRLLTAMALSPLLWQMNTA AAAIDP RIVALEWLPVELLLALGI PYGVA
Sbjct: 1 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA 60

Query: 61 DVPNYKLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEKLARIAPGH 120
D NY+LWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPE LARIAPG
Sbjct: 61 DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 120

Query: 121 GFDFSDGKKPLAVARRSLVELAQTLNLEAAAEKHLAQYDRFIASQKPHFIRRGGRPLLMT 180
GF+FSDGK+PLA+AR+SL E+A LNL++AAE HLAQY+ FI S KP F++RG RPLL+T
Sbjct: 121 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLT 180

Query: 181 TLIDPRHMLVLGPNCLFQEVLDEYGIVNAWQGETNFWGSTAVSIDRLAMYKEADVICFDH 240
TLIDPRHMLV GPN LFQE+LDEYGI NAWQGETNFWGSTAVSIDRLA YK+ DV+CFDH
Sbjct: 181 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 240

Query: 241 GNSTDMNALMATPLWQAMPFVRAGRFHRVPAVWFYGATLSTMHFVRILDNVLGGKA 296
NS DM+ALMATPLWQAMPFVRAGRF RVPAVWFYGATLS MHFVR+LDN +GGKA
Sbjct: 241 DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0196PF005776980.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 698 bits (1802), Expect = 0.0
Identities = 241/885 (27%), Positives = 385/885 (43%), Gaps = 67/885 (7%)

Query: 3 HYKKFRLSTLAAVVGIVLAVGPENSYAEAPIQFNTRFLDVKDDASLDLSRFSRKGYIMPG 62
H +K RL+ + + A + + A + FN RFL A DLSRF + PG
Sbjct: 17 HIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPG 76

Query: 63 SYHLQVLVNQSQIAQDNIITYSVDNNDPDNTYPCLSPELVSLLGLKPEIADKMIWINAGQ 122
+Y + + +N +A ++ + D+ PCL+ ++ +GL M +
Sbjct: 77 TYRVDIYLNNGYMATRDVTFNTGDSEQ--GIVPCLTRAQLASMGLNTASVSGMNLLADDA 134

Query: 123 CLQPDQL-EGMETQTDLSQSTLTVIIPQAYLEYSDEEWDPPSRWDEGIPGVLFDYNVNSQ 181
C+ + Q D+ Q L + IPQA++ + PP WD GI L +YN +
Sbjct: 135 CVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGN 194

Query: 182 WRHAEHDDGDEYDISGNGTVGANLGAWRLRADWQANYRHENDSEDKDNFGSSSEQNWDWN 241
Y N G N+GAWRLR + +Y + S S S+ W
Sbjct: 195 SVQNRIGGNSHY-AYLNLQSGLNIGAWRLRDNTTWSYNSSDSS-------SGSKNKWQHI 246

Query: 242 RYYAWRAIPQLRAQLTLGEGSLESDIFDGFNYVGGSLITDDQMLPPNLRGYAPDISGVAR 301
+ R I LR++LTLG+G + DIFDG N+ G L +DD MLP + RG+AP I G+AR
Sbjct: 247 NTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIAR 306

Query: 302 TNAKVTVTQRGRVIYESQVPAGPFRIQDINET-VSGDLHVKIEEQSGQVQEYDVSTASIP 360
A+VT+ Q G IY S VP GPF I DI SGDL V I+E G Q + V +S+P
Sbjct: 307 GTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVP 366

Query: 361 FLTRPGQVRYKLAAGRPQDWDHNMEGGFFTSAEASWGIANGWSLYGGAIGEQDYQALALG 420
L R G RY + AG + + E F + G+ GW++YGG Y+A G
Sbjct: 367 LLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFG 426

Query: 421 LGRDLALLGAFSVDVTHSRATLPEGSAYGDGTIQGNSFRASYAKDFDDIDSRLTFAGYRF 480
+G+++ LGA SVD+T + +TLP D G S R Y K ++ + + GYR+
Sbjct: 427 IGKNMGALGALSVDMTQANSTLP-----DDSQHDGQSVRFLYNKSLNESGTNIQLVGYRY 481

Query: 481 SEENYMTMDEFIDTHNDDNDR-----------------QRTGHDKEMYTLTYSQNFSAIN 523
S Y + + + + + + LT +Q
Sbjct: 482 STSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-T 540

Query: 524 VNAYINYTHRTYWNQPNQD-SYNLTLSHYFDVGEVRGISLSVNGFRNEYDNERDDGVYVS 582
Y++ +H+TYW N D + L+ F+ +LS + +N + RD + ++
Sbjct: 541 STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINW---TLSYSLTKNAWQKGRDQMLALN 597

Query: 583 LSIPWGN-----------NRTLSYNGSFSDDNN-SNQVGYYERI--DDRNNYQINAGRAD 628
++IP+ + + + SY+ S + +N G Y + D+ +Y + G A
Sbjct: 598 VNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAG 657

Query: 629 -----NGATLDGYYRHQASYADIDVSANYQEGDYTSGGLNIQGGATLTAKGGALHRTSVN 683
+G+T ++ Y + ++ ++ D + GG A G L +
Sbjct: 658 GGDGNSGSTGYATLNYRGGYGNANIGYSHS-DDIKQLYYGVSGGVLAHANGVTLGQPL-- 714

Query: 684 GGSRLMVDVGDEANVPISGYSTPVYTNAFGKAVIVDVNDYYRNLVKIDITQLPEDAEATL 743
+ ++V + + + V T+ G AV+ +Y N V +D L ++ +
Sbjct: 715 NDTVVLVKAPGAKDAKVENQTG-VRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDN 773

Query: 744 SIAQATLTEGAIGYRRMEVLSGKKAMASIRLRDGGTPPFGAEVYNSRQQQLGIVGEDGSV 803
++A T GAI + G K + ++ + PFGA V + Q GIV ++G V
Sbjct: 774 AVANVVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQV 832

Query: 804 YLIGINPGERLQVTW--EGKTQCEA--ALPDPLPGDLFSGLLLPC 844
YL G+ ++QV W E C A LP L + L C
Sbjct: 833 YLSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0198FIMBRIALPAPE310.001 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 31.2 bits (70), Expect = 0.001
Identities = 39/140 (27%), Positives = 58/140 (41%), Gaps = 17/140 (12%)

Query: 43 PPCTVTGGEVEFGNV-LTTKVDGVNYRQAVGYRLSCNGRVSDYLKLQIQGNAVTINGESV 101
P CTV EV +G++ + V ++ ++C + +K+ I N T N V
Sbjct: 37 PACTVQNAEVNWGDIEIQNLVQSGGNQKDFTVDMNCPYSLGT-MKVTITSNGQTGNSILV 95

Query: 102 LQTDV---DGLGIRLQTATDGALVSPGNTQWLSFQYS----GGSGPA-----IEAIPVKD 149
T DGL I L + + + GN L Q + G+ PA + K
Sbjct: 96 PNTSTASGDGLLIYLYNSNNSGI---GNAVTLGSQVTPGKITGTAPARKITLYAKLGYKG 152

Query: 150 NGVTLTGGAFNAGATLVVDY 169
N +L G F+A ATLV Y
Sbjct: 153 NMQSLQAGTFSATATLVASY 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0199FIMBRIALPAPF413e-07 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 41.2 bits (96), Expect = 3e-07
Identities = 42/171 (24%), Positives = 76/171 (44%), Gaps = 19/171 (11%)

Query: 1 MKKMALMV-LISSSFAAQSAENLKFHGTLISPPNCTISHNQTIEVKFGNMLISKIDGTRY 59
M +++L + L+ +S A + + G + PP CTI++ Q I V FGN+ +D +R
Sbjct: 1 MIRLSLFISLLLTSVAVLADVQINIRGNVYIPP-CTINNGQNIVVDFGNINPEHVDNSRG 59

Query: 60 AQNVPYEITCDSAVRDDTMTMTLTLSGSVTDFNQ-AAINTSVAGLGIELRQN---DQPFT 115
I+C + ++ + ++G+ Q + T++ GI L Q P T
Sbjct: 60 EVTKNISISCPY----KSGSLWIKVTGNTMGVGQNNVLATNITHFGIALYQGKGMSTPLT 115

Query: 116 LGS------TITV---NEQSAPVLKAIPVKKSGASLTEGDFDATATLQVDY 157
LG+ +T +S ++P + L GDF TA++ + Y
Sbjct: 116 LGNGSGNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMIY 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0200FIMBRIALPAPE333e-04 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 33.1 bits (75), Expect = 3e-04
Identities = 46/161 (28%), Positives = 72/161 (44%), Gaps = 18/161 (11%)

Query: 23 VSAADNLHFSGSLVASPCTLTMQGADIAEVDFSSLDASDFIPGGQSARKPLVFELTDCDS 82
V AADNL F G L+ CT+ AEV++ ++ + + G +K ++ C
Sbjct: 22 VHAADNLTFKGKLIIPACTVQN-----AEVNWGDIEIQNLVQSG-GNQKDFTVDMN-CPY 74

Query: 83 ALSNGVQVIFTGTEATGMRGILAIDSYSGASGIGIGIETLSGVPVGINNES--GAVFT-- 138
+L ++V T TG IL ++ S ASG G+ I + GI N G+ T
Sbjct: 75 SLGT-MKVTITSNGQTG-NSILVPNT-STASGDGLLIYLYNSNNSGIGNAVTLGSQVTPG 131

Query: 139 LVTGKN---TLSLNAWV-QRLPGEDLIPGRFSASALATFEY 175
+TG ++L A + + + L G FSA+A Y
Sbjct: 132 KITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVASY 172


74SC0408SC0415N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC04081173.639351prp operon regulator
SC04090150.2640092-methylisocitrate lyase
SC0410014-0.945568methylcitrate synthase
SC0411012-1.2987912-methylcitrate dehydratase
SC0412011-2.076425propionyl-CoA synthetase
SC0413215-3.369356delta-aminolevulinic acid dehydratase
SC0414316-3.958445flagellar protein
SC0415015-1.505202DNA-binding transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0408HTHFIS340e-114 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 340 bits (874), Expect = e-114
Identities = 120/379 (31%), Positives = 190/379 (50%), Gaps = 51/379 (13%)

Query: 191 DALDMTRLTRRQRVDYSSG--KGLQTRYELGDIRGQSPQMEQLRQTITLYARSRAAVLIQ 248
D ++ + R + K + + G+S M+++ + + ++ ++I
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMIT 166

Query: 249 GETGTGKELAAQAIHQTFFHRQPHRQNKPSPPFVAVNCGAITESLLEAELFGYEEGAFTG 308
GE+GTGKEL A+A+H R+N P FVA+N AI L+E+ELFG+E+GAFTG
Sbjct: 167 GESGTGKELVARALHD-----YGKRRNGP---FVAINMAAIPRDLIESELFGHEKGAFTG 218

Query: 309 SRRGGRAGLFEIAHGGTLLLDEIGEMPLPLQTRLLRVLEEKAVTRVGGHQPIPVDVRVIS 368
++ G FE A GGTL LDEIG+MP+ QTRLLRVL++ T VGG PI DVR+++
Sbjct: 219 AQTR-STGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVA 277

Query: 369 ATHCDLDREIMQGRFRPDLFYRLSILRLTLPPLRERQADILPLAESFLKQSLAAMEIPFT 428
AT+ DL + I QG FR DL+YRL+++ L LPPLR+R DI L F++Q +
Sbjct: 278 ATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQ-AEKEGLD-- 334

Query: 429 ESIRHGLTQCQPLLLAWRWPGNIRELRNMMERLALFLS---------------------- 466
++ + L+ A WPGN+REL N++ RL
Sbjct: 335 --VKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPI 392

Query: 467 ----VDPAPTLDRQFMRQLLPELMVNTAELTPST---------VDANALQDVLARFKGDK 513
Q + + + + + + P + ++ + L +G++
Sbjct: 393 EKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQ 452

Query: 514 SAAARYLGISRTTLWRRLK 532
AA LG++R TL ++++
Sbjct: 453 IKAADLLGLNRNTLRKKIR 471


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0413BINARYTOXINB320.003 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 32.3 bits (73), Expect = 0.003
Identities = 19/69 (27%), Positives = 29/69 (42%)

Query: 254 DVLREIRERTELPLGAYQVSGEYAMIKFAAMAGAIDEEKVVLESLGSIKRAGADLIFSYF 313
+ E+ + +L L QV G A F +D E L I+ A +IF+
Sbjct: 466 NQFLELEKTKQLRLDTDQVYGNIATYNFENGRVRVDTGSNWSEVLPQIQETTARIIFNGK 525

Query: 314 ALDLAEKNI 322
L+L E+ I
Sbjct: 526 DLNLVERRI 534


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0414PRTACTNFAMLY1221e-30 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 122 bits (308), Expect = 1e-30
Identities = 109/492 (22%), Positives = 184/492 (37%), Gaps = 78/492 (15%)

Query: 538 TLNADLVNDRTWDTTQANYGYGVVAMNSDGHL-----------------TINGNGDINNG 580
+++ +++ TW N G + + SDG + T+ G+G
Sbjct: 429 AVDSLSIDNATW-VMTDNSNVGALRLASDGSVDFQQPAEAGRFKVLTVNTLAGSGLFRMN 487

Query: 581 DEADASSTTDNVVA---ATGNYKVRIDNATGAGSVADYKGNELIYVNDINTDATFSAAN- 636
AD +D +V A+G +++ + N+ GS L+ + + ATF+ AN
Sbjct: 488 VFAD-LGLSDKLVVMQDASGQHRLWVRNS---GSEPASANTLLLVQTPLGSAATFTLANK 543

Query: 637 --KADLGAYTYQAKQEGNTV------------------------------------VLEQ 658
K D+G Y Y+ GN
Sbjct: 544 DGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAG 603

Query: 659 MELTDYANMALSIP--SANTNIWNLEQDTVGTRLTNARHGLADNGGAWVSYFGGNFNGDN 716
EL+ AN A++ + +W E + + RL R D GGAW F DN
Sbjct: 604 RELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRL-NPDAGGAWGRGFAQRQQLDN 662

Query: 717 GTIN-YDQDVNGIMVGVDTKVDGNNAKWIVGAAAGFAKGDLS---DRTGQVDQDSQSAYI 772
+DQ V G +G D V +W +G AG+ +GD D G D S ++
Sbjct: 663 RAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDGGGHTD----SVHV 718

Query: 773 YSSARFANN--IFVDGNLSYSHFNNDLSANMSDGTYVDGNTSSDAWGFGLKLGYDLKLGD 830
A + + ++D L S ND SDG V G + G L+ G D
Sbjct: 719 GGYATYIADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGASLEAGRRFTHAD 778

Query: 831 AGYVTPYGSVSGLFQSGDDYQLSNDMKVDGQSYDSMRYELGVDAGYTFTYSEDQALTPYF 890
++ P ++ G Y+ +N ++V + S+ LG++ G + + + PY
Sbjct: 779 GWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIELAGGRQVQPYI 838

Query: 891 KLAYVYD-DSNNDADVNGDSIDNGVEGSAVRVGLGTQFSFTKNFSAYTDANYLGGGDVDQ 949
K + + + D NG + + G+ +GLG + + S Y Y G +
Sbjct: 839 KASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYASYEYSKGPKLAM 898

Query: 950 DWSANVGVKYTW 961
W+ + G +Y+W
Sbjct: 899 PWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0415PF06291300.002 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 30.0 bits (67), Expect = 0.002
Identities = 21/68 (30%), Positives = 30/68 (44%), Gaps = 11/68 (16%)

Query: 28 VNDKEIICSPDESNTHTFVILEGVVSLVRGDKVLIGIVQAPFIFGLADGVAKKEAQYKLI 87
V +K +P E+ TH F VS + K V A I G A+ V K E Q +
Sbjct: 29 VGNKPTAVTPKETITHHFF-----VSGIGQKKT----VDAAKICGGAENVVKTETQQTFV 79

Query: 88 AESGCIGY 95
+G +G+
Sbjct: 80 --NGLLGF 85


75SC0435SC0439N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC04352163.027585MFS transport protein AraJ
SC04361153.198212exonuclease SbcC
SC0437-2131.739498exonuclease SbcD
SC0438-2142.325677transcriptional regulator PhoB
SC0439-2132.193516phosphate regulon sensor protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0435TCRTETA522e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 52.1 bits (125), Expect = 2e-09
Identities = 70/356 (19%), Positives = 122/356 (34%), Gaps = 35/356 (9%)

Query: 5 IFSLALGTFGLGMAEFSIMGVLTELARDVGITIPAAGH---MISFYAFGVVLGAPVMALF 61
+ ++AL G+G+ IM VL L RD+ + H +++ YA APV+
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 62 SSRFSLKHILLFLVTLCVMGNAIFTFSSSYLMLAVGRLVSGFPHGAFFGVGAIVLSKIIR 121
S RF + +LL + + AI + +L +GR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 122 PGKVTAAVAGMVSGMTVANLVGIPVGTYLSQEFSWRYTFLLIAVFNIAVLTAIFFWVPDI 181
G A G +S +V PV L FS F A N F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 182 RDKAQGSLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYIKPFMMYI 229
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 230 SGFSETSMTFIMMLVGLGM---VLGNLLSGKLSGRYTPLRIAVVTDLVIVLSLMALFFFS 286
F + T + L G+ + +++G ++ R R ++ ++ +
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLG---MIADGTGYILLA 295

Query: 287 GYKTASLTFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAIG 340
+ F + + P +L E G G +A +L S +G
Sbjct: 296 FATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVG 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0436RTXTOXIND497e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 49.1 bits (117), Expect = 7e-08
Identities = 32/198 (16%), Positives = 71/198 (35%), Gaps = 13/198 (6%)

Query: 373 TQQSHDRAQLSQWQQQLLSDTRQRDALPPLTLDLTPQALAEARALHTRQRPLRHRLAALQ 432
TQ S +A+L Q + Q+LS + + + LP L L P + R L ++
Sbjct: 139 TQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSL------IK 192

Query: 433 GQILPKQKRQAQLQAAIARHHQEQAQYTQRLADKRLSYKTKAQELADVRTICEQ----EA 488
Q Q ++ Q + + + E+ R+ + + L D ++ + +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKH 252

Query: 489 RIKDLESQRAHLQS--GQPCPLCGSTTHPAIAAYQALELSANQTRRDALEKEVKTLAEEG 546
+ + E++ + ++A + +L + + L+K +T
Sbjct: 253 AVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNI- 311

Query: 547 AALRGQLDALTQQLQRDE 564
L +L ++ Q
Sbjct: 312 GLLTLELAKNEERQQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0438HTHFIS987e-26 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 98.0 bits (244), Expect = 7e-26
Identities = 34/149 (22%), Positives = 63/149 (42%), Gaps = 9/149 (6%)

Query: 4 RILVVEDEAPIREMVCFVLEQNGFQPVEAEDYDSAVNKLNEPWPDLILLDWMLPGGSGLQ 63
ILV +D+A IR ++ L + G+ + + + DL++ D ++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 FIKHLKREAMTRDIPVVMLTARGEEEDRVRGLETGADDYITKPFSPKELVARIKAVMRRI 123
+ +K+ D+PV++++A+ ++ E GA DY+ KPF EL+ I +
Sbjct: 65 LLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA-- 120

Query: 124 SPMAVEEVIEMQGLSLDPGSHRVMTGDSP 152
E L D + G S
Sbjct: 121 -----EPKRRPSKLEDDSQDGMPLVGRSA 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0439PF06580371e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 1e-04
Identities = 17/123 (13%), Positives = 38/123 (30%), Gaps = 28/123 (22%)

Query: 300 TFTFEVDDSLSVLGNEEQLRSAISNLVYNAVNH----TPAGTHITVSWRRVAHGAEFCIQ 355
F +++ ++ + + + LV N + H P G I + + ++
Sbjct: 241 QFENQINPAIM---DVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVE 297

Query: 356 DNGPGIAAEHIPRLTERFYRVDKARSRQTGGSGLGLAIVKHALNH---HESRLEIDSSPG 412
+ G +G GL V+ L E+++++ G
Sbjct: 298 NTGSLAL------------------KNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQG 339

Query: 413 KGT 415
K
Sbjct: 340 KVN 342


76SC0449SC0454N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0449016-0.415011preprotein translocase subunit SecD
SC0450-113-1.270132preprotein translocase subunit SecF
SC0451-215-2.274182hypothetical protein
SC0452-3130.235174regulatory protein
SC0453-2130.484563hypothetical protein
SC0454-1140.669925nucleoside channel phage T6/colicin K receptor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0449SECFTRNLCASE696e-15 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 69.5 bits (170), Expect = 6e-15
Identities = 35/165 (21%), Positives = 79/165 (47%), Gaps = 4/165 (2%)

Query: 433 IQIVEERTIGPTLGMQNIKQGLEACLAGLVVSILFMIF-FYKKFGLIATSALVANLVLIV 491
++I ++GP + + + + + LA VV + ++ F +F L A ALV +++L V
Sbjct: 135 LKITSFESVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHDVLLTV 194

Query: 492 GIMSLLPGATLSMPGIAGIVLTLAVAVDANVLINERIKEEL--SNGRTVQQAINEGYAGA 549
G+ ++L + +A ++ +++ V++ +R++E L ++ +N
Sbjct: 195 GLFAVL-QLKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNET 253

Query: 550 FSSIFDANITTLIKVIILYAVGTGAIKGFAITTGIGVATSMFTAI 594
S +TTL+ ++ + G I+GF GV T ++++
Sbjct: 254 LSRTVMTGMTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSV 298


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0450SECFTRNLCASE352e-124 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 352 bits (904), Expect = e-124
Identities = 104/309 (33%), Positives = 176/309 (56%), Gaps = 12/309 (3%)

Query: 17 YDFMRWDFWAFGISGLLLIAAIVIMGVRGFNWGLDFTGGTVIEITLEKPAEMDVMREALQ 76
+DF RW + FG + +++IA++++ V G N+G+DF GGT I ++ V R AL+
Sbjct: 14 FDFFRWQWATFGAAIVMMIASVILPLVIGLNFGIDFKGGTTIRTESTTAIDVGVYRAALE 73

Query: 77 KAGYEEPQLQNFGS------SHDIMVRMPPTEGETGGQVLGSKVVTIINE------ATNQ 124
+ + H M+R+ E G + G++ ++N+ A +
Sbjct: 74 PLELGDVIISEVRDPSFREDQHVAMIRIQMQEDGQGAEGQGAQGQELVNKVETALTAVDP 133

Query: 125 NAAVKRIEFVGPSVGADLAQTGAMALLVALISILVYVGFRFEWRLAAGVVIALAHDVIIT 184
+ E VGP V +L T +LL A + I+ Y+ RFEW+ A G V+AL HDV++T
Sbjct: 134 ALKITSFESVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHDVLLT 193

Query: 185 LGILSLFHIEIDLTIVASLMSVIGYSLNDSIVVSDRIRENFRKIRRGTPYEIFNVSLTQT 244
+G+ ++ ++ DLT VA+L+++ GYS+ND++VV DR+REN K + ++ N+S+ +T
Sbjct: 194 VGLFAVLQLKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNET 253

Query: 245 LHRTLITSGTTLVVILMLYLFGGPVLEGFSLTMLIGVSIGTASSIYVASALALKLGMKRE 304
L RT++T TTL+ ++ + ++GG V+ GF M+ GV GT SS+YVA + L +G+ R
Sbjct: 254 LSRTVMTGMTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIVLFIGLDRN 313

Query: 305 HMLQQKVEK 313
+ +K
Sbjct: 314 KEKKDPSDK 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0452ARGREPRESSOR334e-04 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 32.9 bits (75), Expect = 4e-04
Identities = 14/56 (25%), Positives = 24/56 (42%), Gaps = 5/56 (8%)

Query: 3 RRADRLFQIVQILRGRRLTT-----AALLAQRLAVSERTIYRDIRDLSLSGVPVEG 53
+ R +I +I+ + T L V++ T+ RDI++L L VP
Sbjct: 2 NKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKELHLVKVPTNN 57


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0454CHANNELTSX4990.0 Nucleoside-specific channel-forming protein Tsx signa...
		>CHANNELTSX#Nucleoside-specific channel-forming protein Tsx

signature.
Length = 294

Score = 499 bits (1286), Expect = 0.0
Identities = 240/295 (81%), Positives = 254/295 (86%), Gaps = 9/295 (3%)

Query: 36 MKKTLLAVSAALALTSSFTANAAENDQPQYLSDWWHQSVNVVGSYHTRFSPKLNNDVYLE 95
MKKTLLA A +AL+++F A AAEND+PQYLSDWWHQSVNVVGSYHTRF P++ ND YLE
Sbjct: 1 MKKTLLAAGAVVALSTTFAAGAAENDKPQYLSDWWHQSVNVVGSYHTRFGPQIRNDTYLE 60

Query: 96 YEAFAKKDWFDFYGYIDIPKTFDWGNGNDKGIWSDGSPLFMEIEPRFSIDKLTGADLSFG 155
YEAFAKKDWFDFYGYID P F GN KGIW+ GSPLFMEIEPRFSIDKLT DLSFG
Sbjct: 61 YEAFAKKDWFDFYGYIDAPVFFG-GNSTAKGIWNKGSPLFMEIEPRFSIDKLTNTDLSFG 119

Query: 156 PFKEWYFANNYIYDMGDNKASRQSTWYMGLGTDIDTGLPMGLSLNVYAKYQWQNYGASNE 215
PFKEWYFANNYIYDMG N + QSTWYMGLGTDIDTGLPM LSLNVYAKYQWQNYGASNE
Sbjct: 120 PFKEWYFANNYIYDMGRNDSQEQSTWYMGLGTDIDTGLPMSLSLNVYAKYQWQNYGASNE 179

Query: 216 NEWDGYRFKVKYFVPITDLWGGKLSYIGFTNFDWGSDLGDDP--------NRTSNSIASS 267
NEWDGYRFKVKYFVP+TDLWGG LSYIGFTNFDWGSDLGDD RTSNSIASS
Sbjct: 180 NEWDGYRFKVKYFVPLTDLWGGSLSYIGFTNFDWGSDLGDDNFYDLNGKHARTSNSIASS 239

Query: 268 HILALNYDHWHYSVVARYFHNGGQWQNGAKLNWGDGDFSAKSTGWGGYLVVGYNF 322
HILALNY HWHYS+VARYFHNGGQW + AKLN+GDG FS +STGWGGY VVGYNF
Sbjct: 240 HILALNYAHWHYSIVARYFHNGGQWADDAKLNFGDGPFSVRSTGWGGYFVVGYNF 294


77SC0517SC0525N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0517-1100.522830acridine efflux system protein
SC05180120.108468acridine efflux pump
SC0519114-0.164673potassium efflux protein KefA
SC05202152.768019transposase
SC05214153.423867hypothetical protein
SC05222164.572877primosomal replication protein N''
SC05231172.727516hypothetical protein
SC05242182.822778adenine phosphoribosyltransferase
SC05252182.740235DNA polymerase III subunits gamma and tau
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0517ACRIFLAVINRP13670.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1367 bits (3539), Expect = 0.0
Identities = 809/1033 (78%), Positives = 918/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISATYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++SA YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDPISRTSGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPTELTKYQLTPVDVINAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ L KY+LTPVDVIN +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTDEFGKILLKVNQDGSQVRLRDVAKIELGGENYDVIAKFNGQPASGLGIKLATGANAL 300
+ +EFGK+ L+VN DGS VRL+DVA++ELGGENY+VIA+ NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTATAIRAELKKMEPFFPPGMKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+L +++PFFP GMK++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 TEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPVAKGDHGEGKKGFFGWFNRLFDKSTHHYTDSVGNILRSTGR 540
SVLVALILTPALCAT+LKPV+ H E K GFFGWFN FD S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLLLYLIIVVGMAYLFVRLPSSFLPDEDQGVFLTMVQLPAGATQERTQKVLDEVTDYHLN 600
YLL+Y +IV GM LF+RLPSSFLP+EDQGVFLTM+QLPAGATQERTQKVLD+VTDY+L
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKANVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEKNKVEAITQRATAAFSQIKD 660
EKANVESVF VNGF F+G+ QN G+AFVSLK W +R G++N EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLFGEVAKYPDLLVGVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQL G A++P LV VRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSISDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS+SDIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDINDWYVRGSDGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D++ YVR ++G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MAMMEELASKLPSGIGYDWTGMSYQERLSGNQAPALYAISLIVVFLCLAALYESWSIPFS 900
MA+ME LASKLP+GIGYDWTGMSYQERLSGNQAPAL AIS +VVFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 VEATLEAVRMRLRPILMTSLAFMLGVMPLVISSGAGSGAQNAVGTGVLGGMVTATVLAIF 1020
VEATL AVRMRLRPILMTSLAF+LGV+PL IS+GAGSGAQNAVG GV+GGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0518RTXTOXIND431e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.9 bits (101), Expect = 1e-06
Identities = 32/211 (15%), Positives = 75/211 (35%), Gaps = 17/211 (8%)

Query: 64 TYQATYDSAKGDLAKAQAAANIAELTVKRYQKLLGTQYISKQEYDQALADAQQATAAVVA 123
+ Y A +L + + ++ + Q +++ ++ L +Q T +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 124 AKAAVETARINLAYTKVTSPISGRIGKSSV-TEGALVQNGQASALATVQQLDPIYVDVTQ 182
+ + + +P+S ++ + V TEG +V + + V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET-LMVIVPEDDTLEVTALV 372

Query: 183 SSNDFLRLKQEL-------ANGSLKQENGKAKVDLVTSDGIKFPQSGTLEFSDVTVDQTT 235
+ D + A + KV + D I+ + G + +++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENC 432

Query: 236 GSITLRAIFPNPDHTLLPGMFVRARLQEGTK 266
S + I L GM V A ++ G +
Sbjct: 433 LSTGNKNIP------LSSGMAVTAEIKTGMR 457



Score = 31.0 bits (70), Expect = 0.007
Identities = 24/133 (18%), Positives = 45/133 (33%), Gaps = 10/133 (7%)

Query: 13 PLQITTELPGR-TIAYRIAEVRPQVSGIILKRNFV-EGSDIEAGVSLYQIDP-------A 63
++I G+ T + R E++P + I+ K V EG + G L ++
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIV-KEIIVKEGESVRKGDVLLKLTALGAEADTL 137

Query: 64 TYQATYDSAKGDLAKAQAAANIAELTVKRYQKLLGTQYISKQEYDQALADAQQATAAVVA 123
Q++ A+ + + Q + EL KL Y ++ L
Sbjct: 138 KTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFST 197

Query: 124 AKAAVETARINLA 136
+ +NL
Sbjct: 198 WQNQKYQKELNLD 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0519CHANLCOLICIN367e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 36.2 bits (83), Expect = 7e-04
Identities = 41/219 (18%), Positives = 81/219 (36%), Gaps = 18/219 (8%)

Query: 92 RQKVAQAPEKMRQ-ATAALNALSDVDNDDEMRKTLSALSLRQLELRVA--QVLDDLQNSQ 148
R ++A+A EK R+ A AA A + + + + A + RQL+L A + L L
Sbjct: 129 RLRLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEA 188

Query: 149 NDLAAYNSQLVSLQTQPERVQNAMYTASQQI-------QQIRNRLDGNNVGEAALRPSQQ 201
+ +L + Q++ ++ + T + ++ L G A +
Sbjct: 189 KAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYK 248

Query: 202 VLLQAQQALLNAQID--------QQRKSLEGNTVLQDTLQKQRDYVTANSNRLEHQLQLL 253
L + + L D + + G +++ QKQ NR+ + +
Sbjct: 249 ELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQI 308

Query: 254 QEAVNSKRLTLTEKTAQEAISPDETARIQANPLVKQELD 292
Q+A++ A+ + + + Q N L Q D
Sbjct: 309 QKAISQVSNNRNAGIARVHEAEENLKKAQNNLLNSQIKD 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0520FLGFLIH310.005 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 30.9 bits (69), Expect = 0.005
Identities = 28/63 (44%), Positives = 37/63 (58%), Gaps = 6/63 (9%)

Query: 222 AEPGALIRQLAQGAPQYKEQLMT--IAEWLEE---KGRTEGLQKGLEQGLAQGREAEARA 276
AEP +L +QLAQ Q EQ IAE ++ +G EGL +GLEQGLA+ + +A
Sbjct: 36 AEP-SLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPI 94

Query: 277 IAR 279
AR
Sbjct: 95 HAR 97



Score = 29.0 bits (64), Expect = 0.023
Identities = 19/67 (28%), Positives = 31/67 (46%), Gaps = 8/67 (11%)

Query: 233 QGAPQYKEQLMTIAEWLEEKGRTEGLQKGLEQGLAQGREAEARAIARKMLANGLEPGLIA 292
+ P ++QL + E+G G+ +G +QG QG + + LA GLE GL
Sbjct: 35 EAEPSLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQ--------EGLAQGLEQGLAE 86

Query: 293 SVTGITP 299
+ + P
Sbjct: 87 AKSQQAP 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0525IGASERPTASE458e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 45.1 bits (106), Expect = 8e-07
Identities = 57/278 (20%), Positives = 92/278 (33%), Gaps = 48/278 (17%)

Query: 366 PEPETPRQSFAPVAPTAVMTPP--QVQQPSAP-----------APQTSPAPLPASTSQVL 412
PE E Q V T + TP Q PS P AP PAP S +
Sbjct: 983 PEVEKRNQ---TVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTET 1039

Query: 413 AARNQLQRAQGVTKTKK--SEPAAASRARPVNHSALERLASVSERVQARPAPSALETAPV 470
A N Q ++ V K ++ +E A + R + + + E A
Sbjct: 1040 VAENSKQESKTVEKNEQDATETTAQN-----------REVAKEAKSNVKANTQTNEVAQS 1088

Query: 471 KKEAYRWKATTPVVQTKEVVATPKALKKALEHEKTPELAAKLAAEAIERDPWAAQVSQLS 530
E T +TKE K K +E EKT E K+ ++ + + V +
Sbjct: 1089 GSET----KETQTTETKETATVEKEEKAKVETEKTQE-VPKVTSQVSPKQEQSETVQPQA 1143

Query: 531 LPKLVEQVALNAWKEQNGNAVCLHLRSTQRHLNSSGAQQKLAQALSDLTGTTV-ELTIVE 589
P +N + Q+ N++ ++ A+ S V E T V
Sbjct: 1144 EPARENDPTVNIKEPQSQT-------------NTTADTEQPAKETSSNVEQPVTESTTVN 1190

Query: 590 DDNPAVRTPLEWRQAIYEEKLAQARESIIADNNIQTLR 627
N V P A + + + + + +++R
Sbjct: 1191 TGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVR 1228


78SC0812SC0817N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0812-2162.297395ABC transporter membrane protein
SC0813-1152.454976ABC transporter membrane protein
SC0814-1142.833437multidrug ABC transporter ATPase
SC0815-1132.437970hypothetical protein
SC08160121.968372DNA-binding transcriptional regulator
SC0817-1101.990277ATP-dependent RNA helicase RhlE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0812ABC2TRNSPORT451e-07 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 45.3 bits (107), Expect = 1e-07
Identities = 35/139 (25%), Positives = 60/139 (43%), Gaps = 5/139 (3%)

Query: 197 AREREQGTLDQLLVSPLTTWQIFVGKAVPALIVATFQATIVLAIGIWAYQIPFAGSLALF 256
R Q T + +L + L I +G+ A A IG+ A + + L+L
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGA---GIGVVAAALGYTQWLSLL 148

Query: 257 YFTMVI--YGLSLVGFGLLISSLCATQQQAFIGVFVFMMPAILLSGYVSPVENMPVWLQN 314
Y VI GL+ G+++++L + + + P + LSG V PV+ +P+ Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 315 LTWINPIRHFTDITKQIYL 333
P+ H D+ + I L
Sbjct: 209 AARFLPLSHSIDLIRPIML 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0814PF05272320.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.009
Identities = 22/89 (24%), Positives = 29/89 (32%), Gaps = 21/89 (23%)

Query: 294 PRFEDAFIDLLGGAGTSESPLGSILHTVEGTAGETVIEAQELTKKFGDFAATDHVNFVVQ 353
PR E + +LG P + Q + K HV V++
Sbjct: 548 PRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVME 590

Query: 354 RGEIFG----LLGPNGAGKSTTFKMMCGL 378
G F L G G GKST + GL
Sbjct: 591 PGCKFDYSVVLEGTGGIGKSTLINTLVGL 619



Score = 29.7 bits (66), Expect = 0.044
Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 34 YVTGLVGPDGAGKTTLMRMLAGL 56
Y L G G GK+TL+ L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0815RTXTOXIND612e-12 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 60.6 bits (147), Expect = 2e-12
Identities = 44/262 (16%), Positives = 97/262 (37%), Gaps = 27/262 (10%)

Query: 79 YENALMQAKAGVSVAQAQYDLMLAGYRDEEIAQAAAAVRQAQAAYDYAQNFYNRQQGLWK 138
++N Q + + +A+ +LA E R + + + L +
Sbjct: 198 WQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQ 257

Query: 139 SRTISA--NDLENARSSRDQAQATLKSAQDKLSQYRTGNREQDI----AQAKASLEQAKA 192
N+L +S +Q ++ + SA+++ T + +I Q ++
Sbjct: 258 ENKYVEAVNELRVYKSQLEQIESEILSAKEEYQL-VTQLFKNEILDKLRQTTDNIGLLTL 316

Query: 193 QLAQAQLDLQDTTLIAPANGTLLTRAV-EPGSMLNAGSTVLTLSLT-RPVWVRAYVDERN 250
+LA+ + Q + + AP + + V G ++ T++ + + V A V ++
Sbjct: 317 ELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKD 376

Query: 251 LSQTQPGRDILLYTDGRPDKPYH---GKIGFVSPTAEFTPKTVETPDLRTDLVYRLRIIV 307
+ G++ ++ + P Y GK+ ++ A D R LV+ + I +
Sbjct: 377 IGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISI 428

Query: 308 T-------DADDALRQGMPVTV 322
+ + L GM VT
Sbjct: 429 EENCLSTGNKNIPLSSGMAVTA 450


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0816HTHTETR662e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.8 bits (160), Expect = 2e-15
Identities = 32/224 (14%), Positives = 72/224 (32%), Gaps = 25/224 (11%)

Query: 6 TTTKGEQAKSQLIAAALAQFGEYGLHATT-RDIAALAGQNIAAITYYFGSKEDLYLACAQ 64
T + ++ + ++ AL F + G+ +T+ +IA AG AI ++F K DL+ +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 65 WIADFLGEKFRPHAEKAERLFSQPAPD-RDAIRELILLACKNMIMLLTQEDTVNLSKFIS 123
+GE E + P R+ + ++ L E + +F+
Sbjct: 65 LSESNIGELEL---EYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV- 120

Query: 124 REQLSPTSAYQLVHEQVIDPLHTHLTRLVAA---YTGCDANDTRMILHTHALLGEVLAFR 180
E A + + + D + L + A +I+ + ++
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIM--RGYISGLM--- 175

Query: 181 LGKETILLRTGWPQFDEEKAELIYQTVTCHIDLILHGLTQRSLD 224
W + + ++ ++L
Sbjct: 176 ---------ENWLFAPQSFDLK--KEARDYVAILLEMYLLCPTL 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0817SECA300.023 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.023
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 4/67 (5%)

Query: 246 QQVLVFTRTKHGANHLAEQLNKDGIRSAAIHG-NKSQGARTRALADFKSGDIRVLVATDI 304
Q VLV T + + ++ +L K GI+ ++ + A A A + + V +AT++
Sbjct: 450 QPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNM 506

Query: 305 AARGLDI 311
A RG DI
Sbjct: 507 AGRGTDI 513


79SC0857SC0864N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC0857-1110.996905D-alanyl-D-alanine carboxypeptidase
SC08580120.800718DNA-binding transcriptional repressor DeoR
SC08591121.027898undecaprenyl pyrophosphate phosphatase
SC08601120.845098multidrug translocase
SC08612140.160709hypothetical protein
SC08622120.391487hypothetical protein
SC0863215-0.147385paral regulator
SC0864115-0.311637hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0857BLACTAMASEA475e-08 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 46.7 bits (111), Expect = 5e-08
Identities = 49/207 (23%), Positives = 78/207 (37%), Gaps = 25/207 (12%)

Query: 1 MTQYASSLRSLAAGSVLLFLFASPVKAEEQTIAPPGVDAR-AWILMDYASGKVLAEGNAD 59
M + SL A ++ L + ASP E+ ++ + R I MD ASG+ L AD
Sbjct: 1 MRYIRLCIISLLA-TLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRAD 59

Query: 60 EKLDPASLTKIMTSYVVGQALKAGKIKLTDMVTVGKDAWATGNPALRGSSVMFLKPGDQV 119
E+ S K++ V + AG +L + + +P V D +
Sbjct: 60 ERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSP------VSEKHLADGM 113

Query: 120 SVADLNKGIIIQSGNDACIALADYVAGSQESFIGLMNAYAKRLGLTNTT---FQTVHGLD 176
+V +L I S N A L V G + A+ +++G T ++T
Sbjct: 114 TVGELCAAAITMSDNSAANLLLATVGGPAG-----LTAFLRQIGDNVTRLDRWETELNEA 168

Query: 177 APGQF---STARDMA------LLGKAL 194
PG +T MA L + L
Sbjct: 169 LPGDARDTTTPASMAATLRKLLTSQRL 195


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0860TCRTETB423e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 42.2 bits (99), Expect = 3e-06
Identities = 65/356 (18%), Positives = 126/356 (35%), Gaps = 51/356 (14%)

Query: 48 QAGLDWVPTSMTAYLAGGMFLQWLLGPLSDRIGRRPVMLAGVVWFIVTCLATLLAKNIEQ 107
A +WV T+ + G + G LSD++G + ++L G++ + + +
Sbjct: 48 PASTNWVNTAFMLTFSIGTAV---YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFS 104

Query: 108 FT-FLRFLQGISLCFIGAVGYAAIQESFEEAVCIKITALMANVALISPLLGPLVGAAWVH 166
RF+QG A+ + + K L+ ++ + +GP +G H
Sbjct: 105 LLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAH 164

Query: 167 VLPWEGMFILFAALAAIAFFGLQRAMPETATRRGE------------------------- 201
+ W +L + I L + + + +G
Sbjct: 165 YIHW-SYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSI 223

Query: 202 ------TLSFKALGRDYRLV---------IKNRRFVAGALALGFVSLPLLAWIAQSPIII 246
LSF + R V KN F+ G L G + + +++ P ++
Sbjct: 224 SFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMM 283

Query: 247 ISGEQLSSYEYG-LLQVPVFGALIAGNLVLARLTSRRTVRSLIVMGGWPIVAGLIITAAA 305
QLS+ E G ++ P ++I + L RR ++ +G + + +
Sbjct: 284 KDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTAS-- 341

Query: 306 TVVSSHAYLWMTAGLSVYAFGIGLANAGLVRLTLFSSDMSKGTVSAAMGMLQMLIF 361
+ +MT + V+ G GL+ V T+ SS + + A M +L F
Sbjct: 342 -FLLETTSWFMTIII-VFVLG-GLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSF 394


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0862TCRTETB330.002 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.3 bits (76), Expect = 0.002
Identities = 33/150 (22%), Positives = 65/150 (43%), Gaps = 6/150 (4%)

Query: 218 LLIGVVVLAMAFAEGSANDWL-PLLMVDGHGFSP-TSGSLIYAGFTLGMTVGRFTGGWFI 275
+IGV+ + F + + P +M D H S GS+I T+ + + + GG +
Sbjct: 258 FMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILV 317

Query: 276 DRYSRVTVVR-ASALM--GALGIGLIIFVDSDWVA-GVSVILWGLGASLGFPLTISAASD 331
DR + V+ + L ++ S ++ + +L GL + TI ++S
Sbjct: 318 DRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSL 377

Query: 332 TGPDAPTRVSVVATTGYLAFLVGPPLLGYL 361
+A +S++ T +L+ G ++G L
Sbjct: 378 KQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0863HTHTETR476e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 47.3 bits (112), Expect = 6e-09
Identities = 17/80 (21%), Positives = 33/80 (41%)

Query: 7 RRANDPKRREKIIQATLEAVKTYGVHAVTHRKIAAIAQVPLGSMTYYFAGMDALLSEAFT 66
+ + R+ I+ L GV + + +IA A V G++ ++F L SE +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 67 LFTENMSRQYQDFFAQVTDA 86
L N+ ++ A+
Sbjct: 65 LSESNIGELELEYQAKFPGD 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC0864TCRTETA310.010 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 31.3 bits (71), Expect = 0.010
Identities = 21/106 (19%), Positives = 34/106 (32%), Gaps = 6/106 (5%)

Query: 394 LMIGMITFQFSNFSFGIGNAAGLLFAGIML-GFLRANHPTFG-YIPQ--GALNMVKEFGL 449
L++ + +L+ G ++ G A G YI + FG
Sbjct: 76 LLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGF 135

Query: 450 MVFMAGVGLSAGSGINNGLGAVGGQM--LIAGLVVSLVPVVICFLF 493
M G G+ AG + +G A + L + CFL
Sbjct: 136 MSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLL 181


80SC1042SC1046N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1042124-4.372406hypothetical protein
SC1043124-3.533998outer protein
SC1044017-1.888643pathogenicity island encoded protein: SPI3
SC1045-115-1.254620copper resistance; histidine kinase
SC1046-212-0.009923transcriptional regulatory protein YedW
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1042PF078241651e-56 Type III secretion chaperone
		>PF07824#Type III secretion chaperone

Length = 120

Score = 165 bits (419), Expect = 1e-56
Identities = 33/114 (28%), Positives = 63/114 (55%), Gaps = 1/114 (0%)

Query: 1 MESLLNRLYDALGLDAPE-DEPLLIIDDGIQVYFNESDHTLEMCCPFMPLPDDILTLQHF 59
ME L + + ALG+ + + D+ +++DD + +Y + ++ + CPF LP++I L +
Sbjct: 1 MEDLADVICRALGIPSIDTDDQAIMLDDDVLIYIEKEGDSINLLCPFCALPENINDLIYA 60

Query: 60 LRLNYTSAVTIGADADNTALVALYRLPQTSTEEEALTGFELFISNVKQLKEHYA 113
L LNY+ + + D + +L+A L + E+ E +IS V+ LK+ +A
Sbjct: 61 LSLNYSEKICLATDDEGGSLIARLDLTGINEFEDIYVNTEYYISRVRWLKDEFA 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1043TYPE3OMBPROT6650.0 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 665 bits (1717), Expect = 0.0
Identities = 186/396 (46%), Positives = 254/396 (64%), Gaps = 5/396 (1%)

Query: 166 LNNQPWQTIKNTLTHNGHHYTNTQLPAAEMKIGAKDIFPSAYEGKGVCSWDTKNIHHANN 225
LNN+ W + ++H+G +Y PA+ MKIG K+IF Y GKG+C T+ H N
Sbjct: 146 LNNKNWGPVNKNISHHGKNYGFQLTPASHMKIGNKNIFVKEYNGKGICCASTRESDHIAN 205

Query: 226 LWMSTVSVHEDGKDKTLFCGIRHGVLSPYH-EKDPLLRHVGAENKAKEVLTAALFSKPEL 284
+W+S V V ++GK+ +F GIRHGV+S Y +K+ R V A NKA+E+++AAL+S+PEL
Sbjct: 206 MWLSKV-VDDEGKE--IFSGIRHGVISAYGLKKNSSERAVAARNKAEELVSAALYSRPEL 262

Query: 285 LNKALAGEAVSLKLVSVGLLTASNIFGKEGTMVEDQMRAWQSL-TQPGKMIHLKIRNKDG 343
L++AL+G+ V LK+VS LLT +++ G E +M++DQ+ A + L ++ G+ L IRN DG
Sbjct: 263 LSQALSGKTVDLKIVSTSLLTPTSLTGGEESMLKDQVNALKGLNSKRGEPTKLLIRNSDG 322

Query: 344 DLQTVKIKPDVAAFNVGVNELALKLGFGLKASDSYNAEALHQLLGNDLRPEARPGGWVGE 403
L+ V + V FN GVNELALK+G G + D N E++ LLG++ GGW E
Sbjct: 323 LLKEVSVNLKVVTFNFGVNELALKMGLGWRNVDKLNDESICSLLGDNFLKNGVIGGWAAE 382

Query: 404 WLAQYPDNYEVVNTLARQIKDIWKNNQHHKDGGEPYKLAQRLAMLAHEIDAVPAWNCKSG 463
+ + P V LA QIK+I D GEPYKL+QR+ +LA+ I AVP WNCKSG
Sbjct: 383 AIEKNPPCKNDVIYLANQIKEIINKKLQKNDNGEPYKLSQRMTLLAYTIGAVPCWNCKSG 442

Query: 464 KDRTGMMDSEIKREIISLHQTHMLSAPGSLPDSGGQKIFQKVLLNSGNLEIQKQNTGGAG 523
KDRTGM D+EIKREII H+T S S S +++F +L+NSGN+EIQ+ NTG G
Sbjct: 443 KDRTGMQDAEIKREIIRKHETGQFSQLNSKLSSEEKRLFSTILMNSGNMEIQEMNTGVPG 502

Query: 524 NKVMKNLSPEVLNLSYQKRVGDENIWQSVKGISSLI 559
NKVMK L L LSY +R+GD IW VKG SS +
Sbjct: 503 NKVMKKLPLSSLELSYSERIGDSKIWNMVKGYSSFV 538


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1045PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.7 bits (77), Expect = 0.001
Identities = 18/102 (17%), Positives = 38/102 (37%), Gaps = 15/102 (14%)

Query: 348 ILLQRVLSNLLTNAIRYSDENAVIRIESAYDDNVAEIRVANPGSHPADADKLFRRFWRGD 407
+L+Q ++ N + + I + I ++ D+ + V N GS K
Sbjct: 258 MLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE-------- 309

Query: 408 NARHTAGFGLGLSLVNA-IALLHGGSASYRYADEHNIFSVRL 448
G GL V + +L+G A + +++ + +
Sbjct: 310 ------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1046HTHFIS792e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.5 bits (196), Expect = 2e-19
Identities = 29/117 (24%), Positives = 55/117 (47%), Gaps = 1/117 (0%)

Query: 2 KILLIEDNQKTIEWVRQGLTEAGYVVDYACDGRDGLHLALQEHYSLIILDIMLPGLDGWQ 61
IL+ +D+ + Q L+ AGY V + L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLRALRTAHQS-PAICLTARDSVEDRVKGLEAGANDYLVKPFSFAELLARVRAQLRQ 117
+L ++ A P + ++A+++ +K E GA DYL KPF EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


81SC1123SC1132N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC11231153.236557flagellar basal body rod modification protein
SC1124-2163.485327flagellar hook protein FlgE
SC1125-1152.425705flagellar basal body rod protein FlgF
SC1126-1151.513856flagellar basal body rod protein FlgG
SC11271152.726684flagellar basal body L-ring protein
SC11281152.576784flagellar basal body P-ring biosynthesis protein
SC11292152.230478flagellar rod assembly protein/muramidase FlgJ
SC11303131.291430flagellar hook-associated protein FlgK
SC11313141.785360flagellar hook-associated protein FlgL
SC11324132.035656ribonuclease E
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1123SYCECHAPRONE290.010 Gram-negative bacterial type III secretion SycE cha...
		>SYCECHAPRONE#Gram-negative bacterial type III secretion SycE

chaperone signature.
Length = 130

Score = 28.9 bits (64), Expect = 0.010
Identities = 16/34 (47%), Positives = 20/34 (58%), Gaps = 2/34 (5%)

Query: 44 LKNQDPTNPLQNNELTTQLAQISTVSGIEKLNTT 77
L N+ P N L NN L TQL + V G E+L T+
Sbjct: 89 LWNRQPLNSLDNNSLYTQLEML--VQGAERLQTS 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1124FLGHOOKAP1417e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.1 bits (96), Expect = 7e-06
Identities = 17/48 (35%), Positives = 29/48 (60%)

Query: 356 LTNGALEASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR 403
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + L+N+R
Sbjct: 499 LSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 37.6 bits (87), Expect = 9e-05
Identities = 22/60 (36%), Positives = 31/60 (51%), Gaps = 4/60 (6%)

Query: 2 SFSQAVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
+ A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1126FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1127FLGLRINGFLGH355e-128 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 355 bits (911), Expect = e-128
Identities = 211/232 (90%), Positives = 223/232 (96%)

Query: 4 MQKYALHAYPVMALMVATLTGCAWIPAKPLVQGATTAQPIPGPVPVANGSIFQSAQPINY 63
MQK A H Y + +L+V +LTGCAWIP+ PLVQGAT+AQP+PGP PVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 64 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTSFGFDTVPRYLQGLFGNS 123
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKT+FGFDTVPRYLQGLFGN+
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 124 RADMEASGGNSFNGKGGANASNTFSGTLTVTVDQVLANGNLHVVGEKQIAINQGTEFIRF 183
RAD+EASGGN+FNGKGGANASNTFSGTLTVTVDQVL NGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 184 SGVVNPRTISGSNSVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 235
SGVVNPRTISGSN+VPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1128FLGPRINGFLGI430e-153 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 430 bits (1106), Expect = e-153
Identities = 153/362 (42%), Positives = 215/362 (59%), Gaps = 9/362 (2%)

Query: 7 LAGIVLALVATLAHAERIRDLTSVQGVRENSLIGYGLVVGLDGTGDQTTQTPFTTQTLNN 66
A L+ A RI+D+ S+Q R+N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 14 SALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRA 73

Query: 67 MLSQLGITVPTGTNMQLKNVAAVMVTASYPPFARQGQTIDVVVSSMGNAKSLRGGTLLMT 126
ML LGIT G + KN+AAVMVTA+ PPFA G +DV VSS+G+A SLRGG L+MT
Sbjct: 74 MLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMT 132

Query: 127 PLKGVDSQVYALAQGNILVGGVGASAGGSSVQVNQLNGGRITNGAIIERELPTQFGAGNT 186
L G D Q+YA+AQG ++V G A +++ R+ NGAIIERELP++F
Sbjct: 133 SLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVN 192

Query: 187 INLQLNDEDFTMAQQITDAINRAR----GYGSATALDARTVQVRVPSGNSSQVRFLADIQ 242
+ LQL + DF+ A ++ D +N G A D++ + V+ P + R +A+I+
Sbjct: 193 LVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEIE 251

Query: 243 NMEVNVTPQDAKVVINSRTGSVVMNREVTLDSCAVAQGNLSVTVNRQLNVNQPNTPFGGG 302
N+ V T AKVVIN RTG++V+ +V + AV+ G L+V V V QP PF G
Sbjct: 252 NLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSRG 309

Query: 303 QTVVTPQTQIDLRQSGGSLQSVRSSANLNSVVRALNALGATPMDLMSILQSMQSAGCLRA 362
QT V PQT I Q G + ++ +L ++V LN++G +++ILQ ++SAG L+A
Sbjct: 310 QTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQA 368

Query: 363 KL 364
+L
Sbjct: 369 EL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1129FLGFLGJ4990.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 499 bits (1285), Expect = 0.0
Identities = 263/316 (83%), Positives = 289/316 (91%), Gaps = 3/316 (0%)

Query: 1 MIGDGKLLASAAWDAQSLNELKAKAGQDPAANIRPVARQVEGMFVQMMLKSMREALPKDG 60
MI D KLLASAAWDAQSLNELKAKAG+DPAANIRPVARQVEGMFVQMMLKSMR+ALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSDQTRLYTSMYDQQIAQQMTAGKGLGLADMMVKQMTGGQTMPADDAPQVPLKFSLET 120
LFSS+ TRLYTSMYDQQIAQQMTAGKGLGLA+MMVKQMT Q +P + P P+KF LET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VNSYQNQALTQLVRKAIPKTPDSSDAPLSGDSKDFLARLSLPARLASEQSGVPHHLILAQ 180
V YQNQAL+QLV+KA+P+ D S L GDSK FLA+LSLPA+LAS+QSGVPHHLILAQ
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDS---LPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQ 177

Query: 181 AALESGWGQRQILRENGEPSYNVFGVKATASWKGPVTEITTTEYENGEAKKVKAKFRVYS 240
AALESGWGQRQI RENGEPSYN+FGVKA+ +WKGPVTEITTTEYENGEAKKVKAKFRVYS
Sbjct: 178 AALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYS 237

Query: 241 SYLEALSDYVALLTRNPRYAAVTTAATAEQGAVALQNAGYATDPNYARKLTSMIQQLKAM 300
SYLEALSDYV LLTRNPRYAAVTTAA+AEQGA ALQ+AGYATDP+YARKLT+MIQQ+K++
Sbjct: 238 SYLEALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSI 297

Query: 301 SEKVSKTYSANLDNLF 316
S+KVSKTYS N+DNLF
Sbjct: 298 SDKVSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1130FLGHOOKAP16640.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 664 bits (1714), Expect = 0.0
Identities = 438/553 (79%), Positives = 487/553 (88%), Gaps = 8/553 (1%)

Query: 2 SSLINHAMSGLNAAQAALNTVSNNINNYNVAGYTRQTTILAQANSTLGAGGWIGNGVYVS 61
SSLIN+AMSGLNAAQAALNT SNNI++YNVAGYTRQTTI+AQANSTLGAGGW+GNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRGAQNQSSGLTTRYEQMSKIDNLLADKSSSLSGSLQSFFTSLQTLV 121
GVQREYDAFITNQLR AQ QSSGLT RYEQMSKIDN+L+ +SSL+ +Q FFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKAEGLVNQFKTTDQYLRDQDKQVNIAIGSSVAQINNYAKQIANLND 181
SNAEDPAARQALIGK+EGLVNQFKTTDQYLRDQDKQVNIAIG+SV QINNYAKQIA+LND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRMTGVGAGASPNDLLDQRDQLVSELNKIVGVEVSVQDGGTYNLTMANGYTLVQGSTA 241
QISR+TGVGAGASPN+LLDQRDQLVSELN+IVGVEVSVQDGGTYN+TMANGY+LVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSSADPTRTTVAYVDEAAGNIEIPEKLLNTGSLGGLLTFRSQDLDQTRNTLGQL 301
RQLAAVPSSADP+RTTVAYVD AGNIEIPEKLLNTGSLGG+LTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFADAFNAQHTKGYDADGNKGKDFFSIGSPVVYSNSNNADKTVSLTAKVVDSTKVQAT 361
ALAFA+AFN QH G+DA+G+ G+DFF+IG P V N+ N V++ A V D++ V AT
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGD-VAIGATVTDASAVLAT 359

Query: 362 DYKIVFDGTDWQVTRTADNTTFTATKDADGKLEIDGLKVTVGTGAQKNDSFLLKPVSNAI 421
DYKI FD WQVTR A NTTFT T DA+GK+ DGL++T NDSF LKPVS+AI
Sbjct: 360 DYKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAI 419

Query: 422 VDMNVKVTNEAEIAMASESKLDPDVDTGDSDNRNGQALLDLQ-NSNVVGGNKTFNDAYAT 480
V+M+V +T+EA+IAMASE D GDSDNRNGQALLDLQ NS VGG K+FNDAYA+
Sbjct: 420 VNMDVLITDEAKIAMASEE------DAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYAS 473

Query: 481 LVSDVGNKTSTLKTSSTTQANVVKQLYKQQQSVSGVNLDEEYGNLQRYQQYYLANAQVLQ 540
LVSD+GNKT+TLKTSS TQ NVV QL QQQS+SGVNLDEEYGNLQR+QQYYLANAQVLQ
Sbjct: 474 LVSDIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQ 533

Query: 541 TANALFDALLNIR 553
TANA+FDAL+NIR
Sbjct: 534 TANAIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1131FLAGELLIN414e-06 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 41.2 bits (96), Expect = 4e-06
Identities = 30/138 (21%), Positives = 59/138 (42%)

Query: 1 MRISTQMMYEQNMSGITNSQAEWMKLGEQMSTGKRVTNPSDDPIAASQAVVLSQAQAQNS 60
I+T + + + SQ+ E++S+G R+ + DD + A + +
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 61 QYALARTFATQKVSLEESVLSQVTTAIQTAQEKIVYAGNGTLSDDDRASLATDLQGIRDQ 120
Q + E L+++ +Q +E V A NGT SD D S+ ++Q ++
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 121 LMNLANSTDGNGRYIFAG 138
+ ++N T NG + +
Sbjct: 122 IDRVSNQTQFNGVKVLSQ 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1132IGASERPTASE552e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 55.1 bits (132), Expect = 2e-09
Identities = 44/263 (16%), Positives = 84/263 (31%), Gaps = 34/263 (12%)

Query: 513 PSEEEYAERKRPEQPALATFAMPDVPPAPTPVEPAVSVATAKKDNVAAAQPAQPGLFSRF 572
P E+ + DVP P+ + A+ D PA
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSN-----NEEIARVDEAPVPPPAPA------ 1031

Query: 573 LNALKQLFSGEETKAVETAAPKAEEKAERQQDRRKPRQNNRRDRNERRDTRDNR----AG 628
+ E +K K E+ A QN + + + + N
Sbjct: 1032 TPSETTETVAENSKQESKTVEKNEQDATE-----TTAQNREVAKEAKSNVKANTQTNEVA 1086

Query: 629 RDGGESRDDNRRNRRQTQQQNAEAR---DTRQQETAEKVKTGDEQQQTPRRERSRRRNDD 685
+ G E+++ ++T E + +T + + KV + Q +P++E+S
Sbjct: 1087 QSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS----QVSPKQEQSETVQPQ 1142

Query: 686 KRQAQQEVKALNREEQPVQETEQEERVQQVQPRRKQRQLNQKVRFTNSAVVETVDTPVVV 745
A++ +N +E Q + QP ++ N + T S V T ++ V
Sbjct: 1143 AEPARENDPTVNIKEPQSQTNTTAD---TEQPAKETSS-NVEQPVTESTTVNTGNSVVEN 1198

Query: 746 DEPRPVENVEQPVPAPRTELAKV 768
E + P +E +
Sbjct: 1199 PENTTPATTQ---PTVNSESSNK 1218



Score = 39.3 bits (91), Expect = 1e-04
Identities = 48/331 (14%), Positives = 81/331 (24%), Gaps = 45/331 (13%)

Query: 630 DGGESRDDNRRNRRQTQQQNAEARDTRQQETAEKVKTGDEQQQTPRRERSRRRNDDKRQA 689
D G + R + N E Q + T + Q S +
Sbjct: 963 DLGAWKYKLRNVNGRYDLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDE 1022

Query: 690 QQEVKALNREEQPVQETEQEERVQQVQPRRKQRQLNQKVRFTNSAVVETVDTPVVVDEPR 749
ET E Q+ + K Q + N V + V +
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKE-AKSNVKANTQ 1081

Query: 750 PVENVEQPVPAPRTELAKVDLPVVADIAPEQDDSVEPRDNTGMPRRSRRSPRHLRVSGQR 809
E + T+ + A + E+ VE +++ P+ +
Sbjct: 1082 TNEVAQSGSETKETQTTETKET--ATVEKEEKAKVETE-------KTQEVPKVTSQVSPK 1132

Query: 810 RRRYRDERYPTQSPMPLTVACASPEMASGKVWIRYPIVRPQETQVVDEQREADLALPQPV 869
+ + + + P V +E Q AD P
Sbjct: 1133 QEQSETVQPQAEPARE-----------------NDPTVNIKEPQS-QTNTTADTEQPAKE 1174

Query: 870 VAEPQVTAATVALEPQASVQAVENVVVEPQTVAEPQTPEVVEVETTHPEVIAAPVDEQPQ 929
Q V + + PE TT P V + ++
Sbjct: 1175 T-------------SSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKN 1221

Query: 930 LIAESDTPVAQEVIADAEPVAETADASITVA 960
S V V EP +++ TVA
Sbjct: 1222 RHRRSVRSVPHNV----EPATTSSNDRSTVA 1248


82SC1408SC1415N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1408029-5.952442tetrathionate reductase complex: response
SC1409231-7.211316hypothetical protein
SC1410234-8.198387inner membrane protein
SC1411439-9.862522MerR family transcriptional regulator
SC1412543-11.036208secretion system transcriptonal activator
SC1413543-10.840142secretion system regulator:sensor component
SC1414743-10.662204secretion system apparatus protein SsaB
SC1415541-9.755618secretion system apparatus protein SsaC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1408HTHFIS842e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.5 bits (209), Expect = 2e-21
Identities = 31/127 (24%), Positives = 56/127 (44%)

Query: 2 ATIHLLDDDTAVTNACAFLLESLGYDVKCWTQGADFLAQASLYQAGVVLLDMRMPVLDGQ 61
ATI + DDD A+ L GYDV+ + A + +V+ D+ MP +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 GVHDALRQCGSTLAVVFLTGHGDVPMAVEQMKRGAVDFLQKPVSVKPLQAALERALTVSS 121
+ +++ L V+ ++ A++ ++GA D+L KP + L + RAL
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 122 AAVARRE 128
++ E
Sbjct: 124 RRPSKLE 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1412HTHFIS667e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.0 bits (161), Expect = 7e-15
Identities = 28/119 (23%), Positives = 50/119 (42%), Gaps = 2/119 (1%)

Query: 1 MKEYKILLVDDHEIIINGIMNALLPWPHFKIVEHVKNGLEVYNACCAYEPDILILDLSLP 60
M IL+ DD I + AL + V N ++ A + D+++ D+ +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGY--DVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 GINGLDIIPQLHQRWPAMNILVYTAYQQEYMTIKTLAAGANGYVLKSSSQQVLLAALQT 119
N D++P++ + P + +LV +A IK GA Y+ K L+ +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1413HTHFIS693e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 69.1 bits (169), Expect = 3e-14
Identities = 31/156 (19%), Positives = 58/156 (37%), Gaps = 13/156 (8%)

Query: 691 ILLVDDADINRDIISKMLVSLGQHVTIAASSNEALTLSQQQRFDLVLIDIRMPEIDGIEC 750
IL+ DD R ++++ L G V I +++ DLV+ D+ MP+ + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 751 VQLWHDEPNNLDPDCMFVALSASVATEDIHRCKKNGIHHYITKPVTLATLARYISIAAEY 810
+ PD + +SA + + G + Y+ KP L L
Sbjct: 66 LP----RIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIG-------- 113

Query: 811 QLLRNIELQEQDPSRCSALLAT-DDMVINSKIFQSL 845
+ R + ++ PS+ +V S Q +
Sbjct: 114 IIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEI 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1415TYPE3OMGPROT5810.0 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 581 bits (1498), Expect = 0.0
Identities = 158/500 (31%), Positives = 259/500 (51%), Gaps = 15/500 (3%)

Query: 11 LLFILNTVKSDELSWKGNDFTLYARQMPLAEVLHLLSENYDTAITISPLITATFSGKIPP 70
LL + + + EL W + A+ L ++L NYD + +S I SG+
Sbjct: 17 LLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDFGANYDATVVVSDKINDKVSGQFEH 76

Query: 71 GPPVDILNNLAAQYDLLTWFDGSMLYVYPASLLKHQVITFNILSTGRFIHYLRSQNILSS 130
P D L ++A+ Y+L+ ++DG++LY++ S + ++I L+ I
Sbjct: 77 DNPQDFLQHIASLYNLVWYYDGNVLYIFKNSEVASRLIRLQESEAAELKQALQRSGIWE- 135

Query: 131 PGCEVKEITGTRAVEVSGVPSCLTRISQLASVLDNALIKR--KDSAVSVSIYTLKYATAM 188
P + R V VSG P L + Q A+ L+ R K A+++ I+ LKYA+A
Sbjct: 136 PRFGWRPDASNRLVYVSGPPRYLELVEQTAAALEQQTQIRSEKTGALAIEIFPLKYASAS 195

Query: 189 DTQYQYRDQSVVVPGVVSVL-REMSKTSVPASSTTN-----GSPATQALPMFAADPRQNA 242
D YRD V PGV ++L R +S ++ + N + A ADP NA
Sbjct: 196 DRTIHYRDDEVAAPGVATILQRVLSDATIQQVTVDNQRIPQAATRASAQARVEADPSLNA 255

Query: 243 VIVRDYAANMAGYRKLITELDQRQQMIEISVKIIDVNAGDINQLGIDWGTAVSLGG---- 298
+IVRD M Y++LI LD+ IE+++ I+D+NA + +LG+DW + G
Sbjct: 256 IIVRDSPERMPMYQRLIHALDKPSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQV 315

Query: 299 --KKIAFNTGLNDGGASGFSTVISDTSNFMVRLNALEKSSQAYVLSQPSVVTLNNIQAVL 356
K + + GA G + R+N LE A V+S+P+++T N QAV+
Sbjct: 316 VIKTTGDQSNIASNGALGSLVDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVI 375

Query: 357 DKNITFYTKLQGEKVAKLESITTGSLLRVTPRLLNDNGTQKIMLNLNIQDGQQSDTQSET 416
D + T+Y K+ G++VA+L+ IT G++LR+TPR+L +I LNL+I+DG Q S
Sbjct: 376 DHSETYYVKVTGKEVAELKGITYGTMLRMTPRVLTQGDKSEISLNLHIEDGNQKPNSSGI 435

Query: 417 DPLPEVQNSEIASQATLLAGQSLLLGGFKQGKQIHSQNKIPLLGDIPVVGHLFRNDTTQV 476
+ +P + + + + A + GQSL++GG + + + +K+PLLGDIP +G LFR +
Sbjct: 436 EGIPTISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELT 495

Query: 477 HSVIRLFLIKASVVNNGISH 496
+RLF+I+ +++ GI+H
Sbjct: 496 RRTVRLFIIEPRIIDEGIAH 515


83SC1439SC1447N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1439334-8.178759type III secretion system protein
SC1440128-7.501595type III secretion system protein
SC1441023-6.554404secretion system apparatus protein SsaS
SC1442019-4.754656secretion system apparatus protein SsaT
SC1443-215-2.352635secretion system apparatus protein SsaU
SC1444-2100.124266**multidrug efflux protein
SC1445-210-0.137108riboflavin synthase subunit alpha
SC1446-111-0.190610cyclopropane-fatty-acyl-phospholipid synthase
SC14470120.694327inner membrane transport protein YdhC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1439FLGMOTORFLIN513e-10 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 51.1 bits (122), Expect = 3e-10
Identities = 21/67 (31%), Positives = 38/67 (56%)

Query: 247 LEQIPQQVLFEIGRASLEIGQLRQLKTGDVLPVGGCFAPEVTIRVNDRIIGQGELIACGN 306
+ IP ++ E+GR + I +L +L G V+ + G + I +N +I QGE++ +
Sbjct: 57 IMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVAD 116

Query: 307 EFMVRIT 313
++ VRIT
Sbjct: 117 KYGVRIT 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1440TYPE3IMPPROT2319e-80 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 231 bits (592), Expect = 9e-80
Identities = 79/215 (36%), Positives = 130/215 (60%), Gaps = 8/215 (3%)

Query: 8 LQLIGILFLLSILPLIIVMGTSFLKLAVVFSILRNALGIQQVPPNIALYGLALVLSLFIM 67
+ LI +L ++LP II GT F+K ++VF ++RNALG+QQ+P N+ L G+AL+LS+F+M
Sbjct: 5 ISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLSMFVM 64

Query: 68 GPTLLAVKERWHPVQVAGAPFWT-SEWDSKALAPYRQFLQKNSEEKEANYFRNLIKRTWP 126
P + + V + S+ + L YR +L K S+ + +F N +
Sbjct: 65 WPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQLKRQY 124

Query: 127 ED-------IKRKIKPDSLLILIPAFTVSQLTQAFRIGLLIYLPFLAIDLLISNILLAMG 179
+ K +I+ S+ L+PA+ +S++ AF+IG +YLPF+ +DL++S++LLA+G
Sbjct: 125 GEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVLLALG 184

Query: 180 MMMVSPMTISLPFKLLIFLLAGGWDLTLAQLVQSF 214
MMM+SP+TIS P KL++F+ GW L L+ +
Sbjct: 185 MMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQY 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1441TYPE3IMQPROT729e-21 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 72.5 bits (178), Expect = 9e-21
Identities = 30/85 (35%), Positives = 50/85 (58%)

Query: 4 SELTQFVTQLLWIVLFTSMPVVLVASVVGVIVSLVQALTQIQDQTLQFMIKLLAIAITLM 63
+L + L++VL S +VA+++G++V L Q +TQ+Q+QTL F IKLL + + L
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 VSYPWLSGILLNYTRQIMLRIGEHG 88
+ W +LL+Y RQ++ G
Sbjct: 62 LLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1442TYPE3IMRPROT1637e-52 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 163 bits (415), Expect = 7e-52
Identities = 55/229 (24%), Positives = 101/229 (44%), Gaps = 5/229 (2%)

Query: 8 WLIALAVAFIRPLSLSLLLPLLKSGSLGAALLRNGVLMSLTFPILPIIYQQKIMMHIGKD 67
WL +R L+L P+L S+ ++ G+ M +TF I P + + +
Sbjct: 12 WLNLYFWPLLRVLALISTAPILSERSVPK-RVKLGLAMMITFAIAPSLPANDVPVF---S 67

Query: 68 YSWLGLVTGEVIIGFLIGFCAAVPFWAVDMAGFLLDTLRGATMGTIFNSTIEAETSLFGL 127
+ L L +++IG +GF F AV AG ++ G + T + +
Sbjct: 68 FFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLAR 127

Query: 128 LFSQFLCVIFFISGGMEFILNILYESYQYLPPGRTLLFDQQFLKYIQAEWRTLYQLCISF 187
+ ++F G +++++L +++ LP G L FL +A ++ +
Sbjct: 128 IMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSL-IFLNGLML 186

Query: 188 SLPAIICMVLADLALGLLNRSAQQLNVFFLSMPLKSILVLLTLLISFPY 236
+LP I ++ +LALGLLNR A QL++F + PL + + + P
Sbjct: 187 ALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPL 235


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1443TYPE3IMSPROT385e-136 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 385 bits (990), Expect = e-136
Identities = 126/350 (36%), Positives = 204/350 (58%), Gaps = 4/350 (1%)

Query: 2 SEKTEQPTEKKLRDGRKEGQVVKSIEITSLFQLIALYLYFHFFTEKMILILIESITFTLQ 61
EKTEQPT KK+RD RK+GQV KS E+ S ++AL ++ + + +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 62 LVNKPFSYALTQL-SHALIESLTSALLFLGAGVIVATVGSVFLQVGVVIASKAIGFKSEH 120
PFS AL+ + + L+E L ++A + S +Q G +I+ +AI +
Sbjct: 63 QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMA-IASHVVQYGFLISGEAIKPDIKK 121

Query: 121 INPVSNFKQIFSLHSVVELCKSSLKVIMLSLIFAFFFYYYASTFRALPYCGLACGLLVVS 180
INP+ K+IFS+ S+VE KS LKV++LS++ T LP CG+ C ++
Sbjct: 122 INPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLG 181

Query: 181 SLIKWLWVGVMAFYIVVGILDYSFQYYKIRKDLKMSKDDVKQEHKDLEGDPQMKTRRREM 240
+++ L V ++V+ I DY+F+YY+ K+LKMSKD++K+E+K++EG P++K++RR+
Sbjct: 182 QILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQF 241

Query: 241 QSEIQSGSLAQSVKQSVAVVRNPTHIAVGLGYHPTDMPIPRVLEKGSDAQANYIVNIAER 300
EIQS ++ ++VK+S VV NPTHIA+G+ Y + P+P V K +DAQ + IAE
Sbjct: 242 HQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEE 301

Query: 301 NCIPVVENVELARSLFFEVERGDKIPETLFEPVAALLRMVMK--IDYAHS 348
+P+++ + LAR+L+++ IP E A +LR + + I+ HS
Sbjct: 302 EGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEKQHS 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1447TCRTETB762e-17 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 76.5 bits (188), Expect = 2e-17
Identities = 48/194 (24%), Positives = 84/194 (43%), Gaps = 3/194 (1%)

Query: 8 LVWLAGLSMLGFLATDMYLPAFAAIRADLQTPAAAVSASLSLFLAGFAVAQLLWGPLSDR 67
L+WL LS L + + I D P A+ + + F+ F++ ++G LSD+
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 68 YGRKPILLLGLSIFALGSLGMLWVESAAALLTL-RFVQAVGVCAATVIWQALVTDYYPSQ 126
G K +LL G+ I GS+ S +LL + RF+Q G A + +V Y P +
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 127 KINRIFATIMPLVGLSPALAPLLGSWILTHFSWQAIFATLFVITLLLMLPALRLKPSGKA 186
+ F I +V + + P +G I + W + L + ++ +P L +
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL--LIPMITIITVPFLMKLLKKEV 193

Query: 187 RTEGQDKLTFATLL 200
R +G + L+
Sbjct: 194 RIKGHFDIKGIILM 207


84SC1572SC1576N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1572-1120.886297new outer membrane protein; porin
SC1573-2111.547529hypothetical protein
SC1574-1111.325596methyl viologen resistance
SC1575-1110.333944TetR family transcriptional regulator
SC1576-211-1.132152major facilitator superfamily nitrate extrusion
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1572ECOLIPORIN5100.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 510 bits (1314), Expect = 0.0
Identities = 247/385 (64%), Positives = 286/385 (74%), Gaps = 25/385 (6%)

Query: 1 MKLKLVAVAVTSLLAAGVVNAAEVYNKDGNKLDLYGKVHAQHYFSDDNGSDGDKTYARLG 60
MK K++A+ + +LLAAG +AAE+YNKDGNKLDLYGKV HYFSDD+ DGD+TY R+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQINDQLTGFGQWEYEFKGNRTESQGADKDKTRLAFAGLKFADYGSFDYGRNYGVA 120
FKGETQINDQLTG+GQWEY + N TE +GA+ TRLAFAGLKF DYGSFDYGRNYGV
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANS-WTRLAFAGLKFGDYGSFDYGRNYGVL 119

Query: 121 YDIGAWTDVLPEFGGDTWTQTDVFMTGRTTGVATYRNTDFFGLVEGLNFAAQYQGKNDRD 180
YD+ WTD+LPEFGGD++T D +MTGR GVATYRNTDFFGLV+GLNFA QYQGKN+
Sbjct: 120 YDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQ 179

Query: 181 GAY----------------ESNGDGFGLSATYEY-EGFGVGAAYAKSDRTNNQVKAASNL 223
A NGDGFG+S TY+ GF GAAY SDRTN QV A
Sbjct: 180 SADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAG-GT 238

Query: 224 NAAGKNAEVWAAGLKYDANNIYLATTYSETLNMTTFGE-DAAGDAFIANKTQNFEAVAQY 282
A G A+ W AGLKYDANNIYLAT YSET NMT +G+ D D +ANKTQNFE AQY
Sbjct: 239 IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQY 298

Query: 283 QFDFGLRPSIAYLKSKGKNLGT----YGDQDLVEYIDVGATYYFNKNMSTFVDYKINLLD 338
QFDFGLRP++++L SKGK+L D+DLV+Y DVGATYYFNKN ST+VDYKINLLD
Sbjct: 299 QFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLD 358

Query: 339 DSD-FTKAAKVSTDNIVAVGLNYQF 362
D D F K A +STD+IVA+G+ YQF
Sbjct: 359 DDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1574TCRTETB396e-136 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 396 bits (1019), Expect = e-136
Identities = 87/395 (22%), Positives = 171/395 (43%), Gaps = 15/395 (3%)

Query: 20 IDATVLHVAAPTLSMTLGASGNELLWIIDIYSLVMAGMVLPMGALGDRIGFKRLLMLGGT 79
++ VL+V+ P ++ W+ + L + G L D++G KRLL+ G
Sbjct: 28 LNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGII 87

Query: 80 LFGLASLAAAFSYT-ASWLIATRVLLAIGAAMIVPATLAGIRATFCEEKHRNMALGVWAA 138
+ S+ ++ S LI R + GAA PA + + A + +++R A G+ +
Sbjct: 88 INCFGSVIGFVGHSFFSLLIMARFIQGAGAAAF-PALVMVVVARYIPKENRGKAFGLIGS 146

Query: 139 VGSGGAAFGPLIGGILLEHFYWGSVFLINVPIVLVVMGLTARYVPRQAGRRDQPLNLGHA 198
+ + G GP IGG++ + +W +L+ +P++ ++ + ++ R ++
Sbjct: 147 IVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGI 204

Query: 199 VMLIIAILLLVYSAKTALKGSLSLWAISLTLLTGTLLLGLFIRTQLATSRPMIDMRLFTH 258
+++ + I+ + + L + +S +F++ + P +D L +
Sbjct: 205 ILMSVGIVFFMLFTTSYSISFLIVSVLS---------FLIFVKHIRKVTDPFVDPGLGKN 255

Query: 259 RIILSGVVMAMTAMITLVGFELLMAQELQFVHGLSPYEAG-VFMLPVMVASGFSGPIAGA 317
+ GV+ T+ GF ++ ++ VH LS E G V + P ++ G I G
Sbjct: 256 IPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGI 315

Query: 318 LVSRLGLRLVATGGMALSALSFYGLAMTDFSTQQWQAWGLMALLGFSAASALLASTSAIM 377
LV R G V G+ ++SF + +T + ++ +LG + + + ST
Sbjct: 316 LVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSS 375

Query: 378 AAAPAEKAAAAGAIETMAYELGAGLGIAIFGLLLS 412
+ E A + ++ L G GIAI G LLS
Sbjct: 376 SLKQQEAGAGMSLLNFTSF-LSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1575HTHTETR513e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 50.8 bits (121), Expect = 3e-10
Identities = 26/167 (15%), Positives = 54/167 (32%), Gaps = 9/167 (5%)

Query: 5 NRDERREVILQAAMRVALAEGFTAMTVRRIASEADVAAGQVHHHFSSAGEL-KALAFVH- 62
E R+ IL A+R+ +G ++ ++ IA A V G ++ HF +L + +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 63 -----LIRTLLDAGQVPPPATWRARLHAMLGS--EDGGFEPYIKLWREAQILADRDPHIK 115
L P + R L +L S + +++ ++
Sbjct: 68 SNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQ 127

Query: 116 DAYLLTMQMWHEETVTIIEQGKQAGEFTFTANATDIAWRLIALVCGL 162
A ++ ++ +A A + + GL
Sbjct: 128 QAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGL 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1576TCRTETB290.040 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.1 bits (65), Expect = 0.040
Identities = 50/272 (18%), Positives = 95/272 (34%), Gaps = 36/272 (13%)

Query: 126 TPFGVFMLIALLCGFAGANF-ASSMGNISFFFPKARQGSALGINGGLGNLGVSVMQLIA- 183
+ F + ++ + G A F A M ++ + PK +G A G+ G + +G V I
Sbjct: 101 SFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGG 160

Query: 184 --------PLVIFLPIFTFLGV---RSVPQPDGSLLALTNAAWIWVPLLAVATLAAWFGM 232
++ +P+ T + V + + + + + I + + + +
Sbjct: 161 MIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTS 220

Query: 233 NDIGSSKASVASQLPVLKRL--------------HLWLLSLLYLATFGSFIGFSAGFAML 278
I SV S L +K + ++ + + G G AGF +
Sbjct: 221 YSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCG--GIIFGTVAGFVSM 278

Query: 279 AKTQFPDVNILQLAFFGP---FIGALARSA----GGVISDKFGGVRVTLINFIFMALFTA 331
DV+ L A G F G ++ GG++ D+ G + V I F+++
Sbjct: 279 VPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFL 338

Query: 332 LLFLTLPGSGAGSFSAFYLVFMGLFLTAGLGS 363
L + V GL T + S
Sbjct: 339 TASFLLETTSWFMTIIIVFVLGGLSFTKTVIS 370


85SC1682SC1688N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC16820130.900181phage shock protein B
SC1683-1120.613813phage shock protein PspA
SC1684-2130.878430phage shock protein operon transcriptional
SC1685-2130.161457peptide ABC transporter substrate-binding
SC1686-117-3.146454peptide ABC transporter
SC1687020-4.486726peptide ABC transporter
SC1688025-6.342701peptide ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1682MPTASEINHBTR260.015 Metalloprotease inhibitor signature.
		>MPTASEINHBTR#Metalloprotease inhibitor signature.

Length = 122

Score = 25.7 bits (56), Expect = 0.015
Identities = 6/43 (13%), Positives = 14/43 (32%)

Query: 30 AGRGELSQSEQQRLLQLTDDAQRMRERIQALEDILDAEHPNWR 72
AG+ + + + A + + E L + +W
Sbjct: 37 AGQLGIEATGSGVCAGPAEQANALAGDVACAEQWLGDKPVSWS 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1683RTXTOXIND280.028 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 28.3 bits (63), Expect = 0.028
Identities = 19/104 (18%), Positives = 43/104 (41%), Gaps = 5/104 (4%)

Query: 40 LVEVRSNSARALAEKKQLSRRIEQATTQQTEWQEKAELA-LRKDKDDLARAALIEKQKLT 98
+ + R + +L K+ +++ + + EL + + + L K++
Sbjct: 232 VEKSRLDDFSSLLHKQAIAK-HAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQ 290

Query: 99 DLIATLEQEVTLVDDTLARMKKEIGELENKLSETRARQQALMLR 142
+ + E+ D L + IG L +L++ RQQA ++R
Sbjct: 291 LVTQLFKNEIL---DKLRQTTDNIGLLTLELAKNEERQQASVIR 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1684HTHFIS344e-118 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 344 bits (883), Expect = e-118
Identities = 124/341 (36%), Positives = 176/341 (51%), Gaps = 22/341 (6%)

Query: 6 DNLLGEANRFLEVLEQVSRLAPLDKPVLIIGERGTGKELIANRLHYLSSRWQGPLISLNC 65
L+G + E+ ++RL D ++I GE GTGKEL+A LH R GP +++N
Sbjct: 137 MPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINM 196

Query: 66 AALNENLLDSELFGHEAGAFTGAQKRHPGRFERADGGTLFLDELATAPMLVQEKLLRVIE 125
AA+ +L++SELFGHE GAFTGAQ R GRFE+A+GGTLFLDE+ PM Q +LLRV++
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQ 256

Query: 126 YGELERVGGSQPLQVNVRLVCATNADLPAMVKEGTFRADLLDRLAFDVVQLPPLRERQSD 185
GE VGG P++ +VR+V ATN DL + +G FR DL RL ++LPPLR+R D
Sbjct: 257 QGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAED 316

Query: 186 IMLMAEHFAIQMCRELRLPLFPGFTDRAKETLLHYAWPGNVRELKNVVERSVYRHGSSE- 244
I + HF Q +E F A E + + WPGNVREL+N+V R +
Sbjct: 317 IPDLVRHFVQQAEKEGLDVK--RFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVI 374

Query: 245 -------HPLDEIVIDPFQRHPAEPPAPALPAA------------SATPDLPLKLREFQL 285
EI P ++ A + ++ A
Sbjct: 375 TREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLA 434

Query: 286 QQEKALLQRSLQQAKFNQKRAADLLALTYHQFRALLKKHQL 326
+ E L+ +L + NQ +AADLL L + R +++ +
Sbjct: 435 EMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1688HTHFIS310.007 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.007
Identities = 9/16 (56%), Positives = 14/16 (87%)

Query: 38 LVGESGSGKSLIAKAI 53
+ GESG+GK L+A+A+
Sbjct: 165 ITGESGTGKELVARAL 180


86SC1759SC1762N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1759-1130.035098major facilitator superfamily nitrite extrusion
SC1760-216-0.735458nitrate/nitrite sensor protein NarX
SC1761-315-0.768279transcriptional regulator NarL
SC1762-212-1.514150hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1759TCRTETB300.022 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.8 bits (67), Expect = 0.022
Identities = 18/58 (31%), Positives = 28/58 (48%), Gaps = 1/58 (1%)

Query: 140 TPFSTFIIISLLCGFAGANF-ASSMANISFFFPKQKQGGALGLNGGLGNMGVSVMQLV 196
+ FS I+ + G A F A M ++ + PK+ +G A GL G + MG V +
Sbjct: 101 SFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAI 158


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1760PF06580514e-09 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 51.4 bits (123), Expect = 4e-09
Identities = 30/123 (24%), Positives = 54/123 (43%), Gaps = 17/123 (13%)

Query: 473 SARFGFTVKLDYQLPPRL----VPSHQAIHLLQIAREALSNALKH-----SHADDVVVTV 523
S +F ++ + Q+ P + VP L+Q E N +KH +++
Sbjct: 233 SIQFEDRLQFENQINPAIMDVQVPPM----LVQTLVE---NGIKHGIAQLPQGGKILLKG 285

Query: 524 TQCGKQVKLKVQDNGCGVPENAERSNHYGMIIMRDRAQSLRG-DCQVRRRETGGTEVTVT 582
T+ V L+V++ G +N + S G+ +R+R Q L G + Q++ E G +
Sbjct: 286 TKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 583 FIP 585
IP
Sbjct: 346 LIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1761HTHFIS726e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 72.2 bits (177), Expect = 6e-17
Identities = 33/117 (28%), Positives = 56/117 (47%), Gaps = 2/117 (1%)

Query: 7 ATILLIDDHPMLRTGVKQLVSMAPDISVVGEASNGEQGIDLAESLDPDLILLDLNMPGMN 66
ATIL+ DD +RT + Q +S A V SN + D DL++ D+ MP N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 67 GLETLDKLREKALSGRIVVFSVSNHEEDVVTALKRGADGYLLKDMEPEDLLKALQQA 123
+ L ++++ ++V S N + A ++GA YL K + +L+ + +A
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1762INTIMIN2471e-74 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 247 bits (632), Expect = 1e-74
Identities = 125/444 (28%), Positives = 216/444 (48%), Gaps = 24/444 (5%)

Query: 57 SFSLSLLLLTASGTIRAQAQDPFDQNHL----PDLGMMPESHEGEKHFAEMAKAFGEASM 112
F S L L S + A N L PD+ + + ++A A + +
Sbjct: 118 PFEYSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQL 177

Query: 113 KNNDLDTGEQARQFAFGQVRDVVSEQVNQQLESWLSAWGSASVDINVDNEGHFNGSRGSW 172
++ L+ G+ A+ A G + Q + QL++WL +G+A V++ N F+GS +
Sbjct: 178 QSRSLN-GDYAKDTALG----IAGNQASSQLQAWLQHYGTAEVNLQSGNN--FDGSSLDF 230

Query: 173 FIPLQDKQRYLTWSQLGLTQQTDGLVSNIGVGQRWAQDGWLLGYNTFYDNLLDENLQRAG 232
+P D ++ L + Q+G +N+G GQR+ +LGYN F D + R G
Sbjct: 231 LLPFYDSEKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLG 290

Query: 233 FGAEAWGEYLRLSANYYQPFADWQT--HTATLEQRMARGYDINAQMRLPFYQHINTSVSL 290
G E W +Y + S N Y + W + ++R A G+DI LP Y + +
Sbjct: 291 IGGEYWRDYFKSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMY 350

Query: 291 EQYFGDSVDLFDSGTGYHNPVALKLGLNYTPVPLLTVTAQHKQGESGVSQNNLGLTLNYR 350
EQY+GD+V LF+S NP A +G+NYTP+PL+T+ ++ G + + Y+
Sbjct: 351 EQYYGDNVALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQ 410

Query: 351 FGVPLKKQLAASEVAQSQSLRGSRYDTLQRNSLPTMEYRQRKTLTVFLATPPWDLTPGET 410
F P +Q+ V + ++L GSRYD +QRN+ +EY+++ L++ + + T T
Sbjct: 411 FDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNI-PHDINGTERST 469

Query: 411 VALKLQVRSVHGIRHLSWQGDTQALSLTAG----TDTRSTEGWTIIMPAWDHREGAANRW 466
++L V+S +G+ + W D AL G + ++S + + I+PA+ +G +N +
Sbjct: 470 QKIQLIVKSKYGLDRIVW--DDSALRSQGGQIQHSGSQSAQDYQAILPAY--VQGGSNVY 525

Query: 467 RLSVVVEDEKGQRVSSNEITLALT 490
+++ D G SSN + L +T
Sbjct: 526 KVTARAYDRNGN--SSNNVLLTIT 547


87SC1921SC1930N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1921-191.986074flagellar biosynthesis protein FlhB
SC1922-2111.551729chemotaxis regulator CheZ
SC1923-191.328706chemotaxis regulatory protein CheY
SC1924-191.513427chemotaxis-specific methylesterase
SC1925-1100.917147chemotaxis methyltransferase CheR
SC1926-1100.765745methyl accepting chemotaxis protein II,
SC1927-112-0.134232purine-binding chemotaxis protein
SC1928012-0.493128chemotaxis protein CheA
SC1929-117-1.982441flagellar motor protein MotB
SC1930-216-2.539211flagellar motor protein MotA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1921TYPE3IMSPROT419e-149 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 419 bits (1080), Expect = e-149
Identities = 100/351 (28%), Positives = 179/351 (50%), Gaps = 14/351 (3%)

Query: 7 DDKTEAPTPHRLEKAREEGQIPRSRELTSLLILLVGVCIIWFGGESLARQLAGMLSAGLH 66
+KTE PTP ++ AR++GQ+ +S+E+ S +++ ++ + + ++ +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLML--IP 60

Query: 67 FDHRMVNDPNLILGQIILLIKAAMMALLPLIAGVVLVALISPVMLGGLIFSGKSLQPKFS 126
+ + + + ++ PL+ L+A+ S V+ G + SG++++P
Sbjct: 61 AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIK 120

Query: 127 KLNPLPGIKRMFSAQTGAELLKAVLKSTLVGCVTGFYLWHHWPQMMRLMAESPIVAMGNA 186
K+NP+ G KR+FS ++ E LK++LK L+ + + + +++L P +
Sbjct: 121 KINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQL----PTCGIECI 176

Query: 187 LDLVGLCALLVVLGVIPMVGF------DVFFQIFSHLKKLRMSRQDIRDEFKESEGDPHV 240
L+G +L L VI VGF D F+ + ++K+L+MS+ +I+ E+KE EG P +
Sbjct: 177 TPLLG--QILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEI 234

Query: 241 KGKIRQMQRAAAQRRMMEDVPKADVIVTNPTHYSVALQYDENKMSAPKVVAKGAGLIALR 300
K K RQ + R M E+V ++ V+V NPTH ++ + Y + P V K
Sbjct: 235 KSKRRQFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQT 294

Query: 301 IRELGAEHRVPTLEAPPLARALYRHAEIGQQIPGQLYAAVAEVLAWVWQLK 351
+R++ E VP L+ PLARALY A + IP + A AEVL W+ +
Sbjct: 295 VRKIAEEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQN 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1923HTHFIS897e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.7 bits (220), Expect = 7e-24
Identities = 29/105 (27%), Positives = 51/105 (48%), Gaps = 3/105 (2%)

Query: 7 KFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGVDALNKLQAGGFGFIISDWNMPNMDGL 66
LV DD + +R ++ L G++ V + + AG +++D MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 ELLKTIRADSAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKPF 111
+LL I+ LPVL+++A+ I A++ GA Y+ KPF
Sbjct: 64 DLLPRIKKARPD--LPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1924HTHFIS657e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 65.2 bits (159), Expect = 7e-14
Identities = 31/142 (21%), Positives = 62/142 (43%), Gaps = 6/142 (4%)

Query: 1 MSKIRVLSVDDSALMRQIMTEIINSHSDMEMVATAPDPLVARDLIKKFNPDVLTLDVEMP 60
M+ +L DD A +R ++ + ++ V + I + D++ DV MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 RMDGLDFLEKLMRLRPMPVVMVSSLTGKGS-EVTLRALELGAIDFVTKPQLGIREGMLAY 119
+ D L ++ + RP V+V ++ + + ++A E GA D++ KP + E +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLV--MSAQNTFMTAIKASEKGAYDYLPKP-FDLTELIGII 115

Query: 120 SEMIAEKVRTAARARIAAHKPM 141
+AE R ++ + M
Sbjct: 116 GRALAEPKRRPSKLEDDSQDGM 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1928PF06580424e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.2 bits (99), Expect = 4e-06
Identities = 23/151 (15%), Positives = 49/151 (32%), Gaps = 52/151 (34%)

Query: 378 ELDKSLIERIIDPLT--HLVRNSLDHGIEMPEKRLEAGKNAVGNLILSAEHQGGNICIEV 435
+++ ++++ + P+ LV N + HGI G ++L G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 436 TDDGAGLNRERILAKAMSQGMAVNENMTDDEVGMLIFAPGFSTAEQVTDVSGRGVGMDVV 495
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKNTK--------------------------------------ESTGTGLQNV 318

Query: 496 KRNIQEMGG---HVEIQSKQGSGTTIRILLP 523
+ +Q + G +++ KQG +L+P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1929OMPADOMAIN421e-06 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 42.2 bits (99), Expect = 1e-06
Identities = 25/118 (21%), Positives = 46/118 (38%), Gaps = 11/118 (9%)

Query: 162 FKTGSAEVEPYMRDILRAIAPVL---NGIPNRISLAGHTDDFPYANGEKGYSNWELSADR 218
F A ++P + L + L + + + G+TD G Y N LS R
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI----GSDAY-NQGLSERR 277

Query: 219 ANASRRELVAGGLDNGKVLRVVGMAATMRLSDRGPDDAINRR--ISLLVLNKQAEQAI 274
A + L++ G+ K+ GM + ++ D+ R I L +++ E +
Sbjct: 278 AQSVVDYLISKGIPADKI-SARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1930PF05844320.002 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 31.9 bits (72), Expect = 0.002
Identities = 12/28 (42%), Positives = 21/28 (75%), Gaps = 2/28 (7%)

Query: 76 MDLLALLYRLMAKSRQQGMFSLERDIEN 103
++LL +L+R+ K+R+ G+ L+RD EN
Sbjct: 74 VELLLILFRIAQKARELGV--LQRDNEN 99


88SC1969SC1986N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC1969-2131.097974inner membrane protein
SC1970-2160.499679hypothetical protein
SC1971-1131.934431hypothetical protein
SC1972-1143.926198flagellar hook-basal body protein FliE
SC19730133.595015hypothetical protein
SC19740134.905976flagellar MS-ring protein
SC19750154.930755flagellar motor switch protein G
SC1976-1164.657759flagellar assembly protein H
SC1977-1153.949932flagellum-specific ATP synthase
SC19780153.383233flagellar biosynthesis chaperone
SC1979-1173.420480flagellar hook-length control protein
SC1980-1151.190045flagellar basal body protein FliL
SC19810130.525135flagellar motor switch protein FliM
SC1982114-2.246300flagellar motor switch protein FliN
SC1983114-2.558198flagellar biosynthesis protein FliO
SC1984015-4.066970flagellar biosynthesis protein FliP
SC1985-116-3.429783flagellar biosynthesis protein FliQ
SC1986-315-2.092401flagellar biosynthesis protein FliR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1969RTXTOXIND290.025 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.025
Identities = 10/53 (18%), Positives = 17/53 (32%), Gaps = 2/53 (3%)

Query: 184 RFTLLPIFRIPVKMQKVSAASPLTQKPDQARRRF--RLGMLVFIGMIGWALLT 234
R L R + + + A L + P R R M + ++L
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLG 78


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1970PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1972FLGHOOKFLIE1102e-35 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 110 bits (277), Expect = 2e-35
Identities = 90/103 (87%), Positives = 95/103 (92%)

Query: 2 AAIQGIEGVISQLQATAMAASGQETHSQSTVSFAGQLHAALDRISDRQTAARVQAEKFTL 61
+AIQGIEGVISQLQATAM+A QE+ Q T+SFAGQLHAALDRISD QTAAR QAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGIALNDVMADMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPG+ALNDVM DMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1974FLGMRINGFLIF7840.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 784 bits (2026), Expect = 0.0
Identities = 557/559 (99%), Positives = 558/559 (99%)

Query: 2 SATASTATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQ 61
SATASTATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQ
Sbjct: 1 SATASTATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQ 60

Query: 62 DGGAIVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKF 121
DGGAIVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKF
Sbjct: 61 DGGAIVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKF 120

Query: 122 GISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLE 181
GISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLE
Sbjct: 121 GISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLE 180

Query: 182 PGRALDEGQISAVVHLVSSAVAGLPLGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDV 241
PGRALDEGQISAVVHLVSSAVAGLP GNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDV
Sbjct: 181 PGRALDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDV 240

Query: 242 ESRIQRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNIS 301
ESRIQRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNIS
Sbjct: 241 ESRIQRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNIS 300

Query: 302 EQVGAGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRNTQRN 361
EQVGAGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPR+TQRN
Sbjct: 301 EQVGAGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRN 360

Query: 362 ETSNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMG 421
ETSNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMG
Sbjct: 361 ETSNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMG 420

Query: 422 FSDKRGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVR 481
FSDKRGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVR
Sbjct: 421 FSDKRGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVR 480

Query: 482 PQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSD 541
PQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSD
Sbjct: 481 PQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSD 540

Query: 542 NDPRVVALVIRQWMSNDHE 560
NDPRVVALVIRQWMSNDHE
Sbjct: 541 NDPRVVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1975FLGMOTORFLIG339e-118 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 339 bits (870), Expect = e-118
Identities = 114/329 (34%), Positives = 196/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLSGTDKSVILLMTIGEDRAAEVFKHLSTREVQALSTAMANVRQISNKQLTDVLSEFE 60
+S L+G K+ ILL++IG + +++VFK+LS E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANEYLRSVLVKALGEERASSLLEDILETRDTTSGIETLNFMEPQSAAD 120
+ + +Y R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLKRSQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEPPLREKFLRNMSQRAADILRDDLANRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1976FLGFLIH368e-133 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 368 bits (945), Expect = e-133
Identities = 193/235 (82%), Positives = 209/235 (88%), Gaps = 7/235 (2%)

Query: 1 MSNELPWQVWTPDDLAPPPETFVPVEADNVTLTEDTPEPELTAEQQLEQELAQLKIQAHE 60
MS+ LPW+ WTPDDLAPP FVP+ T+ E+ AE LEQ+LAQL++QAHE
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEE-------AEPSLEQQLAQLQMQAHE 53

Query: 61 QGYNAGLAEGRQKGHAQGYQEGLAQGLEQGQAQAQTQQAPIHARMQQLVSEFQNTLDALD 120
QGY AG+AEGRQ+GH QGYQEGLAQGLEQG A+A++QQAPIHARMQQLVSEFQ TLDALD
Sbjct: 54 QGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALD 113

Query: 121 SVIASRLMQMALEAARQVIGQTPAVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRV 180
SVIASRLMQMALEAARQVIGQTP VDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRV
Sbjct: 114 SVIASRLMQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRV 173

Query: 181 EEMLGATLSLHGWRLRGDPTLHHGGCKVSADEGDLDASVATRWQELCRLAAPGVL 235
++MLGATLSLHGWRLRGDPTLH GGCKVSADEGDLDASVATRWQELCRLAAPGV+
Sbjct: 174 DDMLGATLSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1978FLGFLIJ2064e-72 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 206 bits (526), Expect = 4e-72
Identities = 130/147 (88%), Positives = 138/147 (93%)

Query: 1 MAQHGALETLKDLAEKEVDDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRSNLNTDMGNG 60
MA+HGAL TLKDLAEKEV+DAARLLGEMRRGCQQAEEQLKMLIDYQNEYR+NLN+DM G
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 IASNRWINYQQFIQTLEKAIEQHRLQLTQWTQKVDLALKSWREKKQRLQAWQTLQDRQTA 120
I SNRWINYQQFIQTLEKAI QHR QL QWTQKVD+AL SWREKKQRLQAWQTLQ+RQ+
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRMDQKKMDEFAQRAAMRKPE 147
AALLAENR+DQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1979FLGHOOKFLIK408e-144 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 408 bits (1049), Expect = e-144
Identities = 191/409 (46%), Positives = 231/409 (56%), Gaps = 38/409 (9%)

Query: 1 MITLPQLITTDTDMTAGLTSGKTTGSAEDFLALLAGALGADGAQGKDARITLADLQAAGG 60
MI L LIT D D T L GK + +A+DFLALL+ AL + K A L
Sbjct: 1 MIRLAPLITADVDTTT-LPGGKASDAAQDFLALLSEALAGETTTDKAAPQLL-------- 51

Query: 61 KLSKELLTQHGEPGQAVKLADLLAQKAN---ATDETLTDLTQAQHLLSTLTPSLKTSALA 117
++ + T GEP + ++D AQ+AN DET + Q + LT + + A
Sbjct: 52 -VATDKPTTKGEPLISDIVSD--AQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAA 108

Query: 118 ALSKTAQHDEKTPALSDEDLASLSALFAMLPGQPVATPVAGETPAENHIALPSLLRGDMP 177
K DEK L+++ ASLSALFAMLPG V D P
Sbjct: 109 VADKNTTKDEKADDLNEDVTASLSALFAMLPGFDNTPKVT-----------------DAP 151

Query: 178 SAPQEETHTLSFSEHEKGKTEASLARASDDRATGPSLTPLVVAAAATSAKVEVDSPSAPV 237
S F++ T L A D A G PL A +K EV S +PV
Sbjct: 152 STVLPTEKPTLFTK----LTSEQLTTAQPDDAPGTPAQPLTPLVAEAQSKAEVISTPSPV 207

Query: 238 THGAAMPTLSSATAQPLPVASAPELSAPLGSHEWQQTFSQQVMLFTRQGQQSAQLRLHPE 297
T AA P ++ QPLP +AP LSAPLGSHEWQQ+ SQ + LFTRQGQQSA+LRLHP+
Sbjct: 208 T-AAASPLITPHQTQPLPTVAAPVLSAPLGSHEWQQSLSQHISLFTRQGQQSAELRLHPQ 266

Query: 298 ELGQVHISLKLDDNQAQLQMVSPHSHVRAALEAALPMLRTQLAESGIQLGQSSISSESFA 357
+LG+V ISLK+DDNQAQ+QMVSPH HVRAALEAALP+LRTQLAESGIQLGQS+IS ESF+
Sbjct: 267 DLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAALPVLRTQLAESGIQLGQSNISGESFS 326

Query: 358 GQQQ-SSSQQQSSRAQHTDAFGAEDDIALAAPASLQAAARGNGAVDIFA 405
GQQQ +S QQQS R + + EDD L P SLQ GN VDIFA
Sbjct: 327 GQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVSLQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1981FLGMOTORFLIM384e-136 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 384 bits (987), Expect = e-136
Identities = 86/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 5 ILSQAEIDALLNGDS--DTKDEPTPGIASDSDIRPYDPNTQRRVVRERLQALEIINERFA 62
+LSQ EID LL S D E I+ I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 63 RQFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 122
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 123 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 182
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 183 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 240
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 241 HEDQNWRDNLVRQVQHSELELVANFADIPLRLSQILKLKPGDVLPIEKP---DRIIAHVD 297
+ L ++ ++++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 298 GVPVLTSQYGTVNGQYALRVEHLI 321
Q G V + A ++ I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1982FLGMOTORFLIN2092e-73 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 209 bits (534), Expect = 2e-73
Identities = 136/137 (99%), Positives = 136/137 (99%)

Query: 1 MSDMNNPSDENTGALDDLWADALNEQKATTNKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60
MSDMNNPSDENTGALDDLWADALNEQKATT KSAADAVFQQLGGGDVSGAMQDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1984FLGBIOSNFLIP330e-117 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 330 bits (847), Expect = e-117
Identities = 225/245 (91%), Positives = 233/245 (95%)

Query: 1 MRRLLFLSLAGLWLFSPAAAAQLPGLISQPLAGGGQSWSLSVQTLVFITSLTFLPAILLM 60
MRRLL ++ LWL +P A AQLPG+ SQPL GGGQSWSL VQTLVFITSLTF+PAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEQK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSE+K
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALDKGAQPLRAFMLRQTREADLALFARLANSGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEAL+KGAQPLR FMLRQTREADL LFARLAN+GPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1985TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.5 bits (165), Expect = 1e-18
Identities = 23/78 (29%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALITGLIISILQAATQINEMTLSFIPKIVAVFIAII 63
+ ++ G +A+ + L L+ +VA I GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 VAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC1986TYPE3IMRPROT2135e-71 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 213 bits (543), Expect = 5e-71
Identities = 231/260 (88%), Positives = 246/260 (94%)

Query: 1 MIQVTSEQWLYWLHLYFWPLLRVLALISTAPILSERAIPKRVKLGLGIMITLVIAPSLPA 60
M+QVTSEQWL WL+LYFWPLLRVLALISTAPILSER++PKRVKLGL +MIT IAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDTPLFSIAALWLAMQQILIGIALGFTMQFAFAAVRTAGEFIGLQMGLSFATFVDPGSHL 120
ND P+FS ALWLA+QQILIGIALGFTMQFAFAAVRTAGE IGLQMGLSFATFVDP SHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLAMLLFLTFNGHLWLISLLVDTFHTLPIGSNPVNSNAFMALARAGGLIF 180
NMPVLARIMDMLA+LLFLTFNGHLWLISLLVDTFHTLPIG P+NSNAF+AL +AG LIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPVITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGIMLMAALMPLIAPFC 240
LNGLMLALP+ITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGI LMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIVSEMPI 260
EHLFSEIFNLLADI+SE+P+
Sbjct: 241 EHLFSEIFNLLADIISELPL 260


89SC2128SC2133N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2128-1174.487950chaperone
SC2129-1143.318759multidrug efflux system subunit MdtA
SC21300143.111102multidrug efflux system subunit MdtB
SC21310131.785958multidrug efflux system subunit MdtC
SC21320140.852763signal transduction histidine-protein kinase
SC2133115-2.525396DNA-binding transcriptional regulator BaeR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2128SHAPEPROTEIN492e-08 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 49.0 bits (117), Expect = 2e-08
Identities = 31/129 (24%), Positives = 56/129 (43%), Gaps = 20/129 (15%)

Query: 132 TMMVHIRHTAHSQ-LPEAITQAVIGRPINFQGLGGDDANRQAQGILERAAKRAGFQDVVF 190
M+ H HS + ++ P+ R+A + +A+ AG ++V
Sbjct: 89 KMLQHFIKQVHSNSFMRPSPRVLVCVPVGA-----TQVERRA---IRESAQGAGAREVFL 140

Query: 191 QYEPVAAGLDYEATLREEKRVLVVDIGGGTTDCSMLLMGPQWRQRADRENSLLGHSGCRV 250
EP+AA + + E +VVDIGGGTT+ +++ + ++ S R+
Sbjct: 141 IEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN-----------GVVYSSSVRI 189

Query: 251 GGNDLDIAL 259
GG+ D A+
Sbjct: 190 GGDRFDEAI 198



Score = 35.9 bits (83), Expect = 2e-04
Identities = 25/81 (30%), Positives = 39/81 (48%), Gaps = 12/81 (14%)

Query: 377 ALDQPLARILEQVRLALDSAQEKPDV--------IYLTGGSARSPLIKKALSEQLPGIPV 428
AL +PL I+ V +AL+ Q P++ + LTGG A + + L E+ GIPV
Sbjct: 259 ALQEPLTGIVSAVMVALE--QCPPELASDISERGMVLTGGGALLRNLDRLLMEET-GIPV 315

Query: 429 AGGDD-FGSVTAGLARWAEVV 448
+D V G + E++
Sbjct: 316 VVAEDPLTCVARGGGKALEMI 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2129RTXTOXIND423e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.1 bits (99), Expect = 3e-06
Identities = 36/172 (20%), Positives = 71/172 (41%), Gaps = 10/172 (5%)

Query: 107 KVALAQAQGQLAKDNATLANARRDLARYQQ---LAKTNLVSRQELDAQQAL--VNETQGT 161
K A+ + + + + L + L + + AK +L + L + +T
Sbjct: 251 KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDN 310

Query: 162 IKADEANVASAQLQLDWSRITAPVSGRV-GLKQVDVGNQISSSDTAGIVVITQTHPIDLI 220
I +A + + S I APVS +V LK G +++++T +V++ + +++
Sbjct: 311 IGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVT 369

Query: 221 FTLPESDIATVVQAQKAGKTLVVEAWDRTNSHKL-SEGVLLSLDNQIDPTTG 271
+ DI + Q A + VEA+ T L + ++LD D G
Sbjct: 370 ALVQNKDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419



Score = 40.6 bits (95), Expect = 9e-06
Identities = 20/122 (16%), Positives = 46/122 (37%), Gaps = 13/122 (10%)

Query: 63 GTVTAA-NTVTVRSRVDGQLIALHFQEGQQVNAGDLLAQIDPSQFKVALAQAQGQLAKDN 121
G +T + + ++ + + + +EG+ V GD+L ++ + + Q
Sbjct: 88 GKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQ------- 140

Query: 122 ATLANARRDLARYQQLAKTNLVSRQELDAQQALVNETQGTIKADEANVASAQLQLDWSRI 181
++L AR + RYQ L+++ EL+ L + + L +
Sbjct: 141 SSLLQARLEQTRYQILSRS-----IELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQF 195

Query: 182 TA 183
+
Sbjct: 196 ST 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2130ACRIFLAVINRP8860.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 886 bits (2290), Expect = 0.0
Identities = 291/1036 (28%), Positives = 503/1036 (48%), Gaps = 29/1036 (2%)

Query: 13 SRLFILRPVATTLLMAAILLAGIIGYRFLPVAALPEVDYPTIQVVTLYPGASPDVMTSAV 72
+ FI RP+ +L +++AG + LPVA P + P + V YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMSSQS-SGGASVVTLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L MSS S S G+ +TL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPNPPIYSKVNPADPPIMTLAVTSNAMPMTQVE--DMVETRVAQKISQVSGVGLVTLAGG 189
+ I S + +M S+ TQ + D V + V +S+++GVG V L G
Sbjct: 122 VQQQGI-SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAQAVAALGLTSETVRTAITGANVNSAKGSLDGP------ERAVTLSANDQ 243
Q A+R+ L+A + LT V + N A G L G + ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MQSADEYRRLII-AYQNGAPVRLGDVATVEQGAENSWLGAWANQAPAIVMNVQRQPGANI 302
++ +E+ ++ + +G+ VRL DVA VE G EN + A N PA + ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 IATADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVRDTQFELMLAIALVVMIIYLFL 362
+ TA +I+ L +L P+ +KV D T ++ S+ + L AI LV +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N+ AT+IP +AVP+ L+GTFA++ +SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAVTLAVAIL 481
+ E P A K +I ++ + L AV IP+ F G G ++R+F++T+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SQQSLRKQNRFSRACERMFDRVIASYGRGLAKVLNHPWL 538
+S +V+L LTP +CA +L S + + F FD + Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVAFATLLLSVMLWIVIPKGFFPVQDNGIIQGTLQAPQSSSYASMAQRQRQVAERILQ 598
L + + V+L++ +P F P +D G+ +Q P ++ + QV + L+
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VQSLTTFVGVDGANSTLNSTRLQINLKPLDARDDR---VQQVISRLQTAVATIPG 653
+ V+S+ T G + N+ ++LKP + R+ + VI R + + I
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 654 VALYLQPTQDLTIDTQVSRTQYQFSLQ---ATTLDALSHWVPKL-QNALQSLPQLSEVSS 709
++ P I + T + F L DAL+ +L A Q L V
Sbjct: 660 G--FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDRGLAAWVNVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTA 769
+ + + VD++ A LG+S++D++ + A G ++ + ++ ++ +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 STPGLAALETIRLTSRDGGTVPLSAIARIEQRFAPLSINHLDQFPITTFSFNVPEGYSLD 829
++ + + S +G VP SA + + + P G S
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 DAVQAILDTEKTLALPADITTQFQGSTLAFQAALGSTVWLIVAAVVAMYIVLGVLYESFI 889
DA+ + + LPA I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DAMALMENLAS--KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALIIAGSELDIIAIIGIILLIGIVKKNAIMMIDFALAAEREQ 949
P++++ +P VG LLA + + D+ ++G++ IG+ KNAI++++FA ++
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMSPRDAIFQACLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIAMVGGLLVSQV 1009
G +A A +R RPILMT+LA +LG LPL +S G G+ + +GI ++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDRL 1025
L +F PV +++ R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2131ACRIFLAVINRP8800.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 880 bits (2275), Expect = 0.0
Identities = 282/1035 (27%), Positives = 503/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIYRPVATILIAAAITLCGILGFRLLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ ++A + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVNEMTSSS-SLGSTRIILEFNFDRDINGAARDVQAAINAAQSLLPGGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSES--WSQGKLYDFASTQLAQTIAQIDGVGDVDVGGSSL 182
+ S + +M+ S++ +Q + D+ ++ + T+++++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLNPQALFNQGVSLDEVREAIDSANVRRPQGAIEDSV------HRWQIQTNDELK 236
A+R+ L+ L ++ +V + N + G + + I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGAAVRLGDVASVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G+ VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDGIRAKLPELRAMIPAAIDLQIAQDRSPTIRASLQEVEETLAISVALVIMVVFLFLRS 355
T I+AKL EL+ P + + D +P ++ S+ EV +TL ++ LV +V++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATLIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RATLIP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVISMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LVVSLTLTPMMCGWMLKSSKPRTQPRKRGVG----RLLVALQQGYGTSLKWVLNHTRLVG 530
++V+L LTP +C +LK K G Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVFLGTVALNIWLYIAIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 586
+++ VA + L++ +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 587 RD-DPAVNNVTGFT-GGSRVNSGMMFITLKPRGER---KETAQQIIDRLRVKLAKEPGAR 641
+ +V V GF+ G N+GM F++LKP ER + +A+ +I R +++L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLMAVQDIRVGGRQANASYQYTLLSDSLAALREWEPKIRKALSAL-----PQLADVNSD 696
+ + I G ++ L D + + R L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QQDNGAEMNLIYDRDTMSRLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 756
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 SQDISALEKMFVINRDGKAIPLSYFAQWRPANAPLSVNHQGLSAASTIAFNLPTGTSLSQ 816
++K++V + +G+ +P S F + + I GTS
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ATEAINRTMTQLGVPPTVRGSFSGTAQVFQQTMNSQLILIVAAIATVYIVLGILYESYVH 876
A + ++L P + ++G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRSGG 936
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA + G
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LTPAQAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLL 996
+A A +R RPI+MT+LA + G LPL +S G GS + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVVYLFFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2132BCTERIALGSPF310.010 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 31.0 bits (70), Expect = 0.010
Identities = 20/66 (30%), Positives = 26/66 (39%), Gaps = 14/66 (21%)

Query: 187 RGLLAPVKRLVEGTHRLAAGDFTTRVTPTSADEL-----------GKLAQDFNQLASTLE 235
L+A V+ V H LA + P S + L G L N+LA E
Sbjct: 104 SQLMAAVRSKVMEGHSLAD---AMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTE 160

Query: 236 KNQQMR 241
+ QQMR
Sbjct: 161 QRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2133HTHFIS751e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 74.9 bits (184), Expect = 1e-17
Identities = 28/140 (20%), Positives = 65/140 (46%), Gaps = 2/140 (1%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLINHGDKLLPYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + ++ L ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTIL-RRC 128
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 129 KPQRELQQQDAESPLMIDES 148
+ +L+ + ++ S
Sbjct: 124 RRPSKLEDDSQDGMPLVGRS 143


90SC2271SC2278N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2271013-1.196268porin
SC2272011-1.929528phosphotransfer intermediate protein in
SC2273010-2.909145transcriptional regulator RcsB
SC2274-110-2.516930hybrid sensory kinase in two-component
SC2275015-2.721941DNA gyrase subunit A
SC2276-113-4.360583dehydratase
SC2277017-2.962398permease
SC2278015-2.222534GntR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2271ECOLIPORIN5370.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 537 bits (1386), Expect = 0.0
Identities = 261/389 (67%), Positives = 298/389 (76%), Gaps = 17/389 (4%)

Query: 1 MKVKVLSLLVPALLVAGAANAAEIYNKDGNKLDLFGKVDGLHYFSDDKGSDGDQTYMRIG 60
MK KVL+L++PALL AGAA+AAEIYNKDGNKLDL+GKVDGLHYFSDD DGDQTYMR+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQVNDQLTGYGQWEYQIQGNQTEG-SNDSWTRVAFAGLKFADAGSFDYGRNYGVTY 119
FKGETQ+NDQLTGYGQWEY +Q N TEG +SWTR+AFAGLKF D GSFDYGRNYGV Y
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 120 DVTSWTDVLPEFGGDTYG-ADNFMQQRGNGYATYRNTDFFGLVDGLDFALQYQGKNGSVS 178
DV WTD+LPEFGGD+Y ADN+M R NG ATYRNTDFFGLVDGL+FALQYQGKN S S
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 179 GEN--------TNGRSLLNQNGDGYGGSLTYAIGEGFSVGGAITTSKRTADQNNTADEHL 230
++ NG + NGDG+G S TY IG GFS G A TTS RT +Q N
Sbjct: 181 ADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGG--T 238

Query: 231 YGNGDRATVYTGGLKYDANNIYLAAQYSQTYNATRFGTSNGNNKSTSYGFANKAQNFEVV 290
GD+A +T GLKYDANNIYLA YS+T N T +G ++ G ANK QNFEV
Sbjct: 239 IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDK---GYDGGVANKTQNFEVT 295

Query: 291 AQYQFDFGLRPSVAYLQSKGKDISNGYGASYGDQDIVKYVDVGATYYFNKNMSTYVDYKI 350
AQYQFDFGLRP+V++L SKGKD++ + D+D+VKY DVGATYYFNKN STYVDYKI
Sbjct: 296 AQYQFDFGLRPAVSFLMSKGKDLTYN-NVNGDDKDLVKYADVGATYYFNKNFSTYVDYKI 354

Query: 351 NLLDKND-FTRDAGINTDDIVALGLVYQF 378
NLLD +D F +DAGI+TDDIVALG+VYQF
Sbjct: 355 NLLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2273HTHFIS495e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 5e-09
Identities = 26/145 (17%), Positives = 60/145 (41%), Gaps = 20/145 (13%)

Query: 16 MNNMNVIIADDHPIVLFGIRKSLEQIEWVNVVGEFEDSTALINNLPKLDAHVLITDLSMP 75
M +++ADD + + ++L + + + ++ L + D +++TD+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 76 GDKYGDGITLIKYIKRHFPSLSIIVLTMNNNPAILSAVLDLDIEGIVLKQGA------PT 129
+ L+ IK+ P L ++V++ N +A+ ++GA P
Sbjct: 59 D---ENAFDLLPRIKKARPDLPVLVMSAQNTFM--TAIKA-------SEKGAYDYLPKPF 106

Query: 130 DLPKALAALQKGKKFTPESVSRLLE 154
DL + + + + S+L +
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2274HTHFIS801e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.9 bits (197), Expect = 1e-17
Identities = 29/104 (27%), Positives = 47/104 (45%)

Query: 827 ILVVDDHPINRRLLADQLGSLGYQCKTANDGVDALNVLSKNAIDIVLSDVNMPNMDGYRL 886
ILV DD R +L L GY + ++ ++ D+V++DV MP+ + + L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 887 TQRIRQLGLTLPVVGVTANALAEEKQRCLESGMDSCLSKPVTLD 930
RI++ LPV+ ++A + E G L KP L
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2278NUCEPIMERASE280.031 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 27.8 bits (62), Expect = 0.031
Identities = 16/75 (21%), Positives = 29/75 (38%), Gaps = 15/75 (20%)

Query: 133 AERDTQAYLKLDHDFHYVFVKYADNKYISQAHLLISARLLAIRYRLDFTAEYITSSNRGH 192
A+R+ L F VF+ + A+RY L+ Y S+ G
Sbjct: 62 ADREGMTDLFASGHFERVFISPH------RL---------AVRYSLENPHAYADSNLTGF 106

Query: 193 ATILDMLKNNNVEGV 207
IL+ ++N ++ +
Sbjct: 107 LNILEGCRHNKIQHL 121


91SC2362SC2370N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC23620100.427193diaminopimelate decarboxylase
SC2363-1111.748125regulatory protein
SC23640152.023054amidophosphoribosyltransferase
SC2365-2162.676400colicin V production protein
SC2366-2163.434319hypothetical protein
SC2367-1142.972366bifunctional folylpolyglutamate synthase/
SC2368-1132.159750acetyl-CoA carboxylase subunit beta
SC2369-1132.865691hypothetical protein
SC2370-2143.058415tRNA pseudouridine synthase A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2362ALARACEMASE320.006 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 31.7 bits (72), Expect = 0.006
Identities = 23/133 (17%), Positives = 47/133 (35%), Gaps = 20/133 (15%)

Query: 87 VLKAIRDAGICAEANSQYEVRKCLEIGFRGDQIVFNGVVKKPADLEYAIANDLYLINVDS 146
+ AI A N + E E G++G ++ G DLE + L
Sbjct: 46 IWSAIGATDGFALLNLE-EAITLRERGWKGPILMLEGFFH-AQDLEIYDQHRLTT----C 99

Query: 147 LYELEHIDAIS-RKLKKVANVCVRVEPNVPSATHAELVTAFHAKSGLDLEQAEETCRRIL 205
++ + A+ +LK ++ ++V + + + G ++ +++
Sbjct: 100 VHSNWQLKALQNARLKAPLDIYLKVN------------SGMN-RLGFQPDRVLTVWQQLR 146

Query: 206 AMPYVHLRGLHMH 218
AM V L H
Sbjct: 147 AMANVGEMTLMSH 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2363HTHFIS343e-116 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 343 bits (882), Expect = e-116
Identities = 120/371 (32%), Positives = 183/371 (49%), Gaps = 24/371 (6%)

Query: 122 NMSGVRRLQEQVVELNQLLYADHHE---KHHAIITENPEMLSNIAKAKRLAASNIPVTIV 178
+++ + + + + + + + ++ + M RL +++ + I
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMIT 166

Query: 179 GETGTGKELFSRLIHQCSKRANKPFIALNCGALPPTLIESTLFDTVRGAYTGAENS-QGY 237
GE+GTGKEL +R +H KR N PF+A+N A+P LIES LF +GA+TGA+ G
Sbjct: 167 GESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGR 226

Query: 238 LELANGGTLFLDELNAMPIEMQSKLLRFLQDKTFWRLGGQQQLHSDVRIVAAMNEAPVKL 297
E A GGTLFLDE+ MP++ Q++LLR LQ + +GG+ + SDVRIVAA N+ +
Sbjct: 227 FEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQS 286

Query: 298 IQQERLRADLFYRLSVGMLTLPPLRARPEDIPLLANYFFDKYRNDVPQDIHGLSETARAD 357
I Q R DL+YRL+V L LPPLR R EDIP L +F + + D+ + A
Sbjct: 287 INQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALEL 345

Query: 358 LLNHAWPGNVRMLENAIVRSMIMQEKDGLLKHIIF-------------------EQDELN 398
+ H WPGNVR LEN + R + +D + + II ++
Sbjct: 346 MKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSIS 405

Query: 399 LGVPETAPENPLPSSPDPQYEGSLEVRVANYERHLIETALDTHQGNIAAAARSLNVSRTT 458
V E + G + +A E LI AL +GN AA L ++R T
Sbjct: 406 QAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNT 465

Query: 459 LQYKVQKYAIR 469
L+ K+++ +
Sbjct: 466 LRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2364ANTHRAXTOXNA340.002 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 33.6 bits (76), Expect = 0.002
Identities = 13/37 (35%), Positives = 24/37 (64%), Gaps = 2/37 (5%)

Query: 469 KDVDQQYLDFLDSLRND-DAKAVLFQNEM-ENLEMHN 503
K +D ++L+ + SL +D D+ +LF + E LE++N
Sbjct: 186 KSLDPEFLNLIKSLSDDSDSSDLLFSQKFKEKLELNN 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2366PERTACTIN290.019 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 28.9 bits (64), Expect = 0.019
Identities = 19/60 (31%), Positives = 22/60 (36%), Gaps = 4/60 (6%)

Query: 99 PIPVETPKPKPVEKPKPQPKPQQPVVAASTPTPAPQPATDDKPAPTGKAYVVQLGALKNA 158
P P P+P P P+P PQ P P QP P G+ L A NA
Sbjct: 569 PAPKPAPQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQPPAGRE----LSAAANA 624



Score = 28.5 bits (63), Expect = 0.022
Identities = 16/49 (32%), Positives = 17/49 (34%)

Query: 106 KPKPVEKPKPQPKPQQPVVAASTPTPAPQPATDDKPAPTGKAYVVQLGA 154
K P KP PQP PQ P P P P +A Q A
Sbjct: 566 KAPPAPKPAPQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQPPA 614


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2370FbpA_PF05833290.026 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 28.7 bits (64), Expect = 0.026
Identities = 20/63 (31%), Positives = 31/63 (49%), Gaps = 6/63 (9%)

Query: 204 VRNIVGS-LLEVGAHNQPESWIAELLAARDRTLAAATAKAEGLYLVAVDYPDRFDLPKPP 262
+NI GS ++ + PES + E AA LAA +K++ V VDY + ++ KP
Sbjct: 496 TKNIPGSHVIVKNIMDIPESTLLE--AAN---LAAYYSKSQNSSNVPVDYTEVKNVKKPN 550

Query: 263 MGP 265

Sbjct: 551 GAK 553


92SC2394SC2400N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2394-115-1.365126lipoprotein
SC2395-116-1.935803hypothetical protein
SC2396-118-1.985788*outer membrane protease
SC2397-115-0.926014phosphoglycerate transport activator
SC2398-114-1.188252phosphoglycerate transporter
SC2399015-2.032946phosphoglycerate transporter
SC2400-113-1.666849phosphoglycerate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2394VACJLIPOPROT397e-144 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 397 bits (1021), Expect = e-144
Identities = 236/251 (94%), Positives = 248/251 (98%)

Query: 1 MKLRLSALALGTTLLVGCASSGTEQQGRSDPFEGFNRTMYNFNFNVLDPYVVRPVAVAWR 60
MKLRLSALALGTTLLVGCASSGT+QQGRSDP EGFNRTMYNFNFNVLDPY+VRPVAVAWR
Sbjct: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60

Query: 61 DYVPQPARNGLSNFTGNLEEPAIMVNYFLQGDPYQGMVHFTRFFLNTLLGMGGFIDVAGM 120
DYVPQPARNGLSNFTGNLEEPA+MVNYFLQGDPYQGMVHFTRFFLNT+LGMGGFIDVAGM
Sbjct: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120

Query: 121 ANPKLQRVEPHRFGSTLGHYGVGYGPYMQIPFYGSFTLREDGGDMADTLYPVLSWLTWPM 180
ANPKLQR EPHRFGSTLGHYGVGYGPY+Q+PFYGSFTLR+DGGDMAD LYPVLSWLTWPM
Sbjct: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADALYPVLSWLTWPM 180

Query: 181 SIGKWTIEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGKLKPQENPNAQA 240
S+GKWT+EGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGG+LKPQENPNAQA
Sbjct: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240

Query: 241 IQDELKEIDSE 251
IQD+LK+IDSE
Sbjct: 241 IQDDLKDIDSE 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2395PF06580280.036 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.3 bits (63), Expect = 0.036
Identities = 22/113 (19%), Positives = 46/113 (40%), Gaps = 12/113 (10%)

Query: 199 WIIATMVWMFPAAGGAKIVVIILMTWLIALGDTTHIVVGSVEILYLV-FNGTLPWSDFFW 257
I W+ G I+ ++ +I + V + I L+ F T P + F
Sbjct: 61 SFIKRQGWLKLNMG-QIILRVLPACVVIGM----VWFVANTSIWRLLAFINTKPVA-FTL 114

Query: 258 PFALPTLAGNICGGTFIFALMSHAQIRNDMSNKRKEEARLRGERLERERKKAE 310
P AL ++ N+ TF+++L+ K ++A + ++ ++A+
Sbjct: 115 PLAL-SIIFNVVVVTFMWSLLYFGWHFF----KNYKQAEIDQWKMASMAQEAQ 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2396OMPTIN476e-173 Omptin serine protease signature.
		>OMPTIN#Omptin serine protease signature.

Length = 317

Score = 476 bits (1226), Expect = e-173
Identities = 149/320 (46%), Positives = 213/320 (66%), Gaps = 11/320 (3%)

Query: 1 MKKHAIAVMMIAVFSESVYAESTLFIPDVSPDSVTTSLSVGVLNGKSRELVYD-TDTGRK 59
M+ + +++ + S +A + +PD++ +S+G L+GK++E VY + GRK
Sbjct: 1 MRAKLLGIVLTTPIAISSFASTET--LSFTPDNINADISLGTLSGKTKERVYLAEEGGRK 58

Query: 60 LSQLDWKIKNVATLQGDLSWEPYSFMTLDARGWTSLASGSGHMVDHDWMSSEQPG-WTDR 118
+SQLDWK N A ++G ++W+ +++ A GWT+L S G+MVD DWM S PG WTD
Sbjct: 59 VSQLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDQDWMDSSNPGTWTDE 118

Query: 119 SIHPDTSVNYANEYDLNVKGWLLQGDNYKAGVTAGYQETRFSWTARGGSYIYDNGR---- 174
S HPDT +NYANE+DLN+KGWLL NY+ G+ AGYQE+R+S+TARGGSYIY +
Sbjct: 119 SRHPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRD 178

Query: 175 YIGNFPHGVRGIGYSQRFEMPYIGLAGDYRINDFECNVLFKYSDWVNAHDNDEHY--MRK 232
IG+FP+G R IGY QRF+MPYIGL G YR DFE FKYS WV + DNDEHY ++
Sbjct: 179 DIGSFPNGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWVESSDNDEHYDPGKR 238

Query: 233 LTFREKTENSRYYGASIDAGYYITSNAKIFAEFAYSKYEEGKGGTQIIDKTSGDTAYFGG 292
+T+R K ++ YY +++AGYY+T NAK++ E A+++ KG T + D + +T+ +
Sbjct: 239 ITYRSKVKDQNYYSVAVNAGYYVTPNAKVYVEGAWNRVTNKKGNTSLYDHNN-NTSDYSK 297

Query: 293 DAAGIANNNYTVTAGLQYRF 312
+ AGI N N+ TAGL+Y F
Sbjct: 298 NGAGIENYNFITTAGLKYTF 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2397HTHFIS2041e-63 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 204 bits (520), Expect = 1e-63
Identities = 105/427 (24%), Positives = 169/427 (39%), Gaps = 73/427 (17%)

Query: 1 MSDVCMPGCSGIDLMTLFHQDDDQLPILLITGHGDVPMAVDAVKKGAWDFLQKPVDPGKL 60
++DV MP + DL+ + LP+L+++ A+ A +KGA+D+L KP D +L
Sbjct: 52 VTDVVMPDENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTEL 111

Query: 61 LILIEDALRQRRSVIARRQYCQQTLQVELIGRSEWMNQFRQRLQQLAETDIAVWFYGEHG 120
+ +I AL + + ++ + Q L+GRS M + + L +L +TD+ + GE G
Sbjct: 112 IGIIGRALAEPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESG 170

Query: 121 TGRMTGARYLHQLGRNAKGPFVRYELT--PENAGQLETF-----------------IDQA 161
TG+ AR LH G+ GPFV + P + + E F +QA
Sbjct: 171 TGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQA 230

Query: 162 QGGTLVLSHPEYLTREQQHHLAR-LQSLEHRP----------FRLVGVGSASLVEQAAAN 210
+GGTL L + + Q L R LQ E+ R+V + L +
Sbjct: 231 EGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQG 290

Query: 211 QIAAELYYCFAMTQIACQSLSQRPDDIEPLFRHYLRKACLRLNHPVPEIAGELLKGIMRR 270
+LYY + + L R +DI L RH++++A + V E L+ +
Sbjct: 291 LFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAE-KEGLDVKRFDQEALELMKAH 349

Query: 271 AWPSNVRELANAAELFAV-----------------------------------GVLPLAE 295
WP NVREL N + E
Sbjct: 350 PWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVE 409

Query: 296 TVNPQLL------LQEPTPLDRRVEEYERQIITEALNIHQGRINEVAEYLQIPRKKLYLR 349
Q L DR + E E +I AL +G + A+ L + R L +
Sbjct: 410 ENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKK 469

Query: 350 MKKYGLS 356
+++ G+S
Sbjct: 470 IRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2399FLGMOTORFLIM290.049 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 28.7 bits (64), Expect = 0.049
Identities = 5/35 (14%), Positives = 15/35 (42%), Gaps = 4/35 (11%)

Query: 342 QQLVQRMFDTAISFRLAQLKDAWRALHSAETRLKR 376
+++ + LA ++++W + RL +
Sbjct: 150 NSVMEGVIVRI----LANVRESWTQVIDLRPRLGQ 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2400TCRTETA348e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.4 bits (79), Expect = 8e-04
Identities = 72/429 (16%), Positives = 140/429 (32%), Gaps = 45/429 (10%)

Query: 28 IQALLSVFLGYLAYYIVRNNFTLSTPYLKEQLDLSATQI---GLLSSCMLIAYGISKGVM 84
I L +V L + ++ P L L S G+L + + V+
Sbjct: 8 IVILSTVALDAVGIGLI----MPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVL 63

Query: 85 SSLADKASPKVFMACGLVLCAIVNVGLGFSSAFWIFAALVVFNGLFQGMGVGPSFITIAN 144
+L+D+ + + L A+ + + W+ + G+ G IA+
Sbjct: 64 GALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG-AYIAD 122

Query: 145 WFPRRERGRVGAFWNISHNVGGGIVA-PIVGAAFAILGSEHWQSASYIVPACVAVIFALI 203
ER R F +S G G+VA P++G A + A + + L
Sbjct: 123 ITDGDERAR--HFGFMSACFGFGMVAGPVLGGLMGGFSPH----APFFAAAALNGLNFLT 176

Query: 204 VLVLGKGSPREEGLPSLEQMMPEEKVVLKTKNTAKAPENMSAWQIFCTYVLRNKNAWYIS 263
L +PE + + P A ++ +
Sbjct: 177 GCFL----------------LPE------SHKGERRPLRREALNPLASFRWARGMTVVAA 214

Query: 264 LVDVFVYMVRFGMISWLPIYLLTVKHFSKEQMSVAFLFFEWA---AIPSTLLAGWLSDKL 320
L+ VF M G + + F + ++ + ++ ++ G ++ +L
Sbjct: 215 LMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARL 274

Query: 321 FKGRRMPLAMICMALIFVCLIGYWKSESLLMVTIFAAIVGCLIYVPQFLASVQTMEIVPS 380
+ R + L MI ++ L + + + A G + Q + S Q E
Sbjct: 275 GERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQG 334

Query: 381 FAVGSAVGLRGFMSYIFGASLGTSLFGVMVDKLGWYGGFYLLMGGIVCCILFCYLSHRGA 440
GS L ++ I G L T+++ + + G+ + G + + L RG
Sbjct: 335 QLQGSLAALTS-LTSIVGPLLFTAIYAA---SITTWNGWAWIAGAALYLLCLPAL-RRGL 389

Query: 441 LELERQRQN 449
QR +
Sbjct: 390 WSGAGQRAD 398


93SC2745SC2752N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2745-3110.615651glycine betaine transporter periplasmic subunit
SC2746-3110.074967inner membrane protein
SC2747-211-1.253539transcriptional repressor MprA
SC2748-310-0.909392multidrug resistance secretion protein
SC2749-312-1.234950MFS superfamily, multidrug transport protein
SC2750-314-2.362608glycoporin
SC2751-317-0.898599hypothetical protein
SC2752-1130.846187S-ribosylhomocysteinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2745PF06057300.014 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 29.8 bits (67), Expect = 0.014
Identities = 8/55 (14%), Positives = 17/55 (30%)

Query: 277 FAIMKLPLADINAQNAMMHAGKSSEADVQGHVDGWINAHQQQFDGWVKEALAAQK 331
F + ++P + S +D + HV + + Q + Q
Sbjct: 133 FVLNEMPARYRKNVLGAVLLSPSQSSDFEIHVSEMVTSDNQSARYLTLPEVNKQT 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2746TCRTETB461e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 46.4 bits (110), Expect = 1e-07
Identities = 31/165 (18%), Positives = 66/165 (40%), Gaps = 2/165 (1%)

Query: 34 LDTIAHHFSLSASSAGFIVTAAQLGYAAGLLFLVPLGDMFE-RRTLIVSMTLLAAGGMLI 92
L IA+ F+ +S ++ TA L ++ G L D +R L+ + + G ++
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 93 TASSQSLSMMILGTALTGLFSVVAQILVPLA-ATLATPATRGKVVGTIMSGLLLGILLAR 151
S++I+ + G + LV + A RGK G I S + +G +
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 152 TVAGLLANLGGWRTVFWVASALMALMAVALWRGLPKLKSDTHLNY 196
+ G++A+ W + + + + + +++ H +
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2748RTXTOXIND765e-17 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 75.6 bits (186), Expect = 5e-17
Identities = 63/418 (15%), Positives = 128/418 (30%), Gaps = 101/418 (24%)

Query: 28 KRKTALLLLTLLFVIIAVAYGIYWFLVLRHIEETDDA----YVAGNQVQITAQVSGSVTK 83
+R + + F++IA + L +E A +G +I + V +
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSV-----LGQVEIVATANGKLTHSGRSKEIKPIENSIVKE 109

Query: 84 VWADNTDFVKEGDVLVTLDQT--------------------------------------- 104
+ + V++GDVL+ L
Sbjct: 110 IIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELK 169

Query: 105 -------------DAKQAFEKAKTALASSVRQTHQLMINSKQ-------LQANIDVQKTA 144
+ + K ++ Q +Q +N + + A I+ +
Sbjct: 170 LPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENL 229

Query: 145 LAQAQSDLNRRVPLGNANLIGREELQHARDAVASAQAQLDVAIQQYNANQAMILNSNLED 204
+S L+ L + I + + + A +L V Q ++ IL++ E
Sbjct: 230 SRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEY 289

Query: 205 QPAVQQAATEVRN------------------AWLALERTRIVSPMTGYVSRRAVQ-PGAQ 245
Q Q E+ + + + I +P++ V + V G
Sbjct: 290 QLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGV 349

Query: 246 ISPTTPLMAVVPATD-LWVDANFKETQLANMRIGQPVTIITDIYGDDVKY---TGKVVGL 301
++ LM +VP D L V A + + + +GQ I + + +Y GKV +
Sbjct: 350 VTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAF-PYTRYGYLVGKVKNI 408

Query: 302 DMGTGSAFSLLPAQNATGNWIKVVQRLPVRVELDARQLEQHPLRIGLSTLVTVDTANR 359
+ ++ G V+ + + PL G++ + T R
Sbjct: 409 -----NLDAIE--DQRLGLVFNVIISIEENCLSTGNK--NIPLSSGMAVTAEIKTGMR 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2749TCRTETB1305e-35 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 130 bits (328), Expect = 5e-35
Identities = 94/405 (23%), Positives = 164/405 (40%), Gaps = 23/405 (5%)

Query: 17 IALSLATFMQVLDSTIANVAIPTIAGNLGSSLSQGTWVITSFGVANAISIPLTGWLAKRF 76
I L + +F VL+ + NV++P IA + + WV T+F + +I + G L+ +
Sbjct: 17 IWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQL 76

Query: 77 GEVKLFMWSTVAFAAASWACGVS-SSLNMLIFFRVVQGVVAGPLIPLSQSLLLNNYPPAK 135
G +L ++ + S V S ++LI R +QG A L ++ P
Sbjct: 77 GIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN 136

Query: 136 RSIALALWSMTVIVAPICGPILGGYISDNYHWGWIFFINVPIGIAVVLMTLHTLRGRETH 195
R A L V + GP +GG I+ HW ++ I + I I V + L+
Sbjct: 137 RGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPM-ITIITVPFLMKLLKKEVRI 195

Query: 196 TERRRIDAVGLALLVIGIGSLQIMLDRGKELDWFSSQEIIILTVVAVIAISFLIVWELTD 255
D G+ L+ +GI + ML F++ I +V+V++ +
Sbjct: 196 KG--HFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIRKV 243

Query: 256 DHPIVDLSLFKSRNFTIGCLCISLAYMLYFGAIVLLPQLLQEVYGYTATWAGLASAPVGI 315
P VD L K+ F IG LC + + G + ++P ++++V+ + G G
Sbjct: 244 TDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGT 303

Query: 316 IPVILS-PIIGRFAHKLDMRRLVTFSFIMYAVCFYWRAWTFEPGMDFGASAWPQFIQRF- 373
+ VI+ I G + ++ +V F ++ S + I F
Sbjct: 304 MSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFL-----LETTSWFMTIIIVFV 358

Query: 374 --AVACFFMPLTTITLSGLPPERLAAASSLSNFTRTLAGSIGTSI 416
++ ++TI S L + A SL NFT L+ G +I
Sbjct: 359 LGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2752LUXSPROTEIN287e-103 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 287 bits (736), Expect = e-103
Identities = 130/170 (76%), Positives = 145/170 (85%)

Query: 2 PLLDSFAVDHTRMQAPAVRVAKTMNTPHGDAITVFDLRFCIPNKEVMPEKGIHTLEHLFA 61
PLLDSF VDHTRM APAVRVAKTM TP GD ITVFDLRF PNK+++ EKGIHTLEHL+A
Sbjct: 1 PLLDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYA 60

Query: 62 GFMRDHLNGNGVEIIDISPMGCRTGFYMSLIGTPDEQRVADAWKAAMADVLKVQDQNQIP 121
GFMR+HLNG+ VEIIDISPMGCRTGFYMSLIGTP EQ+VADAW AAM DVLKV++QN+IP
Sbjct: 61 GFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIP 120

Query: 122 ELNVYQCGTYQMHSLSEAQDIARHILERDVRVNSNKELALPKEKLQELHI 171
ELN YQCGT MHSL EA+ IA++ILE V VN N ELALP+ L+EL I
Sbjct: 121 ELNEYQCGTAAMHSLDEAKQIAKNILEVGVAVNKNDELALPESMLRELRI 170


94SC2810SC2830N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC2810230-7.801258protein tyrosine phosphate
SC2811027-6.978967virulence associated chaperone
SC2812024-5.100243hypothetical protein
SC2813024-5.101051acyl carrier protein
SC2814022-5.460558cell invasion protein
SC2815120-5.293561cell invasion protein
SC2816121-5.195046cell invasion protein
SC2817122-5.873024cell invasion protein
SC2818-126-6.539191surface presentation of antigens; secretory
SC2819-125-5.681451surface presentation of antigens protein SpaS
SC2820-126-5.044157surface presentation of antigens; secretory
SC2821-222-3.583639surface presentation of antigens; secretory
SC2822-223-3.920187surface presentation of antigens protein SpaP
SC2823-122-4.358818surface presentation of antigens protein SpaO
SC2824-123-5.544941surface presentation of antigens; secretory
SC2825-124-6.090562surface presentation of antigens; secretory
SC2826-223-5.938995ATP synthase SpaL
SC2827-127-7.700174surface presentation of antigens; secretory
SC2828-125-7.442681invasion protein
SC2829-128-7.204965invasion protein
SC2830025-6.573654invasion protein; outer membrane
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2810BACYPHPHTASE304e-100 Salmonella/Yersinia modular tyrosine phosphatase si...
		>BACYPHPHTASE#Salmonella/Yersinia modular tyrosine phosphatase

signature.
Length = 468

Score = 304 bits (778), Expect = e-100
Identities = 67/212 (31%), Positives = 102/212 (48%), Gaps = 17/212 (8%)

Query: 340 GKPVALAGSYPKNTPDALEAHMKMLLEKECSCLVVLTSEDQMQAKQ--LPPYFRGSYTFG 397
G +A YP LE+H +ML E L VL S ++ ++ +P YFR S T+G
Sbjct: 252 GNTRTIACQYP--LQSQLESHFRMLAENRTPVLAVLASSSEIANQRFGMPDYFRQSGTYG 309

Query: 398 EVHTNSQKVSSASQGEAI--DQYNMQL-SCGEKRYTIPVLHVKNWPDHQPLPS--TDQLE 452
+ S+ G+ I D Y + + G+K ++PV+HV NWPD + S T L
Sbjct: 310 SITVESKMTQQVGLGDGIMADMYTLTIREAGQKTISVPVVHVGNWPDQTAVSSEVTKALA 369

Query: 453 YLADRVKNSNQNGAPGRSSS-----DKHLPMIHCLGGVGRTGTMAAALVLKDNPHSNL-- 505
L D+ + +N + SS K P+IHC GVGRT + A+ + D+ +S L
Sbjct: 370 SLVDQTAETKRNMYESKGSSAVGDDSKLRPVIHCRAGVGRTAQLIGAMCMNDSRNSQLSV 429

Query: 506 EQVRADFRDSRNNRMLEDASQF-VQLKAMQAQ 536
E + + R RN M++ Q V +K + Q
Sbjct: 430 EDMVSQMRVQRNGIMVQKDEQLDVLIKLAEGQ 461


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2811PF05932337e-05 Tir chaperone protein (CesT)
		>PF05932#Tir chaperone protein (CesT)

Length = 127

Score = 33.2 bits (76), Expect = 7e-05
Identities = 16/111 (14%), Positives = 39/111 (35%), Gaps = 7/111 (6%)

Query: 4 PLTFDDNNQCLLLLDSDIFTSIEAK--DDIWLLNGMIIPLSPVCGDSIWRQIMVINGELA 61
PL FDD+ C +++D+ ++ + LL G++ P D + ++
Sbjct: 21 PLVFDDHGTCNMIIDNTFALTLSCDYARERLLLIGLLEPH----KDIPQQCLLAGALNPL 76

Query: 62 ANNEGTLAYIDAAETLLFIHAI-TDLTNTYHIISQLESFVNQQEALKNILQ 111
N L + + +I + + + ++ + + Q
Sbjct: 77 LNAGPGLGLDEKSGLYHAYQSIPREKLSVPTLKREMAGLLEWMRGWREASQ 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2816BACINVASINC5150.0 Salmonella/Shigella invasin protein C signature.
		>BACINVASINC#Salmonella/Shigella invasin protein C signature.

Length = 409

Score = 515 bits (1327), Expect = 0.0
Identities = 407/409 (99%), Positives = 408/409 (99%)

Query: 1 MLISNVGINPAAYLNNHSVENSSQTASQSVSAKDILNSIGISSSKVSDLGLSPTLSAPAP 60
MLISNVGINPAAYLNNHSVENSSQTASQSVSAKDILNSIGISSSKVSDLGLSPTLSAPAP
Sbjct: 1 MLISNVGINPAAYLNNHSVENSSQTASQSVSAKDILNSIGISSSKVSDLGLSPTLSAPAP 60

Query: 61 GVLTQTPGTITSFLKASIQNTDMNQDLNALANNVTTKANEVVQTQLREQQAEVGKFFDIS 120
GVLTQTPGTITSFLKASIQNTDMNQDLNALANNVTTKANEVVQTQLREQQAEVGKFFDIS
Sbjct: 61 GVLTQTPGTITSFLKASIQNTDMNQDLNALANNVTTKANEVVQTQLREQQAEVGKFFDIS 120

Query: 121 GMSSSAVALLAAANTLMLTLNQADSKLSGKLSLVSFDAAKTTASSMMREGMNALSGSISQ 180
GMSSSAVALLAAANTLMLTLNQADSKLSGKLSLVSFDAAKTTASSMMREGMNALSGSISQ
Sbjct: 121 GMSSSAVALLAAANTLMLTLNQADSKLSGKLSLVSFDAAKTTASSMMREGMNALSGSISQ 180

Query: 181 SALQLGITGVGAKLEYKGLQNERGALKHNAAKIDKLTTESHSIKNVLNGQNSVKLGAEGV 240
SALQLGITGVGAKLEYKGLQNERGALKHNAAKIDKLTTESHSIKNVLNGQNSVKLGAEGV
Sbjct: 181 SALQLGITGVGAKLEYKGLQNERGALKHNAAKIDKLTTESHSIKNVLNGQNSVKLGAEGV 240

Query: 241 DSLKSLNMKKTGTDATKNLNDATLKSNAGTSATESLGIKDSNKQISPEHQAILSKRLESV 300
DSLKSLNMKKTGTDATKNLNDATLKSNAGTSATESLGIK+SNKQISPEHQAILSKRLESV
Sbjct: 241 DSLKSLNMKKTGTDATKNLNDATLKSNAGTSATESLGIKNSNKQISPEHQAILSKRLESV 300

Query: 301 ESDIRLEQNTMDMTRIDARKMQMTGDLIMKNSVTVGGIAGASGQYAATQERSEQQISQVN 360
ESDIRLEQNTMDMTRIDARKMQMTGDLIMKNSVTVGGIAGAS QYAATQERSEQQISQVN
Sbjct: 301 ESDIRLEQNTMDMTRIDARKMQMTGDLIMKNSVTVGGIAGASRQYAATQERSEQQISQVN 360

Query: 361 NRVASTASDEARESSRKSTSLIQEMLKTMESINQSKASALAAIAGNIRA 409
NRVASTASDEARESSRKSTSLIQEMLKTMESINQSKASALAAIAGNIRA
Sbjct: 361 NRVASTASDEARESSRKSTSLIQEMLKTMESINQSKASALAAIAGNIRA 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2817BACINVASINB8420.0 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 842 bits (2175), Expect = 0.0
Identities = 592/593 (99%), Positives = 592/593 (99%)

Query: 1 MVNDASSISRSGYTQNPRLAEAAFEGVRKNTDFLKAADKAFKDVVATKAGDLKAGTKSGE 60
MVNDASSISRSGYTQNPRLAEAAFEGVRKNTDFLKAADKAFKDVVATKAGDLKAGTKSGE
Sbjct: 1 MVNDASSISRSGYTQNPRLAEAAFEGVRKNTDFLKAADKAFKDVVATKAGDLKAGTKSGE 60

Query: 61 SAINTVGLKPPTDAAREKLSSEGQLTLLLGKLMTLLGDVSLSQLESRLAVWQAMIESQKE 120
SAINTVGLKPPTDAAREKLSSEGQLTLLLGKLMTLLGDVSLSQLESRLAVWQAMIESQKE
Sbjct: 61 SAINTVGLKPPTDAAREKLSSEGQLTLLLGKLMTLLGDVSLSQLESRLAVWQAMIESQKE 120

Query: 121 MGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAATKKLTQAQNKLQSLDPADPG 180
MGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAATKKLTQAQNKLQSLDPADPG
Sbjct: 121 MGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAATKKLTQAQNKLQSLDPADPG 180

Query: 181 YAQTEAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQGTANAASQN 240
YAQ EAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQGTANAASQN
Sbjct: 181 YAQAEAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQGTANAASQN 240

Query: 241 QVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQAEMEKKSAEF 300
QVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQAEMEKKSAEF
Sbjct: 241 QVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQAEMEKKSAEF 300

Query: 301 QEETRKAEETNRIMGCIGKVLGALLTIVSVVAAVFTGGASLALAAVGLAVMVADEIVKAA 360
QEETRKAEETNRIMGCIGKVLGALLTIVSVVAAVFTGGASLALAAVGLAVMVADEIVKAA
Sbjct: 301 QEETRKAEETNRIMGCIGKVLGALLTIVSVVAAVFTGGASLALAAVGLAVMVADEIVKAA 360

Query: 361 TGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTAEMAGSIVGAIVAAIAMV 420
TGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTAEMAGSIVGAIVAAIAMV
Sbjct: 361 TGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTAEMAGSIVGAIVAAIAMV 420

Query: 421 AVIVVVAVVGKGAAAKLGNALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLG 480
AVIVVVAVVGKGAAAKLGNALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLG
Sbjct: 421 AVIVVVAVVGKGAAAKLGNALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLG 480

Query: 481 NVGSKMGLQTNALSKELVGNTLNKVALGMEVTNTAAQSAGGVAEGVFIKNASEALADFML 540
NVGSKMGLQTNALSKELVGNTLNKVALGMEVTNTAAQSAGGVAEGVFIKNASEALADFML
Sbjct: 481 NVGSKMGLQTNALSKELVGNTLNKVALGMEVTNTAAQSAGGVAEGVFIKNASEALADFML 540

Query: 541 ARFAMDQIQQWLKQSVEIFGENQKVTAELQKAMSSAVQQNADASRFILRQSRA 593
ARFAMDQIQQWLKQSVEIFGENQKVTAELQKAMSSAVQQNADASRFILRQSRA
Sbjct: 541 ARFAMDQIQQWLKQSVEIFGENQKVTAELQKAMSSAVQQNADASRFILRQSRA 593


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2818SYCDCHAPRONE1282e-40 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 128 bits (322), Expect = 2e-40
Identities = 39/160 (24%), Positives = 72/160 (45%), Gaps = 4/160 (2%)

Query: 4 QNNVSEERVAEMIWDAVSEGATLKDVHGIPQDMMDGLYAHAYEFYNQGRLDEAETFFRFL 63
Q + + + G T+ ++ I D ++ LY+ A+ Y G+ ++A F+ L
Sbjct: 3 QETTDTQEYQLAMESFLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQAL 62

Query: 64 CIYDFYNPDYTMGLAAVCQLKKQFQKACDLYAVAFTLLKNDYRPVFFTGQCQLLMRKAAK 123
C+ D Y+ + +GL A Q Q+ A Y+ + + R F +C L + A+
Sbjct: 63 CVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAE 122

Query: 124 ARQCF----ELVNERTEDESLRAKALVYLEALKTAETEQH 159
A EL+ ++TE + L + LEA+K + +H
Sbjct: 123 AESGLFLAQELIADKTEFKELSTRVSSMLEAIKLKKEMEH 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2819TYPE3IMSPROT341e-118 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 341 bits (876), Expect = e-118
Identities = 119/360 (33%), Positives = 205/360 (56%), Gaps = 19/360 (5%)

Query: 1 MSSNKTEKPTKKRLEDSAKKGQSFKSKDLIIACLTLGGIAYLVSYGSFN-EFMGIIKIII 59
MS KTE+PT K++ D+ KKGQ KSK+++ L + A L+ + E + +I
Sbjct: 1 MSGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIP 60

Query: 60 ADNFDQSMADYSLAVFGIGLKYLIPFMLLCL---VCSALPAL----LQAGFVLATEALKP 112
+QS +S A+ + L+ F LC +AL A+ +Q GF+++ EA+KP
Sbjct: 61 ---AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKP 117

Query: 113 NLSALNPVEGAKKLFSMRTVKDTVKTLLYLSSFVVAAIICWKKYKVEIFSQLNGNVVDIA 172
++ +NP+EGAK++FS++++ + +K++L + V+ +I+ W K + + L I
Sbjct: 118 DIKKINPIEGAKRIFSIKSLVEFLKSILKV---VLLSILIWIIIKGNLVTLLQLPTCGIE 174

Query: 173 VIWRELLLALVLTCLACA---LIVLLLDAVAEYFRTMKDMKMDKEEVKREMKEQEGNPEV 229
I L L + C +++ + D EY++ +K++KM K+E+KRE KE EG+PE+
Sbjct: 175 CITPLLGQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEI 234

Query: 230 KSKRREVHMEILSEQVKSDIENSRLIVANPTHITIGIYFKPELMPIPMISVYETNQRALA 289
KSKRR+ H EI S ++ +++ S ++VANPTHI IGI +K P+P+++ T+ +
Sbjct: 235 KSKRRQFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQT 294

Query: 290 VRAYAEKVGVPVIVDIKLARSLFKTHRRYDLVSLEEIDEVLRLLVWLE--EVENAGKDVI 347
VR AE+ GVP++ I LAR+L+ + E+I+ +L WLE +E +++
Sbjct: 295 VRKIAEEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEKQHSEML 354


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2820TYPE3IMRPROT1845e-60 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 184 bits (470), Expect = 5e-60
Identities = 48/237 (20%), Positives = 103/237 (43%), Gaps = 4/237 (1%)

Query: 12 LVASAALGFARVAPIFFFLPFLNSGVLSGAPRNAIIILVALGVWPHALNEVPPFLSVAMI 71
+ RV + P L+ + + + +++ + P P S +
Sbjct: 12 WLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPVFSFFAL 71

Query: 72 PLVLQEAAVGVMLGCLLSWPFWVMHALGCIIDNQRGATLSSSIDPANGIDTSEMANFLNM 131
L +Q+ +G+ LG + + F + G II Q G + ++ +DPA+ ++ +A ++M
Sbjct: 72 WLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDM 131

Query: 132 SAAVVHLQNGGLVTMVDVLTKSYQLCDPMNEC--TPSLPPLLTFINQVAQNALVLASPVV 189
A ++ L G + ++ +L ++ E + + L + + N L+LA P++
Sbjct: 132 LALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALPLI 191

Query: 190 LVLLLSEVFLGLLSRFAPQMNAFAISLTVKSGIAVLIMLLYFS--PVLPDNVLRLSF 244
+LL + LGLL+R APQ++ F I + + + +M +++ F
Sbjct: 192 TLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIF 248


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2821TYPE3IMQPROT894e-27 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 88.7 bits (220), Expect = 4e-27
Identities = 86/86 (100%), Positives = 86/86 (100%)

Query: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60
MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL
Sbjct: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60

Query: 61 FLLSGWYGEVLLSYGRQVIFLALAKG 86
FLLSGWYGEVLLSYGRQVIFLALAKG
Sbjct: 61 FLLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2822TYPE3IMPPROT303e-107 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 303 bits (777), Expect = e-107
Identities = 224/224 (100%), Positives = 224/224 (100%)

Query: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60
MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS
Sbjct: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60

Query: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120
MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL
Sbjct: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120

Query: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180
KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL
Sbjct: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180

Query: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIAT 224
LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIAT
Sbjct: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIAT 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2823TYPE3OMOPROT5380.0 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 538 bits (1387), Expect = 0.0
Identities = 300/303 (99%), Positives = 302/303 (99%)

Query: 1 MSLRVRQIDRREWLLAQTATECQRHGQEATLEYPTRQGMWVRLSDAEKRWSAWIQPGDWL 60
MSLRVRQIDRREWLLAQTATECQRHG+EATLEYPTRQGMWVRLSDAEKRWSAWI+PGDWL
Sbjct: 1 MSLRVRQIDRREWLLAQTATECQRHGREATLEYPTRQGMWVRLSDAEKRWSAWIKPGDWL 60

Query: 61 EHVSPALAGAAVSAGAEHLVVPWLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLL 120
EHVSPALAGAAVSAGAEHLVVPWLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLL
Sbjct: 61 EHVSPALAGAAVSAGAEHLVVPWLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLL 120

Query: 121 HIMSDRGGLWFEHLPELPAVGGGRPKMLRWPLRFVIGSSDTQCSLLGRIGIGDVLLIRTS 180
HIMSDRGGLWFEHLPELPAVGGGRPKMLRWPLRFVIGSSDTQ SLLGRIGIGDVLLIRTS
Sbjct: 121 HIMSDRGGLWFEHLPELPAVGGGRPKMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTS 180

Query: 181 RAEVYCYAKKLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYR 240
RAEVYCYAKKLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYR
Sbjct: 181 RAEVYCYAKKLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYR 240

Query: 241 KNVTLAELEAMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESG 300
KNVTLAELEAMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESG
Sbjct: 241 KNVTLAELEAMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESG 300

Query: 301 NGE 303
NGE
Sbjct: 301 NGE 303


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2824SSPANPROTEIN6000.0 Salmonella invasion protein InvJ signature.
		>SSPANPROTEIN#Salmonella invasion protein InvJ signature.

Length = 336

Score = 600 bits (1548), Expect = 0.0
Identities = 331/336 (98%), Positives = 334/336 (99%)

Query: 1 MGDVSAVSSSGNILLPQQDEVGGLSEALKKAVEKHKTEYSGDKKDRDYGDAFVMHKETAL 60
MGDVSAVSSSGNILLPQQDEVGGLSEALKKAVEKHKTEYSGDKKDRDYGDAFVMHKETAL
Sbjct: 1 MGDVSAVSSSGNILLPQQDEVGGLSEALKKAVEKHKTEYSGDKKDRDYGDAFVMHKETAL 60

Query: 61 PVLLAAWRHGAPAKSEHHNGNVSGLHHNGKGDLRIAEKLLKVTAEKSVGLISAEAKVDKS 120
P+LLAAWRHGAPAKSEHHNGNVSGLHHNGK +LRIAEKLLKVTAEKSVGLISAEAKVDKS
Sbjct: 61 PLLLAAWRHGAPAKSEHHNGNVSGLHHNGKSELRIAEKLLKVTAEKSVGLISAEAKVDKS 120

Query: 121 AALLSSKNRPLESVSGKKLSADLKAVESVSEVADNATGISDDNIKALPGDNKAIAGEGVR 180
AALLSSKNRPLESVSGKKLSADLKAVESVSEV DNATGISDDNIKALPGDNKAIAGEGVR
Sbjct: 121 AALLSSKNRPLESVSGKKLSADLKAVESVSEVTDNATGISDDNIKALPGDNKAIAGEGVR 180

Query: 181 KEGAPLARDVAPARMAAANTGKPDDKDHKKVKDVSQLPLQPTTIADLSQLTGGDEKMPLA 240
KEGAPLARDVAPARMAAANTGKP+DKDHKKVKDVSQLPLQPTTIADLSQLTGGDEKMPLA
Sbjct: 181 KEGAPLARDVAPARMAAANTGKPEDKDHKKVKDVSQLPLQPTTIADLSQLTGGDEKMPLA 240

Query: 241 AQSKPMMTIFPTADGVKGEDSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLH 300
AQSKPMMTIFPTADGVKGEDSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLH
Sbjct: 241 AQSKPMMTIFPTADGVKGEDSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLH 300

Query: 301 DQWQNGNPQRWHLTRDDQQNPQQQQHRQQSGEEDDA 336
DQWQNGNPQRWHLTRDDQQNPQQQQHRQQSGEEDDA
Sbjct: 301 DQWQNGNPQRWHLTRDDQQNPQQQQHRQQSGEEDDA 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2825SSPAMPROTEIN1693e-57 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type M

signature.
Length = 147

Score = 169 bits (429), Expect = 3e-57
Identities = 141/147 (95%), Positives = 143/147 (97%)

Query: 1 MHSLTRIKVLQRRCTVFHSQCESILLRYQDEDRGLQAEEEAILEQIAGLKLLLDTLRAEN 60
MHSLTRIKVLQRRCTVFHSQCESILLRYQDEDR LQ EEEAI+EQIAGLKLLLDTLRAEN
Sbjct: 1 MHSLTRIKVLQRRCTVFHSQCESILLRYQDEDRRLQVEEEAIVEQIAGLKLLLDTLRAEN 60

Query: 61 RQLSREEIYTLLRKQSIVRRQIKDLELQIIQIQEKRSELEKKREEFQKKSKYWLRKEGNY 120
RQLSREEIY LLRKQSIVRRQIKDLELQIIQIQEKRSELEKKREEFQ+KSKYWLRKEGNY
Sbjct: 61 RQLSREEIYALLRKQSIVRRQIKDLELQIIQIQEKRSELEKKREEFQEKSKYWLRKEGNY 120

Query: 121 QRWIIRQKRFYIQREIQQEEAESEEII 147
QRWIIRQKR YIQREIQQEEAESEEII
Sbjct: 121 QRWIIRQKRLYIQREIQQEEAESEEII 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2827SSPAKPROTEIN2059e-72 Invasion protein B family signature.
		>SSPAKPROTEIN#Invasion protein B family signature.

Length = 133

Score = 205 bits (522), Expect = 9e-72
Identities = 43/133 (32%), Positives = 75/133 (56%)

Query: 1 MQHLDIAELVRSALEVSGCDPSLIGGIDSHSTIVLDLFALPSICISVKEDDVWIWAQLGA 60
M ++++ +LVR +L GC PS+I +DSHS I + L ++P+I I++ + V +WA A
Sbjct: 1 MSNINLVQLVRDSLFTIGCPPSIITDLDSHSAITISLDSMPAINIALVNEQVMLWANFDA 60

Query: 61 DSMVVLQQRAYEILMTIMEGCHFARGGQLLLGEQNGELTLKALVHPDFLSDGEKFSTALN 120
S V LQ AY IL ++ ++ + L + L L+ ++ D++ DG F+ L+
Sbjct: 61 PSDVKLQSSAYNILNLMLMNFSYSINELVELHRSDEYLQLRVVIKDDYVHDGIVFAEILH 120

Query: 121 GFYNYLEVFSRSL 133
FY +E+ + L
Sbjct: 121 EFYQRMEILNGVL 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2829INVEPROTEIN6040.0 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 604 bits (1558), Expect = 0.0
Identities = 371/372 (99%), Positives = 371/372 (99%)

Query: 1 MIPGSTSGISFSRILSRQTSHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSA 60
MIPGSTSGISFSRILSRQ SHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSA
Sbjct: 1 MIPGSTSGISFSRILSRQASHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSA 60

Query: 61 ALAQFRNRRDYEKKSSNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFP 120
ALAQFRNRRDYEKKSSNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFP
Sbjct: 61 ALAQFRNRRDYEKKSSNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFP 120

Query: 121 DPSDLVLVLRELLRRKDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLS 180
DPSDLVLVLRELLRRKDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLS
Sbjct: 121 DPSDLVLVLRELLRRKDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLS 180

Query: 181 LKPGLLRASYRQFIQSESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSR 240
LKPGLLRASYRQFIQSESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSR
Sbjct: 181 LKPGLLRASYRQFIQSESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSR 240

Query: 241 LEFGQLLRRLTQLKMLRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSL 300
LEFGQLLRRLTQLKMLRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSL
Sbjct: 241 LEFGQLLRRLTQLKMLRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSL 300

Query: 301 LADIIGLNALLLSHKEHASFLQIFYQVCKAIPSSLFYEEYWQEELLMALRSMTDIAYKHE 360
LADIIGLNALLLSHKEHASFLQIFYQVCKAIPSSLFYEEYWQEELLMALRSMTDIAYKHE
Sbjct: 301 LADIIGLNALLLSHKEHASFLQIFYQVCKAIPSSLFYEEYWQEELLMALRSMTDIAYKHE 360

Query: 361 MAEQRRTIEKLS 372
MAEQRRTIEKLS
Sbjct: 361 MAEQRRTIEKLS 372


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2830TYPE3OMGPROT5760.0 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 576 bits (1485), Expect = 0.0
Identities = 169/540 (31%), Positives = 271/540 (50%), Gaps = 57/540 (10%)

Query: 4 HILLARVLACAALVLVTPGYSSE----KIPVTGSGFVAKDDSLRTFFDAMALQLKEPVIV 59
H RVL L+L + ++ E IP +VAK +SLR V+V
Sbjct: 6 HSFFKRVLTGTLLLLSSYSWAQELDWLPIPYV---YVAKGESLRDLLTDFGANYDATVVV 62

Query: 60 SKMAARKKITGNFEFHDPNALLEKLSLQLGLIWYFDGQAIYIYDASEMRNAVVSLRNVSL 119
S K++G FE +P L+ ++ L+WY+DG +YI+ SE+ + ++ L+
Sbjct: 63 SD-KINDKVSGQFEHDNPQDFLQHIASLYNLVWYYDGNVLYIFKNSEVASRLIRLQESEA 121

Query: 120 NEFNNFLKRSGLYNKNYPLRGDNRKGTFYVSGPPVYVDMVVNAATMMDKQND--GIELGR 177
E L+RSG++ + R D YVSGPP Y+++V A +++Q + G
Sbjct: 122 AELKQALQRSGIWEPRFGWRPDASNRLVYVSGPPRYLELVEQTAAALEQQTQIRSEKTGA 181

Query: 178 QKIGVMRLNNTFVGDRTYNLRDQKMVIPGIATAIERLLQGEEQPLGNIVSSEPPAMPAFS 237
I + L DRT + RD ++ PG+AT ++R+L + + P
Sbjct: 182 LAIEIFPLKYASASDRTIHYRDDEVAAPGVATILQRVLSDATIQQVTVDNQRIP------ 235

Query: 238 ANGEKGKAANYAGGMSLQEALKQNAAAGNIKIVAYPDTNSLLVKGTAEQVHFIEMLVKAL 297
Q A + +A A ++ A P N+++V+ + E++ + L+ AL
Sbjct: 236 -----------------QAATRASAQA---RVEADPSLNAIIVRDSPERMPMYQRLIHAL 275

Query: 298 DVAKRHVELSLWIVDLNKSDLERLGTSWSGSI-----------TIGDKLGVSLNQSSIST 346
D +E++L IVD+N L LG W I T GD+ ++ N + S
Sbjct: 276 DKPSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQVVIKTTGDQSNIASNGALGSL 335

Query: 347 LDG---SRFIAAVNALEEKKQATVVSRPVLLTQENVPAIFDNNRTFYTKLIGERNVALEH 403
+D +A VN LE + A VVSRP LLTQEN A+ D++ T+Y K+ G+ L+
Sbjct: 336 VDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDHSETYYVKVTGKEVAELKG 395

Query: 404 VTYGTMIRVLPRFSADG---QIEMSLDIEDGNDKTPQSDTTTSVDALPEVGRTLISTIAR 460
+TYGTM+R+ PR G +I ++L IEDGN Q ++ ++ +P + RT++ T+AR
Sbjct: 396 ITYGTMLRMTPRVLTQGDKSEISLNLHIEDGN----QKPNSSGIEGIPTISRTVVDTVAR 451

Query: 461 VPHGKSLLVGGYTRDANTDTVQSIPFLGKLPLIGSLFRYSSKNKSNVVRVFMIEPKEIVD 520
V HG+SL++GG RD + + +P LG +P IG+LFR S+ VR+F+IEP+ I +
Sbjct: 452 VGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVRLFIIEPRIIDE 511


95SC2886SC2893N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC28860210.211032phosphopyruvate hydratase
SC2887-1120.573321CTP synthetase
SC2888-1120.687737nucleoside triphosphate pyrophosphohydrolase
SC28891130.746766fimbrial subunit
SC2890213-1.225546outer membrane usher protein
SC2891014-1.377119periplasmic fimbrial chaperone
SC2892-213-1.202620minor fimbrial subunit
SC2893-210-1.233715minor fimbrial subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2886ANTHRAXTOXNA290.036 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.3 bits (65), Expect = 0.036
Identities = 31/132 (23%), Positives = 51/132 (38%), Gaps = 9/132 (6%)

Query: 211 GYAPNLGSNAEALAVIAEAVKAAGYELGKDITLAMDCAASEFYKDGKYVLA-----GEGN 265
P L N + A+ +E K YE+GK I+L + + ++ + +
Sbjct: 147 RETPKLIINIKDYAINSEQSKEVYYEIGKGISLDIISKDKSLDPEFLNLIKSLSDDSDSS 206

Query: 266 KAFTSEEFTHFLEELTKQYPIVSIEDGLDESDW---DGFAYQTKVLG-DKIQLVGDDLFV 321
S++F LE K I I++ L E F+Y ++L D+F
Sbjct: 207 DLLFSQKFKEKLELNNKSIDINFIKENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFE 266

Query: 322 TNTKILKEGIEK 333
K+ K G EK
Sbjct: 267 YMNKLEKGGFEK 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2890PF005777030.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 703 bits (1816), Expect = 0.0
Identities = 217/875 (24%), Positives = 375/875 (42%), Gaps = 73/875 (8%)

Query: 12 PIACGVGMLLSVSPYSASGKDIEFNTDFLDVKNRDNVNIAQFSRKGFILPGVYLLQIKIN 71
+AC + +P S++ ++ FN FL + ++++F + PG Y + I +N
Sbjct: 31 FVACA---FAAQAPLSSA--ELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLN 85

Query: 72 GQTLPQEFPVNWVIPEHDPQGSEVCAEPELVTQLGIKPELAEKLVWITHGERQCLAPDSL 131
+ V + QG C + +G+ + + + C+ S+
Sbjct: 86 NGYMATR-DVTFN-TGDSEQGIVPCLTRAQLASMGLNTASVSGMNLL--ADDACVPLTSM 141

Query: 132 -KGMDFQADLGHSTLLVNLPQAYMEYSDVDWDPPARWDNGIPGIILDYNINNQLRHDQES 190
Q D+G L + +PQA+M + PP WD GI +L+YN + ++
Sbjct: 142 IHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIG 201

Query: 191 GSEEQSISGNGTLGANLGAWRLRADWQASYDHRDDDENTSTLHDQSWSRYYAYRALPTLG 250
G+ N G N+GAWRLR + SY+ D + + R + L
Sbjct: 202 GNS-HYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKN--KWQHINTWLERDIIPLR 258

Query: 251 AKLTLGESYLQSDVFDSFNYIGASVVSDDQMLPPKLRGYAPEIVGIARSNAKVKVSWQGR 310
++LTLG+ Y Q D+FD N+ GA + SDD MLP RG+AP I GIAR A+V + G
Sbjct: 259 SRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGY 318

Query: 311 VLYETQVPAGPFRIQDLNQ-SVSGTLHVTVEEQNGQTQEFDVNTASVPFLTRPGMVRYKM 369
+Y + VP GPF I D+ SG L VT++E +G TQ F V +SVP L R G RY +
Sbjct: 319 DIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSI 378

Query: 370 ALGRPQDWDHHPITGTFASAEASWGVTNGWSLYGGAIGESNYQAVALGSGKDLGVVGAVA 429
G + + F + G+ GW++YGG Y+A G GK++G +GA++
Sbjct: 379 TAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALS 438

Query: 430 VDITHSIAHMPQDDGFDGETLQGNSYRISYSRDFDEIDSRLTFAGYRFSEKNFMSMSDYL 489
VD+T + + +P D G S R Y++ +E + + GYR+S + + +D
Sbjct: 439 VDMTQANSTLPDD-----SQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTT 493

Query: 490 DAKT--YHHLNA-----------------GHEKERYTVTYNQNFREQGMSAYFSYSRSTF 530
++ Y+ +++ + +T Q + Y S S T+
Sbjct: 494 YSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTS-TLYLSGSHQTY 552

Query: 531 WDSPDQS-NYNLSLSWYFDLGSIKNLSASLNGYRSEYNGDKDDGVYISLSVPWG------ 583
W + + + L+ F+ + + S + ++ + +D + +++++P+
Sbjct: 553 WGTSNVDEQFQAGLNTAFEDIN---WTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSD 609

Query: 584 ------NDSISYNGT-FNGSQHRNQLGYSGH--SQNGDNWQLHVG-----QDEQGAQADG 629
+ S SY+ + + N G G N ++ + G G+
Sbjct: 610 SKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYA 669

Query: 630 YYSHQGALTDIDLSADYEEGSYRSLGMSLRGGMTLTTQGGALHRGSLAGSTRLLVDTDGI 689
+++G + ++ + + + L + GG+ G L + T +LV G
Sbjct: 670 TLNYRGGYGNANIGYSHSDD-IKQLYYGVSGGVLAHANGVTLGQPL--NDTVVLVKAPGA 726

Query: 690 ADVPVSGNDSPTSTNIFGKAVIADVGSYSRSLARIDLNKLPEKAEATKSVVQITLTEGAI 749
D V N + T+ G AV+ Y + +D N L + + +V + T GAI
Sbjct: 727 KDAKV-ENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAI 785

Query: 750 GYRHFDVVSGEKMMAVFRLADGDFPPFGAEVKNERQQQLGLVANDGNAWLAGVKAGETLK 809
F G K++ + PFGA V +E Q G+VA++G +L+G+ ++
Sbjct: 786 VRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQ 844

Query: 810 VFW--DGAAQCEA--SLPPTFTPELLANALLLPCK 840
V W + A C A LPP +LL L C+
Sbjct: 845 VKWGEEENAHCVANYQLPPESQQQLL-TQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2892FIMBRIALPAPF364e-05 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 35.8 bits (82), Expect = 4e-05
Identities = 37/144 (25%), Positives = 62/144 (43%), Gaps = 26/144 (18%)

Query: 57 PPCTIGGAS---VEFGDVLTTKVGDASQTKPVGYSLNCDGRASDYLKLQIQGTTTTISGE 113
PPCTI V+FG++ V ++ S++C + S L +++ G T +
Sbjct: 32 PPCTINNGQNIVVDFGNINPEHVDNSRGEVTKNISISCPYK-SGSLWIKVTGNTMGVGQN 90

Query: 114 QVLQTSVQGLGIRIQQ-------------AGNKQLVPVGI-TDWLNFTLSGSNGPELEAV 159
VL T++ GI + Q +GN V G+ T FT + +V
Sbjct: 91 NVLATNITHFGIALYQGKGMSTPLTLGNGSGNGYRVTAGLDTARSTFTFT--------SV 142

Query: 160 PVKEPTTQLAGGDFNASATLVVDY 183
P + + L GGDF +A++ + Y
Sbjct: 143 PFRNGSGILNGGDFRTTASMSMIY 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC2893FIMBRIALPAPF437e-08 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 43.2 bits (101), Expect = 7e-08
Identities = 44/170 (25%), Positives = 77/170 (45%), Gaps = 18/170 (10%)

Query: 5 IVKRVLILTLLITQFAC-AD-NLTFHGKLINPPACTINNGETLEVSFGSVIIDNIDGVNY 62
+++ L ++LL+T A AD + G + PP CTINNG+ + V FG++ +++D N
Sbjct: 1 MIRLSLFISLLLTSVAVLADVQINIRGNVYIPP-CTINNGQNIVVDFGNINPEHVD--NS 57

Query: 63 LTEIPWTLTCDSSFRDDALTFTLSYLGTATPYSANALTTNVPELGIELQQNGTVFPPGT- 121
E+ ++ ++ +L ++ T N L TN+ GI L Q + P T
Sbjct: 58 RGEVTKNISISCPYKSGSLWIKVTG-NTMGVGQNNVLATNITHFGIALYQGKGMSTPLTL 116

Query: 122 ----------SLTIDES-SLPTLKAVPVKQPGKEPAEGDFEAFATLQVDY 160
+ +D + S T +VP + GDF A++ + Y
Sbjct: 117 GNGSGNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMIY 166


96SC3202SC3209N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC32020132.149785galactitol-1-phosphate dehydrogenase
SC32031132.248056sugar metabolism transcriptional regulator
SC32041143.342561hypothetical protein
SC32050153.220692transglycosylase
SC32062162.111598hypothetical protein
SC32072161.677096chromosome replication initiator DnaA
SC32081172.547493hypothetical protein
SC32090181.669315hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3202DHBDHDRGNASE383e-05 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 38.1 bits (88), Expect = 3e-05
Identities = 26/92 (28%), Positives = 39/92 (42%), Gaps = 2/92 (2%)

Query: 156 AQGCEGKNVIIVGAGT-IGLLALQCARELGARSVTAIDINPQKLELAKALGATHTCNSRE 214
A+G EGK I GA IG + GA + A+D NP+KLE + ++
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAH-IAAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 215 MTADDIQTALSDIQFDQLVLETAGTPQTVSLA 246
AD +A D ++ E V++A
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3205IGASERPTASE330.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.5 bits (76), Expect = 0.003
Identities = 23/120 (19%), Positives = 38/120 (31%), Gaps = 8/120 (6%)

Query: 289 QAVEMQPAAAPDAPVEPGVEETQPQMTNGVASPSQASVSDLTDDAPAQSATPVSAPQTPP 348
Q+ +QP A P +P V +PQ + A + + PV+ T
Sbjct: 1135 QSETVQPQAEPARENDPTVNIKEPQSQTN----TTADTEQPAKETSSNVEQPVTESTTVN 1190

Query: 349 ATASAPADPSAELKIYDTSSQPLD-QVLAQVQQDGASIVVGPLLKNNVEALMKSNTPLNV 407
S +P ++QP + ++ V + N A SN V
Sbjct: 1191 TGNSVVENPENTT---PATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTV 1247


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3207RTXTOXINA280.028 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.0 bits (62), Expect = 0.028
Identities = 26/111 (23%), Positives = 44/111 (39%), Gaps = 22/111 (19%)

Query: 42 NKILCCGNGTSAANAQHFAASMINRFETERPSLPAIALNTDNVVLTAIA-------NDRL 94
K+L GN + A T + IA + V AI+ D+
Sbjct: 277 TKVL--GNVGKGISQYIIAQRAAQGLSTSAAAAGLIA----SAVTLAISPLSFLSIADKF 330

Query: 95 HD----EVYAKQVRALGHAGDVLLAISTRGNSRDIVKAVEAAVTRDMTIVA 141
E Y+++ + LG+ GD LLA + A++A++T T++A
Sbjct: 331 KRANKIEEYSQRFKKLGYDGDSLLAAFHKETG-----AIDASLTTISTVLA 376


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3209NUCEPIMERASE300.007 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.8 bits (67), Expect = 0.007
Identities = 14/55 (25%), Positives = 22/55 (40%), Gaps = 16/55 (29%)

Query: 15 VLITGATGLVGGHLLRMLINTPQVSAIAAPTRRPLTDIVGV--YNP-HDPQLTDA 66
L+TGA G +G H+ + L+ +VG+ N +D L A
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGH-------------QVVGIDNLNDYYDVSLKQA 44


97SC3323SC3330N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3323-118-5.576257Fis family transcriptional regulator
SC3324-118-5.751871methyltransferase
SC3325-122-5.184998hypothetical protein
SC3326-122-4.934267DNA-binding transcriptional regulator EnvR
SC3327121-3.870324hypothetical protein
SC3328119-2.898068multidrug transporter acriflavin resistance
SC3329224-3.738404hypothetical protein
SC3330126-2.255873outer membrane lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3323DNABINDNGFIS1573e-54 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 157 bits (399), Expect = 3e-54
Identities = 98/98 (100%), Positives = 98/98 (100%)

Query: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60
MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ
Sbjct: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60

Query: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98
PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN
Sbjct: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3326HTHTETR1292e-39 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 129 bits (325), Expect = 2e-39
Identities = 83/216 (38%), Positives = 130/216 (60%), Gaps = 3/216 (1%)

Query: 1 MAKKTKADALKTRQHLIETAIAQFALRGVANTTLNDIADAADVTRGAIYWHFENKTQLFN 60
MA+KTK +A +TRQH+++ A+ F+ +GV++T+L +IA AA VTRGAIYWHF++K+ LF+
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EVW-LQQPPLRELIQDRLTGCWNDNPLQDLREKFIAALQYIAAVPRQQALMQILYHKCEF 119
E+W L + + EL + +PL LRE I L+ R++ LM+I++HKCEF
Sbjct: 61 EIWELSESNIGELELEYQAKF-PGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 120 HNGM-ISEQAIREKIGFHHQSLLEVLQRCMDKKLISGSLDLDVILIILHGSFSGIVKNWL 178
M + +QA R + + + L+ C++ K++ L II+ G SG+++NWL
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 179 MNPTSYDLYKQAPALVDNLLKMLSPDGSVRQLMPNE 214
P S+DL K+A V LL+M ++R NE
Sbjct: 180 FAPQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3328ACRIFLAVINRP13860.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1386 bits (3590), Expect = 0.0
Identities = 914/1032 (88%), Positives = 972/1032 (94%)

Query: 1 MANFFIRRPIFAWVLAIILMMAGALAIMQLPVAQYPTIAPPAVSISATYPGADAQTVQDT 60
MANFFIRRPIFAWVLAIILMMAGALAI+QLPVAQYPTIAPPAVS+SA YPGADAQTVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120
VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGISVEKSSSSFLMVAGFVSDNPNTTQDDISDYVASNIKDSISRLNGVGDVQLFGA 180
EVQQQGISVEKSSSS+LMVAGFVSDNP TTQDDISDYVASN+KD++SRLNGVGDVQLFGA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWLDANLLNKYQLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRL 240
QYAMRIWLDA+LLNKY+LTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 KDPEEFGKVTLRVNTDGSVVHLKDVARIELGGENYNVVARINGKPASGLGIKLATGANAL 300
K+PEEFGKVTLRVN+DGSVV LKDVAR+ELGGENYNV+ARINGKPA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTATAIKAKLAELQPFFPQGMKVVYPYDTTPFVKISIHEVVKTLFEAIILVFLVMYLFLQ 360
DTA AIKAKLAELQPFFPQGMKV+YPYDTTPFV++SIHEVVKTLFEAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NIRATLIPTIAVPVVLLGTFAVLAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N+RATLIPTIAVPVVLLGTFA+LAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 MEDNLSPREATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTRAIYRQFSITIVSAMAL 480
MED L P+EATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGST AIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATLLKPVSAEHHEKKSGFFGWFNTRFDHSVNHYTNSVSGIVRNTGRY 540
SVLVALILTPALCATLLKPVSAEHHE K GFFGWFNT FDHSVNHYTNSV I+ +TGRY
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 541 LIIYLLIVVGMAVLFLRLPTSFLPEEDQGVFLTMIQLPSGATQERTQKVLDQVTHYYLNN 600
L+IY LIV GM VLFLRLP+SFLPEEDQGVFLTMIQLP+GATQERTQKVLDQVT YYL N
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 601 EKANVESVFTVNGFSFSGQGQNSGMAFVSLKPWEERNGEENSVEAVIARATRAFSQIRDG 660
EKANVESVFTVNGFSFSGQ QN+GMAFVSLKPWEERNG+ENS EAVI RA +IRDG
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 661 LVFPFNMPAIVELGTATGFDFELIDQGGLGHDALTKARNQLLGMVAKHPDLLVRVRPNGL 720
V PFNMPAIVELGTATGFDFELIDQ GLGHDALT+ARNQLLGM A+HP LV VRPNGL
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 721 EDTPQFKLDVDQEKAQALGISLSDINETISAALGGYYVNDFIDRGRVKKVYVQADAQFRM 780
EDT QFKL+VDQEKAQALG+SLSDIN+TIS ALGG YVNDFIDRGRVKK+YVQADA+FRM
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 781 LPGDINNLYVRSANGEMVPFSTFSSARWIYGSPRLERYNGMPSMELLGEAAPGRSTGEAM 840
LP D++ LYVRSANGEMVPFS F+++ W+YGSPRLERYNG+PSME+ GEAAPG S+G+AM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 841 SLMENLASQLPNGIGYDWTGMSYQERLSGNQEPALYAISLIVVFLCLAALYESWSIPFSV 900
+LMENLAS+LP GIGYDWTGMSYQERLSGNQ PAL AIS +VVFLCLAALYESWSIP SV
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 901 MLVVPLGVVGALLAASLRGLNNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMEKEGRGLI 960
MLVVPLG+VG LLAA+L NDVYF VGLLTTIGLSAKNAILIVEFAKDLMEKEG+G++
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 961 EATLEASRMRLRPILMTSLAFILGVMPLVISRGAGSGAQNAVGTGVMGGMLTATLLAIFF 1020
EATL A RMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGM++ATLLAIFF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1021 VPVFFVVVKRRF 1032
VPVFFVV++R F
Sbjct: 1021 VPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3330adhesinb290.001 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 29.0 bits (65), Expect = 0.001
Identities = 14/68 (20%), Positives = 26/68 (38%), Gaps = 10/68 (14%)

Query: 1 MKR---LIPVALLTTLLAGCAHDSPCVPVYDDQGRLVHTNTCMKGTTQDNWETAGAIAGG 57
MK+ L+ + L LA C+ + +V TN+ + T++ IAG
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKN-------IAGD 53

Query: 58 AAAVAGLT 65
+ +
Sbjct: 54 KINLHSIV 61


98SC3376SC3380N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3376445-2.261990leader peptidase HopD
SC3377653-2.270112bacterioferritin
SC3378755-1.334608bacterioferritin-associated ferredoxin
SC3379754-0.793222elongation factor Tu
SC3380543-0.657873elongation factor G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3376PREPILNPTASE1482e-46 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 148 bits (376), Expect = 2e-46
Identities = 59/143 (41%), Positives = 86/143 (60%), Gaps = 2/143 (1%)

Query: 46 ALPFLIFYASFSLLLGIYDARTGLLPDRFTCPLLWGGLLYHQICLPERLPDALWGAIAGY 105
L L+ + L D LLPD+ T PLLWGGLL++ + L DA+ GA+AGY
Sbjct: 134 TLAALLLT-WVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGY 192

Query: 106 GGFALIYWGYRLRYQKEGLGYGDVKYLAALGAWHCWETLPLLVFLAAMLACGGFGVALLV 165
+YW ++L KEG+GYGD K LAALGAW W+ LP+++ L++++ G+ L++
Sbjct: 193 LVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLV-GAFMGIGLIL 251

Query: 166 RGKSALINPLPFGPWLAVAGFIT 188
P+PFGP+LA+AG+I
Sbjct: 252 LRNHHQSKPIPFGPYLAIAGWIA 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3377HELNAPAPROT361e-05 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 36.4 bits (84), Expect = 1e-05
Identities = 18/103 (17%), Positives = 43/103 (41%), Gaps = 10/103 (9%)

Query: 44 EYHESIDEMKHADKYIERILFLEGIPN--LQDLGKL------GIGEDVEEMLRSDLRLEL 95
E ++ E D ER+L + G P +++ + G EM+++ +
Sbjct: 52 ELYDHAAE--TVDTIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYK 109

Query: 96 EGAKDLREAIAYADSVHDYVSRDMMIEILADEEGHIDRLETEL 138
+ + + + I A+ D + D+ + ++ + E + L + L
Sbjct: 110 QISSESKFVIGLAEENQDNATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3379TCRTETOQM803e-18 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 79.5 bits (196), Expect = 3e-18
Identities = 57/198 (28%), Positives = 87/198 (43%), Gaps = 13/198 (6%)

Query: 13 VNVGTIGHVDHGKTTLTAAI------TTVLAKTYGGAARAFDQIDNAPEEKARGITINTS 66
+N+G + HVD GKTTLT ++ T L G R DN E+ RGITI T
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRT----DNTLLERQRGITIQTG 59

Query: 67 HVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGPMPQTREHILLGRQV 126
+ +D PGH D++ + + +DGAIL+++A DG QTR R++
Sbjct: 60 ITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKM 119

Query: 127 GVPYIIVFLNKCDMVDDEELLELVEMEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWE 186
G+P I F+NK D + L V +++E LS + + +W+
Sbjct: 120 GIP-TIFFINKIDQNGID--LSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWD 176

Query: 187 AKIIELAGFLDSYIPEPE 204
I L+ Y+
Sbjct: 177 TVIEGNDDLLEKYMSGKS 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3380TCRTETOQM6160.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 616 bits (1591), Expect = 0.0
Identities = 178/698 (25%), Positives = 305/698 (43%), Gaps = 81/698 (11%)

Query: 9 RYRNIGISAHIDAGKTTTTERILFYTGVNHKIGEVHDGAATMDWMEQEQERGITITSAAT 68
+ NIG+ AH+DAGKTT TE +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TAFWSGMAKQYEPHRINIIDTPGHVDFTIEVERSMRVLDGAVMVYCAVGGVQPQSETVWR 128
+ W ++NIIDTPGH+DF EV RS+ VLDGA+++ A GVQ Q+ ++
Sbjct: 62 SFQWEN-------TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFH 114

Query: 129 QANKYKVPRIAFVNKMDRMGANFLKVVGQIKTRLGANPVPLQLAIGAEEGFTGVVDLVKM 188
K +P I F+NK+D+ G + V IK +L A V Q V M
Sbjct: 115 ALRKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ----------KVELYPNM 164

Query: 189 KAINWNDADQGVTFEYEDIPADMQDLANEWHQNLIESAAEASEELMEKYLGGEELTEEEI 248
N+ +++Q ++ E +++L+EKY+ G+ L E+
Sbjct: 165 CVTNFTESEQ------------------------WDTVIEGNDDLLEKYMSGKSLEALEL 200

Query: 249 KQALRQRVLNNEIILVTCGSAFKNKGVQAMLDAVIDYLPSPVDVPAINGILDDGKDTPAE 308
+Q R N + V GSA N G+ +++ + + S
Sbjct: 201 EQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH----------------- 243

Query: 309 RHASDDEPFSALAFKIATDPFVGNLTFFRVYSGVVNSGDTVLNSVKTARERFGRIVQMHA 368
FKI L + R+YSGV++ D+V S K + + +
Sbjct: 244 ---RGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEK-EKIKITEMYTSIN 299

Query: 369 NKREEIKEVRAGDIAAAIG----LKDVTTGDTLCDPENPIILERMEFPEPVISIAVEPKT 424
+ +I + +G+I L V GDT P+ ER+E P P++ VEP
Sbjct: 300 GELCKIDKAYSGEIVILQNEFLKLNSV-LGDTKLLPQR----ERIENPLPLLQTTVEPSK 354

Query: 425 KADQEKMGLALGRLAKEDPSFRVWTDEESNQTIIAGMGELHLDIIVDRMKREFNVEANVG 484
+E + AL ++ DP R + D +++ I++ +G++ +++ ++ +++VE +
Sbjct: 355 PQQREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIK 414

Query: 485 KPQVAYREAIRAKVTDIEGKHAKQSGGRGQYGHVVIDMYPLEPGSNPKGYEFINDIKGGV 544
+P V Y E K E + + + + + PL GS G ++ + + G
Sbjct: 415 EPTVIYMERPLKKA---EYTIHIEVPPNPFWASIGLSVSPLPLGS---GMQYESSVSLGY 468

Query: 545 IPGEYIPAVDKGIQEQLKSGPLAGYPVVDLGVRLHFGSYHDVDSSELAFKLAASIAFKEG 604
+ + AV +GI+ + G L G+ V D + +G Y+ S+ F++ A I ++
Sbjct: 469 LNQSFQNAVMEGIRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQV 527

Query: 605 FKKAKPVLLEPIMKVEVETPEENTGDVIGDLSRRRGMLKGQESEVTGVKIHAEVPLSEMF 664
KKA LLEP + ++ P+E D + + + + V + E+P +
Sbjct: 528 LKKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQ 587

Query: 665 GYATQLRSLTKGRASYTMEFLKYDDAPNNVAQAVIEAR 702
Y + L T GR+ E Y + V + R
Sbjct: 588 EYRSDLTFFTNGRSVCLTELKGYHVT---TGEPVCQPR 622


99SC3386SC3398N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3386-114-0.765474hypothetical protein
SC3387-2130.571697FKBP-type peptidylprolyl isomerase
SC33880122.318057hypothetical protein
SC33890132.419074FKBP-type peptidylprolyl isomerase
SC33900141.904109hypothetical protein
SC3391-1142.092474glutathione-regulated potassium-efflux system
SC33921172.089820glutathione-regulated potassium-efflux system
SC33930192.048573ABC transporter ATP-binding protein
SC3394-2140.269499ABC transporter ATPase
SC3395-115-0.323841hypothetical protein
SC3396-1130.696704hydrolase
SC3397-1130.861453hypothetical protein
SC3398-1121.075653phosphoribulokinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3386ACRIFLAVINRP290.024 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.7 bits (64), Expect = 0.024
Identities = 14/62 (22%), Positives = 29/62 (46%), Gaps = 1/62 (1%)

Query: 164 ASSVEDLVTQTLEFTIEEVNADRNV-SNNAKNRQIVLNLYEKGIFDIKDAINQVADRLNI 222
A +V+D VTQ +E + ++ + S + + + L + D A QV ++L +
Sbjct: 54 AQTVQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQL 113

Query: 223 SK 224
+
Sbjct: 114 AT 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3387INFPOTNTIATR1282e-38 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 128 bits (323), Expect = 2e-38
Identities = 80/226 (35%), Positives = 121/226 (53%), Gaps = 9/226 (3%)

Query: 28 AAKPAATADSKAAFKNDDQKAAYALGASLGRYMENSLKEQEKLGIKLDKDQLIAGVQDAF 87
A A A + D K +Y++GA LG K + GI ++ D L G+QD
Sbjct: 14 AMSTAMAATDATSLTTDKDKLSYSIGADLG-------KNFKNQGIDINPDVLAKGMQDGM 66

Query: 88 A-DKSKLSDQEIEQTLQTFEARVKSAAQAKMEKDAADNEAKGKTFRDAFAKEKGVKTSST 146
+ + L++++++ L F+ + + A+ K A +N+AKG F A + G+ +
Sbjct: 67 SGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVLPS 126

Query: 147 GLLYKVEKEGTGEAPKDSDTVVVNYKGTLIDGKEFDNSYTRGEPLSFRLDGVIPGWTEGL 206
GL YK+ GTG P SDTV V Y GTLIDG FD++ G+P +F++ VIPGWTE L
Sbjct: 127 GLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEAL 186

Query: 207 KNIKKGGKIKLVIPPALAYGKTGVPG-IPANSTLVFDVELLDIKPA 251
+ + G ++ +P LAYG V G I N TL+F + L+ +K A
Sbjct: 187 QLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVKKA 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3392ISCHRISMTASE280.025 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 27.7 bits (61), Expect = 0.025
Identities = 35/138 (25%), Positives = 52/138 (37%), Gaps = 22/138 (15%)

Query: 11 YAHPESQDSVANRVLLKPAIQHNNVTVHDLYARYPDFFID--TPYEQ-----ALLREHDV 63
Y P + D N+V P + +HD+ + D F +P + L+ V
Sbjct: 9 YQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCV 68

Query: 64 IVFQH--PLYTYSCPALLKEWLDRVLSRGFASGPGGNQLVGKYWRSVITTGEPESA---- 117
Q P+ + P DR L F GPG N G Y +IT PE
Sbjct: 69 ---QLGIPVVYTAQPGSQNP-DDRALLTDFW-GPGLNS--GPYEEKIITELAPEDDDLVL 121

Query: 118 --YRYDALNRYPMSDVLR 133
+RY A R + +++R
Sbjct: 122 TKWRYSAFKRTNLLEMMR 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3393PYOCINKILLER310.019 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.9 bits (69), Expect = 0.019
Identities = 21/85 (24%), Positives = 33/85 (38%), Gaps = 7/85 (8%)

Query: 522 VQKQENQADDAPKENNANSAQSRKDQKRREAELRTLT---QPLRKEITRLEKEMEKLNAQ 578
+ E + A +E N N ++ RE E T + + I+ L+ M L A
Sbjct: 151 TRTAEEIGEQAVREGNINGPEAYMRFLDREMEGLTAAYNVKLFTEAISSLQIRMNTLTAA 210

Query: 579 LA----QAEEKLGDSSLYDPSRKAE 599
A A K + + + RKAE
Sbjct: 211 KASIEAAAANKAREQAAAEAKRKAE 235


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3397FLGFLIH250.024 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 25.1 bits (54), Expect = 0.024
Identities = 17/46 (36%), Positives = 23/46 (50%), Gaps = 3/46 (6%)

Query: 3 IPWQGLAPDTLDNLIESFV---LREGTDYGEHERSLEQKVADVKRQ 45
+PW+ PD L FV E T E E SLEQ++A ++ Q
Sbjct: 5 LPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQ 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3398PF07299361e-04 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 36.0 bits (83), Expect = 1e-04
Identities = 10/46 (21%), Positives = 21/46 (45%), Gaps = 2/46 (4%)

Query: 71 PEANDFSLLEHTFIEYGQTGKGQSRKYLHTYDEAVPWNQVPGTFTP 116
P+ + + E ++ KG SRK++ ++ + + GTF
Sbjct: 112 PDMEELDMKELSY--LSWIDKGSSRKFIIAKNDKNKFVGLQGTFQS 155


100SC3475SC3486N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3475-113-3.393702dehydrogenase
SC3476014-3.153411acetyltransferase YhhY
SC3477012-2.438540sugar metabolism transcriptional regulator
SC3478-1100.946183hypothetical protein
SC3479-2122.284634inner membrane protein
SC3480-2172.979131gamma-glutamyltranspeptidase
SC3481-3152.172862hypothetical protein
SC3482-3142.031251glycerophosphodiester phosphodiesterase
SC3483-3161.517337glycerol-3-phosphate transporter ATP-binding
SC3484-3181.458331glycerol-3-phosphate transporter membrane
SC3485-2162.221605glycerol-3-phosphate transporter permease
SC3486-2192.773421glycerol-3-phosphate transporter periplasmic
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3475MICOLLPTASE300.015 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 30.1 bits (67), Expect = 0.015
Identities = 15/50 (30%), Positives = 25/50 (50%), Gaps = 3/50 (6%)

Query: 268 FAADESVGVLEYVNDDGVTVKEEVKPETGDYGRVYDALYQTLTVGTPNYV 317
++AD+ ++Y N DG + K + G+ Y +YQ GT NY+
Sbjct: 1052 YSADDLSNYVDYANADGNKLSNTCK---LNPGKYYLCVYQFENSGTGNYI 1098


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3476SACTRNSFRASE358e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 34.5 bits (79), Expect = 8e-05
Identities = 18/92 (19%), Positives = 33/92 (35%), Gaps = 16/92 (17%)

Query: 55 VACIDDIVVGHLSIQVTQRPRRSHVADFGICVDARWHNRGIASALIRTMID------MCD 108
+ +++ +G + I+ + + D + D R G+ +AL+ I+ C
Sbjct: 69 LYYLENNCIGRIKIRSNWN-GYALIEDIAVAKDYRKK--GVGTALLHKAIEWAKENHFCG 125

Query: 109 NWLRVDRIELTVFVDNEPAVAVYKKYGFEIEG 140
L I N A Y K+ F I
Sbjct: 126 LMLETQDI-------NISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3480STREPTOPAIN310.012 Streptopain (C10) cysteine protease family signature.
		>STREPTOPAIN#Streptopain (C10) cysteine protease family signature.

Length = 398

Score = 31.2 bits (70), Expect = 0.012
Identities = 33/176 (18%), Positives = 57/176 (32%), Gaps = 22/176 (12%)

Query: 204 NSKAIFWKDGEPLKKGDKLVQKNLAKSLEMIAENGPDAFYKGAIADQIAGEMQ----KNG 259
+SK I + G P +++K + ++ G +A A M+ N
Sbjct: 154 DSKGIHYNQGNPYNLLTPVIEKVKPGEQSFVGQHA----ATGCVATATAQIMKYHNYPNK 209

Query: 260 GLMTKEDLASYKAVERTPISGDY-----RGYQVFSMPPPSSGGIHIVQILNILE------ 308
GL S + R Y ++ P SG VQ + I E
Sbjct: 210 GLKDYTYTLSSNNPYFNHPKNLFAAISTRQYNWNNILPTYSGRESNVQKMAISELMADVG 269

Query: 309 ---NFDMKKYGFGSADAMQIMAEAEKYAYADRSEYLGDSDFVKVPWQALTNKDYAK 361
+ D + + A E + Y + DF K W+A +K+ ++
Sbjct: 270 ISVDMDYGPSSGSAGSSRVQRALKENFGYNQSVHQINRGDFSKQDWEAQIDKELSQ 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3482PF04619280.027 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 28.0 bits (62), Expect = 0.027
Identities = 11/60 (18%), Positives = 21/60 (35%), Gaps = 4/60 (6%)

Query: 29 VGAQYGHTMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGELNWQD----LLRVDAGGW 84
+G ++ D + G+ FL+ D+N ++ W + D G W
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3483PF05272290.040 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.9 bits (64), Expect = 0.040
Identities = 10/29 (34%), Positives = 16/29 (55%)

Query: 33 IVMVGPSGCGKSTLLRMVAGLERVTSGDI 61
+V+ G G GKSTL+ + GL+ +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHF 627


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3486MALTOSEBP401e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 40.1 bits (93), Expect = 1e-05
Identities = 43/176 (24%), Positives = 70/176 (39%), Gaps = 17/176 (9%)

Query: 133 SGHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQLPKTWQELADYTAKLRAAGMKCGYASGW 192
+G L++ P L YNKD PKTW+E+ +L+A G +
Sbjct: 126 NGKLIAYPIAVEALSLIYNKDLLPNP-------PKTWEEIPALDKELKAKGKSALMFNLQ 178

Query: 193 QGWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIALLEEMNKKGDFSYVG 250
+ + +A G F +N +D D ++ K + L++ + D Y
Sbjct: 179 EPYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY-- 236

Query: 251 RKDESTEKFYNGDCAMTTASSGSLANIRQYAKFNYGVGMMPYDADIKGAPQNAIIG 306
+ F G+ AMT + +NI +K NYGV ++P KG P +G
Sbjct: 237 --SIAEAAFNKGETAMTINGPWAWSNIDT-SKVNYGVTVLP---TFKGQPSKPFVG 286


101SC3500SC3518N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC35002143.985025cell division protein FtsY
SC35011154.03560116S rRNA m(2)G966-methyltransferase
SC35021163.960652hypothetical protein
SC35032143.657304hypothetical protein
SC35041153.593317hypothetical protein
SC35051153.917992zinc/cadmium/mercury/lead-transporting ATPase
SC35060141.531731methyl-accepting transmembrane citrate/phenol
SC35072161.697535sulfur transfer protein SirA
SC35081151.709710hypothetical protein
SC35090152.169660hypothetical protein
SC3510-2143.220884major facilitator superfamily transporter
SC3511-2143.408308PerM family permease
SC3512-2154.089079holo-(acyl carrier protein) synthase 2
SC3513-2143.343466nickel responsive regulator
SC3514-2143.173602ABC transporter ATP-binding protein
SC3515-2133.161690mtultidrug ABC transporter permease/ATPase
SC3516-2121.144067hypothetical protein
SC3517-2131.031761hypothetical protein
SC3518-2161.214994PiT family, low-affinity phosphate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3500IGASERPTASE300.024 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.024
Identities = 15/114 (13%), Positives = 34/114 (29%), Gaps = 2/114 (1%)

Query: 17 DKEQKQEQTEEQQIVEEQRPVEPPVETAADVDAQTPAHSKAETEAFAEEVVDVTEKVQES 76
+++ K E + Q++ + V P E + V Q + + +E T ++
Sbjct: 1109 EEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADT 1168

Query: 77 EKP-QPVEPEPAAAIETAAPQIAVEREELPLPEEVKDEAISPEEWQAEAETVEV 129
E+P + + + PE P + +
Sbjct: 1169 EQPAKETSSNVEQPVTESTTVNTGNSVV-ENPENTTPATTQPTVNSESSNKPKN 1221


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3503SHIGARICIN270.027 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 26.7 bits (59), Expect = 0.027
Identities = 6/29 (20%), Positives = 16/29 (55%)

Query: 7 FFIIIIALIVVAASFRFVQQRREKAANEA 35
+++I AA ++F++Q+ K ++
Sbjct: 173 ALMVLIQSTSEAARYKFIEQQIGKRVDKT 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3505ACRIFLAVINRP300.039 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.2 bits (68), Expect = 0.039
Identities = 17/78 (21%), Positives = 34/78 (43%), Gaps = 3/78 (3%)

Query: 336 AEERRAPIERFIDRFSRIYTPVIMVIALLVTLIPPLMFDGGWQEWIYKGLTLLLIGCPCA 395
E++ P E S+I ++ + +L + P+ F GG IY+ ++ ++ A
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVS---A 477

Query: 396 LVISTPAAITSGLAAAAR 413
+ +S A+ A A
Sbjct: 478 MALSVLVALILTPALCAT 495


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3507PF012061012e-32 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 101 bits (254), Expect = 2e-32
Identities = 28/72 (38%), Positives = 42/72 (58%)

Query: 9 DHTLDALGLRCPEPVMMVRKTVRNMQTGETLLIIADDPATTRDIPGFCTFMEHDLLAQET 68
D +LDA GL CP P++ +KT+ M GE L ++A DP + +D F H+LL Q+
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 69 EGLPYRYLLRKA 80
E Y + L++A
Sbjct: 65 EDGTYHFRLKRA 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3508PF04183280.035 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 27.9 bits (62), Expect = 0.035
Identities = 17/91 (18%), Positives = 28/91 (30%), Gaps = 14/91 (15%)

Query: 121 LGQILDVHVFNRLRQNRRWWLAPTASTLFGNISDTLAFFFIAFWRSPDAFMAEHWMEIAL 180
LG I + L+ + +TL + + AE W+
Sbjct: 347 LGVIWRENPCRWLKPDES---PVLMATLMECDENNQPL--AGAYIDRSGLDAETWLT--- 398

Query: 181 VDYCFKVLISIIFFLPMYGVLL-----NMLL 206
V++ + L YGV L N+ L
Sbjct: 399 -QLFRVVVVPLYHLLCRYGVALIAHGQNITL 428


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3510TCRTETA483e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 48.3 bits (115), Expect = 3e-08
Identities = 75/403 (18%), Positives = 137/403 (33%), Gaps = 42/403 (10%)

Query: 13 LRLNLRIVSIVMFNFASYLTIGLPLAVLPGYVHD--AMGFSAFWAGLIISLQYFATLLSR 70
++ N ++ I+ + IGL + VLPG + D G++++L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 71 PHAGRYADVLGPKKIVVFGLCGCFLSGFGYLLADIASAWPMISLLLLGLGRVILGI-GQS 129
P G +D G + +++ L G + Y + A L +L +GR++ GI G +
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAG---AAVDYAIMATAPF-----LWVLYIGRIVAGITGAT 112

Query: 130 FAGTGSTLWGVGVVGSLHIGRVISWNGIVTYGAMAMGAPLGVLCYAWGGLQGLALTVMGV 189
A G+ + + R + M G LG L G
Sbjct: 113 GAVAGAYIADITDGDER--ARHFGFMSACFGFGMVAGPVLGGLM----GGFSPHAPFFAA 166

Query: 190 ALLAILLAL----------PRPSVKANKGKPLPFRAVLGRVWLYGMALALA-----SAGF 234
A L L L + P + + +A +A
Sbjct: 167 AALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVG 226

Query: 235 GVIATFITLFYDAK-GWDGAAFALTLFSVAFVGT---RLLFPNGINRLGGLNVAMICFGV 290
V A +F + + WD ++L + + + ++ RLG M+
Sbjct: 227 QVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIA 286

Query: 291 EIIGLLLVGTAAMPWMAKIGVLLTGMGFSLVFPALGVVAVKAVPPQNQGAALATYTVFMD 350
+ G +L+ A WMA ++L + PAL + + V + QG +
Sbjct: 287 DGTGYILLAFATRGWMAFPIMVLLA-SGGIGMPALQAMLSRQVDEERQGQLQGSLAALTS 345

Query: 351 MSLGVTGPLAGLVMTWAGVPV----IYLAAAGLVAMALLLTWR 389
++ + GPL + A + ++A A L + L R
Sbjct: 346 LT-SIVGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3512ENTSNTHTASED342e-04 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 34.2 bits (78), Expect = 2e-04
Identities = 30/116 (25%), Positives = 54/116 (46%), Gaps = 9/116 (7%)

Query: 30 RRASWLAGRVLLSRALSPL---PEMVYGEQGKPAFSAGTPLWFNLSHSGDTIALLLSDEG 86
R+A LAGR+ AL + G++ +P + G L+ ++SH T ++S +
Sbjct: 46 RKAEHLAGRIAAVHALREVGVRTVPGMGDKRQPLWPDG--LFGSISHCATTALAVISRQ- 102

Query: 87 EVGCDIEVIRPRDNWRSLANAVFSLGEHAEMEAERPERQLADFWRI-WTRKEAIVK 141
+G DIE I + LA ++ E ++A LA + ++ KE++ K
Sbjct: 103 RIGIDIEKIMSQHTATELAPSIIDSDERQILQASLLPFPLAL--TLAFSAKESVYK 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3514ABC2TRNSPORT482e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 48.0 bits (114), Expect = 2e-08
Identities = 43/171 (25%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 200 REREHGTVEHLLVMPVTPFEIMMAKV-WSMGLVVLVVSGLSLMLMVKGVLGVPIEGSIPL 258
R T E +L + +I++ ++ W+ L +G + +V LG + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAG---IGVVAAALGY-TQWLSLL 148

Query: 259 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLMILVLLPLQMLSGGSTPRESMPQAVQD 317
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 318 IMLTMPTTHFVSLAQAILYRGAGLSIVWPQFLTLLAIGGVFFL-IALLRFR 367
+P +H + L + I+ + + + I FFL ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3516RTXTOXIND717e-16 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 71.0 bits (174), Expect = 7e-16
Identities = 61/355 (17%), Positives = 115/355 (32%), Gaps = 75/355 (21%)

Query: 25 IATKIAGRIDTILVSEGQFVRQGEVLAKMDTRV----------------LQEQRLEAI-- 66
I + I+V EG+ VR+G+VL K+ L++ R + +
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSR 158

Query: 67 ----------------------------------AQIKEAESAVAAARALLEQRQSEMRA 92
Q ++ L+++++E
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT 218

Query: 93 AQSVVKQREAELDSVSKRHVRSRSLSQRGAVSVQQLDDDRAAAESARAALETAKAQVSAA 152
+ + + E R SL + A++ + + A L K+Q+
Sbjct: 219 VLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQI 278

Query: 153 KAAIEAARTSIIQ-------------AQTRVEAAQATERRIVADID--DSELKAPRDGRV 197
++ I +A+ QT T + S ++AP +V
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKV 338

Query: 198 -QYRVAEPGEVLSAGGRVLNMVDLSDVY-MTFFLPTEQAGLLKIGGEARLVLDAAPDLRI 255
Q +V G V++ ++ +V D +T + + G + +G A + ++A P R
Sbjct: 339 QQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRY 398

Query: 256 PATISFVASVAQFTPKTVETHDERLKLMFRVKARIPPELLRQHLEYVKTGLPGMA 310
V V +E D+RL L+F V I L + + GMA
Sbjct: 399 G---YLVGKVKNINLDAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLS-SGMA 447


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3518TYPE3IMSPROT300.022 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 30.1 bits (68), Expect = 0.022
Identities = 23/194 (11%), Positives = 57/194 (29%), Gaps = 40/194 (20%)

Query: 12 TGLLLLLALAFVLFYEAINGFHDTANAVATVIY------TRAMRSQLAVVMAAVFNFFGV 65
L++AL+ +L + F + + ++A+ + V+ F
Sbjct: 30 VSTALIVALSAMLMGLSDYYFEHFSKLMLIPAEQSYLPFSQALSYVVDNVLLEFFYLCFP 89

Query: 66 LLGGLSVAYAIVHML-------------------PTDLLLNMGSAHGLAMVFSMLLAAII 106
LL ++ H++ P + + S L +L ++
Sbjct: 90 LLTVAALMAIASHVVQYGFLISGEAIKPDIKKINPIEGAKRIFSIKSLVEFLKSILKVVL 149

Query: 107 WNLGTWYFGLPASSSHTLIGAIIGIGLTNAMMTGTSVVDALNIPKVINIFGSLIISPIVG 166
++ W + ++ + T + T ++ + L++ VG
Sbjct: 150 LSILIWIIIKG------NLVTLLQLP-TCGIECITPLLGQI--------LRQLMVICTVG 194

Query: 167 LVFAGGLIFLLRRY 180
V + Y
Sbjct: 195 FVVISIADYAFEYY 208


102SC3538SC3545N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3538-114-0.653154hypothetical protein
SC3539-212-0.968478phage endolysin
SC3540-212-0.662977LuxR family transcriptional regulator
SC3541-1120.012840transposase of Tn10
SC3542-1141.466542hypothetical protein
SC3543-2151.694315MFS family transporter
SC3544-3130.971186hypothetical protein
SC3545-1143.051411diguanylate phosphodiesterase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3538YERSSTKINASE364e-04 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 35.9 bits (82), Expect = 4e-04
Identities = 24/89 (26%), Positives = 41/89 (46%), Gaps = 12/89 (13%)

Query: 21 RQASIEILLLLGIHTTEGKEPRWFMEQLEQARLNLGGWGAVAKKLRINDAQLSQFMLQLR 80
R + +++ LG+H+ G++P+ F E + L +G GA K S L +
Sbjct: 280 RASGEPVVIDLGLHSRSGEQPKGFTESFKAPELGVGNLGASEK---------SDVFLVVS 330

Query: 81 HLQQHVPQYDSGQEVSENQLLAALRFVTS 109
L + ++ E+ NQ LRF+TS
Sbjct: 331 TLLHCIEGFEKNPEIKPNQ---GLRFITS 356


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3540HTHFIS336e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.9 bits (75), Expect = 6e-04
Identities = 23/138 (16%), Positives = 48/138 (34%), Gaps = 26/138 (18%)

Query: 11 GISIQSVGQAEELWQKIESAPDALVMLDSGLDAEFCREVLQRIAQQFPEVK-IIITAMDG 69
G ++ A LW+ I + LV+ D + E ++L RI + P++ ++++A
Sbjct: 27 GYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKARPDLPVLVMSA--- 83

Query: 70 SQKWLHEVMQFNVQAVVPRDSDAETFVLALNAVARGMMFLPGDWLNSTELESRDIKALSA 129
+ T + A A + P D + R AL+
Sbjct: 84 -------------------QNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---ALAE 121

Query: 130 RQREILQMLAAGESNKQI 147
+R ++ + +
Sbjct: 122 PKRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3543TCRTETB348e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.5 bits (79), Expect = 8e-04
Identities = 83/367 (22%), Positives = 126/367 (34%), Gaps = 59/367 (16%)

Query: 79 IGSALFGHFGDRVGRKVTLVASLLTMGISTVIIGLLPGYATIGIFAPLLLALARFGQGLG 138
IG+A++G D++G K LL GI G + G+ F+ LL +ARF QG G
Sbjct: 64 IGTAVYGKLSDQLGIK-----RLLLFGIIINCFGSVIGFVGHSFFS--LLIMARFIQGAG 116

Query: 139 LGGEWGGAALLATENAPPRKR----ALYGSFPQLGAPIGFFFANGTFLLLSW-------- 186
++ P R L GS +G +G + W
Sbjct: 117 AAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPM 176

Query: 187 -----LLTDEQFMSWGWRV--PF-IFSAVLVIIG-------------LYVRVSLHETPVF 225
+ + + R+ F I +L+ +G ++ VS+ +F
Sbjct: 177 ITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIF 236

Query: 226 AKVAAAKKQVKIPLGTLLTKHVRVTVLGTFIMLATYTLFYIMTVYSMTYSTAAAPVGLGL 285
K + G + VL I+ T F M Y M + +G
Sbjct: 237 VKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIG- 295

Query: 286 PRNEILWMLMMAVIGFGVMVPIAGLLADAFGRRKSMVIITTLIIL-FALFAFTPLLGSGN 344
+ I++ M+VI FG I G+L D G + I T + + F +F S
Sbjct: 296 --SVIIFPGTMSVIIFG---YIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWF 350

Query: 345 PALVFVFLLLGLSLMGL---TFGPMGALLPELFPTEVRYTGASFS-YNVSSILGASVAPY 400
++ VF+L GLS T L E GA S N +S L
Sbjct: 351 MTIIIVFVLGGLSFTKTVISTIVSSS-----LKQQEA---GAGMSLLNFTSFLSEGTGIA 402

Query: 401 IAAWLQS 407
I L S
Sbjct: 403 IVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3545SALSPVBPROT290.016 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 29.3 bits (65), Expect = 0.016
Identities = 44/160 (27%), Positives = 63/160 (39%), Gaps = 30/160 (18%)

Query: 93 DFFTRHHLLASVNVDGPTLIAMRRQPDILAAMERLPWLRFELV----EHIRLPKDSSFAS 148
DF+ H +++ G T A R D AA WL E V EHI ++
Sbjct: 157 DFWLLHDSNGILHLLGKT--AAARLSDPQAASHTAQWLVEESVTPAGEHI------YYSY 208

Query: 149 MCEFGPLWLDDFGTGMANFSA---LSEVRYDYIKVALELFVMLRQSAEGRNLFTLLLQLM 205
+ E G + + SA LS+V+Y A +L++ + + LFTL+
Sbjct: 209 LAENGDNVDLNGNEAGRDRSAMRYLSKVQYGNATPAADLYLWTSATPAVQWLFTLVFDYG 268

Query: 206 NRYCRGVIVEGVETLEEWRDVQRSPAFAAQGYFLSRPVPL 245
R GV D Q PAF AQ +L+R P
Sbjct: 269 ER--------GV-------DPQVPPAFTAQNSWLARQDPF 293


103SC3706SC3710N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3706-1141.796668GntR family transcriptional regulator
SC3707-1172.811510hypothetical protein
SC3708-2152.082153sugar phosphate antiporter
SC3709-2121.779947sensory histidine kinase UhpB
SC3710-211-0.345074DNA-binding transcriptional activator UhpA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3706CABNDNGRPT280.030 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 28.4 bits (63), Expect = 0.030
Identities = 13/69 (18%), Positives = 27/69 (39%), Gaps = 9/69 (13%)

Query: 51 TVKKAVDQLVREGVLVQVQGKGTFVKKENVAYPLGEGLLSFAEALASQKINFTTSVITSR 110
++ +A Q+ RE V G F K N+ + F ++++S T V +
Sbjct: 49 SIDQAAAQITREN--VSWNGTNVFGKSANLTF-------KFLQSVSSIPSGDTGFVKFNA 99

Query: 111 LEPANRFVA 119
+ ++
Sbjct: 100 EQIEQAKLS 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3708TCRTETB363e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 36.0 bits (83), Expect = 3e-04
Identities = 27/168 (16%), Positives = 64/168 (38%), Gaps = 16/168 (9%)

Query: 49 FNIAQNDMISTYGLSMTELGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAIC 108
N++ D+ + + + F +T+ +G + +D K+ L F +I++
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN--- 89

Query: 109 MLGFSASMGAGSTSLFLMIAFYALSGFFQSTGGSCSYSTI----TKWTPRRKRGTFLGFW 164
F + +G S F ++ + F Q G + + + ++ P+ RG G
Sbjct: 90 --CFGSVIGFVGHSFFSLLIM---ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 165 NISHNLGGAGAAGVALFGANYLFDGHVIGMFIFPSIIALIVGFIGLRF 212
+G + A+Y+ + + + P +I +I ++
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSY---LLLIP-MITIITVPFLMKL 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3709PF06580393e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.1 bits (91), Expect = 3e-05
Identities = 30/142 (21%), Positives = 55/142 (38%), Gaps = 11/142 (7%)

Query: 378 LRPRQLDDLTLAQAIRSLLREMELESRGIVSHLDWRIDETALSESQRVTLFRVCQEGLNN 437
LR ++LA + + ++L S L + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 438 IVKHA-----NASAVTLQGWQQDERLMLVIEDDGSGLPPGSHQ-QGFGLTGMRERVSALG 491
+KH + L+G + + + L +E+ GS + + G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 492 G---TLTISCTHG-TRVSVSLP 509
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3710HTHFIS613e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 3e-13
Identities = 23/116 (19%), Positives = 45/116 (38%), Gaps = 5/116 (4%)

Query: 23 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 82
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 83 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHT 135
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


104SC3739SC3744N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC3739-2110.933594chaperone protein TorD
SC3740-3111.477849trimethylamine N-oxide reductase subunit
SC3741-2111.380654trimethylamine N-oxide reductase, cytochrome
SC3742-2111.881030DNA-binding transcriptional regulator TorR
SC3743-1111.660425hybrid sensory histidine kinase TorS
SC3744-3121.575860major facilitator superfamily D-galactonate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3739PF06872290.021 EspG protein
		>PF06872#EspG protein

Length = 398

Score = 28.5 bits (63), Expect = 0.021
Identities = 14/54 (25%), Positives = 27/54 (50%)

Query: 111 LLLEAGMEVNDDFKEPADHLAIYLELLSHLHFSLGESFQQRRMNKLRQKTLSSL 164
L+L+A +++N D+K+P + + +LL L L + + Q L+ L
Sbjct: 29 LVLDATIKINSDYKKPWNEMTCAEKLLKILTLGLWNPKYSQDERQQFQGLLTVL 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3742HTHFIS763e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 76.4 bits (188), Expect = 3e-18
Identities = 28/115 (24%), Positives = 56/115 (48%), Gaps = 1/115 (0%)

Query: 4 HIVIVEDEPVTQARLQAYFEQEGYSVSVTDSGAGLRDIMEHEHVSLILLDINLPDENGLM 63
I++ +D+ + L + GY V +T + A L + L++ D+ +PDEN
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 LTRALRER-STVGIILVTGRCDQIDRIVGLEMGADDYVTKPLELRELVVRVKNLL 117
L +++ + +++++ + + I E GA DY+ KP +L EL+ + L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3743HTHFIS557e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 55.2 bits (133), Expect = 7e-10
Identities = 25/137 (18%), Positives = 54/137 (39%), Gaps = 3/137 (2%)

Query: 681 RLLLIEDNMLTQRITAEMLTGKGVKVSVAESANDALRCLAEGESFDVALVDFDLPDYDGL 740
+L+ +D+ + + + L+ G V + +A R +A G+ D+ + D +PD +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD-GDLVVTDVVMPDENAF 63

Query: 741 TLAQQLMSLYPAMKRIGFSAH-VIDDNLRQRTAGLFCGIIQKPVPREELYRMIAHYLQGK 799
L ++ P + + SA ++ G + + KP EL +I L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAY-DYLPKPFDLTELIGIIGRALAEP 122

Query: 800 SHNARAMLNEHQLAGDM 816
+ ++ Q +
Sbjct: 123 KRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3744TCRTETA478e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 47.1 bits (112), Expect = 8e-08
Identities = 65/384 (16%), Positives = 118/384 (30%), Gaps = 36/384 (9%)

Query: 66 AEMGYVFSAFAWLYTLCQIPGGWFLDRIGSRLTYFIAIFGWSVATLLQGFATGLLSLIGL 125
A G + + +A + C G DR G R +++ G +V + A L L
Sbjct: 43 AHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIG 102

Query: 126 RAITGIFEAPAFPANNRMVTSWFPEHERASAVGFYTSGQFVGLAFLTPLLIWIQEMLSWH 185
R + GI A + ERA GF ++ G+ P+L + S H
Sbjct: 103 RIVAGITGAT-GAVAGAYIADITDGDERARHFGFMSACFGFGMV-AGPVLGGLMGGFSPH 160

Query: 186 WVFIVTGGIGIIWSLVWFKVYQPPRLTKSLSQAELEYIRDGGGLVDGDAPAKKEARQPLT 245
F + + L + + P ++EA PL
Sbjct: 161 APFFAAAALNGLNFLTGCFLLPESHKGE-------------------RRPLRREALNPLA 201

Query: 246 KADWKLVFHRKLVGVYLGQFAVNSTLWFFLTWFPNYLTQEKGITALKAGFMTTV-PFLAA 304
W + + F + + + A G L +
Sbjct: 202 SFRWARGM-TVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHS 260

Query: 305 FFGVLLSGWLADKLVKKGFSLGVARKTPIICGLLISTC--IMGANYTNDPLWIMALMAIA 362
+++G +A +L + ++ G++ I+ A T + ++ +A
Sbjct: 261 LAQAMITGPVAARL---------GERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLA 311

Query: 363 FFGNGFASITWSLISSLAPMRLIGLTGGMFNFIGGLGGISVPLVIGYL-AQSYGFAPALV 421
G G ++ +++S G G + L I PL+ + A S
Sbjct: 312 SGGIGMPALQ-AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWA 370

Query: 422 YISVVALLGALSYILLVGDVKRVG 445
+I+ AL L G G
Sbjct: 371 WIAGAALYLLCLPALRRGLWSGAG 394


105SC3895SC3900N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC38950201.469228hypothetical protein
SC3896-1170.869714coproporphyrinogen III oxidase
SC3897-1160.688654nitrogen regulation protein NR(I)
SC3898-113-0.871924nitrogen regulation protein NR(II)
SC3899-115-2.435316glutamine synthetase
SC3900-212-3.665461GTP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3895SECA280.017 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 28.3 bits (63), Expect = 0.017
Identities = 14/74 (18%), Positives = 27/74 (36%)

Query: 11 KAFGKQRRKTREELNQEARDRKRLKKHRGHAPGSRAAGGNSASGGGNQNQQKDPRIGSKT 70
K + + EE+ + + R+ + +SA+ Q + ++G
Sbjct: 824 STLSKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRND 883

Query: 71 PVPLGVTEKVTQQH 84
P P G +K Q H
Sbjct: 884 PCPCGSGKKYKQCH 897


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3897HTHFIS5970.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 597 bits (1540), Expect = 0.0
Identities = 204/478 (42%), Positives = 299/478 (62%), Gaps = 11/478 (2%)

Query: 1 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGNEVLAALASKTPDVLLSDIRMPGM 60
M + V DDD++IR VL +AL+ AG N + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 120
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 HYQEQQQPRNIEVNGPTTDMIGEAPAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 180
+ + + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 181 LHRHSPRAKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 240
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 241 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLERRVQEGKFREDLFHR 300
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L++ + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 301 LNVIRIHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETETALTRLAWPGNVRQL 360
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 361 ENTCRWLTVMAAGQEVLIQDLPGELFEASAPDSPSHLPPDSWATLLAQWADRALRS---- 416
EN R LT + + + + EL S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 417 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 469
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3898PF06580290.034 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.7 bits (64), Expect = 0.034
Identities = 33/189 (17%), Positives = 71/189 (37%), Gaps = 39/189 (20%)

Query: 171 IIEQADRLRNLVDRL-------LGPQHPGMHIT--ESIHKVAERVVALVSMELPDNVRLI 221
I+E + R ++ L L ++ + + V + + L S++ D ++
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYS-NARQVSLADELTVV-DSYLQLASIQFEDRLQFE 243

Query: 222 RDYDPSLPELPHDPEQIEQVLL-NIVRNALQALGPEGGEITLRTRTAFQLTLHGERYRLA 280
+P++ ++ P + Q L+ N +++ + L P+GG+I L+
Sbjct: 244 NQINPAIMDVQV-PPMLVQTLVENGIKHGIAQL-PQGGKILLKGT------KDNGTVT-- 293

Query: 281 ARIDVEDNGPGIPPHLQDTLFYPMVSGREGGTGLGLSIARNLIDQHAGK---IEFTSWPG 337
++VE+ G + ++ TG GL R + G I+ + G
Sbjct: 294 --LEVENTGSLALKNTKE------------STGTGLQNVRERLQMLYGTEAQIKLSEKQG 339

Query: 338 HTEFSVYLP 346
V +P
Sbjct: 340 KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3900TCRTETOQM1781e-50 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 178 bits (454), Expect = 1e-50
Identities = 100/448 (22%), Positives = 170/448 (37%), Gaps = 87/448 (19%)

Query: 4 NLRNIAIIAHVDHGKTTLVDKLLQQSGTFDARAETQE--RVMDSNDLEKERGITILAKNT 61
+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAHGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIIYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIIDHVPAPDVDLDGPLQMQISQLDYNNYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + L ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLTHLGLERIDSDIAEAGDIIAITGLG-ELN--ISDTICDPQNVEALPALSVDE 304
K+ ++ T + E D A +G+I+ + +LN + DT PQ +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQR----ERIENPL 343

Query: 305 PTVSMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSGRGEL 364
P + + + D L LR +S G++
Sbjct: 344 PLLQTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKV 394

Query: 365 HLSVLIENMRRE-GFELAVSRPKVIFRE 391
+ V ++ + E+ + P VI+ E
Sbjct: 395 QMEVTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


106SC3946SC3956N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC39461152.112673superoxide dismutase
SC39470131.607482hypothetical protein
SC39480140.952141inner membrane protein
SC3949-1140.568873two-component sensor protein
SC3950-113-0.586903DNA-binding transcriptional regulator CpxR
SC3951-212-1.424758repressor CpxP
SC3952-112-0.196067ferrous iron efflux protein F
SC3953-2110.6262686-phosphofructokinase
SC3954-1110.394664sulfate transporter subunit
SC3955-1120.537963CDP-diacylglycerol pyrophosphatase
SC3956-1150.227062Na+:galactoside symporter family permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3946UREASE280.023 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 28.2 bits (63), Expect = 0.023
Identities = 11/27 (40%), Positives = 13/27 (48%), Gaps = 1/27 (3%)

Query: 2 SYTLPSLPYAYDALEPHFDKQTMAIHH 28
S T P+ PY + L H D M HH
Sbjct: 298 SSTNPTRPYTVNTLAEHLD-MLMVCHH 323


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3949PF06580290.037 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.1 bits (65), Expect = 0.037
Identities = 19/108 (17%), Positives = 37/108 (34%), Gaps = 28/108 (25%)

Query: 354 LENIVRNALRY------SHTKIEVGFSVDKDGITITVDDDGPGVSPEDREQIFRPFYRTD 407
++ +V N +++ KI + + D +T+ V++ G +E
Sbjct: 260 VQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE---------- 309

Query: 408 EARDRESGGTGLGLAIVESAMQQHRGWVKAD---DSPLGGLRLTLWLP 452
TG GL V +Q G +A G + + +P
Sbjct: 310 --------STGTGLQNVRERLQMLYG-TEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3950HTHFIS942e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.1 bits (234), Expect = 2e-24
Identities = 36/128 (28%), Positives = 64/128 (50%), Gaps = 2/128 (1%)

Query: 3 KILLVDDDRELTSLLKELLEMEGFNVLVAHDGEQALELL-DDSIDLLLLDVMMPKKNGID 61
IL+ DDD + ++L + L G++V + + + DL++ DV+MP +N D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 TLKALRQTH-QTPVIMLTARGSELDRVLGLELGADDYLPKPFNDRELVARIRAILRRSHW 120
L +++ PV++++A+ + + + E GA DYLPKPF+ EL+ I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 SEQQQSSD 128
+ D
Sbjct: 125 RPSKLEDD 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3952ABC2TRNSPORT280.030 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 28.4 bits (63), Expect = 0.030
Identities = 31/140 (22%), Positives = 58/140 (41%), Gaps = 27/140 (19%)

Query: 10 SRASIAATAMASALLLIKIFAWWYTGSVSILAALVD-SLVDIAASLTNLLVVRYSLQPAD 68
++A++A + + W S+L AL +L +A + ++V +L P+
Sbjct: 123 TKAALAGAGIGVVAAALGYTQWL-----SLLYALPVIALTGLAFASLGMVVT--ALAPSY 175

Query: 69 DEHTFGHGKAESLAALAQSMFISGSAL--------------FLFLTSIQNLIKPTPMNDP 114
D F ++L + +F+SG+ FL L+ +LI+P + P
Sbjct: 176 DYFIF----YQTLV-ITPILFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHP 230

Query: 115 GVGIGVTVIALICTIILVTF 134
V + V AL I++ F
Sbjct: 231 VVDVCQHVGALCIYIVIPFF 250


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC3956TCRTETB290.041 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.1 bits (65), Expect = 0.041
Identities = 12/58 (20%), Positives = 24/58 (41%), Gaps = 1/58 (1%)

Query: 16 MAINVVIIAMQLLLAYFYTDIYGLSAADVGVLFVVVRMIDAII-DPAMGVLTDKLNTR 72
I + ++ Y D++ LS A++G + + + II G+L D+
Sbjct: 266 GIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPL 323


107SC4051SC4063N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC4051-1122.689683transcriptional regulator HU subunit alpha
SC4052-2123.142976hypothetical protein
SC4053-3122.672284zinc resistance protein
SC4054-2121.628756sensor protein ZraS
SC4055-2121.101892transcriptional regulatory protein ZraR
SC4056-1151.247951phosphoribosylamine--glycine ligase
SC4059-212-0.636179*hypothetical protein
SC4060-312-0.408400hypothetical protein
SC4061-2110.578610homoserine O-succinyltransferase
SC4062-2192.549372malate synthase
SC4063-2213.408434isocitrate lyase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4051DNABINDINGHU1201e-39 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 120 bits (303), Expect = 1e-39
Identities = 49/89 (55%), Positives = 66/89 (74%)

Query: 2 NKTQLIDVIADKAELSKTQAKAALESTLAAITESLKEGDAVQLVGFGTFKVNHRAERTGR 61
NK LI +A+ EL+K + AA+++ +A++ L +G+ VQL+GFG F+V RA R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEIKIAAANVPAFVSGKALKDAVK 90
NPQTG+EIKI A+ VPAF +GKALKDAVK
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4054PF06580395e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.7 bits (90), Expect = 5e-05
Identities = 48/268 (17%), Positives = 102/268 (38%), Gaps = 53/268 (19%)

Query: 294 IVLSALAAVLLATLLAFFWHQRYQRSHRELLDAMKRKEKLVAMGHLAAGVA----HEIRN 349
I+ + + + +LL F WH + +++++ + + L A A H + N
Sbjct: 120 IIFNVVVVTFMWSLLYFGWH--FFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFN 177

Query: 350 PLSSIKGLAKYFAERTPAGGESHELAQVM---TKEADRLNRVVSELLELVKPAHLTLQAV 406
L++I+ L + A L+++M + ++ +++ L +V ++L L ++
Sbjct: 178 ALNNIRALILEDPTK--AREMLTSLSELMRYSLRYSNARQVSLADELTVVD-SYLQLASI 234

Query: 407 NLNDIITHSLNLVSQDAQSREIQLRFTANETLKRIQADPDRLTQVLLNLYLNAI-HAIGR 465
D + + + ++Q+ P L Q L+ N I H I +
Sbjct: 235 QFEDRLQFENQI---NPAIMDVQV--------------PPMLVQTLVE---NGIKHGIAQ 274

Query: 466 Q---GTITVEAKESGTDRVIITVTDSGKGIAPDQLEAIFTPYFTTKADGTGLGLAVVQNI 522
G I ++ + V + V ++G + E TG GL V+
Sbjct: 275 LPQGGKILLKGTKDN-GTVTLEVENTGSLALKNTKE------------STGTGLQNVRER 321

Query: 523 IEQHGG---AIKVKSIEGKGAVFTIWLP 547
++ G IK+ +GK + +P
Sbjct: 322 LQMLYGTEAQIKLSEKQGKVNA-MVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4055HTHFIS5280.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 528 bits (1361), Expect = 0.0
Identities = 183/475 (38%), Positives = 256/475 (53%), Gaps = 37/475 (7%)

Query: 1 MIRGKIDILVVDDDVSHCTILQALLRGWGYNVALAYSGHDALAQVREKVFDLVLCDVRMA 60
M I LV DDD + T+L L GY+V + + + DLV+ DV M
Sbjct: 1 MTGATI--LVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 EMDGIATLKEIKALNPAIPILIMTAFSSVETAVEALKAGALDYLIKPLDFDRLQETLEKA 120
+ + L IK P +P+L+M+A ++ TA++A + GA DYL KP D L + +A
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 121 LAHTRETGAELPSASAAQFGMIGSSPAMQHLLNEIAMVAPSDATVLIHGDSGTGKELVAR 180
LA + ++L S ++G S AMQ + +A + +D T++I G+SGTGKELVAR
Sbjct: 119 LAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVAR 178

Query: 181 ALHACSARSDKPLVTLNCAALNESLLESELFGHEKGAFTGADKRREGRFVEADGGTLFLD 240
ALH R + P V +N AA+ L+ESELFGHEKGAFTGA R GRF +A+GGTLFLD
Sbjct: 179 ALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLD 238

Query: 241 EIGDISPLMQVRLLRAIQEREVQRVGSNQTISVDVRLIAATHRDLAEEVSAGRFRQDLYY 300
EIGD+ Q RLLR +Q+ E VG I DVR++AAT++DL + ++ G FR+DLYY
Sbjct: 239 EIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYY 298

Query: 301 RLNVVAIEMPSLRQRREDIPLLADHFLRRFAERNRKAVKGFTPQAMDLLIHYDWPGNIRE 360
RLNVV + +P LR R EDIP L HF+++ + VK F +A++L+ + WPGN+RE
Sbjct: 299 RLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLD-VKRFDQEALELMKAHPWPGNVRE 357

Query: 361 LENAIERAVVLLTGEYISERELPLAIAATPIKTECSGEIQP------------------- 401
LEN + R L + I+ + + + +
Sbjct: 358 LENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFA 417

Query: 402 ---------------LVEVEKEVILAALEKTGGNKTEAARQLGITRKTLLAKISR 441
L E+E +ILAAL T GN+ +AA LG+ R TL KI
Sbjct: 418 SFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4060SACTRNSFRASE392e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.8 bits (90), Expect = 2e-06
Identities = 16/54 (29%), Positives = 22/54 (40%), Gaps = 5/54 (9%)

Query: 78 VDPDVRGQGIGKRLVEHALTLAP-----GLTTNVNEQNTQAVGFYKKMGFKVTG 126
V D R +G+G L+ A+ A GL + N A FY K F +
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4063BINARYTOXINB320.008 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 31.6 bits (71), Expect = 0.008
Identities = 14/58 (24%), Positives = 23/58 (39%)

Query: 289 ETSTPDLELARRFADAIHAKYPGKLLAYNCSPSFNWQKNLDDKTIASFQQQLSDMGYK 346
ET+ PD+ L A P L Y + N D +T + + QL+++
Sbjct: 544 ETTKPDMTLKEALKIAFGFNEPNGNLQYQGKDITEFDFNFDQQTSQNIKNQLAELNAT 601


108SC4166SC4172N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC4166013-0.189508aminoalkylphosphonic acid N-acetyltransferase
SC4167-111-0.987374hypothetical protein
SC4168-113-0.226387hypothetical protein
SC4169-212-1.095077proline/glycine betaine transporter
SC4170-113-0.816313sensor protein BasS/PmrB
SC4171-112-1.247027DNA-binding transcriptional regulator BasR
SC4172-112-0.750907cell division protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4166SACTRNSFRASE341e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.8 bits (77), Expect = 1e-04
Identities = 20/86 (23%), Positives = 33/86 (38%), Gaps = 9/86 (10%)

Query: 61 LALRNGEVVGMISLHMQFHLHHANWIG--EIQELVVLPPMRGQKIGSQLLAWAEEEARQA 118
L +G I + +NW G I+++ V R + +G+ LL A E A++
Sbjct: 69 LYYLENNCIGRIKIR-------SNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKEN 121

Query: 119 GAELTELSTNIKRRDAHRFYLREGYK 144
L T A FY + +
Sbjct: 122 HFCGLMLETQDINISACHFYAKHHFI 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4169TCRTETA432e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.9 bits (101), Expect = 2e-06
Identities = 53/284 (18%), Positives = 104/284 (36%), Gaps = 40/284 (14%)

Query: 85 FFGMLGDKYGRQKILAITIVIMSISTFCIGLIPSYATIGIWAPILLLLCKMAQGFSVGGE 144
G L D++GR+ +L +++ ++ + P +W +L + ++ G + G
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPF-----LW---VLYIGRIVAGIT-GAT 112

Query: 145 YTGASIFVAEYSPDRKR----GFMGSWLDFGSIAGFVLGAGVVVLISTIVGEENFLEWGW 200
A ++A+ + +R GFM + FG +AG VLG G++ S
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG-GLMGGFSP------------ 159

Query: 201 RIPFFIALPLGIIGLYLRHALEETPAFQQHVDKLEQGDREGLQDGPKVSFKEIATKHWRS 260
PFF A L + L K E+ P SF+ +
Sbjct: 160 HAPFFAAAALNGLNFLTGCFLLPESH------KGERRPLRREALNPLASFRWARGMTVVA 213

Query: 261 LLSCIGLVIATNVTYYMLLTYMPSYLSHNLHYS-EDHGVLIIIAIMIGMLFVQPVMGLLS 319
L + ++ + + + H+ G+ + ++ L + G ++
Sbjct: 214 ALMAVFFIM--QLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVA 271

Query: 320 DRFGRRPFVIMGSIA-LFALAIPAFILINSNVIGLIFAGLLMLA 362
R G R +++G IA + AF + F +++LA
Sbjct: 272 ARLGERRALMLGMIADGTGYILLAFA----TRGWMAFPIMVLLA 311



Score = 37.9 bits (88), Expect = 7e-05
Identities = 37/164 (22%), Positives = 73/164 (44%), Gaps = 16/164 (9%)

Query: 286 LSHNLHYSEDHGVLI-IIAIMIGMLFVQPVMGLLSDRFGRRPFVIMGSIALFALAIPAFI 344
L H+ + +G+L+ + A+M PV+G LSDRFGRRP ++ ++L A+ I
Sbjct: 35 LVHSNDVTAHYGILLALYALM--QFACAPVLGALSDRFGRRPVLL---VSLAGAAVDYAI 89

Query: 345 LINSNVIGLIFAGLLMLAVILNCFTGVMASTLPAMFPTHIR---YSALAAAFNISVLIAG 401
+ + + +++ G ++A I V + + + R + ++A F ++AG
Sbjct: 90 MATAPFLWVLYIG-RIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFG-MVAG 147

Query: 402 LTPTLAAWLVESSQDLMMPAYYLMVIAVIGLVTGI-SMKETANR 444
P L + S P + + + +TG + E+
Sbjct: 148 --PVLGGLMGGFS--PHAPFFAAAALNGLNFLTGCFLLPESHKG 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4170PF06580386e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.5 bits (87), Expect = 6e-05
Identities = 39/182 (21%), Positives = 78/182 (42%), Gaps = 34/182 (18%)

Query: 184 ARLDQMMDSVSQLLQLARVGQSFSSGNYQEVKLLEDV-ILPSYDELNTM-LETR-QQTLL 240
+ +M+ S+S+L++ S N ++V L +++ ++ SY +L ++ E R Q
Sbjct: 191 TKAREMLTSLSELMR-----YSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 241 LPESAADVVVRGDATLLRMLLRNLVENAHRY----SPEGTHITIHISADPDAI-MAVEDE 295
+ + DV V ML++ LVEN ++ P+G I + + D + + VE+
Sbjct: 246 INPAIMDVQV------PPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENT 299

Query: 296 GPGIDESKCGKLSEAFVRMDSRYGGIGLGLSIV-SRITQLHQGQFFLQNRTERTGTRAWV 354
G + + G GL V R+ L+ + ++ ++ A V
Sbjct: 300 GSLA--------------LKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 355 LL 356
L+
Sbjct: 346 LI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4171HTHFIS905e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.9 bits (223), Expect = 5e-23
Identities = 45/144 (31%), Positives = 68/144 (47%), Gaps = 1/144 (0%)

Query: 2 KILIVEDDTLLLQGLILAAQTEGYACDGFSTVRAAEHSLESGHYSLMVLDLGLPDEDGLH 61
IL+ +DD + L A GY S + +G L+V D+ +PDE+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 FLTRIRQKKYTLPVLILTARDTLNDRITGLDVGADDYLVKPFALEELHARI-RALLRRHN 120
L RI++ + LPVL+++A++T I + GA DYL KPF L EL I RAL
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 NQGESELTVGNLTLNIGRHQAWRD 144
+ E + +GR A ++
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQE 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4172BCTERIALGSPF320.006 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 32.1 bits (73), Expect = 0.006
Identities = 39/163 (23%), Positives = 60/163 (36%), Gaps = 13/163 (7%)

Query: 80 CVFILVGAAAQYFILTYGIIIDHSMIANMMDTTPAETFALM-TPQMVLTLG---LSGVLA 135
CV +V A +L+ + +M P T LM V T G L +LA
Sbjct: 177 CVLTVVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLA 236

Query: 136 AVIAFWVKIRPATPRLRSGLYRLASVLISILLVILVAAFFYKDYASLFRNNKQLIKALSP 195
+AF V +R R+ L LI + L A + + + L + L++A+
Sbjct: 237 GFMAFRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRI 296

Query: 196 SNSIVASWSWYSHQRLANLPLVRIGEDAHRN--------PLML 230
S V S + H+ VR G H+ P+M
Sbjct: 297 S-GDVMSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMR 338


109SC4234SC4240N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC4234-2132.970522N-acetylmuramoyl-L-alanine amidase
SC4235-1162.350702DNA mismatch repair protein
SC42361181.231570tRNA delta(2)-isopentenylpyrophosphate
SC42374231.091278RNA-binding protein Hfq
SC42383210.953230GTPase HflX
SC42393201.354885FtsH protease regulator HflK
SC42404191.226684FtsH protease regulator HflC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4234PF03544310.007 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 31.1 bits (70), Expect = 0.007
Identities = 16/65 (24%), Positives = 26/65 (40%), Gaps = 7/65 (10%)

Query: 130 PPPPPPPVVAKRVESAPRPTEPARNPFKSSDDRLTGVTSSNTVTRPAARASAGAGDKVVI 189
P P P P K+VE R +P + S + + RP + + A K V
Sbjct: 99 PKPKPKPKPVKKVEQPKRDVKPVESRPASPFE-------NTAPARPTSSTATAATSKPVT 151

Query: 190 AIDAG 194
++ +G
Sbjct: 152 SVASG 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4235ALARACEMASE300.028 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 30.1 bits (68), Expect = 0.028
Identities = 26/161 (16%), Positives = 57/161 (35%), Gaps = 18/161 (11%)

Query: 31 VENSLDAGATRVDIDIER---GGAKLIR-IRDNGCGIKKEELALALARHATSKIASLDDL 86
++ SLD A + ++ I R A++ ++ N G E + A+ + +L++
Sbjct: 5 IQASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGATDGFALLNLEEA 64

Query: 87 EAIISLGFRGEAL----------ASISSVSRLTLTSRTAEQAEAWQAYAEGRDMDVTVK- 135
+ G++G L I RLT + Q +A Q +D+ +K
Sbjct: 65 ITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDIYLKV 124

Query: 136 -PAAHPVGTTLEVLDLFYNTPARRKFMRTEK--TEFNHIDE 173
+ +G + + + + + F +
Sbjct: 125 NSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAEH 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4238SECA330.002 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 33.3 bits (76), Expect = 0.002
Identities = 26/144 (18%), Positives = 55/144 (38%), Gaps = 6/144 (4%)

Query: 282 HVVDAADVRVQENIEAVNTVLEEIDAHEIPTLMVMNKIDMLDDFEPRIDRDEENK-PIRV 340
++D +DV N + IDA+ P + ++ + + R+ D + PI
Sbjct: 665 ELLDVSDVSETINSIREDVFKATIDAYIPPQSL--EEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 341 WLSAQSGVGIPQLFQALTERLSGEVAQHTLRLPPQEGRLRSRFYQLQAIEKEWMEEDGSV 400
WL + + L + + + + + + R + LQ ++ W E ++
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAM 782

Query: 401 SLQVRMPIVDWRRLCKQEPALIEY 424
+R I R +++P EY
Sbjct: 783 D-YLRQGIH-LRGYAQKDP-KQEY 803


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4240PYOCINKILLER290.030 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 29.0 bits (64), Expect = 0.030
Identities = 18/65 (27%), Positives = 30/65 (46%), Gaps = 3/65 (4%)

Query: 225 NRMRAEREAVARRHRSQGQEEAEKLRAAADYEVTK---TLAEAERQGRIMRGEGDAEAAK 281
N+ R + A A+R + + +RAA Y + +A A +G I +G A A+
Sbjct: 220 NKAREQAAAEAKRKAEEQARQQAAIRAANTYAMPANGSVVATAAGRGLIQVAQGAASLAQ 279

Query: 282 LFADA 286
+DA
Sbjct: 280 AISDA 284


110SC4406SC4412N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SC44060131.193333ribosomal-protein-alanine N-acetyltransferase
SC4407-1142.133371nucleotidase
SC4408-1152.540582peptide chain release factor 3
SC4409-1122.337300hypothetical protein
SC44100122.815725hypothetical protein
SC4411-1122.774508hypothetical protein
SC4412-2172.820436deoxyribonuclease YjjV
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4406SACTRNSFRASE488e-10 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 48.4 bits (115), Expect = 8e-10
Identities = 17/59 (28%), Positives = 29/59 (49%)

Query: 62 DEATLFNIAVDPDFQRRGLGRMLLEHLIDELEKRGVVTLWLEVRASNAAAIALYESLGF 120
A + +IAV D++++G+G LL I+ ++ L LE + N +A Y F
Sbjct: 88 GYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4408TCRTETOQM2136e-64 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 213 bits (545), Expect = 6e-64
Identities = 109/452 (24%), Positives = 209/452 (46%), Gaps = 44/452 (9%)

Query: 12 KRRTFAIISHPDAGKTTITEKVLLFGQAIQTAGTVKGRGSSQHAKSDWMEMEKQRGISIT 71
K +++H DAGKTT+TE +L AI G+V ++D +E+QRGI+I
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGT----TRTDNTLLERQRGITIQ 57

Query: 72 TSVMQFPYHDCLVNLLDTPGHEDFSEDTYRTLTAVDCCLMVIDAAKGVEDRTRKLMEVTR 131
T + F + + VN++DTPGH DF + YR+L+ +D +++I A GV+ +TR L R
Sbjct: 58 TGITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 132 LRDTPILTFMNKLDRDIRDPMELLDEVENELKIGCAPITWPIGCGKLFKGVYHLYKDETY 191
P + F+NK+D++ D + +++ +L K +
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVI------------------KQKVE 159

Query: 192 LYQTGKGHTIQEVRIVKGLNNPDLDAAVGEDLAQQLRDELELVQGASNEFDEELFLAGEI 251
LY E + + D + + ++ + + LEL Q S F +
Sbjct: 160 LYPNMCVTNFTESEQWDTVIEGN-DDLLEKYMSGKSLEALELEQEES-----IRFHNCSL 213

Query: 252 TPVFFGTALGNFGVDHMLDGLVAWAPAPMPRQTDTRTVEASEEKFTGFVFKIQANMDPKH 311
PV+ G+A N G+D++++ + + + + G VFKI+ K
Sbjct: 214 FPVYHGSAKNNIGIDNLIEVITNKFYSS---------THRGQSELCGKVFKIE--YSEK- 261

Query: 312 RDRVAFMRVVSGKYEKGMKLRQVRTGKDVVISDALTFMAGDRSHVEEAYPGDILGLHNHG 371
R R+A++R+ SG +R K + I++ T + G+ +++AY G+I+ L N
Sbjct: 262 RQRLAYIRLYSGVLHLRDSVRISEKEK-IKITEMYTSINGELCKIDKAYSGEIVILQNEF 320

Query: 372 TIQIGDTFTQGEMMKFTGIPNFA-PELFRRIRLKDPLKQKQLLKGLVQLSEEG-AVQVFR 429
+++ +++ P L + P +++ LL L+++S+ ++ +
Sbjct: 321 -LKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYV 379

Query: 430 PISNNDLIVGAVGVLQFDVVVARLKSEYNVEA 461
+ +++I+ +G +Q +V A L+ +Y+VE
Sbjct: 380 DSATHEIILSFLGKVQMEVTCALLQEKYHVEI 411


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4410CHANLCOLICIN270.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 27.3 bits (60), Expect = 0.002
Identities = 16/49 (32%), Positives = 20/49 (40%), Gaps = 8/49 (16%)

Query: 4 WGIIFLVIALIA--------AALGFGGLAGTAAGAAKIVFVVGIVLFLV 44
W +FL + A AL F LAGT G I V GI+ +
Sbjct: 460 WKPLFLTLEKKAADAGVSYVVALLFSLLAGTTLGIWGIAIVTGILCSYI 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SC4412UREASE290.028 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 28.6 bits (64), Expect = 0.028
Identities = 32/141 (22%), Positives = 51/141 (36%), Gaps = 37/141 (26%)

Query: 6 IDTHCHFDFPPFTGDERASIQRACEAGVEKIIVPATEAA-------------HFPRVLAL 52
+D+H HF P I+ A +G+ ++ T A H R++
Sbjct: 133 MDSHIHFICP-------QQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIARMIEA 185

Query: 53 AARFPSLYAALGLHPIVIERHADDDPDKLQQALAQQQNVVAVGEIGLDLYRDDPQFARQE 112
A FP A G + P AL + V G L L+ D +
Sbjct: 186 ADAFPMNLAFAG-------KGNASLPG----ALVEM---VLGGATSLKLHED---WGTTP 228

Query: 113 RFLDAQLQLAKRYDLPVILHS 133
+D L +A YD+ V++H+
Sbjct: 229 AAIDCCLSVADEYDVQVMIHT 249



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.