PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome2250.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_010334 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1Shal_0045Shal_0087Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0045-1133.446769preprotein translocase subunit SecB
Shal_0046-2193.901968NAD(P)H-dependent glycerol-3-phosphate
Shal_0047-1223.738405hypothetical protein
Shal_00480223.313975hypothetical protein
Shal_0049-2213.330484hypothetical protein
Shal_00500192.327030ABC transporter-like protein
Shal_00511202.050455ABC-2 type transporter
Shal_00522231.570348NLP/P60 protein
Shal_00532251.564311hypothetical protein
Shal_00541181.002728hypothetical protein
Shal_00551191.355922molybdenum cofactor biosynthesis protein MogA
Shal_00561161.694524hypothetical protein
Shal_00571142.850660hypothetical protein
Shal_0058-1142.560740hypothetical protein
Shal_0059-1132.566683methyl-accepting chemotaxis sensory transducer
Shal_0060-2173.267560ABC-2 type transporter
Shal_0061-1162.994348ABC transporter-like protein
Shal_0062-1152.235174TAP domain-containing protein
Shal_00631181.156375AMP-dependent synthetase and ligase
Shal_006409-2.312396thioesterase superfamily protein
Shal_0065110-3.594353hypothetical protein
Shal_0066-110-1.921397thioesterase superfamily protein
Shal_0067010-1.2540352,3-diketo-5-methylthio-1-phosphopentane
Shal_0068-2110.855412lytic transglycosylase
Shal_0069-3131.823241ATPase AAA
Shal_0070-2184.150326hypothetical protein
Shal_0071-1194.921984imidazolonepropionase
Shal_0072-1204.499978histidine utilization repressor
Shal_0073-1204.587655urocanate hydratase
Shal_00740193.275843histidine ammonia-lyase
Shal_00750190.700763N-acetyltransferase GCN5
Shal_00760180.698481LysR family transcriptional regulator
Shal_00771240.274782putative endoribonuclease L-PSP
Shal_0078124-0.091407Na+/H+ antiporter NhaC
Shal_0079223-1.4076781-aminocyclopropane-1-carboxylate deaminase
Shal_0080217-1.821543hypothetical protein
Shal_0081013-0.461748hypothetical protein
Shal_0082-112-0.330407hypothetical protein
Shal_0083-112-0.181914iron-containing alcohol dehydrogenase
Shal_00840140.510386hypothetical protein
Shal_00851141.183027mechanosensitive ion channel protein MscS
Shal_00862162.272035lysine exporter protein LysE/YggA
Shal_00871163.097490phospholipid/glycerol acyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0045SECBCHAPRONE2071e-71 Bacterial protein-transport SecB chaperone protein ...
		>SECBCHAPRONE#Bacterial protein-transport SecB chaperone protein

signature.
Length = 170

Score = 207 bits (528), Expect = 1e-71
Identities = 80/161 (49%), Positives = 113/161 (70%), Gaps = 4/161 (2%)

Query: 4 VANNEQ--QGPQFNIQRVYTKDISFETPNSPAVFQKEWNPEVKLDLDTRSNKLSDDVYEV 61
A + Q Q P IQR+Y KD+SFE PN P +FQ++W P++ DL T + ++ DD+YEV
Sbjct: 8 NAADTQATQQPVLQIQRIYVKDVSFEAPNLPHIFQQDWEPKLSFDLSTEAKQVGDDLYEV 67

Query: 62 VLSLTVTA--KNGEETAFLCEVQQAGIFAIAGLTEQQLAHSLGAYCPNVLFPYAREAIGS 119
L+++V ++ + AF+CEV+QAG+F I+GL E Q+AH L + CPN+LFPYARE + S
Sbjct: 68 CLNISVETTMESSGDVAFICEVKQAGVFTISGLEEMQMAHCLTSQCPNMLFPYARELVSS 127

Query: 120 LVSRGTFPQLNLAPVNFDALFAQYVQQRQAAAADAPAEEAN 160
LV+RGTFP LNL+PVNFDALF Y+Q+++ A E +
Sbjct: 128 LVNRGTFPALNLSPVNFDALFMDYLQRQEQAEQTTEEENKD 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0047DNABINDNGFIS351e-04 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 34.6 bits (79), Expect = 1e-04
Identities = 27/104 (25%), Positives = 46/104 (44%), Gaps = 18/104 (17%)

Query: 249 NPGEAISINLLPGEDVRAKIQQALEASPKQSLRNALSQWLPKRLVEVLFDESLLNKALNQ 308
+ ++++ + +D Q+ L S KQ+L+N +Q + + + L +
Sbjct: 6 VNSDVLTVSTVNSQD--QVTQKPLRDSVKQALKNYFAQLNGQDVND-----------LYE 52

Query: 309 LVHAEREQLALDLESWTLVMNGTEGYRTAEVTLGGIDTDELSSK 352
LV AE EQ LD +VM T G +T + GI+ L K
Sbjct: 53 LVLAEVEQPLLD-----MVMQYTRGNQTRAALMMGINRGTLRKK 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0049RTXTOXIND445e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.0 bits (104), Expect = 5e-07
Identities = 21/152 (13%), Positives = 48/152 (31%), Gaps = 17/152 (11%)

Query: 45 VVVALPVAQGSIVTKGTVLVQLDDTQQKSHVAKALADVAQATANYEKLLKGAREEEIAAA 104
+V + V +G V KG VL++L ++ K + + QA +
Sbjct: 106 IVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLE---------QTRYQIL 156

Query: 105 RAAVAGTRARLQESEANYRRIASMAKDNLASKADLDRALASRDADAANLESAQENLLELV 164
++ + + ++ L + + ++ E +
Sbjct: 157 SRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLD------ 210

Query: 165 SGSREEDIRFALANLQASEAVLLGEQKKLDDL 196
+ + LA + E + E+ +LDD
Sbjct: 211 --KKRAERLTVLARINRYENLSRVEKSRLDDF 240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0050ANTHRAXTOXNA290.025 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.3 bits (65), Expect = 0.025
Identities = 14/34 (41%), Positives = 20/34 (58%)

Query: 112 EQTARIQELLNTYGLEQKADQLAGSMSGGQKQRL 145
E+ + LL YG+E+K D G++S QKQ L
Sbjct: 524 EKQKGVTNLLIKYGIERKPDSTKGTLSNWQKQML 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0051ABC2TRNSPORT452e-07 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 44.9 bits (106), Expect = 2e-07
Identities = 44/161 (27%), Positives = 75/161 (46%), Gaps = 13/161 (8%)

Query: 194 GVILTMTMIMFT----SAAIVRERERGNLEMLITTPIRPIELMLGKII----PYMFIGIV 245
G++ T M T AA R + E ++ T +R +++LG++ G
Sbjct: 72 GMVATSAMTAATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAG 131

Query: 246 QVIIILGLGYSVFDVPINGSLLQLAGATLLFIMASLTLGLVISTIAKSQLQSMQMTIFVL 305
++ LGY+ + SLL L +A +LG+V++ +A S + V+
Sbjct: 132 IGVVAAALGYTQWL-----SLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVI 186

Query: 306 LPSILLSGFMFPYEGMPIEAQYIAEALPATHFMRLIRGVVL 346
P + LSG +FP + +PI Q A LP +H + LIR ++L
Sbjct: 187 TPILFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIML 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0065SACTRNSFRASE260.028 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 25.7 bits (56), Expect = 0.028
Identities = 14/73 (19%), Positives = 27/73 (36%), Gaps = 10/73 (13%)

Query: 9 DESRFVINVDGAQAVLAYELYEDQTAADSPGRRCNFN------NTYVPPELRGRGLAEKL 62
+ +V A A+ Y + R N+N + V + R +G+ L
Sbjct: 55 MDVSYVEEEGKA----AFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTAL 110

Query: 63 VRRGLKWAKAEGL 75
+ + ++WAK
Sbjct: 111 LHKAIEWAKENHF 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0071UREASE371e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 37.4 bits (87), Expect = 1e-04
Identities = 14/33 (42%), Positives = 19/33 (57%)

Query: 348 LAGMTRNAAKALGIDDHVGVIEVGMTADFCIWN 380
+A T N A A G+ +G +EVG AD +WN
Sbjct: 406 IAKYTINPAIAHGLSHEIGSLEVGKRADLVLWN 438



Score = 34.3 bits (79), Expect = 8e-04
Identities = 19/61 (31%), Positives = 32/61 (52%), Gaps = 6/61 (9%)

Query: 23 YGAITDAALAVQDGKIAWVGKRSD---LPEFDVF---GTPIYKGKGGWITPGLIDAHTHL 76
+ I A + ++DG+IA +GK + P + GT + G+G +T G +D+H H
Sbjct: 80 HWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIHF 139

Query: 77 I 77
I
Sbjct: 140 I 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0075SACTRNSFRASE290.005 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.8 bits (64), Expect = 0.005
Identities = 21/127 (16%), Positives = 42/127 (33%), Gaps = 8/127 (6%)

Query: 14 TETIELYQANEWSSADKPDLLILALRNS-HTLVTARYEGKLVGIGNAISDGHLVVYYSHM 72
T T E + + + D+ + + E +G S+ + +
Sbjct: 36 TYTEERFSKPYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDI 95

Query: 73 LVHPSLHGKGIGRKMMA-----AMQSVYCGFHQQMLTADGDAVEFYKALGFERAGKTEPM 127
V KG+G ++ A ++ +CG + + A FY F G + M
Sbjct: 96 AVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF-IIGAVDTM 154

Query: 128 WIYAGTE 134
+Y+
Sbjct: 155 -LYSNFP 160


2Shal_0104Shal_0112Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_01043331.619514formate dehydrogenase region TAT target
Shal_01053341.556996molybdopterin oxidoreductase
Shal_01064351.2395754Fe-4S ferredoxin
Shal_01073341.206565formate dehydrogenase subunit gamma
Shal_01086431.866200formate dehydrogenase region TAT target
Shal_01096451.903933molybdopterin oxidoreductase
Shal_01106421.5191054Fe-4S ferredoxin
Shal_01114360.936249formate dehydrogenase subunit gamma
Shal_01123301.632986formate dehydrogenase region TAT target
3Shal_0173Shal_0182Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_01733210.814850major facilitator transporter
Shal_01743220.657632two component LuxR family transcriptional
Shal_01754230.422679histidine kinase
Shal_01765260.892163major facilitator transporter
Shal_01774251.275985nitrate reductase subunit alpha
Shal_01782190.500366nitrate reductase subunit beta
Shal_0179214-0.896385nitrate reductase molybdenum cofactor assembly
Shal_0180217-0.598158respiratory nitrate reductase subunit gamma
Shal_0181216-0.656218PpiC-type peptidyl-prolyl cis-trans isomerase
Shal_0182218-0.848029HPP family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0174HTHFIS702e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 2e-16
Identities = 32/122 (26%), Positives = 52/122 (42%), Gaps = 2/122 (1%)

Query: 1 MPIAKILLVDDHPALRQGLAQLMSLEESLDVVAQASSGEEAISLAVQYEPDIILLDLNMR 60
M A IL+ DD A+R L Q +S DV S+ + D+++ D+ M
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG-YDVRI-TSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 GLSGIDTLISLKNAQVKSKVIIFTVSDNESDVVQAIKFNTDGYLLKDSEPEELIEKIHLA 120
+ D L +K A+ V++ + + ++A + YL K + ELI I A
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 121 LQ 122
L
Sbjct: 119 LA 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0175PF06580310.008 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.008
Identities = 16/94 (17%), Positives = 32/94 (34%), Gaps = 8/94 (8%)

Query: 481 IRFSYDLPQHCLSANQEIHLLQIIREGLSNIHRHA---KASEAGVRLS--PKGELIKLDI 535
++F + + L+Q + E N +H + L + L++
Sbjct: 240 LQFENQINPAIMDVQVPPMLVQTLVE---NGIKHGIAQLPQGGKILLKGTKDNGTVTLEV 296

Query: 536 WDNGAGLPGTLQTQGHFGLGIMRERVSSLKGQMY 569
+ G+ + GL +RER+ L G
Sbjct: 297 ENTGSLALKNTKESTGTGLQNVRERLQMLYGTEA 330


4Shal_0226Shal_0236Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_022629-0.043609carnitine operon protein CaiE
Shal_0227210-1.629994alpha/beta hydrolase domain-containing protein
Shal_0228112-3.393154LysR family transcriptional regulator
Shal_0229314-2.841674disulfide bond formation protein DsbB
Shal_0230314-3.590839DSBA oxidoreductase
Shal_0231513-3.212916arylsulfotransferase
Shal_0232312-2.891186hypothetical protein
Shal_0233211-1.848690TetR family transcriptional regulator
Shal_0234211-0.347915disulfide bond formation protein DsbB
Shal_0235213-0.099891DSBA oxidoreductase
Shal_0236212-0.029783arylsulfotransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0233HTHTETR508e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 50.4 bits (120), Expect = 8e-10
Identities = 17/58 (29%), Positives = 31/58 (53%)

Query: 3 SKTLEHILNVSEQLIYKDGVIGFKFCTVAKEAGISTTSLYKFFGNKEDILVALASKSF 60
+T +HIL+V+ +L + GV +AK AG++ ++Y F +K D+ + S
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67


5Shal_0251Shal_0265Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_02512181.367746protoheme IX farnesyltransferase
Shal_02522161.873264cytochrome oxidase assembly
Shal_02532131.679397hypothetical protein
Shal_02542121.471505hypothetical protein
Shal_02552121.570762cytochrome c oxidase subunit III
Shal_02563132.651698cytochrome C oxidase assembly protein
Shal_02574132.603669cytochrome c oxidase subunit I type
Shal_02584132.410379cytochrome c oxidase subunit II
Shal_02593162.683146hypothetical protein
Shal_02603182.639312LexA repressor
Shal_02613212.780382FGGY-family pentulose kinase
Shal_02622202.325837monosaccharide-transporting ATPase
Shal_02632202.115194monosaccharide-transporting ATPase
Shal_02642222.171503ABC transporter-like protein
Shal_02653231.747457putative periplasmic-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0255THERMOLYSIN290.026 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 28.8 bits (64), Expect = 0.026
Identities = 17/53 (32%), Positives = 28/53 (52%), Gaps = 4/53 (7%)

Query: 219 GLTLNSGIYGNTFFLLT---GFHGMHVT-LGTVFLIVLFFRVLKGHFTPDKHF 267
G+ NSGI +LL+ +G+ VT +G + +F+R L + TP +F
Sbjct: 457 GVHTNSGIINKAAYLLSQGGVHYGVSVTGIGRDKMGKIFYRALVYYLTPTSNF 509


6Shal_0280Shal_0303Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0280224-1.369671hypothetical protein
Shal_0281123-0.627880hypothetical protein
Shal_02825171.637122hypothetical protein
Shal_02834182.420960hypothetical protein
Shal_02843162.126029NapC/NirT cytochrome c domain-containing
Shal_02853162.286349hypothetical protein
Shal_02864162.475123putative methyltransferase
Shal_02870161.335312signal recognition particle-docking protein
Shal_0288-114-0.279819cell division ATP-binding protein FtsE
Shal_0289-2110.778867hypothetical protein
Shal_0290-1121.342778RNA polymerase factor sigma-32
Shal_02910141.980263hypothetical protein
Shal_02920132.373491proprotein convertase P
Shal_02933172.176249hypothetical protein
Shal_02942161.300467o-succinylbenzoate--CoA ligase
Shal_0295212-0.478169O-succinylbenzoate synthase
Shal_0296112-1.497986alpha/beta hydrolase fold protein
Shal_0297011-2.2672612-succinyl-5-enolpyruvyl-6-hydroxy-3-
Shal_0298013-3.866604LysR family transcriptional regulator
Shal_0299014-3.166764LysR family transcriptional regulator
Shal_0300215-1.658984hypothetical protein
Shal_0301213-0.379444hypothetical protein
Shal_0302211-0.7113254Fe-4S ferredoxin
Shal_03032120.237166polysulfide reductase NrfD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0287IGASERPTASE557e-10 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 55.1 bits (132), Expect = 7e-10
Identities = 40/241 (16%), Positives = 79/241 (32%), Gaps = 13/241 (5%)

Query: 12 KDKSNEEAQA-----VEAAKLEAEKLELEEAEKLELERAEAERLAAEQAETERLAEEARL 66
SN E A A E E ++ EQ TE A+ +
Sbjct: 1009 SVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREV 1068

Query: 67 AVEQAEAERIESERVEAERHAAEQAEAQRVA---EETAAEQARLAAELAAQAETQRIEAE 123
A E + ++ E + +E E Q T ++ + E E ++ ++
Sbjct: 1069 AKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQ 1128

Query: 124 RLESEREAAEQAEGQRIEAARLESERVIAEQAEAQ---RVATEQASQAEAARIEAERIEA 180
S ++ + + E AR V ++ ++Q TEQ ++ ++ +E E+
Sbjct: 1129 --VSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES 1186

Query: 181 ARIESERVVAEQAESARIAAEQAEAQRIEATRIEAARIEAERIEAARIEAARIEAARIEA 240
+ + V E E+ A Q + + + + R +E A +
Sbjct: 1187 TTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRST 1246

Query: 241 E 241

Sbjct: 1247 V 1247



Score = 49.3 bits (117), Expect = 4e-08
Identities = 27/184 (14%), Positives = 59/184 (32%), Gaps = 17/184 (9%)

Query: 102 AEQARLAAELAAQAETQRIEAERLESEREAAEQAEGQRIEAARLESERVIAEQAEAQRVA 161
E+ + I+A+ E A + +E E VA
Sbjct: 985 VEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPP-APATPSETTET--VA 1041

Query: 162 TEQASQAEAARIEAERIEAARIESERVVAEQAESARIAAEQAEAQRIEATRIEAARIEAE 221
+++ + ++ V E + + + E + + E E +
Sbjct: 1042 ENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETK 1101

Query: 222 RIEAARIEAARIEAARIEAERIEAERIEAERIEAERIEAEAADQAAQLEAAEEQ-QPEPQ 280
E A +E E A++E E+ + E ++ ++ + + Q E + Q +P +
Sbjct: 1102 --ETATVEKE--EKAKVETEKTQ---------EVPKVTSQVSPKQEQSETVQPQAEPARE 1148

Query: 281 AKPV 284
P
Sbjct: 1149 NDPT 1152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0292SUBTILISIN1173e-31 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 117 bits (295), Expect = 3e-31
Identities = 68/357 (19%), Positives = 109/357 (30%), Gaps = 89/357 (24%)

Query: 31 SDPLYKHQWHLKNTGQAAFSENGGKAGNDLKTKQAHKKGALGQGVQISVVDTGLQIDHID 90
Q N G + + G+GV+++V+DTG DH D
Sbjct: 8 IPYQVIKQEQQVNEIPR-----GVEMIQ---APAVWNQTR-GRGVKVAVLDTGCDADHPD 58

Query: 91 LADNVVPG-SMDLITGSDY--PEDTHGHGTAVGGIIAAVGFNGEGVRGIAPKAGINGFNF 147
L ++ G + D +D +GHGT V G IAA N GV G+AP+A +
Sbjct: 59 LKARIIGGRNFTDDDEGDPEIFKDYNGHGTHVAGTIAATE-NENGVVGVAPEADLLIIKV 117

Query: 148 LFEQ--SMNSWLLSHGYGEGTENTELFNQSYGSQALFIPQFDLENDPQLAIENAVMEDVA 205
L +Q W++ Y + ++ + S G + + A++ AV
Sbjct: 118 LNKQGSGQYDWIIQGIYYAIEQKVDIISMSLGG-------PEDVPELHEAVKKAVA---- 166

Query: 206 SNSNNGRGAAFVSSAGNSYNYYGYAGYYFLPADYFDVEASNHGLPMQNSNMTNSKATHWS 265
+ +AGN + P Y
Sbjct: 167 ------SQILVMCAAGNEGDGDDRTDELGYPGCYN-----------------------EV 197

Query: 266 TTVSAINANGERSSYSTVGANVFMAATGGEYGSETPAMVTTDLMGCDSGFNDSTGLGKNG 325
+V AIN + S +S V + A G +++T G
Sbjct: 198 ISVGAINFDRHASEFSNSNNEVDLVAPGE-------DILSTVPGGK-------------- 236

Query: 326 LHGGTADDPNCNYTSVMNGTSSAAPSMVGAIAAVMSANPQLTTRDAKHVLALTAKKT 382
+ +GTS A P + GA+A + RD
Sbjct: 237 -------------YATFSGTSMATPHVAGALALIKQLANASFERDLTEPELYAQLIK 280


7Shal_0360Shal_0383Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0360-2223.764326ribonuclease BN
Shal_0361-1224.424551prolyl aminopeptidase
Shal_03620254.630931hypothetical protein
Shal_0363-1265.177311D-tyrosyl-tRNA(Tyr) deacylase
Shal_03652265.063488rhodanese domain-containing protein
Shal_03661244.7100423-oxoacyl-(acyl carrier protein) synthase II
Shal_03672264.1378963-ketoacyl-ACP reductase
Shal_03683273.996389beta-hydroxyacyl-(acyl-carrier-protein)
Shal_03693284.1149243-oxoacyl-ACP synthase
Shal_03702294.720029hypothetical protein
Shal_03711305.343638tryptophan halogenase
Shal_03721315.809446hypothetical protein
Shal_03730316.032309outer membrane lipoprotein carrier protein LolA
Shal_0374-1295.7731634-hydroxybenzoyl-CoA thioesterase
Shal_0375-1295.635353histidine ammonia-lyase
Shal_03761284.833877glycosyl transferase family protein
Shal_03770254.034473beta-hydroxyacyl-(acyl-carrier-protein)
Shal_03780223.894531hypothetical protein
Shal_03790172.584289hypothetical protein
Shal_03801173.767540acyl carrier protein
Shal_03810153.198465acyl carrier protein
Shal_0382-1153.126191phospholipid/glycerol acyltransferase
Shal_0383-1163.060966hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0367DHBDHDRGNASE1072e-30 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 107 bits (269), Expect = 2e-30
Identities = 71/257 (27%), Positives = 118/257 (45%), Gaps = 29/257 (11%)

Query: 3 KRVLITGSSRGIGKAIALKLAASGHDIAMHFHSNQTAADATKAELEQLEIKVSCLQF--- 59
K ITG+++GIG+A+A LA+ G I A D +LE++ +
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHI--------AAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 60 ----DIADRAAVKQAIEQDIEQHGAYYGVVLNAGINADTAFPAMTESEWDSVVHTNLDGF 115
D+ D AA+ + + + G +V AG+ ++++ EW++ N G
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 116 YNVIHPTVMPMVQGRQGGRIITLASVSGIAGNRGQVNYSASKAGIIGATKALSLELAKRK 175
+N +V + R+ G I+T+ S Y++SKA + TK L LELA+
Sbjct: 121 FNASR-SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYN 179

Query: 176 ITVNCIAPGLIETDM-----VSDIPKEMV--------NNLVPMRRMGKPSEIAGLANYLM 222
I N ++PG ETDM + E V +P++++ KPS+IA +L+
Sbjct: 180 IRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 223 SDDAAYITRQVISVNGG 239
S A +IT + V+GG
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0372ACRIFLAVINRP469e-07 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 45.6 bits (108), Expect = 9e-07
Identities = 29/145 (20%), Positives = 57/145 (39%), Gaps = 25/145 (17%)

Query: 675 LLALALLIAGIIFTFRFGVRLAAVV-VAVPA--LSALLTLACLGVSNNPLTLFHALALIL 731
L +L+ +++ F +R + +AVP L LA G S N LT+F ++L
Sbjct: 344 LFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMF---GMVL 400

Query: 732 VFGIGIDYSL----------------FFAESKQQTRGVMMAVFMSAMSTLLAFGLLAF-- 773
G+ +D ++ +++ + A+ AM F +AF
Sbjct: 401 AIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFG 460

Query: 774 -SQTPAINAFGLTLLLGIGFTFLLS 797
S F +T++ + + L++
Sbjct: 461 GSTGAIYRQFSITIVSAMALSVLVA 485



Score = 41.0 bits (96), Expect = 2e-05
Identities = 49/301 (16%), Positives = 104/301 (34%), Gaps = 32/301 (10%)

Query: 203 VLLKTIAPKVLNQSPQFAAIVMAKGRDSAFNPNAQQLQLSALTAAFDAVKQQDSDI---- 258
V LK +A +V + I G+ +A +AL A A+K + +++
Sbjct: 260 VRLKDVA-RVELGGENYNVIARINGKPAAGLGIKLATGANALDTA-KAIKAKLAELQPFF 317

Query: 259 -EIIKAGALFHATAATDNAKKEVSTIGLISLLGVIALVWFAFRSFRPLTIALVTVSSSFL 317
+ +K + T + EV +++ V +++ ++ R I + V L
Sbjct: 318 PQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLL 377

Query: 318 FAVVTTTAVFGELHLLTLVFGTSLIGISIDY-------CFHYYCERLNHPSHSSDRVIQQ 370
A ++ LT+ IG+ +D E P ++++ + Q
Sbjct: 378 GTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQ 437

Query: 371 IFIAISLALLTSVIAYS---AIGLAPFPGMQQVAIFCASGLVGAYLTLLLAYPTLAARPL 427
I A+ + + G + +Q +I S + + L L+ P L A L
Sbjct: 438 IQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLL 497

Query: 428 RDG---------------NRALDYASRYQGFMLSLMPTVPRKRLLLAVGTCMFIIIGFTQ 472
+ N D++ + + + + LL+ +++ F +
Sbjct: 498 KPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLR 557

Query: 473 L 473
L
Sbjct: 558 L 558


8Shal_0486Shal_0495Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_04860173.587068hypothetical protein
Shal_04871173.721395phosphoribosylamine--glycine ligase
Shal_04881182.456211bifunctional
Shal_04891172.095159zinc-responsive transcriptional regulator
Shal_04902192.100337permease
Shal_04911202.495923DSBA oxidoreductase
Shal_04921193.273657molybdopterin oxidoreductase Fe4S4 region
Shal_04931163.009768formate dehydrogenase subunit alpha
Shal_04941152.616292formate dehydrogenase subunit beta
Shal_04952182.689448formate dehydrogenase subunit gamma
9Shal_0506Shal_0525Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0506118-3.292597EmrB/QacA family drug resistance transporter
Shal_0507125-5.804529N-acetyltransferase GCN5
Shal_0508227-6.319123hypothetical protein
Shal_0509228-6.844362hypothetical protein
Shal_0510328-7.192005phage integrase family protein
Shal_0511326-6.782626integrase family protein
Shal_0512331-8.014963hypothetical protein
Shal_0513430-7.709537hypothetical protein
Shal_0514428-7.478725RNA-directed DNA polymerase
Shal_0515426-6.905941hypothetical protein
Shal_0516226-5.708351hypothetical protein
Shal_0517324-5.672979RNA-directed DNA polymerase
Shal_0518424-5.557223hypothetical protein
Shal_0519322-5.024137XRE family transcriptional regulator
Shal_0520322-4.592524hypothetical protein
Shal_0521323-4.834686type III restriction protein res subunit
Shal_0522324-6.322031DNA repair ATPase-like protein
Shal_0523122-5.290052hypothetical protein
Shal_0524117-3.949549hypothetical protein
Shal_0525116-3.125918hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0506TCRTETB1074e-27 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 107 bits (268), Expect = 4e-27
Identities = 88/401 (21%), Positives = 168/401 (41%), Gaps = 19/401 (4%)

Query: 37 AFMAILDIQITNASMKEIQGGLGATLEEGSWIATAYLVAEMIAIPLSGWLSKGLDIRRYM 96
+F ++L+ + N S+ +I +W+ TA+++ I + G LS L I+R +
Sbjct: 23 SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLL 82

Query: 97 LWNSAIFIFASLLCSIAWNLES-MIAFRAMQGFFGGALIPMAFRLILEYLPDSKRAVGMA 155
L+ I F S++ + + S +I R +QG A + ++ Y+P R
Sbjct: 83 LFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFG 142

Query: 156 LFGVTATFAPSIGPTLGGWLTEQFSWHYLFYINVPPGIVVMSMLAYGLVKKPINWPELKN 215
L G +GP +GG + W YL +P ++ L+KK + +
Sbjct: 143 LIGSIVAMGEGVGPAIGGMIAHYIHWSYLL--LIPMITIITVPFLMKLLKKEVRIKG--H 198

Query: 216 VDVSSIITMALGMGCLEVVLEEGNRKDWFGSDFIRNLAIVAAVNIALFVYLQLKSKAPLV 275
D+ II M++G+ + F + + + IV+ ++ +FV K P V
Sbjct: 199 FDIKGIILMSVGIVFFML----------FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFV 248

Query: 276 NLKLLANRDFSISTIAYFLLGLALFSSIYMVPLYLSQIQDYNSLEIGEVLMWLGFPQLLI 335
+ L N F I + ++ + + MVP + + ++ EIG V+++ G ++I
Sbjct: 249 DPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVII 308

Query: 336 L-PFMPMLMQRFDNRYLAAFGFFMFGVSYYMNSHMTADFAGPQLIASMVVRAIG-QPFIM 393
+L+ R Y+ G VS+ S + + ++V +G F
Sbjct: 309 FGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFL--LETTSWFMTIIIVFVLGGLSFTK 366

Query: 394 VPIGMLATARLQKHENASASTVLNVVRNLGGAVGIAMVSTL 434
I + ++ L++ E + ++LN L GIA+V L
Sbjct: 367 TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0507AUTOINDCRSYN333e-04 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 32.5 bits (74), Expect = 3e-04
Identities = 14/63 (22%), Positives = 25/63 (39%), Gaps = 12/63 (19%)

Query: 5 SLSFSELSLNELYDLLKLRVDVFV--------VEQNCAYPELDDKDRHSQTQHLLGLNEQ 56
++ + LS + +L LR + F E D D ++ T +L G+ +
Sbjct: 6 DVNHTLLSETKSGELFTLRKETFKDRLNWAVQCTDGM---EFDQYD-NNNTTYLFGIKDN 61

Query: 57 GVI 59
VI
Sbjct: 62 TVI 64


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0522GPOSANCHOR330.006 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 33.1 bits (75), Expect = 0.006
Identities = 34/289 (11%), Positives = 90/289 (31%), Gaps = 38/289 (13%)

Query: 585 KQLNGLHNKVNEVGARLTQTERDKSTLEAEYKSIENRVVELEKSVWGLSHVELVSRVEAE 644
+ L +++ +L + ++ S ++ + +E R +LEK++ G
Sbjct: 85 DHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGA---------MNF 135

Query: 645 LKELLIKRSNLLEQKTSATTQITLFTKSFKAKDLELQSLISEVERKGSEHAYVTVQAYLN 704
K L +K + + K+ + + ++++ +E A + +
Sbjct: 136 STADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAEL 195

Query: 705 ENAIVASSLQKHCEIKRKELDTEVLKYKVEVESLTSQCNALQQEMQVGGTWINFPQLKQQ 764
E +E E A + +++ + +
Sbjct: 196 EK--------ALEGAMNFSTADSAKIKTLEAEKAAL--AARKADLE---------KALEG 236

Query: 765 KEALELRLAHSQSAVNAFYGSLSHIILVRSEDTLEKVKEFITGAIEERQRRSEELNILSS 824
+ + A +L E ++++ + GA+ S ++ L +
Sbjct: 237 AMNFSTADSAKIKTLEAEKAAL--------EARQAELEKALEGAMNFSTADSAKIKTLEA 288

Query: 825 KIKLLSELLASFKPYIRRISLQEELTDVERQLEQRSQVYETLAAEKEVV 873
+ L A + + L + R L+ + + L AE + +
Sbjct: 289 EKAALEAEKADLEHQSQ--VLNANRQSLRRDLDASREAKKQLEAEHQKL 335



Score = 32.3 bits (73), Expect = 0.011
Identities = 52/343 (15%), Positives = 98/343 (28%), Gaps = 48/343 (13%)

Query: 336 AESTDTFLQIESRLLELSTQKKALTEERSKTSTQLEALNQTLSELNTELKAVDNRTLLLR 395
TDT +++ R + + L + S S +AL EL EL +
Sbjct: 46 RSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKND 105

Query: 396 TSIDNASPVYTEISNRRAHFAALGLQIAEKEIAIQKEKVQFNVLNRELSEISTLKITPDL 455
S+ + E+ R+A L + + + L E + ++ K +
Sbjct: 106 KSLSEKASKIQELEARKAD---LEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEK 162

Query: 456 LLTGKVDALLFEQGKIDQLTRCYADLDLIEVNITALHTTKKALTEQMELHERLLSIGLDY 515
L G ++ + KI L A L+ + + + + L
Sbjct: 163 ALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAA 222

Query: 516 LSVQPSNICPLCTSPHASVEALQDKIKNRNLLSDLSQENTKKLSLSSKRQKELRDIIQSI 575
L+ + +++ A KIK + +L + + +
Sbjct: 223 LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAK 282

Query: 576 VQQ----------------------------------AIEAQVKQLNGLHNKVNE----- 596
++ A KQL H K+ E
Sbjct: 283 IKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKIS 342

Query: 597 ------VGARLTQTERDKSTLEAEYKSIENRVVELEKSVWGLS 633
+ L + K LEAE++ +E + E S L
Sbjct: 343 EASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLR 385


10Shal_0596Shal_0613Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_05962153.5819283-dehydroquinate dehydratase
Shal_05972174.052225peptidyl-tRNA hydrolase domain-containing
Shal_05982215.424006hypothetical protein
Shal_05991215.364513hypothetical protein
Shal_06001215.098435hypothetical protein
Shal_06010214.993708outer membrane efflux family protein
Shal_06020214.498069RND family efflux transporter MFP subunit
Shal_06030224.330358CzcA family heavy metal efflux protein
Shal_06041202.746142penicillin-insensitive murein endopeptidase
Shal_06051212.849956hypothetical protein
Shal_06060223.514633hypothetical protein
Shal_06071233.346371ATP-dependent protease La
Shal_06081233.756470pseudouridine synthase
Shal_06091223.797609hypothetical protein
Shal_06100214.125466dTDP-4-dehydrorhamnose reductase
Shal_06110213.760401histidine kinase
Shal_06120183.374347two component Fis family transcriptional
Shal_06130183.074062sulfate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0599LIPOLPP20250.010 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 25.5 bits (55), Expect = 0.010
Identities = 17/54 (31%), Positives = 26/54 (48%), Gaps = 2/54 (3%)

Query: 3 LSKLSGLSLVCALVFGLSACAPEVGSEAWCKQMKEKPSG--DWTANEAADYAKH 54
+ K+ G+S+V A+V + AP+ G K KE G DW + AK+
Sbjct: 5 VKKILGMSVVAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKY 58


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0602RTXTOXIND517e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.0 bits (122), Expect = 7e-09
Identities = 35/208 (16%), Positives = 72/208 (34%), Gaps = 24/208 (11%)

Query: 142 ELTAAQLALAKIEVAQLSEQTFSLEDVATATIVVDRDRTVTIAPQVDVRVLKRNVVPGQE 201
+ +L L K A+ + V++ R + + + + ++ V QE
Sbjct: 201 QKYQKELNLDKKR-AERLTVLARINR-YENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQE 258

Query: 202 VNKGDILLTLGGVAVAQAQADYINAAAEWSRVKRMSTSAVSASRRLMAQVDAELKRATLE 261
+ +N + S + +++ V K L+
Sbjct: 259 ----------------NKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILD 302

Query: 262 AMKMTPAQIKA----LANAPETIGSYQLIAPISGRVQQ-DIALLGQIVPAGTALMQLT-D 315
++ T I LA E + + AP+S +VQQ + G +V LM + +
Sbjct: 303 KLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPE 362

Query: 316 ESHLWVEAELTPTQAEKVSIGSKTVVKV 343
+ L V A + +++G ++KV
Sbjct: 363 DDTLEVTALVQNKDIGFINVGQNAIIKV 390



Score = 36.0 bits (83), Expect = 3e-04
Identities = 26/140 (18%), Positives = 50/140 (35%), Gaps = 5/140 (3%)

Query: 168 VATATIVVDRDRTVTIAPQVDVRVLKRNVVPGQEVNKGDILLTLGGV----AVAQAQADY 223
A + R+ I P + V + V G+ V KGD+LL L + + Q+
Sbjct: 85 TANGKLTHS-GRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSL 143

Query: 224 INAAAEWSRVKRMSTSAVSASRRLMAQVDAELKRATLEAMKMTPAQIKALANAPETIGSY 283
+ A E +R + +S S + D + E + + + Y
Sbjct: 144 LQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKY 203

Query: 284 QLIAPISGRVQQDIALLGQI 303
Q + + + + +L +I
Sbjct: 204 QKELNLDKKRAERLTVLARI 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0603ACRIFLAVINRP6520.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 652 bits (1684), Expect = 0.0
Identities = 222/1091 (20%), Positives = 436/1091 (39%), Gaps = 85/1091 (7%)

Query: 5 LIDVAIRNRLMVVLTLVGLIIACVAMLPKLNLDAFPDVTNVQVTVNTAAEGLAAEEVEKL 64
+ + IR + + + L++A + +L + +P + V+V+ G A+ V+
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 65 ISYPVESAMYALPAVTEVRSLS-RTGLSIVTVVFAEGTDIYFARQQVFEQLQAAREMIPD 123
++ +E M + + + S S G +T+ F GTD A+ QV +LQ A ++P
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 124 GVGVPEIGPNTSGLGQIYQYILRATPESGVDASELRSINDYMVKLIMMPVGGVTEVLSFG 183
V I S + + ++ VK + + GV +V FG
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPG-TTQDDISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 184 GEVRQYQVQIEPNKLLSYGLSMAQVTSALESNNRNAGGWFMDQGQE------QLVVRGYG 237
+ ++ ++ + L Y L+ V + L+ N + +
Sbjct: 180 AQ-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQT 238

Query: 238 LLPAGDAGLKAIAQIPLTEVA-GTPVRVGDIAKVDYGSEIRVGAVTMTRRDEAGNAQDLG 296
+ ++ L + G+ VR+ D+A+V+ G E + N +
Sbjct: 239 RF----KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARI-------NGK--- 284

Query: 297 EVVAGVVLKRMGANTKETIDDISARTAMIEQALPDGVSFEVFYDQSDLVNQAVTTVRDAL 356
+ GAN +T I A+ A ++ P G+ YD + V ++ V L
Sbjct: 285 PAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTL 344

Query: 357 LMAFVFIVVILALFLVNIRATMLVLLSIPVSIGLALLVMSYFGMSANLMSLGGLAVAIGM 416
A + + +++ LFL N+RAT++ +++PV + +++ FG S N +++ G+ +AIG+
Sbjct: 345 FEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGL 404

Query: 417 LVDGSVVMVENIFKHLTQPDRRHLANARKRASDEDDPYHADEDGTNTTAHGESSGGITMR 476
LVD ++V+VEN+ R D+ P A E
Sbjct: 405 LVDDAIVVVENV--------------ERVMMEDKLPPKEATEKSM--------------- 435

Query: 477 VMLAAKEVCSPIFFATAIIIVVFAPLFALEGVEGKLFQPMAVSIILAMISALLVALIAVP 536
++ + ++ VF P+ G G +++ +++I+ AM ++LVALI P
Sbjct: 436 -----SQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTP 490

Query: 537 ALAVYLFK----------RGVVLRESAVLKPIESVYRKLLTSTMAHPKVVGITAVVMFAM 586
AL L K G + + Y + + + ++ A
Sbjct: 491 ALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLLIYALIVAG 550

Query: 587 SMMLLPRLGTEFVPELEEGTINLRVTLAPTASLATSLDVAPKLEALLLEFPEVDYALSRI 646
++L RL + F+PE ++G + L A+ + V ++ L+ + + S
Sbjct: 551 MVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNEKANVE-SVF 609

Query: 647 GAAELGGDPEPVNNIEIYIGLKPVDEWVSATNRFE--LQRKMEAKLNVYPGLLFTFSQPI 704
+ N ++ LKP +E N E + R + G + F+ P
Sbjct: 610 TVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGFVIPFNMP- 668

Query: 705 ATRVDELLSGVKAQLA-IKIFGPDLDVLSERGQVLTELVSQIPGAV-DVSLEQVSGEAQL 762
+ EL + I G D L++ L + +Q P ++ V + AQ
Sbjct: 669 --AIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLEDTAQF 726

Query: 763 VVRPKRDQLARYGISVDEIMALVSQGVGGASAGQVIDGNARYDIYVRLGEQYRSSPDILE 822
+ +++ G+S+ +I +S +GG ID +YV+ ++R P+ ++
Sbjct: 727 KLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRMLPEDVD 786

Query: 823 DLLLTGVSGATVRLGEVADVVIEMAPPNIRRDDVQRRVVVQANVADRDMGSVVNDIYAIV 882
L + +G V P + R + + +Q A G+ D A++
Sbjct: 787 KLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAP---GTSSGDAMALM 843

Query: 883 PQA--ELPPGYTVVVGGQYENQQRAQQKLMLVVPVSIALIALLLFFSFGSVRQVGLIMAN 940
+LP G G ++ + + +V +S ++ L L + S +M
Sbjct: 844 ENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLV 903

Query: 941 VPLALIGGVVALFASGTYLSVPSSIGFITLFGVAVLNGVVLVDSINQRRASVKEETDESL 1000
VPL ++G ++A V +G +T G++ N +++V+ E+ + +
Sbjct: 904 VPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDL----MEKEGKGV 959

Query: 1001 YDAVYEGTVGRLRPVLMTALTSALGLIPILLSSGVGSEIQQPLAVVIIGGLFSSTALTLL 1060
+A RLRP+LMT+L LG++P+ +S+G GS Q + + ++GG+ S+T L +
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1061 VLPTLYRWIYQ 1071
+P + I +
Sbjct: 1020 FVPVFFVVIRR 1030



Score = 105 bits (263), Expect = 5e-25
Identities = 79/552 (14%), Positives = 177/552 (32%), Gaps = 70/552 (12%)

Query: 10 IRNRLMVVLTLVGLIIACVAMLPKLNLDAFPDVTNVQVTVN-TAAEGLAAEEVEKLI--- 65
+ + +L ++ V + +L P+ G E +K++
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQV 593

Query: 66 -SYPVESAMYALPAVTEVRSLSRTGLS----IVTVVFAEGTDIYFARQQVFEQLQAAREM 120
Y +++ + +V V S +G + + V + + A+
Sbjct: 594 TDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKME 653

Query: 121 ---IPDGVGVPEIGPNTSGLGQIYQYILRATPESGVDASELRSINDYMVKLIMMPVGGVT 177
I DG +P P LG + ++G+ L + ++ + +
Sbjct: 654 LGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLV 713

Query: 178 EVLSFGGEVRQYQVQIEPN--KLLSYGLSMAQVTSALES--NNRNAGGWFMDQGQEQLVV 233
V G Q ++E + K + G+S++ + + + + ++L V
Sbjct: 714 SVRP-NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYV 772

Query: 234 RGYGLLPAGDAGLKAIAQIPLTEVAGTPVRVGDIAKVDYGSEIRVGAVTMTRRDEAGNAQ 293
+ + + ++ + G V + G+ + R + + +
Sbjct: 773 Q---ADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVY----GSPRLERYNGLPSME 825

Query: 294 DLGEVVAGVVLKRMGANTKETIDDISARTAMIEQALPDGVSFEVFYDQSDLVNQAVTTVR 353
GE G D A + LP G+ ++ + S +
Sbjct: 826 IQGEAAPGTSS-----------GDAMALMENLASKLPAGIGYD-WTGMSYQERLSGNQAP 873

Query: 354 DALLMAFVFIVVILALFLVNIRATMLVLLSIPVSIGLALLVMSYFGMSANLMSLGGLAVA 413
+ ++FV + + LA + + V+L +P+ I LL + F ++ + GL
Sbjct: 874 ALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTT 933

Query: 414 IGMLVDGSVVMVENIFKHLTQPDRRHLANARKRASDEDDPYHADEDGTNTTAHGESSGGI 473
IG+ ++++VE + L + E
Sbjct: 934 IGLSAKNAILIVEFA---------KDLMEKEGKGVVE----------------------- 961

Query: 474 TMRVMLAAKEVCSPIFFATAIIIVVFAPLFALEGVEGKLFQPMAVSIILAMISALLVALI 533
++A + PI + I+ PL G + + ++ M+SA L+A+
Sbjct: 962 --ATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 534 AVPALAVYLFKR 545
VP V + +
Sbjct: 1020 FVPVFFVVIRRC 1031



Score = 94.1 bits (234), Expect = 2e-21
Identities = 92/521 (17%), Positives = 191/521 (36%), Gaps = 53/521 (10%)

Query: 578 ITAVVMFAMSMML----LPRLGTEFVPELEEGTINLRVTLAPTASLATSLD-VAPKLEAL 632
I A V+ + MM + +L P + +++ P A T D V +E
Sbjct: 10 IFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSAN-YPGADAQTVQDTVTQVIEQN 68

Query: 633 LLEFPEVDYALSRIGAAELGGDPEPVNNIEIYIGLKPVDEWVSATNRFELQRKMEAKLNV 692
+ + Y S + ++ I + + T+ Q +++ KL +
Sbjct: 69 MNGIDNLMYMSST---------SDSAGSVTITLTFQS------GTDPDIAQVQVQNKLQL 113

Query: 693 YPGLLFTFSQPIATRVDELLSGVKAQLAIKIFGPDLDVLSERGQVLTEL---VSQIPGAV 749
LL Q V++ S P V + + +S++ G
Sbjct: 114 ATPLLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVG 173

Query: 750 DVSLEQVSGEAQLVVRPKRDQLARYGISVDEIM-ALVSQ----GVGGASAGQVIDGNARY 804
DV L + + + D L +Y ++ +++ L Q G + G +
Sbjct: 174 DVQL--FGAQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQ-QL 230

Query: 805 DIYVRLGEQYRSSPDILEDLLLTGVSGATVRLGEVADVVIEMAPPNIR-RDDVQRRVVVQ 863
+ + ++++ + + L G+ VRL +VA V + N+ R + + +
Sbjct: 231 NASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLG 290

Query: 864 ANVA-DRDMGSVVNDIYAIVP--QAELPPGYTVVVGGQYENQQRAQQKLMLVVP---VSI 917
+A + I A + Q P G V+ Y+ Q + VV +I
Sbjct: 291 IKLATGANALDTAKAIKAKLAELQPFFPQGMKVLY--PYDTTPFVQLSIHEVVKTLFEAI 348

Query: 918 ALIALLLFFSFGSVRQVGLIMANVPLALIGGVVALFASGTYLSVPSSIGFITLFGVAVLN 977
L+ L+++ ++R + VP+ L+G L A G SI +T+FG+ +
Sbjct: 349 MLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGY------SINTLTMFGMVLAI 402

Query: 978 GVVLVDSI----NQRRASVKEETDESLYDAVYEGTVGRLRPVLMTALTSALGLIPILLSS 1033
G+++ D+I N R ++++ +A + ++ A+ + IP+
Sbjct: 403 GLLVDDAIVVVENVERVMMEDKLPP--KEATEKSMSQIQGALVGIAMVLSAVFIPMAFFG 460

Query: 1034 GVGSEIQQPLAVVIIGGLFSSTALTLLVLPTLYRWIYQGRK 1074
G I + ++ I+ + S + L++ P L + +
Sbjct: 461 GSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVS 501


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0610NUCEPIMERASE565e-11 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 55.6 bits (134), Expect = 5e-11
Identities = 36/182 (19%), Positives = 69/182 (37%), Gaps = 23/182 (12%)

Query: 6 KIMVTGASGLLGRALIKQLGQNSQQQI----IACGY------------SRFGPNIERLDL 49
K +VTGA+G +G + K+L + Q + + Y ++ G ++DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 50 TQAAQVSSFVAKHKPDIILHCAAERRPDVSEQDPQAALALNSEATQFLTQAASQHG-AWL 108
++ A + + S ++P A N + + + L
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 109 LYISTDYVFDGSQS-PYREDAETN-PVNFYGQSKLKGELVVSDAQQGFAI----LRLPIL 162
LY S+ V+ ++ P+ D + PV+ Y +K EL+ + + LR +
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFFTV 181

Query: 163 YG 164
YG
Sbjct: 182 YG 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0612HTHFIS887e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.0 bits (218), Expect = 7e-23
Identities = 27/111 (24%), Positives = 58/111 (52%), Gaps = 1/111 (0%)

Query: 1 MTK-KLLIIEDDIALASVLVRRMTKQGFECQQCHDATQGLLLARQFRPTHLLLDMKLDQE 59
MT +L+ +DD A+ +VL + +++ G++ + +A ++ D+ + E
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 60 NGLKLISPLRQLLPDVVMVLLTGYASIATAVEAIRLGADNYLAKPVDTQTL 110
N L+ +++ PD+ +++++ + TA++A GA +YL KP D L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTEL 111



Score = 47.9 bits (114), Expect = 5e-09
Identities = 27/172 (15%), Positives = 56/172 (32%), Gaps = 20/172 (11%)

Query: 8 IEDDIALASVLVRRMTKQGFECQQCHDATQGLLLA-------RQFRPTHLLLDMKLDQEN 60
ED L V++ K+G + ++ L+ A R+ +L+ +
Sbjct: 314 AEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELE--NLVRRLTALYPQ 371

Query: 61 GLKLISPLRQLLPDVVMVLLTGYASIATAVEAIRLGADNYLAKPVDTQTLFNALSTTPFD 120
+ + L + A+ + +I + + + F
Sbjct: 372 DVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFA-----------SFG 420

Query: 121 SMIESLIEEPLSPKRLEWEHIQQVLNANSGNVSATARQLGMHRRTLQRKLLK 172
+ +E+ I L A GN A LG++R TL++K+ +
Sbjct: 421 DALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472


11Shal_0662Shal_0686Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0662-1183.281265formate dehydrogenase subunit alpha
Shal_0663-1192.615476hypothetical protein
Shal_06640172.078888aldose 1-epimerase
Shal_06650161.867693galactokinase
Shal_06662161.535249SSS family solute/sodium (Na+) symporter
Shal_06672151.401459beta-galactosidase
Shal_06682110.575964hypothetical protein
Shal_06691110.356949sodium/hydrogen exchanger
Shal_06700100.745658thiol:disulfide interchange protein
Shal_0671214-0.182539CutA1 divalent ion tolerance protein
Shal_0672218-0.324663FxsA cytoplasmic membrane protein
Shal_06731190.100758serine/threonine protein kinase
Shal_0674226-1.347856hypothetical protein
Shal_0675327-1.260022thiopurine S-methyltransferase
Shal_0676227-2.740986porin
Shal_0677119-3.309644porin
Shal_0678017-4.910948UBA/THIF-type NAD/FAD-binding protein
Shal_0679028-8.558095hypothetical protein
Shal_0680231-9.226908hypothetical protein
Shal_0681032-9.143562*AraC family transcriptional regulator
Shal_0682129-7.913076aspartate carbamoyltransferase regulatory
Shal_0683329-7.276793aspartate carbamoyltransferase catalytic
Shal_0684328-6.880760major facilitator transporter
Shal_0685225-5.022890hypothetical protein
Shal_0686122-4.078450endoribonuclease L-PSP
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0668MYCMG045290.028 Hypothetical mycoplasma lipoprotein (MG045) signature.
		>MYCMG045#Hypothetical mycoplasma lipoprotein (MG045) signature.

Length = 483

Score = 28.9 bits (64), Expect = 0.028
Identities = 24/75 (32%), Positives = 33/75 (44%), Gaps = 7/75 (9%)

Query: 64 QPNFSPDSSQLLFTSDRQAAPNSNSQTPAAHNDIYQFQI--ADSHIEQLTFTPEQSEYSP 121
QP SP + LL + +Q SN Q AH I+ + AD EQL T E+
Sbjct: 300 QPKISPVALDLLVINKQQ----SNFQK-EAHEIIFDLALDGADQTKEQLIKTDEELGTDD 354

Query: 122 QAFKTDKATKQITYV 136
+ F A + +YV
Sbjct: 355 EDFYLKGAMQNFSYV 369


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0676ECOLNEIPORIN564e-11 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 56.3 bits (136), Expect = 4e-11
Identities = 75/367 (20%), Positives = 130/367 (35%), Gaps = 54/367 (14%)

Query: 1 MKKTLISASVASVLTLASFGALADGPGFYGRLDLAVTNSDT----GATTQNGKSGTVFEN 56
MKK+LI+ ++A++ A YG + V S + GA + ++GT +
Sbjct: 1 MKKSLIALTLAALPVAAMADV-----TLYGTIKAGVETSRSVAHNGAQAASVETGTGIVD 55

Query: 57 NFSFVGVKGTESLTNNIDVVYQMEFQVENTTNSGETFKARNTYLGLKSQAGTVLVGRNDH 116
S +G KG E L N + ++Q+E + + + + R +++GLK G + VGR +
Sbjct: 56 LGSKIGFKGQEDLGNGLKAIWQVEQKA-SIAGTDSGWGNRQSFIGLKGGFGKLRVGRLNS 114

Query: 117 VFKQS------EGGVDIFGNTNADIDRLVAGQDRVADGIWYYSPKIAGLV-TLNATYLME 169
V K + + D G +A + + Y SP+ AGL ++
Sbjct: 115 VLKDTGDINPWDSKSDYLGVNK------IAEPEARLISVRYDSPEFAGLSGSVQYALNDN 168

Query: 170 DNYIDADDKDAVASYD-----AQYALSATIGDKKLKEQNYYVAAAYNTIKGIDAYRGVAQ 224
+++ A +Y QY + + + N + + G D A
Sbjct: 169 AGRHNSESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSGYDNDALYAS 228

Query: 225 V--KLGDFKVGGLFQNTESQTTDQEGNSYFVNVAYNLNGVNLKAEYGMDEGGFGKYYQNS 282
V + D K+ + SQ +AY V + Y GF +
Sbjct: 229 VAVQQQDAKLVEENYSHNSQ------TEVAATLAYRFGNVTPRVSYAH---GFKGSFDA- 278

Query: 283 VPKVDGSKVIGTDINVQVITVGADYKISKSTMVYGHYAMYEGDHKVAGNKFDLKDDNVFT 342
+ + + VGA+Y SK T + G
Sbjct: 279 ---------TNYNNDYDQVVVGAEYDFSKRTSALVSAGWLQE-----GKGESKFVSTAGG 324

Query: 343 VGLRYNF 349
VGLR+ F
Sbjct: 325 VGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0677ECOLNEIPORIN611e-12 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 61.4 bits (149), Expect = 1e-12
Identities = 68/364 (18%), Positives = 121/364 (33%), Gaps = 44/364 (12%)

Query: 1 MKKTVLSATIISALAATSFTALADGPNFYGRADLAITNSDM----GIATQNQKSGTVIEN 56
MKK++++ T+ + A YG + S G + ++GT I +
Sbjct: 1 MKKSLIALTLAALPVAAMADV-----TLYGTIKAGVETSRSVAHNGAQAASVETGTGIVD 55

Query: 57 NFSWLGVKGTEAINSELEVVYQMEFGVNNFDNSGNTFSARNTFLGLKSATAGTILVGRND 116
S +G KG E + + L+ ++Q+E + + R +F+GLK G + VGR +
Sbjct: 56 LGSKIGFKGQEDLGNGLKAIWQVEQKASIAGTDSGWGN-RQSFIGLK-GGFGKLRVGRLN 113

Query: 117 TVFK------ASEGGFDIFGNTNSDIDLFAAGQSRNADGFTYYSPKIAGLVTLNATYLMD 170
+V K + D G ++ A ++R Y SP+ AGL
Sbjct: 114 SVLKDTGDINPWDSKSDYLG-----VNKIAEPEARLI-SVRYDSPEFAGLS-------GS 160

Query: 171 DNYAQEVDGKKVDTDNMYALSATIGDKGLKAQNYYVAAAYNDSIDNVKAYRGVAQVKLGQ 230
YA + + ++++ Y + G Q ++ +NV +
Sbjct: 161 VQYALNDNAGRHNSES-YHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHR---- 215

Query: 231 VILGGFYQNSEHVDSKYANLEGDTYFVNAAYVMGDLKLKAMYGSDDSGLGKYVSRYVGSV 290
L Y N S + ++ A + VS Y
Sbjct: 216 --LVSGYDNDALYASVAVQQQDAKLVEENYSHNSQTEVAATLAYRFGNVTPRVS-YAHGF 272

Query: 291 DGATLENVSDVDL-QQFSVGADYRLSKNTLVYGHYTKYDGDIKLAGATQDLSDDIFTVGM 349
+ + + + Q VGA+Y SK T + VG+
Sbjct: 273 -KGSFDATNYNNDYDQVVVGAEYDFSKRTSALVSAGWLQEGKGESKFVS----TAGGVGL 327

Query: 350 RFDF 353
R F
Sbjct: 328 RHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0684TCRTETA394e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 38.7 bits (90), Expect = 4e-05
Identities = 58/379 (15%), Positives = 128/379 (33%), Gaps = 41/379 (10%)

Query: 55 FFPNYDTTSALMATFAIFAIS-FIVRPLGGIFWGHIGDKIGRKTALSMSIIIMSCATFCI 113
+ D T+ A++A+ F P+ G + D+ GR+ L +S+ + +
Sbjct: 35 LVHSNDVTAHYGILLALYALMQFACAPVLG----ALSDRFGRRPVLLVSLAGAAVDYAIM 90

Query: 114 ALLPDYNSIGIMAPLLLLVARMFQGFSASGEYAGAATFLTEIAPKEEKGFYASIVPASAA 173
A P +L + R+ G + A A ++ +I +E+ + + A
Sbjct: 91 ATAP--------FLWVLYIGRIVAGITG-ATGAVAGAYIADITDGDERARHFGFMSACFG 141

Query: 174 AGLLFGSIFISILYAFLSSAQLHEWGWRIPFLLAAPFGLIGLYIRVKIEDSPQFVKFKEE 233
G++ G + + + PF AA + + + +
Sbjct: 142 FGMVAGPVL---------GGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPL 192

Query: 234 NKEKQSP---IPLQVILAHHKKPLLLGFLVTSLNALGFYLLLSYMPTYMTVHLEVKDTT- 289
+E +P + + + F++ + + L + + H +
Sbjct: 193 RREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIF--GEDRFHWDATTIGI 250

Query: 290 AFAISSIVLAFYILFVFSIGKLSDKFGRRKLLMLASLCFICLSIPIFIILESANIFQMTL 349
+ A I+ + + G ++ + G R+ LML + I + F + +
Sbjct: 251 SLAAFGILHSLAQAMIT--GPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMV 308

Query: 350 VLILFSAFLALNDGCLSCYLCELFPTNVRYTGFALSFNSANALFGGTAPFIATTLIAVT- 408
+L + LS + E + G + S ++ G P + T + A +
Sbjct: 309 LLASGGIGMPALQAMLSRQVDE--ERQGQLQGSLAALTSLTSIVG---PLLFTAIYAASI 363

Query: 409 ----GLSYAPGFYLMLIAL 423
G ++ G L L+ L
Sbjct: 364 TTWNGWAWIAGAALYLLCL 382



Score = 34.8 bits (80), Expect = 6e-04
Identities = 27/89 (30%), Positives = 44/89 (49%), Gaps = 7/89 (7%)

Query: 252 KPLLLGFLVTSLNALGFYLLLSYMPTYMTVHLEVKDTTAFAISSIVLAFYIL--FVFS-- 307
+PL++ +L+A+G L++ +P + V A I+LA Y L F +
Sbjct: 5 RPLIVILSTVALDAVGIGLIMPVLPGLLRDL--VHSNDVTAHYGILLALYALMQFACAPV 62

Query: 308 IGKLSDKFGRRKLLMLASLCFICLSIPIF 336
+G LSD+FGRR ++L SL + I
Sbjct: 63 LGALSDRFGRR-PVLLVSLAGAAVDYAIM 90


12Shal_0755Shal_0771Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0755-1153.140418sugar fermentation stimulation protein A
Shal_07560143.728132aminopeptidase B
Shal_0757-1153.011358transcriptional regulator CadC
Shal_07580163.757466hypothetical protein
Shal_07590174.1103492'-5' RNA ligase
Shal_07600153.222131ATP-dependent helicase HrpB
Shal_0761-1122.337520penicillin-binding protein 1B
Shal_0762-1141.048781hypothetical protein
Shal_0763-1151.744744sugar efflux transporter
Shal_0764-1150.408979N-acetyltransferase GCN5
Shal_07650162.164789hypothetical protein
Shal_07661203.015218TonB-dependent receptor
Shal_07672253.877601hypothetical protein
Shal_07682254.011443amino acid carrier protein
Shal_07693274.248253hypothetical protein
Shal_07703253.997479hypothetical protein
Shal_07712213.023636hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0757HTHFIS416e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 40.6 bits (95), Expect = 6e-06
Identities = 15/95 (15%), Positives = 31/95 (32%), Gaps = 3/95 (3%)

Query: 214 GRILWVDDHPENNLVEKAFFEQKGIGVYSTVTSEEALMLLSMYHYQAVISDMGRHGDSLA 273
IL DD V + G V T + ++ V++D+ ++
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN-- 61

Query: 274 GLKLLQSIRASGNKTPFYLYT-YVESAGVVDAIDE 307
LL I+ + P + + + A ++
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEK 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0763TCRTETB354e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 35.2 bits (81), Expect = 4e-04
Identities = 37/180 (20%), Positives = 70/180 (38%), Gaps = 3/180 (1%)

Query: 12 NWAPVWLLALTAFVFVSTEFVPVGILAALGQSFNMSAVAVGPMLTVYAAVVALASLPAVL 71
N +WL L+ F ++ + V L + FN + + T + ++ +
Sbjct: 13 NQILIWLCILSFFSVLNEMVLNVS-LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGK 71

Query: 72 LFARFERKKLLLGIMAVFIISHGICVWAPSY-DILLSGRIGIALTHALFWAIAPALVVRL 130
L + K+LLL + + I S+ +L+ R A F A+ +V R
Sbjct: 72 LSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARY 131

Query: 131 APEGQGAKALSIFATGCVLALVLGIPLGRVIGQLIGWRSIFALIGLLTFAMMLLLWRGLP 190
P+ KA + + + +G +G +I I W + LI ++T + L + L
Sbjct: 132 IPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL-LIPMITIITVPFLMKLLK 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0764SACTRNSFRASE353e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 35.3 bits (81), Expect = 3e-05
Identities = 17/60 (28%), Positives = 24/60 (40%), Gaps = 2/60 (3%)

Query: 70 LINVVWVDESARGTGLGRDLMQRAEVEAKQRGCTMAQLDTLSFQAPV--FYQKLGFEIIG 127
LI + V + R G+G L+ +A AK+ L+T FY K F I
Sbjct: 91 LIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0770RTXTOXIND397e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.4 bits (92), Expect = 7e-05
Identities = 42/215 (19%), Positives = 78/215 (36%), Gaps = 21/215 (9%)

Query: 696 RIDSDIKQLQGQ-QQEWLEEQKEQALEARMEKNAYWQEVVGALDSQLNQVKSNIQSRRDH 754
++D + Q Q LE+ + Q L +E N E+ + V R
Sbjct: 131 GAEADTLKTQSSLLQARLEQTRYQILSRSIELNKL-PELKLPDEPYFQNVSEEEVLRLTS 189

Query: 755 AASEQKACEQWYKNELKSRGVDEAKIVALKAEIRTLESSISKAEQRRSEVLRYEDWYQHT 814
EQ + Q K + E + +AE T+ + I++ E S V +
Sbjct: 190 LIKEQFSTWQNQKYQK------ELNLDKKRAERLTVLARINRYENL-SRVEKSR------ 236

Query: 815 WLKRKPKLQSELSKVKHASLDLEQQLKSKANEVKAKRNELDGERKRSDVVQIEASENLTK 874
L L + + KHA L+ E + NE++ +++L+ + E
Sbjct: 237 -LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQL 295

Query: 875 LRSVMRKLSELKLPASNDDAQGSLTERLRQGEDQL 909
++ + KL + D+ LT L + E++
Sbjct: 296 FKNEILD----KLRQTTDNIG-LLTLELAKNEERQ 325


13Shal_0817Shal_0843Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_08172201.970973hypothetical protein
Shal_08180161.335008Fe-S metabolism associated SufE
Shal_0819-1141.253144SufS subfamily cysteine desulfurase
Shal_0820-1160.482201aldo/keto reductase
Shal_0821-1160.149193MarR family transcriptional regulator
Shal_0822119-0.110378NTPase
Shal_0823221-0.451117aromatic amino acid permease
Shal_08242170.240808L-serine dehydratase 1
Shal_0825116-1.036120N-acetyltransferase GCN5
Shal_0826014-0.871268hypothetical protein
Shal_0827015-0.503060sodium:dicarboxylate symporter
Shal_0828-115-1.818548hypothetical protein
Shal_0829-216-2.671679hypothetical protein
Shal_0830-116-2.596046putative diguanylate cyclase
Shal_0831-119-4.337747N-acetyltransferase GCN5
Shal_0832022-5.012458hypothetical protein
Shal_0833122-5.473075hypothetical protein
Shal_0834021-4.629197hypothetical protein
Shal_0835119-3.253311hypothetical protein
Shal_0836118-1.226804hypothetical protein
Shal_08371211.271559hypothetical protein
Shal_08380162.057607hypothetical protein
Shal_0839-1153.026052N-acetyltransferase GCN5
Shal_0840-1153.001721LuxR family transcriptional regulator
Shal_0841-1163.160733fumarate reductase/succinate dehydrogenase
Shal_08421232.802722thiamine pyrophosphate protein central region
Shal_08433322.064453S-adenosylmethionine synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0820HELNAPAPROT352e-04 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 34.8 bits (80), Expect = 2e-04
Identities = 27/123 (21%), Positives = 38/123 (30%), Gaps = 17/123 (13%)

Query: 111 AAVNASLERLQIDTIDLY----QIHWPDRNTNFFGELSYNQQDEHEKLTPIIDTLEALSD 166
V SL + LY + HW + +FF HEK + D D
Sbjct: 11 TLVENSLNTQLSNWFLLYSKLHRFHWYVKGPHFF--------TLHEKFEELYDHAAETVD 62

Query: 167 LIKAGKIRYIGISNETPWGFM-EYLRLAEKHDLPKIVSVQNPYNLLNRSYEVGMAEISHR 225
I A ++ IG P + EY A D S L Y+ +E
Sbjct: 63 TI-AERLLAIGGQ---PVATVKEYTEHASITDGGNETSASEMVQALVNDYKQISSESKFV 118

Query: 226 EEV 228
+
Sbjct: 119 IGL 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0821SACTRNSFRASE374e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.8 bits (85), Expect = 4e-05
Identities = 14/49 (28%), Positives = 18/49 (36%)

Query: 240 PEVRGKGFAKRLALQAIRYANVQGYTSCYLETTANLIEAIKLYESLGFK 288
+ R KG L +AI +A + LET I A Y F
Sbjct: 99 KDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFI 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0839SACTRNSFRASE300.004 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 29.5 bits (66), Expect = 0.004
Identities = 9/24 (37%), Positives = 14/24 (58%)

Query: 70 QLAILTGMLVHPDYRGQGVGHRLM 93
A++ + V DYR +GVG L+
Sbjct: 88 GYALIEDIAVAKDYRKKGVGTALL 111


14Shal_0857Shal_0869Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_08570173.412036ATP-dependent RNA helicase SrmB
Shal_08580173.299249methyltransferase small
Shal_0859-1163.009930branched-chain amino acid transport system II
Shal_08600172.567139tyrosine recombinase XerD
Shal_08611140.929279thiol:disulfide interchange protein DsbC
Shal_08622151.644190single-stranded-DNA-specific exonuclease RecJ
Shal_08632140.308934AsnC family transcriptional regulator
Shal_08642130.487180hypothetical protein
Shal_08650110.316803serine transporter
Shal_0866-1141.790603hypothetical protein
Shal_0867-1203.051230FAD dependent oxidoreductase
Shal_08680172.171351AraC family transcriptional regulator
Shal_0869-1193.006645lysine exporter protein LysE/YggA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0866ECOLNEIPORIN270.046 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 26.7 bits (59), Expect = 0.046
Identities = 15/85 (17%), Positives = 33/85 (38%), Gaps = 2/85 (2%)

Query: 82 DTKIGVKFGDDYLGFHPYLTVAYGEGKATIYNYSGDDNYIFVGAGAEVQIAKNYVISAEF 141
T++ + P ++ A+G + + +++Y V GAE +K
Sbjct: 248 QTEVAATLAYRFGNVTPRVSYAHGFKG-SFDATNYNNDYDQVVVGAEYDFSKRTSALVSA 306

Query: 142 GNLKLDE-LSEISANVTRINFAYKF 165
G L+ + S+ + + +KF
Sbjct: 307 GWLQEGKGESKFVSTAGGVGLRHKF 331


15Shal_0941Shal_0946Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_09412131.348380pseudouridine synthase
Shal_09422131.464484GPR1/FUN34/yaaH family protein
Shal_09432142.133557major facilitator transporter
Shal_09444233.582815LrgB family protein
Shal_09452223.091011LrgA family protein
Shal_09460213.516432LysR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0943TCRTETA683e-14 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 67.5 bits (165), Expect = 3e-14
Identities = 64/308 (20%), Positives = 108/308 (35%), Gaps = 36/308 (11%)

Query: 71 SDYGIIIAIWAFMQTFIPVFTGGISDRVGYKETIFASTIIKIAGYLVMAFFPSFYGFLFG 130
+ YGI++A++A MQ G +SDR G + + S Y +MA P + L+
Sbjct: 43 AHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLW-VLYI 101

Query: 131 AILLAGGTGIFKPGIQGTLVLSTGRQNTSMAWGIFYQVVNIGGFLGPLVAVHMRQLSWDN 190
++AG TG + T + +G G GP++ M S
Sbjct: 102 GRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHA 161

Query: 191 VFYACAAIISLNFLF-LLAYKEPGKEERMERNRKIKAGEIHQEALWRDALHELKKPIVIY 249
F+A AA+ LNFL E K ER R+ AL+ L
Sbjct: 162 PFFAAAALNGLNFLTGCFLLPESHKGERRPLRRE--------------ALNPLASFRWAR 207

Query: 250 YMLVFAGFWFLFNSLFDVLPIHISEWVDTSVIVTSLFGPDGTSNGFFQFWLGLDNTGTKV 309
M V A +F + V + + WV + F D T+ G G+ +
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWV---IFGEDRFHWDATTIGISLAAFGILH----- 259

Query: 310 MPEGMLNLNAGMIMTTCFLVAALTAKYRITTAMLAGCLLSILAFVLIGATNAAWMIVLAI 369
+ + + A+ A++ G + ++L+ WM +
Sbjct: 260 ------------SLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIM 307

Query: 370 AMFSFGEM 377
+ + G +
Sbjct: 308 VLLASGGI 315



Score = 50.6 bits (121), Expect = 8e-09
Identities = 34/145 (23%), Positives = 62/145 (42%), Gaps = 10/145 (6%)

Query: 66 LGISLSDYGIIIAIWAFMQTFIPVFTGGISDRVGYKETIFASTIIKIAGYLVMAFFPSFY 125
+GISL+ +GI+ ++ + TG ++ R+G + + I GY+++AF +
Sbjct: 248 IGISLAAFGILHSL------AQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 126 GFLFGAILLAGGTGIFKPGIQGTLVLSTGRQNTSMAWGIFYQVVNIGGFLGPLVAVHMRQ 185
+LLA G GI P +Q L + G + ++ +GPL+ +
Sbjct: 302 MAFPIMVLLASG-GIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA 360

Query: 186 LS---WDNVFYACAAIISLNFLFLL 207
S W+ + A + L L L
Sbjct: 361 ASITTWNGWAWIAGAALYLLCLPAL 385


16Shal_0997Shal_1004Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0997-1224.085663spermidine/putrescine ABC transporter ATPase
Shal_0998-1224.343890binding-protein-dependent transport system inner
Shal_0999-1163.469897ornithine carbamoyltransferase
Shal_1000-1122.366493FAD dependent oxidoreductase
Shal_10010121.886734succinate-semialdehyde dehydrogenase
Shal_10022140.8747914-aminobutyrate aminotransferase
Shal_10033170.216772UspA domain-containing protein
Shal_10042170.247915*RNA polymerase sigma 70 subunit RpoD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0998PF06057290.023 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 28.7 bits (64), Expect = 0.023
Identities = 11/50 (22%), Positives = 17/50 (34%), Gaps = 7/50 (14%)

Query: 98 LLIGY-------PMAYAIARAPKRNQTMLILLVMLPSWTSFLIRVYAWMG 140
+LIGY P A R + +L+ + F I V +
Sbjct: 120 ILIGYSFGAEVIPFVLNEMPARYRKNVLGAVLLSPSQSSDFEIHVSEMVT 169


17Shal_1269Shal_1274Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_12692140.223564hypothetical protein
Shal_1270215-0.437178beta-lactamase
Shal_1271212-0.771360hypothetical protein
Shal_1272313-0.481358hypothetical protein
Shal_12733130.003661putative lipoprotein
Shal_1274215-0.502539hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1272ACRIFLAVINRP270.011 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 26.7 bits (59), Expect = 0.011
Identities = 8/24 (33%), Positives = 14/24 (58%)

Query: 3 KHLGVAAVLVVLLILIFGCSQRSA 26
K L A +LV L++ +F + R+
Sbjct: 342 KTLFEAIMLVFLVMYLFLQNMRAT 365


18Shal_1283Shal_1345Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1283223-3.135923PEBP family protein
Shal_1284225-3.762064AraC family transcriptional regulator
Shal_1285328-6.527364alkylphosphonate utilization operon protein
Shal_1286330-7.597263AraC family transcriptional regulator
Shal_1287239-11.727898alkylhydroperoxidase
Shal_1288341-12.685826hypothetical protein
Shal_1289344-13.027301integrase, catalytic region
Shal_1291134-9.751100LysR family transcriptional regulator
Shal_1292133-9.344582hypothetical protein
Shal_1293030-8.670245diguanylate cyclase/phosphodiesterase
Shal_1294024-6.533178peroxiredoxin-like protein
Shal_1295027-7.484218methyl-accepting chemotaxis sensory transducer
Shal_1296022-6.341929MSHA biogenesis protein MshQ
Shal_1297228-8.592795hypothetical protein
Shal_1298229-8.842024hypothetical protein
Shal_1299224-6.748319phage integrase family protein
Shal_1300223-6.928524ATP-dependent OLD family endonuclease
Shal_1301219-4.830307hypothetical protein
Shal_1302320-3.635318phage integrase family protein
Shal_1303117-2.895423hypothetical protein
Shal_1304117-1.917916integrase family protein
Shal_1305022-2.877462hypothetical protein
Shal_1306422-0.616098CI repressor
Shal_13075211.820787hypothetical protein
Shal_13086201.534193regulatory CII family protein
Shal_13094181.046723hypothetical protein
Shal_13103170.538153hypothetical protein
Shal_13114160.790986hypothetical protein
Shal_13122150.392580hypothetical protein
Shal_1313217-1.030926hypothetical protein
Shal_1314214-0.315715replication gene A
Shal_13152200.319685hypothetical protein
Shal_13162181.039768transcriptional activator Ogr/delta
Shal_13173170.972749SH3 domain-containing protein
Shal_13183181.502190hypothetical protein
Shal_13194202.533747PBSX family phage portal protein
Shal_13205222.835147hypothetical protein
Shal_13214242.634928capsid scaffolding
Shal_13225222.028661P2 family phage major capsid protein
Shal_13236262.982403hypothetical protein
Shal_13245262.619774small terminase subunit
Shal_13255273.097664hypothetical protein
Shal_13264272.111626hypothetical protein
Shal_13275271.935783hypothetical protein
Shal_13287302.216392hypothetical protein
Shal_13296300.698919prophage PSPPH06 tail tube protein
Shal_13305271.935394TraR/DksA family transcriptional regulator
Shal_13316271.487487hypothetical protein
Shal_13327292.178193hypothetical protein
Shal_13336302.339978hypothetical protein
Shal_13346282.699677hypothetical protein
Shal_13356282.579420TP901 family phage tail tape measure protein
Shal_13366302.492738hypothetical protein
Shal_13376302.854575putative bacteriophage protein
Shal_13384281.382913hypothetical protein
Shal_13402260.565212hypothetical protein
Shal_13411250.058839hypothetical protein
Shal_1342024-0.566411hypothetical protein
Shal_1343319-2.108088hypothetical protein
Shal_1344315-1.373865hypothetical protein
Shal_1345217-0.443205SsrA-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1289HTHFIS270.021 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 26.7 bits (59), Expect = 0.021
Identities = 7/24 (29%), Positives = 14/24 (58%)

Query: 18 LAEELGNVSRACKVMGVSRDTFYR 41
L GN +A ++G++R+T +
Sbjct: 445 LTATRGNQIKAADLLGLNRNTLRK 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1292BCTERIALGSPG503e-10 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 49.9 bits (119), Expect = 3e-10
Identities = 20/56 (35%), Positives = 35/56 (62%)

Query: 2 TKKTFGFSLIELVVVLVILGVLSVIVIPRFINLHRDAKIATLKATKGSIESAFDMF 57
T K GF+L+E++VV+VI+GVL+ +V+P + A + ++E+A DM+
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMY 59


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1303ADHESNFAMILY300.005 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 29.8 bits (67), Expect = 0.005
Identities = 13/52 (25%), Positives = 25/52 (48%), Gaps = 1/52 (1%)

Query: 112 LQIEKANVQIENTEKQILETQQKNKTDLYLAHYKHFSEHLVALEAKWKNTVN 163
L +E + +N KQ+ NK + Y + K +++ L L+ + K+ N
Sbjct: 142 LNLENGIIFAKNIAKQLSAKDPNNK-EFYEKNLKEYTDKLDKLDKESKDKFN 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1315TONBPROTEIN260.040 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 26.1 bits (57), Expect = 0.040
Identities = 9/24 (37%), Positives = 12/24 (50%)

Query: 102 EVEPEPEPEPQSPVRKGPTRKGPG 125
E EP PEP ++PV + P
Sbjct: 74 EPEPIPEPPKEAPVVIEKPKPKPK 97


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1316BLACTAMASEA260.015 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 26.3 bits (58), Expect = 0.015
Identities = 9/30 (30%), Positives = 17/30 (56%)

Query: 49 TLSPSAQAASAIVSELARALSPTQRQQLQQ 78
T +P++ AA+ ++ LS ++QL Q
Sbjct: 176 TTTPASMAATLRKLLTSQRLSARSQRQLLQ 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1327PF05043300.009 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 29.5 bits (66), Expect = 0.009
Identities = 9/62 (14%), Positives = 20/62 (32%), Gaps = 10/62 (16%)

Query: 55 DGKSFAP--RKKNSESKSLKRLAKGLKPYVTGGNRLELK-HKNKLTGR-------IAAFH 104
+G ++ S SL R+ + + + E+ ++ G A +
Sbjct: 99 EGCQAESICKEFYISSSSLYRIISQINKVIKRQFQFEVSLTPVQIIGNERDIRYFFAQYF 158

Query: 105 QE 106
E
Sbjct: 159 SE 160


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1344RTXTOXINA270.021 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 27.2 bits (60), Expect = 0.021
Identities = 14/70 (20%), Positives = 31/70 (44%)

Query: 40 LKDFTDAVKASQKLNEMEKKQIRDFIEDNMDIVKEGLESWIEGVEGISSAERCIQKMFSG 99
++F +S K++E+ KKQ + ++ K +E + V+ ++S +
Sbjct: 149 FQNFLGTALSSMKIDELIKKQKSGGNVSSSELAKASIELINQLVDTVASLNNNVNSFSQQ 208

Query: 100 LSELATFSEN 109
L+ L + N
Sbjct: 209 LNTLGSVLSN 218


19Shal_1410Shal_1417Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1410-120-3.404466formate dehydrogenase
Shal_1411-127-5.857723thiamine biosynthesis protein ThiH
Shal_1412136-9.481989hypothetical protein
Shal_1413135-9.394088biotin synthase
Shal_1414131-8.975527small GTP-binding protein
Shal_1415132-9.383413beta-lactamase domain-containing protein
Shal_1416028-8.139840hypothetical protein
Shal_1417-122-6.126086helicase domain-containing protein
20Shal_1439Shal_1456Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1439215-1.815929flagellin domain-containing protein
Shal_1440116-1.266631flagellar protein FlaG protein
Shal_1441117-0.655373flagellar hook-associated 2 domain-containing
Shal_1442217-0.496509hypothetical protein
Shal_14431140.622418flagellar protein FliS
Shal_14441120.899281sigma-54 dependent trancsriptional regulator
Shal_14452121.307608PAS/PAC sensor-containing signal transduction
Shal_14461122.364775Fis family two component sigma54 specific
Shal_14471132.458869flagellar hook-basal body complex subunit FliE
Shal_14482102.788139flagellar MS-ring protein
Shal_14493102.551944flagellar motor switch protein G
Shal_14501112.392242flagellar assembly protein FliH
Shal_14510132.190674flagellum-specific ATP synthase
Shal_14522150.572255flagellar export protein FliJ
Shal_14531150.282376flagellar hook-length control protein
Shal_1454221-0.918450flagellar basal body-associated protein FliL
Shal_1455320-0.277721flagellar motor switch protein FliM
Shal_14562220.108608flagellar motor switch protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1439FLAGELLIN1321e-37 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 132 bits (333), Expect = 1e-37
Identities = 95/271 (35%), Positives = 134/271 (49%), Gaps = 10/271 (3%)

Query: 2 AITVNTNVTSMKAQKNLNASSQNLATSMERLSSGLRINSAKDDAAGLAISNRLDSQVRGL 61
A +NTN S+ Q NLN S +L++++ERLSSGLRINSAKDDAAG AI+NR S ++GL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 DVGMRNANDAISIAQISEGAMQEQTNMLQRMRDLTIQASNGANSSDDIASIKKEIDALAG 121
RNAND ISIAQ +EGA+ E N LQR+R+L++QA+NG NS D+ SI+ EI
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EITSIGNTTAFGNTKLMNGDFSAGKSFQVGHQKGEDITVKVQKVNASSLAVGSL------ 175
EI + N T F K+++ D QVG GE IT+ +QK++ SL +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQ--MKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPK 178

Query: 176 --TLTNSANRQSALTKIDAAIKTIDSQRADLGAVQNRLSHNISNSANTQANVADAKSRIV 233
T+ + + +T D + R D+ + + A
Sbjct: 179 EATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTT 238

Query: 234 DVDFAKETAAMTKNQVLQQTGSAMLAQANQL 264
D + K + A A +
Sbjct: 239 DDAENNTAVDLFKTTKSTAGTAEAKAIAGAI 269



Score = 87.0 bits (215), Expect = 2e-21
Identities = 57/213 (26%), Positives = 97/213 (45%), Gaps = 4/213 (1%)

Query: 60 GLDVGMRNANDAISIAQISEGAMQEQTNMLQRMRDLTIQASNGANSSDDIASIKKEIDAL 119
+ + +++A I+ GA LQ +++ NG + DD + +
Sbjct: 298 KVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSD 357

Query: 120 AGEITSIGNTTAFGNTKLMNGDFSAGKSFQVGHQKGEDITVKVQKVNASSLAVGSLTLTN 179
++ + +AG + + + + S +
Sbjct: 358 LEANNAVKGESKITVNGAEYTANAAGDKVTLAGK----TMFIDKTASGVSTLINEDAAAA 413

Query: 180 SANRQSALTKIDAAIKTIDSQRADLGAVQNRLSHNISNSANTQANVADAKSRIVDVDFAK 239
+ + L ID+A+ +D+ R+ LGA+QNR I+N NT N+ A+SRI D D+A
Sbjct: 414 KKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYAT 473

Query: 240 ETAAMTKNQVLQQTGSAMLAQANQLPQVALSLL 272
E + M+K Q+LQQ G+++LAQANQ+PQ LSLL
Sbjct: 474 EVSNMSKAQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1441SHIGARICIN290.025 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 29.4 bits (66), Expect = 0.025
Identities = 10/59 (16%), Positives = 19/59 (32%), Gaps = 6/59 (10%)

Query: 276 NTLVESISNLSKYDTEKEEAAALQGDSMI------RSIESQMRNMISNRVDVDGETIAL 328
L +I+ L Y+ +A + + IE Q+ + I+L
Sbjct: 153 PALDSAITTLFYYNANSAASALMVLIQSTSEAARYKFIEQQIGKRVDKTFLPSLAIISL 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1444HTHFIS436e-152 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 436 bits (1124), Expect = e-152
Identities = 166/478 (34%), Positives = 256/478 (53%), Gaps = 17/478 (3%)

Query: 7 RILLVGNQSERINRLSCVFEFLGEQVELIDFDKLELHTKQTRFRAIVLPSENQPK----E 62
IL+ + + L+ G V + +V+ P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 63 LIQSLTGMLPWQPFLTLGERDDIKA------SNILGCIEEPLNYPQLTELLHFCQVYGQV 116
L+ + P P L + ++ + +P + +L ++ +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 117 KRPEIPTSANQTKLFRSLVGRSEGIASVRHLINQVSGSEATVLVLGQSGTGKEVVARNIH 176
+ ++ + LVGRS + + ++ ++ ++ T+++ G+SGTGKE+VAR +H
Sbjct: 125 RPSKLEDDSQD---GMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALH 181

Query: 177 YISDRRDGPFIPVNCGAIPAELLESELFGHEKGSFTGAISARKGRFELAEKGTLFLDEIG 236
RR+GPF+ +N AIP +L+ESELFGHEKG+FTGA + GRFE AE GTLFLDEIG
Sbjct: 182 DYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIG 241

Query: 237 DMPLQMQVKLLRVLQERMFERVGGSKSIAADVRVVAATHRNLETMIEEGGFREDLYYRLN 296
DMP+ Q +LLRVLQ+ + VGG I +DVR+VAAT+++L+ I +G FREDLYYRLN
Sbjct: 242 DMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLN 301

Query: 297 VFPIEMPALCERKEDIPLLLQELVSRVYNEGRGRVRFTQRAIESLKEHAWSGNVRELSNL 356
V P+ +P L +R EDIP L++ V + EG RF Q A+E +K H W GNVREL NL
Sbjct: 302 VVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELENL 361

Query: 357 VERLTILYPGGLVDVNDLPIKYRHIDVPEYCVEISEEQQERDALASIFNDEEPIEIPETR 416
V RLT LYP ++ + + R ++P+ +E + + +++ EE +
Sbjct: 362 VRRLTALYPQDVITREIIENELRS-EIPDSPIEKAAARSGSLSISQAV--EENMRQYFAS 418

Query: 417 FPSELPPEGVNLKDLLAELEIDMIRQALDQQDNVVARAAEMLGIRRTTLVEKMRKYGM 474
F LPP G+ +LAE+E +I AL +AA++LG+ R TL +K+R+ G+
Sbjct: 419 FGDALPPSGL-YDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1446HTHFIS467e-164 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 467 bits (1202), Expect = e-164
Identities = 170/483 (35%), Positives = 250/483 (51%), Gaps = 39/483 (8%)

Query: 1 MSEGKLLLVEDDASLREALLDTLMLAHYDCVDVGSAEEAILALKADRFDLVISDVQMEGI 60
M+ +L+ +DDA++R L L A YD +A + A DLV++DV M
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 GGIGLLNYLQQHQPKLPVLLMTAYATIDNAVNAMKLGAVDYLAKPFSPEVLLNQVSRYL- 119
LL +++ +P LPVL+M+A T A+ A + GA DYL KPF L+ + R L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 ---------PAKAAEGTPIVADEKSL-ALLALAQRVAASDASVMIMGPSGSGKEVLARFI 169
+ +G P+V ++ + + R+ +D ++MI G SG+GKE++AR +
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180

Query: 170 HQNSQRAEQPFIAINCAAIPENMLEATLFGYEKGAFTGAYQACPGKFEQAQGGTLLLDEI 229
H +R PF+AIN AAIP +++E+ LFG+EKGAFTGA G+FEQA+GGTL LDEI
Sbjct: 181 HDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEI 240

Query: 230 SEMEVGLQAKLLRVLQEREVERLGGRKTIKLDVRVLATSNRDLKAMAASGEFREDLYYRI 289
+M + Q +LLRVLQ+ E +GGR I+ DVR++A +N+DLK G FREDLYYR+
Sbjct: 241 GDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRL 300

Query: 290 NVFPLTWPSLNQRPADILPLARHLLQRHALIANRSELPLLSECASRRLLTHRWPGNVREL 349
NV PL P L R DI L RH +Q+ ++ + A + H WPGNVREL
Sbjct: 301 NVVPLRLPPLRDRAEDIPDLVRHFVQQAE--KEGLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 350 DNVVQRALIMCGSHEITAADIIID--SLELDFNLESTPLAD------------------- 388
+N+V+R + IT I + S D +E
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 389 --NKPTELDGLGDELKAQEHVIILETLTQCNGSRKLVAEKLGISARTLRYKMAKMRDSGI 446
+ L E+ +IL LT G++ A+ LG++ TLR K+ ++ G+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIREL---GV 475

Query: 447 QIP 449
+
Sbjct: 476 SVY 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1447FLGHOOKFLIE546e-13 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 53.9 bits (129), Expect = 6e-13
Identities = 28/71 (39%), Positives = 44/71 (61%)

Query: 40 FGQLLSQAVGNVSELQSNASNLATRLDMGDTTVTLSDTVIAREKSSVAFEATVQVRNKLV 99
F L A+ +S+ Q+ A A + +G+ V L+D + +K+SV+ + +QVRNKLV
Sbjct: 33 FAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQVRNKLV 92

Query: 100 EAYKEIMSMPV 110
AY+E+MSM V
Sbjct: 93 AAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1448FLGMRINGFLIF3011e-97 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 301 bits (773), Expect = 1e-97
Identities = 150/562 (26%), Positives = 257/562 (45%), Gaps = 47/562 (8%)

Query: 30 LGGVDMLRQVTMILALAICLALAVFVMIWAQEPEYRPL-GQMSTAEMVQVLDVLDNNQVK 88
L + ++ +I+A + +A+ V +++WA+ P+YR L +S + ++ L +
Sbjct: 16 LNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIP 75

Query: 89 YEIQGD--VVKVPEDKFQDVRMLLSREGLDNTDANNDFLNKDSGFGVSQRMEQARLKHSQ 146
Y ++VP DK ++R+ L+++GL A L FG+SQ EQ + +
Sbjct: 76 YRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRAL 135

Query: 147 EQNLARIIEELKSVTRAKVILALPKENVFARNRSKPSATVVVSTRRS-GLSQEEVDSIVD 205
E LAR IE L V A+V LA+PK ++F R + PSA+V V+ L + ++ ++V
Sbjct: 136 EGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALDEGQISAVVH 195

Query: 206 IVASAVHNLEPNKVTVTDANGRLLNSGTQDGSSAIARRELEIVQQKESEYRNKVESILMP 265
+V+SAV L P VT+ D +G LL G +L+ ES + ++E+IL P
Sbjct: 196 LVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDL-NDAQLKFANDVESRIQRRIEAILSP 254

Query: 266 ILGPENFTSQVDVSMDFTAVEQTAKRYNPDLPSLRSEMVVENNS-----NGGTSGGIPGA 320
I+G N +QV +DF EQT + Y+P+ + ++ + + G GG+PGA
Sbjct: 255 IVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGVPGA 314

Query: 321 LSNQPP---------------MASDIPQEVNAEETVTRNSGSTHKEATRNFELDTTISHT 365
LSNQP A + PQ + + + ST + T N+E+D TI HT
Sbjct: 315 LSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDRTIRHT 374

Query: 366 RQQIGTLRRVSVSVAVDFKNAPMAEDGSVNRVPRTEQELANIRRLLEGAVGFNTQRGDVI 425
+ +G + R+SV+V V++K + +P T ++ I L A+GF+ +RGD +
Sbjct: 375 KMNVGDIERLSVAVVVNYKTLADGKP-----LPLTADQMKQIEDLTREAMGFSDKRGDTL 429

Query: 426 EVVTVPFMDQLIEAAPAPELWEQPWFWRVLRLVLGALVILVLILAVVRPMLKKLVYPDSV 485
VV PF + W+Q F L L++LV+ + R ++ +
Sbjct: 430 NVVNSPF-SAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRRVE 488

Query: 486 NMPDEPQRGGELAEIEDQYAADTLGMLQRPEAEYSYADDGSIL---IPNLHKDDDMIKAI 542
++ E E+ E + D + + M + I
Sbjct: 489 EAKAAQEQAQVRQETEE-------------AVEVRLSKDEQLQQRRANQRLGAEVMSQRI 535

Query: 543 RALVANEPELSTQVVKNWLLED 564
R + N+P + V++ W+ D
Sbjct: 536 REMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1449FLGMOTORFLIG2871e-97 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 287 bits (736), Expect = 1e-97
Identities = 108/341 (31%), Positives = 187/341 (54%)

Query: 8 EAKTAPGFKPSELNGIEKTAILLLSLSESDAASILKYLEPKQVQKVGMAMAAMRDFGQEK 67
E K S L G +K AILL+S+ ++ + KYL ++++ + +A + E
Sbjct: 3 EKKEKEILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSEL 62

Query: 68 VIGVHKLFLDDIQKYSSIGFNSEEFVRKALTAALGEDKAGNLIEQIIMGSGAKGLDSLKW 127
V F + + I ++ R+ L +LG KA ++I + ++ + ++
Sbjct: 63 KDNVLLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRR 122

Query: 128 MDARQVATIIQNEHPQIQTIVLSYLEPDQAAEIFAQFPENTRLDLMMRIANLEEVQPAAL 187
D + IQ EHPQ ++LSYL+P +A+ I + P + ++ RIA ++ P +
Sbjct: 123 ADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVV 182

Query: 188 QELNDIMEKQFAGQGGAQAAKMGGLKAAANIMNYLDTGVESHLMETMRESDEEMAQQIQD 247
+E+ ++EK+ A GG+ I+N D E ++E++ E D E+A++I+
Sbjct: 183 REVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKK 242

Query: 248 LMFVFENLSDVDDMGIQVLLREVQQDVLMKALKGADEQLKEKLLGNMSKRAAELLRDDLE 307
MFVFE++ +DD IQ +LRE+ L KALK D ++EK+ NMSKRAA +L++D+E
Sbjct: 243 KMFVFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDME 302

Query: 308 AMGPIRISEVEVAQKEILSIARRLSDSGEIMLGGGGGEEFL 348
+GP R +VE +Q++I+S+ R+L + GEI++ GG E+ L
Sbjct: 303 FLGPTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1450FLGFLIH801e-19 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 79.8 bits (196), Expect = 1e-19
Identities = 54/198 (27%), Positives = 97/198 (48%), Gaps = 4/198 (2%)

Query: 50 PLETETILPPTLSEIEDIRAHAEQEGFA---EGQEKGFTEGLEKGRLEGLEQGHGEGYSQ 106
E I+ P + IE+ EQ+ + E+G+ G+ +GR +G +QG+ EG +Q
Sbjct: 19 QAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQ 78

Query: 107 GQQQGYEEGLKTAAEMLQRFENLLGQFEAPLALLDTEIEKELLTTSMTLAKAVIGHELKT 166
G +QG E A + R + L+ +F+ L LD+ I L+ ++ A+ VIG
Sbjct: 79 GLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVIGQTPTV 138

Query: 167 YPEHILSALRQGVDSLPIKEQRVNIRVTPSDEALIGELYSQTQLERNRWQIEADPSLSPG 226
++ ++Q + P+ + +RV P D + ++ T L + W++ DP+L PG
Sbjct: 139 DNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT-LSLHGWRLRGDPTLHPG 197

Query: 227 DCIIDSVRSHIDMTVETR 244
C + + +D +V TR
Sbjct: 198 GCKVSADEGDLDASVATR 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1452FLGFLIJ407e-07 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 39.8 bits (92), Expect = 7e-07
Identities = 38/145 (26%), Positives = 71/145 (48%)

Query: 1 MAKPDPLHMVLKLTQDAEEQASLQLRSAQLELQRRQNQLEALQNYRLDYMKQMEQQQGQS 60
MA+ L + L + E A+ L + Q+ + QL+ L +Y+ +Y +
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 ISASHYHQFHQFVRQVDAAIVQQVNAVKDADNQRQHRQVYWQEKQQKRKAVELLLANKAE 120
I+++ + + QF++ ++ AI Q + + W+EK+Q+ +A + L ++
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 KAKLAELRAEQKMVDEFASQQFYRK 145
A LAE R +QK +DEFA + RK
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRK 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1453FLGHOOKFLIK463e-07 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 45.6 bits (107), Expect = 3e-07
Identities = 30/109 (27%), Positives = 52/109 (47%), Gaps = 1/109 (0%)

Query: 376 MNQQLITMVSNGVQHAEIRLDPPELGHMMVRIQVQGDTTQVQFQVSQHQTRDLVEQAMPR 435
++Q + G Q AE+RL P +LG + + ++V + Q+Q R +E A+P
Sbjct: 244 LSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAALPV 303

Query: 436 LREMLAEQGMQLTDGQVSQGSGGNSDGEQGSGRGNDNGMAETDEISAEE 484
LR LAE G+QL +S G + + S + A + ++ E+
Sbjct: 304 LRTQLAESGIQLGQSNIS-GESFSGQQQAASQQQQSQRTANHEPLAGED 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1455FLGMOTORFLIM2494e-83 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 249 bits (638), Expect = 4e-83
Identities = 86/326 (26%), Positives = 168/326 (51%), Gaps = 11/326 (3%)

Query: 1 MSDLLSQDEIDALLHGVDDVEEDIID----DNELDARSYDFSSQDRIVRGRMPTLEIVNE 56
M+++LSQDEID LL + + I D + YDF D+ + +M TL +++E
Sbjct: 1 MTEVLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHE 60

Query: 57 RFARHLRISMFNMMRRAAEVSINGVQMLKFGEYVHTLFVPTSLNMVRFSPLKGTALITME 116
FAR S+ +R V + V L + E++ ++ P++L ++ PLKG A++ ++
Sbjct: 61 TFARLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVD 120

Query: 117 ARLVFILVDNFFGGDGRFHAKIEGREFTPTERRIVQLLLKIIFEDYKDAWAPVMDVQFDY 176
+ F ++D FGG G+ R+ T E +++ ++ I + +++W V+D++
Sbjct: 121 PSITFSIIDRLFGGTGQAAKVQ--RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRL 178

Query: 177 LDSEVNPAMANIVSPTEVVVVSSFHIEVDGGGGDFHITMPYSMIEPIRELLDAG--VQSD 234
E NP A IV P+E+VV+ + +V G + +PY IEPI L + S
Sbjct: 179 GQIETNPQFAQIVPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSV 238

Query: 235 TQDTDMRWSQALRDEIMDVDVGIDATIVEHKVTLKQVLEFKAGDVIPVE---LPEYIILK 291
+ + ++ LRD++ VD+ + A + +++++ +L + GD+I + + + +L
Sbjct: 239 RRSSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLS 298

Query: 292 VEDLPTYRCKMGRTKENLALKICEQI 317
+ + + C+ G + +A +I E+I
Sbjct: 299 IGNRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1456FLGMOTORFLIN1144e-36 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 114 bits (287), Expect = 4e-36
Identities = 56/119 (47%), Positives = 81/119 (68%)

Query: 6 DDWASAMAEQAIDEAKAVEFDEFSNESAPLSEDEVSKLDAIMDIPVTISMEVGRSFINIR 65
D WA A+ EQ K+ F + +D IMDIPV +++E+GR+ + I+
Sbjct: 17 DLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDIPVKLTVELGRTRMTIK 76

Query: 66 NLLQLNQGSVVELDRIAGEPLDVMVNGTLIAHGEVVVVNDKFGIRLTDVISQTERIKKL 124
LL+L QGSVV LD +AGEPLD+++NG LIA GEVVVV DK+G+R+TD+I+ +ER+++L
Sbjct: 77 ELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGVRITDIITPSERMRRL 135


21Shal_1483Shal_1506Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1483019-3.027056polysaccharide export protein
Shal_1484024-4.954569hypothetical protein
Shal_1485228-7.101831lipopolysaccharide biosynthesis protein
Shal_1486232-8.761263dTDP-glucose-4,6-dehydratase
Shal_1487336-10.225760glucose-1-phosphate thymidylyltransferase
Shal_1488139-11.474410dTDP-4-dehydrorhamnose reductase
Shal_1489141-12.648653dTDP-4-dehydrorhamnose 3,5-epimerase
Shal_1490041-13.451380polysaccharide biosynthesis protein
Shal_1491240-13.343723glycosyl transferase family protein
Shal_1492238-12.7669114Fe-4S ferredoxin
Shal_1493335-12.233455hypothetical protein
Shal_1494232-10.934617glycosyl transferase family protein
Shal_1495124-8.480357hypothetical protein
Shal_1496019-6.420273group 1 glycosyl transferase
Shal_1497-114-4.800017hypothetical protein
Shal_1498-113-3.766956undecaprenyl-phosphate
Shal_1499-212-2.655139undecaprenyl-phosphate
Shal_1500-212-2.156343beta-lactamase domain-containing protein
Shal_1501-111-2.648910UDP-glucose/GDP-mannose dehydrogenase
Shal_1502-213-1.898591NAD-dependent epimerase/dehydratase
Shal_1503-116-1.749806dTDP-4-dehydrorhamnose 3,5-epimerase
Shal_1504016-1.272648CDP-glycerol:poly(glycerophosphate)
Shal_1505117-0.588306glycerol-3-phosphate cytidylyltransferase
Shal_1506216-0.352836nucleotidyl transferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1484NUCEPIMERASE270.030 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 26.7 bits (59), Expect = 0.030
Identities = 23/110 (20%), Positives = 41/110 (37%), Gaps = 20/110 (18%)

Query: 21 EIYKELNTCRDFGLKDQMTRAAVSIASNIAEGEERES------RAESARFLYFAKGSSGE 74
++Y RDF D + A + + I + + + A A + + G+S
Sbjct: 206 DVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSP 265

Query: 75 LATQIYIAIEIGVIEKKAGLNLIKEAREISAMLGALIKIRKGCVRETSAD 124
+ YI +E G+ K ++ ++ G V ETSAD
Sbjct: 266 VELMDYIQA----LEDALGIEAKKN----------MLPLQPGDVLETSAD 301


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1486NUCEPIMERASE1755e-54 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 175 bits (446), Expect = 5e-54
Identities = 82/374 (21%), Positives = 144/374 (38%), Gaps = 66/374 (17%)

Query: 1 MKILVTGGAGFIGSAVVRHIINNTQDSVINVDKLT--YAGNL-ESLASIESNERYVFEQV 57
MK LVTG AGFIG V + ++ V+ +D L Y +L ++ + + + F ++
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 58 DICDRAELDRVFVKCKPDAVMHLAAESHVDRSITGPADFIQTNIVGTYTLLEATRAYWNT 117
D+ DR + +F + V V S+ P + +N+ G +LE R
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN--- 116

Query: 118 LAEDAKKAFRFHHISTDEVYGDLPHPDDIEDETEPSTGKEPSADNEPSANKSLPLFTETT 177
K + S+ VYG N+ +P T+ +
Sbjct: 117 ------KIQHLLYASSSSVYG---------------------------LNRKMPFSTDDS 143

Query: 178 SYEPSSPYSASKASSDHLVRAWLRTYGLPTIVTNCSNNYGPYHFPEKLIPLVILNALEGK 237
P S Y+A+K +++ + + YGLP YGP+ P+ + LEGK
Sbjct: 144 VDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFFTVYGPWGRPDMALFKFTKAMLEGK 203

Query: 238 PLPIYGKGDQIRDWLYVEDHARALYKVV------------------TEGVVGETYNIGGH 279
+ +Y G RD+ Y++D A A+ ++ YNIG
Sbjct: 204 SIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNS 263

Query: 280 NEKQNLEVVQTICAILDSLVPKETKYKEQITYVTDRPGHDRRYAIDSSKMQRELGWTPIE 339
+ + ++ +Q + L K + +PG + D+ + +G+TP
Sbjct: 264 SPVELMDYIQALEDALGIEAKKN--------MLPLQPGDVLETSADTKALYEVIGFTPET 315

Query: 340 TFETGLKKTIEWYL 353
T + G+K + WY
Sbjct: 316 TVKDGVKNFVNWYR 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1488NUCEPIMERASE511e-09 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 51.3 bits (123), Expect = 1e-09
Identities = 50/244 (20%), Positives = 89/244 (36%), Gaps = 43/244 (17%)

Query: 1 MRVLITGCNGQVGCSLTEQLAKDTNTTVLALD--RDY----------------------L 36
M+ L+TG G +G ++++L + + V+ +D DY +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGH-QVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 37 DITNLDAVHAAVVEFKPSIIINAAAHTAVDKAEEEVELSYAINRDGPKYLAQAAQDVG-A 95
D+ + + + + + AV + E N G + + +
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 96 AILHISTDYVFEGNKVGEYVESD-TTNPQGVYGESKLAGEVAVAQACDKHI------ILR 148
+L+ S+ V+ N+ + D +P +Y +K A E+ H+ LR
Sbjct: 120 HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTY--SHLYGLPATGLR 177

Query: 149 TAWVFGENGN------NFVKTMLRLGTTRDALNVVGDQFGGPTYAGDIANALIQIANRIT 202
V+G G F K ML G + D N G TY DIA A+I++ + I
Sbjct: 178 FFTVYGPWGRPDMALFKFTKAMLE-GKSIDVYN-YGKMKRDFTYIDDIAEAIIRLQDVIP 235

Query: 203 QGDA 206
D
Sbjct: 236 HADT 239


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1502NUCEPIMERASE5610.0 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 561 bits (1447), Expect = 0.0
Identities = 225/333 (67%), Positives = 270/333 (81%), Gaps = 1/333 (0%)

Query: 1 MKYLVTGAAGFIGSKVSERLCAAGHEVVGIDNINDYYDVNLKLDRLKNLQSQTLFSFKKL 60
MKYLVTGAAGFIG VS+RL AGH+VVGIDN+NDYYDV+LK RL+ L F F K+
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQP-GFQFHKI 59

Query: 61 DLADREGIATLFAEEGFDRVIHLAAQAGVRYSIDNPMAYADSNLVGHLTILEGCRHHKIQ 120
DLADREG+ LFA F+RV + VRYS++NP AYADSNL G L ILEGCRH+KIQ
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 121 HLVYASSSSVYGLNSKMPFSTDDSVDHPISLYAATKKANELMSHTYSHLYGVPTTGLRFF 180
HL+YASSSSVYGLN KMPFSTDDSVDHP+SLYAATKKANELM+HTYSHLYG+P TGLRFF
Sbjct: 120 HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFF 179

Query: 181 TVYGPWSRPDMALLKFTNKIVKGEAIDVYNHGNLSRDFTYIDDIVEGIIRIQDSVPVANP 240
TVYGPW RPDMAL KFT +++G++IDVYN+G + RDFTYIDDI E IIR+QD +P A+
Sbjct: 180 TVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADT 239

Query: 241 EWNAAEATPATSSAPYRVFNIGNGSPVKLMDYISALEKSLGIEAIKNMMDMQPGDVHSTW 300
+W TPA S APYRV+NIGN SPV+LMDYI ALE +LGIEA KNM+ +QPGDV T
Sbjct: 240 QWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGDVLETS 299

Query: 301 ADTEDLFKTVGYKPQTSVEEGVQKFVEWYKEYY 333
ADT+ L++ +G+ P+T+V++GV+ FV WY+++Y
Sbjct: 300 ADTKALYEVIGFTPETTVKDGVKNFVNWYRDFY 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1505LPSBIOSNTHSS434e-08 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 43.3 bits (102), Expect = 4e-08
Identities = 16/55 (29%), Positives = 26/55 (47%), Gaps = 5/55 (9%)

Query: 8 GTFDLFHYGHVRLFKRLKALGDKLIVAVSTDEFNALKGKAAFFSYFQRAEIVEAC 62
G+FD +GH+ + +R L D++ VAV + K FS +R E +
Sbjct: 7 GSFDPITFGHLDIIERGCRLFDQVYVAVLRN-----PNKQPMFSVQERLEQIAKA 56


22Shal_1549Shal_1554Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1549215-1.376108hypothetical protein
Shal_1550215-1.082447lysozyme
Shal_1551316-0.505403hypothetical protein
Shal_1552418-0.487233hypothetical protein
Shal_1553318-0.258006S-adenosylmethionine--tRNA
Shal_1554321-0.558835queuine tRNA-ribosyltransferase
23Shal_1647Shal_1654Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_16472221.444232Na(+)-translocating NADH-quinone reductase
Shal_16482202.298845Na(+)-translocating NADH-quinone reductase
Shal_16491192.800275Na(+)-translocating NADH-quinone reductase
Shal_16501183.156382hypothetical protein
Shal_16511193.489804hypothetical protein
Shal_16521193.553525arylsulfotransferase
Shal_16530203.831958tryptophan synthase subunit alpha
Shal_1654-1173.075296tryptophan synthase subunit beta
24Shal_1738Shal_1767Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1738-217-3.097037diguanylate cyclase
Shal_1739-118-5.394651hypothetical protein
Shal_1740019-5.534434diacylglycerol kinase
Shal_1741-120-5.610237DNA ligase
Shal_1742021-6.095140nitroreductase
Shal_1743021-6.108030hypothetical protein
Shal_1744023-6.254690hypothetical protein
Shal_1745128-5.671786hypothetical protein
Shal_1746129-5.874717hypothetical protein
Shal_1747332-6.229884hypothetical protein
Shal_1748131-6.985605putative lipoprotein
Shal_1749033-7.282681hypothetical protein
Shal_1750030-6.641966hypothetical protein
Shal_1751029-7.770400hypothetical protein
Shal_1752025-7.137084kinase
Shal_1753-124-7.312291GTP-binding protein HSR1-like
Shal_1754-124-7.119072ankyrin
Shal_1755021-6.671421hypothetical protein
Shal_1756020-6.418070hypothetical protein
Shal_1757015-3.032986hypothetical protein
Shal_1758116-3.183254hypothetical protein
Shal_1760018-2.859139hypothetical protein
Shal_1761-216-1.838782protein tyrosine phosphatase
Shal_1762219-0.876109GTP1/OBG protein
Shal_1763220-0.756037ribosomal biogenesis GTPase
Shal_1764220-0.7948682OG-Fe(II) oxygenase
Shal_1765320-0.514924bifunctional methionine sulfoxide reductase B/A
Shal_1766422-0.451692sulfate transporter
Shal_1767522-0.701273ribonuclease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1767IGASERPTASE607e-11 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 59.7 bits (144), Expect = 7e-11
Identities = 50/279 (17%), Positives = 97/279 (34%), Gaps = 27/279 (9%)

Query: 578 PKTEPKKQDKDSRNQGRNNNRRNNN---RRNDPRRNRNDNRSDSRNDSRNDSEQSKDKRA 634
P+ E + Q D+ N NN + + N+ R D SE + + A
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETT-ETVA 1041

Query: 635 KSSRSDSSDTRLPTEDTNKRQPRNEKVRKDTKSEAP----TTEQPKQEAARERRQRRNMR 690
++S+ +S +D + +N +V K+ KS T E + + + Q +
Sbjct: 1042 ENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETK 1101

Query: 691 --------RKVRVNNEQQEELKAEVAATEAPVQAKPSNDDKPARPKRQPRKPKAKAETVD 742
K +V E+ +E +V + +P Q + + +P+ +P + +
Sbjct: 1102 ETATVEKEEKAKVETEKTQE-VPKVTSQVSPKQEQS----ETVQPQAEPARENDPTVNIK 1156

Query: 743 VEAVEQIETAGSVESAVQAPEAPKTAVETKVETETT--AIETAIEATEDAVATEDN---A 797
+ TA + + A + + V T +E T N +
Sbjct: 1157 EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESS 1216

Query: 798 REPREGQRRSRRS-PRHLRAAGQRRRRDEDAQQNEDGSS 835
+P+ RRS RS P ++ A + S+
Sbjct: 1217 NKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTST 1255


25Shal_1782Shal_1805Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1782017-3.1776146-phosphogluconate dehydrogenase
Shal_1783119-4.862476cytidine deaminase
Shal_1784522-5.981356exonuclease I
Shal_1785729-7.169434integrase family protein
Shal_1786528-7.704038hypothetical protein
Shal_1787532-7.752443hypothetical protein
Shal_1788633-7.917336restriction endonuclease-like protein
Shal_1789635-8.407718hypothetical protein
Shal_1790634-8.605965resolvase domain-containing protein
Shal_1791532-8.441524hypothetical protein
Shal_1792632-8.249269UBA/THIF-type NAD/FAD-binding protein
Shal_1793530-8.049099hypothetical protein
Shal_1794323-6.218838hypothetical protein
Shal_1795420-4.597853hypothetical protein
Shal_1796419-3.02120223S rRNA methyltransferase A
Shal_1797318-2.980688cold-shock DNA-binding domain-containing
Shal_1798213-2.107532hypothetical protein
Shal_1799111-2.382747ABC transporter-like protein
Shal_1800011-2.135042ABC-2 type transporter
Shal_1801011-1.803684hypothetical protein
Shal_1802-211-1.935156hypothetical protein
Shal_1803-112-1.690250arginine decarboxylase
Shal_1804-116-2.552191adenosylmethionine decarboxylase
Shal_1805-212-3.176302agmatinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1791PF07675347e-04 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 34.3 bits (78), Expect = 7e-04
Identities = 36/137 (26%), Positives = 51/137 (37%), Gaps = 16/137 (11%)

Query: 119 SENLHIDFPIYRNNNGTL--ELAVGKANSNQ--DNREWAISDPKGLRSWIKDSSTYGNSA 174
+EN + D I R+N + ++ G+ + Q N K W S+ +
Sbjct: 304 TENGNYDVVITRSNYLPVIKQIQAGEPSPYQPVSNLTATAQGQKVTLKWDAPSAKKAEGS 363

Query: 175 ELKQQQFNRLVRYVKRWRDHKFTDDSVRSKVYSIGLTVMLKADFVWALDDDGYPDDLKAL 234
R VKR D F + V + V+L AD VW D+ GY L A
Sbjct: 364 -----------REVKRIGDGLFVTIEPANDVRANEAKVVLAADNVWG-DNTGYQFLLDAD 411

Query: 235 RDTISSVISTNYFRFIG 251
+T SVI F G
Sbjct: 412 HNTFGSVIPATGPLFTG 428


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1799PF05272320.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.4 bits (73), Expect = 0.002
Identities = 11/44 (25%), Positives = 18/44 (40%)

Query: 31 PIALVGPNGAGKTTLFSLLCGYITPTSGEVSLLGHKPGSPELLG 74
+ L G G GK+TL + L G + + K ++ G
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAG 641


26Shal_1904Shal_1909Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1904-120-3.819749thiazole biosynthesis family protein
Shal_1905-118-4.979907thiamine biosynthesis protein ThiH
Shal_1906-217-5.335547hypothetical protein
Shal_1907-215-4.285397response regulator receiver protein
Shal_1908-212-3.939834methyl-accepting chemotaxis protein
Shal_1909-212-3.680656histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1907HTHFIS791e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.1 bits (195), Expect = 1e-20
Identities = 31/111 (27%), Positives = 49/111 (44%), Gaps = 5/111 (4%)

Query: 1 MKKARILVVDDDPVCSSLLLSILGD-DYQVTTVNTGCDVADIANIQRPDVIFLDIMMPGK 59
M A ILV DDD ++L L Y V + + D++ D++MP +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 60 NGYQVLKDLKQ-DPQTSKLPVIIVSALSEQSDENLALRIGADGYITKPIVP 109
N + +L +K+ P LPV+++SA + A GA Y+ KP
Sbjct: 61 NAFDLLPRIKKARPD---LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDL 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1909HTHFIS602e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 60.2 bits (146), Expect = 2e-11
Identities = 23/119 (19%), Positives = 51/119 (42%), Gaps = 7/119 (5%)

Query: 698 VLVVEDNKVNQQVVSINLNKLNLPYLIANDGREALEHYKRHIGNFSVILMDCMMPVMDGF 757
+LV +D+ + V++ L++ I ++ G+ +++ D +MP + F
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA--GDGDLVVTDVVMPDENAF 63

Query: 758 EATRAIREFEKEEDAKRVQIIALTASILDDDIQKCFDSGMDDYLPKPFKREVLVDKLAK 816
+ I+ + + ++ ++A K + G DYLPKPF L+ + +
Sbjct: 64 DLLPRIK-----KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


27Shal_1952Shal_1968Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1952-215-4.136059keto-hydroxyglutarate-aldolase/keto-deoxy-
Shal_1953012-2.016983ecotin
Shal_1954013-1.580920cation diffusion facilitator family transporter
Shal_1955113-0.692201MarR family transcriptional regulator
Shal_1956-1110.150585sulfatase
Shal_19570142.895622diguanylate cyclase
Shal_1958-1164.257717N-acetyltransferase GCN5
Shal_19600194.565410GntR family transcriptional regulator
Shal_19610174.2311292-methylisocitrate lyase
Shal_19620163.4176132-methylcitrate synthase/citrate synthase II
Shal_1963-1121.086134aconitate hydratase
Shal_1964014-1.336452putative AcnD-accessory protein PrpF
Shal_1965-119-2.524338hypothetical protein
Shal_1966-119-2.585725type 11 methyltransferase
Shal_1967-119-2.845096HNH nuclease
Shal_1968118-3.073155hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1968MICOLLPTASE412e-05 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 41.2 bits (96), Expect = 2e-05
Identities = 29/119 (24%), Positives = 45/119 (37%), Gaps = 17/119 (14%)

Query: 493 YQFQMTSQASFSPPFNASYDYKIEF-------ILVSATDGLPIANAGEDQTARLGDTIIL 545
Y+ + N +Y Y + F + P A D + + + I
Sbjct: 734 YKTVTAYFVNHKVDGNGNYVYDVVFHGMNTDTNTDVHVNKEPKAVIKSDSSVIVEEEINF 793

Query: 546 DGTSSFDDNTPTESLGYNWSFSSLPSGSTVSLTQANSANPSFVIDAFGEYTVELIVTDN 604
DGT S D++ ++ Y W F G +N A + + GEY V+L VTDN
Sbjct: 794 DGTESKDEDGEIKA--YEWDFGD---GEK-----SNEAKATHKYNKTGEYEVKLTVTDN 842


28Shal_1978Shal_1994Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1978317-1.087215hypothetical protein
Shal_1979316-1.433136peptidase S8/S53 subtilisin kexin sedolisin
Shal_1980317-2.606081two component LuxR family transcriptional
Shal_1981316-2.579943histidine kinase
Shal_1982318-3.065902hypothetical protein
Shal_1983318-3.094785hypothetical protein
Shal_1984023-4.783879glutathione S-transferase domain-containing
Shal_1985-120-2.620724NAD(P)H dehydrogenase (quinone)
Shal_1986-116-1.198464phasin family protein
Shal_1987-116-1.333001poly(R)-hydroxyalkanoic acid synthase
Shal_1988-114-0.843875dehydratase
Shal_1989014-1.383872signal transduction protein
Shal_1990013-1.286680hypothetical protein
Shal_1991-116-2.441713glutathione synthase
Shal_1992117-3.189129hypothetical protein
Shal_1993-115-2.604000hypothetical protein
Shal_1994016-3.015716toxin secretion, membrane fusion protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1979SUBTILISIN1403e-39 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 140 bits (354), Expect = 3e-39
Identities = 74/214 (34%), Positives = 100/214 (46%), Gaps = 29/214 (13%)

Query: 124 AGMKVCVIDSGLDRSNPDFEWSSITG----DNDIGTGNWDENGGPHGTHVAGTIGAADND 179
G+KV V+D+G D +PD + I G D+D G ++ HGTHVAGTI A +N+
Sbjct: 41 RGVKVAVLDTGCDADHPDLKARIIGGRNFTDDDEGDPEIFKDYNGHGTHVAGTIAATENE 100

Query: 180 IGVVGMAPGVEMHIIKVFNAQGWGYSSDLAHAANLCNAAGANIISMSLGGGGANSTEENA 239
GVVG+AP ++ IIKV N QG G + +IISMSLGG A
Sbjct: 101 NGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQKVDIISMSLGGPEDVPELHEA 160

Query: 240 FKTFSSSGGLVVAAAGNDG-----NNVRSYPAGYTSVMMVGANDADNNIAAFSQFPACST 294
K +S LV+ AAGN+G + YP Y V+ VGA + D + + FS
Sbjct: 161 VKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAINFDRHASEFSNS----- 215

Query: 295 FSTGRGKKNKTEIIDGSCVEITAGGVSTLSTYPQ 328
+ V++ A G LST P
Sbjct: 216 ---------------NNEVDLVAPGEDILSTVPG 234



Score = 48.3 bits (115), Expect = 4e-08
Identities = 18/72 (25%), Positives = 26/72 (36%), Gaps = 5/72 (6%)

Query: 448 SDYGFMSGTSMATPAISGLAALLWSN-----HSGCTGNDIREALKASAYDAGESGRDNYF 502
Y SGTSMATP ++G AL+ T ++ L G S +
Sbjct: 235 GKYATFSGTSMATPHVAGALALIKQLANASFERDLTEPELYAQLIKRTIPLGNSPKMEGN 294

Query: 503 GYGIAKAADASA 514
G A + +
Sbjct: 295 GLLYLTAVEELS 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1980HTHFIS639e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.9 bits (153), Expect = 9e-14
Identities = 22/122 (18%), Positives = 47/122 (38%), Gaps = 6/122 (4%)

Query: 4 KITIILVDDHILVRAGIRSLLESIANVEVIKESGDGIEALSLLRTHQPTLLILDISLPGL 63
TI++ DD +R + L A +V + + L++ D+ +P
Sbjct: 3 GATILVADDDAAIRTVLNQALS-RAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 64 NGLEVARSVTKMKTKTKILMLSMHSDVEYVAKALTIGSHGYLMK----ESAVEELETAIS 119
N ++ + K + +L++S + KA G++ YL K + + A++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 TL 121

Sbjct: 121 EP 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1981PF06580396e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.5 bits (92), Expect = 6e-05
Identities = 41/304 (13%), Positives = 97/304 (31%), Gaps = 69/304 (22%)

Query: 778 WSDPAKLNVKIVPPWWMKTSVRIASITIIIMMFIAFHKMRLARWQKHTARLQALQAQKQV 837
W A +N K V I ++ ++ M+ + A + +
Sbjct: 99 WRLLAFINTKPVAFTLPLALSIIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMA 158

Query: 838 ALDEVENREQQLST--AYQGLRSLASQIQNAKEEERKSISRELHDQFGQTLTATKINLQL 895
++ + Q++ + L ++ + I + R+ ++ L + +L +
Sbjct: 159 QEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTS-LSELMRYSLRYSNARQVS 217

Query: 896 YKKFNYQEQERIESAISITHTMIQQMR-----SISFNLRPALLDDEGLVAGVKLQLEKMS 950
E ++S + + ++ + PA++D +Q+ M
Sbjct: 218 LA----DELTVVDSYLQL-----ASIQFEDRLQFENQINPAIMD---------VQVPPM- 258

Query: 951 ALIAKPIKLSVSRDFPMVNQGITINVFRIIQESVNNAIRHA-----NASQIIVSLSYHSE 1005
++Q V N I+H +I++ + +
Sbjct: 259 ----------------------------LVQTLVENGIKHGIAQLPQGGKILLKGTKDNG 290

Query: 1006 QLFIEIKDDGKGFDVDKVKENTFSGVHLGLLGMEERVHSLSGK---LLLHSSISNGSIIK 1062
+ +E+++ G +NT GL + ER+ L G + L + +
Sbjct: 291 TVTLEVENTGSLA-----LKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM- 344

Query: 1063 VVIP 1066
V+IP
Sbjct: 345 VLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1990ACRIFLAVINRP290.029 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 29.0 bits (65), Expect = 0.029
Identities = 12/41 (29%), Positives = 23/41 (56%), Gaps = 2/41 (4%)

Query: 248 SFPYLIMLAALLMPTCLRQNYLLFSWLITMVILLVVDLAML 288
P L+ ++ +++ CL Y SW I + ++LVV L ++
Sbjct: 871 QAPALVAISFVVVFLCLAALYE--SWSIPVSVMLVVPLGIV 909


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1994RTXTOXIND1367e-38 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 136 bits (345), Expect = 7e-38
Identities = 82/427 (19%), Positives = 155/427 (36%), Gaps = 54/427 (12%)

Query: 25 QPISIYVSATVLLVIIVAIVFFL--SLSHYARKETVRGYLVPDKGLIKTYANRSGNIDVL 82
P+S ++ ++ F+ L T G L + + + +
Sbjct: 51 TPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEI 110

Query: 83 HVSEGDIIKKGSPLATIV--------------LSRSMLSGEEL-SESLIDELKLQLTLLT 127
V EG+ ++KG L + L ++ L S EL L
Sbjct: 111 IVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKL 170

Query: 128 TEQKVNQSLLAKETQRLKNSITDN-----KQALNVSVNLEALLTEKLALQVK-------- 174
++ Q++ +E RL + I + Q +NL+ E+L + +
Sbjct: 171 PDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLS 230

Query: 175 -----QQTQHQKLFDDGYLSTLDYQSQQQKLIAVRQEIENLKSNQVQIRSQLNDALSELD 229
+ L ++ Q+ K + E+ KS QI S++ A E
Sbjct: 231 RVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQ 290

Query: 230 ILPSQFVLKDGDIERRRSELKRQIDETENSY--------RFVIRAAEAGTVASIQI-VEG 280
++ F +I + + I VIRA + V +++ EG
Sbjct: 291 LVTQLF---KNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEG 347

Query: 281 EFIATNRPLMSLIPQGALLVAELLLPTRSAGFVKSGDEARLRFDAIPYQRFGFLESRVSR 340
+ T LM ++P+ L L+ + GF+ G A ++ +A PY R+G+L +V
Sbjct: 348 GVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKN 407

Query: 341 IDKALLLDGEAKVPVALSEPVYRIRTQLSKQDMLAYSDSFPLKSGMLLEADIVLDRRSLL 400
I+ + D V+ + + + + + + PL SGM + A+I RS++
Sbjct: 408 INLDAIEDQR-------LGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVI 460

Query: 401 DWLLDPI 407
+LL P+
Sbjct: 461 SYLLSPL 467


29Shal_2041Shal_2047Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2041223-0.274616phosphoserine aminotransferase
Shal_2042326-0.368950aromatic amino acid aminotransferase
Shal_2043224-1.1801303-phosphoshikimate 1-carboxyvinyltransferase
Shal_2044429-1.087810cytidylate kinase
Shal_2045330-1.00428430S ribosomal protein S1
Shal_2046216-1.686702integration host factor subunit beta
Shal_2047216-2.142400hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2046DNABINDINGHU1144e-37 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 114 bits (287), Expect = 4e-37
Identities = 32/89 (35%), Positives = 56/89 (62%), Gaps = 1/89 (1%)

Query: 2 TKSELIEKLATRQSQLSAKEVEAAIKEMLEQMADTLEAGDRIEIRGFGSFSLHYRAPRTG 61
K +LI K+A ++L+ K+ AA+ + ++ L G+++++ GFG+F + RA R G
Sbjct: 3 NKQDLIAKVA-EATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKG 61

Query: 62 RNPKTGTSVELEGKYVPHFKPGKELRERV 90
RNP+TG ++++ VP FK GK L++ V
Sbjct: 62 RNPQTGEEIKIKASKVPAFKAGKALKDAV 90


30Shal_2059Shal_2077Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2059-115-3.042523DTW domain-containing protein
Shal_2060-115-3.117257hypothetical protein
Shal_2061013-3.041342two component LuxR family transcriptional
Shal_2062113-2.837525histidine kinase
Shal_2063314-3.588698ApbE family lipoprotein
Shal_2064215-3.220806FMN-binding domain-containing protein
Shal_2065218-3.036970porin
Shal_2066221-2.757898pseudouridine synthase
Shal_2067017-2.369175integrase family protein
Shal_2068120-3.204466hypothetical protein
Shal_2069122-2.607451hypothetical protein
Shal_2070222-2.545664hypothetical protein
Shal_2071223-2.543425short-chain dehydrogenase/reductase SDR
Shal_2072322-2.774889alpha-L-glutamate ligase
Shal_2073221-3.482244hypothetical protein
Shal_2076219-3.051884glyceraldehyde-3-phosphate dehydrogenase
Shal_2077019-3.275227hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2061HTHFIS953e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 95.3 bits (237), Expect = 3e-25
Identities = 33/150 (22%), Positives = 69/150 (46%)

Query: 2 TNLYLVDDDQAVLDSLTWMLNGLGYQPKGFLSADSFLQQVDINNTGIAILDVQMPGMDGS 61
+ + DDD A+ L L+ GY + +A + + + + + + DV MP +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 ALLTHMSKAQSPIAVIMLSGHGSIAMAVQAIQKGALDFLEKPVDGDKLVRLLDQAGELTE 121
LL + KA+ + V+++S + A++A +KGA D+L KP D +L+ ++ +A +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 122 QNLRSKQERQALSDKLATLTPREHEVMEKV 151
+ ++ L + E+ +
Sbjct: 124 RRPSKLEDDSQDGMPLVGRSAAMQEIYRVL 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2065ECOLNEIPORIN662e-14 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 66.4 bits (162), Expect = 2e-14
Identities = 61/350 (17%), Positives = 112/350 (32%), Gaps = 24/350 (6%)

Query: 2 MKIVKLTLLVAAVLASPSVMADAY-KFYGRIDYSITHSDSGSATHSGKSGTVLENNWSRL 60
MK + L +AA+ + Y ++ S + + +G+ S ++GT + + S++
Sbjct: 1 MKKSLIALTLAALPVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSKI 60

Query: 61 GIKGDAAINEDFTVFYQIEVGVNGATEGKQNNPFSARPTFLGIKHSTVGQLAAGRIDPVF 120
G KG + +Q+E + A R +F+G+K G+L GR++ V
Sbjct: 61 GFKGQEDLGNGLKAIWQVEQKASIAGTDSGW---GNRQSFIGLK-GGFGKLRVGRLNSVL 116

Query: 121 KMAKGTADAMDMYSLKHDRLFAGDKRWGDSLEYKTVKWNKLQFGASYILEDNYYGEDDVR 180
K A + S+ Y + ++ L Y D+
Sbjct: 117 KDTGDINPWDSKSDYLGVNKIAEPEARLISVRYDSPEFAGLSGSV------QYALNDNAG 170

Query: 181 RDNGN-YQVALTYGDKLFKSGDLYLAAAYTDGVEDIKGFRAVAQYKIDKLMLGSIYQSSE 239
R N Y Y + F + E++ + + ++Y S
Sbjct: 171 RHNSESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSGYDNDALYASVA 230

Query: 240 IVNPNLDNWQQRDGDG--FIVSAKYQFDKLTLKAQYGQDDSGTGKIAGRVYDKLGAAATE 297
+ + ++ V+A + + + G Y+
Sbjct: 231 VQQQDAKLVEENYSHNSQTEVAATLAYRFGNVTPRVSYAHGFKGSFDATNYNND------ 284

Query: 298 VPEVSQWAIGAEYRLSKSTRVHTELGQFDVKQYSD-FDDTIMSVGFRLDF 346
Q +GAEY SK T G + F T VG R F
Sbjct: 285 ---YDQVVVGAEYDFSKRTSALVSAGWLQEGKGESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2066PF00577290.025 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 28.7 bits (64), Expect = 0.025
Identities = 8/34 (23%), Positives = 15/34 (44%)

Query: 154 SSLDDDAPLLSTNEQAVDRSVSVTRVEFIPQTGR 187
++L D+ L + V ++ R EF + G
Sbjct: 763 NTLADNVDLDNAVANVVPTRGAIVRAEFKARVGI 796


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2067PF05272300.018 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.018
Identities = 10/52 (19%), Positives = 20/52 (38%), Gaps = 5/52 (9%)

Query: 199 LRDKAILQLGLQGGFRRSELAEIRVEHISFL-REKLKVRVPYSKSNQQGQRE 249
+ +L FRR++ ++ +F K + R Y + Q R+
Sbjct: 639 IAGIVAYELSEMTAFRRADAEAVK----AFFSSRKDRYRGAYGRYVQDHPRQ 686


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2070NEISSPPORIN260.035 Neisseria sp. porin signature.
		>NEISSPPORIN#Neisseria sp. porin signature.

Length = 348

Score = 26.5 bits (58), Expect = 0.035
Identities = 19/71 (26%), Positives = 33/71 (46%), Gaps = 16/71 (22%)

Query: 1 MRRYLLSLGLLLLPVSAMANIIVD---KTGVDEKDYVYDLHQCTEMSTQVKQKQTEGSAI 57
M++ L++L L LPV+AMA++ + K GV + V+ + S +
Sbjct: 1 MKKSLIALTLAALPVAAMADVTLYGAIKAGV-------------QTYRSVEHTDGKVSKV 47

Query: 58 GTAAKGAAIGS 68
T ++ A GS
Sbjct: 48 ETGSEIADFGS 58


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2071DHBDHDRGNASE739e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 73.2 bits (179), Expect = 9e-18
Identities = 47/195 (24%), Positives = 80/195 (41%), Gaps = 16/195 (8%)

Query: 2 KHVVITGANRGIGLAFVGHYLTTGWQVTA--CCRNLNDAVALQHQQSKFTALKLVELDVT 59
K ITGA +GIG A + G + A + V + A DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF-PADVR 67

Query: 60 IPSSIAELKRSLGSEA--IDLLINNAGYYGPKGVRFGTTD---INQWQAVLAVNTIAPLI 114
++I E+ + E ID+L+N AG +R G +W+A +VN+
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGV-----LRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 115 LTETLYPNLKIAQNCVLAFISSKVGSMQDNSSGGGYYYRSSKAALNSVVKSLSIDLIQDG 174
+ ++ + ++ + + S + S Y SSKAA K L ++L +
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAA---YASSKAAAVMFTKCLGLELAEYN 179

Query: 175 IKCVVLHPGWVQTEM 189
I+C ++ PG +T+M
Sbjct: 180 IRCNIVSPGSTETDM 194


31Shal_2119Shal_2142Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2119221-1.493270short-chain dehydrogenase/reductase SDR
Shal_2120120-1.498859hypothetical protein
Shal_2121218-1.078862long-chain-acyl-CoA synthetase
Shal_21222150.227293nonspecific lipid-transfer protein
Shal_2123317-0.476470hypothetical protein
Shal_21243160.101634glyoxalase/bleomycin resistance
Shal_21252170.070304coenzyme A transferase
Shal_21262180.304924coenzyme A transferase
Shal_21273200.542353enoyl-CoA hydratase
Shal_21282211.0105102-nitropropane dioxygenase
Shal_21292201.549108enoyl-CoA hydratase
Shal_21302191.394116acyl-CoA dehydrogenase domain-containing
Shal_2131024-2.015689acyl-CoA dehydrogenase domain-containing
Shal_2132-123-2.444856acetyl-CoA acetyltransferase
Shal_2133021-3.133877short chain dehydrogenase
Shal_2134-123-3.656367amidase
Shal_2135-224-4.769323hypothetical protein
Shal_2136-222-3.510207hypothetical protein
Shal_2137018-1.041004hypothetical protein
Shal_2138-117-0.914761hypothetical protein
Shal_2139220-0.877479short chain dehydrogenase
Shal_2140221-1.365723dehydratase
Shal_2141221-1.769555acetyl-CoA acetyltransferase
Shal_2142223-2.814369enoyl-CoA hydratase/isomerase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2119DHBDHDRGNASE1371e-41 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 137 bits (346), Expect = 1e-41
Identities = 78/249 (31%), Positives = 121/249 (48%), Gaps = 12/249 (4%)

Query: 6 KVIFVTGAGQGMGLAMVKLFAEQGAKVAAIDINEAAAKQVAEQQSAESGSEVIGIGCDIS 65
K+ F+TGA QG+G A+ + A QGA +AA+D N ++V AE D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE-ARHAEAFPADVR 67

Query: 66 QSSSVRDAIAEVVQRLGSVDVVINNAGIGSIDSFIDTPDENWHKVINVNLTGTFYCCREA 125
S+++ + A + + +G +D+++N AG+ DE W +VN TG F R
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 126 ARVMKEQGSGCIINISSTAVMSGD-GPSHYCASKAGVIGLTRSIAKELAASGIRVNTIVP 184
++ M ++ SG I+ + S + Y +SKA + T+ + ELA IR N + P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 185 GPTNTPMMADI--PEEWTQQMIN--------AIPLGRMGEPEDIAKLASFIASEDASFIT 234
G T T M + E +Q+I IPL ++ +P DIA F+ S A IT
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHIT 247

Query: 235 GQNLAVNGG 243
NL V+GG
Sbjct: 248 MHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2133DHBDHDRGNASE673e-15 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 67.4 bits (164), Expect = 3e-15
Identities = 61/259 (23%), Positives = 99/259 (38%), Gaps = 19/259 (7%)

Query: 2 KACNSRTVIITGSGGGLGRAYALALAAEGANVVVNDIRAGAAAAVVDEILTQGGQAIANS 61
K + ITG+ G+G A A LA++GA++ D VV + + A A
Sbjct: 4 KGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 62 DDITRMDTATNIVDAALEAFGEVHVLINNAGVLADRMFISLSEADWDKVMQVHLKGHFCL 121
D+ I G + +L+N AGVL + SLS+ +W+ V+ G F
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 122 ANILGRRWRDLAKAGQPVDARIINTSSGAGLQGSIGQSNYSAAKGGIASLTLVQAAELGR 181
+ + + D I+ S + Y+++K T EL
Sbjct: 124 SRSVSKYMMDRRSGS------IVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAE 177

Query: 182 YGVTVNALAPAA-RTSMTQSAMPD-----VVKKPEDGSFDLW-------APENVAPLVVW 228
Y + N ++P + T M S D V K +F P ++A V++
Sbjct: 178 YNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLF 237

Query: 229 LSSSESKHISGQILESQGG 247
L S ++ HI+ L GG
Sbjct: 238 LVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2139DHBDHDRGNASE922e-24 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 92.0 bits (228), Expect = 2e-24
Identities = 66/245 (26%), Positives = 114/245 (46%), Gaps = 13/245 (5%)

Query: 14 LKGKSVLITAAAGAGIGFAAARRAAEEGCRALMISDIHPRRLDEAVTRLRVETGLEHIYG 73
++GK IT AA GIG A AR A +G + D +P +L++ V+ L+ E +
Sbjct: 6 IEGKIAFITGAA-QGIGEAVARTLASQGAH-IAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 74 QLCNVTDQQDVTTLIEQAELKLAGVDVLINNAGLGGQKNVVDMTDKEWSTVLDITLTGTF 133
+V D + + + E ++ +D+L+N AG+ + ++D+EW + TG F
Sbjct: 64 --ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVF 121

Query: 134 RMIREILPHMQARGSGVIVNNASVLGWRAQKEQAHYAAAKAGVMALTRCSALEAAEHGVR 193
R + +M R SG IV S + A YA++KA + T+C LE AE+ +R
Sbjct: 122 NASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIR 181

Query: 194 INAVSPSIALHDFLKKASSEE---------LLNQLASKEAFGRAAEVWEVANVMMFLASD 244
N VSP D ++E L + + A+ ++A+ ++FL S
Sbjct: 182 CNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSG 241

Query: 245 YSSYM 249
+ ++
Sbjct: 242 QAGHI 246


32Shal_2168Shal_2196Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2168-116-3.444313RND efflux system outer membrane lipoprotein
Shal_2169020-5.307264DNA-directed DNA polymerase
Shal_2170128-7.895824peptidase S24/S26 domain-containing protein
Shal_2171331-8.584438diguanylate cyclase
Shal_2172431-7.111121hypothetical protein
Shal_2173431-7.321717hypothetical protein
Shal_2174530-7.881138hypothetical protein
Shal_2175429-7.940849hypothetical protein
Shal_2176328-7.767638helicase domain-containing protein
Shal_2177127-7.238772hypothetical protein
Shal_2178025-6.574569hypothetical protein
Shal_2179025-6.280024hypothetical protein
Shal_2180023-5.624439hypothetical protein
Shal_2181122-5.204702type III restriction protein res subunit
Shal_2182121-4.686603deoxyribonuclease I
Shal_2183122-5.528697DNA-directed DNA polymerase
Shal_2184125-6.744250peptidase S24/S26 domain-containing protein
Shal_2185125-6.839218hypothetical protein
Shal_2186228-7.748707FRG domain-containing protein
Shal_2187227-7.630757hypothetical protein
Shal_2188124-7.269342hypothetical protein
Shal_2189119-5.624524hypothetical protein
Shal_2190017-5.067070hypothetical protein
Shal_2191117-4.924324hypothetical protein
Shal_2192118-2.894627integrase family protein
Shal_2193118-1.777677phage integrase
Shal_2194219-0.157573integrase family protein
Shal_2195629-0.977753hypothetical protein
Shal_2196628-0.993742hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2184TYPE3OMBPROT270.026 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 27.0 bits (59), Expect = 0.026
Identities = 11/36 (30%), Positives = 19/36 (52%), Gaps = 4/36 (11%)

Query: 66 HVDVRNQDIIVANYNGSFVC----KIIDKVNRLLLS 97
H+ + N++I V YNG +C + D + + LS
Sbjct: 174 HMKIGNKNIFVKEYNGKGICCASTRESDHIANMWLS 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2191MICOLLPTASE310.024 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 30.8 bits (69), Expect = 0.024
Identities = 16/86 (18%), Positives = 35/86 (40%), Gaps = 13/86 (15%)

Query: 409 VKNRRKREDQFYTLKQIDDFLKQTITKS----KELSADMIYRVRDSSNDYSYDVEYEKLL 464
+++ + D +++D LK+ KS K ++A + D + +Y YDV + +
Sbjct: 703 GRSQGEENDWKDMNSKLNDILKELSKKSWNGYKTVTAYFVNHKVDGNGNYVYDVVFHGMN 762

Query: 465 FIT---------PVGSTALNYSGMIK 481
T P + S +++
Sbjct: 763 TDTNTDVHVNKEPKAVIKSDSSVIVE 788


33Shal_2267Shal_2289Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2267214-0.542460hydrogenase nickel incorporation protein HypB
Shal_2268015-0.686408hydrogenase assembly chaperone HypC/HupF
Shal_2269015-0.818520hydrogenase expression/formation protein HypD
Shal_2270117-1.667606hydrogenase expression/formation protein HypE
Shal_2271220-3.640567hydrogenase nickel incorporation protein HypA
Shal_2272221-4.092375hypothetical protein
Shal_2273122-4.337228type II secretion system protein
Shal_2274321-3.654509type II secretion system protein
Shal_2275322-3.860494type II secretion system protein E
Shal_2276223-3.999219Flp pilus assembly protein ATPase CpaE-like
Shal_2277325-3.083921TadE family protein
Shal_2278323-2.943523TadE family protein
Shal_2279324-3.313541hypothetical protein
Shal_2280226-4.195855hypothetical protein
Shal_2281226-4.649614type II and III secretion system protein
Shal_2282428-4.975704SAF domain-containing protein
Shal_2283524-5.160248ATPase AAA
Shal_2284420-4.907741peptidase A24A prepilin type IV
Shal_2285318-4.624654hypothetical protein
Shal_2286218-4.477472transposase and inactivated derivative
Shal_2287218-4.072067peptidoglycan-binding domain-containing protein
Shal_2288118-3.330709sulfatase
Shal_2289018-3.206347glycosyl transferase family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2272SYCDCHAPRONE320.002 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 32.2 bits (73), Expect = 0.002
Identities = 16/106 (15%), Positives = 36/106 (33%), Gaps = 4/106 (3%)

Query: 206 ILELQPHSSILITNLGYSYYLTAELTMAEKYLRQAIREDSSFNRAWTNLGLVYVRKGFYK 265
+ E+ + + +L ++ Y + + A K + D +R + LG G Y
Sbjct: 28 LNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYD 87

Query: 266 RAL----AAFEQTMSPADALNDLGYFLMLEGQYEKAIGLFERAIDM 307
A+ + L+ +G+ +A A ++
Sbjct: 88 LAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQEL 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2275HTHFIS290.046 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.046
Identities = 11/59 (18%), Positives = 27/59 (45%)

Query: 170 ALDGPSLSIRRFAVDKLNARQLIDIGSVTESMIELLKGSVKGKLNILVSGGTGSGKTTL 228
AL P + D + L+ + + + +L ++ L ++++G +G+GK +
Sbjct: 118 ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELV 176


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2281BCTERIALGSPD1465e-40 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 146 bits (370), Expect = 5e-40
Identities = 58/256 (22%), Positives = 112/256 (43%), Gaps = 23/256 (8%)

Query: 176 QVMLEVVVAEVQRNVARQFDSKFLI--------FNQGSNLSGGVIGGGGGFDPGSIGG-- 225
QV++E ++AEVQ ++ N G +S + G G++
Sbjct: 346 QVLVEAIIAEVQDADGLNLGIQWANKNAGMTQFTNSGLPISTAIAGANQYNKDGTVSSSL 405

Query: 226 VDGKGLFAQYINGDLMMNFALDVA--KQNGLAKVLAEPNVTAMSGQSAEFLSGGEFPIPV 283
F G N+A+ + + +LA P++ + A F G E P+
Sbjct: 406 ASALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLT 465

Query: 284 P-----GENGNTTIEYRDYGVGVKFVPTVLDSGQINLNLNVVVSEISTANGFSISGNTST 338
G+N T+E + G+ +K P + + + L + VS ++ A +++
Sbjct: 466 GSQTTSGDNIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVADAAS------STS 519

Query: 339 SLVVPSLVKRSTATTVELADGQTIAISGLISDTLRENIDKLPGLGDVPVLGQLFTSKSFQ 398
S + + R+ V + G+T+ + GL+ ++ + DK+P LGD+PV+G LF S S +
Sbjct: 520 SDLGATFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKK 579

Query: 399 SGQSELVILVTPRLVR 414
+ L++ + P ++R
Sbjct: 580 VSKRNLMLFIRPTVIR 595


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2283HTHFIS300.019 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.2 bits (68), Expect = 0.019
Identities = 9/34 (26%), Positives = 17/34 (50%)

Query: 159 KDLLFKIGPAMNSSRPVLIYGPPGTGKSYLCRHL 192
+++ + M + ++I G GTGK + R L
Sbjct: 147 QEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2284PREPILNPTASE333e-04 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 33.2 bits (76), Expect = 3e-04
Identities = 37/184 (20%), Positives = 69/184 (37%), Gaps = 39/184 (21%)

Query: 7 LLQVVIAGVFFILAIVVDLTREKIPNWLCLFAIFCGFLVN---GYFAHLSGLMISFIGFS 63
L ++ + +DL + +P+ L L ++ G L N G+ + ++ + G+
Sbjct: 134 TLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGYL 193

Query: 64 LAFIILFPTFIFKI------LGAGDIKLMMGIGALMGPQLLAWSIAYGIIAGAVTSILLI 117
+ + + + FK+ +G GD KL+ +GA +G Q L + + GA I LI
Sbjct: 194 VLWSL---YWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLI 250

Query: 118 IWKSGFSGCIKTFKRYWDCFYLRTYFKPEEGEAAGQRVPYAPALAIGWVWACSLNQDITH 177
+ + + +P+ P LAI A IT
Sbjct: 251 LLR---------------------------NHHQSKPIPFGPYLAIAGWIALLWGDSITR 283

Query: 178 LYTS 181
Y +
Sbjct: 284 WYLT 287


34Shal_2332Shal_2339Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2332019-3.368945DNA replication initiation factor
Shal_2333019-3.726309hypothetical protein
Shal_2334019-4.008864uracil-xanthine permease
Shal_2335119-4.445842glutaredoxin-like protein
Shal_2336118-4.273572superoxide dismutase
Shal_2337018-3.851018putative serine protein kinase PrkA
Shal_2338-118-3.622695hypothetical protein
Shal_2339-219-3.540975SpoVR family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2338PF07201310.011 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 30.6 bits (69), Expect = 0.011
Identities = 17/93 (18%), Positives = 30/93 (32%), Gaps = 14/93 (15%)

Query: 122 EYLELLFEDLELPN--------LQNNRLNKLVEYQVYRAGFTNDGVPANINIVRSLRSSL 173
+YL +LE L N+ L + + Y G + + ++ LR +L
Sbjct: 89 QYLSK-VPELEQKQNVSELLSLLSNSPNISLSQLKAYLEGKSEEPSEQFK-MLCGLRDAL 146

Query: 174 ARRTAMTSTKRSKLKALEQELNQLENTPGSEAE 206
R + +EQ L + G
Sbjct: 147 KGRPELAHL----SHLVEQALVSMAEEQGETIV 175


35Shal_2410Shal_2422Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2410216-2.3470754-hydroxyphenylpyruvate dioxygenase
Shal_2411118-3.342066PAS/PAC sensor-containing diguanylate cyclase
Shal_2412013-0.468828hypothetical protein
Shal_2413014-1.122324transposase
Shal_2414-113-0.844098hypothetical protein
Shal_24151201.7434134-hydroxyphenylpyruvate dioxygenase
Shal_24161202.282547hypothetical protein
Shal_24172202.966756lysine exporter protein LysE/YggA
Shal_24182182.375222hypothetical protein
Shal_24192173.017666hypothetical protein
Shal_24201172.977314gamma-glutamyltransferase
Shal_24211172.831505ribonucleoside hydrolase 1
Shal_24222182.908147major facilitator transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2414PF08280280.046 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 28.3 bits (63), Expect = 0.046
Identities = 20/96 (20%), Positives = 40/96 (41%), Gaps = 15/96 (15%)

Query: 219 KSLLPFVGDEHKDMPKGLMFNAKEYL----QLVDDTGRIQRNDKRGAISQSSQQLLYRLN 274
+LLP + ++ + K LMF +K +L + +T +G +Q+L L
Sbjct: 333 ITLLPNLKEQKASLVKALMFFSKSFLFNLQHFIPETNLFVSPYYKG-----NQKLYTSLK 387

Query: 275 IPAENWLKMTAEFTKIFKGAVGNTQELSTYCEHLER 310
+ E W+ K+ N + +C ++E+
Sbjct: 388 LIVEEWM------AKLPGKRYLNHKHFHLFCHYVEQ 417


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2422TCRTETA583e-11 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 57.5 bits (139), Expect = 3e-11
Identities = 76/387 (19%), Positives = 134/387 (34%), Gaps = 26/387 (6%)

Query: 2 NPKVYLLALITFATGMDENIAGGIVPFIAEDLGVSISAA---GQLTTVFSLTFALAAPVL 58
N + ++ + + ++P + DL S G L +++L APVL
Sbjct: 4 NRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVL 63

Query: 59 LFATAKFNRLRLLQVALFTFCLGNLFSALSPSYAFLFGMRIISAASCALIVVLVTTIATE 118
+ +F R +L V+L + A +P L+ RI++ + A V IA +
Sbjct: 64 GALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIA-D 122

Query: 119 MVEAEFKGRAIGIIFMGISGSLVLGVPVGLLFCNWLGWRAPFAFIALMALLLCIIVPLML 178
+ + + + R G + +V G +G APF A + L + +L
Sbjct: 123 ITDGDERARHFGFMSACFGFGMVAGPVLG-GLMGGFSPHAPFFAAAALNGLNFLTGCFLL 181

Query: 179 PNVSSNGQRIPLS----------KYAEHLKNPKLLTAQLVSILMIGGHFVLFAYLAPYLQ 228
P S G+R PL ++A + L A + ++G A + +
Sbjct: 182 PE-SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPA--ALWVIFGE 238

Query: 229 ATIGASKSELTFYYFLFGAA-AITGGYMGGWLSDKLGHITAII--LIPSIFILFLGAIPF 285
+ + FG ++ + G ++ +LG A++ +I L A
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLA--- 295

Query: 286 TLNSLVLFIPCMMLWGGLSWTISPIVQNYL-IQSDHANAGVSIAVNNTSMHLGVALGSLL 344
+ P M+L P +Q L Q D G L +G LL
Sbjct: 296 FATRGWMAFPIMVLLASGG-IGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLL 354

Query: 345 GGVVIKYSSVNWTPAIGVFVVVLSLLC 371
+ S W + L LLC
Sbjct: 355 FTAIYAASITTWNGWAWIAGAALYLLC 381


36Shal_2436Shal_2442Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2436316-1.460480hypothetical protein
Shal_2437315-1.502887hypothetical protein
Shal_2438319-2.921496hypothetical protein
Shal_2439320-1.242606hypothetical protein
Shal_2440318-0.802126hypothetical protein
Shal_2441418-0.441735hypothetical protein
Shal_24423180.231515hypothetical protein
37Shal_2660Shal_2674Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2660013-3.180501chorismate synthase
Shal_2661-117-4.142975N5-glutamine S-adenosyl-L-methionine-dependent
Shal_2662018-4.407756hypothetical protein
Shal_2663019-4.614319phosphohistidine phosphatase SixA
Shal_2664017-3.769584peptidase M16 domain-containing protein
Shal_2665-119-3.161176PAS/PAC sensor-containing diguanylate
Shal_26660250.048495hypothetical protein
Shal_26680111.628513hypothetical protein
Shal_26690121.888586hypothetical protein
Shal_26700131.941567multifunctional fatty acid oxidation complex
Shal_26712151.7268893-ketoacyl-CoA thiolase
Shal_26722161.672861ATPase
Shal_26733151.108491hypothetical protein
Shal_26743160.603844hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2672HTHFIS320.003 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.7 bits (72), Expect = 0.003
Identities = 34/131 (25%), Positives = 50/131 (38%), Gaps = 17/131 (12%)

Query: 34 DGHLLVEGPPGLAKT---RAVKALCDGVEGDFHRIQ---FTPDLLPADLTG------TDI 81
D L++ G G K RA+ G F I DL+ ++L G T
Sbjct: 160 DLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGA 219

Query: 82 YRAQTATFEFEAGPIFHNLILADEINRAPAKVQSALLEAMAEGQVT-VGKHSYQLPELFL 140
T FE G + DEI P Q+ LL + +G+ T VG + ++ +
Sbjct: 220 QTRSTGRFEQAEG----GTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRI 275

Query: 141 VMATQNPLENE 151
V AT L+
Sbjct: 276 VAATNKDLKQS 286


38Shal_2683Shal_2707Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2683215-0.731186RND family efflux transporter MFP subunit
Shal_2684214-0.596939acriflavin resistance protein
Shal_2685317-2.360132hypothetical protein
Shal_2686217-1.910319rhodanese domain-containing protein
Shal_2687216-1.189024hypothetical protein
Shal_2688115-0.878242hypothetical protein
Shal_2689015-0.069318GTP cyclohydrolase II
Shal_2690115-0.368458hypothetical protein
Shal_2691015-0.474609anaerobic ribonucleoside-triphosphate reductase
Shal_2692015-0.605464anaerobic ribonucleoside triphosphate reductase
Shal_2693111-0.744284hypothetical protein
Shal_26940110.686881patatin
Shal_26952121.016845DEAD/DEAH box helicase
Shal_26962120.857821hypothetical protein
Shal_26972121.038829hypothetical protein
Shal_26982141.271081hypothetical protein
Shal_26992141.312212hypothetical protein
Shal_27001130.788018hypothetical protein
Shal_2701113-0.88939160 kDa inner membrane insertion protein
Shal_2702115-0.734683peptidase M23B
Shal_2703114-0.987897SMC domain-containing protein
Shal_2704113-1.802398nuclease SbcCD subunit D
Shal_2705215-2.320242valine--pyruvate transaminase
Shal_2706316-2.530844hypothetical protein
Shal_2707215-1.061286hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2683RTXTOXIND476e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.1 bits (112), Expect = 6e-08
Identities = 25/155 (16%), Positives = 48/155 (30%), Gaps = 44/155 (28%)

Query: 79 AALLEITNKEQGAQLAAAEADYAKASALNTEAQLQLERYQKLFPKGAISK---------- 128
E+ ++ A+ A + L+ + +L+ + L K AI+K
Sbjct: 202 KYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKY 261

Query: 129 ----GEMDEAEANAKSTKQAVSVAKARITQATES-------------------------- 158
E+ ++ + + + AK T+
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKN 321

Query: 159 ---LKYTVVSAPFSGVVTQRHV-EQGETVTPGQPL 189
+ +V+ AP S V Q V +G VT + L
Sbjct: 322 EERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL 356



Score = 30.2 bits (68), Expect = 0.012
Identities = 16/82 (19%), Positives = 30/82 (36%), Gaps = 5/82 (6%)

Query: 163 VVSAPFSGVVTQRHVEQGETVTPGQPLYSGYSLKQMRAVTQVPQRYIQAL-----KQQPK 217
+ + +V + V++GE+V G L +L + +QA Q
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 218 FKLTLNNGQQLDVSNPTLFSFV 239
+ LN +L + + F V
Sbjct: 158 RSIELNKLPELKLPDEPYFQNV 179


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2684ACRIFLAVINRP505e-164 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 505 bits (1301), Expect = e-164
Identities = 227/1066 (21%), Positives = 438/1066 (41%), Gaps = 72/1066 (6%)

Query: 25 LLALLGLLFGLFAVLVTPKEEEPQIDVTFADVFVPFPGATPREVESLVTLPTEQIISEIK 84
+LA++ ++ G A+L P + P I V +PGA + V+ VT EQ ++ I
Sbjct: 14 VLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQVIEQNMNGID 73

Query: 85 GIDTLYSFSQPDGALFIVV-FEVGIERNDAIVKLYNQIYSNLDKLPLAAGVGEPQIKPRG 143
+ + S S G++ I + F+ G + + A V++ N++ LP V + I
Sbjct: 74 NLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLP--QEVQQQGISVEK 131

Query: 144 IDDVPIVSLTLWSKQPQVSAEQLTHIAN-GLETEIKRIPGTREISSVGEHQLILNVRIDP 202
++ S P + + ++ ++ + R+ G ++ G Q + + +D
Sbjct: 132 SSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA-QYAMRIWLDA 190

Query: 203 LKLNYYGISFDEINQSLSSNN------HISMPASLVQGNQEIKVQAGQFLQNIEDVKQLV 256
LN Y ++ ++ L N + +L + A +N E+ ++
Sbjct: 191 DLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNPEEFGKVT 250

Query: 257 VAVKQDASGQASPVYLADLAQISLKSDIPVKSAWHGDKQEIYPAVTIAIGKQPGKNAVDV 316
+ V D S V L D+A++ L + A K PA + I G NA+D
Sbjct: 251 LRVNSDGS----VVRLKDVARVELGGENYNVIARINGK----PAAGLGIKLATGANALDT 302

Query: 317 ANAILERVNQVENVLIPDGVEVSISRNYGETAGDKANTLMLKLVFATAAVVILVLITMG- 375
A AI ++ +++ P G++V + + ++ L A V +++ + +
Sbjct: 303 AKAIKAKLAELQPFF-PQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 376 FREAIVVGIAIIITLALTLFASWAWGFTLNRISLFALIFSIGILVDDAIVVVENIHRHMA 435
R ++ IA+ + L T A+G+++N +++F ++ +IG+LVDDAIVVVEN+ R M
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 436 MGKRSFSELIPIAVDEVGGPTILATFTVIAALLPMAFVSGLMGPYMSPIPINASMGMLIS 495
K E ++ ++ G + + A +PMAF G G I M +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 496 LGVAFMITPWLSRKFLKYPQTPHNEKQHSEQHLNQGIMFKLFNHLIGPFVQGKNARKARL 555
+ VA ++TP L LK P + + H + G F+H + +
Sbjct: 482 VLVALILTPALCATLLK----PVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGST 537

Query: 556 GLAAGIFILIAGAIALPMAKLVVLKMLPFDNKSEFQVMVDLPEGTPVEQTQKALKELGRY 615
G I+ LI + + + + LP +++ F M+ LP G E+TQK L ++ Y
Sbjct: 538 GRYLLIYALIVAGMVV-LFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDY 596

Query: 616 LNQVEEVESYQLYAGTSAPMNFNGLVRHYFLRQTQELGDIQVNLVDKKHRDRDSHSIALS 675
+ E+ ++ T +F+G Q Q G V+L + R+ D +S
Sbjct: 597 YLKNEKANVESVF--TVNGFSFSG--------QAQNAGMAFVSLKPWEERNGDENSAEAV 646

Query: 676 VREPLQAIGEKYNANIKVVEVPPGPPVWSPILAEVYGPS-EAIREQAANALLDVLKQ--- 731
+ +G+ + V P + A + +AL Q
Sbjct: 647 IHRAKMELGKIRDGF---VIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLG 703

Query: 732 -----TKDVVDIDIYLPESAAKWQVIIDRSKASLLGVPYSKIVDLVATAVGGKDISYLYK 786
+V + E A++++ +D+ KA LGV S I ++TA+GG ++
Sbjct: 704 MAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFID 763

Query: 787 ANQKHPVPINIQLQEGAKVDLDQVLNLTLRGNNNQAVALSELVRIDKGNIDAPIIHKNMI 846
+ + +Q ++ + V L +R N + V S + N +
Sbjct: 764 --RGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGL 821

Query: 847 PMIMVVADMAGPLDSPLYGMFDMVGKINDKQGLGFDQHYISQPNGLDSVAILWDGEWKIT 906
P + + + A P D + + + P G + W G
Sbjct: 822 PSMEIQGEAA-----PGTSSGDAMALMENLAS--------KLPAG---IGYDWTGMSYQE 865

Query: 907 YETFRDMGIAYAVGMIAIYLLVVAQFRSYIVPLIIMAPIPLTIIGVMPGHALLGAQFTAT 966
+ A+ + ++L + A + S+ +P+ +M +PL I+GV+ L +
Sbjct: 866 RLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVY 925

Query: 967 SMIGMIALAGIIVRNSILLVDFIVQETAA-GVPFEQAVIHSGAVRAKPIMLTALAAMIGA 1025
M+G++ G+ +N+IL+V+F G +A + + +R +PI++T+LA ++G
Sbjct: 926 FMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGV 985

Query: 1026 LFIIDDP-----IFNGLAISLIFGILISTLLTLVVIPVLYYAVMKK 1066
L + N + I ++ G++ +TLL + +PV + + +
Sbjct: 986 LPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRC 1031



Score = 90.3 bits (224), Expect = 2e-20
Identities = 82/518 (15%), Positives = 184/518 (35%), Gaps = 41/518 (7%)

Query: 8 GISGRIASAFQSSAITPLLALLGLLFGLFAVLVTPKEEEPQIDVTFADVFVPFPGATPRE 67
+ + S+ L+ L + + L P P+ D + P +E
Sbjct: 525 HYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQE 584

Query: 68 VESLVTLPTEQII--SEIKGIDTLYSF--------SQPDGALFIVV--FEVGIERNDAIV 115
V +E ++++++ +Q G F+ + +E ++
Sbjct: 585 RTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAE 644

Query: 116 KLYNQIYSNLDKLPLAAGVGEPQIKPRGIDDVPIVSLTLWSKQPQVSAEQLTHIANGLET 175
+ ++ L K+ + + L Q + + LT N L
Sbjct: 645 AVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFEL-IDQAGLGHDALTQARNQLLG 703

Query: 176 EIKRIPGT--REISSVGEHQLILNVRIDPLKLNYYGISFDEINQSLSSNNHISMPASLVQ 233
+ P + + E + +D K G+S +INQ++S+ + +
Sbjct: 704 MAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFID 763

Query: 234 GNQEIKV---QAGQFLQNIEDVKQLVVAVKQDASGQASPVYLADLAQISLKSDIPVKSAW 290
+ K+ +F EDV +L V A+G+ P + P +
Sbjct: 764 RGRVKKLYVQADAKFRMLPEDVDKLYVR---SANGEMVP--FSAFTTSHWVYGSPRLERY 818

Query: 291 HGDKQEIYPAVTIAIGKQPGKNAVDVANAILERVNQVENVLIPDGVEVSISRNYGETAGD 350
+G P++ I PG ++ D + ++ +P G+ + G + +
Sbjct: 819 NG-----LPSMEIQGEAAPGTSSGDAMALMENLASK-----LPAGIGYDWT---GMSYQE 865

Query: 351 KANTLMLKLVFATAAVVILVLITMGFREAIVVGIAIIITLALT----LFASWAWGFTLNR 406
+ + + A + VV+ + + + E+ + +++++ + L L A+ + +
Sbjct: 866 RLSGNQAPALVAISFVVVFLCLAALY-ESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDV 924

Query: 407 ISLFALIFSIGILVDDAIVVVENIHRHMAMGKRSFSELIPIAVDEVGGPTILATFTVIAA 466
+ L+ +IG+ +AI++VE M + E +AV P ++ + I
Sbjct: 925 YFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILG 984

Query: 467 LLPMAFVSGLMGPYMSPIPINASMGMLISLGVAFMITP 504
+LP+A +G + + I GM+ + +A P
Sbjct: 985 VLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVP 1022



Score = 61.8 bits (150), Expect = 1e-11
Identities = 33/166 (19%), Positives = 78/166 (46%), Gaps = 10/166 (6%)

Query: 908 ETFRDMGIAYAVGMIAIYLLVVAQFRSYIVPLIIMAPIPLTIIGVMPGHALLGAQFTATS 967
E + + A + + +YL + R+ ++P I +P+ ++G A G +
Sbjct: 339 EVVKTLFEAIMLVFLVMYL-FLQNMRATLIPTIA---VPVVLLGTFAILAAFGYSINTLT 394

Query: 968 MIGMIALAGIIVRNSILLVDFIVQE-TAAGVPFEQAVIHS-----GAVRAKPIMLTALAA 1021
M GM+ G++V ++I++V+ + + +P ++A S GA+ ++L+A+
Sbjct: 395 MFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFI 454

Query: 1022 MIGALFIIDDPIFNGLAISLIFGILISTLLTLVVIPVLYYAVMKKR 1067
+ I+ +I+++ + +S L+ L++ P L ++K
Sbjct: 455 PMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPV 500


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_270160KDINNERMP912e-23 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 91.2 bits (226), Expect = 2e-23
Identities = 39/214 (18%), Positives = 93/214 (43%), Gaps = 15/214 (7%)

Query: 4 WTNFTDLLIQTIGFLTTEVGVSQGVAIILLTLVVRLLFSPISCSAMINSYKNKKTMAAIK 63
+ L + + ++ + VG + G +II++T +VR + P++ Y + M ++
Sbjct: 333 LWFISQPLFKLLKWIHSFVG-NWGFSIIIITFIVRGIMYPLT----KAQYTSMAKMRMLQ 387

Query: 64 PELDKLKSVYKDKPNEMAKQTMALYRKHDIKFISKTLVANVSSQAIFGFGMFQVLQKAIF 123
P++ ++ D ++++ MALY+ + + + Q ++ +L ++
Sbjct: 388 PKIQAMRERLGDDKQRISQEMMALYKAEKVNPLGGCF--PLLIQMPIFLALYYMLMGSVE 445

Query: 124 ---SSKFAWISNIAKPD--IALAMLVGALTYLSMLMMPDSAE--QANALFLLIPAMVSVA 176
+ WI +++ D L +L+G + M P + + +P + +V
Sbjct: 446 LRQAPFALWIHDLSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTV- 504

Query: 177 VLVTAPSALSLYWATSSAFGALQSLVVNKYCEKQ 210
+ PS L LY+ S+ +Q ++ + EK+
Sbjct: 505 FFLWFPSGLVLYYIVSNLVTIIQQQLIYRGLEKR 538


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2703RTXTOXIND391e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 38.7 bits (90), Expect = 1e-04
Identities = 39/206 (18%), Positives = 67/206 (32%), Gaps = 21/206 (10%)

Query: 231 LIELEPKLAQALVLKEQSQLALINANKQFESAKLLVNDFDALDKLKQTA------ALLDE 284
L++L A+A LK QS L + + + L +LK +E
Sbjct: 124 LLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEE 183

Query: 285 KKQAIEQDKTRLENGQKALQLKPVLDMSLARDAEVVQSQGACVNAQVAKE----SSAQSL 340
+ K + Q K + + V ++ E SL
Sbjct: 184 VLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSL 243

Query: 341 AHSQ--TQIDVLPEQEQKLQSAQNELQQLSQLLPQLQGFDALQHQVNQAIVQLEQAKQKG 398
H Q + VL EQE K A NEL+ L Q++ + L + + + Q +
Sbjct: 244 LHKQAIAKHAVL-EQENKYVEAVNELRVYKSQLEQIES-EILSAK--EEYQLVTQLFKNE 299

Query: 399 IN-----AKGQLDALQAQKQQAEQQL 419
I + L + + E++
Sbjct: 300 ILDKLRQTTDNIGLLTLELAKNEERQ 325



Score = 30.6 bits (69), Expect = 0.037
Identities = 19/157 (12%), Positives = 46/157 (29%), Gaps = 33/157 (21%)

Query: 585 ELQMHLHILQQQL----KQTNDAALQLGQLRGQLQEWQTKERALQTQLETEREHFSD--- 637
E+ ++++Q Q L L + R + + + E+ D
Sbjct: 183 EVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSS 242

Query: 638 --------------QQSLLAELQGQLKHASTAIPEHYRSVDALNGAIAQVNVQIAVMQQ- 682
Q++ E +L+ + ++ + I + ++ Q
Sbjct: 243 LLHKQAIAKHAVLEQENKYVEAVNELR-------VYKSQLEQIESEILSAKEEYQLVTQL 295

Query: 683 ----VINSTRQNHSKAAELDASNTSALAAIQSSLAQA 715
+++ RQ L Q+S+ +A
Sbjct: 296 FKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRA 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2706TCRTETB371e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 37.2 bits (86), Expect = 1e-04
Identities = 41/216 (18%), Positives = 87/216 (40%), Gaps = 10/216 (4%)

Query: 4 RNRILLTWISFLSYALTGSLIIVTGIVMGDIAKFFNLPISSMSNTFTFLNTGVLISIFFN 63
R+ +L W+ LS+ + +V + + DIA FN P +S + T I
Sbjct: 11 RHNQILIWLCILSF-FSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVY 69

Query: 64 VWLMEIISLKKQLIFGFILMLLAVLGLMFGHNLA-IFSGSMFVLGVVSGITMSIGTYLIT 122
L + + +K+ L+FG I+ + GH+ + + F+ G + ++ ++
Sbjct: 70 GKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVA 129

Query: 123 RLYHGKQRGSRLLFTDSFFSMAGMIFPLISAALLAHSIAWYWVYVAIGMIYVAIFILALI 182
R + RG S +M + P I + + +W Y+ + + I + L+
Sbjct: 130 RYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHY---IHWSYLLLIPMITIITVPFLM 186

Query: 183 CEFPALVKSDEQQAVKEKWSLGILFLAIAALCYILG 218
V+ +K GI+ +++ + ++L
Sbjct: 187 KLLKKEVRIKGHFDIK-----GIILMSVGIVFFMLF 217


39Shal_2722Shal_2752Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2722228-1.555007biopolymer transport protein ExbD/TolR
Shal_2723231-1.722985MotA/TolQ/ExbB proton channel
Shal_2724123-1.729717MotA/TolQ/ExbB proton channel
Shal_2725-118-1.492952hypothetical protein
Shal_2726-215-1.420094hypothetical protein
Shal_2727-314-1.569495TonB-dependent receptor
Shal_2728-212-1.023075porin
Shal_2729-112-0.913251DNA polymerase II
Shal_2730-214-1.438538ATP-dependent DNA helicase DinG
Shal_2731017-3.482244hypothetical protein
Shal_2732-120-3.168161primosomal replication protein N''
Shal_2733020-2.754559hypothetical protein
Shal_2734121-2.783667hypothetical protein
Shal_2735222-2.594479alpha/beta hydrolase domain-containing protein
Shal_2736120-2.173087major facilitator transporter
Shal_2737120-2.066031phosphate acetyltransferase
Shal_2738219-2.801475acetate kinase
Shal_2739320-3.628214trans-2-enoyl-CoA reductase
Shal_2740323-4.937915dehydratase
Shal_2741323-5.227313coenzyme A transferase
Shal_2742425-5.969669hypothetical protein
Shal_2743223-5.120706lipase class 2
Shal_2744223-5.060104alpha/beta hydrolase fold protein
Shal_2745119-3.879802alpha/beta hydrolase fold protein
Shal_2746015-1.908218LysR family transcriptional regulator
Shal_2747-111-0.942083short-chain dehydrogenase/reductase SDR
Shal_2748-111-0.931025acetyl-CoA acetyltransferase
Shal_2749-114-1.441613helix-hairpin-helix repeat-containing competence
Shal_2750-115-1.415543anion transporter
Shal_2751117-1.912453cystathionine gamma-synthase
Shal_2752214-2.286078hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2728ECOLIPORIN642e-13 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 64.2 bits (156), Expect = 2e-13
Identities = 92/414 (22%), Positives = 160/414 (38%), Gaps = 45/414 (10%)

Query: 1 MKKTLVASALAAVIFVPTASAIEIYKDDKNAVEIGGYVDARIINTQGATEVVNGA-SRIN 59
MK+ ++A + A++ A A EIY D N +++ G VD + +++ + R+
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 60 FGFTRQMSDGWKAYTKLEWGVNPFGDSKIVYNSNSTFESKSGDFLNNRLGYVGLAHDTYG 119
F Q++D Y + E YN + G RL + GL YG
Sbjct: 61 FKGETQINDQLTGYGQWE------------YNVQANTTEGEGANSWTRLAFAGLKFGDYG 108

Query: 120 SITIGKQWGAWYDVVYNTNYGFVWDGNASGTYTYNKADGAINGTGRGDKTVQYRNA--FG 177
S G+ +G YDV T+ + G+ +Y AD + TGR + YRN FG
Sbjct: 109 SFDYGRNYGVLYDVEGWTDMLPEFGGD-----SYTYADNYM--TGRANGVATYRNTDFFG 161

Query: 178 ---DFSFAIQAQLKQDTFEIGEVPSASNPLFPAADFMSVKAAGVNVNSSVTTVEYNYTYG 234
+FA+Q Q K ++ +V +N D G ++ TT + +
Sbjct: 162 LVDGLNFALQYQGKNESQSADDVNIGTNNRNNGDDIRYDNGDGFGIS---TTYDIGMGFS 218

Query: 235 GAATY---NVTDMLTLTAGFNLGEFEATTSAGKRLTETDSIYGVGATWGNWNSEGVYAAF 291
A Y + T+ G G +A + ++IY + +
Sbjct: 219 AGAAYTTSDRTNEQVNAGGTIAGGDKADAWTAGLKYDANNIY-LATMYSE--------TR 269

Query: 292 NVNKQEFHDTDNLGRMMPEAKGLESLASYKFENGIRTFVSYNIL---DAGSEYEAAYNGD 348
N+ D G + + + E A Y+F+ G+R VS+ + D + D
Sbjct: 270 NMTPYGKTDKGYDGGVANKTQNFEVTAQYQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKD 329

Query: 349 VFKRQFVVAGVHYVWDSNTVLYLEGRKDYSDFTGVHAEAMAKSEDDGIAIGIRY 402
+ K + G Y ++ N Y++ + + D + S DD +A+G+ Y
Sbjct: 330 LVK--YADVGATYYFNKNFSTYVDYKINLLDDDDPFYKDAGISTDDIVALGMVY 381


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2736TCRTETB362e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 36.4 bits (84), Expect = 2e-04
Identities = 29/126 (23%), Positives = 54/126 (42%), Gaps = 3/126 (2%)

Query: 42 AAATWMTNAITIAKIIGNLAAAWLLVKMGPKKAFTFASILIVLGALGAFA--SSYPIYVI 99
A+ W+ A + IG L ++G K+ F I+ G++ F S + + ++
Sbjct: 49 ASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIM 108

Query: 100 SRLIMGFGGAFVIVYFNPIVVKYFTAEERPMINGINAAAFNTGNLLAILLTGTLIANFQT 159
+R I G G A +V +Y E R G+ + G + + G +IA++
Sbjct: 109 ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAI-GGMIAHYIH 167

Query: 160 WQNVIL 165
W ++L
Sbjct: 168 WSYLLL 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2738ACETATEKNASE485e-173 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 485 bits (1249), Expect = e-173
Identities = 179/401 (44%), Positives = 262/401 (65%), Gaps = 12/401 (2%)

Query: 6 VLVLNCGSSSLKFAVIDAISGEDLLSGLSECFGLEDACIKWKTRVGQQESKHQESLGAFS 65
+LV+NCGSSSLK+ +I++ G L GL+E G+ D+ + + +
Sbjct: 3 ILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHNA-----NGEKIKIKKDMK 57

Query: 66 AHTEAMAYIVDTILAAHPTLKQQ---ILAVGHRVVHGGEQFSGSVIINDEVLQGIEACAT 122
H +A+ ++D ++ + + + I AVGHRVVHGGE F+ SV+I D+VL+ I C
Sbjct: 58 DHKDAIKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCIE 117

Query: 123 LAPLHNPAHIIGILAAKVAFPQLAQIAVFDTAFHQTMPQKAFIYALPYQLYREHSIRRFG 182
LAPLHNPA+I GI A P + +AVFDTAFHQTMP A++Y +PY+ Y ++ IR++G
Sbjct: 118 LAPLHNPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKYG 177

Query: 183 MHGTSHRYICQTAAKTLGIAVEDLNLISAHLGNGASVTAIKAGISVDTSMGMTPLEGVVM 242
HGTSH+Y+ Q AA+ L +E L +I+ HLGNG+S+ A+K G S+DTSMG TPLEG+ M
Sbjct: 178 FHGTSHKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLAM 237

Query: 243 GTRCGDLDPSIIFYLVQNLGYSLKEVEDMMNKQSGLLGISELTNDCRGIEEGH-ANGHKG 301
GTR G +DPSII YL++ S +EV +++NK+SG+ GIS +++D R +E+ NG K
Sbjct: 238 GTRSGSIDPSIISYLMEKENISAEEVVNILNKKSGVYGISGISSDFRDLEDAAFKNGDKR 297

Query: 302 ASLALDIFCYRLAKYIASYTVPLGRIDALVFTGGIGENSALIRQKVIELLSIFNFKLDRQ 361
A LAL++F YR+ K I SY +G +D +VFT GIGEN IR+ +++ L FKLD++
Sbjct: 298 AQLALNVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREFILDGLEFLGFKLDKE 357

Query: 362 RNLAARFGQQGQITADTGPV-AMVIPTNEEWVIAQDALKLI 401
+N G++ I+ V MV+PTNEE++IA+D K++
Sbjct: 358 KNKVR--GEEAIISTADSKVNVMVVPTNEEYMIAKDTEKIV 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2747DHBDHDRGNASE1404e-43 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 140 bits (355), Expect = 4e-43
Identities = 85/255 (33%), Positives = 126/255 (49%), Gaps = 19/255 (7%)

Query: 2 LNNKICIVTGGAQGIGRCIVETFAQQGALKVYACDMNIEAMRDLEEQYSNVS----AKEL 57
+ KI +TG AQGIG + T A QGA + A D N E + + + A
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGA-HIAAVDYNPEKLEKVVSSLKAEARHAEAFPA 64

Query: 58 NVCDRKGIQALIETLKAEHGKIDVLVNNAGITRDNLLDRMSEDDWDMVINVNLKGVFNMT 117
+V D I + ++ E G ID+LVN AG+ R L+ +S+++W+ +VN GVFN +
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 118 QAVAPLMIENGVGSIITMSSVVGTDGNVGQTNYAATKGGVITMTKGWAKEFSRKGAQVRA 177
++V+ M++ GSI+T+ S YA++K + TK E + +R
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYN--IRC 182

Query: 178 NCVAPGFIETPMTIEL------PEKVL-----NFMKGKTPLGRMGKPEDIANGVLFLASD 226
N V+PG ET M L E+V+ F G PL ++ KP DIA+ VLFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTG-IPLKKLAKPSDIADAVLFLVSG 241

Query: 227 NSNFITGQTLKIDGG 241
+ IT L +DGG
Sbjct: 242 QAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2748UREASE300.019 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 30.1 bits (68), Expect = 0.019
Identities = 28/97 (28%), Positives = 46/97 (47%), Gaps = 20/97 (20%)

Query: 30 GGEAIKAALAQARVSPEH--LDEVIVGNVL-------SAGQGMGPGRQAA-RYAGIPDTV 79
GG+ I+ + Q++V+ E +D VI ++ A G+ GR AA AG PD
Sbjct: 48 GGKVIRDGMGQSQVTREGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQ 107

Query: 80 PAYTLNMICGSGMKTVIEAATKIKAGDADILVAAGME 116
P T+ ++ T++ AG+ I+ A GM+
Sbjct: 108 PGVTI----------IVGPGTEVIAGEGKIVTAGGMD 134


40Shal_2778Shal_2784Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_27783180.462370cytochrome C family protein
Shal_2779420-0.016279outer membrane protein
Shal_27806220.417394decaheme cytochrome c MtrF
Shal_2781525-0.148074decaheme cytochrome c
Shal_2782528-0.571210hypothetical protein
Shal_2783530-0.290596hypothetical protein
Shal_2784222-0.421447decaheme cytochrome c
41Shal_2845Shal_2860Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2845-118-3.092173leucyl aminopeptidase
Shal_2846-222-5.128294ApbE family lipoprotein
Shal_2847-121-5.030055flavocytochrome c
Shal_2848121-5.607542hypothetical protein
Shal_2849221-5.186448histidine kinase A domain-containing protein
Shal_2850-117-2.508799two component transcriptional regulator
Shal_2851-115-2.270123LuxR family transcriptional regulator
Shal_2852-117-1.332428hypothetical protein
Shal_2853016-0.941160transcriptional antiterminator Rof
Shal_2854217-1.349040hypothetical protein
Shal_2855117-2.455754nucleoside transporter
Shal_2856120-3.592063rod shape-determining protein RodA
Shal_2857019-3.397514UspA domain-containing protein
Shal_2859121-5.116630hypothetical protein
Shal_2860016-4.433337hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2849PF06580290.046 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.7 bits (64), Expect = 0.046
Identities = 19/102 (18%), Positives = 36/102 (35%), Gaps = 16/102 (15%)

Query: 293 DLLQQVISDQQAAKGDYLPIELLCDETVISVQRDPLTIILNNLVRNSLEHGTEG------ 346
++ + D L E + ++ VQ P+ ++ LV N ++HG
Sbjct: 223 TVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM--LVQTLVENGIKHGIAQLPQGGK 280

Query: 347 --VIIRQSNSKIEII-----NSMSPHNSSGFGLGLMLV-ERL 380
+ + N + + + + G GL V ERL
Sbjct: 281 ILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERL 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2850HTHFIS764e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 4e-18
Identities = 30/125 (24%), Positives = 60/125 (48%), Gaps = 1/125 (0%)

Query: 3 VLLVEDNLLLAKNIIQYLELIDIECDYAGSLAQAEVQVYSRSFDAIILDLNLPDGDGLEA 62
+L+ +D+ + + Q L + + A + + D ++ D+ +PD + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 63 CKKWKEQRVVSPVIMLTARSGLQDKLSGFEVGADDYLIKPFAMPELVARL-RVVAERRPA 121
+ K+ R PV++++A++ + E GA DYL KPF + EL+ + R +AE +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRR 125

Query: 122 PKVLR 126
P L
Sbjct: 126 PSKLE 130


42Shal_2870Shal_2880Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_28702162.699394hypothetical protein
Shal_28712163.842909hypothetical protein
Shal_28722154.1697483-ketoacyl-ACP reductase
Shal_28732175.0758643-hydroxyisobutyrate dehydrogenase
Shal_28742174.555712enoyl-CoA hydratase/isomerase
Shal_28753163.587222enoyl-CoA hydratase
Shal_28762142.173057acyl-CoA dehydrogenase domain-containing
Shal_2877011-0.369996methylmalonate-semialdehyde dehydrogenase
Shal_2878116-2.714247acetyl-CoA acetyltransferase
Shal_2879016-4.001445LuxR family transcriptional regulator
Shal_2880115-3.923760hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2872DHBDHDRGNASE1102e-31 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 110 bits (277), Expect = 2e-31
Identities = 74/263 (28%), Positives = 128/263 (48%), Gaps = 22/263 (8%)

Query: 3 LKDKVVVITGGAGGLGYAMAENLAAAGAKLALIDVDQEKLEKACANIGATTEVQ-GYAVD 61
++ K+ ITG A G+G A+A LA+ GA +A +D + EKLEK +++ A + D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 62 ITDEEDVFATFQFIKEDFGRVNVLINNAGILRDGLLLKAKEGQVFERMSFEQFQSVINVN 121
+ D + I+ + G +++L+N AG+LR GL+ +S E++++ +VN
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLI---------HSLSDEEWEATFSVN 116

Query: 122 LTGSFLCGREAAAAMIETEQEGVIINISSLAKAGNVGQTNYAASKAGVAAMSVGWAKELA 181
TG F R + M++ ++ S+ A YA+SKA + ELA
Sbjct: 117 STGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 182 RYNIRSAAVAPGVIETEMTAAMKPE----------ALERLEKMVPVGRLGQAEEIASTVR 231
YNIR V+PG ET+M ++ + +LE + +P+ +L + +IA V
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVL 236

Query: 232 FIIEND--YVNGRVFEIDGGIRL 252
F++ ++ +DGG L
Sbjct: 237 FLVSGQAGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2877PF01540290.041 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 29.3 bits (65), Expect = 0.041
Identities = 21/64 (32%), Positives = 28/64 (43%)

Query: 31 ETIAVVNAAVDAEVLTAVASAKQAFQSWKEVPVSERARVMLRYQHLLKEHHDEIATILAQ 90
+T A VN +L+ + S K+ SW E VSE V + L E E LA+
Sbjct: 181 KTFATVNTIKKDFLLSELESFKEFNTSWLEKIVSEWEEVKKAWSKELAEIKAEDDKKLAE 240

Query: 91 ETGK 94
E K
Sbjct: 241 ENQK 244


43Shal_2897Shal_2905Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2897-2123.083147short-chain dehydrogenase/reductase SDR
Shal_28980163.6506731,4-dihydroxy-2-naphthoate
Shal_28990183.9361096-phosphogluconate dehydrogenase
Shal_29000214.376049AMP-dependent synthetase and ligase
Shal_29011234.374299MerR family transcriptional regulator
Shal_29020234.901462acyl-CoA dehydrogenase domain-containing
Shal_29030234.792120propionyl-CoA carboxylase
Shal_2904-1234.359251enoyl-CoA hydratase/isomerase
Shal_2905-1143.059288carbamoyl-phosphate synthase L chain
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2897DHBDHDRGNASE823e-20 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 81.6 bits (201), Expect = 3e-20
Identities = 53/197 (26%), Positives = 91/197 (46%), Gaps = 17/197 (8%)

Query: 4 EGQVAIVTGAASGLGRAYAIALAERGARVVLVDNETTQTKDVSSSMTNLEPDSGLQACGA 63
EG++A +TGAA G+G A A LA +GA + VD + + V SS+
Sbjct: 7 EGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLK------------- 53

Query: 64 AITALGGEFKSFTLDVTSIDDINSMVDDVVKSWGRVDILINNAGIHKSCAFEQMSASSWK 123
A ++F DV I+ + + + G +DIL+N AG+ + +S W+
Sbjct: 54 ---AEARHAEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWE 110

Query: 124 HNLDVDLNGTFYMTQAVWPFMKRQSYGRVV-MSSASSIYGDMYETSFSTSKMAVMGLVNS 182
V+ G F +++V +M + G +V + S + ++++SK A +
Sbjct: 111 ATFSVNSTGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKC 170

Query: 183 LHLEGAEHGIKVNSIIP 199
L LE AE+ I+ N + P
Sbjct: 171 LGLELAEYNIRCNIVSP 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2899RTXTOXIND320.005 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.1 bits (73), Expect = 0.005
Identities = 21/105 (20%), Positives = 43/105 (40%), Gaps = 3/105 (2%)

Query: 1 MKNQSNLYDIGVIG-LGVMGKNLALNIADNQYRVAAFDLDELKVQDTVAQEELERKR--Y 57
+ + S+L I V+ + A N+ RV L++++ + A+EE + +
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF 296

Query: 58 LKQITEPRIEGCNNLSEMLSKLEKPRVLLLSVPAGAPVDGVCQSL 102
+I + + +N+ + +L K + APV Q L
Sbjct: 297 KNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQL 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2905RTXTOXIND310.014 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.3 bits (71), Expect = 0.014
Identities = 20/112 (17%), Positives = 35/112 (31%), Gaps = 12/112 (10%)

Query: 543 NLLTLSGELIDEILHAEIVQHKLGDSQAEQASKANGHKIKLPVSQVGNDFTLFINSKSYH 602
+L + ++ E+ +K Q E K V F I K
Sbjct: 253 AVLEQENKYVE--AVNELRVYKSQLEQIESEIL----SAKEEYQLVTQLFKNEILDKLRQ 306

Query: 603 YRALESEEIEEQDNLEDKL-----KAPMNGTIVTQLV-SVGDVVKAGQGIMV 648
E E++ +AP++ + V + G VV + +MV
Sbjct: 307 TTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358


44Shal_2919Shal_2934Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_29192161.102325hypothetical protein
Shal_29201131.475412nitrite reductase
Shal_29214222.089072hypothetical protein
Shal_29223211.962937endoribonuclease L-PSP
Shal_29232211.939726hypothetical protein
Shal_29242211.631042cytoplasmic chaperone TorD family protein
Shal_29252171.829831dimethylsulfoxide reductase chain B
Shal_29261151.613616anaerobic dimethyl sulfoxide reductase subunit
Shal_29270121.185324outer membrane protein
Shal_2928-1120.882885cytochrome C family protein
Shal_2929015-0.133008isoprenylcysteine carboxyl methyltransferase
Shal_2930115-0.200791hypothetical protein
Shal_2931117-1.348846hypothetical protein
Shal_2932119-2.020508hypothetical protein
Shal_2933219-2.045990hypothetical protein
Shal_2934222-2.557836hypothetical protein
45Shal_2950Shal_2956Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_29502162.557699hypothetical protein
Shal_29513173.121825hexapaptide repeat-containing transferase
Shal_29522213.909663amino acid permease
Shal_29531255.686971LysR family transcriptional regulator
Shal_29541235.115700hypothetical protein
Shal_29550164.5179543-oxoacid CoA-transferase subunit A
Shal_2956-1133.3734893-oxoacid CoA-transferase subunit B
46Shal_3195Shal_3200Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_31952150.282314cyclophilin type peptidyl-prolyl cis-trans
Shal_31963160.323571hypothetical protein
Shal_31973150.141330major facilitator transporter
Shal_3198415-0.138196putative nucleotide-binding protein
Shal_3199516-0.001029VanZ family protein
Shal_32003130.2388382-dehydropantoate 2-reductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3197TCRTETB310.013 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 30.6 bits (69), Expect = 0.013
Identities = 37/195 (18%), Positives = 78/195 (40%), Gaps = 33/195 (16%)

Query: 261 PFIDFFKRYGRSAILILLLISCYRISDIVMGIMANVFYV---DMGFSKTEIATLSKIYGL 317
PF+D G++ ++ ++ I V G ++ V Y+ S EI ++ G
Sbjct: 246 PFVDP--GLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGT 303

Query: 318 IMTLVGAGAGGLLIARYGTMKILFLGAFLVAVTNLLFAYQAMIGYDVSFLTFAISVDNFS 377
+ ++ GG+L+ R G + +L +G ++ VSFLT + ++
Sbjct: 304 MSVIIFGYIGGILVDRRGPLYVLNIGVTFLS---------------VSFLTASFLLE--- 345

Query: 378 AGIATAAFIAYLSSLTSSGYSATQYALLSSI-MLLFPKFVAGFSGAYIDAYDYVSFFIAA 436
T+ F+ + G S T+ + + + L + GA + ++ SF
Sbjct: 346 ---TTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEA----GAGMSLLNFTSFLSEG 398

Query: 437 SLIGFPVLGLIMLVQ 451
G ++G ++ +
Sbjct: 399 --TGIAIVGGLLSIP 411


47Shal_3348Shal_3361Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_3348220-4.027937hypothetical protein
Shal_3349221-3.423549hypothetical protein
Shal_3350123-3.525924PqiA family integral membrane protein
Shal_3351222-1.638936zinc-binding CMP/dCMP deaminase
Shal_3352222-1.796903cation diffusion facilitator family transporter
Shal_3353119-0.556787hypothetical protein
Shal_33541221.67624817 kDa surface antigen
Shal_33552210.816822hypothetical protein
Shal_33562242.567892cation antiporter
Shal_33573263.713069multiple resistance and pH regulation protein F
Shal_33582253.192374monovalent cation/proton antiporter subunit
Shal_33593263.888648putative monovalent cation/H+ antiporter subunit
Shal_33602222.605673NADH-ubiquinone oxidoreductase chain 4L
Shal_33613222.545078putative monovalent cation/H+ antiporter subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3353CHANLCOLICIN330.001 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 33.1 bits (75), Expect = 0.001
Identities = 15/60 (25%), Positives = 31/60 (51%), Gaps = 3/60 (5%)

Query: 44 PIVIGTNKLALLASGINNVRRYAFSMDLPWPLIVP--AIISGCICAFIGSSLLDHVSEAW 101
P+ + K A +G++ V FS+ L + AI++G +C++I + L+ ++E
Sbjct: 462 PLFLTLEKKAA-DAGVSYVVALLFSLLAGTTLGIWGIAIVTGILCSYIDKNKLNTINEVL 520


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3354ACRIFLAVINRP270.035 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 27.1 bits (60), Expect = 0.035
Identities = 17/53 (32%), Positives = 24/53 (45%), Gaps = 1/53 (1%)

Query: 51 EPVTIAASDKTNIIAMIAGAAIGGILGSEVGGGAGSDI-AAIGGSVAGGYAGS 102
E +A + I M + A I G+L + GAGS A+G V GG +
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSA 1013


48Shal_3371Shal_3386Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_3371-1203.787920hypothetical protein
Shal_3372-1224.917828hypothetical protein
Shal_3373-1235.051903glycine dehydrogenase
Shal_3374-1183.770917glycine cleavage system protein H
Shal_33750182.710998glycine cleavage system aminomethyltransferase
Shal_33761151.399625UbiH/UbiF/VisC/COQ6 family ubiquinone
Shal_33772150.4027252-polyprenyl-6-methoxyphenol 4-hydroxylase
Shal_3378215-0.905713yecA family protein
Shal_3379315-1.235308hypothetical protein
Shal_3380214-1.371972glycerophosphoryl diester phosphodiesterase
Shal_3381113-0.716342hypothetical protein
Shal_3382116-0.475971hypothetical protein
Shal_33830180.381965hypothetical protein
Shal_33840191.746192hypothetical protein
Shal_3385-1182.034355CRP/FNR family transcriptional regulator
Shal_33860233.4815495-formyltetrahydrofolate cyclo-ligase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3381TCRTETA350.001 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.8 bits (80), Expect = 0.001
Identities = 24/133 (18%), Positives = 48/133 (36%), Gaps = 15/133 (11%)

Query: 30 LQKVQKKQQSAAAKQQKQQKVEQQIAITGKPF-WWLMTMFICSVVYLFPEAIFNAALTDV 88
L + K ++ ++ + A + FI +V P A++ V
Sbjct: 181 LPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALW------V 234

Query: 89 AGGRNASHDDLRAVELFGRTISGIGVTLLLADMLLSGKRVARVGH----ALGYFSLIAVL 144
G + H D + G +++ G+ LA +++G AR+G LG +
Sbjct: 235 IFGEDRFHWDATTI---GISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGY 291

Query: 145 VWPTVFFGQKWLV 157
+ F + W+
Sbjct: 292 IL-LAFATRGWMA 303


49Shal_3450Shal_3478Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_3450-2173.037256hypothetical protein
Shal_3451-2183.016606hypothetical protein
Shal_3452-1183.840888periplasmic-binding protein
Shal_3453-1203.731824cob(I)yrinic acid a,c-diamide
Shal_34541213.585279cobyric acid synthase
Shal_34552203.702295cobalbumin biosynthesis protein
Shal_34561203.532009cobalamin 5'-phosphate synthase
Shal_34570184.025335hypothetical protein
Shal_34580163.542306nicotinate-nucleotide--dimethylbenzimidazole
Shal_3459-1163.509988transport system permease
Shal_34600142.846259ABC transporter-like protein
Shal_34611132.705750phosphoglycerate mutase
Shal_34621132.473135B12-dependent methionine synthase
Shal_34642180.539349urease accessory protein UreD
Shal_34651190.730987urease subunit gamma
Shal_3466219-0.814529urease subunit beta
Shal_3467218-0.656673urease subunit alpha
Shal_3468019-2.330667UreE urease accessory domain-containing protein
Shal_3469120-3.003103urease accessory protein UreF
Shal_3470-217-3.370680urease accessory protein UreG
Shal_3471-117-3.838055endoribonuclease L-PSP
Shal_3472-313-2.811698hypothetical protein
Shal_3473-312-2.516060hypothetical protein
Shal_3474111-2.190952glutathione-dependent formaldehyde-activating
Shal_3475011-2.209212mechanosensitive ion channel protein MscS
Shal_3476213-1.951584hypothetical protein
Shal_3477212-1.866931adenosine deaminase
Shal_3478212-0.705083TM2 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3452FERRIBNDNGPP412e-06 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 41.5 bits (97), Expect = 2e-06
Identities = 40/156 (25%), Positives = 60/156 (38%), Gaps = 9/156 (5%)

Query: 10 LLTACALPSAFALADNAAPA----KRIVALSPHSVEMLYAIGAGESIVATTDHADF---- 61
LLTA AL + A A RIVAL VE+L A+G VA T +
Sbjct: 12 LLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSE 71

Query: 62 PEAALQIPRIGGYYGIQIERVLELEPDLVVVWSG-GNKAEDIARIKELGFRVFDSNPSTL 120
P + +G +E + E++P +V +G G E +ARI F L
Sbjct: 72 PPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPL 131

Query: 121 ELVADELVSLGELTGHQAQATKVADDYRQQLSQLRS 156
+ L + +L Q+ A Y + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKP 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3455FLGHOOKFLIK270.037 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 27.1 bits (59), Expect = 0.037
Identities = 13/65 (20%), Positives = 27/65 (41%), Gaps = 3/65 (4%)

Query: 13 KSRFAEAQVAELQNNVDSECFYFATAQALDSEMAQRIRHHQTQRESDTINWRAIECPLEL 72
+++ AE+ + Q+N+ E F A + +QR +H+ D + P+ L
Sbjct: 305 RTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDD---TLPVPVSL 361

Query: 73 TAALK 77
+
Sbjct: 362 QGRVT 366


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3462BCTERIALGSPD310.043 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 30.7 bits (69), Expect = 0.043
Identities = 14/71 (19%), Positives = 30/71 (42%), Gaps = 5/71 (7%)

Query: 361 SGLEPLTIDDNSLFLNVGERTN---VTGSAKFLRLIKTGEYEEALSVAREQVESGAQIID 417
+P+ D ++ + +TN VT + + ++ L + R QV A I +
Sbjct: 298 QAAKPVAALDKNIIIKAHGQTNALIVTAAPDVMNDLE--RVIAQLDIRRPQVLVEAIIAE 355

Query: 418 INMDEGMLDGV 428
+ +G+ G+
Sbjct: 356 VQDADGLNLGI 366


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3467UREASE10260.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1026 bits (2654), Expect = 0.0
Identities = 376/567 (66%), Positives = 451/567 (79%), Gaps = 2/567 (0%)

Query: 3 KISRQAYADMFGPTVGDRVRLADTALMLEVEQDLTTYGEEVKFGGGKVIRDGMGQSQAL- 61
++SR AYA+MFGPTVGD+VRLADT L +EVE+D TT+GEEVKFGGGKVIRDGMGQSQ
Sbjct: 4 RMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTR 63

Query: 62 SKDCVDAVITNALILDHWGIVKADIGIKHGRIVGIGKAGNPDVQPNVDIVIGAGTEAIAG 121
VD VITNALILDHWGIVKADIG+K GRI IGKAGNPD+QP V I++G GTE IAG
Sbjct: 64 EGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAG 123

Query: 122 EGKIITAGAIDTHVHFICPQQVEEALCAGTTTFIGGGTGPVAGSTATTVTPGIWNMHRML 181
EGKI+TAG +D+H+HFICPQQ+EEAL +G T +GGGTGP G+ ATT TPG W++ RM+
Sbjct: 124 EGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIARMI 183

Query: 182 EAADELPINVGLFGKGTVSNPDAIREQIANGAMGLKIHEDWGATPAAIDTCMTVADEMDV 241
EAAD P+N+ GKG S P A+ E + GA LK+HEDWG TPAAID C++VADE DV
Sbjct: 184 EAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADEYDV 243

Query: 242 QVAIHSDTLNEGGFYEATTKAIGDRVIHVFHTEGAGGGHAPDVIKSVGEPNIIPASTNPT 301
QV IH+DTLNE GF E T AI R IH +HTEGAGGGHAPD+I+ G+PN+IP+STNPT
Sbjct: 244 QVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSSTNPT 303

Query: 302 MPYTINTVDEHLDMLMVCHHLDPAIAEDVAFAESRIRRETIAAEDILHDLGAISVMSSDS 361
PYT+NT+ EHLDMLMVCHHL P I ED+AFAESRIR+ETIAAEDILHD+GA S++SSDS
Sbjct: 304 RPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIISSDS 363

Query: 362 QAMGRVAEVAMRTWQCAHKMKLQRGPLEGDSELSDNTRLKRYVAKYTINPAITHGISHEV 421
QAMGRV EVA+RTWQ A KMK QRG L+ ++ +DN R+KRY+AKYTINPAI HG+SHE+
Sbjct: 364 QAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLSHEI 423

Query: 422 GSIEKGKLADIVLWDPAFFGVKPALVLKGGLAVYAPMGDINGAIPTPQPVHYRPMYAATG 481
GS+E GK AD+VLW+PAFFGVKP +VL GG APMGD N +IPTPQPVHYRPM+ A G
Sbjct: 424 GSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFGAYG 483

Query: 482 KARAATSMTFLPKAAIEAGITEKLKLTHLIGEVKGCR-NIRKKDMIHNSYTPVIELDSQT 540
++R +S+TF+ +A+++AG+ +L + + V+ R I K MIHNS TP IE+D +T
Sbjct: 484 RSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVDPET 543

Query: 541 YVVKADGMALVCEPLAELPMAQRYFLF 567
Y V+ADG L CEP LPMAQRYFLF
Sbjct: 544 YEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3475GPOSANCHOR482e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 47.8 bits (113), Expect = 2e-07
Identities = 50/307 (16%), Positives = 104/307 (33%), Gaps = 33/307 (10%)

Query: 46 RLNSESPKNEALISQYETLLSRVNGYKKQKATRDEYRNIISQFPVQRERLLKKI-NDVEQ 104
+ N+ L + ++ K + + + E+ L+ N
Sbjct: 79 NNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTA 138

Query: 105 LDIFNVLHKTKYNDLSQSIATLQAAVTEWRSTSQSNSDKNKQLINASTTLPQSLAQIDRE 164
+ + L+ A L+ A+ + S ++S K K L L A++++
Sbjct: 139 DSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKA 198

Query: 165 IEKA---SIIKSDTDGQLDQWLESSN---------LTTLKLQRQVTSAQLQSLDERTELL 212
+E A S S L+ + L SA++++L+ L
Sbjct: 199 LEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAAL 258

Query: 213 GLEQQLLNKKIQTASPILIELQDKLTRVEQASVKTLIEQARKISEDSALKG--------- 263
Q L K ++ A K+ +E E+A + L
Sbjct: 259 EARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDL 318

Query: 264 -----------AENQQQIEQLKAYTVELEQVLAEIDQARISYQKVEAEQRSLMDEQRLIK 312
AE+Q+ EQ K + + ++D +R + +++EAE + L ++ ++ +
Sbjct: 319 DASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISE 378

Query: 313 NNLAWLR 319
+ LR
Sbjct: 379 ASRQSLR 385


50Shal_3545Shal_3577Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_3545-1163.141096oxidoreductase
Shal_3546-2173.110554NAD(P)(+) transhydrogenase
Shal_3547-2152.585451MltA-interacting MipA family protein
Shal_3548-2181.950794TetR family transcriptional regulator
Shal_3549-1132.830333hypothetical protein
Shal_3550-1132.679958bifunctional heptose 7-phosphate kinase/heptose
Shal_35510142.358488lipid A biosynthesis lauroyl (or palmitoleoyl)
Shal_35520142.507208ECF subfamily RNA polymerase sigma-24 factor
Shal_35530162.842875hypothetical protein
Shal_35540183.599872nitric-oxide reductase
Shal_35551203.610963anaerobic nitric oxide reductase transcriptional
Shal_35581223.390362peptidase M16 domain-containing protein
Shal_3559-1223.448379StbA family protein
Shal_35600192.663704hypothetical protein
Shal_3561-1191.799205D-serine dehydratase
Shal_3562-1220.623218permease DsdX
Shal_3563121-0.331886DNA-binding transcriptional regulator DsdC
Shal_3564117-4.864929hypothetical protein
Shal_3565118-4.162625hypothetical protein
Shal_3566016-4.503638hypothetical protein
Shal_3567-213-1.002621hypothetical protein
Shal_3568-113-0.105215hypothetical protein
Shal_3569-113-0.311150putative DNA mismatch repair protein
Shal_35700183.036802hypothetical protein
Shal_3571-1182.456910hypothetical protein
Shal_3572-1182.920964bifunctional glutamine-synthetase
Shal_3573-2191.965382hypothetical protein
Shal_3574-2170.677881hypothetical protein
Shal_3575-2150.923290spermidine synthase
Shal_3576113-1.062423hypothetical protein
Shal_3577312-1.002123phage shock protein A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3547ECOLNEIPORIN310.012 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 30.5 bits (69), Expect = 0.012
Identities = 24/106 (22%), Positives = 39/106 (36%), Gaps = 8/106 (7%)

Query: 169 YTWESSDWKVKPYAEFQWK-SADFNDYYYGLGQYEVGGGVS-FNAGVKARYHVISNLYLL 226
+ E+ V ++ +K D + + Q E ++ ++G R I L
Sbjct: 45 ASVETGTGIVDLGSKIGFKGQEDLGNGLKAIWQVEQKASIAGTDSGWGNRQSFIG---LK 101

Query: 227 GQFG---VGRLEDDVYDLPTINNRFQYESFFGFGFFPEPNSKKYSA 269
G FG VGRL + D IN + G EP ++ S
Sbjct: 102 GGFGKLRVGRLNSVLKDTGDINPWDSKSDYLGVNKIAEPEARLISV 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3548HTHTETR511e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 51.2 bits (122), Expect = 1e-10
Identities = 20/75 (26%), Positives = 37/75 (49%)

Query: 1 MAKRSKIQTKQTVQQILDEAMKQILDLGYDAMSYSTLSEATGISRTGISHHFPHKIDFLK 60
MA+++K + ++T Q ILD A++ G + S +++A G++R I HF K D
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 QLDSRIANLFIERLD 75
++ + E
Sbjct: 61 EIWELSESNIGELEL 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3550LPSBIOSNTHSS310.007 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 30.6 bits (69), Expect = 0.007
Identities = 10/37 (27%), Positives = 21/37 (56%)

Query: 348 GCFDILHAGHVSYLQQARALGDRLIVAVNTDASVKRL 384
G FD + GH+ +++ L D++ VAV + + + +
Sbjct: 7 GSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPM 43


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3555HTHFIS385e-131 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 385 bits (990), Expect = e-131
Identities = 132/370 (35%), Positives = 201/370 (54%), Gaps = 17/370 (4%)

Query: 166 KLEKQAVQSQE--PSSFNHSSDSEVEIIGQSPTMQAMQNELSVVASTDLNVLILGDTGTG 223
+ +A+ + PS S + ++G+S MQ + L+ + TDL ++I G++GTG
Sbjct: 113 GIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTG 172

Query: 224 KELVAKSIHLHSLRANKPLVYLNCAALPESVAESELFGHVKGAFTGAISHRSGKFEIADN 283
KELVA+++H + R N P V +N AA+P + ESELFGH KGAFTGA + +G+FE A+
Sbjct: 173 KELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEG 232

Query: 284 GTLFLDEIGELPLSLQSKLLRVLQYGDLQKVGSDKSLKVDVRIIAATNKDLKQEVLAGRF 343
GTLFLDEIG++P+ Q++LLRVLQ G+ VG ++ DVRI+AATNKDLKQ + G F
Sbjct: 233 GTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLF 292

Query: 344 RADLYHRLSVFPILVPPLNEREGDVILLSGFFVERSREKLGLKSLSLSPDSISLLNNYDW 403
R DLY+RL+V P+ +PPL +R D+ L FV+++ K GL +++ L+ + W
Sbjct: 293 REDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAE-KEGLDVKRFDQEALELMKAHPW 351

Query: 404 PGNVRELEHVLRRAAVLSRAQSQSDTAVITPQHFDLRGQDS-PQLDSAFKTVTKASHSVA 462
PGNVRELE+++RR L VIT + + + P S S++
Sbjct: 352 PGNVRELENLVRRLTALYPQD------VITREIIENELRSEIPDSPIEKAAARSGSLSIS 405

Query: 463 IAQGQN-------QTNNQTNNQTLKQATEAFQAQYIRQALVGNGNNWAATARALDVDSGN 515
A +N + + + + I AL N A L ++
Sbjct: 406 QAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNT 465

Query: 516 LHRLAKRIGI 525
L + + +G+
Sbjct: 466 LRKKIRELGV 475


51Shal_3613Shal_3630Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_36133252.236530glutamine amidotransferase of anthranilate
Shal_36145262.775032hypothetical protein
Shal_36154243.054973TonB family protein
Shal_36164252.865743biopolymer transport protein ExbD/TolR
Shal_36174232.981861MotA/TolQ/ExbB proton channel
Shal_36183202.997151MotA/TolQ/ExbB proton channel
Shal_36191203.412766hypothetical protein
Shal_3620-2141.604562hypothetical protein
Shal_3621-2141.230575hypothetical protein
Shal_3622-1130.854425ABC transporter-like protein
Shal_36230120.281929transport system permease
Shal_3624117-0.499440periplasmic-binding protein
Shal_3625321-1.005884TonB-dependent receptor plug
Shal_3626227-0.657612ClpXP protease specificity-enhancing factor
Shal_3627226-0.503768stringent starvation protein A
Shal_3628122-0.040719cytochrome c1
Shal_3629123-0.088484cytochrome b/b6 domain-containing protein
Shal_3630219-0.139792ubiquinol-cytochrome c reductase, iron-sulfur
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3615PF03544625e-14 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 62.3 bits (151), Expect = 5e-14
Identities = 38/160 (23%), Positives = 58/160 (36%), Gaps = 10/160 (6%)

Query: 51 ELPLLPPEKPKLPTPKEVQSKPVTASAATPVSSSIPIMPSVADMPLQVPVLAIPTLNGLS 110
E P++ + P PK K V + P + T
Sbjct: 89 EAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTAT----- 143

Query: 111 LPSIAPVIAGIKDVDKAPELLRFIQPKMPVAGRKFKGGGRVLLRLIVEADGVVSQAQVLE 170
A + V P L QP+ P + + G+V ++ V DG V Q+L
Sbjct: 144 ----AATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKFDVTPDGRVDNVQILS 199

Query: 171 AKPKQVFDQSAVEAARRWRFKPAVLSGEAVRVFVDVPINF 210
AKP +F++ A RRWR++P G + V + IN
Sbjct: 200 AKPANMFEREVKNAMRRWRYEPG-KPGSGIVVNILFKING 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3618RTXTOXIND310.016 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.016
Identities = 25/126 (19%), Positives = 54/126 (42%), Gaps = 12/126 (9%)

Query: 17 FAQKSATQETASVKGSAEYQALERRVD--QRASEVERQNLLWSSNL-------KQEIDEV 67
+ + +E K AE + R++ + S VE+ L S+L K + E
Sbjct: 198 WQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQ 257

Query: 68 KTNQQQLTKRLKTKRGQLEQLNQQLIDLNQQKQSLSQDYQLAVDDMKLVQGAYQKALSSL 127
+ + L+ + QLEQ+ +++ ++ Q ++Q ++ + D KL Q + L
Sbjct: 258 ENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILD-KLRQ--TTDNIGLL 314

Query: 128 SQQWQQ 133
+ + +
Sbjct: 315 TLELAK 320


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3622PF05272310.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.8 bits (69), Expect = 0.012
Identities = 27/118 (22%), Positives = 40/118 (33%), Gaps = 14/118 (11%)

Query: 30 LAILGANGAGKTTLINCLCGLNQPSEGAVHLNSRALDKMSRAEIAQQMALVPQQQETAFA 89
+ + G G GK+TLIN L GL+ S+ + E + + TAF
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIG----TGKDSYEQIAGIVAYELSEMTAFR 654

Query: 90 FTALEMVVMGRTPFLS----TFESPSEQDINDAKRVMQLLDVGGLAKADYNRLSGGER 143
E V F S + + + D R Q++ K Y G R
Sbjct: 655 RADAEAV----KAFFSSRKDRYRGAYGRYVQDHPR--QVVIWCTTNKRQYLFDITGNR 706


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3628RTXTOXIND270.050 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 27.5 bits (61), Expect = 0.050
Identities = 8/26 (30%), Positives = 16/26 (61%)

Query: 196 PIKLERERLGWWVMGFLFIFFVVAYL 221
P+ + +++MGFL I F+++ L
Sbjct: 52 PVSRRPRLVAYFIMGFLVIAFILSVL 77


52Shal_3720Shal_3727Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_3720-1253.648654TOBE domain-containing protein
Shal_37213325.020172hypothetical protein
Shal_37223305.334795hypothetical protein
Shal_37233285.594763hypothetical protein
Shal_37243275.326053ABC transporter-like protein
Shal_37254265.341202ABC transporter-like protein
Shal_37264274.991980binding-protein-dependent transport system inner
Shal_37272213.364908binding-protein-dependent transport system inner
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3724PF05272300.006 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.006
Identities = 8/22 (36%), Positives = 14/22 (63%)

Query: 28 VLGISGPSGVGKSSLASVLAGM 49
+ + G G+GKS+L + L G+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL 619


53Shal_3759Shal_3783Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_37592170.671636pentapeptide repeat-containing protein
Shal_3760010-0.035175amino acid-binding ACT domain-containing
Shal_3761-110-0.889374BCCT transporter
Shal_3762-113-1.504882N-acyl-D-aspartate deacylase
Shal_3763-115-2.951548hypothetical protein
Shal_3764-114-3.541564hypothetical protein
Shal_3765016-3.631563metallophosphoesterase
Shal_3766-119-3.632671diguanylate cyclase
Shal_3767118-3.265987chaperone DnaJ domain-containing protein
Shal_3768120-3.253700hypothetical protein
Shal_3769224-3.409084hypothetical protein
Shal_3770325-5.490943integrase family protein
Shal_3771329-3.533923hypothetical protein
Shal_3772331-3.522727hypothetical protein
Shal_3773021-1.241807resolvase domain-containing protein
Shal_3774119-0.258053hypothetical protein
Shal_37751190.578925HNH endonuclease
Shal_37760202.655263chaperonin GroEL
Shal_3777-2223.525979co-chaperonin GroES
Shal_3778-2243.697924MATE efflux family protein
Shal_3779-1223.488100pentapeptide repeat-containing protein
Shal_37800213.034359hypothetical protein
Shal_37810212.761296LysR family transcriptional regulator
Shal_3782-1202.579848RND family efflux transporter MFP subunit
Shal_3783-1193.208114acriflavin resistance protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3762UREASE381e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 37.8 bits (88), Expect = 1e-04
Identities = 24/96 (25%), Positives = 42/96 (43%), Gaps = 20/96 (20%)

Query: 16 LIKNATVIDGTGGAAFKGDVLIDNQLII--------DVAPLIEL---AADSVIDAGGAYL 64
+I NA ++D G K D+ + + I D+ P + + VI G +
Sbjct: 71 VITNALILDHWG--IVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIV 128

Query: 65 TPGFIDVHSHSDLATFL-PQGFKPKVLQGITTELIG 99
T G +D H H F+ PQ + ++ G+T ++G
Sbjct: 129 TAGGMDSHIH-----FICPQQIEEALMSGLTC-MLG 158


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3767RTXTOXINA300.015 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.9 bits (67), Expect = 0.015
Identities = 17/66 (25%), Positives = 28/66 (42%)

Query: 130 KGQDSEFVFTVDFVDAVQGVEKVVELPVNGETKKIKVKIPAGIKDGEKIRFTGKGSAGVN 189
KG D V V AV +++ G + +++I + + DG+ F GSA +
Sbjct: 574 KGVDKWTVKGVQDKGAVYDYSNLIQHASVGNNQYREIRIESHLGDGDDKVFLSAGSANIY 633

Query: 190 GGAAGD 195
G D
Sbjct: 634 AGKGHD 639


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3778SECFTRNLCASE300.025 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 29.8 bits (67), Expect = 0.025
Identities = 19/113 (16%), Positives = 48/113 (42%), Gaps = 12/113 (10%)

Query: 169 MMLAALINLILDPLLIFGIGPFPRLEIEGAAIATVISWVVALSLSTHLLIFKRHLVDFVE 228
L A++ L+ D LL G+ +L+ + +A +++ + S++ +++F R +
Sbjct: 178 FALGAVVALVHDVLLTVGLFAVLQLKFDLTTVAALLT-ITGYSINDTVVVFDR-----LR 231

Query: 229 PNLERLKCNWKQLAHIAQPAAMMNLLNPLANAIIMAMLARIDHSAVAAFGAGT 281
NL + K L + +++ L+ ++ M + + +G
Sbjct: 232 ENLIKYK--TMPLRDVMN----LSVNETLSRTVMTGMTTLLALVPMLIWGGDV 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3782RTXTOXIND454e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.8 bits (106), Expect = 4e-07
Identities = 28/200 (14%), Positives = 67/200 (33%), Gaps = 29/200 (14%)

Query: 104 EADYELAKADFKRKGELLRRELISQAEYDLASAQLKSS--KANLASAQDQLSYTELTAPY 161
E++ AK +++ +L + E++ + L LA +++ + + AP
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDK----LRQTTDNIGLLTLELAKNEERQQASVIRAPV 334

Query: 162 DGTVAKISI-DNYQMVQANQPVL-VLQKDSDIDIVIQVPESLASKVTQFNPNAVTQPV-V 218
V ++ + +V + ++ ++ +D +++ V + V Q +
Sbjct: 335 SVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFIN------VGQNAII 388

Query: 219 RFANDPSVAYPVL------------LKEHATQVTPGTQSYEVVFTLARPSNMTVLPGMSA 266
+ P Y L + V S E N+ + GM+
Sbjct: 389 KVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAV 448

Query: 267 ELTMDIAQQKSQALTAILPP 286
+I ++ +L P
Sbjct: 449 T--AEIKTGMRSVISYLLSP 466



Score = 33.3 bits (76), Expect = 0.001
Identities = 18/83 (21%), Positives = 31/83 (37%), Gaps = 7/83 (8%)

Query: 78 EGQQVKKAMVLARLDRRDAQNTLLNREADYELAKADFKRKGELLR-------RELISQAE 130
EG+ V+K VL +L A+ L ++ A+ + R L R EL E
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 131 YDLASAQLKSSKANLASAQDQLS 153
+ + + ++Q S
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFS 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3783ACRIFLAVINRP494e-160 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 494 bits (1274), Expect = e-160
Identities = 203/1053 (19%), Positives = 436/1053 (41%), Gaps = 61/1053 (5%)

Query: 4 AEYSITHKVISWMFALLLLIGGTVSFFSLGQLEFPEFTIKQALVVTTYPGASPEQVEEEV 63
A + I + +W+ A++L++ G ++ L ++P V YPGA + V++ V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 64 TLPLEDALQQLDGIKHITSV-NSAGLSQIEIEIKENYDASELPQVWDEVRRKVNDKAVEL 122
T +E + +D + +++S +SAG I + + D +V+ K+ L
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTD---PDIAQVQVQNKLQLATPLL 118

Query: 123 PPGVHTPSVIDDFGD---VYGILLNVSGDGYSDRELQNYADF-LRRELVLVDGIKKVTIA 178
P V + + + G + ++ +Y ++ L ++G+ V +
Sbjct: 119 PQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLF 178

Query: 179 GGVNEQVVVEISQQKLNALGLDQNYIYGLINSQNVVSNAGSMLVGDN------RIRIHPT 232
G + + + LN L + + QN AG + I
Sbjct: 179 GA-QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQ 237

Query: 233 GEFDNVRQMERLLISPPGSPKLIYLGDIAKIYKGSEETPSNIYHTDGKQALSIGIAFSSG 292
F N + ++ + ++ L D+A++ G E + I +GK A +GI ++G
Sbjct: 238 TRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGEN-YNVIARINGKPAAGLGIKLATG 296

Query: 293 VNVVKVGEAVNQRMSELYSELPIGMTLNTVYDQSKMVDQTVNGFLVNLAESVAIVIGVLL 352
N + +A+ +++EL P GM + YD + V +++ + L E++ +V V+
Sbjct: 297 ANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMY 356

Query: 353 IFMG-VRSGLLMGLVLLLTILGTFIVMNVLNIELQIISLGALIIALGMLVDNAIVVTEGI 411
+F+ +R+ L+ + + + +LGTF ++ + +++ +++A+G+LVD+AIVV E +
Sbjct: 357 LFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENV 416

Query: 412 L-IGIKRGQTRLETAKQVVSQTQWPLLGATIIAIIAFAPIGLSDNATGEFCASLFQVLLI 470
+ ++ E ++ +SQ Q L+G ++ F P+ +TG ++
Sbjct: 417 ERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVS 476

Query: 471 SLFISWITAMTLTPFFCNLMFKDGVISEDENDDPYKGWLFAIYRQSLNLA-------MRF 523
++ +S + A+ LTP C + K EN + GW + S+N +
Sbjct: 477 AMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGS 536

Query: 524 RSLTLILVIAALVSSVIGFGHVKNVFFPASNTPIFFVDVWMPEGSDIKATERLLSRIEAD 583
L++ + V+ F + + F P + +F + +P G+ + T+++L ++
Sbjct: 537 TGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDY 596

Query: 584 LLQQQKEQDTGLVNLTTVIGQG-AQRFVLSYVPEKGYKAYGQILLEMTDLTTLNQYMRVL 642
L+ +K + + G AQ +++V K ++ + +
Sbjct: 597 YLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWE----------ERNGDENSAEAV 646

Query: 643 ERELSLKFPEAEYRFKYMENGPS---------PAAKIEARFYGEDPQVLRQLAVQAEDIL 693
++ + F N P+ ++ + + +
Sbjct: 647 IHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAA 706

Query: 694 KAEPTAVGVRHNWRNQVTLVRPQLAQAQARETGISKQDLDNALLTNFSGKQIGTYRENSH 753
+ + V VR N + ++ Q +A+ G+S D++ + T G + + +
Sbjct: 707 QHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGR 766

Query: 754 LLPIIARAPDEERLDAQSIWKLQVWSSDNHTFVPVTQVVSDFSTEWEDPLIMRRDRKRVI 813
+ + +A + R+ + + KL V S N VP + + P + R + +
Sbjct: 767 VKKLYVQADAKFRMLPEDVDKLYV-RSANGEMVPFSAFT-TSHWVYGSPRLERYNGLPSM 824

Query: 814 SVLADPLNGTRETADSVFRKVKADIEAIP--LPAGYELEWGGEYETSMEAQESVFSSIPL 871
+ + G + A +E + LPAG +W G + + + +
Sbjct: 825 EIQGEAAPG------TSSGDAMALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAI 878

Query: 872 GYLAMFIITVLLFNSVRQPLVIWFTVPLALIGVVSGLLIFDAPFSFMALLGLLSLTGMII 931
++ +F+ L+ S P+ + VPL ++GV+ +F+ ++GLL+ G+
Sbjct: 879 SFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSA 938

Query: 932 KNGIVLVDQIN-LELSSGKEAYQAVVDSAVSRVRPVLMAAITTMLGMLPLISDAFFGS-- 988
KN I++V+ L GK +A + + R+RP+LM ++ +LG+LPL GS
Sbjct: 939 KNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGA 998

Query: 989 ---MAITIIFGLGFASVLTLIVLPVTYTLAFRI 1018
+ I ++ G+ A++L + +PV + + R
Sbjct: 999 QNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRC 1031


54Shal_3892Shal_3904Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_38922170.882981ATP-dependent RNA helicase RhlB
Shal_38931161.478395thioredoxin
Shal_38940151.441110transcription termination factor Rho
Shal_3895-1122.150988LysR family transcriptional regulator
Shal_38960122.254035fumarate reductase iron-sulfur subunit
Shal_38970102.002287fumarate reductase flavoprotein subunit
Shal_3898-2131.830051fumarate reductase respiratory complex
Shal_3899-1122.622917fumarate reductase cytochrome b-556 subunit
Shal_3900-1133.07876150S ribosomal protein L11 methyltransferase
Shal_39010123.233211nifR3 family TIM-barrel protein
Shal_39020133.322575DNA-binding protein Fis
Shal_39030133.413287leucyl aminopeptidase
Shal_3904-1143.397330CzcA family heavy metal efflux protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3902DNABINDNGFIS1197e-39 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 119 bits (299), Expect = 7e-39
Identities = 59/101 (58%), Positives = 82/101 (81%), Gaps = 3/101 (2%)

Query: 1 MFDQTTHTEVHQLTVGKIETASGAIKPQLLRDAVKRAVTNFFAQMDGQEAEEVYEMVLSE 60
MF+Q +++V LTV + + + + LRD+VK+A+ N+FAQ++GQ+ ++YE+VL+E
Sbjct: 1 MFEQRVNSDV--LTVSTVNSQDQVTQ-KPLRDSVKQALKNYFAQLNGQDVNDLYELVLAE 57

Query: 61 VEAPLLDIIMQHTRGNQTRAANMLGINRGTLRKKLKKYGMN 101
VE PLLD++MQ+TRGNQTRAA M+GINRGTLRKKLKKYGMN
Sbjct: 58 VEQPLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3904ACRIFLAVINRP7410.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 741 bits (1914), Expect = 0.0
Identities = 216/1052 (20%), Positives = 429/1052 (40%), Gaps = 47/1052 (4%)

Query: 7 RLAINRRWMMMFLAMLLVAVGLWSYQKLPIDAVPDITNVQVQISTRAEGYSPQETEQRIT 66
I R LA++L+ G + +LP+ P I V +S G Q + +T
Sbjct: 3 NFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVT 62

Query: 67 YPIENALMGIPQLSYTRSLS-RYGLSQVTVVFDEGTDLYFARNLISERLNGIKGKLPAEI 125
IE + GI L Y S S G +T+ F GTD A+ + +L LP E+
Sbjct: 63 QVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEV 122

Query: 126 EPEMGPISTGLGEIFMYTIEALPDAKQADGREFDATALREIQDWIIKPQLAQVPGVTEIN 185
+ + + M +D + + +K L+++ GV ++
Sbjct: 123 QQQGISVEKSSSSYLMVA------GFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQ 176

Query: 186 TIGGYDKEYHITPLPEQMLAYGVSFADIKVALLNSNSNRGAGYIER----EGQQLTVRSA 241
G I + + Y ++ D+ L N AG + GQQL
Sbjct: 177 LFGA-QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASII 235

Query: 242 AQ--LKTMSDIEQVVVK-HIDNAPIFITDIAEVGIGKALRTGAATRDGKETVLGSAMMLV 298
AQ K + +V ++ + D + + + D+A V +G A +GK +
Sbjct: 236 AQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLAT 295

Query: 299 GENSRQVSLAVAAKLEDVKASLPDGIKVEAVYDRTTLVDKAVYTVQKNLVEGALLVIVVL 358
G N+ + A+ AKL +++ P G+KV YD T V +++ V K L E +LV +V+
Sbjct: 296 GANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVM 355

Query: 359 FILLGNARAALITAAVIPLAMLATMTGMVQAGVSANLMSLG--ALDFGLIVDGAVIIVEN 416
++ L N RA LI +P+ +L T + G S N +++ L GL+VD A+++VEN
Sbjct: 356 YLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVEN 415

Query: 417 AIRRLGQAQVQNGGHVLGRKQRLQLVFEATNEVIRPSLFGVLIITVVYIPLFSLTGVEGK 476
R + + + + ++ +++ + ++++ V+IP+ G G
Sbjct: 416 VERVM----------MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGA 465

Query: 477 MFHPMAATVVMALLAALVLSLTLVPAAVALFMTGKISDKE----------SRVITATKSL 526
++ + T+V A+ +++++L L PA A + ++ + + +
Sbjct: 466 IYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNH 525

Query: 527 YKPLLIAAMKLRWLVLSAAVALVAFSIWLATTLGSEFIPQLNEEDILLQAIRIPGTGLEQ 586
Y + + L +VA + L L S F+P+ ++ L G E+
Sbjct: 526 YTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQER 585

Query: 587 SVDMQVLLEERIKHFPQVLNVFSKIGTPEVANDPMPPNIADTFIMLKPRSEWPNPKMSHD 646
+ + + + + NV S + N F+ LKP E + S +
Sbjct: 586 TQKVLDQVTDYYLKNEKA-NVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAE 644

Query: 647 ALAADIVTAVSGQPGNNFELTQPIEM-RFNELISGVRADLG-IKVVGDDLQQLLTSANAI 704
A+ + P M EL + D I G L + N +
Sbjct: 645 AVIHRAKMELGKIRDGF---VIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQL 701

Query: 705 HEVLEQ-IDGVADLQVEQVTGLPMLSIQPDRMALARYGLTVNELQDFVATAIGGEEAGLI 763
+ Q + ++ + ++ D+ G++++++ ++TA+GG
Sbjct: 702 LGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDF 761

Query: 764 FEGDRRFKMVIRLPETIRKDVEHLQELPIVLPSGEYVPLSEVASLDIAPSPNQISRENGK 823
+ R K+ ++ R E + +L + +GE VP S + ++ R NG
Sbjct: 762 IDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGL 821

Query: 824 RRIVVTANVRGRDLGSFVSEAKLRINQQVDIPAGYWLEYGGTFEQLESASQRLSIVVPLT 883
+ + + + +PAG ++ G Q + + +V ++
Sbjct: 822 PSMEIQGEAAPGTSSGDAMALMENLASK--LPAGIGYDWTGMSYQERLSGNQAPALVAIS 879

Query: 884 LALILGLLVMAFSSVKDALIIFTGIPLALTGGVISLWLRDMPLSISAGVGFIALSGVAVL 943
++ L + S + + +PL + G +++ L + + VG + G++
Sbjct: 880 FVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAK 939

Query: 944 NGLVLLSFIRQLYQEKGE-LMSAIIDGALIRLRPVLMTALVAGLGFVPMALNIGTGAEVQ 1002
N ++++ F + L +++G+ ++ A + +RLRP+LMT+L LG +P+A++ G G+ Q
Sbjct: 940 NAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQ 999

Query: 1003 RPIATVVIGGIISSTLLTLFVLPILYRMAHQR 1034
+ V+GG++S+TLL +F +P+ + + +
Sbjct: 1000 NAVGIGVMGGMVSATLLAIFFVPVFFVVIRRC 1031


55Shal_3917Shal_3922Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_39175212.002894porphobilinogen deaminase
Shal_39185201.931438uroporphyrinogen III synthase HEM4
Shal_39195201.935089hypothetical protein
Shal_39205202.044599HemY domain-containing protein
Shal_39216202.091609putative outer membrane adhesin-like protein
Shal_39225231.750218putative outer membrane adhesin-like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3919RTXTOXIND310.007 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.3 bits (71), Expect = 0.007
Identities = 12/79 (15%), Positives = 27/79 (34%), Gaps = 5/79 (6%)

Query: 80 GYYLYQQLQAQQAETAALQQELKQELQTVLVAPNQRITSLEQQ----QNQFKSSVDSTLK 135
+ Q+ + EL+ + I S +++ FK+ + L+
Sbjct: 247 QAIAKHAVLEQENKYVEAVNELRVYKSQLEQI-ESEILSAKEEYQLVTQLFKNEILDKLR 305

Query: 136 KTVDQQTQLEERVSIIAQR 154
+T D L ++ +R
Sbjct: 306 QTTDNIGLLTLELAKNEER 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3921CABNDNGRPT851e-18 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 84.6 bits (209), Expect = 1e-18
Identities = 41/169 (24%), Positives = 61/169 (36%), Gaps = 17/169 (10%)

Query: 1880 TIVGTDNINNLIFGSTNSDSLTGAN-LDDRIFGREDNDILIGLSGNDELIGGSGNDNVQG 1938
G +N+ + + G + N + + IGGSGND + G
Sbjct: 295 VWDAGGTDTFDFSGYSNNQRINLNEGSFSDVGGLKGNVSIAHGVTIENAIGGSGNDILVG 354

Query: 1939 GEGNDFVIGGIGDDTLNGGIGRDYLSGGQGNDSLDGGALNGSDDGERDFFVWESDSADGS 1998
++ + GG G+D L GG G D L GG G D+ G+ DS +
Sbjct: 355 NSADNILQGGAGNDVLYGGAGADTLYGGAGRDTFVYGS--------------GQDSTVAA 400

Query: 1999 TDTVLNFNLDIDVLDLSDLLIGEESGNLEDFLSFSFSGGNTTITIDADG 2047
D + +F ID +DLS E F+ G + DA
Sbjct: 401 YDWIADFQKGIDKIDLSAF--RNEGQLSFVQDQFTGKGQEVMLQWDAAN 447



Score = 67.3 bits (164), Expect = 3e-13
Identities = 33/159 (20%), Positives = 49/159 (30%), Gaps = 43/159 (27%)

Query: 1875 ADDSITIVGTDNINNLIFGSTNSDSLTGANLDDRIFGREDNDILIGLSGNDELIGGSGND 1934
++ I + + + G + S+ + G NDIL+G S ++ L GG+GND
Sbjct: 309 YSNNQRINLNEGSFSDVGGLKGNVSIAHGVTIENAIGGSGNDILVGNSADNILQGGAGND 368

Query: 1935 NVQGGEGNDFVIGGIGDDTLNGGIG-----------RDYLSGGQGNDSLDGGALNG---- 1979
+ GG G D + GG G DT G G D+ G D
Sbjct: 369 VLYGGAGADTLYGGAGRDTFVYGSGQDSTVAAYDWIADFQKGIDKIDLSAFRNEGQLSFV 428

Query: 1980 ----------------------------SDDGERDFFVW 1990
+ DF V
Sbjct: 429 QDQFTGKGQEVMLQWDAANSITNLWLHEAGHSSVDFLVR 467



Score = 33.4 bits (76), Expect = 0.012
Identities = 11/70 (15%), Positives = 19/70 (27%), Gaps = 4/70 (5%)

Query: 1922 SGNDELIGGSGNDNVQGGE----GNDFVIGGIGDDTLNGGIGRDYLSGGQGNDSLDGGAL 1977
N G D++ + N G N RD+ + + +L
Sbjct: 237 DYNGHYGGAPMIDDIAAIQRLYGANMTTRTGDSVYGFNSNTDRDFYTATDSSKALIFSVW 296

Query: 1978 NGSDDGERDF 1987
+ DF
Sbjct: 297 DAGGTDTFDF 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3922CABNDNGRPT671e-12 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 66.5 bits (162), Expect = 1e-12
Identities = 52/206 (25%), Positives = 73/206 (35%), Gaps = 22/206 (10%)

Query: 3789 GDDKVNAGEGNDIIFGDLVSFDGIDGQGFSALQAFVAQETNQQPTNVTVQDIHEFISSNT 3848
D A + + + + G D FS + N + N
Sbjct: 278 DRDFYTATDSSKALIFSVWDAGGTDTFDFSG-----YSNNQRINLNEGSFSDVGGLKGNV 332

Query: 3849 HLFGESNTDDGADTLEGGEGNDILFGQGGNDELIGGLGADTLIGGLGDDKLTGNEGADTF 3908
+ + GG GNDIL G ++ L GG G D L GG G D L G G DTF
Sbjct: 333 SIAH----GVTIENAIGGSGNDILVGNSADNILQGGAGNDVLYGGAGADTLYGGAGRDTF 388

Query: 3909 VWDRNSIDNSDPTDRNTDHITDFNMAEDKLDLSDILQGDTVSELAQH---------LSFT 3959
V+ D T D I DF DK+DLS +S + L +
Sbjct: 389 VYG----SGQDSTVAAYDWIADFQKGIDKIDLSAFRNEGQLSFVQDQFTGKGQEVMLQWD 444

Query: 3960 DENGSTSINIDTDGNGSFDQHIVLDG 3985
N T++ + G+ S D + + G
Sbjct: 445 AANSITNLWLHEAGHSSVDFLVRIVG 470


56Shal_3935Shal_3943Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_3935-1173.537648PA-phosphatase-like phosphoesterase
Shal_3936-1184.319429AsnC family transcriptional regulator
Shal_39370184.539547sodium:neurotransmitter symporter
Shal_39380174.878285TonB-dependent copper receptor
Shal_3939-1185.257719serine--pyruvate transaminase
Shal_3940-1164.527015threonine dehydratase
Shal_3941-1144.528092dihydroxy-acid dehydratase
Shal_3942-1153.509761branched-chain amino acid aminotransferase
Shal_3943-1163.312250amino acid-binding ACT domain-containing
57Shal_3998Shal_4015Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_3998218-2.642824hypothetical protein
Shal_3999318-2.604577hypothetical protein
Shal_4000217-0.751699hypothetical protein
Shal_4001216-0.431221hypothetical protein
Shal_4002016-0.108540hypothetical protein
Shal_4003-2152.063118histidine kinase
Shal_4004-2173.951497hypothetical protein
Shal_4005-2163.992044peptidase S9 prolyl oligopeptidase
Shal_4006-1153.328272hypothetical protein
Shal_40070173.493594hypothetical protein
Shal_4008-1173.345105molydopterin dinucleotide-binding region
Shal_40090191.5609634Fe-4S ferredoxin
Shal_40100211.743549LysR family transcriptional regulator
Shal_40111252.606901molybdopterin oxidoreductase
Shal_40120222.3122014Fe-4S ferredoxin
Shal_40130170.469759cytoplasmic chaperone TorD family protein
Shal_4014117-0.186085formate dehydrogenase subunit gamma
Shal_4015216-0.075568porin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4001OMPADOMAIN300.011 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 30.3 bits (68), Expect = 0.011
Identities = 20/69 (28%), Positives = 32/69 (46%), Gaps = 9/69 (13%)

Query: 278 YGESHQLRKSKGVTAGVRYDVTSSISLNLEYQMIFSQEEGNRAQFVTSPLVYDESSDVDL 337
YG++H S GV Y +T I+ LEYQ ++ G+ T P D +
Sbjct: 132 YGKNHDTGVSPVFAGGVEYAITPEIATRLEYQ--WTNNIGDAHTIGTRP-------DNGM 182

Query: 338 VTLSVNFVF 346
++L V++ F
Sbjct: 183 LSLGVSYRF 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4003HTHFIS688e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 67.9 bits (166), Expect = 8e-14
Identities = 28/111 (25%), Positives = 55/111 (49%), Gaps = 8/111 (7%)

Query: 759 RIVILIVEDNLVNQKVASLLVKQAGFDCIIANNGQEALDFISSGEAFHAILMDCMMPVMD 818
IL+ +D+ + V + + +AG+D I +N +I++G+ ++ D +MP +
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD-GDLVVTDVVMPDEN 61

Query: 819 GFTATEKIREWEGLHSQSRLPIIALTA-SVLDQDIKRCYQSGMDDYLAKPF 868
F +I++ + LP++ ++A + IK + G DYL KPF
Sbjct: 62 AFDLLPRIKKA-----RPDLPVLVMSAQNTFMTAIK-ASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4011adhesinb340.002 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 34.1 bits (78), Expect = 0.002
Identities = 28/126 (22%), Positives = 45/126 (35%), Gaps = 18/126 (14%)

Query: 219 ETATNVHGIKFINQARDKGAKLLV-----VNPMRTPIASQAD-----VWLQPKPSTDTYL 268
ET N K + A+ K K V+ + S+ WL +
Sbjct: 94 ETGGNAWFTKLVENAKKKENKDYYAVSEGVDVIYLEGQSEKGKEDPHAWLNLENGI--IY 151

Query: 269 ATGIMKYLVENDLADMKFIEANTLDYQDLLKRLDEMSYDEIEQVTGVPREKMFE------ 322
A I K L E D A+ + E N Y + L LD+ + ++ + G + +
Sbjct: 152 AQNIAKRLSEKDPANKETYEKNLKAYVEKLSALDKEAKEKFNNIPGEKKMIVTSEGCFKY 211

Query: 323 FAKVYG 328
F+K Y
Sbjct: 212 FSKAYN 217


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4015ECOLNEIPORIN594e-12 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 59.4 bits (144), Expect = 4e-12
Identities = 51/321 (15%), Positives = 120/321 (37%), Gaps = 29/321 (9%)

Query: 49 SQVSLYGSLRPSLSYLDD---NQDKTWDVR------DALSRIGIKASTEFADGWSAIAQG 99
+ V+LYG+++ + N + V D S+IG K + +G AI Q
Sbjct: 19 ADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSKIGFKGQEDLGNGLKAIWQV 78

Query: 100 EWSVDIANSGNFGKARQAYAAIASPYGQVGIGKQRPAQYTLVAEYVDIFNNANSPFGYDH 159
E IA + + RQ++ + +G++ +G+ ++ +++ + G +
Sbjct: 79 EQKASIAGTDSGWGNRQSFIGLKGGFGKLRVGRLNSVLKD--TGDINPWDSKSDYLGVN- 135

Query: 160 ESPFFVDNF---VTYQLVTGDFTWMAGA---QFNGDSGEDATDMVNVGVGYDLNQLHIGL 213
+ V Y + +F ++G+ N ++G ++ + G Y +
Sbjct: 136 -KIAEPEARLISVRYD--SPEFAGLSGSVQYALNDNAGRHNSESYHAGFNYKNGGFFVQY 192

Query: 214 GYVSQNTVD-NQHIEGDDNTLGGVVAYTFSNDLYLAVSYQDKQYNYDAGSMNKDRSGSTL 272
G + +++ + + +V+ ++ LY +V+ Q Q + N + T
Sbjct: 193 GGAYKRHHQVQENVNIEKYQIHRLVSGYDNDALYASVAVQ--QQDAKLVEENYSHNSQTE 250

Query: 273 DTALAYPIFDDYKVKLGYFQFK----DGIDDNTSADYDGFNTTLEWNPVDNVRVHLEYLD 328
A F + ++ Y D + N D +++ + V +L
Sbjct: 251 VAATLAYRFGNVTPRVSYAHGFKGSFDATNYNNDYDQVVVGAEYDFSKRTSALVSAGWLQ 310

Query: 329 KN-FDNRDDDRSITIGFRYDF 348
+ +++ + +G R+ F
Sbjct: 311 EGKGESKFVSTAGGVGLRHKF 331


58Shal_4045Shal_4064Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_4045-120-4.520999hypothetical protein
Shal_4046331-8.708874hypothetical protein
Shal_4047428-9.284211hypothetical protein
Shal_4048528-8.907929hypothetical protein
Shal_4049527-8.934707phage transcriptional regulator AlpA
Shal_4050426-8.903289hypothetical protein
Shal_4051526-9.314944hypothetical protein
Shal_4052527-8.900976hypothetical protein
Shal_4053426-7.846153hypothetical protein
Shal_4054322-6.358998hypothetical protein
Shal_4055418-4.894761hypothetical protein
Shal_4056418-3.809597hypothetical protein
Shal_4057419-2.728085hypothetical protein
Shal_4058420-2.270686hypothetical protein
Shal_4059519-2.094383integrase family protein
Shal_4060416-4.511011ATPase central domain-containing protein
Shal_4061221-5.842443hypothetical protein
Shal_4062020-5.356915hypothetical protein
Shal_4063-215-3.902324hypothetical protein
Shal_4064-214-3.532749DNA-cytosine methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4047SSPAMPROTEIN270.014 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type M

signature.
Length = 147

Score = 27.3 bits (60), Expect = 0.014
Identities = 21/72 (29%), Positives = 36/72 (50%)

Query: 39 LRCMSGDISQQEFFDEQRKLSSMVWQIKDLEAKIEQFNQTQGVKRGQLSDSEKNQIYHYY 98
LR + +S++E + RK S + QIKDLE +I Q + + + + ++ Y
Sbjct: 56 LRAENRQLSREEIYALLRKQSIVRRQIKDLELQIIQIQEKRSELEKKREEFQEKSKYWLR 115

Query: 99 KTGNYTQMQLSQ 110
K GNY + + Q
Sbjct: 116 KEGNYQRWIIRQ 127


59Shal_4194Shal_4206Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_41943191.507521hypothetical protein
Shal_41951181.675841hypothetical protein
Shal_41962191.439140ECF subfamily RNA polymerase sigma-24 factor
Shal_4197117-0.018191hypothetical protein
Shal_4198013-0.978439hypothetical protein
Shal_4199-114-1.791728hypothetical protein
Shal_4200-215-3.572589sodium/glutamate symporter
Shal_4201022-7.081139hypothetical protein
Shal_4202-120-7.221000glycosyl transferase family protein
Shal_4203-118-6.409801group 1 glycosyl transferase
Shal_4204-219-6.326784glycosyl transferase family protein
Shal_4205-120-5.755227polysaccharide pyruvyl transferase
Shal_4206-119-3.399325glycosyl transferase family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4199OMPADOMAIN270.011 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 26.8 bits (59), Expect = 0.011
Identities = 16/58 (27%), Positives = 27/58 (46%), Gaps = 6/58 (10%)

Query: 1 MKSTKTAVGVTVAGILAASLSLGAQAVPEQPKEWEKCAGVAKAGANDCGALDGSHGCA 58
MK KTA+ + VA A++ AQA P + W A + + +D G ++ +
Sbjct: 1 MK--KTAIAIAVALAGFATV---AQAAP-KDNTWYTGAKLGWSQYHDTGFINNNGPTH 52


60Shal_4236Shal_4269Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_42362152.294205OmpA/MotB domain-containing protein
Shal_42370153.698357sigma-54 dependent trancsriptional regulator
Shal_42380184.133722flagellar hook-basal body complex subunit FliE
Shal_42391184.418730flagellar MS-ring protein
Shal_42401204.424915flagellar motor switch protein G
Shal_42411225.255394flagellar assembly protein H
Shal_42422244.891267FliI/YscN family ATPase
Shal_42436223.551486hypothetical protein
Shal_42443162.929414hypothetical protein
Shal_42451183.534251hypothetical protein
Shal_42460204.598564SAF domain-containing protein
Shal_42470245.064026flagellar basal body rod protein FlgB
Shal_42480225.342771flagellar basal-body rod protein FlgC
Shal_42490245.651564flagellar hook capping protein
Shal_42500246.174322flagellar hook protein FlgE
Shal_4251-2245.615182flagellar basal-body rod protein FlgF
Shal_4252-2234.333284flagellar basal-body rod protein FlgG
Shal_4253-1193.547382flagellar basal body L-ring protein
Shal_4254-1162.294873flagellar basal body P-ring protein
Shal_42551151.094767peptidoglycan hydrolase
Shal_42561150.822069flagellar hook-associated protein FlgK
Shal_42571150.349777flagellar hook-associated protein 3
Shal_42582160.450881hypothetical protein
Shal_42591130.571621hypothetical protein
Shal_42602190.908903flagellin domain-containing protein
Shal_42613170.939525flagellar hook-associated 2 domain-containing
Shal_42622170.905485flagellar protein FliS
Shal_42631161.046477hypothetical protein
Shal_42640160.564191flagellar hook-length control protein
Shal_4265018-0.950889flagellar basal body-associated protein FliL
Shal_4266-215-1.632736flagellar biosynthesis sigma factor
Shal_4267-217-2.144640MotA/TolQ/ExbB proton channel
Shal_4268-216-1.795952OmpA/MotB domain-containing protein
Shal_4269019-3.017816type IV pilus assembly PilZ
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4236OMPADOMAIN632e-13 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 63.4 bits (154), Expect = 2e-13
Identities = 45/174 (25%), Positives = 63/174 (36%), Gaps = 19/174 (10%)

Query: 126 ALEQGYTWQVNIQAADTSGYQISSSPVSTRLVAN-----QFRLCRQQLLPKPFTYLRRIE 180
A Y W NI A T G + + +S + + P P +
Sbjct: 157 ATRLEYQWTNNIGDAHTIGTRPDNGMLSLGVSYRFGQGEAAPVVAPAPAPAPEVQTKHFT 216

Query: 181 L----LFAPSSSLLNNSHEQDLYAVYRYL-QADSSIVEILVDGHADASGDHLANLVLSKE 235
L LF + + L + L +Y L D ++V G+ D G N LS+
Sbjct: 217 LKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSER 276

Query: 236 RADEVVSRLIELGVSAKMIQTRHHGTRAPVASN--NNTEGREL-------NRRV 280
RA VV LI G+ A I R G PV N +N + R +RRV
Sbjct: 277 RAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRV 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4237HTHFIS380e-131 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 380 bits (978), Expect = e-131
Identities = 141/401 (35%), Positives = 211/401 (52%), Gaps = 47/401 (11%)

Query: 74 ELASIAMQCGVQDYLLLPIDEEQLCSLLQR--------LRRLELPNNE---LICAAPVSR 122
A A + G DYL P D +L ++ R +LE + + L+ + +
Sbjct: 88 MTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQ 147

Query: 123 QLLMLAHRAANTEATVLLTGESGTGKEPLARYIHRHSNRTDKPFIAINCAAIPESILESI 182
++ + R T+ T+++TGESGTGKE +AR +H + R + PF+AIN AAIP ++ES
Sbjct: 148 EIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESE 207

Query: 183 LFGHVKGAFTGATTDQIGKFELANGGTLLLDEIGEMPLLLQAKLLRVLQEREVERLGSHK 242
LFGH KGAFTGA T G+FE A GGTL LDEIG+MP+ Q +LLRVLQ+ E +G
Sbjct: 208 LFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRT 267

Query: 243 SITLDIRVIAATNKDLRQAVQDGKFREDLFYRLDVLPIKILPLRQRKEDILPIAEHFLQR 302
I D+R++AATNKDL+Q++ G FREDL+YRL+V+P+++ PLR R EDI + HF+Q+
Sbjct: 268 PIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQ 327

Query: 303 YKILAANQQCYFSEQARNLLLSHDWPGNVRELENTIQRALVMRRGQALQAEELGL----- 357
+ + + F ++A L+ +H WPGNVRELEN ++R + + E +
Sbjct: 328 AEKEGLDVKR-FDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSE 386

Query: 358 -------------VNQDGSALVEQNELG--------------LKASKRQAEFQYIIDTLK 390
+ S VE+N + E+ I+ L
Sbjct: 387 IPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALT 446

Query: 391 RYNGHRNNTAQALGMTTRALRYKLVQMREEGIDIDQILSQS 431
G++ A LG+ LR K +RE G+ + + +
Sbjct: 447 ATRGNQIKAADLLGLNRNTLRKK---IRELGVSVYRSSRSA 484


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4238FLGHOOKFLIE496e-11 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 48.9 bits (116), Expect = 6e-11
Identities = 21/72 (29%), Positives = 34/72 (47%), Gaps = 1/72 (1%)

Query: 42 SFTELMKHKVSSINADQNASSALVAAVDSGQSD-DLVGAMVASQKSSLAFSAMIQIRNRL 100
SF + + I+ Q A+ G+ L M QK+S++ IQ+RN+L
Sbjct: 32 SFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQVRNKL 91

Query: 101 VQAFDDVMKMPI 112
V A+ +VM M +
Sbjct: 92 VAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4239FLGMRINGFLIF315e-103 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 315 bits (809), Expect = e-103
Identities = 172/597 (28%), Positives = 275/597 (46%), Gaps = 82/597 (13%)

Query: 2 STTLTQTSPVMTDVNGNNDRLNSLKQKWHQFSRGDRHAATLAILAIVAACVIVLMLWSTG 61
++T TQ P+ + LN L+ + + A V+ ++LW+
Sbjct: 4 ASTATQPKPL--------EWLNRLRAN--------PRIPLIVAGSAAVAIVVAMVLWAKT 47

Query: 62 QGYSPLYGNQENVETSHIIEVLEAEGISYRLDPTSGLILVPEDRVGNARMVLAARGVKAK 121
Y L+ N + + I+ L I YR SG I VP D+V R+ LA +G+
Sbjct: 48 PDYRTLFSNLSDQDGGAIVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKG 107

Query: 122 VPSGMESLDSSAIGTSQFMEQAKYRYSLEGELSRTIMALKSVKTARVHLAIPKKTLFIRQ 181
G E LD G SQF EQ Y+ +LEGEL+RTI L VK+ARVHLA+PK +LF+R+
Sbjct: 108 GAVGFELLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVRE 167

Query: 182 QPELPSASVMLDLYAGQHLQPEQITSIANLVAGSVTGMTPERVQIVDQEGNHLSSEINAN 241
Q + PSASV + L G+ L QI+++ +LV+ +V G+ P V +VDQ G+ L+ +
Sbjct: 168 Q-KSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSG 226

Query: 242 QDLTQARDKQLQYTQELEQSLINRASSMLQPILGQDNFQVQVAALVNFNQVEETRESLDP 301
+DL D QL++ ++E + R ++L PI+G N QV A ++F E+T E P
Sbjct: 227 RDLN---DAQLKFANDVESRIQRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSP 283

Query: 302 Q------TVVSQEKQSTNQTSGDMALGIPGALSNQPPTADAATNNSTSNLNQ-------- 347
T+ S++ + Q G+PGALSNQP + A + Q
Sbjct: 284 NGDASKATLRSRQLNISEQVGAGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQT 343

Query: 348 ----------------QESRQFEVGRSVKHTRYQQMQLENLSISVLLNNQ---AAGETGW 388
E+ +EV R+++HT+ +E LS++V++N +
Sbjct: 344 STSTNSNSAGPRSTQRNETSNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPL 403

Query: 389 TQPQLDQMSTMVQDAIGYSAARGDQFSISSFNFAPVKIAEFEPLPWWQGESYQAYLRYFI 448
T Q+ Q+ + ++A+G+S RGD ++ + F+ V E LP+WQ +S+ L
Sbjct: 404 TADQMKQIEDLTREAMGFSDKRGDTLNVVNSPFSAVDNTGGE-LPFWQQQSFIDQLLAAG 462

Query: 449 GAILGLGMIF----FVLRPLVQHLTRTVEHNIKDTLPSTQPLMPPAEAATAHLSQDAAQE 504
+L L + + +RP LTR VE A+ + +E
Sbjct: 463 RWLLVLVVAWILWRKAVRPQ---LTRRVEE------------AKAAQEQAQ--VRQETEE 505

Query: 505 AADNLVALNNANINSNWTSNLNLPAPGSPLTVQMEHLSLLANQEPARVAEVISHWIS 561
A + L+ +N L A V + + +++ +P VA VI W+S
Sbjct: 506 AVEV--RLSKDEQLQQRRANQRLGA-----EVMSQRIREMSDNDPRVVALVIRQWMS 555


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4240FLGMOTORFLIG1781e-55 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 178 bits (453), Expect = 1e-55
Identities = 82/324 (25%), Positives = 169/324 (52%), Gaps = 1/324 (0%)

Query: 5 EQAAMLLLSMGEKGAAQVMAHLDRNDVQHLSHKMARLSSITQQEAEAVLGRFFTRYQEQS 64
++AA+LL+S+G + +++V +L + +++ L+ ++A+L +IT + + VL F Q
Sbjct: 19 QKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKELMMAQE 78

Query: 65 GIARASRTYLQKTLDLALGDRVAKSLIDSIYGDEIKVLVKRLEWVDPQLLAREIANEHCQ 124
I + Y ++ L+ +LG + A +I+++ + + DP + I EH Q
Sbjct: 79 FIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANILNFIQQEHPQ 138

Query: 125 LQAVLLGLLPPESAAQVLQGLPAEGQDEVLIRIAQLGDLDREVVDELKQLVERCMLMAME 184
A++L L P+ A+ +L LP E Q V RIA + EVV E+++++E+ +
Sbjct: 139 TIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKKLASLSS 198

Query: 185 KSHTQISGVRQVADILNRFD-GDREQLMEMLKLHDKQLANNVADNMFDFIILGRQKPETL 243
+ +T GV V +I+N D + ++E L+ D +LA + MF F + ++
Sbjct: 199 EDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIVLLDDRSI 258

Query: 244 QEIMAIVPADVLALALKGIDSELKMTLLRALPKRMSSAIETQVEAIGTVPLSQAIAARKE 303
Q ++ + LA ALK +D ++ + + + KR +S ++ +E +G ++++
Sbjct: 259 QRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKDVEESQQK 318

Query: 304 IMELAKQMMDEGQIELQLFEEQVV 327
I+ L +++ ++G+I + E+ V
Sbjct: 319 IVSLIRKLEEQGEIVISRGGEEDV 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4241FLGFLIH606e-13 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 59.8 bits (144), Expect = 6e-13
Identities = 41/182 (22%), Positives = 85/182 (46%), Gaps = 2/182 (1%)

Query: 42 QDFQQAFDKGYDEGVQKGHQAGFTSGEEEGRQTGYAAGFNQGRIEGQQKGKDNIDDQLNS 101
++ + + ++ + + H+ G+ +G EGRQ G+ G+ +G +G ++G Q
Sbjct: 34 EEAEPSLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAP 93

Query: 102 IIAPLGALKSLLEEGHNQQILQQQSLILDLVRRVSLQVIRCELTLQPQQILSLIEETLSA 161
I A + L S + + S ++ + + QVI T+ ++ I++ L
Sbjct: 94 IHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVIGQTPTVDNSALIKQIQQLLQQ 153

Query: 162 LPDDPSQVKIHLEPSAVDKLKEL--AADKIQSWSLVPDATISAGGCRIVSETSDADASVE 219
P + ++ + P + ++ ++ A + W L D T+ GGC++ ++ D DASV
Sbjct: 154 EPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDPTLHPGGCKVSADEGDLDASVA 213

Query: 220 TR 221
TR
Sbjct: 214 TR 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4248FLGHOOKAP1333e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 33.0 bits (75), Expect = 3e-04
Identities = 8/39 (20%), Positives = 19/39 (48%)

Query: 98 SNVNTVEEMADMMAASRSFETNVEIMNRARSMQQGLLQL 136
S VN EE ++ + + N +++ A ++ L+ +
Sbjct: 507 SGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 27.2 bits (60), Expect = 0.021
Identities = 14/59 (23%), Positives = 25/59 (42%), Gaps = 4/59 (6%)

Query: 9 IAGAGMNAQTIRLNTVASNLANAGAAAESPDEAFRALKPVFSTIYKQTQEGQVAGAHVE 67
A +G+NA LNT ++N+++ A + + ST+ G G +V
Sbjct: 6 NAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--MAQANSTLGAGGWVGN--GVYVS 60


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4250FLGHOOKAP1340.001 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 34.2 bits (78), Expect = 0.001
Identities = 15/41 (36%), Positives = 22/41 (53%)

Query: 360 LEGSNVDQTAEMVNLMTAQRNYQSNAKVLDTNSTMQQALLN 400
S V+ E NL Q+ Y +NA+VL T + + AL+N
Sbjct: 504 QSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544



Score = 34.2 bits (78), Expect = 0.001
Identities = 21/59 (35%), Positives = 27/59 (45%), Gaps = 5/59 (8%)

Query: 2 SFNIALSGLQATTQDLNTISNNIANSSTVGFRSGR----SEFSAIYNGGQAG-GVNVMN 55
N A+SGL A LNT SNNI++ + G+ S + GG G GV V
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4252FLGHOOKAP1421e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 42.3 bits (99), Expect = 1e-06
Identities = 11/47 (23%), Positives = 21/47 (44%)

Query: 213 QIRQGALEGANVNVVEEMVEMISTQRAYEMNAKVVSASDDMLKFLNQ 259
Q+ + VN+ EE + Q+ Y NA+V+ ++ + L
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544



Score = 39.2 bits (91), Expect = 1e-05
Identities = 10/37 (27%), Positives = 19/37 (51%)

Query: 3 SALWVSKTGLTAQDTKMTTIANNLANVNTTGFKRDRV 39
S + + +GL A + T +NN+++ N G+ R
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTT 38


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4253FLGLRINGFLGH1481e-46 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 148 bits (375), Expect = 1e-46
Identities = 76/224 (33%), Positives = 112/224 (50%), Gaps = 17/224 (7%)

Query: 12 LTLLLSGCVAHIPEPDTAPGKPEWAPPEIDYSLPDAENGSVYRPGFMLT-----LFKDKR 66
L L L+GC A IP + P + NGS+++ + LF+D+R
Sbjct: 15 LVLSLTGC-AWIP---STPLVQGATSAQPVPGPTPVANGSIFQSAQPINYGYQPLFEDRR 70

Query: 67 AYREGDILTVALDEKTYSSKRADTKTSKSGGVSIDGQGTTGTSSIAGSG------EANMG 120
GD LT+ L E +SK + S+ G + G T G EA+ G
Sbjct: 71 PRNIGDTLTIVLQENVSASKSSSANASRDGKTNF-GFDTVPRYLQGLFGNARADVEASGG 129

Query: 121 RSFNGTGSSTQQNQLSGSITVTVAKVLPNGALLIRGEKWLRLNQGDEYLRLLGLIRADDI 180
+FNG G + N SG++TVTV +VL NG L + GEK + +NQG E++R G++ I
Sbjct: 130 NTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTI 189

Query: 181 DNDNTISSQRIADARIIYGGQGAISDSNRMGWAARYFNSPWFPL 224
NT+ S ++ADARI Y G G I+++ MGW R+F + P+
Sbjct: 190 SGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLN-LSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4254FLGPRINGFLGI343e-118 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 343 bits (880), Expect = e-118
Identities = 151/370 (40%), Positives = 211/370 (57%), Gaps = 14/370 (3%)

Query: 11 MLLSLSPLLPVKAQPQHRYLMDIVDVQGLRDNQLVGYGLVVGLDGTGDRT-QVRFTSQSI 69
+ +L L AQ + DI +Q RDNQL+GYGLVVGL GTGD FT QS+
Sbjct: 12 VFSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSM 71

Query: 70 VNMLKQFGVQIDDKTDPKLKNVAAVAVHATVPPLASPGQTLDITVSSLGDAKSLRGGTLL 129
ML+ G+ KN+AAV V A +PP ASPG +D+TVSSLGDA SLRGG L+
Sbjct: 72 RAMLQNLGITTQG-GQSNAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLI 130

Query: 130 MTPMRAVDGEIYAVAQGNLVVGGVSAQGRNGTSVTINVPTVGSIPNGALLEAAMHSNFND 189
MT + DG+IYAVAQG L+V G SAQG + ++T V T +PNGA++E + S F D
Sbjct: 131 MTSLSGADGQIYAVAQGALIVNGFSAQG-DAATLTQGVTTSARVPNGAIIERELPSKFKD 189

Query: 190 NENIVLNLIDPSFKTARNIERAVNEL----FGPDVAQADSSAKVIVRAPSSNRERVTFMS 245
+ N+VL L +P F TA + VN +G +A+ S ++ V+ P + M+
Sbjct: 190 SVNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKP-RVADLTRLMA 248

Query: 246 MLEELQIEQGRKSPRVVFNSRTGTVVMGGDVVVRKAAVSHGNLTVTIVEQEFVSQPNGAY 305
+E L +E +VV N RTGT+V+G DV + + AVS+G LTV + E V QP
Sbjct: 249 EIENLTVETDTP-AKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPF- 306

Query: 306 LGQAQGETVVTTDSQVGIDEGNGHMFVWPEGTALNDIVRAVNSLGASPMDLMAILQALNE 365
++G+T V + + + + + EG L +V +NS+G ++AILQ +
Sbjct: 307 ---SRGQTAVQPQTDIMAMQEGSKVAI-VEGPDLRTLVAGLNSIGLKADGIIAILQGIKS 362

Query: 366 AGALEAELVV 375
AGAL+AELV+
Sbjct: 363 AGALQAELVL 372


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4255FLGFLGJ503e-10 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 50.1 bits (119), Expect = 3e-10
Identities = 27/96 (28%), Positives = 48/96 (50%), Gaps = 5/96 (5%)

Query: 26 GALKLVSQQFEAQFLQTVLKQMRSASDVMADKDSPLSSQNDGMYRDWHDAELAGRLSQMQ 85
++ V++Q E F+Q +LK MR A KD SS++ +Y +D ++A +++ +
Sbjct: 31 ANIRPVARQVEGMFVQMMLKSMRDALP----KDGLFSSEHTRLYTSMYDQQIAQQMTAGK 86

Query: 86 STGLAEVMTKQLSAGLKSEPEMVASNKQVNSSPNTA 121
GLAE+M KQ++ + PE + T
Sbjct: 87 GLGLAEMMVKQMTPE-QPLPEESTPAAPMKFPLETV 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4256FLGHOOKAP11681e-48 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 168 bits (427), Expect = 1e-48
Identities = 95/320 (29%), Positives = 156/320 (48%), Gaps = 8/320 (2%)

Query: 2 SMLNIGMSGLNASMAALTATSNNINNAMVPGYSRQQVMLSSVGNGVYGS---GSGVMVDG 58
S++N MSGLNA+ AAL SNNI++ V GY+RQ +++ + + G+GV V G
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61

Query: 59 VRRISDQYEVAQLWNTTSGLGYANTQSSYFGQVEQIFGSEGNSISAGLDLLFASLNSAME 118
V+R D + QL + + +++ + + +S++ + F SL + +
Sbjct: 62 VQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLVS 121

Query: 119 QPNEIAHRQGVLNEAKALTQRFNSISEGLNSQVTQVEGQINASAKEINTQLETIASLNAE 178
+ A RQ ++ +++ L +F + + L Q QV I AS +IN + IASLN +
Sbjct: 122 NAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLNDQ 181

Query: 179 IQSS--NASGNVPLALLDARDSAIDDLSSIIDVNVVEDSSGMLNISLAQGQPLLSGTTAS 236
I +G P LLD RD + +L+ I+ V V G NI++A G L+ G+TA
Sbjct: 182 ISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTAR 241

Query: 237 KLEV---TPDPSNPKFSQISIQFGQSSFPLDETAGGSLGALIDYRDNSLVDSMAFIDELA 293
+L + DPS + + G P GSLG ++ +R L + + +LA
Sbjct: 242 QLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQLA 301

Query: 294 MTMADEFNAVLAGGTDLNGN 313
+ A+ FN G D NG+
Sbjct: 302 LAFAEAFNTQHKAGFDANGD 321



Score = 73.1 bits (179), Expect = 6e-16
Identities = 36/105 (34%), Positives = 60/105 (57%), Gaps = 3/105 (2%)

Query: 351 GDNSNLKELVDIANKSFTFSSMGVDTTMGDAFGSKIGELGSASRQAQMSKTTAENLQMEA 410
DN N + L+D+ + S ++G + DA+ S + ++G+ + + S T N+ +
Sbjct: 443 SDNRNGQALLDLQSNS---KTVGGAKSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQL 499

Query: 411 QKQWASTSGVNMDEEGVNLIIYQQSYQANAKVISTADQLFQTILN 455
Q S SGVN+DEE NL +QQ Y ANA+V+ TA+ +F ++N
Sbjct: 500 SNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4257FLAGELLIN451e-07 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 45.4 bits (107), Expect = 1e-07
Identities = 28/143 (19%), Positives = 53/143 (37%)

Query: 11 NNLQSLQNSTFDIAKLNEMMSTGSSILRPSDDPIGAVKVIGNERDMAATNQYIKNTESLS 70
+L S ++ E +S+G I DD G ++ Q +N
Sbjct: 12 LTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGI 71

Query: 71 TSFSRSETYMSSMVELQGRMREITVSANNGSLSPEDRAAYAAEMNELLEAFADTLNAKDE 130
+ +E ++ + R+RE++V A NG+ S D + E+ + LE N
Sbjct: 72 SIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQF 131

Query: 131 SGNYLFSGNKTDTPPIGKDADGN 153
+G + S + +G +
Sbjct: 132 NGVKVLSQDNQMKIQVGANDGET 154


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4260FLAGELLIN983e-25 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 97.8 bits (243), Expect = 3e-25
Identities = 74/269 (27%), Positives = 125/269 (46%), Gaps = 9/269 (3%)

Query: 4 VMTNNASNIAQNAVTKNNDLLSNAMERLSTGLRINSASDDAAGLQIATRLNANVTGMETA 63
+ TN+ S + QN + K+ LS+A+ERLS+GLRINSA DDAAG IA R +N+ G+ A
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 NRNVSDATSMLQTADGALDELNNIASRQKELATQAANGVNSAEDIKALGAEYKELNAEAN 123
+RN +D S+ QT +GAL+E+NN R +EL+ QA NG NS D+K++ E ++ E +
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 124 RIIDSTEYGGNKLFTALDTGVGFQIGASNTASEQLTVKTDVAAVKTLFAGE--------I 175
R+ + T++ G K+ + D + Q+GA++ + + ++ L +
Sbjct: 124 RVSNQTQFNGVKVLSQ-DNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATV 182

Query: 176 TDSATAKAQIDNVDAIIDAVGTQRSNLGASINRLGHTASNLTNVTENTKAAAGRIMDTDF 235
D ++ + D R ++ + TA + + A D
Sbjct: 183 GDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAE 242

Query: 236 AVETAAMTKNQLLVQAGTNILSSSNQNTG 264
+ K + + G
Sbjct: 243 NNTAVDLFKTTKSTAGTAEAKAIAGAIKG 271



Score = 65.8 bits (160), Expect = 2e-14
Identities = 50/266 (18%), Positives = 83/266 (31%)

Query: 5 MTNNASNIAQNAVTKNNDLLSNAMERLSTGLRINSASDDAAGLQIATRLNANVTGMETAN 64
NN + + + D G+ G +
Sbjct: 241 AENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVS 300

Query: 65 RNVSDATSMLQTADGALDELNNIASRQKELATQAANGVNSAEDIKALGAEYKELNAEANR 124
++ L AD N A+ + + VN ++
Sbjct: 301 TTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEA 360

Query: 125 IIDSTEYGGNKLFTALDTGVGFQIGASNTASEQLTVKTDVAAVKTLFAGEITDSATAKAQ 184
+ A T + KT + +
Sbjct: 361 NNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTANP 420

Query: 185 IDNVDAIIDAVGTQRSNLGASINRLGHTASNLTNVTENTKAAAGRIMDTDFAVETAAMTK 244
+ ++D+ + V RS+LGA NR +NL N N +A RI D D+A E + M+K
Sbjct: 421 LASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSK 480

Query: 245 NQLLVQAGTNILSSSNQNTGLVMGLL 270
Q+L QAGT++L+ +NQ V+ LL
Sbjct: 481 AQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4264FLGHOOKFLIK478e-08 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 46.8 bits (110), Expect = 8e-08
Identities = 36/129 (27%), Positives = 56/129 (43%)

Query: 232 AVAQWGPVSVSQSAPLPQQSHEMLSPLREQLRFQIDQQIKQAELRLDPPELGKIELNVRL 291
Q P + P SHE L + + Q + AELRL P +LG++++++++
Sbjct: 218 HQTQPLPTVAAPVLSAPLGSHEWQQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKV 277

Query: 292 DGDRLHIQMHAANSSVRDALLMGLDRLRAELAMDHGGQIDVDISQGEPQQKQHQGRSQSA 351
D ++ IQM + + VR AL L LR +LA +IS +Q Q
Sbjct: 278 DDNQAQIQMVSPHQHVRAALEAALPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQ 337

Query: 352 IAAANLHEP 360
HEP
Sbjct: 338 SQRTANHEP 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4268OMPADOMAIN361e-04 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 36.4 bits (84), Expect = 1e-04
Identities = 32/132 (24%), Positives = 50/132 (37%), Gaps = 24/132 (18%)

Query: 186 FTRGSADMKPYFEDLLMALGPLLSQVE---NKITISGHTDSSPYAGKKFTNWELSSKRAL 242
F A +KP + L L LS ++ + + G+TD G N LS +RA
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI---GSDAYNQGLSERRAQ 279

Query: 243 LARRVLEYGGLRRHQVIQVTGM------ADQAPYVIKDPAA-----AANRRIEVLVMTSD 291
L G+ + I GM +K AA A +RR+E+ V
Sbjct: 280 SVVDYLISKGIPADK-ISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV---- 334

Query: 292 AENQLRQMMGQP 303
++ ++ QP
Sbjct: 335 --KGIKDVVTQP 344


61Shal_4288Shal_4297Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_42882212.009160UDP-N-acetylglucosamine pyrophosphorylase
Shal_42891191.521598DsrE family protein
Shal_42900191.663955hypothetical protein
Shal_42914251.658815ABC transporter-like protein
Shal_42925281.374054RND family efflux transporter MFP subunit
Shal_42936330.325453F0F1 ATP synthase subunit epsilon
Shal_4294533-0.096693F0F1 ATP synthase subunit beta
Shal_4295429-0.941291F0F1 ATP synthase subunit gamma
Shal_4296227-1.301481F0F1 ATP synthase subunit alpha
Shal_4297225-2.575941F0F1 ATP synthase subunit delta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4290ACRIFLAVINRP300.048 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 29.8 bits (67), Expect = 0.048
Identities = 23/105 (21%), Positives = 44/105 (41%), Gaps = 17/105 (16%)

Query: 338 LLIGLASLFALLLTSFAFNMIGYLGEELLPRLDALSLNANMLLFAVLVTLVIAFVFSWIE 397
IGL++ A+L+ FA ++ + +E ++A + M L +L+T +AF+ +
Sbjct: 932 TTIGLSAKNAILIVEFAKDL---MEKEGKGVVEATLMAVRMRLRPILMT-SLAFILGVLP 987

Query: 398 LKAIDESQLQSCLQASGKGTGKQLSKGVSHGLIGLQLLFAVLTLV 442
L A G G V G++G + +L +
Sbjct: 988 L-------------AISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4292RTXTOXIND371e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 36.7 bits (85), Expect = 1e-04
Identities = 57/389 (14%), Positives = 115/389 (29%), Gaps = 93/389 (23%)

Query: 49 GEFTQRIRGYGTLM-SQNQRLITASSLAVVDEIKLHPGAEVSANSILLILKNPQLDSALQ 107
G+ G L S + I ++V EI + G V +LL L ++
Sbjct: 78 GQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTL 137

Query: 108 QAETNVHNSRTTKRKL------------------------------------MLEQQREL 131
+ ++++ +R + + ++++Q
Sbjct: 138 KTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFST 197

Query: 132 LDQASSLAEFKADAEIANLQVEAESPLAAEGIISAMDMRLSRLRA-------------KQ 178
E D + A E + RL + +Q
Sbjct: 198 WQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQ 257

Query: 179 LNERLKLGQD---KLGKLREVHAEHLLIQTEII-------KQAESELATAQNRREQL--- 225
N+ ++ + +L ++ +E L + E + +L + L
Sbjct: 258 ENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE 317

Query: 226 -----------VVRAGIEGVLQRLPIK-LGQSVNVGDEL-ALVGSLSPLIAEIKVPQMLI 272
V+RA + +Q+L + G V + L +V L V I
Sbjct: 318 LAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDI 377

Query: 273 QLVRHGAKADID------TRSGIVSGQVIRVDP---------VVSQGAVRVDIQLL--DE 315
+ G A I TR G + G+V ++ +V + ++ L
Sbjct: 378 GFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTGN 437

Query: 316 ISSDIRPMQMVDAVIFGKSRAQVNYVDKP 344
+ + V A I R+ ++Y+ P
Sbjct: 438 KNIPLSSGMAVTAEIKTGMRSVISYLLSP 466


62Shal_0041Shal_0051N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0041016-0.613494carboxyl-terminal protease
Shal_0042-1130.565991peptidase M23B
Shal_0043-1141.166398phosphoglyceromutase
Shal_0044-2132.517046rhodanese domain-containing protein
Shal_0045-1133.446769preprotein translocase subunit SecB
Shal_0046-2193.901968NAD(P)H-dependent glycerol-3-phosphate
Shal_0047-1223.738405hypothetical protein
Shal_00480223.313975hypothetical protein
Shal_0049-2213.330484hypothetical protein
Shal_00500192.327030ABC transporter-like protein
Shal_00511202.050455ABC-2 type transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0041TYPE3OMGPROT290.040 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 29.1 bits (65), Expect = 0.040
Identities = 45/228 (19%), Positives = 77/228 (33%), Gaps = 42/228 (18%)

Query: 129 GDRVIKLNKQRVTAEKLESILNE------IKQHSLDNQPIELEMVRANSDVHYSVTISPS 182
DR I V A + +IL I+Q ++DNQ I RA++ V PS
Sbjct: 195 SDRTIHYRDDEVAAPGVATILQRVLSDATIQQVTVDNQRIPQAATRASAQA--RVEADPS 252

Query: 183 LIAINSVEAEILDNQIGYIRLSSFQENTTQELVKQLSAWQSE-KLNGIILDLRNNPGGLL 241
L AI +R S + Q L+ L + ++ I+D+ N L
Sbjct: 253 LNAI-------------IVRDSPERMPMYQRLIHALDKPSARIEVALSIVDI--NADQLT 297

Query: 242 DQAIT-VADIFLDKGRIVATEGRFFDANSDYYASPQTMATNIPLSVLINKGSASASEVLA 300
+ + I V + +N +A+N L L++
Sbjct: 298 ELGVDWRVGIRTGNNHQVVIKTTGDQSN---------IASNGALGSLVDARGLDYLLARV 348

Query: 301 AALQENGRAKLIGQTSFGKGTIQSLIPTLMDGNAIKLTIAKYTTPNGK 348
L+ G A+++ + + Q ++D + Y GK
Sbjct: 349 NLLENEGSAQVVSRPTL---LTQENAQAVIDHSET-----YYVKVTGK 388


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0042RTXTOXIND401e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 40.2 bits (94), Expect = 1e-05
Identities = 31/226 (13%), Positives = 82/226 (36%), Gaps = 14/226 (6%)

Query: 13 AGFLSLPLTVNATDLERRQSELKSIQVQINKQQSAVKNTSKQRERLVSLLKDDEKAIAQA 72
G + L LT + + +++ +Q ++ + + + + S + +L L DE
Sbjct: 120 KGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNV 179

Query: 73 ARKVNTTKQALAKTDKKIAELDRKHVQLDKLKKAQQKSLSNQLASAYLAGNHDYTKMLLN 132
+ + +L K + + +L+ KK ++ + Y
Sbjct: 180 SEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRY------------- 226

Query: 133 QQSPASIERLLAYYQYLNNARIDSINQLQTTLKELDQIQADQQEQQNRLNQLVLEQQQQA 192
+ + L + L + + + + + + + + + +++L Q+ E
Sbjct: 227 ENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAK 286

Query: 193 KKLNQEQSQRQN-TLTQLQRTLNNSGAKLEQLQIEEASLKRVVEQA 237
++ +N L +L++T +N G +L E + V +A
Sbjct: 287 EEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRA 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0045SECBCHAPRONE2071e-71 Bacterial protein-transport SecB chaperone protein ...
		>SECBCHAPRONE#Bacterial protein-transport SecB chaperone protein

signature.
Length = 170

Score = 207 bits (528), Expect = 1e-71
Identities = 80/161 (49%), Positives = 113/161 (70%), Gaps = 4/161 (2%)

Query: 4 VANNEQ--QGPQFNIQRVYTKDISFETPNSPAVFQKEWNPEVKLDLDTRSNKLSDDVYEV 61
A + Q Q P IQR+Y KD+SFE PN P +FQ++W P++ DL T + ++ DD+YEV
Sbjct: 8 NAADTQATQQPVLQIQRIYVKDVSFEAPNLPHIFQQDWEPKLSFDLSTEAKQVGDDLYEV 67

Query: 62 VLSLTVTA--KNGEETAFLCEVQQAGIFAIAGLTEQQLAHSLGAYCPNVLFPYAREAIGS 119
L+++V ++ + AF+CEV+QAG+F I+GL E Q+AH L + CPN+LFPYARE + S
Sbjct: 68 CLNISVETTMESSGDVAFICEVKQAGVFTISGLEEMQMAHCLTSQCPNMLFPYARELVSS 127

Query: 120 LVSRGTFPQLNLAPVNFDALFAQYVQQRQAAAADAPAEEAN 160
LV+RGTFP LNL+PVNFDALF Y+Q+++ A E +
Sbjct: 128 LVNRGTFPALNLSPVNFDALFMDYLQRQEQAEQTTEEENKD 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0047DNABINDNGFIS351e-04 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 34.6 bits (79), Expect = 1e-04
Identities = 27/104 (25%), Positives = 46/104 (44%), Gaps = 18/104 (17%)

Query: 249 NPGEAISINLLPGEDVRAKIQQALEASPKQSLRNALSQWLPKRLVEVLFDESLLNKALNQ 308
+ ++++ + +D Q+ L S KQ+L+N +Q + + + L +
Sbjct: 6 VNSDVLTVSTVNSQD--QVTQKPLRDSVKQALKNYFAQLNGQDVND-----------LYE 52

Query: 309 LVHAEREQLALDLESWTLVMNGTEGYRTAEVTLGGIDTDELSSK 352
LV AE EQ LD +VM T G +T + GI+ L K
Sbjct: 53 LVLAEVEQPLLD-----MVMQYTRGNQTRAALMMGINRGTLRKK 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0049RTXTOXIND445e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.0 bits (104), Expect = 5e-07
Identities = 21/152 (13%), Positives = 48/152 (31%), Gaps = 17/152 (11%)

Query: 45 VVVALPVAQGSIVTKGTVLVQLDDTQQKSHVAKALADVAQATANYEKLLKGAREEEIAAA 104
+V + V +G V KG VL++L ++ K + + QA +
Sbjct: 106 IVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLE---------QTRYQIL 156

Query: 105 RAAVAGTRARLQESEANYRRIASMAKDNLASKADLDRALASRDADAANLESAQENLLELV 164
++ + + ++ L + + ++ E +
Sbjct: 157 SRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLD------ 210

Query: 165 SGSREEDIRFALANLQASEAVLLGEQKKLDDL 196
+ + LA + E + E+ +LDD
Sbjct: 211 --KKRAERLTVLARINRYENLSRVEKSRLDDF 240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0050ANTHRAXTOXNA290.025 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.3 bits (65), Expect = 0.025
Identities = 14/34 (41%), Positives = 20/34 (58%)

Query: 112 EQTARIQELLNTYGLEQKADQLAGSMSGGQKQRL 145
E+ + LL YG+E+K D G++S QKQ L
Sbjct: 524 EKQKGVTNLLIKYGIERKPDSTKGTLSNWQKQML 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0051ABC2TRNSPORT452e-07 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 44.9 bits (106), Expect = 2e-07
Identities = 44/161 (27%), Positives = 75/161 (46%), Gaps = 13/161 (8%)

Query: 194 GVILTMTMIMFT----SAAIVRERERGNLEMLITTPIRPIELMLGKII----PYMFIGIV 245
G++ T M T AA R + E ++ T +R +++LG++ G
Sbjct: 72 GMVATSAMTAATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAG 131

Query: 246 QVIIILGLGYSVFDVPINGSLLQLAGATLLFIMASLTLGLVISTIAKSQLQSMQMTIFVL 305
++ LGY+ + SLL L +A +LG+V++ +A S + V+
Sbjct: 132 IGVVAAALGYTQWL-----SLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVI 186

Query: 306 LPSILLSGFMFPYEGMPIEAQYIAEALPATHFMRLIRGVVL 346
P + LSG +FP + +PI Q A LP +H + LIR ++L
Sbjct: 187 TPILFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIML 227


63Shal_0129Shal_0135N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0129-1161.601178ABC transporter-like protein
Shal_0130-1140.464541hypothetical protein
Shal_0131-2161.434551hypothetical protein
Shal_0132-1172.246808two component transcriptional regulator
Shal_0133-2172.112774histidine kinase
Shal_0134-1162.570206pseudouridine synthase Rlu family protein
Shal_0135-1162.742005phosphopantetheine adenylyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0129PF05272320.004 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.004
Identities = 11/27 (40%), Positives = 15/27 (55%)

Query: 26 CKAGEVLAVVGPSGGGKSTLLRMIAGL 52
CK + + G G GKSTL+ + GL
Sbjct: 593 CKFDYSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0132HTHFIS814e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.4 bits (201), Expect = 4e-20
Identities = 27/122 (22%), Positives = 54/122 (44%)

Query: 2 KILMVEDDATTIEYVVKGFVEQGHNIETATDGHQGLLLATSMKYDLIILDRMLPQLDGLK 61
IL+ +DDA + + G+++ ++ + DL++ D ++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 LLAALRATGSQTPVLILSALSHVDERVKGLRAGGDDYMTKPFAFSELLVRAEKLMQRGES 121
LL ++ PVL++SA + +K G DY+ KPF +EL+ + + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 LP 123
P
Sbjct: 125 RP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0133PF06580300.024 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.8 bits (67), Expect = 0.024
Identities = 18/99 (18%), Positives = 38/99 (38%), Gaps = 27/99 (27%)

Query: 356 LVDNAVKY----SGEGAEICIS----QNRNVISIQDNGPGIPEASREKVFERLVRLDPSR 407
LV+N +K+ +G +I + + +++ G + ++E
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE------------- 309

Query: 408 HHKGTGLGLSMVKAILSR---HNAKIALTDNQPGLKVII 443
TG GL V+ L A+I L++ Q + ++
Sbjct: 310 ---STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0135LPSBIOSNTHSS2258e-79 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 225 bits (575), Expect = 8e-79
Identities = 79/155 (50%), Positives = 110/155 (70%)

Query: 5 AIYPGTFDPVTNGHADLIERAANLFEHVIIGIAANPSKQPRFTLAERVELLKTVTAHLDN 64
AIYPG+FDP+T GH D+IER LF+ V + + NP+KQP F++ ER+E + AHL N
Sbjct: 3 AIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLPN 62

Query: 65 VEVVGFSGLLVDFAKEQNASVLVRGLRAVSDFEYEFQLANMNRRLSPDLESVFLTPAEEN 124
+V F GL V++A+++ A ++RGLR +SDFE E Q+AN N+ L+ DLE+VFLT + E
Sbjct: 63 AQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTEY 122

Query: 125 SFISSTLVKEVALHGGDVSQFVHAEVANALTKKAN 159
SF+SS+LVKEVA GG+V FV + VA AL + +
Sbjct: 123 SFLSSSLVKEVARFGGNVEHFVPSHVAAALYDQFH 157


64Shal_0239Shal_0243N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0239-1152.571396osmolarity sensor protein
Shal_0240-1132.427795osmolarity response regulator
Shal_02411132.068813transcription elongation factor GreB
Shal_02420122.171416RNA-binding S1 domain-containing protein
Shal_02431141.989503hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0239PF06580431e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 43.3 bits (102), Expect = 1e-06
Identities = 27/186 (14%), Positives = 61/186 (32%), Gaps = 42/186 (22%)

Query: 251 IVNDIEDMDAIINQFISYIRQDQEGTRE----LEQINDLIQDVIQAESNREG-------S 299
I+ D ++ +R + L ++ +Q S +
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 300 IDSELVECPRIPMQGIAVKRVLSNLVENAYRYG------NGWVRINSQFNGQYVGFSVED 353
I+ +++ PM ++ LVEN ++G G + + + V VE+
Sbjct: 246 INPAIMDVQVPPM-------LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVEN 298

Query: 354 NGPGIDEEQIPKLFQPFTQGDTARGSVGSGLGLA-IIKRIVDRHQGKVILT-NRSEGGLH 411
G + +G GL + +R+ + + + + +G ++
Sbjct: 299 TGSLALKNTKE----------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN 342

Query: 412 AQVWLP 417
A V +P
Sbjct: 343 AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0240HTHFIS994e-26 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 99.1 bits (247), Expect = 4e-26
Identities = 39/130 (30%), Positives = 71/130 (54%), Gaps = 3/130 (2%)

Query: 6 SKILVVDDDMRLRALLERYLMEQGYQVRSAANAEQMDRLLERENFHLLVLDLMLPGEDGL 65
+ ILV DDD +R +L + L GY VR +NA + R + + L+V D+++P E+
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 66 SICRRLRQMGNPIPIVMLTAKGDEVDRIIGLELGADDYLPKPFNPRELLARIKAVM---R 122
+ R+++ +P+++++A+ + I E GA DYLPKPF+ EL+ I + +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 123 RQTPEVPGAP 132
R+ ++
Sbjct: 124 RRPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0242ANTHRAXTOXNA300.030 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 30.5 bits (68), Expect = 0.030
Identities = 30/93 (32%), Positives = 39/93 (41%), Gaps = 4/93 (4%)

Query: 352 TTIFPHAPQNQWDKSVRTLSNLVKMHKVELIAIGNGTASRETDKLAADLISQVKADLPTL 411
T I PQ +WDK V T ++L K V + I G R+ D L + K L L
Sbjct: 502 TEIKKQIPQKEWDKVVNTPNSLEKQKGVTNLLIKYGI-ERKPDSTKGTLSNWQKQMLDRL 560

Query: 412 T---KIMVSEAGASVYSASELASEEFPDIDVSI 441
K G V +E +EEFP+ D I
Sbjct: 561 NEAVKYTGYTGGDVVNHGTEQDNEEFPEKDNEI 593


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_024356KDTSANTIGN270.030 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 27.2 bits (60), Expect = 0.030
Identities = 17/73 (23%), Positives = 28/73 (38%), Gaps = 15/73 (20%)

Query: 59 QNQAKLRESVQQKQQGQSQQQEQSGQEKQKQAA---------NLNLKQL------LSQRA 103
Q Q +QQ QQ+Q+ Q+ A + + QL L + A
Sbjct: 329 QIHLNFVMPPQAQQQQGQGQQQQAQATAQEAVAAAAVRLLNGSDQIAQLYKDLVKLQRHA 388

Query: 104 PLKRSDIKLPGQQ 116
++++ KL QQ
Sbjct: 389 GIRKAMEKLAAQQ 401


65Shal_0331Shal_0342N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0331118-1.653131flavocytochrome c
Shal_0332-114-2.150335tetraheme cytochrome c
Shal_0333-114-1.506686hypothetical protein
Shal_0334-114-1.144281hypothetical protein
Shal_03351140.917702LysR family transcriptional regulator
Shal_03362162.594107regulatory protein TetR
Shal_03372172.853308histidine kinase
Shal_03381152.756920two component transcriptional regulator
Shal_03390152.736781hypothetical protein
Shal_0340-1153.362662cation diffusion facilitator family transporter
Shal_0341-2162.791980nitrogen metabolism transcriptional regulator
Shal_0342-3161.558227signal transduction histidine kinase, nitrogen
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0331HTHFIS310.013 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.013
Identities = 11/31 (35%), Positives = 16/31 (51%), Gaps = 1/31 (3%)

Query: 39 KWDQQVEVLIIGSGFAGLAAAIEATRKGAKD 69
K + VL++ S AI+A+ KGA D
Sbjct: 71 KARPDLPVLVM-SAQNTFMTAIKASEKGAYD 100


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0336HTHTETR382e-05 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 37.7 bits (87), Expect = 2e-05
Identities = 26/165 (15%), Positives = 49/165 (29%), Gaps = 7/165 (4%)

Query: 3 NWQQRESYLTDIAERCLRGHKSFDLRRSHLVEASQISKGTIYNHFPTEADLVVAVATAHY 62
Q+ ++ D+A R + +A+ +++G IY HF ++DL +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 63 RKRLERAA-IDDDLYQDYLTRFL-----MHHCWGLRDDLLYDRFIISRVMPNSELLAQVT 116
E D L+ + + II +A V
Sbjct: 68 SNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQ 127

Query: 117 DDNRETFEQVYGEYVRWNNDLIKAIGVVEGFNRAELVANYLRGAL 161
R + Y + I+A + A +RG +
Sbjct: 128 QAQRNLCLESYDRIEQTLKHCIEAKMLPADLM-TRRAAIIMRGYI 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0337PF06580385e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.3 bits (89), Expect = 5e-05
Identities = 23/125 (18%), Positives = 48/125 (38%), Gaps = 16/125 (12%)

Query: 282 EADQLEQMIAELLELSRVKLNANETKRNLELSETLSQVLDDADFEAQQ----QQKQLHID 337
+ + +M+ L EL R L R + L++ L+ V D+ + + Q
Sbjct: 189 DPTKAREMLTSLSELMRYSL-RYSNARQVSLADELTVV--DSYLQLASIQFEDRLQFENQ 245

Query: 338 IDESIVAPLYPR----PLCRAVENLLRNAI--RYANSQVSIQAIATASNIQIEIIDDGPG 391
I+ +I+ P L VEN +++ I ++ ++ + +E+ + G
Sbjct: 246 INPAIMDVQVPPMLVQTL---VENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSL 302

Query: 392 IADET 396
T
Sbjct: 303 ALKNT 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0338HTHFIS989e-26 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 98.0 bits (244), Expect = 9e-26
Identities = 40/146 (27%), Positives = 71/146 (48%), Gaps = 2/146 (1%)

Query: 2 SRILLIDDDLGLSELLAQLLELEGFELTLAHDGQSGLDLAIEQQFDLILLDVMLPKLNGF 61
+ IL+ DDD + +L Q L G+++ + + + DL++ DV++P N F
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 EVLRALRS-KKQTPVLMLTARGDEIDRVVGLEIGADDYLPKPFNDRELVARI-RAIIRRT 119
++L ++ + PVL+++A+ + + E GA DYLPKPF+ EL+ I RA+
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 120 HSQANETPLSTRQYGDINLDPARQEV 145
+ S + A QE+
Sbjct: 124 RRPSKLEDDSQDGMPLVGRSAAMQEI 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0341HTHFIS5650.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 565 bits (1458), Expect = 0.0
Identities = 205/474 (43%), Positives = 297/474 (62%), Gaps = 12/474 (2%)

Query: 5 VWILDDDSSIRWVLEKALQSAKFSCASFAAAESLWQALEMTQPQVIVSDIRMPGTDGLTL 64
+ + DDD++IR VL +AL A + + A +LW+ + ++V+D+ MP + L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 65 LERLQNHYPHIPVIIMTAHSDLDSAVSAYQAGAFEYLPKPFDIDEAISLVDRALTHAKEQ 124
L R++ P +PV++M+A + +A+ A + GA++YLPKPFD+ E I ++ RAL K +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRR 125

Query: 125 SSSITVEEPLVATPEIIGEAPAMQEVFRAIGRLSRSSISVLINGQSGTGKELVASALHKH 184
S +E+ ++G + AMQE++R + RL ++ ++++I G+SGTGKELVA ALH +
Sbjct: 126 PS--KLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDY 183

Query: 185 SPRKGKPFIAINMAAIPKDLIESELFGHEKGAFTGAGSVRQGRFEQANGGTLFLDEIGDM 244
R+ PF+AINMAAIP+DLIESELFGHEKGAFTGA + GRFEQA GGTLFLDEIGDM
Sbjct: 184 GKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDM 243

Query: 245 PLDVQTRLLRVLADGQFYRVGGHSPVQVDVRIIAATHQDLEQLVYQGGFREDLFHRLNVI 304
P+D QTRLLRVL G++ VGG +P++ DVRI+AAT++DL+Q + QG FREDL++RLNV+
Sbjct: 244 PMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVV 303

Query: 305 RVHLPPLSQRREDIPQLARHFLMIAAKEIAVEPKVLTKEAATKLSQLPWPGNVRQLENTC 364
+ LPPL R EDIP L RHF+ A KE ++ K +EA + PWPGNVR+LEN
Sbjct: 304 PLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVRELENLV 362

Query: 365 RWLTVMASGQEILPLDLPPELLQEPKLNNAQTSDSDDWQGALKLFIDQRLSD-------- 416
R LT + I + EL E + + + + ++ +++ +
Sbjct: 363 RRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDA 422

Query: 417 -GDSDLLTEVQPAFERILLQTALKHTNGHKQEAAKRLGWGRNTLTRKLKELEMD 469
S L V E L+ AL T G++ +AA LG RNTL +K++EL +
Sbjct: 423 LPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0342PF06580413e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.4 bits (97), Expect = 3e-06
Identities = 40/257 (15%), Positives = 86/257 (33%), Gaps = 58/257 (22%)

Query: 98 KYLALIELRQVDQQRRIHQQLTQDAQQQAAQYLVRNLAHEIKNPLGGLRGAAQLLSRELK 157
+ + ++DQ + + Q+AQ A + + H + N L +R
Sbjct: 139 HFFKNYKQAEIDQWKM--ASMAQEAQLMALKAQIN--PHFMFNALNNIRA---------- 184

Query: 158 EPELHEFTDLIIEQADRLRNLVDRL-------LGPQRPTQHSLYNIHEVIQKVVQLVNVT 210
LI+E + R ++ L L Q SL + V+ +QL ++
Sbjct: 185 ---------LILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQ 235

Query: 211 LPSNIELTQDYDPSIPDIEMDPDQLQQAILNIVQNAIQ-ALEPKGGNIRLKTRTQHQITI 269
++ +P+I D+++ P +Q +V+N I+ + +I +
Sbjct: 236 FEDRLQFENQINPAIMDVQVPPMLVQT----LVENGIKHGIAQL--------PQGGKILL 283

Query: 270 GTKRHKLVLMLSIIDDGPGIQPELMDTLFYPMVTGREQGSGLGLSIAHNFARLHGG---R 326
+ + L + + G ++ +G GL ++ G +
Sbjct: 284 KGTKDNGTVTLEVENTGSLALKN------------TKESTGTGLQNVRERLQMLYGTEAQ 331

Query: 327 IDCDSTVGHTEFTITLP 343
I G + +P
Sbjct: 332 IKLSEKQGKVNAMVLIP 348


66Shal_0456Shal_0469N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0456-1151.403109UDP-N-acetylmuramate--L-alanine ligase
Shal_04570150.607084polypeptide-transport-associated
Shal_04582200.374758cell division protein FtsA
Shal_0459222-0.023201cell division protein FtsZ
Shal_0460121-1.141566UDP-3-O-[3-hydroxymyristoyl] N-acetylglucosamine
Shal_0461121-1.327399hypothetical protein
Shal_0462219-1.629399peptidase M23B
Shal_0463019-1.277856preprotein translocase subunit SecA
Shal_0464018-0.600752**mutator MutT protein
Shal_04650200.973184hypothetical protein
Shal_04660190.877634hypothetical protein
Shal_04670181.377580dephospho-CoA kinase
Shal_04680171.649343prepilin peptidase
Shal_04690161.551364type II secretion system protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0456ACETATEKNASE320.006 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 31.7 bits (72), Expect = 0.006
Identities = 29/136 (21%), Positives = 52/136 (38%), Gaps = 21/136 (15%)

Query: 261 QALDFKQTGYSCEFTVKRAGLADLRLSVNLPGEHNVLN--------ALAAIAVATEDDIE 312
Q ++ K + +R G+ D L+ N GE + A+ + A +
Sbjct: 16 QLIESKDGNVLAKGLAERIGINDSLLTHNANGEKIKIKKDMKDHKDAIKLVLDALVN--S 73

Query: 313 DAAIIKALAEFQGIGRRFQQIGEFATSK--------GEIKLVDDYG--HHPSEVAATIKA 362
D +IK ++E +G R GE+ TS I + H+P+ + IKA
Sbjct: 74 DYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCIELAPLHNPANIEG-IKA 132

Query: 363 ARLGWPERRLVMIYQP 378
P+ +V ++
Sbjct: 133 CTQIMPDVPMVAVFDT 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0458SHAPEPROTEIN611e-12 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 61.3 bits (149), Expect = 1e-12
Identities = 49/222 (22%), Positives = 90/222 (40%), Gaps = 20/222 (9%)

Query: 150 SGMRMEAKVHIVTC----ANDMAKNITK-SVERCGLKVDDLVFSAIASADSVLTDDEKDL 204
S M ++ C A + + + S + G + L+ +A+A +
Sbjct: 100 SNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEAT 159

Query: 205 GVCLVDIGGGTTDIAVYTNGALRHCAVVPVAGNQVTNDIAKIFR------TPLSHAEQIK 258
G +VDIGGGTT++AV + + + + V + G++ I R + AE+IK
Sbjct: 160 GSMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIK 219

Query: 259 VQHASARSAMVSREDSIEVPS---VGGRPSR-SMSRHTLAEVVEPRYQELFELILKQLRD 314
+ SA IEV G P +++ + + E ++ + ++ L
Sbjct: 220 HEIGSAYPG--DEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQ 277

Query: 315 SGLE---DQVAAGIVITGGTASIEGAVDIAEATFGMPVRMAQ 353
E D G+V+TGG A + + G+PV +A+
Sbjct: 278 CPPELASDISERGMVLTGGGALLRNLDRLLMEETGIPVVVAE 319


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0462PF06580320.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.1 bits (73), Expect = 0.002
Identities = 15/65 (23%), Positives = 28/65 (43%)

Query: 6 FIQGRNGATRWQPGKRWLLLPILLIAAGTGLYQHNAKQLTQQQATVDGERMAREEQKDEL 65
FI + A + +++ + LY +QA +D +MA Q+ +L
Sbjct: 104 FINTKPVAFTLPLALSIIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMAQEAQL 163

Query: 66 IALKS 70
+ALK+
Sbjct: 164 MALKA 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0463SECA13130.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 1313 bits (3400), Expect = 0.0
Identities = 646/906 (71%), Positives = 749/906 (82%), Gaps = 6/906 (0%)

Query: 1 MFGKLLTKVFGSRNDRTLKAFGKVVTKINALEAEYEKLSDEELKAKTLHFRERLDGGETL 60
M KLLTKVFGSRNDRTL+ KVV INA+E E EKLSDEELK KT FR RL+ GE L
Sbjct: 1 MLIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVL 60

Query: 61 EGVLPEAFATVREASKRVFEMRHFDVQLIGGMILDSNRIAEMRTGEGKTLTATLPAYLNG 120
E ++PEAFA VREASKRVF MRHFDVQL+GGM+L+ IAEMRTGEGKTLTATLPAYLN
Sbjct: 61 ENLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNA 120

Query: 121 LTGKGVHVITVNDYLAGRDAENNRPLFEFLGLTVGINVAGLGQVEKKAAYDADITYGTNN 180
LTGKGVHV+TVNDYLA RDAENNRPLFEFLGLTVGIN+ G+ K+ AY ADITYGTNN
Sbjct: 121 LTGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNN 180

Query: 181 EFGFDYLRDNMAFSPSERVQRPLHYALIDEVDSILIDEARTPLIISGAAEDSSELYIKIN 240
E+GFDYLRDNMAFSP ERVQR LHYAL+DEVDSILIDEARTPLIISG AEDSSE+Y ++N
Sbjct: 181 EYGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVN 240

Query: 241 TLIPNLIAQDKEDTEDEIGVGDYSVDEKSKQVHMTERGQEKVEVLLTERGMLAEGDSLYS 300
+IP+LI Q+KED+E G G +SVDEKS+QV++TERG +E LL + G++ EG+SLYS
Sbjct: 241 KIIPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYS 300

Query: 301 AANISLLHHVNAALRAHTLFEKDVDYIVQDNEVVIVDEHTGRTMPGRRWSEGLHQAVEAK 360
ANI L+HHV AALRAH LF +DVDYIV+D EV+IVDEHTGRTM GRRWS+GLHQAVEAK
Sbjct: 301 PANIMLMHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAK 360

Query: 361 EGVHIQNENQTLASITFQNFFRQYEKLAGMTGTADTEAFEFQHIYGLDTVVVPTNRPMVR 420
EGV IQNENQTLASITFQN+FR YEKLAGMTGTADTEAFEF IY LDTVVVPTNRPM+R
Sbjct: 361 EGVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIR 420

Query: 421 KDHADLVYLTAEEKYAAIVKDIVGCRERGQPVLVGTVSIEQSELLHSLLKKEKIPHEILN 480
KD DLVY+T EK AI++DI +GQPVLVGT+SIE+SEL+ + L K I H +LN
Sbjct: 421 KDLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLN 480

Query: 481 AKFHEREADIVAQAGRTGAVTVATNMAGRGTDIVLGGNWNMEIEVLANPTTEQKAKIKAD 540
AKFH EA IVAQAG AVT+ATNMAGRGTDIVLGG+W E+ L NPT EQ KIKAD
Sbjct: 481 AKFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKAD 540

Query: 541 WQIRHDEVVGAGGLHILGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDSLMRIFAS 600
WQ+RHD V+ AGGLHI+GTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMED+LMRIFAS
Sbjct: 541 WQVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFAS 600

Query: 601 DRVSSMMKKLGMEEGEAIEHPWVSRAIENAQRKVEARNFDIRKQLLEFDDVANDQRQVVY 660
DRVS MM+KLGM+ GEAIEHPWV++AI NAQRKVE+RNFDIRKQLLE+DDVANDQR+ +Y
Sbjct: 601 DRVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIY 660

Query: 661 AQRNELMDADSIQDTITNIQEDVVNGLVDQYIPRQSVEELWDIEGLEQRLKQEYAMSLPI 720
+QRNEL+D + +TI +I+EDV +D YIP QS+EE+WDI GL++RLK ++ + LPI
Sbjct: 661 SQRNELLDVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPI 720

Query: 721 QEWLDKEDDLHEETLRERIVDTWVNAYKAKEEMVGEQVLRQFEKAVMLQTLDGLWKEHLS 780
EWLDKE +LHEETLRERI+ + Y+ KEE+VG +++R FEK VMLQTLD LWKEHL+
Sbjct: 721 AEWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLA 780

Query: 781 AMDHLRQGIHLRGYAQKNPKQEYKRESFELFQQMLESLKHDVISILSKVQVQAQSDVEEM 840
AMD+LRQGIHLRGYAQK+PKQEYKRESF +F MLESLK++VIS LSKVQV+ +VEE+
Sbjct: 781 AMDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMPEEVEEL 840

Query: 841 EERRRQEDAKIRRDYQHAEAEALVGAEESAALAATQPTVRDGEKIGRNDPCPCGSGKKYK 900
E++RR E ++ A+ + L ++ +A AA K+GRNDPCPCGSGKKYK
Sbjct: 841 EQQRRMEAERL------AQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCPCGSGKKYK 894

Query: 901 QCHGKL 906
QCHG+L
Sbjct: 895 QCHGRL 900


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0468PREPILNPTASE332e-116 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 332 bits (852), Expect = e-116
Identities = 161/303 (53%), Positives = 203/303 (66%), Gaps = 14/303 (4%)

Query: 1 MTELISVFGQNPWLFIALSFVFAAVIGSFLNVVIHRMPVMMKRDWQQECNHYLSEYKKEV 60
M L+ + PWL+ +L F+F+ +IGSFLNVVIHR+P+M++R+WQ E Y +
Sbjct: 1 MALLLELAHGLPWLYFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPD---- 56

Query: 61 FDSNKKQLEAPIDAFPEKYNLVVPGSACPKCKANIKAVHNLPILGWLMLKGKCATCSTKI 120
YNL+VP S CP C I A+ N+P+L WL L+G+C C I
Sbjct: 57 ----------DEGVDEPPYNLMVPRSCCPHCNHPITALENIPLLSWLWLRGRCRGCQAPI 106

Query: 121 SPRYPIVELITGALVAFLAWHFGPTLEFALTSLLTFCLVALTGIDLDEMLLPDQITLPLL 180
S RYP+VEL+T L +A P LLT+ LVALT IDLD+MLLPDQ+TLPLL
Sbjct: 107 SARYPLVELLTALLSVAVAMTLAPGWGTLAALLLTWVLVALTFIDLDKMLLPDQLTLPLL 166

Query: 181 WLGLIINLSGTFATTSDALIGAAAGYLSLWSVFWLFKLLTGKDGMGYGDFKLLAVFGAWF 240
W GL+ NL G F + DA+IGA AGYL LWS++W FKLLTGK+GMGYGDFKLLA GAW
Sbjct: 167 WGGLLFNLLGGFVSLGDAVIGAMAGYLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWL 226

Query: 241 GWQVLPLVILLSSLVGAVVGIALIVFKKLNHGNPIPFGPYIAAAGWIAMIWGESITNWYL 300
GWQ LP+V+LLSSLVGA +GI LI+ + + PIPFGPY+A AGWIA++WG+SIT WYL
Sbjct: 227 GWQALPIVLLLSSLVGAFMGIGLILLRNHHQSKPIPFGPYLAIAGWIALLWGDSITRWYL 286

Query: 301 STL 303
+
Sbjct: 287 TNF 289


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0469BCTERIALGSPF399e-140 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 399 bits (1028), Expect = e-140
Identities = 118/405 (29%), Positives = 214/405 (52%), Gaps = 10/405 (2%)

Query: 24 TFQWKGTNRDGAKTSGELRGVSSAEIRAQLKAQGVVPKNVRKKSAP---------LFKSE 74
+ ++ + G K G S+ + R L+ +G+VP +V + + +
Sbjct: 3 QYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLRRK 62

Query: 75 KKIKPMDIAMMTRQIATMLAAGVPIVTTLELLGKGHEKAKMRELLATIVNDVQSGIPLSD 134
++ D+A++TRQ+AT++AA +P+ L+ + K EK + +L+A + + V G L+D
Sbjct: 63 IRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD 122

Query: 135 SLRPHRKYFDDLYVDLVAAGEHSGSLDAVFDRIATYREKSEALKSKIKKAMFYPAAVVIV 194
+++ F+ LY +VAAGE SG LDAV +R+A Y E+ + ++S+I++AM YP + +V
Sbjct: 123 AMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLTVV 182

Query: 195 AIMVTALLLLFVVPQFEDIFNGFGAELPAFTQFIISISRWLQASWYIFVIAIIVGIWSFR 254
AI V ++LL VVP+ + F LP T+ ++ +S ++ ++A++ G +F
Sbjct: 183 AIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMAF- 241

Query: 255 RAHLNSQKFRDRVDEFVLKIPAIGDILHKGAMARFARTLATTFAAGVPLIDGLESAAGAS 314
R L +K R +L +P IG I AR+ARTL+ A+ VPL+ + +
Sbjct: 242 RVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDVM 301

Query: 315 GNAVYRKAILKIRTEVMSGMQMNVAMRTTGLFPDMLIQMVMIGEESGSLDNMLNKVANIY 374
N R + V G+ ++ A+ T LFP M+ M+ GE SG LD+ML + A+
Sbjct: 302 SNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADNQ 361

Query: 375 EMQVDDAVDGLSSLIEPVMMVVIGTVVGGLIVGMYLPIFEMGNVV 419
+ + + L EP+++V + VV +++ + PI ++ ++
Sbjct: 362 DREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


67Shal_0501Shal_0507N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0501-1152.271456hypothetical protein
Shal_0502-2142.191267hypothetical protein
Shal_0503-1192.245456hypothetical protein
Shal_05040232.291029LysR family transcriptional regulator
Shal_0505018-0.784025secretion protein HlyD family protein
Shal_0506118-3.292597EmrB/QacA family drug resistance transporter
Shal_0507125-5.804529N-acetyltransferase GCN5
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0501PF01206928e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.1 bits (229), Expect = 8e-29
Identities = 18/71 (25%), Positives = 37/71 (52%)

Query: 8 DYNLEIYGEPCPYPAVATLEAMQSLKPGEVLEVITDCSQSINNIPNDAKNHGYEVLDISQ 67
D +L+ G CP P + + + ++ GEVL V+ S+ + + +K G+E+L+ +
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 68 QGVMLRYLLKK 78
+ + LK+
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0505RTXTOXIND1024e-26 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 102 bits (255), Expect = 4e-26
Identities = 53/411 (12%), Positives = 123/411 (29%), Gaps = 81/411 (19%)

Query: 2 SKTKLFFGIPVLIIALLTGGYYWWTQVRNFESTDNAYVESDISH-ISVKVPGYVTATFVT 60
S+ ++ ++ QV + + S S I V V
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 61 DNQHVKKGQLLAQLEDNQFRAKVAQNQAVLASAKAKLQTLNAQIDLQQALITQAQASVVA 120
+ + V+KG +L +L A + Q+ L A+ + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 121 AEAERVRSQQQLTRATKLKVKNYSSQDAVDEAQAAFESATAHVDEIHAV----------- 169
+ V ++ L + +K + + Q+ + + + A + A
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 170 --------------------LVAKQRELKVFNAQLVEADSSVEQAKAGLQLAQI------ 203
++ ++ + +L S +EQ ++ + A+
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 204 ---------QLADTQ---------------------ITAPFTGIIGKRGAQ-QGQYVQPG 232
+L T I AP + + + +G V
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 233 QAIYSLVPDEAV-WITANFKETQIAKMHAGQKVTIELDAFSGKEFSGVIDSLSPASGAKF 291
+ + +VP++ +TA + I ++ GQ I+++AF + G + K
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRY-GYLV-------GKV 405

Query: 292 SLLPAENATGNFTKIVQRIPVRIRLDSNAA-EKLEQGRIVPGLSALVKVDT 341
+ + +V V I ++ N + + G++ ++ T
Sbjct: 406 KNINLDAIEDQRLGLVFN--VIISIEENCLSTGNKNIPLSSGMAVTAEIKT 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0506TCRTETB1074e-27 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 107 bits (268), Expect = 4e-27
Identities = 88/401 (21%), Positives = 168/401 (41%), Gaps = 19/401 (4%)

Query: 37 AFMAILDIQITNASMKEIQGGLGATLEEGSWIATAYLVAEMIAIPLSGWLSKGLDIRRYM 96
+F ++L+ + N S+ +I +W+ TA+++ I + G LS L I+R +
Sbjct: 23 SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLL 82

Query: 97 LWNSAIFIFASLLCSIAWNLES-MIAFRAMQGFFGGALIPMAFRLILEYLPDSKRAVGMA 155
L+ I F S++ + + S +I R +QG A + ++ Y+P R
Sbjct: 83 LFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFG 142

Query: 156 LFGVTATFAPSIGPTLGGWLTEQFSWHYLFYINVPPGIVVMSMLAYGLVKKPINWPELKN 215
L G +GP +GG + W YL +P ++ L+KK + +
Sbjct: 143 LIGSIVAMGEGVGPAIGGMIAHYIHWSYLL--LIPMITIITVPFLMKLLKKEVRIKG--H 198

Query: 216 VDVSSIITMALGMGCLEVVLEEGNRKDWFGSDFIRNLAIVAAVNIALFVYLQLKSKAPLV 275
D+ II M++G+ + F + + + IV+ ++ +FV K P V
Sbjct: 199 FDIKGIILMSVGIVFFML----------FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFV 248

Query: 276 NLKLLANRDFSISTIAYFLLGLALFSSIYMVPLYLSQIQDYNSLEIGEVLMWLGFPQLLI 335
+ L N F I + ++ + + MVP + + ++ EIG V+++ G ++I
Sbjct: 249 DPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVII 308

Query: 336 L-PFMPMLMQRFDNRYLAAFGFFMFGVSYYMNSHMTADFAGPQLIASMVVRAIG-QPFIM 393
+L+ R Y+ G VS+ S + + ++V +G F
Sbjct: 309 FGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFL--LETTSWFMTIIIVFVLGGLSFTK 366

Query: 394 VPIGMLATARLQKHENASASTVLNVVRNLGGAVGIAMVSTL 434
I + ++ L++ E + ++LN L GIA+V L
Sbjct: 367 TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0507AUTOINDCRSYN333e-04 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 32.5 bits (74), Expect = 3e-04
Identities = 14/63 (22%), Positives = 25/63 (39%), Gaps = 12/63 (19%)

Query: 5 SLSFSELSLNELYDLLKLRVDVFV--------VEQNCAYPELDDKDRHSQTQHLLGLNEQ 56
++ + LS + +L LR + F E D D ++ T +L G+ +
Sbjct: 6 DVNHTLLSETKSGELFTLRKETFKDRLNWAVQCTDGM---EFDQYD-NNNTTYLFGIKDN 61

Query: 57 GVI 59
VI
Sbjct: 62 TVI 64


68Shal_0632Shal_0639N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0632117-0.272266major facilitator transporter
Shal_0633119-0.775157ssDNA-binding protein
Shal_0634117-0.957946putative sigma-54 specific transcriptional
Shal_06350161.503364aspartate/ornithine carbamoyltransferase family
Shal_0636-1182.963971diaminopropionate ammonia-lyase
Shal_0637-1172.882228peptidase
Shal_0638-2163.150139phenylhydantoinase
Shal_0639-1173.406154carbamate kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0632TCRTETA841e-19 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 83.7 bits (207), Expect = 1e-19
Identities = 76/370 (20%), Positives = 133/370 (35%), Gaps = 27/370 (7%)

Query: 8 KTEKKVAFSLASVFGLRMMGLFMIMPV--FALYGQHLEGFSPLWVGIAIGAYGLTQAILQ 65
K + + L++V L +G+ +IMPV L GI + Y L Q
Sbjct: 2 KPNRPLIVILSTVA-LDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 66 IPMGILSDKFGRKPIILTGLLIFALGSLVAANAETIYGVVAGRALQGM-GAIAAAVLALA 124
+G LSD+FGR+P++L L A+ + A A ++ + GR + G+ GA A A
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYI 120

Query: 125 ADLTRDEQRTKVMAIIGMCIGFSFALSLLVGPIVAQHLGLSGLFFMTAGLALLGMLIVQL 184
AD+T ++R + + C GF ++G ++ FF A L L L
Sbjct: 121 ADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCF 179

Query: 185 LVPTPVGQAPKGDTVAAPEKLKRMLLDPQLFRLDAGIFILHLVLTAVFVALPLDLVDAGL 244
L+P + A L FR G+ ++ ++ F+ + V A L
Sbjct: 180 LLPESHKGERRPLRREALNPLAS-------FRWARGMTVVAALMAVFFIMQLVGQVPAAL 232

Query: 245 ------------PKEEHWMLYFPAFMGAFFLMVPLIIIGVKRNNTKGMFQVALIIMMVAL 292
L + + + + R + + +I
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPV-AARLGERRALMLGMIADGTGY 291

Query: 293 GSMAMFANNLVVLSIAVVLFFTGFNYLEASLPSLIAKFCPVGDKGSAMGVYSTSQFLGAF 352
+A + I V+L G +L +++++ +G G + L +
Sbjct: 292 ILLAFATRGWMAFPIMVLLASGGIG--MPALQAMLSRQVDEERQGQLQGSLAALTSLTSI 349

Query: 353 CGGLLGGGAY 362
G LL Y
Sbjct: 350 VGPLLFTAIY 359


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0633PF05616351e-04 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 35.5 bits (81), Expect = 1e-04
Identities = 19/62 (30%), Positives = 27/62 (43%), Gaps = 4/62 (6%)

Query: 157 PKPQQAPAQQGGYQQQPQQQAPQQGGYAPKPQSAPAPQQAPQQRPAPQPQQNFTPDLDDG 216
P+P P G + P Q + A P + PAP + P RP P+P + PD +
Sbjct: 311 PRPDLTP----GSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPD 366

Query: 217 WD 218
D
Sbjct: 367 TD 368


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0634HTHFIS381e-129 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 381 bits (979), Expect = e-129
Identities = 126/349 (36%), Positives = 187/349 (53%), Gaps = 34/349 (9%)

Query: 237 TPASPFSNIIGESNKMLQLKTLIARVAKSPSSILISGESGTGKEVFAQAIHKYSDRSEQN 296
+ ++G S M ++ ++AR+ ++ +++I+GESGTGKE+ A+A+H Y R
Sbjct: 131 DDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGP 190

Query: 297 FVAINCAAIPEQLLESELFGYVKGAFTGALTKGKVGLIQAANNGTLFLDEIGDMSMTLQS 356
FVAIN AAIP L+ESELFG+ KGAFTGA T+ G + A GTLFLDEIGDM M Q+
Sbjct: 191 FVAINMAAIPRDLIESELFGHEKGAFTGAQTRST-GRFEQAEGGTLFLDEIGDMPMDAQT 249

Query: 357 KLLRVLEEREVMPLGSNASTPVNIRIISATNRNFAEMIAANEFREDLYYRLNVIPLYLPA 416
+LLRVL++ E +G ++RI++ATN++ + I FREDLYYRLNV+PL LP
Sbjct: 250 RLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPP 309

Query: 417 LREREGDIELLVYYFLDRHSQAIGTCYPGITEEVINSLNAYHWPGNIRELSNLVEYLVNI 476
LR+R DI LV +F+ + + G +E + + A+ WPGN+REL NLV L
Sbjct: 310 LRDRAEDIPDLVRHFVQQAEK-EGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLT-A 367

Query: 477 VPSGDQIDIDLLPPYF-DTPQESCTPMVSINDDNVS------------------------ 511
+ D I +++ +S + ++S
Sbjct: 368 LYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSG 427

Query: 512 -----LEEMERIKIEDSINR-LGNRKLVAKELGIGVATLYRKLKKYRLS 554
L EME I ++ GN+ A LG+ TL +K+++ +S
Sbjct: 428 LYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0638UREASE425e-06 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 41.6 bits (98), Expect = 5e-06
Identities = 22/70 (31%), Positives = 31/70 (44%), Gaps = 9/70 (12%)

Query: 6 TLIINGTLVDAQKQSQQQLLIIDGKIAAV---------DKVITDYPVDTEVIDASGLYVM 56
T+I N ++D + + + DG+IAA+ V TEVI G V
Sbjct: 70 TVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVT 129

Query: 57 PGGIDVHTHF 66
GG+D H HF
Sbjct: 130 AGGMDSHIHF 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0639CARBMTKINASE362e-128 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 362 bits (931), Expect = e-128
Identities = 137/312 (43%), Positives = 181/312 (58%), Gaps = 14/312 (4%)

Query: 1 MKPIAVVALGGNALLRRGDAPSFQNQLANIKIAAVAIAE-IAKEYRVAIVHGNGPQVGLI 59
M V+ALGGNAL +RG S++ + N++ A IAE IA+ Y V I HGNGPQVG +
Sbjct: 1 MGKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSL 60

Query: 60 SLQNL---SYKDVAPYPLDVLGAESEGMIGYMLAQELQNVMPHS----EVSTILTRIEVD 112
L + + P+DV GA S+G IGYM+ Q L+N + +V TI+T+ VD
Sbjct: 61 LLHMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVD 120

Query: 113 SQDPSMLDPTKFVGPVYNEDEAGLLAEANNWIMKRD-GDYVRRVVPSPKPMSILDRSSIN 171
DP+ +PTK VGP Y+E+ A LA WI+K D G RRVVPSP P ++ +I
Sbjct: 121 KNDPAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIK 180

Query: 172 TLLGSGHIVICCGGGGIPVCKSDHGYIGVEGVIDKDLSATLLAKQLKADKLLILTDADAV 231
L+ G IVI GGGG+PV D GVE VIDKDL+ LA+++ AD +ILTD +
Sbjct: 181 KLVERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGA 240

Query: 232 YLDWGTENQRALHRTTPEELSQY----SFAAGSMGPKVEAVTDFALSGGK-AYIGALEHG 286
L +GTE ++ L EEL +Y F AGSMGPKV A F GG+ A I LE
Sbjct: 241 ALYYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLEKA 300

Query: 287 LDVLAERSGTCV 298
++ L ++GT V
Sbjct: 301 VEALEGKTGTQV 312


69Shal_0783Shal_0790N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0783-2131.138135hypothetical protein
Shal_0784-2142.215837fructose-1,6-bisphosphatase
Shal_0785-2142.105868peptidase S9B dipeptidylpeptidase IV subunit
Shal_0786-2121.941485two-component response regulator
Shal_07870182.446235hypothetical protein
Shal_07880193.144181aspartate kinase III
Shal_07891182.037101succinylglutamate desuccinylase/aspartoacylase
Shal_0790-1191.441349DNA-binding transcriptional regulator AsnC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0783BACINVASINB280.030 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 27.8 bits (61), Expect = 0.030
Identities = 24/100 (24%), Positives = 43/100 (43%), Gaps = 14/100 (14%)

Query: 66 NKQASLKAQNCSCCNKSFTTRMKIAELAGEFQSYFLCVKCEKQISKRAESTFLLNQLLSP 125
K LKA S + T +K A +++S + T LL +L++
Sbjct: 47 TKAGDLKAGTKSGESAINTVGLKPPTDAAR-----------EKLSSEGQLTLLLGKLMT- 94

Query: 126 DFIQKNSSFSDLESMVESSGIQLKTQDDLKLEVWDEFITA 165
+ + S S LES + +++Q ++ ++V EF TA
Sbjct: 95 --LLGDVSLSQLESRLAVWQAMIESQKEMGIQVSKEFQTA 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0786HTHFIS831e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.3 bits (206), Expect = 1e-20
Identities = 30/124 (24%), Positives = 56/124 (45%), Gaps = 1/124 (0%)

Query: 1 MQNPHILIVEDEAVTRNTLRSIFEAEGYVVTEANDGAEMHKAMQDNKVNLVVMDINLPGK 60
M IL+ +D+A R L GY V ++ A + + + +LVV D+ +P +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLLLARELREIN-NIGLIFLTGRDNEVDKILGLEIGADDYITKPFNPRELTIRARNLLT 119
N L +++ ++ ++ ++ ++ + I E GA DY+ KPF+ EL L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RVNS 123

Sbjct: 121 EPKR 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0788CARBMTKINASE300.019 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 30.2 bits (68), Expect = 0.019
Identities = 20/87 (22%), Positives = 29/87 (33%), Gaps = 11/87 (12%)

Query: 213 DYSAALLAEALQAYAVEIWTDVAGIYTTDPRLAPNASPIAEISFNEAAEMATFGAKVLHP 272
D + LAE + A I TDV G + E+ E + G
Sbjct: 216 DLAGEKLAEEVNADIFMILTDVNGAALY--YGTEKEQWLREVKVEELRKYYEEGH--FKA 271

Query: 273 ATILPAVRQKIQVFVGSSWAPEQGGTW 299
++ P V I+ F+ E GG
Sbjct: 272 GSMGPKVLAAIR-FI------EWGGER 291


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0790PF01540270.034 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 27.0 bits (59), Expect = 0.034
Identities = 12/17 (70%), Positives = 13/17 (76%)

Query: 79 SAGDYPAAISKLNALEE 95
S GDYPA ISKL+A E
Sbjct: 84 SYGDYPAIISKLSAAVE 100


70Shal_0881Shal_0891N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0881-114-0.815363FKBP-type peptidylprolyl isomerase
Shal_0882-113-1.093188PA-phosphatase-like phosphoesterase
Shal_0883113-1.666120hypothetical protein
Shal_0884-211-0.835285two component LuxR family transcriptional
Shal_0885-111-0.951883nitrate/nitrite sensor protein NarQ
Shal_0886-315-0.347965cytochrome c552
Shal_0887-115-0.136748isochorismatase hydrolase
Shal_0888-117-0.110325nitrogen regulatory protein P-II
Shal_0889017-0.285157Rh family protein/ammonium transporter
Shal_0890-117-0.960117hypothetical protein
Shal_0891017-0.406847hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0881INFPOTNTIATR1866e-61 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 186 bits (473), Expect = 6e-61
Identities = 102/242 (42%), Positives = 149/242 (61%), Gaps = 12/242 (4%)

Query: 6 KYSLVALAVVGLTACNQEQAVAVDAPVELKTEAQKEAYSVGASIGTYMAGHIKEQEELGL 65
K LV A++GL A+A L T+ K +YS+GA +G K + G+
Sbjct: 2 KMKLVTAAIMGLA---MSTAMAATDATSLTTDKDKLSYSIGADLG-------KNFKNQGI 51

Query: 66 NVDRSLIVTGFSEGLN-SELKLTQEEMQTILQSLDEKLNEKRQAQAAMLAEKSLAESQAF 124
+++ ++ G +G++ ++L LT+E+M+ +L + L KR A+ AE++ A+ AF
Sbjct: 52 DINPDVLAKGMQDGMSGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAF 111

Query: 125 LDANKAKEGVVTTDSGLQYEVITEGTGDKPVAEDTVKVHYVGTLTDGTEFDSSVARGEPA 184
L ANK+K G+V SGLQY++I GTG KP DTV V Y GTL DGT FDS+ G+PA
Sbjct: 112 LSANKSKPGIVVLPSGLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPA 171

Query: 185 TFPLNRVIPGWTEGVQLMSVGSKYKFVIPADLAYGDRDT-GTIPANSTLVFEVELLDIEK 243
TF +++VIPGWTE +QLM GS ++ +PADLAYG R G I N TL+F++ L+ ++K
Sbjct: 172 TFQVSQVIPGWTEALQLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVKK 231

Query: 244 AA 245
AA
Sbjct: 232 AA 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0884HTHFIS639e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.5 bits (152), Expect = 9e-14
Identities = 25/159 (15%), Positives = 63/159 (39%), Gaps = 9/159 (5%)

Query: 6 SVLVIDDHPLLRRGICQLVTSDSDFTLFGEAGSGLDALSAVSDNEPDIILLDLNMKGMTG 65
++LV DD +R + Q ++ + + + + ++ + D+++ D+ M
Sbjct: 5 TILVADDDAAIRTVLNQALS-RAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 66 LDTLNALRQEGVTSRIVILTVSDAKQDVIRLVRAGADGYLLKDTEPDLLLDMLKNAMLGH 125
D L +++ +++++ + I+ GA YL K + L+ ++ A+
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL--- 119

Query: 126 RVISNTVQEYLSELNSSVNEQEWIENLTPRELEILQHLA 164
+ S+L + + + EI + LA
Sbjct: 120 ----AEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLA 154


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0885PF06580386e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.3 bits (89), Expect = 6e-05
Identities = 20/113 (17%), Positives = 40/113 (35%), Gaps = 11/113 (9%)

Query: 458 ITLDYKLPLQLLGAHQHIHILQLTREATLNAIKHASAKHIHISCQKVSSEK----VVISI 513
+ + ++ ++ ++Q E N IKH A+ + K V + +
Sbjct: 240 LQFENQINPAIMDVQVPPMLVQTLVE---NGIKHGIAQLPQGGKILLKGTKDNGTVTLEV 296

Query: 514 SDDGVGIEELKERDQHFGLGIMHERASRLSGV---VEFSQNKKGGATVTLTFP 563
+ G + + GL + ER L G ++ S K+G + P
Sbjct: 297 ENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLS-EKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0887ISCHRISMTASE417e-07 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 41.2 bits (96), Expect = 7e-07
Identities = 21/91 (23%), Positives = 36/91 (39%)

Query: 74 ITKHHFSGWQADAFRERLSAMQQKQIILAGIETHVCVYQTCRDLLANGYQTHLVADAMSS 133
+TK +S ++ E + + Q+I+ GI H+ T + + V DA++
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 134 REQSNKVSGLEMMLSHGAMQTNVESLLFELQ 164
LE A +SLL +LQ
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQ 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0891SYCDCHAPRONE290.019 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 29.1 bits (65), Expect = 0.019
Identities = 17/124 (13%), Positives = 41/124 (33%), Gaps = 4/124 (3%)

Query: 276 LRVNPEDMNALNNYAIMLSAQDRLDEWADVHKVLELARIRNPYYYYGMAQQAYFDKDYQD 335
++ + + L + A + ++ V + L + + ++ G+ Y
Sbjct: 29 NEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDL 88

Query: 336 ALLWYKRAIA--KADYRHEFYFGLSRTYWVTGDEMRAQQSLKKALSLTTDGQHKRRYQSK 393
A+ Y + R F+ G+ A+ L A L D + ++
Sbjct: 89 AIHSYSYGAIMDIKEPRFPFHAAEC--LLQKGELAEAESGLFLAQELIADKTEFKELSTR 146

Query: 394 LHAM 397
+ +M
Sbjct: 147 VSSM 150


71Shal_0924Shal_0934N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_09241161.993947phosphoribosylglycinamide formyltransferase 2
Shal_09251141.168708hypothetical protein
Shal_09261130.815681secretion protein HlyD family protein
Shal_09272130.126202ATPase central domain-containing protein
Shal_0928016-1.235602hypothetical protein
Shal_0929115-1.374569hypothetical protein
Shal_0930114-1.424805histidine kinase
Shal_0931214-2.114722two component transcriptional regulator
Shal_0932119-2.015424hypothetical protein
Shal_0933119-1.664918MltA-interacting MipA family protein
Shal_0934022-1.738685N-acetyltransferase GCN5
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0924PF06057310.006 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 31.0 bits (70), Expect = 0.006
Identities = 10/28 (35%), Positives = 13/28 (46%), Gaps = 2/28 (7%)

Query: 17 GCGELGKEVAIELQRYGIEVIGVD--RY 42
G L K V LQ+ G V+G +Y
Sbjct: 62 GWATLDKAVGGILQQQGWPVVGWSSLKY 89


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0926RTXTOXIND537e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 53.3 bits (128), Expect = 7e-10
Identities = 48/321 (14%), Positives = 102/321 (31%), Gaps = 82/321 (25%)

Query: 66 INPAVRGVVVSVEVEPNTPIKKGDVLFRIDPTPFQAIVKQKRAALVAAELEVPQLEA--- 122
I P +V + V+ ++KGDVL ++ +A + +++L+ A LE + +
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSR 158

Query: 123 AWETTKAAVTRATADRDRTKSAFERYEKGRKRGGVNSPFTEL--ELDNKRQFYFASEAQL 180
+ E K + + + E E R + F+ + K A+
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEE--EVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAER 216

Query: 181 TAANA-------------EEL-------------RMRL---------------AYESNVD 199
A L + + Y+S ++
Sbjct: 217 LTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLE 276

Query: 200 GVNTKVAGIQ---------------GELEKAQFDLDQ--------------TVVKAPADG 230
+ +++ + +L + ++ +V++AP
Sbjct: 277 QIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSV 336

Query: 231 MVTQMALRPGIVAVPMPLRPLLSFIPDEERMFVGAFWQNSLL-RLKEGDEAEIILDAAPG 289
V Q+ + V L+ +P+++ + V A QN + + G A I ++A P
Sbjct: 337 KVQQLKVH-TEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPY 395

Query: 290 Q---VFKGKVAKVLPAMAEGE 307
GKV + E +
Sbjct: 396 TRYGYLVGKVKNINLDAIEDQ 416


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0927ISCHRISMTASE290.048 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 28.8 bits (64), Expect = 0.048
Identities = 18/79 (22%), Positives = 33/79 (41%), Gaps = 9/79 (11%)

Query: 390 DLPDT------EIRQAIFLIH-LQRRQLDPMQFDLAQLAKVSQGFSG--AEIEQAVISAI 440
D+P + +A+ LIH +Q +D + + ++S + Q I +
Sbjct: 16 DMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCVQLGIPVV 75

Query: 441 YTAKASEREVDQSALLEEL 459
YTA+ + D ALL +
Sbjct: 76 YTAQPGSQNPDDRALLTDF 94


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0930PF06580432e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.5 bits (100), Expect = 2e-06
Identities = 22/127 (17%), Positives = 43/127 (33%), Gaps = 30/127 (23%)

Query: 301 LDVPEETIYLIAEPSLVERALQNLVTNA------QRFSTDDIIVKISQDSDGVRLSVTDD 354
+ I + P ++ +Q LV N Q I++K ++D+ V L V +
Sbjct: 244 NQINPA-IMDVQVPPML---VQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENT 299

Query: 355 GEGILEEDQSKIFEPFYRSASSKNGNKGHGLGLAIIKRIMDRHHGE---VSLQSRPGFTQ 411
G L+ + + G GL ++ + +G + L + G
Sbjct: 300 GSLALKNTK-----------------ESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN 342

Query: 412 FTLYWPA 418
+ P
Sbjct: 343 AMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0931HTHFIS742e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 74.5 bits (183), Expect = 2e-17
Identities = 32/135 (23%), Positives = 61/135 (45%), Gaps = 2/135 (1%)

Query: 9 RVLLVEDDIRLANLIVDFLKSHGMHVEVERRGDTVLTRLINYKPDIILLDIMLPGMDGLT 68
+L+ +DD + ++ L G V + T+ + D+++ D+++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 69 LCEKLPDYFAG-PILLMSALGSNEDQIKGLELGADDYVVKPVDPSLLVARINNLL-RRQA 126
L ++ P+L+MSA + IK E GA DY+ KP D + L+ I L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 127 KPAQVESHCLSFGKL 141
+P+++E L
Sbjct: 125 RPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0934SACTRNSFRASE417e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 40.7 bits (95), Expect = 7e-07
Identities = 22/86 (25%), Positives = 37/86 (43%), Gaps = 4/86 (4%)

Query: 70 IAVRDGDIVGQIGMEVFNNPRRKHVANIGMAVDEAYRGIGIASAMLEAMISLAQNWLAVR 129
+ + + +G+I + N + +I AV + YR G+ +A+L I A+
Sbjct: 69 LYYLENNCIGRIKIRSNWN-GYALIEDI--AVAKDYRKKGVGTALLHKAIEWAKE-NHFC 124

Query: 130 RIELEVYTDNHLAISLYKKHGFVIEG 155
+ LE N A Y KH F+I
Sbjct: 125 GLMLETQDINISACHFYAKHHFIIGA 150


72Shal_0967Shal_0977N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_0967-2112.454024N-acetyltransferase GCN5
Shal_0968-1122.449063carboxypeptidase Taq
Shal_09690121.774468peptidase S8/S53 subtilisin kexin sedolisin
Shal_09700130.857359hypothetical protein
Shal_0971-2101.802892DEAD/DEAH box helicase
Shal_0972-2121.801170peptidase M24
Shal_0974-1131.775153peptidase M24
Shal_0975-1161.683734FKBP-type peptidylprolyl isomerase
Shal_0976-1161.745119OmpA/MotB domain-containing protein
Shal_0977-1161.803322Ig domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0967SACTRNSFRASE355e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 34.9 bits (80), Expect = 5e-05
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 5/67 (7%)

Query: 83 LYLHDIALSSKAQGKGAGRAALAILIEFARCHHYPNISLVAVQG----AHHYWAKQGFKA 138
+ DIA++ + KG G A L IE+A+ +H+ + L Q A H++AK F
Sbjct: 90 ALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLML-ETQDINISACHFYAKHHFII 148

Query: 139 KSINKDL 145
+++ L
Sbjct: 149 GAVDTML 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0969SUBTILISIN1361e-37 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 136 bits (343), Expect = 1e-37
Identities = 71/210 (33%), Positives = 96/210 (45%), Gaps = 24/210 (11%)

Query: 124 AGMKVCIIDSGLDRSNGDFVWNNISG----DNDGGTGNWDENGGPHGTHVAGTIGAADNN 179
G+KV ++D+G D + D I G D+D G ++ HGTHVAGTI A +N
Sbjct: 41 RGVKVAVLDTGCDADHPDLKARIIGGRNFTDDDEGDPEIFKDYNGHGTHVAGTIAATENE 100

Query: 180 IGVVGMAPGVDMHIIKVFNADGWGYSSDLAHAANLCSDAGANIISMSLGGGGSNSTESNA 239
GVVG+AP D+ IIKV N G G + + +IISMSLGG A
Sbjct: 101 NGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQKVDIISMSLGGPEDVPELHEA 160

Query: 240 FQSFSDAGGLVVAAAGNDGNSVRS-----YPAGYPSVMMVGANDATDTIADFSQYPGCTT 294
+ + LV+ AAGN+G+ YP Y V+ VGA + ++FS
Sbjct: 161 VKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAINFDRHASEFSNSNN--- 217

Query: 295 GRGKKQRTDTSICVEIAAGGVDTLSTYPAG 324
V++ A G D LST P G
Sbjct: 218 ------------EVDLVAPGEDILSTVPGG 235



Score = 46.4 bits (110), Expect = 2e-07
Identities = 25/106 (23%), Positives = 35/106 (33%), Gaps = 6/106 (5%)

Query: 408 TLGDTNATTIPAAGATFEDRTAIIAASSANLSIG-TSDYGFMSGTSMATPAVSGIAARVW 466
++G N + + + ++A LS Y SGTSMATP V+G A +
Sbjct: 199 SVGAINFDRHASEFSNSNNEVDLVAPGEDILSTVPGGKYATFSGTSMATPHVAGALALIK 258

Query: 467 SN-----HNQCTGEEIRAALNASARDSGASGHDVYFGHGIADAAAA 507
T E+ A L G S G A
Sbjct: 259 QLANASFERDLTEPELYAQLIKRTIPLGNSPKMEGNGLLYLTAVEE 304


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0971SECA310.012 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.6 bits (69), Expect = 0.012
Identities = 53/267 (19%), Positives = 103/267 (38%), Gaps = 35/267 (13%)

Query: 10 PEILRAISECGYE--KMTPIQQQAIPAVRRGQDVLASAQTGTGKTAAFALPILQKMLDNP 67
PE + E M Q + + + +A +TG GKT LP L
Sbjct: 65 PEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNAL--- 121

Query: 68 STTGRSNARALILTPTRELAAQIADNINDYAKYLDMKVVTVLGGVKMDSQATKLKRGADI 127
TG+ ++T LA + A+N ++L + V L G M + A + ADI
Sbjct: 122 --TGKG---VHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPG--MPAPAKREAYAADI 174

Query: 128 IIATPGRLLEHIVACNLSLS-------NVDFLVLDEADRMLDMGFSADIQKILQAVNKKR 180
T + N++ S + + ++DE D +L +++ R
Sbjct: 175 TYGTNNEYGFDYLRDNMAFSPEERVQRKLHYALVDEVDSIL--------------IDEAR 220

Query: 181 QNLLFSATFSTAVKQLANVMMVKPNIIAADKQNTTAVTVSQVVYPVEQRRKRELLSELIG 240
L+ S + + V + P++I +K+++ + + V+++ ++ L+E G
Sbjct: 221 TPLIISGPAEDSSEMYKRVNKIIPHLIRQEKEDSETFQG-EGHFSVDEKSRQVNLTER-G 278

Query: 241 KKNWKQVLVFTATRDAADKLEKELNLD 267
+++LV D + L N+
Sbjct: 279 LVLIEELLVKEGIMDEGESLYSPANIM 305



Score = 29.1 bits (65), Expect = 0.048
Identities = 35/173 (20%), Positives = 64/173 (36%), Gaps = 42/173 (24%)

Query: 220 SQVVYPVEQRRKRELLSELIGKKNWKQ-VLVFTATRDAADKLEKELNLDGIPTAVVHGEK 278
+VY E + + ++ ++ + Q VLV T + + ++ + EL GI V++
Sbjct: 424 PDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLN--- 480

Query: 279 AQGSRRRALREFKEGKM-RVLVATEVAARGLDIQ---------------GLEYVVNYDLP 322
A+ A + G V +AT +A RG DI E +
Sbjct: 481 AKFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKAD 540

Query: 323 FLAED---------YV--------HRI-----GRTGRAGKSGVAISFVSREEE 353
+ ++ RI GR+GR G +G + ++S E+
Sbjct: 541 WQVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDA 593


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0975INFPOTNTIATR1387e-44 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 138 bits (349), Expect = 7e-44
Identities = 69/132 (52%), Positives = 89/132 (67%), Gaps = 2/132 (1%)

Query: 25 KAAKENIAIGSAYLADNKLKDGVTTTASGLQYQVLEPGTGTVHPKASDTVTVHYHGTLID 84
K A+EN A G A+L+ NK K G+ SGLQY++++ GTG P SDTVTV Y GTLID
Sbjct: 99 KKAEENKAKGDAFLSANKSKPGIVVLPSGLQYKIIDAGTGA-KPGKSDTVTVEYTGTLID 157

Query: 85 GTVFDSSVERGEPIAFPLNRVIKGWTEGVQLMVVGEKARFFIPSELAYGNRS-AGKISGG 143
GTVFDS+ + G+P F +++VI GWTE +QLM G F+P++LAYG RS G I
Sbjct: 158 GTVFDSTEKAGKPATFQVSQVIPGWTEALQLMPAGSTWEVFVPADLAYGPRSVGGPIGPN 217

Query: 144 STLIFDVELISI 155
TLIF + LIS+
Sbjct: 218 ETLIFKIHLISV 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0976OMPADOMAIN2006e-64 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 200 bits (511), Expect = 6e-64
Identities = 104/342 (30%), Positives = 157/342 (45%), Gaps = 30/342 (8%)

Query: 24 AVYAAETDNSEAANEFDPYFYLGAKAGQMHYQNAC-ESWSISCDGNDVGFGGFVGYQAWQ 82
AV A A D +Y GAK G Y + + + N +G G F GYQ
Sbjct: 9 AVALAGFATVAQAAPKDNTWYTGAKLGWSQYHDTGFINNNGPTHENQLGAGAFGGYQVNP 68

Query: 83 YLGFETAYLDLGEAVAHYSESGVNNTYTGSMKGWEISAVTRFSLSESFELFAKAGSVYWD 142
Y+GFE Y LG Y S N Y +G +++A + +++ +++ + G + W
Sbjct: 69 YVGFEMGYDWLGRMP--YKGSVENGAYKA--QGVQLTAKLGYPITDDLDIYTRLGGMVWR 124

Query: 143 GDNRGPQS-RNSDSDWAPMLGAGVEYQLSPSWVARLEYQYIDSLGS--DLIGGSNGHLTT 199
D + +N D+ +P+ GVEY ++P RLEYQ+ +++G + + + +
Sbjct: 125 ADTKSNVYGKNHDTGVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGTRPDNGMLS 184

Query: 200 LGISYRFGQSATKAQPPVKETPVTLPEKIIPVTPKQIVLPALTVISLFDFDSSELTHSGS 259
LG+SYRFGQ A P V P PE V K L + LF+F+ + L G
Sbjct: 185 LGVSYRFGQGE--AAPVVAPAPAPAPE----VQTKHFTLKSDV---LFNFNKATLKPEGQ 235

Query: 260 --LAPVIERLKQ--SPEAIANIKGYTDSKGNATYNQALSERRAQAVADYLIAAGIKPEQI 315
L + +L + + GYTD G+ YNQ LSERRAQ+V DYLI+ GI ++I
Sbjct: 236 AALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPADKI 295

Query: 316 EVHGYGEQFPLMKNDT---------PEHRHENRRVLIHIQST 348
G GE P+ N + +RRV I ++
Sbjct: 296 SARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKGI 337


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_0977INTIMIN360.002 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 35.8 bits (82), Expect = 0.002
Identities = 54/268 (20%), Positives = 93/268 (34%), Gaps = 28/268 (10%)

Query: 72 SDGATKDVSQEVSWSSSEPTLATISTN--GVAMGAL-AGEVQITASLPPVQEGGMVLTDS 128
A VS + ++ + + +TN G A L + + E L +
Sbjct: 589 VAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNAN 648

Query: 129 AALKVTDAALMALNIEPGQAQILVGMSQSYRALALFADGHQ----QDVTNDASWTSLDTA 184
A + V I+ + + + G + Q+VT + L +
Sbjct: 649 AVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNS 708

Query: 185 TATIESAGESIGLATGVTVGVTQVKANFS--AQEALATLVVLATKPS------QLLINPV 236
T ++ G + T T G + V A S A + A V T + +++ V
Sbjct: 709 TEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGV 768

Query: 237 DEVLPNGTSLQYQAHLILEDGLSIDVTTQSAWHSSSVDIASID-NQGFLVAKTLGQTRVT 295
LP Q +L G + W S++ IAS+D + G + K G T ++
Sbjct: 769 KGKLPTVWLQYGQVNLKASGG-----NGKYTWRSANPAIASVDASSGQVTLKEKGTTTIS 823

Query: 296 ATLSFAGISLLDTTSATVVDAKVSNLIV 323
S D +AT A ++LIV
Sbjct: 824 VISS-------DNQTATYTIATPNSLIV 844


73Shal_1139Shal_1145N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1139-122-2.642109methylation site containing protein
Shal_1140-119-1.731631hypothetical protein
Shal_1141-117-1.188436methylation site containing protein
Shal_1142-115-1.210683general secretion pathway protein H
Shal_1143-213-0.812894nitrogen regulatory protein P-II
Shal_1144-213-0.439314LacI family transcriptional regulator
Shal_1145-2120.351038TonB-dependent receptor plug
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1139BCTERIALGSPG602e-14 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 59.5 bits (144), Expect = 2e-14
Identities = 25/74 (33%), Positives = 44/74 (59%), Gaps = 2/74 (2%)

Query: 1 MHKTRGFTLIELMIVVAIVGILASIAYPSYIDYVTKSARSEGVAAVLRVANLQEQYYLDN 60
K RGFTL+E+M+V+ I+G+LAS+ P+ + K+ + + V+ ++ + N + Y LDN
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDN 63

Query: 61 RAFATDLKKLGLSA 74
+ T GL +
Sbjct: 64 HHYPT--TNQGLES 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1141BCTERIALGSPG472e-09 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 47.2 bits (112), Expect = 2e-09
Identities = 18/45 (40%), Positives = 30/45 (66%)

Query: 3 NKKQGFTLVELMITIVVAGILLSIAVPSLVSMYEQTRVNSNVEKI 47
+K++GFTL+E+M+ IV+ G+L S+ VP+L+ E+ V I
Sbjct: 5 DKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDI 49


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1142BCTERIALGSPG415e-07 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 40.6 bits (95), Expect = 5e-07
Identities = 17/51 (33%), Positives = 30/51 (58%), Gaps = 3/51 (5%)

Query: 4 RTAGFTLIELMVTLVVATILIVIAVPSL---SQFYEAQRAKSAIRVIQQTL 51
+ GFTL+E+MV +V+ +L + VP+L + + Q+A S I ++ L
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENAL 56


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1145ACRIFLAVINRP340.003 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 34.0 bits (78), Expect = 0.003
Identities = 27/146 (18%), Positives = 56/146 (38%), Gaps = 29/146 (19%)

Query: 67 SVVDAITAEDIGKFPDGDVGESLARIPGVTVNRQFGQGQQVSIRGASSQLTRTLLNGHAV 126
S T +DI + +V ++L+R+ GV + FG + I
Sbjct: 144 SDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQYAMRI----------------- 186

Query: 127 ASTGWFDQQAIDRSFNYSLLPPEMVSGIEVYKSSQADLPEGGIGGT-VIVKTRKPLDLDA 185
W D + Y L P ++++ + K + G +GGT + + + A
Sbjct: 187 ----WLDADLL---NKYKLTPVDVINQL---KVQNDQIAAGQLGGTPALPGQQLNASIIA 236

Query: 186 NTVFVSAKGDYGTISESTDPELSGLY 211
T F + + ++G ++ + + S +
Sbjct: 237 QTRFKNPE-EFGKVTLRVNSDGSVVR 261


74Shal_1194Shal_1199N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1194-2120.946510RND family efflux transporter MFP subunit
Shal_1195-2121.140242hydrophobe/amphiphile efflux-1 (HAE1) family
Shal_1196-2150.909923iron-containing alcohol dehydrogenase
Shal_1197-3161.296586hypothetical protein
Shal_1198-3141.674718acriflavin resistance protein
Shal_1199-3151.604444RND family efflux transporter MFP subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1194RTXTOXIND347e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.4 bits (79), Expect = 7e-04
Identities = 20/120 (16%), Positives = 44/120 (36%), Gaps = 9/120 (7%)

Query: 101 AELAQQNAVLKQAVASRDVAVMNWERGRRLLPDGMISAQDMDELTSRKLTTA-AGVVQAE 159
E + V K + + +++ + +L+ Q KL +
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLV------TQLFKNEILDKLRQTTDNIGLLT 315

Query: 160 AAVQAAELQLSYTKVYAPISGRISHSKVST-GDIITPQSEMASLV-QLQPMWVSFQVAEK 217
+ E + + + AP+S ++ KV T G ++T + +V + + V+ V K
Sbjct: 316 LELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNK 375



Score = 31.0 bits (70), Expect = 0.009
Identities = 22/162 (13%), Positives = 50/162 (30%), Gaps = 21/162 (12%)

Query: 72 GQLLKRTFV-EGDDITAGDLLFEIDPSTYKAELAQQNAVLKQAVASR---DVAVMNWERG 127
++K V EG+ + GD+L ++ +A+ + + L QA + + + E
Sbjct: 104 NSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELN 163

Query: 128 RRLL-----PDGMISAQDMDELTSRKLTTAAGVVQAEAAVQAAELQLSYTKVYAPISGRI 182
+ + + + L L + Q + +L+ K A +
Sbjct: 164 KLPELKLPDEPYFQNVSEEEVLRLTSLIKEQ---FSTWQNQKYQKELNLDKKRAERLTVL 220

Query: 183 SHSKVSTGDIITPQSEMASLVQLQPMWVSFQVAEKKLISAQQ 224
+ +S + L K+ I+
Sbjct: 221 ARINRYENLSRVEKSRLDDFSSL---------LHKQAIAKHA 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1195ACRIFLAVINRP9980.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 998 bits (2582), Expect = 0.0
Identities = 441/1034 (42%), Positives = 647/1034 (62%), Gaps = 12/1034 (1%)

Query: 2 ISDFFINRPKFAFVISTVLTLVGLIAIPVLSVSEFPEIAPPQVSVSTSYSGASADIVKDT 61
+++FFI RP FA+V++ +L + G +AI L V+++P IAPP VSVS +Y GA A V+DT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 62 IAQPLEAEVNGVEGMLYMESKSANDGSYSLNVTFEVGTDADMAQVKVQNRVQQAMPRLPE 121
+ Q +E +NG++ ++YM S S + GS ++ +TF+ GTD D+AQV+VQN++Q A P LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 EVKRQGVKVEKQSPNMLMVVNLVSPNETFDSLFITNYAGLNVKDALARQYGVSKVQVIGA 181
EV++QG+ VEK S + LMV VS N I++Y NVKD L+R GV VQ+ GA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 182 LDYAMRIWLDPDKMASLGVTATDVIAALQEQNIQVAAGRIGAAPVDPEQQFQYTLQTKGR 241
YAMRIWLD D + +T DVI L+ QN Q+AAG++G P P QQ ++ + R
Sbjct: 181 -QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 242 LKDPKEFYDVMIRANNDGSKVVVGDVARVELGSQTYDAQGKLNNKPSAIIAIYQSPDANA 301
K+P+EF V +R N+DGS V + DVARVELG + Y+ ++N KP+A + I + ANA
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 302 LEVGKAIKAEMENLSERFPNDLEYEVLYDTTEFVETSIKEVVQTLFISISLVVLVVFIFL 361
L+ KAIKA++ L FP ++ YDTT FV+ SI EVV+TLF +I LV LV+++FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 362 QDVRSTLVPAIAIPVSLIGTFAFLLAFDMSINTVSLFALILAIGIVVDDAIVVVENVTRL 421
Q++R+TL+P IA+PV L+GTFA L AF SINT+++F ++LAIG++VDDAIVVVENV R+
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 422 MQDEGLSPKEATSKAMKEVTGPIIATTLVLLAVFAPTAVMPGITGQMYAQFSVTICISVL 481
M ++ L PKEAT K+M ++ G ++ +VL AVF P A G TG +Y QFS+TI ++
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISSINALTLSPALCASLLRP----PKTHTKGFHAVFNKYFERFTGKYMKLVSLLTRKLAI 537
+S + AL L+PALCA+LL+P + GF FN F+ Y V +
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 538 VGVVYICLIVVTGGVAKILPSGFVPMEDKKAFMVDIQLPDGASLNRTEDVMRELVELTLA 597
++Y ++ + LPS F+P ED+ F+ IQLP GA+ RT+ V+ ++ + L
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 598 --EPGVENVIHASGFSILTGSVSSNGGLMIVTLSTWSERESPDMVESAIVAKLQAKYAAN 655
+ VE+V +GFS + N G+ V+L W ER + A++ + + +
Sbjct: 600 NEKANVESVFTVNGFS--FSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKI 657

Query: 656 PAVKAMAFSLPPIPGVGSVGGFEFVLQDTQGRTPQELASVMRALIMKANEQP-EIAMAFS 714
+ F++P I +G+ GF+F L D G L L+ A + P +
Sbjct: 658 RDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 715 NFRADVPQMFVDVDRDKAKALGISLNEIFSTMQTMLGSMYVNDFNRFGKVFRVILQAESE 774
N D Q ++VD++KA+ALG+SL++I T+ T LG YVNDF G+V ++ +QA+++
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 775 YRNSDRDISRFYVRSNTGEMVPLSTLVTVTPILGPDVMNNYNMFSSTTINGFPAPGFSSG 834
+R D+ + YVRS GEMVP S T + G + YN S I G APG SSG
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 835 DAITAMERAANESLPAGYTFEWTGQTYQEIKAGNLAPLIFSLALVFTYLFLVAQYESWTI 894
DA+ ME A++ LPAG ++WTG +YQE +GN AP + +++ V +L L A YESW+I
Sbjct: 838 DAMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 895 PFAVILAVPIAVLGAFLNILLVGSDLNLYAQIGLVLLIGLACKNAILIVEFAKQLHE-EG 953
P +V+L VP+ ++G L L ++Y +GL+ IGL+ KNAILIVEFAK L E EG
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 954 KSILDAAQTAARLRFRAVLMTAFSFLLGVLPLVIATGAGAGSRRALGYSVFGGMLAATIV 1013
K +++A A R+R R +LMT+ +F+LGVLPL I+ GAG+G++ A+G V GGM++AT++
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 1014 GTLLVPVFYVIMQK 1027
VPVF+V++++
Sbjct: 1017 AIFFVPVFFVVIRR 1030



Score = 83.0 bits (205), Expect = 4e-18
Identities = 101/526 (19%), Positives = 197/526 (37%), Gaps = 58/526 (11%)

Query: 534 KLAIVGVVYICLIVVTGGVA-KILPSGFVPMEDKKAFMVDIQLPDGASLNRTEDVMRELV 592
+ I V ++++ G +A LP P A V P GA +D + +++
Sbjct: 7 RRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYP-GADAQTVQDTVTQVI 65

Query: 593 ELTLAEPGVENVIHASGFSILTGSVSSNGGLMIVTLSTWSERESPDMVESAIVAKLQAKY 652
E + G++N+++ S S S + G + +TL T+ PD+ + + KLQ
Sbjct: 66 EQNMN--GIDNLMYMS-------STSDSAGSVTITL-TFQSGTDPDIAQVQVQNKLQLAT 115

Query: 653 AANPAVKAMAFSLPPIPGVGSVGGFEFVL---QDTQGRTPQEL----ASVMRALIMKANE 705
P I S + V D G T ++ AS ++ + + N
Sbjct: 116 PLLPQ----EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNG 171

Query: 706 QPEIAMAFSNFRADVPQMFVDVDRDKAKALGISLNEIFSTMQT----MLGSMYVNDFNRF 761
++ + + + M + +D D ++ ++ + ++ +
Sbjct: 172 VGDVQLFGAQY-----AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALP 226

Query: 762 GKVFRVILQAESEYRNSDRDISRFYVRSNT-GEMVPLSTLVTVTPILGPDVMNNYNMFSS 820
G+ + A++ ++ + + + +R N+ G +V L + V LG + NYN
Sbjct: 227 GQQLNASIIAQTRFK-NPEEFGKVTLRVNSDGSVVRLKDVARVE--LGGE---NYN--VI 278

Query: 821 TTINGFPAPGFS-----------SGDAITAMERAANESLPAGYTFEWTGQTYQEIKAGNL 869
ING PA G + AI A P G + T ++
Sbjct: 279 ARINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIH 338

Query: 870 A---PLIFSLALVFTYLFLVAQYESWTIPFAVILAVPIAVLGAFLNILLVGSDLNLYAQI 926
L ++ LVF ++L ++ +AVP+ +LG F + G +N
Sbjct: 339 EVVKTLFEAIMLVFLVMYLF--LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMF 396

Query: 927 GLVLLIGLACKNAILIVE-FAKQLHEEGKSILDAAQTAARLRFRAVLMTAFSFLLGVLPL 985
G+VL IGL +AI++VE + + E+ +A + + A++ A +P+
Sbjct: 397 GMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPM 456

Query: 986 VIATGAGAGSRRALGYSVFGGMLAATIVGTLLVPVFYVIMQKMREK 1031
G+ R ++ M + +V +L P + K
Sbjct: 457 AFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSA 502


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1198ACRIFLAVINRP7750.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 775 bits (2004), Expect = 0.0
Identities = 308/1032 (29%), Positives = 515/1032 (49%), Gaps = 28/1032 (2%)

Query: 3 LSDVSVKRPVVAIVLSLLLCVFGAVSFSKLAVREMPDVESPVVTVMTTYEGASATIMESQ 62
+++ ++RP+ A VL+++L + GA++ +L V + P + P V+V Y GA A ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 ITTTLEGELTGISGVDQIQSIT-RNGMSRITVTFLLGWDLTEGVSDVRDAVARAQRRLPD 121
+T +E + GI + + S + G IT+TF G D V++ + A LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 EAKDPIVSKDNGSGEPSVYVNLSSSVMDRTQ--LTDYAQRVLEDRFSLISGVSSVSISGG 179
E + +S + S + S TQ ++DY ++D S ++GV V + G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 180 LYKVMYVQLNPELMAGRNITTRDIVNVLNKENVETPGGEVRNDTTV------MAVRTARL 233
Y + + L+ +L+ +T D++N L +N + G++ + ++
Sbjct: 181 QYAMR-IWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 234 YQSPEDFNYLVLRTAADGSQVYLKDVANVFIGAENENSTFKSDGVVNISLGVVSQSDANP 293
+++PE+F + LR +DGS V LKDVA V +G EN N + +G LG+ + AN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 294 LEVAQDVHKEVELIQKFLPDGTKLIVDYDSTVFIDRSIDEVFSTLAVTALLVILVLYIFI 353
L+ A+ + ++ +Q F P G K++ YD+T F+ SI EV TL +LV LV+Y+F+
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 354 GQARATLIPAVTVPVSLISAFIAANVFGYSINLLTLMALILSIGLVVDDAIVVVENIFHH 413
RATLIP + VPV L+ F FGYSIN LT+ ++L+IGL+VDDAIVVVEN+
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 414 I-EKGEPPILAAYNGTREVGFAVIATTAVLVMVFLPISFMEGMVGLLFTEFSVMLAVSVL 472
+ E PP A ++ A++ VL VF+P++F G G ++ +FS+ + ++
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 473 FSSLIALTLTPVLGSKLLKANVK-----PNRFNLWVESGFSKLEHAYRNMVAKAVTFRVA 527
S L+AL LTP L + LLK F W + F + Y N V K +
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 528 SPLVIIACILGSAWLMQQVPAQLAPQEDRGVIFAFIKGAEGTSYNRMAANMEIAEKKLMP 587
L+ + G L ++P+ P+ED+GV I+ G + R ++ +
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 588 LLGQGVVKSFSIQAPAFGGRAGDQTGFVIIQLEDWDERTINAQQALGVVAKA---LKGIP 644
V F++ +F G+ G + L+ W+ER + A V+ +A L I
Sbjct: 600 NEKANVESVFTVNGFSFSGQ-AQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 645 DVMVRP--LLPGFRGGSSEPVQFVL---GGSDYEELFKWA-KLLEEEALYSPIFESPEID 698
D V P + G++ F L G ++ L + +LL A + S +
Sbjct: 659 DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 699 YQETTPELVVTVDKERAAELGISVAEVSETLEIMLGGRSETTFVERGEEYDVYLRGDENS 758
E T + + VD+E+A LG+S++++++T+ LGG F++RG +Y++ D
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 759 FSNVADLSQIYMRSNKGELITLDTITHIEEVASAQKLSHNNKQKSITLKANLGKGYTLGE 818
D+ ++Y+RS GE++ T V + +L N S+ ++ G + G+
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 819 ALDFLDEKAIEMLPSDISVAYTGESKDFKENQSSIFIVFGLALLVAYLVLAAQFESFINP 878
A+ + E LP+ I +TG S + + + + ++ +V +L LAA +ES+ P
Sbjct: 839 AMALM-ENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIP 897

Query: 879 LVVMFTVPMGVFGGFLGLYLTGQGLNIYSQIGMIMLIGMVTKNGILIVEFANQLRDK-GL 937
+ VM VP+G+ G L L Q ++Y +G++ IG+ KN ILIVEFA L +K G
Sbjct: 898 VSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGK 957

Query: 938 EIEQAIIDASVRRLRPILMTAFTTLIGAIPLILSTGAGSESRIAVGTVVFFGMAFATFVT 997
+ +A + A RLRPILMT+ ++G +PL +S GAGS ++ AVG V GM AT +
Sbjct: 958 GVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLA 1017

Query: 998 LLVIPAMYRLIS 1009
+ +P + +I
Sbjct: 1018 IFFVPVFFVVIR 1029


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1199RTXTOXIND392e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.0 bits (91), Expect = 2e-05
Identities = 47/254 (18%), Positives = 86/254 (33%), Gaps = 33/254 (12%)

Query: 47 EQHELSQSLSLIGKLDAEQSVFIAPQIAGKIKSIRVSTEQEIDQGQLL------------ 94
Q++ Q + K AE+ +A +I RV + D LL
Sbjct: 198 WQNQKYQKELNLDKKRAERLTVLA-RINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLE 256

Query: 95 VQLDDAKAQAALAEAAAYLAD---EKRKLKEFLKLIDQNAITKTEIDAQKASVDMA--NA 149
+ +A L + L E KE +L+ Q + ++ + ++
Sbjct: 257 QENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTL 316

Query: 150 RLAAAQADLDYHYLSAPFSGTT-GLIDFSRGKMVTAGTELLTL-DDLSSMRLDLQVPEHY 207
LA + + AP S L + G +VT L+ + + ++ + V
Sbjct: 317 ELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKD 376

Query: 208 LSQLSVGMKVSAKSRAWPNQ---VFIGKVLAID-SRVNQDTLNL--RVRVQFDNPT---- 257
+ ++VG K A+P +GKV I+ + L L V + +
Sbjct: 377 IGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTG 436

Query: 258 ---QQLKPGMMMSA 268
L GM ++A
Sbjct: 437 NKNIPLSSGMAVTA 450



Score = 37.5 bits (87), Expect = 6e-05
Identities = 25/159 (15%), Positives = 50/159 (31%), Gaps = 26/159 (16%)

Query: 10 IMLVVAGAISYSSFSEAKRTDSKDRPVRTVPVVTGIVEQHELSQSLSLIGKLDAEQSVFI 69
IM + A S V V G + + +S I
Sbjct: 64 IMGFLVIAFILSVLG----------QVEIVATANGKL--------------THSGRSKEI 99

Query: 70 APQIAGKIKSIRVSTEQEIDQGQLLVQLDDAKAQAALAEAAAYLADEKRKLKEFLKLIDQ 129
P +K I V + + +G +L++L A+A + + L +L++ I
Sbjct: 100 KPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQA--RLEQTRYQILS 157

Query: 130 NAITKTEIDAQKASVDMANARLAAAQADLDYHYLSAPFS 168
+I ++ K + ++ + + FS
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFS 196


75Shal_1387Shal_1393N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1387-313-1.043337acriflavin resistance protein
Shal_1388-216-3.828662RND family efflux transporter MFP subunit
Shal_1389-115-4.101779TetR family transcriptional regulator
Shal_1390-115-4.366458hypothetical protein
Shal_1391-214-3.928310hypothetical protein
Shal_1392-214-3.885867hypothetical protein
Shal_1393-116-2.544383polysaccharide biosynthesis protein CapD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1387ACRIFLAVINRP360e-110 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 360 bits (925), Expect = e-110
Identities = 205/1050 (19%), Positives = 429/1050 (40%), Gaps = 60/1050 (5%)

Query: 1 MVKSLVENGRLISLIIALLIVAGFGAISSLPRMEDPEITNRFASVITHYPGASAERVEAL 60
M + ++ +L++AG AI LP + P I SV +YPGA A+ V+
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTEVLESELRRLEELKLVQSTS-RPGISVIQLELKDSVVETAPVWSR--ARDLIADAKAL 117
VT+V+E + ++ L + STS G I L + T P ++ ++ + A L
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQS---GTDPDIAQVQVQNKLQLATPL 117

Query: 118 LPVSAQTPTL-DDQLGYANTAILGIVWRGGGEVRTDMLNRYAKELQSQLRLLSGTDFVNL 176
LP Q + ++ + + G V G + D+ + A ++ L L+G V L
Sbjct: 118 LPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL 177

Query: 177 YGQPTEEILVQLDGHKVNQLALSAQSIAQILQNADAKVSAGEINND------NFRALIEV 230
+G + + LD +N+ L+ + L+ + +++AG++ A I
Sbjct: 178 FGAQ-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIA 236

Query: 231 SGELDSISRIAQVPLKVTEDGQIIRLADIAHISRQPKEPANSIALIDQQQGVMVAARMLS 290
+ +V L+V DG ++RL D+A + E N IA I+ + + ++ +
Sbjct: 237 QTRFKNPEEFGKVTLRVNSDGSVVRLKDVARV-ELGGENYNVIARINGKPAAGLGIKLAT 295

Query: 291 NTRVDLWLEQVKHSVDQLQTSIPANIEIEWLFDQEGYTTERLSDLVGSLLLG-FLIILAV 349
+ +K + +LQ P +++ + +D + + ++V +L L+ L +
Sbjct: 296 GANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVM 355

Query: 350 LMLTLGLRNAVIVALSLPLTALFTLACMKYIGLPIHQMSVTGLVVALGIMVDNAIVIVDA 409
+ +R +I +++P+ L T A + G I+ +++ G+V+A+G++VD+AIV+V+
Sbjct: 356 YLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVEN 415

Query: 410 ISQRRQK-GMDRLDAVKQTLEHLWLPLAGSTITTMLAFAPIVLMPGAAGEFVGGIAISVM 468
+ + + + +A ++++ + L G + F P+ G+ G +I+++
Sbjct: 416 VERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIV 475

Query: 469 FALLGSFIISHTIIAGLAGRFGVDGKGTHWYHRGINLPFMSHAFRQTLSLA-------LS 521
A+ S +++ + L H ++G + + F +++ L
Sbjct: 476 SAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILG 535

Query: 522 RPLLAAVIIGVLPVMGFIAAGKMTEQFFPPSDRDMFQIEVYLAPQASISNTRAQVEKI-- 579
+I ++ + ++ F P D+ +F + L A+ T+ ++++
Sbjct: 536 STGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTD 595

Query: 580 DAKLRQTDGVTRIDWVVGGNAPSFYYNLLQRQQGASHYAQAMVKARDFA-----SANQLI 634
+ V + V G + Q + A +K + SA +I
Sbjct: 596 YYLKNEKANVESVFTVNGFSFSG--------QAQNAGMAFVSLKPWEERNGDENSAEAVI 647

Query: 635 PQLQMTLDKQFPGAQII------VRKLEQGPPFNAPVEVRIFGPNLDQLKLLGEQV-RKA 687
+ +M L + +I + +L F+ + + G D L Q+ A
Sbjct: 648 HRAKMEL-GKIRDGFVIPFNMPAIVELGTATGFDFELIDQ-AGLGHDALTQARNQLLGMA 705

Query: 688 LSDTQNVIHTRATLSAGSPKVWLQINEDASLISGLSLTDIAKQIEMATTGINGGSILEQT 747
+++ R + + L+++++ + G+SL+DI + I A G +++
Sbjct: 706 AQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRG 765

Query: 748 ESLPVRVRLSDSIREQQTKLSAITLVSGQGTGIPLSAISSSQIEVSRGAIPRRDGQRVNV 807
+ V+ R + + + S G +P SA ++S + R +G
Sbjct: 766 RVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSME 825

Query: 808 IEAYITSGVLPQTVVDNVSDKLSAIELPSGYRVELGGESAKRNEAIGNLMSNLVLVITLL 867
I+ G + + + S +LP+G + G S + + + + + ++
Sbjct: 826 IQGEAAPGTSSGDAMALMENLAS--KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVV 883

Query: 868 LATVVLSFNSFRLTGIILFSAIQSAGLGLLAVYTFGYPFGFTVIIGLLGLMGLAINAAIV 927
+ + S+ + ++ LLA F ++GLL +GL+ AI+
Sbjct: 884 FLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAIL 943

Query: 928 ILAELEEIPATRNGDTQLIVELVTRCG----RHIGSTTITTIGGFLPLII---AGGGFWP 980
I+ +++ + G +VE R I T++ I G LPL I AG G
Sbjct: 944 IVEFAKDLME-KEGKG--VVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQN 1000

Query: 981 PFAIAIAGGTLLTTLLSLVWVPTMYHLIMR 1010
I + GG + TLL++ +VP + +I R
Sbjct: 1001 AVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1388RTXTOXIND492e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 48.7 bits (116), Expect = 2e-08
Identities = 29/210 (13%), Positives = 74/210 (35%), Gaps = 40/210 (19%)

Query: 109 ELDASLAQNKADLDLAKATLGRSLELQQQGYVSEQQLDELKGQLSSLLAAKNR------- 161
E + + +L + K+ L +++ + ++++ + + + K R
Sbjct: 256 EQENKYVEAVNELRVYKSQLE---QIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIG 312

Query: 162 -LSASLLANQLKIEKSDLLAPFDGVISKRS-HNLGEVIGVGSPVYTII-EQNNIQAYVGV 218
L+ L N+ + + S + AP + + H G V+ + I+ E + ++ V
Sbjct: 313 LLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALV 372

Query: 219 PVQIANNLTTGQSVDLRV---------------------RQQD------YHASIAGIGAE 251
+ + GQ+ ++V +D ++ I+
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENC 432

Query: 252 LNPVTRTVELRLTLPDDANVINGELSYLNY 281
L+ + + L + A + G S ++Y
Sbjct: 433 LSTGNKNIPLSSGMAVTAEIKTGMRSVISY 462



Score = 38.3 bits (89), Expect = 5e-05
Identities = 22/106 (20%), Positives = 42/106 (39%), Gaps = 7/106 (6%)

Query: 76 GKIKTLAVDSGDRVKQGQLLAKLDTRLLMAEKNELDASLAQNKADLDLAKATLGRSLELQ 135
+K + V G+ V++G +L KL A+ + +SL Q + + L RS+EL
Sbjct: 105 SIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQ-TRYQILSRSIELN 163

Query: 136 QQGYVSEQQLDELKGQLSSLLAAKNRLSA-SLLANQLKIEKSDLLA 180
+ +L ++ + L SL+ Q ++
Sbjct: 164 K-----LPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQ 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1389HTHTETR784e-20 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 78.1 bits (192), Expect = 4e-20
Identities = 39/204 (19%), Positives = 65/204 (31%), Gaps = 5/204 (2%)

Query: 1 MNSPVLSRSEQKREQILQAAKDLFCEQGFPNTSMDEVAKLAGVSKQTVYSHFGCKDDLFV 60
M +++ R+ IL A LF +QG +TS+ E+AK AGV++ +Y HF K DLF
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 A--SIESKCLVHGVNQEVFSDPTMPEDSLLLFAKHFGEVITSPEAVTVFKACVAQADSHP 118
+ + + P P L H E + E + +
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 119 ---ELSELFYSAGPQHILGLLRDYLTRVSELGVYHFANPHHCAVRLCLMIFGEMRMRLDL 175
+ + + L E + A + +
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 176 GLDVSDLLDTREQYINETVEMFLR 199
DL Y+ +EM+L
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLL 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1393NUCEPIMERASE782e-18 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 78.3 bits (193), Expect = 2e-18
Identities = 41/245 (16%), Positives = 90/245 (36%), Gaps = 54/245 (22%)

Query: 6 SILITGGTGSFGRMYTRTILERY-----------------KPKRLIIFSRDELKQYEMQQ 48
L+TG G G ++ +LE K RL + ++ + ++
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK--- 58

Query: 49 EFNEPCMRYFIGDVRDGDRLKQAFRDVDF--VIHAAALKQVPAAEYNPMECIKTNIHGAE 106
D+ D + + F F V + V + NP +N+ G
Sbjct: 59 -----------IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFL 107

Query: 107 NVIKAAIENGVEKVIALST---------------DKAANPINLYGATKLASDKLFVAANN 151
N+++ N ++ ++ S+ D +P++LY ATK A++ + ++
Sbjct: 108 NILEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH 167

Query: 152 IVGEGRTRFSAVRYGNVVGSRGS---VVPLFKAIVARGETCLPITHEEMTRFWISLQDGV 208
+ G + +R+ V G G + F + G++ + +M R + + D
Sbjct: 168 LYG---LPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIA 224

Query: 209 DFVLK 213
+ +++
Sbjct: 225 EAIIR 229


76Shal_1425Shal_1466N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1425-1140.126229response regulator receiver modulated CheW
Shal_14260141.729745CheR-type MCP methyltransferase
Shal_14271172.379698flagellar basal body rod protein FlgB
Shal_14281162.343958flagellar basal body rod protein FlgC
Shal_14292172.915026flagellar basal body rod modification protein
Shal_14302182.649657flagellar hook protein FlgE
Shal_14311152.336608flagellar basal body rod protein FlgF
Shal_14320131.764916flagellar basal body rod protein FlgG
Shal_14331121.373361flagellar basal body L-ring protein
Shal_14341121.189408flagellar basal body P-ring protein
Shal_14350120.629188flagellar rod assembly protein/muramidase FlgJ
Shal_1436011-0.147885flagellar hook-associated protein FlgK
Shal_1437115-1.423120flagellar hook-associated protein FlgL
Shal_1438117-2.116587flagellin domain-containing protein
Shal_1439215-1.815929flagellin domain-containing protein
Shal_1440116-1.266631flagellar protein FlaG protein
Shal_1441117-0.655373flagellar hook-associated 2 domain-containing
Shal_1442217-0.496509hypothetical protein
Shal_14431140.622418flagellar protein FliS
Shal_14441120.899281sigma-54 dependent trancsriptional regulator
Shal_14452121.307608PAS/PAC sensor-containing signal transduction
Shal_14461122.364775Fis family two component sigma54 specific
Shal_14471132.458869flagellar hook-basal body complex subunit FliE
Shal_14482102.788139flagellar MS-ring protein
Shal_14493102.551944flagellar motor switch protein G
Shal_14501112.392242flagellar assembly protein FliH
Shal_14510132.190674flagellum-specific ATP synthase
Shal_14522150.572255flagellar export protein FliJ
Shal_14531150.282376flagellar hook-length control protein
Shal_1454221-0.918450flagellar basal body-associated protein FliL
Shal_1455320-0.277721flagellar motor switch protein FliM
Shal_14562220.108608flagellar motor switch protein
Shal_14571181.585357flagellar biosynthesis protein FliO
Shal_14580171.364366flagellar biosynthesis protein FliP
Shal_1459-1181.551770flagellar biosynthetic protein FliQ
Shal_1460-1181.613298flagellar biosynthesis protein FliR
Shal_1461-1171.311020flagellar biosynthesis protein FlhB
Shal_1462-1160.914239flagellar biosynthesis protein FlhA
Shal_1463-1160.076615flagellar biosynthesis regulator FlhF
Shal_1464-1140.423480cobyrinic acid ac-diamide synthase
Shal_1465-1150.170121flagellar biosynthesis sigma factor
Shal_1466-215-0.091289response regulator receiver protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1425HTHFIS642e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 63.7 bits (155), Expect = 2e-13
Identities = 23/128 (17%), Positives = 54/128 (42%), Gaps = 12/128 (9%)

Query: 180 HIMVIDDSSVARKQIIRALTSLDLQIDTAKDGKEALEKLRAMSAELEDVSTEIPLIISDI 239
I+V DD + R + +AL+ + + + A + L+++D+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGD---------LVVTDV 55

Query: 240 EMPEMDGYTLTAEIRDDPKLKNIKVVLHTSLSGVFNQAMVQKVGANDFIAK-FNPDELAA 298
MP+ + + L I+ ++ V++ ++ + + GA D++ K F+ EL
Sbjct: 56 VMPDENAFDLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIG 113

Query: 299 AVNKHLSL 306
+ + L+
Sbjct: 114 IIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1428FLGHOOKAP1342e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 33.8 bits (77), Expect = 2e-04
Identities = 11/38 (28%), Positives = 18/38 (47%)

Query: 99 NVNVMEEMANMISASRSYQMNVQVTEAAKSMLQQTLRI 136
VN+ EE N+ + Y N QV + A ++ + I
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 26.5 bits (58), Expect = 0.046
Identities = 10/25 (40%), Positives = 17/25 (68%)

Query: 8 NVAGSGMSAQSVRLNTTASNIANAD 32
N A SG++A LNT ++NI++ +
Sbjct: 5 NNAMSGLNAAQAALNTASNNISSYN 29


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1430FLGHOOKAP1415e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.5 bits (97), Expect = 5e-06
Identities = 28/94 (29%), Positives = 41/94 (43%), Gaps = 7/94 (7%)

Query: 2 SFNIALSGISSAQKDLNTTANNIANVNTTGFKESRAEFADVYASSIFANSKTTVGGGVAT 61
N A+SG+++AQ LNT +NNI++ N G+ A S++ A VG GV
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQA-NSTLGAGG--WVGNGVYV 59

Query: 62 SQVAQQFHQGSMAFTNNSLDMAINGGGFFVTSSE 95
S V +++ F N L A E
Sbjct: 60 SGVQREYDA----FITNQLRAAQTQSSGLTARYE 89



Score = 38.0 bits (88), Expect = 6e-05
Identities = 12/49 (24%), Positives = 24/49 (48%)

Query: 405 SIRSSALEQSNVDLTTELVDLISAQRNFQANSRTLEVNNTLQQTILQIR 453
+ + S V+L E +L Q+ + AN++ L+ N + ++ IR
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1431FLGHOOKAP1290.018 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.2 bits (65), Expect = 0.018
Identities = 10/34 (29%), Positives = 18/34 (52%)

Query: 4 LLYVAMSGAKQNMNSLAVSANNLANANTDGFKSS 37
L+ AMSG +L ++NN+++ N G+
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQ 36


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1432FLGHOOKAP1435e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.4 bits (102), Expect = 5e-07
Identities = 19/117 (16%), Positives = 39/117 (33%), Gaps = 4/117 (3%)

Query: 147 ATSITVSAEGEVSVKTPGNAENQVVGQLAIADFINPSGLDPMGQNLYLETG---ASGTPI 203
I +++E + N + + Q + +L + G A+
Sbjct: 429 EAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGNKTATLKTS 488

Query: 204 QGTASLDGMGAIRQGALETSNVNVTEELVNLIESQRIYEMNSKVISAVDQMLSYVNQ 260
T + + S VN+ EE NL Q+ Y N++V+ + + +
Sbjct: 489 SATQGNV-VTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544



Score = 35.3 bits (81), Expect = 2e-04
Identities = 8/36 (22%), Positives = 21/36 (58%)

Query: 5 LWISKTGLDAQQTDISVISNNVANASTVGFKKSRAV 40
+ + +GL+A Q ++ SNN+++ + G+ + +
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI 39


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1433FLGLRINGFLGH1443e-45 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 144 bits (365), Expect = 3e-45
Identities = 73/219 (33%), Positives = 106/219 (48%), Gaps = 15/219 (6%)

Query: 8 LFMAALSGCNSTNGKPIADDPYYAPVYPEAPPTKIAATGSMYQDSQ-----ASSLYSDIK 62
L + +L+GC P+ A P P A GS++Q +Q L+ D +
Sbjct: 14 LLVLSLTGCAWIPSTPLVQGATSAQPVPGPTP---VANGSIFQSAQPINYGYQPLFEDRR 70

Query: 63 ALKVGDIITVYLMEQTQAKKSANNEISK----GTDLSLDPIYAGGGNVTIGGNPIDLRYK 118
+GD +T+ L E A KS++ S+ P Y G G D+
Sbjct: 71 PRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLF---GNARADVEAS 127

Query: 119 DSMNTKRESDADQSNSLSGSISANVMQVLNNGNLVIRGEKWISINNGDEFVRITGIVRAQ 178
+ A+ SN+ SG+++ V QVL NGNL + GEK I+IN G EF+R +G+V +
Sbjct: 128 GGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPR 187

Query: 179 DIRPDNTIDSQRVANARIQYSGTGTFAEVQKVGWLASFF 217
I NT+ S +VA+ARI+Y G G E Q +GWL FF
Sbjct: 188 TISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFF 226


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1434FLGPRINGFLGI386e-136 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 386 bits (994), Expect = e-136
Identities = 166/367 (45%), Positives = 226/367 (61%), Gaps = 14/367 (3%)

Query: 6 LVLLCAMLVLTTPVHAQ--RIKDIANVQGVRSNQLIGYGLVVGLPGTGEKTR---YTEQT 60
LV + T P A RIKDIA++Q R NQLIGYGLVVGL GTG+ R +TEQ+
Sbjct: 11 LVFSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQS 70

Query: 61 FKTMLKNFGINLPDNFRPKIKNIAVVAVSADMPPFIKPGQTLDVTVSSLGEAKSLRGGML 120
+ ML+N GI + KNIA V V+A++PPF PG +DVTVSSLG+A SLRGG L
Sbjct: 71 MRAMLQNLGITTQGG-QSNAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNL 129

Query: 121 LQTFLKGVDGNVYAIAQGSMVVSGFSAEGMDGSKVIQNTPTVGRIPNGAIIERTVATPFS 180
+ T L G DG +YA+AQG+++V+GFSA+G D + + Q T R+PNGAIIER + + F
Sbjct: 130 IMTSLSGADGQIYAVAQGALIVNGFSAQG-DAATLTQGVTTSARVPNGAIIERELPSKFK 188

Query: 181 TGDHLTFNLRRADFSTAKRLADSINDL----LGPGMARPLDAASVQVSAPRDVSQRVSFL 236
+L LR DFSTA R+AD +N G +A P D+ + V PR V+ +
Sbjct: 189 DSVNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPR-VADLTRLM 247

Query: 237 ATLENIEVEPAAESAKVIVNSRTGTIVVGQNVKLLPAAVTHGGLTVTIAEATQVSQPNPF 296
A +EN+ VE AKV++N RTGTIV+G +V++ AV++G LTV + E+ QV QP PF
Sbjct: 248 AEIENLTVETDTP-AKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPF 306

Query: 297 GNGQTVVTTDSTIDVAEEDSRMFMFNPGTTLDELVRAVNLVGAAPSDVLAILEALKMAGA 356
GQT V + I +E S++ + G L LV +N +G ++AIL+ +K AGA
Sbjct: 307 SRGQTAVQPQTDIMAMQEGSKVAIVE-GPDLRTLVAGLNSIGLKADGIIAILQGIKSAGA 365

Query: 357 LHGELII 363
L EL++
Sbjct: 366 LQAELVL 372


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1435FLGFLGJ1962e-62 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 196 bits (499), Expect = 2e-62
Identities = 106/349 (30%), Positives = 173/349 (49%), Gaps = 63/349 (18%)

Query: 7 ASQFLDLGGLDSLRSRAQKDETSALKEVAQQFEGIFVQMLMKSMRDANAVFESDSPMNSQ 66
AS D L+ L+++A +D + ++ VA+Q EG+FVQM++KSMRDA D +S+
Sbjct: 9 ASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALP---KDGLFSSE 65

Query: 67 YTKFYEQMHDQQMSLNLSGEGMLGLADLMVQQLDPANSPMTPASVLRGDINGGSKAAALT 126
+T+ Y M+DQQ++ ++ LGLA++MV+Q+ P
Sbjct: 66 HTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQ----------------------P 103

Query: 127 MGKPDTLQMPTSRVSGSDAMAAQAAPFIAQSQIQAQSQAQVTAAQTALTSSAQSQTLDSV 186
+ + T P F ++ ++ Q+QA Q A+ +
Sbjct: 104 LPEESTPAAPMK--------------FPLETVVRYQNQALSQLVQKAVPRNYDDSL---- 145

Query: 187 LSGKVLPSAAVTADKSQANFTSQDEFVARLYPHAQKAAQTLGTTPELLIAQSALETGWGQ 246
S+ F+A+L AQ A+Q G L++AQ+ALE+GWGQ
Sbjct: 146 ------------------PGDSKA-FLAQLSLPAQLASQQSGVPHHLILAQAALESGWGQ 186

Query: 247 KMVKGHQGQQSNNLFNIKADNRWQGEHASVSTLEYEQGIAVKQRANFRVYDDIGQSFNDF 306
+ ++ G+ S NLF +KA W+G ++T EYE G A K +A FRVY ++ +D+
Sbjct: 187 RQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYLEALSDY 246

Query: 307 VSFVSNGERYQDAMKQAANPQAFIRSLQDAGYATDPKYADKVIQVMKTI 355
V ++ RY A+ AA+ + ++LQDAGYATDP YA K+ +++ +
Sbjct: 247 VGLLTRNPRYA-AVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQM 294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1436FLGHOOKAP12241e-67 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 224 bits (572), Expect = 1e-67
Identities = 121/454 (26%), Positives = 199/454 (43%), Gaps = 19/454 (4%)

Query: 4 DLMNIARTGVLASQSQLAITSNNIANANTAGYNRQVVSQSALDSQRMGNEFYGAGTYVSD 63
L+N A +G+ A+Q+ L SNNI++ N AGY RQ + +S + G G YVS
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61

Query: 64 VKRVYNDYATRELRIGSTAVSESQTTFSKMSQLDQLFSQMGKAVPQGLNDFFASMNALAD 123
V+R Y+ + T +LR T S + +MS++D + S ++ + DFF S+ L
Sbjct: 62 VQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLVS 121

Query: 124 IPNDMGLRGSFLTSANQLANSVNQMQSHLDSQMTQTNDQISAVTDRINEISNELGSINRE 183
D R + + + L N +L Q Q N I A D+IN + ++ S+N +
Sbjct: 122 NAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLNDQ 181

Query: 184 LMKSQGQDM-----QLLDKQDALILELSEYASVNVIPLESGAKSIMLGGSMMLVSGEMSM 238
+ + G LLD++D L+ EL++ V V + G +I + LV G +
Sbjct: 182 ISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTAR 241

Query: 239 QVGTAPGDPYPNELQITAQSGS--KSMIVDASKMGGQLGALVNYRDETLIPSQMELGQFA 296
Q+ P P+ + G+ I + G LG ++ +R + L ++ LGQ A
Sbjct: 242 QLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQLA 301

Query: 297 LGVSDAFNQAQAQGFDLNGQVGANIFTDINDPSMQLGRVGALSSNTGTANLSVNIDDVGS 356
L ++AFN GFD NG G + F + V + N G + + D +
Sbjct: 302 LAFAEAFNTQHKAGFDANGDAGEDFFA------IGKPAVLQNTKNKGDVAIGATVTDASA 355

Query: 357 LTGSSYELKF--TAPSTYELKDAVSGNITPLTLNGTKLEGANGFSINIDAGALASGDTFE 414
+ + Y++ F L + +TP +G G A D+F
Sbjct: 356 VLATDYKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFT----GTPAVNDSFT 411

Query: 415 IRPTSSAAVSLSVEMTDAKGIAAAGAKITADAAN 448
++P S A V++ V +TD IA A + D+ N
Sbjct: 412 LKPVSDAIVNMDVLITDEAKIAMASEEDAGDSDN 445



Score = 95.8 bits (238), Expect = 6e-23
Identities = 38/104 (36%), Positives = 58/104 (55%)

Query: 533 AEGDNTNMVSMAKLSEAKLMNGGKTTLTDVFENTTIDVGSKTKSAEISMGSAEAIYKQAY 592
+ DN N ++ L GG + D + + D+G+KT + + S + + Q
Sbjct: 441 GDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQLS 500

Query: 593 TRVQSVSGVNLDEEAANLMRFQQSYQASARIMTTANEIFNTLFS 636
+ QS+SGVNLDEE NL RFQQ Y A+A+++ TAN IF+ L +
Sbjct: 501 NQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1437FLAGELLIN515e-09 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 50.8 bits (121), Expect = 5e-09
Identities = 37/145 (25%), Positives = 66/145 (45%), Gaps = 4/145 (2%)

Query: 1 MRIST--AQMFHQNINSVTQKQSQTSQIIEQLSTGKRVNTAGDDPVAAAGINNLNQKNSV 58
I+T + QN + +Q ++ E+LS+G R+N+A DD A N
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAI--ERLSSGLRINSAKDDAAGQAIANRFTSNIKG 59

Query: 59 VDQFLKNIDYAKNRLSISESKLGSAATLASSVREQVLRAVNGSLTDSDRQTVADEMRGSL 118
+ Q +N + + +E L VRE ++A NG+ +DSD +++ DE++ L
Sbjct: 60 LTQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRL 119

Query: 119 EELMAIANSKDESGNYLFGGFNTGS 143
EE+ ++N +G + N
Sbjct: 120 EEIDRVSNQTQFNGVKVLSQDNQMK 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1438FLAGELLIN1359e-39 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 135 bits (341), Expect = 9e-39
Identities = 95/271 (35%), Positives = 130/271 (47%), Gaps = 10/271 (3%)

Query: 2 AITVNTNVTSMKAQKNLNTSSQGLATSMERLSSGLRINSAKDDAAGLAISNRLDSQVRGL 61
A +NTN S+ Q NLN S L++++ERLSSGLRINSAKDDAAG AI+NR S ++GL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 DVGMRNANDAISIAQIAEGAMQEQTNMLQRMRDLTIQASNGANSTDDIASIQKEIDALAT 121
RNAND ISIAQ EGA+ E N LQR+R+L++QA+NG NS D+ SIQ EI
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EITSIGNSTAFGNTKLLDGGFSAGKSFQVGHQKGEDISVKVSKVNSSSLAVGGL------ 175
EI + N T F K+L QVG GE I++ + K++ SL + G
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQ--MKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPK 178

Query: 176 --TLTNSANRQSALTKIDAAIKTIDTQRADLGAVQNRLAHNISNSANTQANVADAKSRIV 233
T+ + + +T D + R D+ + + A
Sbjct: 179 EATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTT 238

Query: 234 DVDFAKETATMTKNQVLQQTGSAMLAQANQL 264
D + K + A A +
Sbjct: 239 DDAENNTAVDLFKTTKSTAGTAEAKAIAGAI 269



Score = 85.1 bits (210), Expect = 7e-21
Identities = 57/213 (26%), Positives = 94/213 (44%), Gaps = 4/213 (1%)

Query: 60 GLDVGMRNANDAISIAQIAEGAMQEQTNMLQRMRDLTIQASNGANSTDDIASIQKEIDAL 119
+ + +++A I GA LQ +++ NG + DD + +
Sbjct: 298 KVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSD 357

Query: 120 ATEITSIGNSTAFGNTKLLDGGFSAGKSFQVGHQKGEDISVKVSKVNSSSLAVGGLTLTN 179
++ + +AG + + + S +
Sbjct: 358 LEANNAVKGESKITVNGAEYTANAAGDKVTLAGK----TMFIDKTASGVSTLINEDAAAA 413

Query: 180 SANRQSALTKIDAAIKTIDTQRADLGAVQNRLAHNISNSANTQANVADAKSRIVDVDFAK 239
+ + L ID+A+ +D R+ LGA+QNR I+N NT N+ A+SRI D D+A
Sbjct: 414 KKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYAT 473

Query: 240 ETATMTKNQVLQQTGSAMLAQANQLPQVALSLL 272
E + M+K Q+LQQ G+++LAQANQ+PQ LSLL
Sbjct: 474 EVSNMSKAQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1439FLAGELLIN1321e-37 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 132 bits (333), Expect = 1e-37
Identities = 95/271 (35%), Positives = 134/271 (49%), Gaps = 10/271 (3%)

Query: 2 AITVNTNVTSMKAQKNLNASSQNLATSMERLSSGLRINSAKDDAAGLAISNRLDSQVRGL 61
A +NTN S+ Q NLN S +L++++ERLSSGLRINSAKDDAAG AI+NR S ++GL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 DVGMRNANDAISIAQISEGAMQEQTNMLQRMRDLTIQASNGANSSDDIASIKKEIDALAG 121
RNAND ISIAQ +EGA+ E N LQR+R+L++QA+NG NS D+ SI+ EI
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EITSIGNTTAFGNTKLMNGDFSAGKSFQVGHQKGEDITVKVQKVNASSLAVGSL------ 175
EI + N T F K+++ D QVG GE IT+ +QK++ SL +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQ--MKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPK 178

Query: 176 --TLTNSANRQSALTKIDAAIKTIDSQRADLGAVQNRLSHNISNSANTQANVADAKSRIV 233
T+ + + +T D + R D+ + + A
Sbjct: 179 EATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTT 238

Query: 234 DVDFAKETAAMTKNQVLQQTGSAMLAQANQL 264
D + K + A A +
Sbjct: 239 DDAENNTAVDLFKTTKSTAGTAEAKAIAGAI 269



Score = 87.0 bits (215), Expect = 2e-21
Identities = 57/213 (26%), Positives = 97/213 (45%), Gaps = 4/213 (1%)

Query: 60 GLDVGMRNANDAISIAQISEGAMQEQTNMLQRMRDLTIQASNGANSSDDIASIKKEIDAL 119
+ + +++A I+ GA LQ +++ NG + DD + +
Sbjct: 298 KVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSD 357

Query: 120 AGEITSIGNTTAFGNTKLMNGDFSAGKSFQVGHQKGEDITVKVQKVNASSLAVGSLTLTN 179
++ + +AG + + + + S +
Sbjct: 358 LEANNAVKGESKITVNGAEYTANAAGDKVTLAGK----TMFIDKTASGVSTLINEDAAAA 413

Query: 180 SANRQSALTKIDAAIKTIDSQRADLGAVQNRLSHNISNSANTQANVADAKSRIVDVDFAK 239
+ + L ID+A+ +D+ R+ LGA+QNR I+N NT N+ A+SRI D D+A
Sbjct: 414 KKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYAT 473

Query: 240 ETAAMTKNQVLQQTGSAMLAQANQLPQVALSLL 272
E + M+K Q+LQQ G+++LAQANQ+PQ LSLL
Sbjct: 474 EVSNMSKAQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1441SHIGARICIN290.025 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 29.4 bits (66), Expect = 0.025
Identities = 10/59 (16%), Positives = 19/59 (32%), Gaps = 6/59 (10%)

Query: 276 NTLVESISNLSKYDTEKEEAAALQGDSMI------RSIESQMRNMISNRVDVDGETIAL 328
L +I+ L Y+ +A + + IE Q+ + I+L
Sbjct: 153 PALDSAITTLFYYNANSAASALMVLIQSTSEAARYKFIEQQIGKRVDKTFLPSLAIISL 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1444HTHFIS436e-152 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 436 bits (1124), Expect = e-152
Identities = 166/478 (34%), Positives = 256/478 (53%), Gaps = 17/478 (3%)

Query: 7 RILLVGNQSERINRLSCVFEFLGEQVELIDFDKLELHTKQTRFRAIVLPSENQPK----E 62
IL+ + + L+ G V + +V+ P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 63 LIQSLTGMLPWQPFLTLGERDDIKA------SNILGCIEEPLNYPQLTELLHFCQVYGQV 116
L+ + P P L + ++ + +P + +L ++ +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 117 KRPEIPTSANQTKLFRSLVGRSEGIASVRHLINQVSGSEATVLVLGQSGTGKEVVARNIH 176
+ ++ + LVGRS + + ++ ++ ++ T+++ G+SGTGKE+VAR +H
Sbjct: 125 RPSKLEDDSQD---GMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALH 181

Query: 177 YISDRRDGPFIPVNCGAIPAELLESELFGHEKGSFTGAISARKGRFELAEKGTLFLDEIG 236
RR+GPF+ +N AIP +L+ESELFGHEKG+FTGA + GRFE AE GTLFLDEIG
Sbjct: 182 DYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIG 241

Query: 237 DMPLQMQVKLLRVLQERMFERVGGSKSIAADVRVVAATHRNLETMIEEGGFREDLYYRLN 296
DMP+ Q +LLRVLQ+ + VGG I +DVR+VAAT+++L+ I +G FREDLYYRLN
Sbjct: 242 DMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLN 301

Query: 297 VFPIEMPALCERKEDIPLLLQELVSRVYNEGRGRVRFTQRAIESLKEHAWSGNVRELSNL 356
V P+ +P L +R EDIP L++ V + EG RF Q A+E +K H W GNVREL NL
Sbjct: 302 VVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELENL 361

Query: 357 VERLTILYPGGLVDVNDLPIKYRHIDVPEYCVEISEEQQERDALASIFNDEEPIEIPETR 416
V RLT LYP ++ + + R ++P+ +E + + +++ EE +
Sbjct: 362 VRRLTALYPQDVITREIIENELRS-EIPDSPIEKAAARSGSLSISQAV--EENMRQYFAS 418

Query: 417 FPSELPPEGVNLKDLLAELEIDMIRQALDQQDNVVARAAEMLGIRRTTLVEKMRKYGM 474
F LPP G+ +LAE+E +I AL +AA++LG+ R TL +K+R+ G+
Sbjct: 419 FGDALPPSGL-YDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1446HTHFIS467e-164 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 467 bits (1202), Expect = e-164
Identities = 170/483 (35%), Positives = 250/483 (51%), Gaps = 39/483 (8%)

Query: 1 MSEGKLLLVEDDASLREALLDTLMLAHYDCVDVGSAEEAILALKADRFDLVISDVQMEGI 60
M+ +L+ +DDA++R L L A YD +A + A DLV++DV M
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 GGIGLLNYLQQHQPKLPVLLMTAYATIDNAVNAMKLGAVDYLAKPFSPEVLLNQVSRYL- 119
LL +++ +P LPVL+M+A T A+ A + GA DYL KPF L+ + R L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 ---------PAKAAEGTPIVADEKSL-ALLALAQRVAASDASVMIMGPSGSGKEVLARFI 169
+ +G P+V ++ + + R+ +D ++MI G SG+GKE++AR +
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180

Query: 170 HQNSQRAEQPFIAINCAAIPENMLEATLFGYEKGAFTGAYQACPGKFEQAQGGTLLLDEI 229
H +R PF+AIN AAIP +++E+ LFG+EKGAFTGA G+FEQA+GGTL LDEI
Sbjct: 181 HDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEI 240

Query: 230 SEMEVGLQAKLLRVLQEREVERLGGRKTIKLDVRVLATSNRDLKAMAASGEFREDLYYRI 289
+M + Q +LLRVLQ+ E +GGR I+ DVR++A +N+DLK G FREDLYYR+
Sbjct: 241 GDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRL 300

Query: 290 NVFPLTWPSLNQRPADILPLARHLLQRHALIANRSELPLLSECASRRLLTHRWPGNVREL 349
NV PL P L R DI L RH +Q+ ++ + A + H WPGNVREL
Sbjct: 301 NVVPLRLPPLRDRAEDIPDLVRHFVQQAE--KEGLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 350 DNVVQRALIMCGSHEITAADIIID--SLELDFNLESTPLAD------------------- 388
+N+V+R + IT I + S D +E
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 389 --NKPTELDGLGDELKAQEHVIILETLTQCNGSRKLVAEKLGISARTLRYKMAKMRDSGI 446
+ L E+ +IL LT G++ A+ LG++ TLR K+ ++ G+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIREL---GV 475

Query: 447 QIP 449
+
Sbjct: 476 SVY 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1447FLGHOOKFLIE546e-13 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 53.9 bits (129), Expect = 6e-13
Identities = 28/71 (39%), Positives = 44/71 (61%)

Query: 40 FGQLLSQAVGNVSELQSNASNLATRLDMGDTTVTLSDTVIAREKSSVAFEATVQVRNKLV 99
F L A+ +S+ Q+ A A + +G+ V L+D + +K+SV+ + +QVRNKLV
Sbjct: 33 FAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQVRNKLV 92

Query: 100 EAYKEIMSMPV 110
AY+E+MSM V
Sbjct: 93 AAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1448FLGMRINGFLIF3011e-97 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 301 bits (773), Expect = 1e-97
Identities = 150/562 (26%), Positives = 257/562 (45%), Gaps = 47/562 (8%)

Query: 30 LGGVDMLRQVTMILALAICLALAVFVMIWAQEPEYRPL-GQMSTAEMVQVLDVLDNNQVK 88
L + ++ +I+A + +A+ V +++WA+ P+YR L +S + ++ L +
Sbjct: 16 LNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIP 75

Query: 89 YEIQGD--VVKVPEDKFQDVRMLLSREGLDNTDANNDFLNKDSGFGVSQRMEQARLKHSQ 146
Y ++VP DK ++R+ L+++GL A L FG+SQ EQ + +
Sbjct: 76 YRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRAL 135

Query: 147 EQNLARIIEELKSVTRAKVILALPKENVFARNRSKPSATVVVSTRRS-GLSQEEVDSIVD 205
E LAR IE L V A+V LA+PK ++F R + PSA+V V+ L + ++ ++V
Sbjct: 136 EGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALDEGQISAVVH 195

Query: 206 IVASAVHNLEPNKVTVTDANGRLLNSGTQDGSSAIARRELEIVQQKESEYRNKVESILMP 265
+V+SAV L P VT+ D +G LL G +L+ ES + ++E+IL P
Sbjct: 196 LVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDL-NDAQLKFANDVESRIQRRIEAILSP 254

Query: 266 ILGPENFTSQVDVSMDFTAVEQTAKRYNPDLPSLRSEMVVENNS-----NGGTSGGIPGA 320
I+G N +QV +DF EQT + Y+P+ + ++ + + G GG+PGA
Sbjct: 255 IVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGVPGA 314

Query: 321 LSNQPP---------------MASDIPQEVNAEETVTRNSGSTHKEATRNFELDTTISHT 365
LSNQP A + PQ + + + ST + T N+E+D TI HT
Sbjct: 315 LSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDRTIRHT 374

Query: 366 RQQIGTLRRVSVSVAVDFKNAPMAEDGSVNRVPRTEQELANIRRLLEGAVGFNTQRGDVI 425
+ +G + R+SV+V V++K + +P T ++ I L A+GF+ +RGD +
Sbjct: 375 KMNVGDIERLSVAVVVNYKTLADGKP-----LPLTADQMKQIEDLTREAMGFSDKRGDTL 429

Query: 426 EVVTVPFMDQLIEAAPAPELWEQPWFWRVLRLVLGALVILVLILAVVRPMLKKLVYPDSV 485
VV PF + W+Q F L L++LV+ + R ++ +
Sbjct: 430 NVVNSPF-SAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRRVE 488

Query: 486 NMPDEPQRGGELAEIEDQYAADTLGMLQRPEAEYSYADDGSIL---IPNLHKDDDMIKAI 542
++ E E+ E + D + + M + I
Sbjct: 489 EAKAAQEQAQVRQETEE-------------AVEVRLSKDEQLQQRRANQRLGAEVMSQRI 535

Query: 543 RALVANEPELSTQVVKNWLLED 564
R + N+P + V++ W+ D
Sbjct: 536 REMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1449FLGMOTORFLIG2871e-97 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 287 bits (736), Expect = 1e-97
Identities = 108/341 (31%), Positives = 187/341 (54%)

Query: 8 EAKTAPGFKPSELNGIEKTAILLLSLSESDAASILKYLEPKQVQKVGMAMAAMRDFGQEK 67
E K S L G +K AILL+S+ ++ + KYL ++++ + +A + E
Sbjct: 3 EKKEKEILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSEL 62

Query: 68 VIGVHKLFLDDIQKYSSIGFNSEEFVRKALTAALGEDKAGNLIEQIIMGSGAKGLDSLKW 127
V F + + I ++ R+ L +LG KA ++I + ++ + ++
Sbjct: 63 KDNVLLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRR 122

Query: 128 MDARQVATIIQNEHPQIQTIVLSYLEPDQAAEIFAQFPENTRLDLMMRIANLEEVQPAAL 187
D + IQ EHPQ ++LSYL+P +A+ I + P + ++ RIA ++ P +
Sbjct: 123 ADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVV 182

Query: 188 QELNDIMEKQFAGQGGAQAAKMGGLKAAANIMNYLDTGVESHLMETMRESDEEMAQQIQD 247
+E+ ++EK+ A GG+ I+N D E ++E++ E D E+A++I+
Sbjct: 183 REVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKK 242

Query: 248 LMFVFENLSDVDDMGIQVLLREVQQDVLMKALKGADEQLKEKLLGNMSKRAAELLRDDLE 307
MFVFE++ +DD IQ +LRE+ L KALK D ++EK+ NMSKRAA +L++D+E
Sbjct: 243 KMFVFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDME 302

Query: 308 AMGPIRISEVEVAQKEILSIARRLSDSGEIMLGGGGGEEFL 348
+GP R +VE +Q++I+S+ R+L + GEI++ GG E+ L
Sbjct: 303 FLGPTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1450FLGFLIH801e-19 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 79.8 bits (196), Expect = 1e-19
Identities = 54/198 (27%), Positives = 97/198 (48%), Gaps = 4/198 (2%)

Query: 50 PLETETILPPTLSEIEDIRAHAEQEGFA---EGQEKGFTEGLEKGRLEGLEQGHGEGYSQ 106
E I+ P + IE+ EQ+ + E+G+ G+ +GR +G +QG+ EG +Q
Sbjct: 19 QAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQ 78

Query: 107 GQQQGYEEGLKTAAEMLQRFENLLGQFEAPLALLDTEIEKELLTTSMTLAKAVIGHELKT 166
G +QG E A + R + L+ +F+ L LD+ I L+ ++ A+ VIG
Sbjct: 79 GLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVIGQTPTV 138

Query: 167 YPEHILSALRQGVDSLPIKEQRVNIRVTPSDEALIGELYSQTQLERNRWQIEADPSLSPG 226
++ ++Q + P+ + +RV P D + ++ T L + W++ DP+L PG
Sbjct: 139 DNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT-LSLHGWRLRGDPTLHPG 197

Query: 227 DCIIDSVRSHIDMTVETR 244
C + + +D +V TR
Sbjct: 198 GCKVSADEGDLDASVATR 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1452FLGFLIJ407e-07 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 39.8 bits (92), Expect = 7e-07
Identities = 38/145 (26%), Positives = 71/145 (48%)

Query: 1 MAKPDPLHMVLKLTQDAEEQASLQLRSAQLELQRRQNQLEALQNYRLDYMKQMEQQQGQS 60
MA+ L + L + E A+ L + Q+ + QL+ L +Y+ +Y +
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 ISASHYHQFHQFVRQVDAAIVQQVNAVKDADNQRQHRQVYWQEKQQKRKAVELLLANKAE 120
I+++ + + QF++ ++ AI Q + + W+EK+Q+ +A + L ++
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 KAKLAELRAEQKMVDEFASQQFYRK 145
A LAE R +QK +DEFA + RK
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRK 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1453FLGHOOKFLIK463e-07 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 45.6 bits (107), Expect = 3e-07
Identities = 30/109 (27%), Positives = 52/109 (47%), Gaps = 1/109 (0%)

Query: 376 MNQQLITMVSNGVQHAEIRLDPPELGHMMVRIQVQGDTTQVQFQVSQHQTRDLVEQAMPR 435
++Q + G Q AE+RL P +LG + + ++V + Q+Q R +E A+P
Sbjct: 244 LSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAALPV 303

Query: 436 LREMLAEQGMQLTDGQVSQGSGGNSDGEQGSGRGNDNGMAETDEISAEE 484
LR LAE G+QL +S G + + S + A + ++ E+
Sbjct: 304 LRTQLAESGIQLGQSNIS-GESFSGQQQAASQQQQSQRTANHEPLAGED 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1455FLGMOTORFLIM2494e-83 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 249 bits (638), Expect = 4e-83
Identities = 86/326 (26%), Positives = 168/326 (51%), Gaps = 11/326 (3%)

Query: 1 MSDLLSQDEIDALLHGVDDVEEDIID----DNELDARSYDFSSQDRIVRGRMPTLEIVNE 56
M+++LSQDEID LL + + I D + YDF D+ + +M TL +++E
Sbjct: 1 MTEVLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHE 60

Query: 57 RFARHLRISMFNMMRRAAEVSINGVQMLKFGEYVHTLFVPTSLNMVRFSPLKGTALITME 116
FAR S+ +R V + V L + E++ ++ P++L ++ PLKG A++ ++
Sbjct: 61 TFARLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVD 120

Query: 117 ARLVFILVDNFFGGDGRFHAKIEGREFTPTERRIVQLLLKIIFEDYKDAWAPVMDVQFDY 176
+ F ++D FGG G+ R+ T E +++ ++ I + +++W V+D++
Sbjct: 121 PSITFSIIDRLFGGTGQAAKVQ--RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRL 178

Query: 177 LDSEVNPAMANIVSPTEVVVVSSFHIEVDGGGGDFHITMPYSMIEPIRELLDAG--VQSD 234
E NP A IV P+E+VV+ + +V G + +PY IEPI L + S
Sbjct: 179 GQIETNPQFAQIVPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSV 238

Query: 235 TQDTDMRWSQALRDEIMDVDVGIDATIVEHKVTLKQVLEFKAGDVIPVE---LPEYIILK 291
+ + ++ LRD++ VD+ + A + +++++ +L + GD+I + + + +L
Sbjct: 239 RRSSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLS 298

Query: 292 VEDLPTYRCKMGRTKENLALKICEQI 317
+ + + C+ G + +A +I E+I
Sbjct: 299 IGNRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1456FLGMOTORFLIN1144e-36 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 114 bits (287), Expect = 4e-36
Identities = 56/119 (47%), Positives = 81/119 (68%)

Query: 6 DDWASAMAEQAIDEAKAVEFDEFSNESAPLSEDEVSKLDAIMDIPVTISMEVGRSFINIR 65
D WA A+ EQ K+ F + +D IMDIPV +++E+GR+ + I+
Sbjct: 17 DLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDIPVKLTVELGRTRMTIK 76

Query: 66 NLLQLNQGSVVELDRIAGEPLDVMVNGTLIAHGEVVVVNDKFGIRLTDVISQTERIKKL 124
LL+L QGSVV LD +AGEPLD+++NG LIA GEVVVV DK+G+R+TD+I+ +ER+++L
Sbjct: 77 ELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGVRITDIITPSERMRRL 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1458FLGBIOSNFLIP2641e-91 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 264 bits (676), Expect = 1e-91
Identities = 125/246 (50%), Positives = 181/246 (73%), Gaps = 3/246 (1%)

Query: 2 MKWILLLAGLALTFMAPTALAENGILPAVTVSTGADGSTQYSVTMQILLLMTALSFIPAM 61
M+ +L +A + L + P A A+ LP +T G +S+ +Q L+ +T+L+FIPA+
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQ---LPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAI 57

Query: 62 VIMLTSFTRIIVVLSILRQAIGLQQTPSNQVLIGISMFMTFFIMSPVFDKIYVQAVQPYI 121
++M+TSFTRII+V +LR A+G P NQVL+G+++F+TFFIMSPV DKIYV A QP+
Sbjct: 58 LLMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFS 117

Query: 122 EQGMPLQDAFTTGQGPLKDFMLAQTRNSDLNTFIEISGYQNINEPEDAPMTVIIPAFITS 181
E+ + +Q+A G PL++FML QTR +DL F ++ + PE PM +++PA++TS
Sbjct: 118 EEKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTS 177

Query: 182 ELKTAFQIGFMLFVPFLVLDLVVASILMAMGMMMLSPMIVSLPFKIMLFVLVDGWSLVMG 241
ELKTAFQIGF +F+PFL++DLV+AS+LMA+GMMM+ P ++LPFK+MLFVLVDGW L++G
Sbjct: 178 ELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVG 237

Query: 242 TLANSF 247
+LA SF
Sbjct: 238 SLAQSF 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1459TYPE3IMQPROT427e-09 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 42.4 bits (100), Expect = 7e-09
Identities = 23/81 (28%), Positives = 44/81 (54%)

Query: 4 EALIDIFREALAVIVIIVSMIIVPGLIIGLVVAVFQAATSINEQTLSFLPRLLTTLLALM 63
+ L+ +AL +++I+ + IIGL+V +FQ T + EQTL F +LL L L
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 LMGHMLVQMMMDFFFQMVDMI 84
L+ ++++ + Q++ +
Sbjct: 62 LLSGWYGEVLLSYGRQVIFLA 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1460TYPE3IMRPROT1213e-35 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 121 bits (304), Expect = 3e-35
Identities = 80/243 (32%), Positives = 139/243 (57%), Gaps = 1/243 (0%)

Query: 15 YLWPLTRISSMFMVMAVFGATTTPTRVRLLLSVAVTVAVAPVLPAMPTIDLFSLSAAFVT 74
Y WPL R+ ++ + + P RV+L L++ +T A+AP LPA +FS A ++
Sbjct: 16 YFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVP-VFSFFALWLA 74

Query: 75 AQQIIIGVAMGFATLLLMQTFVLTGQIIGMQTSLGFASMVDPSSGQQTPVIGNFFLLLTT 134
QQI+IG+A+GF G+IIG+Q L FA+ VDP+S PV+ +L
Sbjct: 75 VQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDMLAL 134

Query: 135 VIFLAVDGHLLLIKMVIASFDSIPVSMQGLSLVSYRLFTEFVGYMFGAALTMSLSAIVAL 194
++FL +GHL LI +++ +F ++P+ + L+ ++ T+ +F L ++L I L
Sbjct: 135 LLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALPLITLL 194

Query: 195 LTINLSFGVMTRASPQLNIFSIGFPVTMIAGLFILWLTLSPIMTHFDEVWREVQLLLCNV 254
LT+NL+ G++ R +PQL+IF IGFP+T+ G+ ++ + I + ++ E+ LL ++
Sbjct: 195 LTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIFNLLADI 254

Query: 255 LEL 257
+
Sbjct: 255 ISE 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1461TYPE3IMSPROT332e-115 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 332 bits (853), Expect = e-115
Identities = 106/363 (29%), Positives = 181/363 (49%), Gaps = 23/363 (6%)

Query: 7 SQEKTEEATSRKLQQAKDKGQVARSKELGTSAVLIAASVGLLMTGPDIAQAMHNIMTKLF 66
S EKTE+ T +K++ A+ KGQVA+SKE+ ++A+++A S L+ + +KL
Sbjct: 2 SGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFE----HFSKLM 57

Query: 67 TMSRDEIFD------TKQMMNVWGVIGTELALPLMGFIVFLALIAFAGNVALGGISFSVK 120
+ ++ + + + NV L PL+ +A+ + +V G S +
Sbjct: 58 LIPAEQSYLPFSQALSYVVDNVLLEFFY-LCFPLLTVAALMAIAS---HVVQYGFLISGE 113

Query: 121 AFMPKASKMSPMAGFKRMFGVQALVELTKGIAKFSVVAITAFLLLSTYLNDILMLSQDHL 180
A P K++P+ G KR+F +++LVE K I K +++I ++++ L +L L
Sbjct: 114 AIKPDIKKINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPT--- 170

Query: 181 PGNIYHALDIVVWIFILLCAST---FLIV-VIDVPFQIWNHAKQLKMTKQEVKDEYKDTE 236
I ++ I L F+++ + D F+ + + K+LKM+K E+K EYK+ E
Sbjct: 171 -CGIECITPLLGQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEME 229

Query: 237 GKPEVKGRIRQLQHEMSQRRMMGEVPNADVIVVNPEHYAVAVKYDATRSTAPFVVAKGVD 296
G PE+K + RQ E+ R M V + V+V NP H A+ + Y + P V K D
Sbjct: 230 GSPEIKSKRRQFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTD 289

Query: 297 EVAFKIREIAREHDVAIVSAPPLARAIFHTTKIDQEIPEGLFTAVAQVLAYVFQLR-QYQ 355
+R+IA E V I+ PLARA++ +D IP A A+VL ++ + + Q
Sbjct: 290 AQVQTVRKIAEEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEKQ 349

Query: 356 KGK 358
+
Sbjct: 350 HSE 352


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1466HTHFIS911e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 90.7 bits (225), Expect = 1e-24
Identities = 34/105 (32%), Positives = 51/105 (48%), Gaps = 3/105 (2%)

Query: 6 KILVVDDFSTMRRIIKNLLRDLGFNNTQEADDGSTALPMLQKGDFDFVVTDWNMPGMQGI 65
ILV DD + +R ++ L G++ + +T + GD D VVTD MP
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 66 DLLKAIRADDNLKHLPVLMVTAEAKREQIIAAAQAGVNGYVVKPF 110
DLL I+ LPVL+++A+ I A++ G Y+ KPF
Sbjct: 64 DLLPRIKKAR--PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


77Shal_1515Shal_1524N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1515017-0.197605response regulator receiver modulated serine
Shal_1516-2150.232731phosphonate ABC transporter periplasmic
Shal_1517-2160.478182multi-sensor hybrid histidine kinase
Shal_1518-1151.201429Fis family two component sigma54 specific
Shal_15190141.767481putative hydrolase
Shal_1520-1142.008728dTDP-4-dehydrorhamnose reductase
Shal_15211141.593780hypothetical protein
Shal_15221171.162826glycoside hydrolase family 3
Shal_15231170.8923933'(2'),5'-bisphosphate nucleotidase
Shal_15240170.279044fructokinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1515HTHFIS841e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.7 bits (207), Expect = 1e-19
Identities = 35/168 (20%), Positives = 70/168 (41%), Gaps = 11/168 (6%)

Query: 14 KIILVEDSDSERCLLLNLLDSMGFTAQGFSNAQGAIKHLQNEQVDMVITDWMMPQISGIE 73
I++ +D + R +L L G+ + SNA + + D+V+TD +MP + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 74 LCKTIKSMPCSPYTILLTGNNQNAHMIEGIESGADDFIAKPFHSGVLKVRILAGLRIIEM 133
L IK ++++ N I+ E GA D++ KPF +I +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT----------ELIGI 114

Query: 134 QQK-LASHNKTLNQMLVKEQDYLNCLRQDLSLAAQLQRALLPKNSDLT 180
+ LA + +++ QD + + + ++ + +DLT
Sbjct: 115 IGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLT 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1517HTHFIS795e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 78.7 bits (194), Expect = 5e-17
Identities = 26/112 (23%), Positives = 55/112 (49%)

Query: 878 ILLAEDSPANQIVASALLNKSGFKVEIANNGIEALKMASTKEYGLILMDMRMPEMDGIEA 937
IL+A+D A + V + L+++G+ V I +N + + + L++ D+ MP+ + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 938 TQKILQLDPSQIVIAMTANVQKEDVELCMSAGMKDFVPKPVNRENLVNVVNQ 989
+I + P V+ M+A G D++PKP + L+ ++ +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1518HTHFIS489e-172 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 489 bits (1260), Expect = e-172
Identities = 164/486 (33%), Positives = 264/486 (54%), Gaps = 18/486 (3%)

Query: 5 SSFTALLVEDSMSLGALYTEYLRADGARVTHVNHGEDALNELKRWQPDLLVLDIQLPDMS 64
+ T L+ +D ++ + + L G V ++ + DL+V D+ +PD +
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 65 GMDILETVQQQYPDITVIMITAHGTIDIAVDAMRSGAFDFLVKPFDAKRLSITVRNALKQ 124
D+L +++ PD+ V++++A T A+ A GA+D+L KPFD L + AL +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 125 RQLVNLVAKYESSLPKPHYMGFVGESLAMQTVYKTIDCVANSKASAFIIGESGTGKEVCA 184
+ + M VG S AMQ +Y+ + + + + I GESGTGKE+ A
Sbjct: 122 PKR----RPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVA 177

Query: 185 HAIHNAGNRHDGPFVALNCASIPKDLIESEIFGHTKGAFTGAIANRDGAATRAHNGTLFL 244
A+H+ G R +GPFVA+N A+IP+DLIESE+FGH KGAFTGA G +A GTLFL
Sbjct: 178 RALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFL 237

Query: 245 DEICEMDLELQSKLLRFIQTGVFQRVGGTKEEKVDVRFVSATNRMPWDEVKAGRFREDLF 304
DEI +M ++ Q++LLR +Q G + VGG + DVR V+ATN+ + G FREDL+
Sbjct: 238 DEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLY 297

Query: 305 YRLHVIPIELPPLRMRGKDILLLASNLLKEYNKEEGKRFKGISCDAKQCLKSYQWPGNVR 364
YRL+V+P+ LPPLR R +DI L + +++ K EG K +A + +K++ WPGNVR
Sbjct: 298 YRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEK-EGLDVKRFDQEALELMKAHPWPGNVR 356

Query: 365 QLQNVIRQIVVLNDGELVELDMLPIQMTATSETTLAKVSEPVQQDQSVSIAASNSAPVSF 424
+L+N++R++ L +++ +++ ++ + P+++ + S + S S V
Sbjct: 357 ELENLVRRLTALYPQDVITREIIENELRSE------IPDSPIEKAAARSGSLSISQAVE- 409

Query: 425 LEGIESAGFSADEHFPAVTPEAKSDDIVPLWLTEKQTIENAIALCKGNVPKAAALLDISA 484
E+ A+ P D + L E I A+ +GN KAA LL ++
Sbjct: 410 ----ENMRQYFASFGDALPPSGLYDRV--LAEMEYPLILAALTATRGNQIKAADLLGLNR 463

Query: 485 STIYRK 490
+T+ +K
Sbjct: 464 NTLRKK 469


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1520NUCEPIMERASE596e-12 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 59.4 bits (144), Expect = 6e-12
Identities = 41/190 (21%), Positives = 70/190 (36%), Gaps = 28/190 (14%)

Query: 1 MRVLITGAAGQLGQALLSIAELTQVTAAELTAPQQML----------VALLPEVLACIAT 50
M+ L+TGAAG +G V+ L A Q++ V+L L +A
Sbjct: 1 MKYLVTGAAGFIG---------FHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQ 51

Query: 51 TDEVIGVSHQQLDICALHSIQAAFDAFKPDVVINCAAFNGVDKAETDTDKAIAVNATGPK 110
G ++D+ + F + + V V + + N TG
Sbjct: 52 P----GFQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFL 107

Query: 111 LLAGECKRLNI-RLVHISTDFVFDGALKRPYTEQDMPS-PLSAYGRSKLEGERWV---ND 165
+ C+ I L++ S+ V+ K P++ D P+S Y +K E +
Sbjct: 108 NILEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH 167

Query: 166 ILGAKATIIR 175
+ G AT +R
Sbjct: 168 LYGLPATGLR 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1524ACETATEKNASE320.003 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 31.7 bits (72), Expect = 0.003
Identities = 9/42 (21%), Positives = 18/42 (42%), Gaps = 1/42 (2%)

Query: 211 MEEGDALATAAFERYIDRLARSLAHIINVLDP-DIIVLGGGV 251
+ GD A A + R+ +++ + D+IV G+
Sbjct: 291 FKNGDKRAQLALNVFAYRVKKTIGSYAAAMGGVDVIVFTAGI 332


78Shal_1627Shal_1632N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1627-1192.586664TetR family transcriptional regulator
Shal_1628-2172.515986RND family efflux transporter MFP subunit
Shal_1629-2151.434994acriflavin resistance protein
Shal_1630-215-2.857941hypothetical protein
Shal_1631-115-3.544005hypothetical protein
Shal_1632-216-4.006521purine phosphorylase family 1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1627HTHTETR551e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 55.4 bits (133), Expect = 1e-11
Identities = 19/112 (16%), Positives = 38/112 (33%), Gaps = 6/112 (5%)

Query: 29 RAALIRSANQCFTASDYDSVSIRLIAQQADVNMAMIRYYFGDKLGLFEAMV---TEQIQP 85
R ++ A + F+ S S+ IA+ A V I ++F DK LF + I
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 86 IYQRAKALQKSHDKPTIADLITEFYQTMIPNPEFPRF---LFRLMNSDGSSE 134
+ +A + +++ ++ + +F G
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1628RTXTOXIND515e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.0 bits (122), Expect = 5e-09
Identities = 58/289 (20%), Positives = 100/289 (34%), Gaps = 60/289 (20%)

Query: 110 RLAQAQADRKALEGQLEGKKLQLKNLQLSLEIENNRYDLVKSDLKRKETLRKQNLISQSE 169
+ + Q + E L+ K+ + + + N + KS L +L + I+
Sbjct: 194 QFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIA--- 250

Query: 170 LDGERQNLLAQQQKLQELDNSLN-----LMPNETQILEA-----------------QRLQ 207
+ +L Q+ K E N L L E++IL A + Q
Sbjct: 251 ----KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQ 306

Query: 208 AVAREQEAQSQLTK-------TEIRLPFEGRVAEVNVES-SQVVSPQQVLARINGI-EVM 258
+L K + IR P +V ++ V + VV+ + L I + +
Sbjct: 307 TTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTL 366

Query: 259 EVEAQVSLGDVMTLMKTVNRPQEQEKQLPRAETLGLSAEITLKGATYSF--SWPAQITRI 316
EV A V Q K + G +A I ++ Y+ ++ I
Sbjct: 367 EVTALV-----------------QNKDIGFINV-GQNAIIKVEAFPYTRYGYLVGKVKNI 408

Query: 317 G--ETVDPTLATVSVVLQVEQQYRDLKMGQSPPLVNGMFVSARIKGGER 363
D L V V+ ++ ++ PL +GM V+A IK G R
Sbjct: 409 NLDAIEDQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMR 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1629ACRIFLAVINRP452e-144 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 452 bits (1164), Expect = e-144
Identities = 217/1050 (20%), Positives = 441/1050 (42%), Gaps = 53/1050 (5%)

Query: 1 MIGFFVRHPTATTLLMLAFIVLGIKALPELKRETFPEFSKNYITAQVVLPGASPQDVEEN 60
M FF+R P +L + ++ G A+ +L +P + ++ PGA Q V++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 LCLRMEDAVDSLGSIIETKCDAL-EGVARMTLKLDDKADLSRSLVDVQTKISAIKD-FPA 118
+ +E ++ + +++ + G +TL D + V VQ K+ P
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 119 EIEPPIVEELDFNERII----DIAVSAATSKPELKAYAED-LKRRLKLDTSISQVEISGF 173
E++ + + + ++ + T++ ++ Y +K L + V++ G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 174 SSHQLLVEISLGAIKRLGMSVADVAKQIEQQNVQLPSGTVETPSK------NILIRFDQR 227
+ + + + + + + ++ DV Q++ QN Q+ +G + N I R
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 228 EVEPERLANIVIRSDALGGVVRLRDIATITDRFELDEESIRFDGEQAAILTVYKNKSQDS 287
PE + +R ++ G VVRL+D+A + E R +G+ AA L + ++
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 288 LRLKEEVIGFLDAEKLRTPNGISISTSNDLSSLLWDRLTMLVKNGWQGVVLVFLCMWLF- 346
L + + L + P G+ + D + + + +VK ++ ++LVFL M+LF
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 347 FSLRYSFWVAMGLPIAFMGSLFFMAQVGLTINIMTLVAFLMAIGIMMDDAIVIAESIA-A 405
++R + + +P+ +G+ +A G +IN +T+ ++AIG+++DDAIV+ E++
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 406 HIERGMSKADAVIQGVKRVLPGVVSSFLTTVFIFSSIAFMEGDMGKVLRVVPQTLLLILT 465
+E + +A + + ++ +V + +F +AF G G + R T++ +
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 466 ISLVEAFLILPNHLAHSAKGKQRKTSRFKQKFNHKFE---HFRTVQLVAAVEWVINWRYV 522
+S++ A ++ P A K + K F F +V ++
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 523 FMGSVISLLFISISLVAGGAIKFVGFPELDGDVAEARIILPPGSTLQQTEHVVNRVVSVA 582
++ +I L ++ +V + PE D V I LP G+T ++T+ V+++V
Sbjct: 540 YL--LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYY 597

Query: 583 KALGDKYSEKEDGHRLIQHITERFNFNADAGENGAHVATVKVDLLSAEVRQTLMSTFI-R 641
K + ++ + F+ A +A V + + +
Sbjct: 598 L----KNEKAN-----VESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIH 648

Query: 642 EWQQGVGEMVDPIAIVFKQPK---MGPAGRA--IEIRVRGDDLDELKSAAIDV-QQYLQG 695
+ +G++ D I F P +G A I G D L A + Q
Sbjct: 649 RAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQH 708

Query: 696 FNGVSGVMDSMRPGKAEILMTLKPG-AEAFGVN----GMMLASQLRGAYFHQTADNIQVG 750
+ V + A+ + + A+A GV+ +++ L G Y +
Sbjct: 709 PASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVND----FIDR 764

Query: 751 PESIQIDVQLDKTDAAKLENLANFPITIGTDGEQVPLSAVADFEWQRGYVKISRLDGMRF 810
++ VQ D E++ + +GE VP SA W G ++ R +G+
Sbjct: 765 GRVKKLYVQADAKFRMLPEDVDKLYVRSA-NGEMVPFSAFTTSHWVYGSPRLERYNGLPS 823

Query: 811 VTITGEVDTHLANASEINDAFKAELLPELKKRYPGIRVSYEGEVKESKTTQNSMASGFIV 870
+ I GE A + DA A + K GI + G + + + N + +
Sbjct: 824 MEIQGEA----APGTSSGDA-MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAI 878

Query: 871 GLLAVFAILSLQFKSYIEPLVVMAVIPLGLIGVLWGHLLLGYSMSMPSIMGFVALAGVVV 930
+ VF L+ ++S+ P+ VM V+PLG++GVL L + ++G + G+
Sbjct: 879 SFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSA 938

Query: 931 NDSILLVQYIR-YHVEEGQSVHEAVVSASKERFRAVFITSLTTAAGMLPLLLETSIQAQV 989
++IL+V++ + +EG+ V EA + A + R R + +TSL G+LPL + +
Sbjct: 939 KNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGA 998

Query: 990 LQPLVVSMVFGIFASTALVLFMVPACYAIL 1019
+ + ++ G+ ++T L +F VP + ++
Sbjct: 999 QNAVGIGVMGGMVSATLLAIFFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1632ANTHRAXTOXNA290.027 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 28.6 bits (63), Expect = 0.027
Identities = 13/40 (32%), Positives = 25/40 (62%)

Query: 221 LVVIRTISDKADGSAHLVYEEAKQVTADNSVAITLNMIRE 260
L +I+++SD +D S L ++ K+ N+ +I +N I+E
Sbjct: 193 LNLIKSLSDDSDSSDLLFSQKFKEKLELNNKSIDINFIKE 232


79Shal_1698Shal_1705N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1698-213-1.033075PAS/PAC sensor-containing hybrid histidine
Shal_1699-317-1.914900hypothetical protein
Shal_1700019-1.174396FAD dependent oxidoreductase
Shal_1701-116-0.985526two component LuxR family transcriptional
Shal_1702018-1.064015thioesterase superfamily protein
Shal_1703-117-1.210966phospho-2-dehydro-3-deoxyheptonate aldolase
Shal_1704015-1.228923hypothetical protein
Shal_1705-115-1.127352phosphoenolpyruvate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1698HTHFIS536e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 52.9 bits (127), Expect = 6e-09
Identities = 27/115 (23%), Positives = 44/115 (38%), Gaps = 5/115 (4%)

Query: 1022 KVLCIDNEEAILAGLESLLSRWQCEVICAKDLADARIKLGLKGVAPDIVLADYHLDDGQN 1081
+L D++ AI L LSR +V + A + D+V+ D + D N
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRW--IAAGDGDLVVTDVVMPDE-N 61

Query: 1082 GVDAMDGIRGLYGEHLPGILITANTRKDL-IDDMHKRGYHYMAKMVKPAALRALI 1135
D + I+ LP ++++A I K Y Y+ K L +I
Sbjct: 62 AFDLLPRIKKAR-PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGII 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1701HTHFIS711e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 71.0 bits (174), Expect = 1e-16
Identities = 30/119 (25%), Positives = 53/119 (44%), Gaps = 6/119 (5%)

Query: 8 IIIADDHPLFRNALRQALSSEFKNTQWFEADSADALQALLES-SDIEYDVVLLDLQMPGS 66
I++ADD R L QALS ++ L + + D+V+ D+ MP
Sbjct: 6 ILVADDDAAIRTVLNQALSRAG-----YDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 67 HGYSTLIHLRTHFQELPVVVISAHEDNLTISRAIHYGSSGFIPKSSSMETLAEALEAVL 125
+ + L ++ +LPV+V+SA +T +A G+ ++PK + L + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1702TYPE3OMGPROT270.028 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 27.2 bits (60), Expect = 0.028
Identities = 12/44 (27%), Positives = 24/44 (54%), Gaps = 1/44 (2%)

Query: 66 VTVSSDRIDFNKPIPAGSLAEIIARVIHVGNTSIKVEVNIFVED 109
V V+ + K I G++ + RV+ G+ S ++ +N+ +ED
Sbjct: 383 VKVTGKEVAELKGITYGTMLRMTPRVLTQGDKS-EISLNLHIED 425


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1705PHPHTRNFRASE2996e-94 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 299 bits (766), Expect = 6e-94
Identities = 106/418 (25%), Positives = 180/418 (43%), Gaps = 65/418 (15%)

Query: 384 QPGDVLVTDMTDPDWEPIMK-RASAIVTNRGGRTCHAAIIARELGVPAVVGCGDVTERIK 442
+ ++ D+T D + K T+ GGRT H+AI++R L +PAVVG +VTE+I+
Sbjct: 155 EETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQ 214

Query: 443 NGQEITVSCAEG---------DTGLIYNGLQDFEVISSRVDSMPELP--------MKITM 485
+G + V EG + FE + P +++
Sbjct: 215 HGDMVIVDGIEGIVIVNPTEEEVKAYEEKRAAFEKQKQEWAKLVGEPSTTKDGAHVELAA 274

Query: 486 NVGNPDRAFDFARLPNEGVGLARLEFIINRMIGIHPKALLEFNDQTAELQQEIHEMIAGY 545
N+G P EG+GL R EF+ + ++E
Sbjct: 275 NIGTPKDVDGVLANGGEGIGLYRTEFLY-------------MDRDQLPTEEE-------- 313

Query: 546 ESPVEFYVARLVEGIATIASAFHPKKVIVRMSDFKSNEYANLVGGDRYEPEEENPMLGFR 605
E + K V++R D ++ + + P+E NP LGFR
Sbjct: 314 ----------QFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYL----QLPKELNPFLGFR 359

Query: 606 GASRYISESFRDCFALECEAIKRVRNEMGLKNVEIMIPFVRTLSEAAQVIELLKEQGLER 665
+ + +D F + A+ R N+++M P + TL E Q +++E+ +
Sbjct: 360 AIRLCLEK--QDIFRTQLRALLRAS---TYGNLKVMFPMIATLEELRQAKAIMQEEKDKL 414

Query: 666 GKDG------LRVIMMCELPSNALLAEQFLEHFDGFSIGSNDLTQLTLGLDRDSGIISHL 719
+G + V +M E+PS A+ A F + D FSIG+NDL Q T+ DR + +S+L
Sbjct: 415 LSEGVDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYL 474

Query: 720 FDERNDAVKALLSMAIKAAKAKGAYVGICGQGPSDHPDFAAWLVEQGIDTVSLNPDTV 777
+ + A+ L+ M IKAA ++G +VG+CG+ D L+ G+D S++ ++
Sbjct: 475 YQPYHPAILRLVDMVIKAAHSEGKWVGMCGEMAGD-EVAIPLLLGLGLDEFSMSATSI 531


80Shal_1729Shal_1735N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_17291170.781524putative chaperone
Shal_17300170.467621CreA family protein
Shal_17310140.251324hypothetical protein
Shal_17320140.128867cystathionine beta-lyase
Shal_1733013-0.463648histidine kinase
Shal_1734013-1.437921two component transcriptional regulator
Shal_1735-115-1.644822OmpA/MotB domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1729SHAPEPROTEIN544e-10 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 54.0 bits (130), Expect = 4e-10
Identities = 38/155 (24%), Positives = 66/155 (42%), Gaps = 20/155 (12%)

Query: 109 VRSPKSFLGATGLRDSQIALFEDIVTLMMMHIKAQAEKNLGTKVITHAVIGRPVNFQGIG 168
R+P + ++D IA F + M+ H Q N + ++ PV +
Sbjct: 64 GRTPGNIAAIRPMKDGVIADFF-VTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQV- 121

Query: 169 GEQSNQQAEAILSLAAKRAGFTDVDFLFEPLAAGMDYEASLEENMTVLVVDVGGGTTDCS 228
+ AI +A+ AG +V + EP+AA + + E +VVD+GGGTT+ +
Sbjct: 122 ------ERRAIRE-SAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVA 174

Query: 229 MVKMGPNHIDNRDRSADFLGHSGQRIGGNDLDIAL 263
++ + + S RIGG+ D A+
Sbjct: 175 VISLN-----------GVVYSSSVRIGGDRFDEAI 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1731BCTERIALGSPG444e-08 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 44.1 bits (104), Expect = 4e-08
Identities = 16/56 (28%), Positives = 32/56 (57%)

Query: 1 MHRPTASRGFTLIELVIVIIVLGILAVIATAKYVDLKRDAEVARVKATAAALQQSV 56
M RGFTL+E+++VI+++G+LA + + K A+ + + AL+ ++
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENAL 56


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1734HTHFIS661e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.0 bits (161), Expect = 1e-14
Identities = 30/130 (23%), Positives = 54/130 (41%), Gaps = 4/130 (3%)

Query: 3 RIAIVEDEAAIRENYKEVLQQQGYCVQAYANRPQAMLAFNTRLPDLAIIDIGLENEIDGG 62
I + +D+AAIR + L + GY V+ +N DL + D+ + +E
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE--NA 62

Query: 63 FTLCQSLRAMSSTLPIIFLTARDSDFDTVCGLRLGADDYLSKDVSFPHLIA--RLAALFR 120
F L ++ LP++ ++A+++ + GA DYL K LI A
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122

Query: 121 RSDLKNVSND 130
+ + +D
Sbjct: 123 KRRPSKLEDD 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1735OMPADOMAIN763e-18 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 76.1 bits (187), Expect = 3e-18
Identities = 28/95 (29%), Positives = 49/95 (51%), Gaps = 2/95 (2%)

Query: 126 ELALGLNVQFKTGSSTIEPHFQNQLNDIAYAMS--LSPELTLDLTGYADRRGDGDYNQAL 183
L +V F +T++P Q L+ + +S + ++ + GY DR G YNQ L
Sbjct: 214 HFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGL 273

Query: 184 SEQRVAEVKNYLIEQGVAEARLHNQAYGDSSPLMA 218
SE+R V +YLI +G+ ++ + G+S+P+
Sbjct: 274 SERRAQSVVDYLISKGIPADKISARGMGESNPVTG 308


81Shal_1968Shal_1981N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_1968118-3.073155hypothetical protein
Shal_1969017-0.709967LuxR family transcriptional regulator
Shal_1970-1161.696146hypothetical protein
Shal_1971-2141.191646hypothetical protein
Shal_1972-2131.164851signal transduction protein
Shal_1973-3141.080513TetR family transcriptional regulator
Shal_1974-2111.346517RND family efflux transporter MFP subunit
Shal_1975-1110.833546acriflavin resistance protein
Shal_1976-118-1.145119vault protein inter-alpha-trypsin subunit
Shal_1977119-1.260022MerR family transcriptional regulator
Shal_1978317-1.087215hypothetical protein
Shal_1979316-1.433136peptidase S8/S53 subtilisin kexin sedolisin
Shal_1980317-2.606081two component LuxR family transcriptional
Shal_1981316-2.579943histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1968MICOLLPTASE412e-05 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 41.2 bits (96), Expect = 2e-05
Identities = 29/119 (24%), Positives = 45/119 (37%), Gaps = 17/119 (14%)

Query: 493 YQFQMTSQASFSPPFNASYDYKIEF-------ILVSATDGLPIANAGEDQTARLGDTIIL 545
Y+ + N +Y Y + F + P A D + + + I
Sbjct: 734 YKTVTAYFVNHKVDGNGNYVYDVVFHGMNTDTNTDVHVNKEPKAVIKSDSSVIVEEEINF 793

Query: 546 DGTSSFDDNTPTESLGYNWSFSSLPSGSTVSLTQANSANPSFVIDAFGEYTVELIVTDN 604
DGT S D++ ++ Y W F G +N A + + GEY V+L VTDN
Sbjct: 794 DGTESKDEDGEIKA--YEWDFGD---GEK-----SNEAKATHKYNKTGEYEVKLTVTDN 842


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1973HTHTETR572e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 57.3 bits (138), Expect = 2e-12
Identities = 20/107 (18%), Positives = 43/107 (40%), Gaps = 1/107 (0%)

Query: 3 RSEQKRLAIINAAKEEFIHHGFIAANMDRICTTAEVSKRTLYRHYESKEKLFESVLTIIQ 62
+++ R I++ A F G + ++ I A V++ +Y H++ K LF + + +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 63 ASIDETL-TYPFNPELSLYEQLKAITYLEVDTLYNICGMALARTILL 108
++I E Y L+ I +++ L I+
Sbjct: 68 SNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIF 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1974RTXTOXIND531e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 52.5 bits (126), Expect = 1e-09
Identities = 20/108 (18%), Positives = 35/108 (32%), Gaps = 1/108 (0%)

Query: 64 VVRAAERAELAFQVGGRLINILVKEGDEVKQGQVLARLDARDATTALESAQLELKNTALD 123
+ + E+ + I+VKEG+ V++G VL +L A A Q L L+
Sbjct: 90 LTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLE 149

Query: 124 HNRALVVYEKSQAISKSELD-TITTRLELARNHVEDATLQLEYTELKA 170
R ++ + EL + L +
Sbjct: 150 QTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFST 197



Score = 34.8 bits (80), Expect = 5e-04
Identities = 30/196 (15%), Positives = 64/196 (32%), Gaps = 15/196 (7%)

Query: 93 KQGQVLARLDARDATTALESAQLELKNTALDHNRALVVYEKSQAISKSELDTITTRLELA 152
+Q + Q+E + + LV I +L T + L
Sbjct: 256 EQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEIL-DKLRQTTDNIGLL 314

Query: 153 RNHVEDATLQLEYTELKAPFAGIIGRKLVDNHIQ-IQANTPVF-ILHDLSDLEVVINIPH 210
+ + + + ++AP + + + V + + I+ + LEV + +
Sbjct: 315 TLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQN 374

Query: 211 KVMLSSAIASKANAELSAIPGTLFP-LALRTFGTQADAVS-----QTYPVVLGFD----- 259
K + + A ++ A P T + L + DA+ + V++ +
Sbjct: 375 KDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLS 434

Query: 260 -DLKGFRVLPGMAVKV 274
K + GMAV
Sbjct: 435 TGNKNIPLSSGMAVTA 450


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1975ACRIFLAVINRP467e-150 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 467 bits (1203), Expect = e-150
Identities = 239/1049 (22%), Positives = 452/1049 (43%), Gaps = 60/1049 (5%)

Query: 3 IARYTIAKRTSVWVLIALILLGGYISYLKLGRFEDPEFVIRQAVINTAYSGATAQEVSDE 62
+A + I + WVL ++++ G ++ L+L + P ++ Y GA AQ V D
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 ITDVIEGAVQTLQELKEVKSVSKQGMSEVTVEIKLEFAGSSEELQQVWDKLRRKIADVQR 122
+T VIE + + L + S S S VT+ + + + Q +++ K+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGS-VTITLTFQSGTDPDIAQV---QVQNKLQLATP 116

Query: 123 QLPPGA-GPSIVNDDFSDVYALFFAV--TGEGFTDKQLQDYVD-TLRRDLVLVDGVAKTA 178
LP I + S Y + G T + DYV ++ L ++GV
Sbjct: 117 LLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGD-V 175

Query: 179 TLAEQQETIFVEISSERLAKFGVSAEKVYQVLQKQNMVTVAGSIETD------AMRIAVI 232
L Q + + + ++ L K+ ++ V L+ QN AG + + ++I
Sbjct: 176 QLFGAQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASII 235

Query: 233 PSSGIDSFTDLNNLQIGIGDNNTVLRLGDIANVTRGYKDPASMLIRYNGERAIGFGLSNV 292
+ + + + + + + +V+RL D+A V G ++ + R NG+ A G G+
Sbjct: 236 AQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIA-RINGKPAAGLGIKLA 294

Query: 293 TGGNVVDMGDAVKARIAELDSQRPLGMDLNVISMQSDSVRDSVTNFIDNLIAAVVIVFLV 352
TG N +D A+KA++AEL P GM + + V+ S+ + L A+++VFLV
Sbjct: 295 TGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLV 354

Query: 353 LLLFMG-VRSGVIIGFVLLLTVAGTLCIMLIDNIAMQRISLGALIIALGMLVDNAIVVTD 411
+ LF+ +R+ +I + + + GT I+ ++ +++ +++A+G+LVD+AIVV +
Sbjct: 355 MYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVE 414

Query: 412 GILVRMQQNPDEDRETIVSEVVDATKWPLLGGTVVGICAFSAIGLSPSDMGEYAGSLFWV 471
+ M ++ +E + L+G +V F + G
Sbjct: 415 NVERVMMEDKLPPKEATEKSMSQIQG-ALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSIT 473

Query: 472 ILYSMLLSWVFAVTITPMLCFDFLKVKALKVGQKPG-----------RIVTAYSTILRWV 520
I+ +M LS + A+ +TP LC LK + + + G V Y+ + +
Sbjct: 474 IVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKI 533

Query: 521 IGHRALSSLMLLGTLATAIWGVQFVPPGFMPESQRAQFVVDVYLPQGSDIKRTEALVANI 580
+G L+ +A + +P F+PE + F+ + LP G+ +RT+ ++ +
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQV 593

Query: 581 EQDVKAKEGITNITSFIGGGGLRFMLTYAPEPRNPSYGQL-LIDIDDYKKIAP----LLG 635
E N+ S G F + + +N + L ++ ++
Sbjct: 594 TDYYLKNE-KANVESVFTVNGFSF----SGQAQNAGMAFVSLKPWEERNGDENSAEAVIH 648

Query: 636 ELQSELDAKYPDASVKVWKFM----LGRGGGKKIE-AGFKGPDSQVLRQLAEE-AKVILL 689
+ EL K D V + LG G E G L Q + +
Sbjct: 649 RAKMEL-GKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQ 707

Query: 690 NDSNLIAVQDDWRQQVPVLRPVYSAEEAQRYGLTTQEINQAIAQTLTGRNVGVYREGDDL 749
+ ++L++V+ + + + E+AQ G++ +INQ I+ L G V + + +
Sbjct: 708 HPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRV 767

Query: 750 IPIVVRSPQNERNHQRAIENTEVFSHTQGGFIPVSQLVDSVDVVYQDDMLRRIDRMPTIL 809
+ V++ R ++ V S G +P S VY L R + +P++
Sbjct: 768 KKLYVQADAKFRMLPEDVDKLYVRSAN-GEMVPFSAFTT-SHWVYGSPRLERYNGLPSME 825

Query: 810 VQADPAPGVLTADAFNKVRSKIEAI--ELPAGYELIWYGEYKASKDANEGLVISAPYGFA 867
+Q + APG + DA + +E + +LPAG W G + + F
Sbjct: 826 IQGEAAPGTSSGDA----MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFV 881

Query: 868 AMILSVIFMFNAIRQPLVIWLTAPLAIIGVSVGLIIFQTPFEFMAILGFLSLIGMMVKNA 927
+ L + ++ + P+ + L PL I+GV + +F + ++G L+ IG+ KNA
Sbjct: 882 VVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNA 941

Query: 928 IVLVDQA-DVEVREGKSGYDAIINAALSRARPVLLGALTTILGVAPLLIDP-----FFKS 981
I++V+ A D+ +EGK +A + A R RP+L+ +L ILGV PL I +
Sbjct: 942 ILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNA 1001

Query: 982 MAVTIMFGLMFATILTLVVIPLFYSIFFR 1010
+ + +M G++ AT+L + +P+F+ + R
Sbjct: 1002 VGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1979SUBTILISIN1403e-39 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 140 bits (354), Expect = 3e-39
Identities = 74/214 (34%), Positives = 100/214 (46%), Gaps = 29/214 (13%)

Query: 124 AGMKVCVIDSGLDRSNPDFEWSSITG----DNDIGTGNWDENGGPHGTHVAGTIGAADND 179
G+KV V+D+G D +PD + I G D+D G ++ HGTHVAGTI A +N+
Sbjct: 41 RGVKVAVLDTGCDADHPDLKARIIGGRNFTDDDEGDPEIFKDYNGHGTHVAGTIAATENE 100

Query: 180 IGVVGMAPGVEMHIIKVFNAQGWGYSSDLAHAANLCNAAGANIISMSLGGGGANSTEENA 239
GVVG+AP ++ IIKV N QG G + +IISMSLGG A
Sbjct: 101 NGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQKVDIISMSLGGPEDVPELHEA 160

Query: 240 FKTFSSSGGLVVAAAGNDG-----NNVRSYPAGYTSVMMVGANDADNNIAAFSQFPACST 294
K +S LV+ AAGN+G + YP Y V+ VGA + D + + FS
Sbjct: 161 VKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAINFDRHASEFSNS----- 215

Query: 295 FSTGRGKKNKTEIIDGSCVEITAGGVSTLSTYPQ 328
+ V++ A G LST P
Sbjct: 216 ---------------NNEVDLVAPGEDILSTVPG 234



Score = 48.3 bits (115), Expect = 4e-08
Identities = 18/72 (25%), Positives = 26/72 (36%), Gaps = 5/72 (6%)

Query: 448 SDYGFMSGTSMATPAISGLAALLWSN-----HSGCTGNDIREALKASAYDAGESGRDNYF 502
Y SGTSMATP ++G AL+ T ++ L G S +
Sbjct: 235 GKYATFSGTSMATPHVAGALALIKQLANASFERDLTEPELYAQLIKRTIPLGNSPKMEGN 294

Query: 503 GYGIAKAADASA 514
G A + +
Sbjct: 295 GLLYLTAVEELS 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1980HTHFIS639e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.9 bits (153), Expect = 9e-14
Identities = 22/122 (18%), Positives = 47/122 (38%), Gaps = 6/122 (4%)

Query: 4 KITIILVDDHILVRAGIRSLLESIANVEVIKESGDGIEALSLLRTHQPTLLILDISLPGL 63
TI++ DD +R + L A +V + + L++ D+ +P
Sbjct: 3 GATILVADDDAAIRTVLNQALS-RAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 64 NGLEVARSVTKMKTKTKILMLSMHSDVEYVAKALTIGSHGYLMK----ESAVEELETAIS 119
N ++ + K + +L++S + KA G++ YL K + + A++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 TL 121

Sbjct: 121 EP 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_1981PF06580396e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.5 bits (92), Expect = 6e-05
Identities = 41/304 (13%), Positives = 97/304 (31%), Gaps = 69/304 (22%)

Query: 778 WSDPAKLNVKIVPPWWMKTSVRIASITIIIMMFIAFHKMRLARWQKHTARLQALQAQKQV 837
W A +N K V I ++ ++ M+ + A + +
Sbjct: 99 WRLLAFINTKPVAFTLPLALSIIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMA 158

Query: 838 ALDEVENREQQLST--AYQGLRSLASQIQNAKEEERKSISRELHDQFGQTLTATKINLQL 895
++ + Q++ + L ++ + I + R+ ++ L + +L +
Sbjct: 159 QEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTS-LSELMRYSLRYSNARQVS 217

Query: 896 YKKFNYQEQERIESAISITHTMIQQMR-----SISFNLRPALLDDEGLVAGVKLQLEKMS 950
E ++S + + ++ + PA++D +Q+ M
Sbjct: 218 LA----DELTVVDSYLQL-----ASIQFEDRLQFENQINPAIMD---------VQVPPM- 258

Query: 951 ALIAKPIKLSVSRDFPMVNQGITINVFRIIQESVNNAIRHA-----NASQIIVSLSYHSE 1005
++Q V N I+H +I++ + +
Sbjct: 259 ----------------------------LVQTLVENGIKHGIAQLPQGGKILLKGTKDNG 290

Query: 1006 QLFIEIKDDGKGFDVDKVKENTFSGVHLGLLGMEERVHSLSGK---LLLHSSISNGSIIK 1062
+ +E+++ G +NT GL + ER+ L G + L + +
Sbjct: 291 TVTLEVENTGSLA-----LKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM- 344

Query: 1063 VVIP 1066
V+IP
Sbjct: 345 VLIP 348


82Shal_2061Shal_2071N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2061013-3.041342two component LuxR family transcriptional
Shal_2062113-2.837525histidine kinase
Shal_2063314-3.588698ApbE family lipoprotein
Shal_2064215-3.220806FMN-binding domain-containing protein
Shal_2065218-3.036970porin
Shal_2066221-2.757898pseudouridine synthase
Shal_2067017-2.369175integrase family protein
Shal_2068120-3.204466hypothetical protein
Shal_2069122-2.607451hypothetical protein
Shal_2070222-2.545664hypothetical protein
Shal_2071223-2.543425short-chain dehydrogenase/reductase SDR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2061HTHFIS953e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 95.3 bits (237), Expect = 3e-25
Identities = 33/150 (22%), Positives = 69/150 (46%)

Query: 2 TNLYLVDDDQAVLDSLTWMLNGLGYQPKGFLSADSFLQQVDINNTGIAILDVQMPGMDGS 61
+ + DDD A+ L L+ GY + +A + + + + + + DV MP +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 ALLTHMSKAQSPIAVIMLSGHGSIAMAVQAIQKGALDFLEKPVDGDKLVRLLDQAGELTE 121
LL + KA+ + V+++S + A++A +KGA D+L KP D +L+ ++ +A +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 122 QNLRSKQERQALSDKLATLTPREHEVMEKV 151
+ ++ L + E+ +
Sbjct: 124 RRPSKLEDDSQDGMPLVGRSAAMQEIYRVL 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2065ECOLNEIPORIN662e-14 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 66.4 bits (162), Expect = 2e-14
Identities = 61/350 (17%), Positives = 112/350 (32%), Gaps = 24/350 (6%)

Query: 2 MKIVKLTLLVAAVLASPSVMADAY-KFYGRIDYSITHSDSGSATHSGKSGTVLENNWSRL 60
MK + L +AA+ + Y ++ S + + +G+ S ++GT + + S++
Sbjct: 1 MKKSLIALTLAALPVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSKI 60

Query: 61 GIKGDAAINEDFTVFYQIEVGVNGATEGKQNNPFSARPTFLGIKHSTVGQLAAGRIDPVF 120
G KG + +Q+E + A R +F+G+K G+L GR++ V
Sbjct: 61 GFKGQEDLGNGLKAIWQVEQKASIAGTDSGW---GNRQSFIGLK-GGFGKLRVGRLNSVL 116

Query: 121 KMAKGTADAMDMYSLKHDRLFAGDKRWGDSLEYKTVKWNKLQFGASYILEDNYYGEDDVR 180
K A + S+ Y + ++ L Y D+
Sbjct: 117 KDTGDINPWDSKSDYLGVNKIAEPEARLISVRYDSPEFAGLSGSV------QYALNDNAG 170

Query: 181 RDNGN-YQVALTYGDKLFKSGDLYLAAAYTDGVEDIKGFRAVAQYKIDKLMLGSIYQSSE 239
R N Y Y + F + E++ + + ++Y S
Sbjct: 171 RHNSESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSGYDNDALYASVA 230

Query: 240 IVNPNLDNWQQRDGDG--FIVSAKYQFDKLTLKAQYGQDDSGTGKIAGRVYDKLGAAATE 297
+ + ++ V+A + + + G Y+
Sbjct: 231 VQQQDAKLVEENYSHNSQTEVAATLAYRFGNVTPRVSYAHGFKGSFDATNYNND------ 284

Query: 298 VPEVSQWAIGAEYRLSKSTRVHTELGQFDVKQYSD-FDDTIMSVGFRLDF 346
Q +GAEY SK T G + F T VG R F
Sbjct: 285 ---YDQVVVGAEYDFSKRTSALVSAGWLQEGKGESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2066PF00577290.025 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 28.7 bits (64), Expect = 0.025
Identities = 8/34 (23%), Positives = 15/34 (44%)

Query: 154 SSLDDDAPLLSTNEQAVDRSVSVTRVEFIPQTGR 187
++L D+ L + V ++ R EF + G
Sbjct: 763 NTLADNVDLDNAVANVVPTRGAIVRAEFKARVGI 796


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2067PF05272300.018 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.018
Identities = 10/52 (19%), Positives = 20/52 (38%), Gaps = 5/52 (9%)

Query: 199 LRDKAILQLGLQGGFRRSELAEIRVEHISFL-REKLKVRVPYSKSNQQGQRE 249
+ +L FRR++ ++ +F K + R Y + Q R+
Sbjct: 639 IAGIVAYELSEMTAFRRADAEAVK----AFFSSRKDRYRGAYGRYVQDHPRQ 686


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2070NEISSPPORIN260.035 Neisseria sp. porin signature.
		>NEISSPPORIN#Neisseria sp. porin signature.

Length = 348

Score = 26.5 bits (58), Expect = 0.035
Identities = 19/71 (26%), Positives = 33/71 (46%), Gaps = 16/71 (22%)

Query: 1 MRRYLLSLGLLLLPVSAMANIIVD---KTGVDEKDYVYDLHQCTEMSTQVKQKQTEGSAI 57
M++ L++L L LPV+AMA++ + K GV + V+ + S +
Sbjct: 1 MKKSLIALTLAALPVAAMADVTLYGAIKAGV-------------QTYRSVEHTDGKVSKV 47

Query: 58 GTAAKGAAIGS 68
T ++ A GS
Sbjct: 48 ETGSEIADFGS 58


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2071DHBDHDRGNASE739e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 73.2 bits (179), Expect = 9e-18
Identities = 47/195 (24%), Positives = 80/195 (41%), Gaps = 16/195 (8%)

Query: 2 KHVVITGANRGIGLAFVGHYLTTGWQVTA--CCRNLNDAVALQHQQSKFTALKLVELDVT 59
K ITGA +GIG A + G + A + V + A DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF-PADVR 67

Query: 60 IPSSIAELKRSLGSEA--IDLLINNAGYYGPKGVRFGTTD---INQWQAVLAVNTIAPLI 114
++I E+ + E ID+L+N AG +R G +W+A +VN+
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGV-----LRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 115 LTETLYPNLKIAQNCVLAFISSKVGSMQDNSSGGGYYYRSSKAALNSVVKSLSIDLIQDG 174
+ ++ + ++ + + S + S Y SSKAA K L ++L +
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAA---YASSKAAAVMFTKCLGLELAEYN 179

Query: 175 IKCVVLHPGWVQTEM 189
I+C ++ PG +T+M
Sbjct: 180 IRCNIVSPGSTETDM 194


83Shal_2144Shal_2150N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2144-116-2.146725short-chain dehydrogenase/reductase SDR
Shal_2145014-1.975291nuclear transport factor 2
Shal_2146-113-1.957580acyl-CoA dehydrogenase domain-containing
Shal_2147-111-1.310724acyl-CoA dehydrogenase domain-containing
Shal_2148-29-0.956508acriflavin resistance protein
Shal_2149-39-1.199083RND family efflux transporter MFP subunit
Shal_2150-29-1.654888SecC motif-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2144DHBDHDRGNASE1224e-36 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 122 bits (308), Expect = 4e-36
Identities = 80/260 (30%), Positives = 125/260 (48%), Gaps = 21/260 (8%)

Query: 5 LAGKVALITGAARGVGLATAKLMAKEGAQVIFTDINAEQGQEIANSISSNALFLEH---D 61
+ GK+A ITGAA+G+G A A+ +A +GA + D N E+ +++ +S+ + A E D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 62 VTKEADWSSILTHIKSKYGQLNILVNNAAILQLGDIKEETLAGWQKVHRVNSDSVFLGIH 121
V A I I+ + G ++ILVN A +L+ G I + W+ VNS VF
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 122 YALPLMEESGGGSIVNMSSSSAIFGMPHFAAYGASKAAIRGLSQSVAVYCSQTKNNVRCN 181
M + GSIV + S+ A AAY +SKAA ++ + + ++ N+RCN
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEY--NIRCN 183

Query: 182 TLHPDSIMTPMVMEMSAQAGDRSLAEPERAKAYLCQ------------PEDVANTILFLA 229
+ P S T M + A + K L P D+A+ +LFL
Sbjct: 184 IVSPGSTETDMQWSLWADEN----GAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 230 SDESKHINGAAIALDGGATV 249
S ++ HI + +DGGAT+
Sbjct: 240 SGQAGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2148ACRIFLAVINRP424e-134 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 424 bits (1092), Expect = e-134
Identities = 202/1054 (19%), Positives = 417/1054 (39%), Gaps = 62/1054 (5%)

Query: 12 FARNSVAANLLMIIIIIGGLLTANTIRKQFFPAVEINWLEFNAVYPGAAPQEVEEGITIK 71
F R + A +L II+++ G L + +P + + +A YPGA Q V++ +T
Sbjct: 5 FIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQV 64

Query: 72 IEEALESVQGLKRVITYSNRNVSSG-YFRVEDSYDPQVVLEEVKSEIDSI-SSFPDGMER 129
IE+ + + L + + S+ S + DP + +V++++ P +++
Sbjct: 65 IEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQQ 124

Query: 130 PKVERIKLRQE-VMYMSLY---GDLSQRQLKDLGEK-IHDELLQLPLVNITDFYGGLGYE 184
+ K +M +Q + D + D L +L V +G Y
Sbjct: 125 QGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA-QYA 183

Query: 185 IAIEVSKDRLREFGLSFNDVAEAVRGYSRNMSAGQIRAE------NGYINLRVQNQAYVG 238
+ I + D L ++ L+ DV ++ + ++AGQ+ ++ Q +
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP 243

Query: 239 YEFESLPLITLEDGTTLLLGDVATVIDGFEEGIQYSKFNGKNSVTFFIGAANDQSLTDVA 298
EF + L DG+ + L DVA V G E ++ NGK + I A + D A
Sbjct: 244 EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDTA 303

Query: 299 DVVKGYILDKQKVLPQGVKLEPWVDMTYYLEGRLNLMLDSMKSGAILVFILLALFLR-VR 357
+K + + Q PQG+K+ D T +++ ++ ++ ++ +LVF+++ LFL+ +R
Sbjct: 304 KAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNMR 363

Query: 358 LAFWVMMGLPVCFLGTLLFMPMGMIDVTINVISLFAFILVLGIVVDDAIVMGESAH-AEC 416
+ +PV LGT + +IN +++F +L +G++VDDAIV+ E+
Sbjct: 364 ATLIPTIAVPVVLLGTFAILAA--FGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 417 EEKGQTLDNVIRGVKRVAMPATFGVLTTIAAFLPITLDDGPSSAFGQAIGFVVILCLIFS 476
E+K + + + ++ + A F+P+ G + A + ++ + S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 477 LVESKLILPAHLARMKPKTIVKPGSKNPVDWLRNIVNFLQAKVDTGLKIFITQYYRPFLE 536
++ + ++ PA A + +KP S + + D +Y +
Sbjct: 482 VLVALILTPALCATL-----LKPVSAEHHENKGGFFGWFNTTFD-----HSVNHYTNSVG 531

Query: 537 VAVKYRYTVIMVFFSLILICAGLYSGGLIRFIGQPKIPHDF-PRI---SFEMNIDASEKA 592
+ ++++ ++ L+ ++P F P F I A
Sbjct: 532 KILGSTGRYLLIYALIVAGMVVLF----------LRLPSSFLPEEDQGVFLTMIQLPAGA 581

Query: 593 TLSAALSIEEALRLVDSKLEEEYGQKMISDMQVELRGRTS----AQVMTKLVDPEIRPID 648
T + + + K E+ + + + G+ A V K + +
Sbjct: 582 TQERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDEN 641

Query: 649 TFAVAELWRQNMPL--IPGMKSFTIQDNLFGGGRDDGDISFRLE---GKDDAQLVAAAKE 703
+ A A + R M L I F L G L A +
Sbjct: 642 S-AEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQ 700

Query: 704 LKAKLNTLKG-VGDVNDSRQSSAKEIQFELK-PLAHSLGLTLADIARQVGNSFYGLEAQR 761
L + V + + + E+ A +LG++L+DI + + + G
Sbjct: 701 LLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVND 760

Query: 762 ILRNGEEIKVMLRYPEEQRNSIAQVSDVIIKTPQGAEIPLSEVAAIVVTDGVNSIRRENG 821
+ G K+ ++ + R V + +++ G +P S G + R NG
Sbjct: 761 FIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNG 820

Query: 822 NRTINVWGSVDADQAEPFKLAKDIRDNFLPELLNKYPR-VKSEVSGNIQEQLDSADTQLR 880
++ + G + +A + L +K P + + +G ++ S +
Sbjct: 821 LPSMEIQGEAAPGTSSGDAMAL------MENLASKLPAGIGYDWTGMSYQERLSGNQAPA 874

Query: 881 DFLISMLVIYSLLAVPLKSYSQPIMIMAVIPFGVIGSVLGHMILGLDLSALSVFGIIAAA 940
IS +V++ LA +S+S P+ +M V+P G++G +L + + G++
Sbjct: 875 LVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTI 934

Query: 941 GVVVNDSLVMVDYINKSRE-SGIAMKLSVLEAGCRRFRAILLTSLTTFIGLVPIMTETSM 999
G+ +++++V++ E G + + L A R R IL+TSL +G++P+
Sbjct: 935 GLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGA 994

Query: 1000 QAQMVIPMAVSLAFGVLFATVVTLVLIPCLYVTI 1033
+ + + + G++ AT++ + +P +V I
Sbjct: 995 GSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2149RTXTOXIND432e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.9 bits (101), Expect = 2e-06
Identities = 31/154 (20%), Positives = 56/154 (36%), Gaps = 26/154 (16%)

Query: 52 PMSFAVTSYGVVNAKYETELVSQLNGEIVFLSDKFVR-GGFVKKGDVLAKIDPSDYEAAL 110
+ T+ G + ++ + + IV + V+ G V+KGDVL K+ EA
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIV--KEIIVKEGESVRKGDVLLKLTALGAEADT 136

Query: 111 IDAKANMASARA--------------------TLVQEKAFGKVAEAEWKRIENGVPTELS 150
+ ++++ AR L E F V+E E R+ + + + S
Sbjct: 137 LKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFS 196

Query: 151 LRKPQLAQEVAKLNSSEA---GLKRAIRNLERTI 181
+ Q Q+ L+ A + I E
Sbjct: 197 TWQNQKYQKELNLDKKRAERLTVLARINRYENLS 230



Score = 38.3 bits (89), Expect = 5e-05
Identities = 33/210 (15%), Positives = 71/210 (33%), Gaps = 28/210 (13%)

Query: 105 DYEAALIDAKANMASARATLVQEKAFGKVAEAEWKRIENGVPTELSLRKPQLAQEVAKLN 164
+ E ++A + ++ L Q ++ A+ E++ + E+ +L Q +
Sbjct: 256 EQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEIL---DKLRQTTDNIG 312

Query: 165 SSEAGLKRAIRNLERTIIKAPFDALIEARNI-GLGSYVSMGTPLGKVLSTAH---AEVRL 220
L + + ++I+AP ++ + G V+ L ++ +
Sbjct: 313 LLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALV 372

Query: 221 PIADKELQFLDNKGKGAEVML---------TGELAGQTKQWPARIVRSEGVIDSRSRMTY 271
D + G ++ G L G+ K + + + D R + +
Sbjct: 373 QNKD-----IGFINVGQNAIIKVEAFPYTRYGYLVGKVKN-----INLDAIEDQRLGLVF 422

Query: 272 LVAEVTDPYGLNANNN--ELRYGTYVTASI 299
V + L+ N L G VTA I
Sbjct: 423 NVIISIEENCLSTGNKNIPLSSGMAVTAEI 452


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2150SECA511e-10 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 50.6 bits (121), Expect = 1e-10
Identities = 18/75 (24%), Positives = 31/75 (41%), Gaps = 5/75 (6%)

Query: 37 LTLLVQNAEREAEILQLLSDNELVGLITVDETQEENVVQLMGLLNKPQTTRFEKTPNRND 96
+ + + E E + + L + + +++ E+ RND
Sbjct: 829 VQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAA-----AALAAQTGERKVGRND 883

Query: 97 PCVCGSGQKYKKCCG 111
PC CGSG+KYK+C G
Sbjct: 884 PCPCGSGKKYKQCHG 898


84Shal_2163Shal_2167N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2163-213-0.056318EmrB/QacA family drug resistance transporter
Shal_2164-2130.116894secretion protein HlyD family protein
Shal_2165-1140.070547hypothetical protein
Shal_2166-114-1.759946RND family efflux transporter MFP subunit
Shal_2167-113-1.880374hydrophobe/amphiphile efflux-1 (HAE1) family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2163TCRTETB911e-21 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 90.7 bits (225), Expect = 1e-21
Identities = 76/395 (19%), Positives = 157/395 (39%), Gaps = 16/395 (4%)

Query: 38 LDMTIANVALPHMMGALGVTSDQVTWVLTSYSMAEAIFIPLASFLALKFGVRNLLLISVS 97
L+ + NV+LP + WV T++ + +I + L+ + G++ LLL +
Sbjct: 28 LNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGII 87

Query: 98 GFMISSALCGQADSIAEMVTF-RVMQGAFGASVIPLSQSIMVQIYPANQRGKAMALFSVG 156
S + S ++ R +QGA A+ L ++ + P RGKA L
Sbjct: 88 INCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSI 147

Query: 157 VLLGPILGPTLGGIITENMNWRWIFYVNLPIGAICLTLIYTFVKLGGKGKPKIDWPIVIA 216
V +G +GP +GG+I ++W ++ + + L+ +K + K D +I
Sbjct: 148 VAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMK-LLKKEVRIKGHFDIKGIIL 206

Query: 217 MTIGIGLLQMVLDRGNQESWFESNTILFSAIISAIAIIFFVARSFITKSEIAPVWLLHDR 276
M++GI + S +I F I+S ++ + FV L +
Sbjct: 207 MSVGIVFFMLFTT---------SYSISF-LIVSVLSFLIFVKHIRKVTDPFVDPGLGKNI 256

Query: 277 NLAMSCLVMAGFSMG-MFGITQLQPMMLEQLLNY-PVETTGFAMAPRGLASAIVLLAMAR 334
M ++ G G + G + P M++ + E + P ++ I
Sbjct: 257 PF-MIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGI 315

Query: 335 YMDKIDARLLIAVGLSLNALGTYLMTQYSLEIDIYWILLPSIIQGAGMGLVFAPLSQLAY 394
+D+ ++ +G++ ++ +L + LE +++ + + G+ +S +
Sbjct: 316 LVDRRGPLYVLNIGVTFLSVS-FLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVS 374

Query: 395 TTLSPKDTIGGAVVFNLCRTIGGSFGISIVNTYFS 429
++L ++ G + N + GI+IV S
Sbjct: 375 SSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2164RTXTOXIND718e-16 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 71.4 bits (175), Expect = 8e-16
Identities = 37/274 (13%), Positives = 81/274 (29%), Gaps = 34/274 (12%)

Query: 88 FQAKLDEARAAYEMALQNNAATDDAILAASANVKSAVAQLSDAQVTYKRTKDLVDKELLP 147
+ + + N L A + + L+ K+ +
Sbjct: 191 IKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIA 250

Query: 148 EQQLDDTRAKLSAAEQSVIAARATMAQLIKS-------------------QGAHGNAAPE 188
+ + + K A + ++ + Q+
Sbjct: 251 KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDN 310

Query: 189 VKKAAAALSQASLSLSYTNIFAPKDGHLGKLSAHA-GSVVSPGQAIVPLV-VDNTYWVQA 246
+ L++ + I AP + +L H G VV+ + ++ +V D+T V A
Sbjct: 311 IGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTA 370

Query: 247 NFKETQLERLTAGQTATITLDLYPKVDYH---GTIEAISPASGASFTLLPPENATGNWVK 303
+ + + GQ A I ++ +P Y G ++ I+ + + G
Sbjct: 371 LVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA-------IEDQRLGLVFN 423

Query: 304 VPQRFPVLVRLTNESEHPDFPLRVGASANVSIDT 337
V + + + + PL G + I T
Sbjct: 424 VIIS---IEENCLSTGNKNIPLSSGMAVTAEIKT 454



Score = 43.3 bits (102), Expect = 8e-07
Identities = 15/116 (12%), Positives = 43/116 (37%)

Query: 55 IAPQVSGKVSSVNASDYQKVSQGDLLVQIDSAPFQAKLDEARAAYEMALQNNAATDDAIL 114
I P + V + + + V +GD+L+++ + +A + +++ A
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSR 158

Query: 115 AASANVKSAVAQLSDAQVTYKRTKDLVDKELLPEQQLDDTRAKLSAAEQSVIAARA 170
+ N + + ++++ L ++Q + + E ++ RA
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRA 214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2166RTXTOXIND432e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.5 bits (100), Expect = 2e-06
Identities = 24/112 (21%), Positives = 50/112 (44%), Gaps = 4/112 (3%)

Query: 102 VNRLSANVESQQSALEKAQRDVERLKPLYEQDAASQLDFDNALSVLSQAKSSVAASKAEL 161
+ +S LE+ + ++ K Y+ +QL + L L Q ++ EL
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQL--VTQLFKNEILDKLRQTTDNIGLLTLEL 318

Query: 162 EEAKLELSYTEIKSPISGLVSRSEV-DIGALVGSSGQSLLTRVKQVDPIYVT 212
+ + + I++P+S V + +V G +V + ++L+ V + D + VT
Sbjct: 319 AKNEERQQASVIRAPVSVKVQQLKVHTEGGVVT-TAETLMVIVPEDDTLEVT 369


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2167ACRIFLAVINRP9410.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 941 bits (2435), Expect = 0.0
Identities = 421/1028 (40%), Positives = 631/1028 (61%), Gaps = 9/1028 (0%)

Query: 1 MAQYFVNRPVFASVISIVIVLLGLIAMFQLPIDQYPYITPPQVKISASYPGATSTTAAES 60
MA +F+ RP+FA V++I++++ G +A+ QLP+ QYP I PP V +SA+YPGA + T ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VATPLEQELNGLPNMIYMSSKSTNSGSSNITITFDVGTNPDLAAVDAQNSTQQATGSLPI 120
V +EQ +NG+ N++YMSS S ++GS IT+TF GT+PD+A V QN Q AT LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 DVQTEGVSVSKEASVELLKLALTSEDERYDEIYLSNYATINIQSALKRIPGVGRVRNTGA 180
+VQ +G+SV K +S L+ S++ + +S+Y N++ L R+ GVG V+ GA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 RSYSMRVWLNPDTMAGYGLTTSDVIDAIKAQNKESPAGSIGSQPNADTLSMTLPITAAGR 240
+ Y+MR+WL+ D + Y LT DVI+ +K QN + AG +G P + I A R
Sbjct: 181 Q-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 241 MSSVPQFNEIIVRASADGSIIRLRDIANIELGSSSYTLQSQLNGNNATILQVYLLPGANA 300
+ +F ++ +R ++DGS++RL+D+A +ELG +Y + +++NG A L + L GANA
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 301 LEVTKKVKAEMAKLAQKFPQGMNWEVFFDASVFIENSIDEVVKTLVEALILVILVVFMFL 360
L+ K +KA++A+L FPQGM +D + F++ SI EVVKTL EA++LV LV+++FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 361 QNIRATLIPAIAVPVSLIGTLAAMLAFGFTINTVSLLALVLAIGIVVDDAIVVVENVERL 420
QN+RATLIP IAVPV L+GT A + AFG++INT+++ +VLAIG++VDDAIVVVENVER+
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 421 MDEKGLSSSQATKVAMKELSGALIATSLVLAAVFVPVSFLSGITGIMYREFAVAITVAVL 480
M E L +AT+ +M ++ GAL+ ++VL+AVF+P++F G TG +YR+F++ I A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 481 ISTVVALTLSPALCALLLKPG----DKATSGFFKWMNDRLDTATTKYVRLVVLTNKHAKR 536
+S +VAL L+PALCA LLKP + GFF W N D + Y V R
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 537 SYLLFALMVGGVYLTMSSLPSSFMPDEDQGRFFIDVSLPNGATVNRTQDVLKKAEATVLA 596
L++AL+V G+ + LPSSF+P+EDQG F + LP GAT RTQ VL + L
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 597 HP-AIAYSFTLAGENRRSGSNQANGQFEIILKPWSDRVDNDATVQKVMNEIKQSLHDVLE 655
+ A S SG Q G + LKPW +R ++ + + V++ K L + +
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 656 AEFRIYLPSAVPGLGNGSGVEMELQDTSGSNFKGLMETADELVEALKLQP-EIATAGLSL 714
+ A+ LG +G + EL D +G L + ++L+ P + + +
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 715 QSAIPQLHLNVDEAKAMAIGVKVSDIYGTIKTFTDSSTVNDFNLFGRVYRVKVQAEEQYR 774
Q L VD+ KA A+GV +SDI TI T + VNDF GRV ++ VQA+ ++R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 775 QFPDQIEDYHVRSSSGAMVPIGVLAESNYSVGPAAVTHYNMFTSASINASPAPGYASGDV 834
P+ ++ +VRS++G MVP S++ G + YN S I APG +SGD
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 835 IRAIERVAKPMLPDEFSYEWTGITYQEVQSANQTAIAVTLAMVFVFLFLAALYESWTLPI 894
+ +E +A LP Y+WTG++YQE S NQ V ++ V VFL LAALYESW++P+
Sbjct: 840 MALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPV 898

Query: 895 AVLLIAPIAMLGASVGTLVSGMESNLFFQVAFIALIGMAAKNSILIVEFANQLH-KAGKS 953
+V+L+ P+ ++G + + +++++F V + IG++AKN+ILIVEFA L K GK
Sbjct: 899 SVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKG 958

Query: 954 RLDSAIEAANMRFRPILMTSLAFILGVLPLVFSVGPGAVSRQSISIPILCGMIFATTIGI 1013
+++ + A MR RPILMTSLAFILGVLPL S G G+ ++ ++ I ++ GM+ AT + I
Sbjct: 959 VVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAI 1018

Query: 1014 IMVPLFFV 1021
VP+FFV
Sbjct: 1019 FFVPVFFV 1026



Score = 106 bits (266), Expect = 3e-25
Identities = 77/509 (15%), Positives = 179/509 (35%), Gaps = 29/509 (5%)

Query: 5 FVNRPVFASVISIVIVLLGLIAMFQLPIDQYPYITPPQVKISASYPGATSTTAAESVATP 64
+ +I +IV ++ +LP P P + + V
Sbjct: 533 ILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQ 592

Query: 65 LEQEL-----NGLPNMIYMSSKSTNSGSSNITITF------DVGTNPDLAAVDAQNSTQQ 113
+ + ++ ++ S + + N + F + + +A + +
Sbjct: 593 VTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKM 652

Query: 114 ATGSLPIDVQTEGVSVSKEASVELL---KLALTSEDERYDEI-YLSNYATINIQSALKRI 169
G + + + A VEL D+ L+ + A +
Sbjct: 653 ELGKIRDGF---VIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHP 709

Query: 170 PGVGRVRNTG-ARSYSMRVWLNPDTMAGYGLTTSDVIDAIKAQNKESPAGSIGSQPNADT 228
+ VR G + ++ ++ + G++ SD+ I + +
Sbjct: 710 ASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKK 769

Query: 229 LSMTLPITAAGRMSSVPQFNEIIVRASADGSIIRLRDIANIELGSSSYTLQSQLNGNNAT 288
L + +++ VR SA+G ++ S L + NG +
Sbjct: 770 LYVQADAKFR---MLPEDVDKLYVR-SANGEMVPFSAFTTSHWVYGSPRL-ERYNGLPSM 824

Query: 289 ILQVYLLPGANALEVTKKVKAEMAKLAQKFPQGMNWEVFFDASVFIENSIDEVVKTLVEA 348
+Q PG + A M LA K P G+ ++ + S ++ + +
Sbjct: 825 EIQGEAAPGT----SSGDAMALMENLASKLPAGIGYDWTGMSYQERL-SGNQAPALVAIS 879

Query: 349 LILVILVVFMFLQNIRATLIPAIAVPVSLIGTLAAMLAFGFTINTVSLLALVLAIGIVVD 408
++V L + ++ + + VP+ ++G L A F + ++ L+ IG+
Sbjct: 880 FVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAK 939

Query: 409 DAIVVVENVERLMDEKGLSSSQATKVAMKELSGALIATSLVLAAVFVPVSFLSGITGIMY 468
+AI++VE + LM+++G +AT +A++ ++ TSL +P++ +G
Sbjct: 940 NAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQ 999

Query: 469 REFAVAITVAVLISTVVALTLSPALCALL 497
+ + ++ +T++A+ P ++
Sbjct: 1000 NAVGIGVMGGMVSATLLAIFFVPVFFVVI 1028



Score = 66.4 bits (162), Expect = 4e-13
Identities = 84/510 (16%), Positives = 181/510 (35%), Gaps = 52/510 (10%)

Query: 538 YLLFALMVGGVYLTMSSLPSSFMPDEDQGRFFIDVSLPNGATVNRTQD-VLKKAEATVLA 596
++L +++ L + LP + P + + P GA QD V + E +
Sbjct: 13 WVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYP-GADAQTVQDTVTQVIEQNMNG 71

Query: 597 HPAIAYSFTLAGENRRSGSNQANGQFEIILKPWSDRVDNDATVQKVMNEIKQSLHDVLEA 656
+ Y ++ + +GS F+ D D +V N+++ +
Sbjct: 72 IDNLMY---MSSTSDSAGSVTITLTFQ-------SGTDPDIAQVQVQNKLQLATPL---- 117

Query: 657 EFRIYLPSAVPGLG-----NGSGVEMELQDTSGSNFKGLMETADELVEALKLQPEIAT-- 709
LP V G + S M S + + +D + +K ++
Sbjct: 118 -----LPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVK--DTLSRLN 170

Query: 710 --AGLSLQSAIPQLHLNVDEAKAMAIGVKVSDIYGTIKTFTDS----STVNDFNLFGRVY 763
+ L A + + +D + D+ +K D L G+
Sbjct: 171 GVGDVQLFGAQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQL 230

Query: 764 RVKVQAEEQYRQFPDQIEDYHVRSS-SGAMVPIG-----VLAESNYSVGPAAVTHYNMFT 817
+ A+ ++ + P++ +R + G++V + L NY+V + N
Sbjct: 231 NASIIAQTRF-KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNV----IARINGKP 285

Query: 818 SASINASPAPGYASGDVIRAI-ERVA--KPMLPDEFSYEWTGITYQEVQSANQTAI-AVT 873
+A + A G + D +AI ++A +P P + T VQ + + +
Sbjct: 286 AAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLF 345

Query: 874 LAMVFVFLFLAALYESWTLPIAVLLIAPIAMLGASVGTLVSGMESNLFFQVAFIALIGMA 933
A++ VFL + ++ + + P+ +LG G N + IG+
Sbjct: 346 EAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLL 405

Query: 934 AKNSILIVEFANQLHKAGKSR-LDSAIEAANMRFRPILMTSLAFILGVLPLVFSVGPGAV 992
++I++VE ++ K ++ ++ + ++ ++ +P+ F G
Sbjct: 406 VDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGA 465

Query: 993 SRQSISIPILCGMIFATTIGIIMVPLFFVT 1022
+ SI I+ M + + +I+ P T
Sbjct: 466 IYRQFSITIVSAMALSVLVALILTPALCAT 495


85Shal_2495Shal_2503N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2495-1140.543644RND family efflux transporter MFP subunit
Shal_2496-1140.565646acriflavin resistance protein
Shal_24970140.062869acriflavin resistance protein
Shal_24980110.020671hypothetical protein
Shal_24991120.613942hypothetical protein
Shal_25000120.577189sigma-54 dependent trancsriptional regulator
Shal_25011121.016667DSBA oxidoreductase
Shal_25020140.842920hypothetical protein
Shal_2503-1150.863168peptidase M11 gametolysin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2495RTXTOXIND416e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 40.6 bits (95), Expect = 6e-06
Identities = 18/117 (15%), Positives = 48/117 (41%), Gaps = 12/117 (10%)

Query: 65 KVVTRVAGLIQSIKVEEGDRVTKGQLLAVIDSKRQKFDLDRSQA------------EVEI 112
++ +++ I V+EG+ V KG +L + + + D ++Q+ ++
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 113 IEQELNRLKKINNKEFFSADSMAKLEYNLQAAIAKRDLAALYVQESMIRSPIEGVVA 169
ELN+L ++ + ++++ E ++ K + Q+ ++ A
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRA 214



Score = 39.8 bits (93), Expect = 1e-05
Identities = 40/192 (20%), Positives = 74/192 (38%), Gaps = 30/192 (15%)

Query: 102 DLDRSQAEVEIIEQELNRLKK--INNKEFFSADSMAKL--------EYNLQAAIAKRDLA 151
+L ++++E IE E+ K+ + F + + KL L+ A +
Sbjct: 267 ELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQ 326

Query: 152 ALYVQESMIRSPIEGVVATRFVKS-GNMAKEFDELFYVVNQDELYGI-VHLPEQQLQHLR 209
A S+IR+P+ V V + G + + L +V +D+ + + + + +
Sbjct: 327 A-----SVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFIN 381

Query: 210 LGQDAQIFANKHTQETTH---ATVLRISP--IVDAQSGTF--------KVTLSVPNQNAA 256
+GQ+A I V I+ I D + G + LS N+N
Sbjct: 382 VGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKNIP 441

Query: 257 LKAGMFTRVELK 268
L +GM E+K
Sbjct: 442 LSSGMAVTAEIK 453


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2496ACRIFLAVINRP6440.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 644 bits (1663), Expect = 0.0
Identities = 248/1100 (22%), Positives = 470/1100 (42%), Gaps = 96/1100 (8%)

Query: 3 IIKTAVNRPVTVWMFMFAVILFGMVGFSRLAVKLLPDLSYPTITIRTQYVGAAPVEVEQL 62
+ + RP+ W+ +++ G + +L V P ++ P +++ Y GA V+
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 VSKPIEEAAGIVKGLRKISSISRS-GMSDVVLEFEWGTDMDMASLDVREKLDTIE--LPL 119
V++ IE+ + L +SS S S G + L F+ GTD D+A + V+ KL LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 120 DVKKPLLLRFNPNLDPIVRLALSVPETSESTTEGMSETELKQMRTYAEEELKRQLESLTG 179
+V++ + + ++ S G ++ ++ Y +K L L G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFV------SDNPGTTQDDI---SDYVASNVKDTLSRLNG 171

Query: 180 VAAVRLSGGLQQEVHIQLNQQKLTQLNLSADLIRNRIAEENINLSAGKVIQGDK------ 233
V V+L G Q + I L+ L + L+ + N++ +N ++AG++
Sbjct: 172 VGDVQLFGA-QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQL 230

Query: 234 EYLVRTLNQFNSLEELGQIVIYRDEQ-TLVRLFEVAQIVDAHKERNDITRIGDKESIELA 292
+ +F + EE G++ + + ++VRL +VA++ + N I RI K + L
Sbjct: 231 NASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLG 290

Query: 293 IYKEGDANTVAVAQKVTNELNKLNKHNAKA-ELKVIYDQSEFIESAVNEVTTAALIGSLL 351
I AN + A+ + +L +L + ++ YD + F++ +++EV +L
Sbjct: 291 IKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIML 350

Query: 352 SMLVIYLFLRDIVPTLIISISIPFSVIATFNMMYFADISLNIMSLGGIALAVGLLVDNAI 411
LV+YLFL+++ TLI +I++P ++ TF ++ S+N +++ G+ LA+GLLVD+AI
Sbjct: 351 VFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAI 410

Query: 412 VVLENIDRC-KSLGMNRLDAAVTGTKEVAGAIFASTLTTLAVFVPLVFVDGVAGALFSDQ 470
VV+EN++R + +A ++ GA+ + AVF+P+ F G GA++
Sbjct: 411 VVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQF 470

Query: 471 ALTVTFALLASLLVALTTIPMLASREGFKALPPLLEKTAKPKPETKMAKLKHYSATVFSF 530
++T+ A+ S+LVAL P L + LL+ + E K
Sbjct: 471 SITIVSAMALSVLVALILTPALCAT--------LLKPVSAEHHENK-------------- 508

Query: 531 PFVLLFNYLPSALLTLVLIIGRSLSWLTGLVMRPISGAFNWGYHKLERFYHKLLAAALKF 590
G FN + Y + L
Sbjct: 509 --------------------------------GGFFGWFNTTFDHSVNHYTNSVGKILGS 536

Query: 591 KVLTLSIAIAVTAGAGLLIPRLGMELIPPMNQGEFYVEVLLPPGTEVTQTDRVLRTLALS 650
L I + AG +L RL +P +QG F + LP G +T +VL +
Sbjct: 537 TGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDY 596

Query: 651 IKDRSDVKHAYSQAGSGGLMTSDTSRGGENWGRLQVVLQN-------HNAFDAVADKLRS 703
+++ + S G S ++ N G V L+ N+ +AV + +
Sbjct: 597 YL-KNEKANVESVFTVNGFSFSGQAQ---NAGMAFVSLKPWEERNGDENSAEAVIHRAKM 652

Query: 704 TAMRIPE---LEAKMQHPELFSFKTPLEIELV---GYDLGQLKQTADNLVDALSDS-DRF 756
+I + + M T + EL+ G L Q + L+ +
Sbjct: 653 ELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASL 712

Query: 757 ADINTSLRDGQPELSIRFDHERLAALGMDAPTVANRIAQRIGGTIASQYTVRDRKIDILV 816
+ + + + + D E+ ALG+ + I+ +GGT + + R R + V
Sbjct: 713 VSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYV 772

Query: 817 RSELDERNQISDIDSMIINPNSSHPISLSAVADVSLKLGPSAINRISQQRVAIVSANLAY 876
+++ R D+D + + + + SA G + R + + A
Sbjct: 773 QADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAP 832

Query: 877 GDLNDAVLNARDILSAQTLPTSIQARFGGQNEEMEHSFKSLQIALVLAVFLVYLVMASQF 936
G + + + L+++ LP I + G + + S + ++ +V+L +A+ +
Sbjct: 833 GTSSGDAMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALY 891

Query: 937 ESLLHPLLILIAVPMAVGGSILGLFITQTHLSVVVFIGLIMLAGIVVNNAIVLVDRINQL 996
ES P+ +++ VP+ + G +L + V +GL+ G+ NAI++V+ L
Sbjct: 892 ESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDL 951

Query: 997 -RQDGEEKITAISNAAKSRLRPIIMTTMTTALGLSPMALGLGDGSEVRAPMAITVIFGLS 1055
++G+ + A A + RLRPI+MT++ LG+ P+A+ G GS + + I V+ G+
Sbjct: 952 MEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMV 1011

Query: 1056 LSTLLTLVVIPVLYALFDRK 1075
+TLL + +PV + + R
Sbjct: 1012 SATLLAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2497ACRIFLAVINRP486e-157 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 486 bits (1252), Expect = e-157
Identities = 212/1054 (20%), Positives = 445/1054 (42%), Gaps = 79/1054 (7%)

Query: 3 ITRFALARPVTTTMFFVAILLFGLASSRLLPLEMFPGIDIPQVIVEVPYKGSTPAEVERD 62
+ F + RP+ + + +++ G + LP+ +P I P V V Y G+ V+
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 ITNVLEESLATMGGIEELRSSSSQNG-AEIDLRMKWGQNVATKSLEAREKIDAVRHLLPK 121
+T V+E+++ + + + S+S G I L + G + ++ + K+ LLP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 DVERVFIRQFSTADMPVLNLRISSDRELSSAFDLLD---KQLKKPLERVEGVSQVTLYGV 178
+V++ I ++ ++ SD ++ D+ D +K L R+ GV V L+G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 179 EQKQIEIRINADKLSASNISVQSLNRRLQQENFVINAGVLKTDSRV------YQVSPKGE 232
Q + I ++AD L+ ++ + +L+ +N I AG L + + +
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 233 FRNLDDITALVLAPG-----ITLGDIANVSFSLPERLDGRHLDQNYAVGLDVFKESGANL 287
F+N ++ + L + L D+A V ++ A GL + +GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 288 VDVSQRVMKVINEAKRDSQFEGIKLFVMEDQAYGVTSSLRDLLTAGLIGALLSFVVLYLF 347
+D ++ + + E + +G+K+ D V S+ +++ +L F+V+YLF
Sbjct: 300 LDTAKAIKAKLAELQPFFP-QGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 348 LRDLKMTLVIVSSVPIAICMTLAAMYFLGYSLNILSMMGLLLAVGMLIDNAVVVTESVLQ 407
L++++ TL+ +VP+ + T A + GYS+N L+M G++LA+G+L+D+A+VV E+V +
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 408 QKQAQIADGSAQEAGVAKINSSAILRGVDKVSLAVLAGTLTTAIVFLPNIFGVKVELTIF 467
A + + ++ A++ + + VF+P F I+
Sbjct: 419 VMMEDKLPPKE-----------ATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIY 467

Query: 468 LEHVAIAICISLAASLLVAKTLLPLMLSRMSFSQKKAPKK-------------SHLQARY 514
+ +I I ++A S+LVA L P + + + + H Y
Sbjct: 468 RQ-FSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHY 526

Query: 515 QTSLNWILAHPRISGVLAMVILASTALPLSMVKQDQSDGEGNNRLYINYQVEGRHSLDVT 574
S+ IL ++ +I+A + + E Q+ + + T
Sbjct: 527 TNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERT 586

Query: 575 EAMITKMETYLYAN--KDEFQIDSVYSYFAADRGQS------TLILKEDTEVDMKALKKT 626
+ ++ ++ Y N + + +V + + + Q+ +L E+ D + +
Sbjct: 587 QKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAV 646

Query: 627 IREGFPKFAIAKPQFGWGGENNGVRVSLTGRST--------------TELIHLSEQVIPL 672
I + K + G+ N + G +T L Q++ +
Sbjct: 647 IHRAKMEL--GKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGM 704

Query: 673 LS-NIDGLTDVRSELNGAQQEVVIRIDRQMAARLDLKLNEVASSISMALRGTPLRSFRHD 731
+ + L VR + + +D++ A L + L+++ +IS AL GT + F D
Sbjct: 705 AAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFI-D 763

Query: 732 PSGELRIEMAYEQQWRLSLEKLKQLPVIRIDNRVYTLDSLAKIEILPRFDTIRHYDRQTA 791
++ + + ++R+ E + +L V + + + + + Y+ +
Sbjct: 764 RGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPS 823

Query: 792 LSIGANLDE-LTTEEAQEKITQVMDSVNFPAGYGYSLRGGFQKQDEDEAVMATNMILAIA 850
+ I ++ +A + + PAG GY G ++ + ++
Sbjct: 824 MEIQGEAAPGTSSGDAMALMENLAS--KLPAGIGYDWTGMSYQERLSGNQAPALVAISFV 881

Query: 851 MIYIVMAALFESLLLPTAIITSILFSITGVFWALLFTGTPMSIMAMIGILILMGIVVNNG 910
++++ +AAL+ES +P +++ + I GV A + M+G+L +G+ N
Sbjct: 882 VVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNA 941

Query: 911 IVLVDQINQLSPELDELSDTISAV---CYTRLRPVLMTVGTTVLGLVPLAMGDTQLGGGG 967
I++V+ L E + A RLRP+LMT +LG++PLA+ G G
Sbjct: 942 ILIVEFAKDL--MEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAIS---NGAGS 996

Query: 968 PSYSPMAIAIIGGLTFSTVTSLYLVPLCYQALYR 1001
+ + + I ++GG+ +T+ +++ VP+ + + R
Sbjct: 997 GAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2500HTHFIS362e-124 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 362 bits (932), Expect = e-124
Identities = 125/455 (27%), Positives = 212/455 (46%), Gaps = 43/455 (9%)

Query: 38 CILADTLASANKAVQDYNFFVAIA--VLSKKTQHRVLNSVNEINQRQQNIIWIAVLIDID 95
+ A+ + + + + + V+ + +L + + ++ ++
Sbjct: 30 VRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKARPDLPVLV-MSAQNTFM 88

Query: 96 AVDAALISRLPYYFTDYHHLPIDWDHLNQTLGHAYGMAILKQKEQGGCRHLGQKTPFLGD 155
A Y DY P D L +G A +A K++ P +G
Sbjct: 89 TAIKAS--EKGAY--DYLPKPFDLTELIGIIGRA--LAEPKRRPSKLEDDSQDGMPLVGR 142

Query: 156 SHLINKLRSNITKVAASDEAVLISGETGTGKGLCARLIHSQSARKRGPFITINCGALPQS 215
S + ++ + ++ +D ++I+GE+GTGK L AR +H R+ GPF+ IN A+P+
Sbjct: 143 SAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRD 202

Query: 216 LIHSELFGHEKGAFTGADKQYIGHIERANKGTLFLDEIGDLTLESQVNLLQFLEEHIIER 275
LI SELFGHEKGAFTGA + G E+A GTLFLDEIGD+ +++Q LL+ L++
Sbjct: 203 LIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTT 262

Query: 276 LGGNKTINIDCRIIFASHINLETAVEEGRFREDLYYRINILHLHAPSLREHKEDILLLAN 335
+GG I D RI+ A++ +L+ ++ +G FREDLYYR+N++ L P LR+ EDI L
Sbjct: 263 VGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVR 322

Query: 336 EYLNLFSPEYQK-YTLSPKALDIMLAYEWPGNVRELKNRIHRGIIMANSDQLSAADLGIK 394
++ E +AL++M A+ WPGNVREL+N + R + D ++ + +
Sbjct: 323 HFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENE 382

Query: 395 ITNIKPDENDVVDLAQHRVV---------------------------------IDTELLL 421
+ + PD A+ + ++ L+L
Sbjct: 383 LRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLIL 442

Query: 422 DAIKRNNHNISAAARELKISRTTFYRLIKKCKIKL 456
A+ N AA L ++R T + I++ + +
Sbjct: 443 AALTATRGNQIKAADLLGLNRNTLRKKIRELGVSV 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2501ADHESNFAMILY320.002 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 32.1 bits (73), Expect = 0.002
Identities = 26/125 (20%), Positives = 39/125 (31%), Gaps = 18/125 (14%)

Query: 9 LALLLTPLLLTGCQPQNNAELQKEMASLKQEITQLKKEMSTIGGQVNDIHTIAMRSQKPQ 68
L L L+ ++L C + + + + I G D+H+I Q P
Sbjct: 8 LVLFLSAIILVACASGKKDTTSGQKLKVVATNSIIADITKNIAGDKIDLHSIVPIGQDPH 67

Query: 69 -YKTLPTQSNYGEDGKLPLQGD-ATAQLAIIEFSDYQCPYCKRFIDQTFTKLKSNYIDTG 126
Y+ PL D A + F Y + + FTKL N T
Sbjct: 68 EYE--------------PLPEDVKKTSEADLIF--YNGINLETGGNAWFTKLVENAKKTE 111

Query: 127 KVQYL 131
Y
Sbjct: 112 NKDYF 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2503CABNDNGRPT310.028 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 31.1 bits (70), Expect = 0.028
Identities = 14/45 (31%), Positives = 20/45 (44%)

Query: 347 IGGSPSRAFINGSMTLRTVGHEMGHNLGLYHAKDYDCSEGVLTGR 391
S R + +T HE+GH LGL H +Y+ EG +
Sbjct: 168 YNQSNIRNPGSEEYGRQTFTHEIGHALGLAHPGEYNAGEGDPSYN 212


86Shal_2576Shal_2582N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2576010-0.580689cell division protein ZipA
Shal_2577011-1.011265chromosome segregation protein SMC
Shal_2578012-1.818864putative sulfate transport protein CysZ
Shal_2579-211-1.461176response regulator receiver modulated
Shal_2580-311-0.679975RDD domain-containing protein
Shal_2581-110-0.050108cysteine synthase A
Shal_2582-1100.253334TetR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2576IGASERPTASE300.016 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.016
Identities = 23/102 (22%), Positives = 39/102 (38%), Gaps = 5/102 (4%)

Query: 70 EAETSKPDVNEQAPQFIEPVEEAFELSQAPAIKVNKTRVEPSLSAEAPAFTAEAPVQESL 129
+A T +V + + E + + K K +VE + E P T++ ++
Sbjct: 1077 KANTQTNEVAQSGSETKETQTTETKETATVE-KEEKAKVETEKTQEVPKVTSQVSPKQEQ 1135

Query: 130 FATDEPLLAEPVIEARQAPLDVAPQEPVVDEPKQVATEALGE 171
T +P AEP AR+ V +EP TE +
Sbjct: 1136 SETVQP-QAEP---ARENDPTVNIKEPQSQTNTTADTEQPAK 1173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2577GPOSANCHOR474e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 47.0 bits (111), Expect = 4e-07
Identities = 56/329 (17%), Positives = 103/329 (31%), Gaps = 6/329 (1%)

Query: 168 AGISRYKERRRETENRIRHTRENLARLGDIRSELAKQLDKLAEQAETAKKYRELKQAERK 227
A + + E + L + L+K E A K +
Sbjct: 123 ADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLE 182

Query: 228 CDAELSVSRYHELLQQIAKIDEQLGKLALQQAQFLAEKQTIELRLTELNLKLSELDTKEA 287
+ +R EL + + + + AEK + R +L L
Sbjct: 183 AEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST 242

Query: 288 HQVEDFYLTKTHIAKLEQSLKHREQQDESLSLRLQEIAAQMQAYRARLAEDEAKQTLLNE 347
+ A LE E+ E +A+++ A A EA++ L
Sbjct: 243 ADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEH 302

Query: 348 QQASALPEASLLRQQLAEHDAELSTLVEQLEYLSEQLNSLNEAHT--QAHLAFEMNRNQL 405
Q LR+ L L + + L EQ + + L +
Sbjct: 303 QSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQ 362

Query: 406 SHAETDISNKHKLVARLEDQALQ---TASDLELLKSQ-DITEQVSELSGVIEQEQEVLDE 461
AE + ++ Q+L+ AS + + + E S+L+ + + +E+ +
Sbjct: 363 LEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEES 422

Query: 462 LNFLLEAKQQTEQTLFAEQDTLKVKLADE 490
+ K + + L AE LK KLA +
Sbjct: 423 KKLTEKEKAELQAKLEAEAKALKEKLAKQ 451



Score = 43.5 bits (102), Expect = 5e-06
Identities = 41/284 (14%), Positives = 106/284 (37%), Gaps = 4/284 (1%)

Query: 616 EKTQAAGSIVELKNEQIALQQSVTNSQANLARLTESLTEIKLMVVPLVSQAQQKQQLIQA 675
++ A I EL+ + L++++ + + + ++ L ++ ++ ++
Sbjct: 107 SLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEG 166

Query: 676 KQIELASLSSQLRSQQQSIVDIASRLAKVEQELVEAKQERQDFQLNVEQRKLQHQLLNET 735
+ S+++++ + + +R A++E+ L A ++ + + L
Sbjct: 167 AMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAAR 226

Query: 736 LEQTLRAKTQGQQQRALAAEKVAQLKPLRQQLDAKVTQTSLSEQGLHTQLVLIKQQVNQH 795
+A + K+ L+ + L+A+ + + +G ++
Sbjct: 227 KADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTL 286

Query: 796 EQRILELEQNERQLHKQLDAKDSAQEEGQSQPL---REQLDQALQAQQIKQEALTVLRRQ 852
E LE + L Q + A + + L RE Q Q +E +
Sbjct: 287 EAEKAALEAEKADLEHQSQVLN-ANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEAS 345

Query: 853 QAELQELCDSAGTNKKQQLAKLEDLTQSSSTLKLRREGLKGQID 896
+ L+ D++ KKQ A+ + L + + + R+ L+ +D
Sbjct: 346 RQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLD 389



Score = 34.3 bits (78), Expect = 0.003
Identities = 34/245 (13%), Positives = 75/245 (30%), Gaps = 27/245 (11%)

Query: 271 RLTELNLKLSELDTKEAHQVEDFYLTKTHIA------------------KLEQSLKHREQ 312
++ E K + + D + K ++SL +
Sbjct: 54 KVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKAS 113

Query: 313 QDESLSLRLQEIAAQMQAYRARLAEDEAKQTLLNEQQASALPEASLLRQQLAEHDAELST 372
+ + L R ++ ++ D AK L ++A+ + L + L +
Sbjct: 114 KIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTA 173

Query: 373 LVEQLEYLSEQLNSLNEAHTQAHLAFEMNRNQLSHAETDISNKHKLVARLEDQ------- 425
+++ L + +L + A E N + I A L +
Sbjct: 174 DSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKA 233

Query: 426 --ALQTASDLELLKSQDITEQVSELSGVIEQEQEVLDELNFLLEAKQQTEQTLFAEQDTL 483
S + K + + + + L + ++ L+ A +TL AE+ L
Sbjct: 234 LEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAAL 293

Query: 484 KVKLA 488
+ + A
Sbjct: 294 EAEKA 298



Score = 33.5 bits (76), Expect = 0.006
Identities = 36/304 (11%), Positives = 95/304 (31%), Gaps = 11/304 (3%)

Query: 674 QAKQIELASLSSQLRSQQQSIVDIASRLAKVEQELVEAKQERQDFQLNVEQRKLQHQLLN 733
+ Q + + + D++ ++ E +E + + + +
Sbjct: 53 EKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKA 112

Query: 734 ETLEQTLRAKTQGQQQRALAAEKVAQLKPLRQQLDAKVTQTSLSEQGLHTQLVLIKQQVN 793
+++ K ++ A + L+A+ + + L L
Sbjct: 113 SKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST 172

Query: 794 QHEQRILELEQNERQLHKQLDAKDSAQEEGQSQPLREQLDQALQAQQIKQEALTVLRRQQ 853
+I LE L+A+ + E+ + + + L ++
Sbjct: 173 ADSAKIKTLEAE----KAALEARQAELEK-ALEGAMNFSTADSAKIKTLEAEKAALAARK 227

Query: 854 AELQELCDSAGTNKKQQLAKLEDLTQSSSTLKLRREGLKGQIDSQLRLLNEQRVDVELIL 913
A+L++ + A AK++ L + L+ R+ L+ ++ + ++ +
Sbjct: 228 ADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLE 287

Query: 914 STLDDKKTMLWRQKELERLREQIGLLGAINLAAIEEYDQQNQRKLYLDSQDDDLNAALSS 973
+ + E L Q +L A + + D + K L+++ L
Sbjct: 288 AEKAAL------EAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKI 341

Query: 974 LEEA 977
E +
Sbjct: 342 SEAS 345



Score = 30.8 bits (69), Expect = 0.040
Identities = 43/267 (16%), Positives = 92/267 (34%), Gaps = 9/267 (3%)

Query: 616 EKTQAAGSIVELKNEQIALQQSVTNSQANLARLTESLTEIKLMVVPLVSQAQQKQQLIQA 675
EK EL+ T A + L + L + A
Sbjct: 184 EKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTA 243

Query: 676 KQIELASLSSQLRSQQQSIVDIASRLAKVEQELVEAKQERQDFQLNVEQRKLQHQLLNET 735
++ +L ++ + + ++ L + + + + + L
Sbjct: 244 DSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQ 303

Query: 736 LEQTLRAKTQGQQQRALAAEKVAQLKPLRQQLDAKVTQTSLSEQGLHTQLVLIKQQVNQH 795
+ + ++ + E QL+ Q+L+ + + S Q L L ++ Q
Sbjct: 304 SQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQL 363

Query: 796 EQRILELEQN------ERQ-LHKQLDAKDSAQE--EGQSQPLREQLDQALQAQQIKQEAL 846
E +LE+ RQ L + LDA A++ E + +L + + +E+
Sbjct: 364 EAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESK 423

Query: 847 TVLRRQQAELQELCDSAGTNKKQQLAK 873
+ +++AELQ ++ K++LAK
Sbjct: 424 KLTEKEKAELQAKLEAEAKALKEKLAK 450


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2579HTHFIS644e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 63.7 bits (155), Expect = 4e-13
Identities = 24/112 (21%), Positives = 47/112 (41%), Gaps = 3/112 (2%)

Query: 120 QSVKVLVADDSVVSRKFIRSLLEQHLFQVIEADDGISALETLNDNPDITLLITDYNMPRL 179
+LVADD R + L + + V + + + L++TD MP
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD-GDLVVTDVVMPDE 60

Query: 180 DGFGLIIKVRERLGREELAIIGLSSDSDESLSARFIKNGANDFLQKPFVHEE 231
+ F L+ ++++ + ++ + ++ A + GA D+L KPF E
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKA--SEKGAYDYLPKPFDLTE 110



Score = 57.1 bits (138), Expect = 5e-11
Identities = 22/116 (18%), Positives = 51/116 (43%), Gaps = 5/116 (4%)

Query: 2 RILVVEDSQVVSRVMRHLLTQELCCEVDVASDMASAKELLAHNEYFVAITDLNLPDAQEG 61
ILV +D + V+ L++ +V + S+ A+ +A + + +TD+ +PD
Sbjct: 5 TILVADDDAAIRTVLNQALSRA-GYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 EIVKFVLE--KQIPCIVLTGSWDAEQRARLLQLGIVDYVFKENRFSYEYTAKLVKR 115
+++ + + +P +V++ + + G DY+ K F ++ R
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKP--FDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2582HTHTETR424e-07 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 42.3 bits (99), Expect = 4e-07
Identities = 15/57 (26%), Positives = 27/57 (47%)

Query: 11 QSSRSDGQARRIAILEATLRLIVREGIRGVRHRAVASEANVPLSSTTYYFDDIKDLI 67
+ ++ + Q R IL+ LRL ++G+ +A A V + ++F D DL
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLF 59


87Shal_2756Shal_2767N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2756-211-1.001378phage shock protein A
Shal_2757-112-1.083572Fis family transcriptional regulator
Shal_2758-110-0.842655extracellular solute-binding protein
Shal_2759013-0.862144binding-protein-dependent transport system inner
Shal_2760016-0.507851binding-protein-dependent transport system inner
Shal_2761116-0.641749oligopeptide/dipeptide ABC transporter ATPase
Shal_2762218-1.075409ABC transporter-like protein
Shal_2763219-0.671528trans-2-enoyl-CoA reductase
Shal_2764424-0.930675PpiC-type peptidyl-prolyl cis-trans isomerase
Shal_2765223-0.247199histone family protein DNA-binding protein
Shal_2766119-0.249921ATP-dependent protease La
Shal_2767124-0.811314ATP-dependent protease ATP-binding subunit ClpX
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2756RTXTOXIND300.008 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.8 bits (67), Expect = 0.008
Identities = 26/156 (16%), Positives = 61/156 (39%), Gaps = 18/156 (11%)

Query: 42 EVRSTSAKVLAEKKEIIRR-IAKVQEQVQDWESKAELALSKDREDLAKAALVEKQKANEL 100
+ A++ + +I+ R I + + + E L +L+++Q +
Sbjct: 140 QSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQ 199

Query: 101 AQ---------TLAAELVVVEEHILRLKEEVNLLQEKLADAKARQKTIIMRKQTASSRLE 151
Q AE + V I R + + + +L D + + + A ++
Sbjct: 200 NQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSS------LLHKQAIAKHA 253

Query: 152 VKKQLDSSKIDNAMSKFEQYERRVENLESQVDSYDL 187
V +Q +K A+++ Y+ ++E +ES++ S
Sbjct: 254 VLEQ--ENKYVEAVNELRVYKSQLEQIESEILSAKE 287


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2757HTHFIS353e-121 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 353 bits (908), Expect = e-121
Identities = 113/353 (32%), Positives = 185/353 (52%), Gaps = 9/353 (2%)

Query: 6 QQDNLIGQSNALLEVLEHISQVAPLSKPVLIIGERGTGKELIAERLHYLSKRWDQSFIKL 65
L+G+S A+ E+ ++++ ++I GE GTGKEL+A LH KR + F+ +
Sbjct: 135 DGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAI 194

Query: 66 NCSSLSENLLESELFGHDAGAFTGASKKHEGRFERADGGTLFLDELANTSGLIQEKLLRV 125
N +++ +L+ESELFGH+ GAFTGA + GRFE+A+GGTLFLDE+ + Q +LLRV
Sbjct: 195 NMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRV 254

Query: 126 IEYGEFERVGGSKTVQTNVRLICAANEDLPSLAEAGEFRPDLLDRLAFDVITLPPLRHRS 185
++ GE+ VGG ++++VR++ A N+DL G FR DL RL + LPPLR R+
Sbjct: 255 LQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRA 314

Query: 186 EDIMALAEYFAVGMARQLKLELFEGFSRSAVEQLMEYQWPGNIRELKNVVERSVYRNADT 245
EDI L +F ++ F + A+E + + WPGN+REL+N+V R
Sbjct: 315 EDIPDLVRHFVQQAEKEGLDVK--RFDQEALELMKAHPWPGNVRELENLVRRLTALYPQ- 371

Query: 246 NAAIEQIIIDPFASPYRPTKRVKTKERQQIVSPEVNVTAPDSTTEVSANTSTLANAAVSF 305
+ I + + R + ++ + +++ + E A+
Sbjct: 372 ----DVITREIIENELRSE--IPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPP 425

Query: 306 PIDFKTHCEQGEVRILKQALEAGQFNQKKTAELLGLSYHQLRGILKKYNLLDK 358
+ + E ++ AL A + NQ K A+LLGL+ + LR +++ +
Sbjct: 426 SGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVSVY 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2761HTHFIS300.011 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.2 bits (68), Expect = 0.011
Identities = 9/16 (56%), Positives = 14/16 (87%)

Query: 38 LVGESGSGRTLLARAI 53
+ GESG+G+ L+ARA+
Sbjct: 165 ITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2762PF05272300.013 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.013
Identities = 12/50 (24%), Positives = 19/50 (38%), Gaps = 1/50 (2%)

Query: 43 LAIVGEAGSGKSTIARILVGAEIRSGGEIFFEGEPLDKHDLKQRCRLIRM 92
+ + G G GKST+ LVG + S G D ++ +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDI-GTGKDSYEQIAGIVAYEL 647


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2765DNABINDINGHU1186e-39 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 118 bits (298), Expect = 6e-39
Identities = 50/88 (56%), Positives = 67/88 (76%)

Query: 2 NKSELIEKIASGADISKAAAGRALDSFIGAVTDGLKEGDKIALVGFGTFEVRQRAERTGR 61
NK +LI K+A +++K + A+D+ AV+ L +G+K+ L+GFG FEVR+RA R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEIKIAAANIPAFKAGKALKDAV 89
NPQTG+EIKI A+ +PAFKAGKALKDAV
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2766HTHFIS340.002 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 34.4 bits (79), Expect = 0.002
Identities = 28/152 (18%), Positives = 60/152 (39%), Gaps = 26/152 (17%)

Query: 304 HKRSKIKRDLAKAQDVLD--TDHFGLEKVKERILEYLAVQSRVKQLKGPILCLVGPPGVG 361
+ + + +QD + ++++ + +R+ Q ++ + G G G
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVL-------ARLMQTDLTLM-ITGESGTG 172

Query: 362 KTSLGQSIAKATGRK---YVRVALGGVRD---EAEIRGHRRTYIGSMPGKVIQKMSKVGV 415
K + +++ R+ +V + + + E+E+ GH + G+ G + +
Sbjct: 173 KELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEK---GAFTGAQTRSTGRFEQ 229

Query: 416 KN--PLFLLDEIDKMSSDMRGDPASALLEVLD 445
LFL DEI M D + + LL VL
Sbjct: 230 AEGGTLFL-DEIGDMPMDAQ----TRLLRVLQ 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2767HTHFIS300.021 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.8 bits (67), Expect = 0.021
Identities = 14/74 (18%), Positives = 30/74 (40%), Gaps = 13/74 (17%)

Query: 60 QDQDKLPTPHELRAHLDDYVIGQDKAKKVLAVAVYNHYKRLRNATPKDGVELGKSNILLI 119
+ + P+ E + ++G+ A + + Y+ L D +++
Sbjct: 120 AEPKRRPSKLEDDSQDGMPLVGRSAAMQEI-------YRVLARLMQTD------LTLMIT 166

Query: 120 GPTGSGKTLLAETL 133
G +G+GK L+A L
Sbjct: 167 GESGTGKELVARAL 180


88Shal_2789Shal_2794N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2789015-2.126114transcriptional regulator CdaR
Shal_2790016-2.534344type IV pilin
Shal_2791115-2.331676OmpA/MotB domain-containing protein
Shal_2792013-2.715725hypothetical protein
Shal_2793013-3.161368hypothetical protein
Shal_2794-210-2.137120P pilus assembly protein porin PapC-like
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2789HTHFIS290.028 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.4 bits (66), Expect = 0.028
Identities = 9/32 (28%), Positives = 19/32 (59%)

Query: 325 MNAYLLHFGDLQQCANVLFIHRNTLRYRLDRI 356
+ A G+ + A++L ++RNTLR ++ +
Sbjct: 442 LAALTATRGNQIKAADLLGLNRNTLRKKIREL 473


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2790BCTERIALGSPG435e-08 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 43.3 bits (102), Expect = 5e-08
Identities = 20/54 (37%), Positives = 37/54 (68%), Gaps = 4/54 (7%)

Query: 8 KNGFTLIELVVVIIVLGILAVIALPKFVNF--HADSKVAVLDGIAGAMKSGLDL 59
+ GFTL+E++VVI+++G+LA + +P + AD + AV D + A+++ LD+
Sbjct: 7 QRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIV--ALENALDM 58


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2791OMPADOMAIN1513e-45 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 151 bits (382), Expect = 3e-45
Identities = 82/337 (24%), Positives = 133/337 (39%), Gaps = 53/337 (15%)

Query: 21 AVAAKSFYSDGDFYFGGKLGGVLLDSEEPLEPGENKMVN-LSSGLTLGYNINSYLSVETD 79
A A++ D +Y G KLG + N L +G GY +N Y+ E
Sbjct: 16 ATVAQAAPKDNTWYTGAKLGWSQYHDTGFINNNGPTHENQLGAGAFGGYQVNPYVGFEMG 75

Query: 80 MSYLGQYQDELSGEDKNLFAIGASLS--TRYRLSDDAALYVKLGPAWVEDN--------- 128
+LG+ + S E+ A G L+ Y ++DD +Y +LG +
Sbjct: 76 YDWLGRMPYKGSVENGAYKAQGVQLTAKLGYPITDDLDIYTRLGGMVWRADTKSNVYGKN 135

Query: 129 ----VSLSSGLGIKYRLSPNWELDSGYRWI-----KDTPSTDDDLYEFTLGINYKFGVVS 179
VS G++Y ++P Y+W T T D +LG++Y+FG
Sbjct: 136 HDTGVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGTRPDNGMLSLGVSYRFGQGE 195

Query: 180 HQPVRPAVEPTLKHKKKVELLDATSTTAITVSKPVSISAKSLFGFDSSKLVATEALAEVL 239
PV A P + +K ++ + LF F+ + L L
Sbjct: 196 AAPV-VAPAP--------------APAPEVQTKHFTLKSDVLFNFNKATL--KPEGQAAL 238

Query: 240 KNVIA------SKDTEICITAYTDSLGAKEYNLTLSKQRAEATRSYFISNGVEPSRIHVD 293
+ + KD + + YTD +G+ YN LS++RA++ Y IS G+ +I
Sbjct: 239 DQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPADKISAR 298

Query: 294 WKGEASPVATNRTAEGR---------ALNRRVKIELN 321
GE++PV N + A +RRV+IE+
Sbjct: 299 GMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVK 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2794PF00577565e-10 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 56.0 bits (135), Expect = 5e-10
Identities = 70/476 (14%), Positives = 142/476 (29%), Gaps = 91/476 (19%)

Query: 285 RIEVYRDGHLIYSKNVDAGQQSIAFRDLPYGSYTASIVVI---SAGREILKERQQIVN-- 339
++ + ++G+ IY+ V G +I D+ + + V + G + V
Sbjct: 310 QVTIKQNGYDIYNSTVPPGPFTI--NDIYAAGNSGDLQVTIKEADGS----TQIFTVPYS 363

Query: 340 NSAFSLNKGEYDYSFSAGRFNDRYDNSYDGGVEQSQLTKIQAKLGYLVGPDSFIQADGFI 399
+ +G YS +AG + + + F Q
Sbjct: 364 SVPLLQREGHTRYSITAGEY-RSGNAQQEK--------------------PRFFQ----- 397

Query: 400 HDGDYSYKRTGGLNAYLKTL-TEPYS---LQLESNNFVEGKLSYQLTDSTMIGGRILSNS 455
+ G Y T + Y + N G LS +T + + +S
Sbjct: 398 --STLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQAN---STLPDDS 452

Query: 456 DSTLSELGVKTSLSDDST-AQLKFAS-------FSNGSQFIAADVSFYNIGVGYEKFDSA 507
+ + S + + ++ + N + + ++ YNI +
Sbjct: 453 QHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNI--ETQDGVIQ 510

Query: 508 DSDFGLDNFMLSNTGYQRLNVNLSSDLWGGQGYVLYVNNKLDAENEAQPFLDQSDYW--- 564
D + L+ +L + ++ L G LY++ YW
Sbjct: 511 VKPKFTDYYNLAYNKRGKLQLTVTQQL-GRTST-LYLS------------GSHQTYWGTS 556

Query: 565 ----SVSAGFSHSFVADSVINFSATFQGGESFGVKDDWYASVLW-SVPLSAGWSASS--- 616
AG + +F IN++ ++ ++ K L ++P S + S
Sbjct: 557 NVDEQFQAGLNTAF---EDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQ 613

Query: 617 --SVSVSRQGLDEFRNSVANDRQLSRNLSMNNELGISYNGTDIERNMSSDLSS---NINY 671
S S + + N + L +N L S + S+ +NY
Sbjct: 614 WRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNY 673

Query: 672 NNGYVASDTYAYISSDGTHSVSSSFNSTQVLSGKGEVYFSSEQSDAYIIVDAQNQG 727
GY ++ S D + + + G V +D ++V A
Sbjct: 674 RGGYGNANIGYSHSDDIK-QLYYGVSGGVLAHANG-VTLGQPLNDTVVLVKAPGAK 727



Score = 34.1 bits (78), Expect = 0.003
Identities = 41/329 (12%), Positives = 94/329 (28%), Gaps = 35/329 (10%)

Query: 428 ESNNFVEGKLSYQLTDSTMIGGRILSNSDSTLSELGVKTSLSDDSTAQLKFASFSNGSQF 487
E F + L + L I G G+ ++ + + +N +
Sbjct: 391 EKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVD-MTQANSTLP 449

Query: 488 IAADVSFYNIGVGYEK-FDSADSDFGLDNFMLSNTGYQRLNVNLSSDLWGGQGYVLYVNN 546
+ ++ Y K + + ++ L + S +GY S + G
Sbjct: 450 DDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVI 509

Query: 547 KLDAENEAQPFLDQSDYWSVSAGFSHSFVADSVINFSATFQGGESFGVKDDWYASVLWSV 606
++ + L + + + S + S + Q ++ +
Sbjct: 510 QVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQ---------TYWGTSNVDE 560

Query: 607 PLSAGWS-----ASSSVSVSRQGLDEFRNSVANDRQLSRNLSMNNELGISYNGTDIERNM 661
AG + + ++S S + ++ D+ L+ N+++ + + R+
Sbjct: 561 QFQAGLNTAFEDINWTLSYSLT-KNAWQKG--RDQMLALNVNIPFSHWLRSDSKSQWRHA 617

Query: 662 SSDLSSNINYNNGYVASDTY----------------AYISSDGTHSVSSSFNSTQVLSGK 705
S+ S + + N Y +S S+ + + G
Sbjct: 618 SASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGY 677

Query: 706 GEVYFSSEQSDAYIIVDAQNQGSDDAHRG 734
G SD + G AH
Sbjct: 678 GNANIGYSHSDDIKQLYYGVSGGVLAHAN 706


89Shal_2833Shal_2840N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2833114-1.424434alanine racemase
Shal_2834015-0.733737TetR family transcriptional regulator
Shal_28350111.412378xanthine/uracil/vitamin C permease
Shal_2836-1111.429740cryptic adenine deaminase
Shal_2837-2121.636320hypothetical protein
Shal_2838-2121.389551hypothetical protein
Shal_2839-2121.594679hypothetical protein
Shal_2840-3121.974859collagenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2833ALARACEMASE1883e-58 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 188 bits (479), Expect = 3e-58
Identities = 93/372 (25%), Positives = 159/372 (42%), Gaps = 28/372 (7%)

Query: 42 SWLEVNLTQFDSNIKELKNHIQSNVKVCAIMKADAYGNGIAGLMPTVLANNIPCIGVTSN 101
++L N+ ++ + +V +++KA+AYG+GI + + A + + +
Sbjct: 5 IQASLDLQALKQNLSIVRQAAT-HARVWSVVKANAYGHGIERIWSAIGATD--GFALLNL 61

Query: 102 EEIRIVRDAGFRGSLIRVRSASIAEISDTL-QYDVEEFVGDANQATLLSAMAAKLDKPLK 160
EE +R+ G++G ++ + A+ + Q+ + V Q L A+L PL
Sbjct: 62 EEAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQN--ARLKAPLD 119

Query: 161 VHLVLNSGGMGRNGVDVSTQSGLNEAVQIATTADLDVVGIMTHFPSYDRDDVLVKSKSFY 220
++L +NSG M R G L Q+ A++ + +M+HF + D
Sbjct: 120 IYLKVNSG-MNRLGF--QPDRVLTVWQQLRAMANVGEMTLMSHFAEAEHPD-------GI 169

Query: 221 NSALNIIEESGLKRDSVTIHSGNSYVALNVPEAHFDMVRPGGVLYGDQPT-------NPE 273
+ A+ IE++ + S NS L PEAHFD VRPG +LYG P+ N
Sbjct: 170 SGAMARIEQAAEGLECRRSLS-NSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTG 228

Query: 274 FPSIVTFKTRIASLVDLPKGATVGYDSTIKLERESLLANLPVGYSDSFPRKMGNTADVLV 333
++T + I + L G VGY E + + GY+D +PR VLV
Sbjct: 229 LRPVMTLSSEIIGVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVLV 288

Query: 334 NGQRAKVMGVISMNTSMIDVTDLSGVKEGDEVVLFGYQGKESILAPEFEKNADVIFPEIY 393
+G R +G +SM+ +D+T G V L+G + I + A + E+
Sbjct: 289 DGVRTMTVGTVSMDMLAVDLTPCPQAGIGTPVELWG----KEIKIDDVAAAAGTVGYELM 344

Query: 394 TQWGQTNPRIYV 405
P + V
Sbjct: 345 CALALRVPVVTV 356


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2834HTHTETR684e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 67.7 bits (165), Expect = 4e-16
Identities = 28/179 (15%), Positives = 57/179 (31%), Gaps = 14/179 (7%)

Query: 1 METKSKTRGRPKQSPVHTPCQQNILSVARRLFLTFSYSKVTTRKIADEADVDVALIGYYF 60
M K+K + + Q+IL VA RLF S + +IA A V I ++F
Sbjct: 1 MARKTKQEAQETR--------QHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHF 52

Query: 61 GSKAELYKAVFDDIYAPFFEKLNQL--KQASSEINTIEELFMTIFNLDIEHP--ELLSII 116
K++L+ +++ + E + K ++ + E+ + + + LL I
Sbjct: 53 KDKSDLFSEIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEI 112

Query: 117 YKTLVLNDG--PKRNYFNKQILNYTDTFHVNIFAKLQRDGFIDKNLDVELLKDSFNALV 173
G + + + + +L +
Sbjct: 113 IFHKCEFVGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2836UREASE340.002 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 33.9 bits (78), Expect = 0.002
Identities = 18/72 (25%), Positives = 32/72 (44%), Gaps = 12/72 (16%)

Query: 31 DLKLTNVSLLDLINGEVIPGPILIDKGHIIAVGQE--------VDLL--AAVRVVDCHGQ 80
D +TN +LD ++ I + G I A+G+ V ++ V+ G+
Sbjct: 69 DTVITNALILDHWG--IVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGK 126

Query: 81 IAVPGFIDAHMH 92
I G +D+H+H
Sbjct: 127 IVTAGGMDSHIH 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2840MICOLLPTASE2772e-80 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 277 bits (709), Expect = 2e-80
Identities = 108/589 (18%), Positives = 221/589 (37%), Gaps = 50/589 (8%)

Query: 92 PMGASGPTAEAEDNMC--SSELANLSGEALFDAVRQAEISCISELYSRNDAVSVAAYQTE 149
P+G S + A +N EL ++ L + ++ + +L++ ND +
Sbjct: 77 PLGPSIAPSRARNNKIYTFDELNRMNYSDLVELIKTISYENVPDLFNFNDGSYTFFSNRD 136

Query: 150 NVVSVANQAAAIAATYDSSTGYEMRNLFYFLRGAFYIEFYNDDLAYSDTQA-ADAVYAAL 208
V ++ TY + + L FLR +Y+ FYN L+Y +T + A+
Sbjct: 137 RVQAIIYGLEDSGRTYTADDDKGIPTLVEFLRAGYYLGFYNKQLSYLNTPQLKNECLPAM 196

Query: 209 VEYAKNPKLFEITPSAGDTLMEFFTSWASSDHILESVPVITDYLQMFNADFLASNRHRAA 268
N T + + ++ E + L F + + +
Sbjct: 197 KAIQYNSNFRLGTKAQDGVVEALGRLIGNASADPEVINNCIYVLSDFKDNIDKYGSNYSK 256

Query: 269 MTSALTTLYYGSW---EEAYNSKAMEH---------GELIDALLNIATADYIINSDYQYE 316
+ + + YN+K + ++ L ++ T +N+D +
Sbjct: 257 GNAVFNLMKGIDYYTNSVIYNTKGYDAKNTEFYNRIDPYMERLESLCTIGDKLNNDNAWL 316

Query: 317 STDAFHEFGRFYEYQKYWDLPQSLKTRLNDGVELYMSKFERMSAQ---WADGAGYLDYYN 373
+A + GR ++++ + Q R M ++ +S Q A+
Sbjct: 317 VNNALYYTGRMGKFREDPSISQRALERA-------MKEYPYLSYQYIEAANDLDLNFGGK 369

Query: 374 PGDCEQFGICGWEEELEQTALPINYSCSDTIYI-RAQQLTNDE-LQSSCDLMGGEETLFH 431
+ + + LP Y+ D ++ +A +E ++ + F
Sbjct: 370 NSSGNDIDFNKIKADAREKYLPKTYTFDDGKFVVKAGDKVTEEKIKRLYWASKEVKAQFM 429

Query: 432 DVLATGYQPVADDLNESLEVNIFDSYDDYAQYASVIFGINTNNGGMYLEGTPSQEGNHAR 491
V+ + ++ L V I++S ++Y + +I G +T+NGG+Y+E N
Sbjct: 430 RVVQNDKALEEGNPDDILTVVIYNSPEEY-KLNRIINGFSTDNGGIYIE-------NIGT 481

Query: 492 FIAHEATWTDEI-LVWNL-KHEYVHYLDGRFNLYGAF-NYFDIDTGKSVWWTEGLAEYIS 548
F +E T + I + L +HE+ HYL GR+ + G + G W+ EG AE+ +
Sbjct: 482 FFTYERTPEESIYTLEELFRHEFTHYLQGRYVVPGMWGQGEFYQEGVLTWYEEGTAEFFA 541

Query: 549 KQNRNDG---------AIELGRSQAYSLSEILSNTYDSGSDRVYSWGYLAVRFLFENHRS 599
R DG + R+ SL +L Y S Y++G+ +++ N+
Sbjct: 542 GSTRTDGIKPRKSVTQGLAYDRNNRMSLYGVLHAKYGS--WDFYNYGFALSNYMYNNNMG 599

Query: 600 DVDALLVLARGGDADGWLAYIDNTIGQ-NYNSEWNTWLASVSSNDDSIS 647
+ + + D G+ YI + N ++ ++ S+ +N D++
Sbjct: 600 MFNKMTNYIKNNDVSGYKDYIASMSSDYGLNDKYQDYMDSLLNNIDNLD 648



Score = 79.4 bits (195), Expect = 5e-17
Identities = 34/244 (13%), Positives = 73/244 (29%), Gaps = 32/244 (13%)

Query: 454 FDSYDDYA-QYASVIFGINTNNGGMYLEGTPSQEGNHARFIAHEATWTDEILVWNLKHEY 512
+ S+D Y +A + N N G + + ++ + L +Y
Sbjct: 577 YGSWDFYNYGFALSNYMYNNNMGMFNKMTNYIKNND---VSGYKDYIASMSSDYGLNDKY 633

Query: 513 VHYLDGRFNLYGAFNYFDIDTGKSVWWTEGLAEYISKQNRNDGAIE-LGRSQAYSLSEIL 571
Y+D N D+ + A+ I++ + + + +
Sbjct: 634 QDYMDSLLNNIDNL---DVPLVSDEYVNGHEAKDINEITNDIKEVSNIKDLSSNVEKSQF 690

Query: 572 SNTYDSGSDRVYSWGYLAVRFLFENH-----RSDVDALLVLARGGDADGWLAYI------ 620
TYD Y+ R E + S ++ +L +G+
Sbjct: 691 FTTYD------MRGTYVGGRSQGEENDWKDMNSKLNDILKELSKKSWNGYKTVTAYFVNH 744

Query: 621 --DNTIGQNYNSEWNTWLASVSSNDDSISTGITPPVDSDGDGVIDSQDAFPHDPSETHDT 678
D Y+ ++ ++ D ++ + SD +++ + F T
Sbjct: 745 KVDGNGNYVYDVVFHGMNTDTNT-DVHVNKEPKAVIKSDSSVIVEEEINF----DGTESK 799

Query: 679 DGDG 682
D DG
Sbjct: 800 DEDG 803


90Shal_2882Shal_2890N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2882012-2.372769fimbrial biogenesis outer membrane usher
Shal_2883013-2.190612hypothetical protein
Shal_2884010-0.561691homoserine O-succinyltransferase
Shal_2885011-1.174000hypothetical protein
Shal_2886011-1.408710short-chain dehydrogenase/reductase SDR
Shal_2887013-2.311252outer membrane protein W
Shal_2888013-1.940473RND family efflux transporter MFP subunit
Shal_2889011-1.725926acriflavin resistance protein
Shal_2890014-2.713598LysR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2882PF005771106e-27 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 110 bits (277), Expect = 6e-27
Identities = 93/672 (13%), Positives = 197/672 (29%), Gaps = 84/672 (12%)

Query: 95 AQKGLDLSFNPSTLTLDLEIDPEAFGQFDVDFNNQFNAFVP----SQSGTFSWLN-SINF 149
+ L+L I N+ ++P LN + +
Sbjct: 141 MIHDATAQLDVGQQRLNLTIPQAFMS-------NRARGYIPPELWDPGINAGLLNYNFSG 193

Query: 150 THSENWESTTNSRFSTADWLAQMNFGGA---NGINITLANHLEANDSETNLLRGEWTAFY 206
+N NS ++ + + +N G + + + ++ S+
Sbjct: 194 NSVQN-RIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLER 252

Query: 207 DRPSAPFRLAMGDVESGSSVAGHLSGTSMGGLSIKSDYAELQPERIIGPNNSQELILKES 266
D RL +GD + + G + G + SD + P+ G I + +
Sbjct: 253 DIIPLRSRLTLGDGYTQGDI---FDGINFRGAQLASDDN-MLPDSQRGFAPVIHGIARGT 308

Query: 267 AEVEITVNGQIIFSGRQEAGRFNLNNLPMTNGANDIIVNVTYLSGKSERFVFSQFYNNKL 326
A+V I NG I++ G F +N++ + D+ V + G ++ F L
Sbjct: 309 AQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLL 368

Query: 327 LNKDMLNFGVTAGAPSIYAENGVEYLDTWTVIGFTEYGVSSWLTLGANAAIAQYGQILGA 386
+ + +TAG Y + +G+ + T+ +A +
Sbjct: 369 QREGHTRYSITAGE---YRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNF 425

Query: 387 TATLGT-SWGNLSSRVSFSNL----EDTGMGNIVSFTFES---------TVAGNSNNAPN 432
+ G LS ++ +N + G V F + + G +
Sbjct: 426 GIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSG 485

Query: 433 LRLSADFSDTFTSSPWDK--DILATSYERYLANYVWIFNNAWDTTISGSY---------- 480
AD + + + + D + ++ Y +N ++ +
Sbjct: 486 YFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYL 545

Query: 481 ------YKDTHDVEQS-NLSMTLNWRMGDWTVGTGANYNDTDTYENADIEYFVTLDWRRS 533
Y T +V++ + + +WT+ N + + V + +
Sbjct: 546 SGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHW 605

Query: 534 NTQNKVNLAANYNSANNYARLEASKTS-----SDRVGS---IGYRAQAEYEDGRDSQNAQ 585
+ + + +++ + + + + + + Y Q Y G D +
Sbjct: 606 LRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGS 665

Query: 586 F---DYTANRLRVELEVERNNSNASIDNDA-SYSASIRGNTAIGIVDGQVGWGRAQDGPF 641
N S +D + G + V G+ +
Sbjct: 666 TGYATLNYRGGYGNA-----NIGYSHSDDIKQLYYGVSG--GVLAHANGVTLGQPLNDTV 718

Query: 642 IISKLHPSLPQQQVMLGIDQKQNYEASGTSMIG-ALLPLEVAYVSNVVDLNVPDAPIGYD 700
++ K P +V +N T G A+LP Y N V L+ D
Sbjct: 719 VLVKA-PGAKDAKV-------ENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVD 770

Query: 701 WGESRLTVSPGA 712
+ V P
Sbjct: 771 LDNAVANVVPTR 782


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2886DHBDHDRGNASE464e-08 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 46.2 bits (109), Expect = 4e-08
Identities = 36/165 (21%), Positives = 69/165 (41%), Gaps = 10/165 (6%)

Query: 6 AIIVGASSLLSREIAKQLADEGVELALLAQDPQSLIEFSQSLPTKA---QVFPLQIDEPE 62
A I GA+ + +A+ LA +G +A + +P+ L + SL +A + FP + +
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 63 AIISTLNAVWQSLGGAHLILVNTGLTHYDP--ELPWQPEQDIITVNVQGFSAICNTAFRL 120
AI + + +G +++ G+ L + + +VN G + +
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 121 FRDQGYGQLAAINSIAGLRGGPSV---AYHASKAYAQNYFEGLSM 162
D+ G + + S G P AY +SKA A + + L +
Sbjct: 131 MMDRRSGSIVTVGSNPA--GVPRTSMAAYASSKAAAVMFTKCLGL 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2888RTXTOXIND531e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 52.5 bits (126), Expect = 1e-09
Identities = 37/188 (19%), Positives = 78/188 (41%), Gaps = 21/188 (11%)

Query: 103 SVQAAQLKEQEVQLPAVRDDYNRQLRLYKDKSISQQSLQAAQAKYEALQANIESLQANIE 162
V +QL++ E ++ + +++Y +L+K++ + + L+ L + + +
Sbjct: 269 RVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDK--LRQTTDNIGLLTLELAKNEERQQ 326

Query: 163 LRQIKAPFNG-VLGIRNINLGQFLSAGTN---IVRLEDLSVMRVRFSVPQSQIGKLSVGQ 218
I+AP + V ++ G ++ IV +D + V V IG ++VGQ
Sbjct: 327 ASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDD--TLEVTALVQNKDIGFINVGQ 384

Query: 219 PISLTVDPYPDRIY---SGKINAIEP--VINYKTGLV------MIQAELPNDDQT--LRG 265
+ V+ +P Y GK+ I + + + GLV + + L ++ L
Sbjct: 385 NAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKNIPLSS 444

Query: 266 GMFAEVSV 273
GM +
Sbjct: 445 GMAVTAEI 452


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2889ACRIFLAVINRP7970.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 797 bits (2059), Expect = 0.0
Identities = 319/1030 (30%), Positives = 540/1030 (52%), Gaps = 34/1030 (3%)

Query: 7 FIHRPVMAASLSILLFILGLNAMKDLQVRQYPEMTNTVITVTTPYPGADAELIEGFITQP 66
FI RP+ A L+I+L + G A+ L V QYP + ++V+ YPGADA+ ++ +TQ
Sbjct: 5 FIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQV 64

Query: 67 LEQALSQVDNLDFMTSSSQ-LGTSTIILNMRLNTNPEEALANVLSQINSVTNQLPKEAYN 125
+EQ ++ +DNL +M+S+S G+ TI L + T+P+ A V +++ T LP+E
Sbjct: 65 IEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQQ 124

Query: 126 PSVTSSTGATTSIMYISFFSKELDSSQ--IVDYLERVVNPQLSTVNGVSNIQMMGGTPFA 183
++ +++ +M F S ++Q I DY+ V LS +NGV ++Q+ G +A
Sbjct: 125 QGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA-QYA 183

Query: 184 LRIWPNPVKLGQYNLTTADLITVLQQNNYQSAVGQFNSSLTI------LNGTINTQVDTV 237
+RIW + L +Y LT D+I L+ N Q A GQ + + + T+
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP 243

Query: 238 DGLKKLVIKSD-SGQVVQLQDVADISLGKSSDSTRALANGREAVIIGINATPTANPLDVA 296
+ K+ ++ + G VV+L+DVA + LG + + A NG+ A +GI AN LD A
Sbjct: 244 EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDTA 303

Query: 297 EGIRTKFDLVEKNLPTTIQSTLIYDSTLAITDSIDEVVKTIAEAAVIVIVVILLFLGSFR 356
+ I+ K ++ P ++ YD+T + SI EVVKT+ EA ++V +V+ LFL + R
Sbjct: 304 KAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNMR 363

Query: 357 AVIIPIITIPLSLIGVIAVMQVFGFSLNLMTLLAMVLAIGLVVDDAIVVVENVDRHIKM- 415
A +IP I +P+ L+G A++ FG+S+N +T+ MVLAIGL+VDDAIVVVENV+R +
Sbjct: 364 ATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMMED 423

Query: 416 GTKPLEAALVATREIAVPIISMTITLAAVYAPIALMGGVTGSLFKEFALTLAGSVVISGF 475
P EA + +I ++ + + L+AV+ P+A GG TG+++++F++T+ ++ +S
Sbjct: 424 KLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVL 483

Query: 476 VALTLSPMMCSKMLKP-----HAQPSKFEETVERVLGGLTSRYHKTLLAVLDHRGAMVFF 530
VAL L+P +C+ +LKP H F + Y ++ +L G +
Sbjct: 484 VALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLLI 543

Query: 531 AVIVFVSLPIIFSFIPSQLAPDEDQGVVVVIGTTPSTSNNDYIQANMELITQIISKQPQ- 589
++ + ++F +PS P+EDQGV + + P+ + + Q ++ +T K +
Sbjct: 544 YALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNEKA 603

Query: 590 -VEASLALIGV----PTSSQGLAIAPLVPWSDR---ALSQKQISANVNKQAKKIPGITAT 641
VE+ + G + G+A L PW +R S + + + KI
Sbjct: 604 NVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGFVI 663

Query: 642 AYQMP--SLPGASGGLPIQFVITSPASFETLYNIGNKLLEKAKKSPL-LVYTDLNLKYDS 698
+ MP G + G + + + + L N+LL A + P LV N D+
Sbjct: 664 PFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLEDT 723

Query: 699 GLVQLKIDREAAGAYGVTMQEIGSTLGTMMGGGYINRVNIDGRSYEVIPQVERVYRANPK 758
+L++D+E A A GV++ +I T+ T +GG Y+N GR ++ Q + +R P+
Sbjct: 724 AQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRMLPE 783

Query: 759 LIDQFYVTAQNGTAIPLSSLVSYSVNGNAKSLPHFNQMNSISIEAVPAPNVAMGDAIAYF 818
+D+ YV + NG +P S+ + + L +N + S+ I+ AP + GDA+A
Sbjct: 784 DVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMALM 843

Query: 819 EQLAKTDLPPGFTYTFMGEARQFVEEGSSLYVTFALALAIIFLVLASQFESVRDPLVIMF 878
E L + LP G Y + G + Q G+ A++ ++FL LA+ +ES P+ +M
Sbjct: 844 ENL-ASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVML 902

Query: 879 TVPLAISGALIALGWTHVSGETSLNIYSQVGLITLVGLITKHGILMCEVAKEEQIFKGAT 938
VPL I G L+A + ++Y VGL+T +GL K+ IL+ E AK+ +G
Sbjct: 903 VVPLGIVGVLLAATLFN----QKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKG 958

Query: 939 KHDAIIEAATVRLRPILMTTSAMIAGLIPLLFASGAGAASRYNIGLVIVAGLAVGTAFTL 998
+A + A +RLRPILMT+ A I G++PL ++GAG+ ++ +G+ ++ G+ T +
Sbjct: 959 VVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAI 1018

Query: 999 FVLPVMYTYI 1008
F +PV + I
Sbjct: 1019 FFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2890RTXTOXINC300.006 Gram-negative bacterial RTX toxin-activating protein C...
		>RTXTOXINC#Gram-negative bacterial RTX toxin-activating protein C

signature.
Length = 170

Score = 30.2 bits (68), Expect = 0.006
Identities = 15/60 (25%), Positives = 26/60 (43%), Gaps = 2/60 (3%)

Query: 3 FPSRLLLLLEVAELGSFSKVAEQR--NVDRSVISKQINRLEHELGVHLLNRTTRSLSLTA 60
FP L + V KV+E +D+ + +K + HEL + ++ + SLT
Sbjct: 111 FPDELFRAIRVDPKTHVGKVSEFHGGKIDKQLANKIFKQYHHELITEVKRKSDFNFSLTG 170


91Shal_2905Shal_2917N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_2905-1143.059288carbamoyl-phosphate synthase L chain
Shal_2906-1152.140936hypothetical protein
Shal_2907-1141.999299pyruvate carboxyltransferase
Shal_2908-1130.7292473-oxoacid CoA-transferase subunit A
Shal_2909-1130.0303893-oxoacid CoA-transferase subunit B
Shal_2910-114-0.317156hydrophobe/amphiphile efflux-1 (HAE1) family
Shal_2911-115-1.270104RND family efflux transporter MFP subunit
Shal_2912015-1.912965two component transcriptional regulator
Shal_2913015-1.675946histidine kinase
Shal_2914115-1.339346hypothetical protein
Shal_2915015-0.822460alpha/beta hydrolase fold protein
Shal_2916014-0.750488beta-lactamase domain-containing protein
Shal_2917117-0.383822hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2905RTXTOXIND310.014 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.3 bits (71), Expect = 0.014
Identities = 20/112 (17%), Positives = 35/112 (31%), Gaps = 12/112 (10%)

Query: 543 NLLTLSGELIDEILHAEIVQHKLGDSQAEQASKANGHKIKLPVSQVGNDFTLFINSKSYH 602
+L + ++ E+ +K Q E K V F I K
Sbjct: 253 AVLEQENKYVE--AVNELRVYKSQLEQIESEIL----SAKEEYQLVTQLFKNEILDKLRQ 306

Query: 603 YRALESEEIEEQDNLEDKL-----KAPMNGTIVTQLV-SVGDVVKAGQGIMV 648
E E++ +AP++ + V + G VV + +MV
Sbjct: 307 TTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2906RTXTOXIND250.022 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 24.8 bits (54), Expect = 0.022
Identities = 10/44 (22%), Positives = 20/44 (45%)

Query: 9 AIESPFDGVVGAFFFEPGELVSDGMLLAEVEAKVETAELEPTET 52
I+ + +V + GE V G +L ++ A A+ T++
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQS 141


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2910ACRIFLAVINRP9960.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 996 bits (2576), Expect = 0.0
Identities = 474/1033 (45%), Positives = 676/1033 (65%), Gaps = 7/1033 (0%)

Query: 1 MSYFFISRPVFAWVIAILISIAGVIAIPKLAVERFPSVAPPSIGLYLSYPGASPQTINDS 60
M+ FFI RP+FAWV+AI++ +AG +AI +L V ++P++APP++ + +YPGA QT+ D+
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VVTLIEREISGINGLLYFSSSSDASGSASITVTFENGTDPEMAQVEIQNKIRAIEPRLPL 120
V +IE+ ++GI+ L+Y SS+SD++GS +IT+TF++GTDP++AQV++QNK++ P LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 AVRQIGITVETTSSNNLMYLGLISPSGEYSEQQLSDYMVRNIVEEIKRVPGVGRVQMYGA 180
V+Q GI+VE +SS+ LM G +S + ++ +SDY+ N+ + + R+ GVG VQ++GA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 EFAMRIWVDPVALTGYNLTMEDVTSAIAEQNVQISPGKVGDSPTSGGQMTVHSLTAQGQL 240
++AMRIW+D L Y LT DV + + QN QI+ G++G +P GQ S+ AQ +
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 ETPEQFRAIVLRSNIDAGMVQLGDVAKVELGAQNYLFSNRENGQQSTSAAIQLSPGANAI 300
+ PE+F + LR N D +V+L DVA+VELG +NY R NG+ + I+L+ GANA+
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 ATSQGIRDRLAELKLAMPSNMDYSITSDAAIFAKISIQKVVVTLLEAMVLVFLVMYLFLQ 360
T++ I+ +LAEL+ P M D F ++SI +VV TL EA++LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 KIRYTLIPAIVAPIALLGTFAVMWLMGFSINVFTMFGMVLAIGIIVDDAIVVVENVERVM 420
+R TLIP I P+ LLGTFA++ G+SIN TMFGMVLAIG++VDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AETGLPPKAATEQAMKSIYSAVIGITAVLSAVFIPMAFASGSVGTIFQQFSLAMAVSILF 480
E LPPK ATE++M I A++GI VLSAVFIPMAF GS G I++QFS+ + ++
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SAFLALTLTPALCATLLKPLTEHDHKS-GGFFGWFNRNFHNFTQKVTAGVAHLVKRTGRV 539
S +AL LTPALCATLLKP++ H++ GGFFGWFN F + T V ++ TGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 540 MALYIVLVVACIYSFTLVPSTFLPEEDQGYYMTSIQLPPDATVERTMDVVSKIEAFTLGR 599
+ +Y ++V + F +PS+FLPEEDQG ++T IQLP AT ERT V+ ++ + L
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 600 EA--IDASMSVLGFSFSGTGPGSAFGITTLKDWQLRGG--SSTQQEVAALSDYLSDLTEG 655
E +++ +V GFSFSG + +LK W+ R G +S + + L + +G
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 656 EVFSVLPAAIDGLGNSSNLSFQLQDSGNAGYERFQLIQKEFLEQA-YASEKLSSVFYEAL 714
V AI LG ++ F+L D G++ + + L A L SV L
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 715 PETSGIHLNIDRAKARALGVPFNAISDTISTAMGSNYVNDFPNNGRLQQVIVQVQAESRM 774
+T+ L +D+ KA+ALGV + I+ TISTA+G YVNDF + GR++++ VQ A+ RM
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 775 QLEDILQLAIRNEQGGTVYLSEFVTPVWQMAPQQLTRFNGLLAVAVSASPAPGYSTGEAM 834
ED+ +L +R+ G V S F T W +L R+NGL ++ + APG S+G+AM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 835 QEVERIAASLPTAAGIEWTGLSLQEKQAESQTVWLLLFSMLVIFLVLAALYESWSIPLAV 894
+E +A+ LP G +WTG+S QE+ + +Q L+ S +V+FL LAALYESWSIP++V
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 895 ILVVPLGIIGAVGFVLLNGVANDVFFKVGLVTIIGLSAKNAILIVEFAKQAQ-AEGICLV 953
+LVVPLGI+G + L NDV+F VGL+T IGLSAKNAILIVEFAK EG +V
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 954 DSVLQATKQRLRPILMTSLAFSCGIIPLYLATGPSSEIQNSIGTGVLGGMFTGTVFAVLF 1013
++ L A + RLRPILMTSLAF G++PL ++ G S QN++G GV+GGM + T+ A+ F
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1014 VPVFFVFICQLLE 1026
VPVFFV I + +
Sbjct: 1021 VPVFFVVIRRCFK 1033



Score = 86.4 bits (214), Expect = 3e-19
Identities = 48/326 (14%), Positives = 121/326 (37%), Gaps = 17/326 (5%)

Query: 720 IHLNIDRAKARALGVPFNAISDTISTA---MGSNYVNDFPNNGRLQQVIVQVQAESRMQ- 775
+ + +D + + + + + + + P QQ+ + A++R +
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPG-QQLNASIIAQTRFKN 242

Query: 776 LEDILQLAIR-NEQGGTVYLSEFVTPVWQMAPQQ-LTRFNGLLAVAVSASPAPGYSTGEA 833
E+ ++ +R N G V L + + R NG A + A G + +
Sbjct: 243 PEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDT 302

Query: 834 ----MQEVERIAASLPTAAGIEWTGLSLQE---KQAESQTVWLLLFSMLVIFLVLAALYE 886
++ + P G++ + + + V L +++++FLV+ +
Sbjct: 303 AKAIKAKLAELQPFFP--QGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 887 SWSIPLAVILVVPLGIIGAVGFVLLNGVANDVFFKVGLVTIIGLSAKNAILIVE-FAKQA 945
+ L + VP+ ++G + G + + G+V IGL +AI++VE +
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 946 QAEGICLVDSVLQATKQRLRPILMTSLAFSCGIIPLYLATGPSSEIQNSIGTGVLGGMFT 1005
+ + ++ ++ Q ++ ++ S IP+ G + I ++ M
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 1006 GTVFAVLFVPVFFVFICQLLERVRPK 1031
+ A++ P + + + +
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHE 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2911RTXTOXIND423e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.1 bits (99), Expect = 3e-06
Identities = 30/158 (18%), Positives = 60/158 (37%), Gaps = 8/158 (5%)

Query: 58 QGRVRPL-RSAEIRPQVEGIITQRLFKQGDTLKQGDALFQIDQDIFTIDVEIKQAALEQS 116
G++ RS EI+P I+ + + K+G+++++GD L ++ D Q++L Q+
Sbjct: 87 NGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQA 146

Query: 117 MAHLALIQS-----QVDRFTQLDRTKAVSKQAFEEVSFNLQIAAATVQQNRAQLKHSELQ 171
Q ++++ +L Q E + Q + Q + + +
Sbjct: 147 RLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKE 206

Query: 172 LEYAKVTAPISGIIGEALVTEGSLVSRTEPQPLAIIQQ 209
L K A + A + +SR E L
Sbjct: 207 LNLDKKRAERLTV--LARINRYENLSRVEKSRLDDFSS 242



Score = 37.1 bits (86), Expect = 1e-04
Identities = 43/273 (15%), Positives = 91/273 (33%), Gaps = 41/273 (15%)

Query: 39 ANIATITAELTQVTLFDELQGRVRPLRSAEIRPQVEGIITQRLFKQGDTLKQGDALFQID 98
N+ AE + R+ + + L + K A+ + +
Sbjct: 207 LNLDKKRAER------LTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKH--AVLEQE 258

Query: 99 QDIFTIDVEIKQAALEQSMAHLALIQSQVDRFTQLDRTKAVSKQAFEEVSFNLQIAAATV 158
L + L I+S++ + + V++ E+ L+ +
Sbjct: 259 NKYVEA-----VNELRVYKSQLEQIESEILSAKE--EYQLVTQLFKNEILDKLRQTTDNI 311

Query: 159 QQNRAQLKHSELQLEYAKVTAPISGIIGEALV-TEGSLVSRTEPQPLAIIQQIDKVYIDV 217
+L +E + + + + AP+S + + V TEG +V+ E + I+ + D + +
Sbjct: 312 GLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVTA 370

Query: 218 RQPASKLGLLRRLLANEQKNSEQGVKVIVQSSA---ESADDIEGQI--LFSGISVDENTG 272
+G + G I++ A + G++ + D+ G
Sbjct: 371 LVQNKDIGFIN-----------VGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419

Query: 273 ---DLIIRI-----IADNAQQKLLPGMYVRTQI 297
++II I N L GM V +I
Sbjct: 420 LVFNVIISIEENCLSTGNKNIPLSSGMAVTAEI 452


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2912HTHFIS852e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.3 bits (211), Expect = 2e-21
Identities = 34/135 (25%), Positives = 60/135 (44%), Gaps = 1/135 (0%)

Query: 2 IRVLLIEDDRDLAETLLQLMELAQMLPDHVSNGIAGLTLARSNSYDVLVIDVGLPKLDGY 61
+L+ +DD + L Q + A SN + D++V DV +P + +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 SVCKQLRASGIDTPILFLTAFSSIDNKLEGFASGGDDYLAKPFDNRELIARI-TALSGRK 120
+ +++ + D P+L ++A ++ ++ G DYL KPFD ELI I AL+ K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 121 SSHVKRLEIAGLIMN 135
K + + M
Sbjct: 124 RRPSKLEDDSQDGMP 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_2917BCTERIALGSPG489e-10 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 48.3 bits (115), Expect = 9e-10
Identities = 13/55 (23%), Positives = 36/55 (65%)

Query: 23 KDEGFTLIELVVVIIILGIVSVIAIPKFISFEDESRDAVMRQTLASVKSAISLVR 77
K GFTL+E++VVI+I+G+++ + +P + ++++ + ++++A+ + +
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK 60


92Shal_3056Shal_3060N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_3056-1110.048194collagenase
Shal_3057011-0.055565phosphate-binding protein
Shal_3058-19-1.463895PAS/PAC sensor-containing signal transduction
Shal_3059110-2.833203two component transcriptional regulator
Shal_3060110-2.893001porin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3056MICOLLPTASE1811e-53 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 181 bits (460), Expect = 1e-53
Identities = 51/279 (18%), Positives = 106/279 (37%), Gaps = 44/279 (15%)

Query: 39 ESVLNQNHACSDTI-IIRSQA-LTKEQISSACELLVRQEANFHQLFGTLNKPVADDNNHS 96
E L + + D ++++ +T+E+I +A F ++ + +
Sbjct: 387 EKYLPKTYTFDDGKFVVKAGDKVTEEKIKRLYWASKEVKAQFMRVVQNDKALEEGNPDDI 446

Query: 97 MRANVYHSREDYVAHVTNHFDVPSDNGGMYLEGLPWKSDNQAEFVAYEKKGQ-----VWN 151
+ +Y+S E+Y + +DNGG+Y+E N F YE+ + +
Sbjct: 447 LTVVIYNSPEEYKLNRI-INGFSTDNGGIYIE-------NIGTFFTYERTPEESIYTLEE 498

Query: 152 LA-HEYVHYLDGRFNLYGDFCLSLHDSHSGPEYCPKPAPLYPHTVWWSEGVAEYISLGEN 210
L HE+ HYL GR+ + G + W+ EG AE+ +
Sbjct: 499 LFRHEFTHYLQGRYVVPGMWGQGEFYQEG-------------VLTWYEEGTAEFFAGSTR 545

Query: 211 NP---------KAFALIGEKNLELSDIFNTSYEQNGGTDRVYRWGYLAVRFMIENHKDKV 261
+ A + L + + Y G+ Y +G+ +M N+
Sbjct: 546 TDGIKPRKSVTQGLAYDRNNRMSLYGVLHAKY----GSWDFYNYGFALSNYMYNNNMGMF 601

Query: 262 DEMLSFTRKGDYPRYQALIKQWGN--SMDDEFQIWLKTL 298
++M ++ + D Y+ I + ++D++Q ++ +L
Sbjct: 602 NKMTNYIKNNDVSGYKDYIASMSSDYGLNDKYQDYMDSL 640



Score = 80.5 bits (198), Expect = 5e-19
Identities = 25/167 (14%), Positives = 50/167 (29%), Gaps = 40/167 (23%)

Query: 147 GQVWNLAHEYVHYLDGRFNLYGDFCLSLHDSHSGPEYCPKPAPLYPHTVWWSEGVAEYIS 206
+ L +Y Y+D N + + L + + A+ I+
Sbjct: 624 SSDYGLNDKYQDYMDSLLNNIDNLDVPLVSD-----------------EYVNGHEAKDIN 666

Query: 207 LGENNPKAFALIGEKNLE-LSDIFNTSYEQNGGTDRVYRWGYLAVRFMIENH-----KDK 260
N+ K + I + + F T+Y+ R Y+ R E + K
Sbjct: 667 EITNDIKEVSNIKDLSSNVEKSQFFTTYD--------MRGTYVGGRSQGEENDWKDMNSK 718

Query: 261 VDEMLSFTRKGDYPRYQALIKQWGNSMDD---------EFQIWLKTL 298
++++L K + Y+ + + N D F
Sbjct: 719 LNDILKELSKKSWNGYKTVTAYFVNHKVDGNGNYVYDVVFHGMNTDT 765


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3058PF06580355e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.8 bits (80), Expect = 5e-04
Identities = 34/196 (17%), Positives = 68/196 (34%), Gaps = 38/196 (19%)

Query: 241 NVRAMDQMLQQTTRMRSMVEQLLALSRIEDASDVDLEVKVSMSQMMDILRDEALALAQ-- 298
N+RA+ +L+ T+ R M+ L L R + +VS++ + ++ L LA
Sbjct: 181 NIRAL--ILEDPTKAREMLTSLSELMRY--SLRYSNARQVSLADELTVVDS-YLQLASIQ 235

Query: 299 --EQYEVSFYCEPGLDLYGNELLVRSACSNLISNAIRY----TEPGGNIEVSWRKVAIGG 352
++ + P + + + L+ N I++ GG I + K
Sbjct: 236 FEDRLQFENQINPAIM---DVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTV 292

Query: 353 LFSVKDNGEGVAPQHIGRLTERFYRVDSARSRQSGGTGLGLAIAKHALSH---HQSELSI 409
V++ G TG GL + L ++++ +
Sbjct: 293 TLEVENTGSLALKN------------------TKESTGTGLQNVRERLQMLYGTEAQIKL 334

Query: 410 MSQLGKGSTFSFVIPA 425
+ GK + +IP
Sbjct: 335 SEKQGKVNAM-VLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3059HTHFIS906e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 6e-23
Identities = 30/120 (25%), Positives = 57/120 (47%), Gaps = 2/120 (1%)

Query: 3 ARILIVEDELAIREMLTFVLEQHGFTTAAAEDYDSALAMLTEPYPDLVLLDWMFPGGNGI 62
A IL+ +D+ AIR +L L + G+ + + + DLV+ D + P N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 63 QLAKRLRQDEFTRHIPIIMLTARGEEEDKVRGLEVGADDFMTKPFSPKELVARIKAVMRR 122
L R+++ +P+++++A+ ++ E GA D++ KPF EL+ I +
Sbjct: 64 DLLPRIKKAR--PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3060ECOLNEIPORIN649e-14 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 64.1 bits (156), Expect = 9e-14
Identities = 64/331 (19%), Positives = 112/331 (33%), Gaps = 33/331 (9%)

Query: 13 AIATATLSSAYAADPLTVYGKLN--VTAQSNDVND-------ESTTTIQSNASRFGVKGA 63
A+ A L A AD +T+YG + V + ++ E+ T I S+ G KG
Sbjct: 7 ALTLAALPVAAMAD-VTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSKIGFKGQ 65

Query: 64 FELSRSLEAFYTIEYEVDTGDDVKENFKARNQFVGLKGNFGAFSVGRNDTMLKVA---QG 120
+L L+A + +E + R F+GLKG FG VGR +++LK
Sbjct: 66 EDLGNGLKAIWQVEQKASIAGTDSGWGN-RQSFIGLKGGFGKLRVGRLNSVLKDTGDINP 124

Query: 121 KVDQFNDLSGDLKNLFKGENRIEQTATYVTPSFSGFKVGVTYAAEGAGSQYGQDGFSVAA 180
+ + L + + + E R+ + Y +P F+G V YA ++ + +
Sbjct: 125 WDSKSDYLGVN--KIAEPEARL-ISVRYDSPEFAGLSGSVQYALNDNAGRHNSESYHAGF 181

Query: 181 IYGDAKLKQSPLYASVAYDADVKGYEVVRATVQGKIAGLKLGGMYQQQEETYDKDGSPTI 240
Y + A + + + + + ++G +Y D
Sbjct: 182 NYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSGYDNDALYASVAVQQQ-DAKLVE 240

Query: 241 GGESKTG-YLVSAAYQIDAVVLKAQF-------QDMEDKGDS-----WSVGGDYKLGKPT 287
S V+A + + + + VG +Y K T
Sbjct: 241 ENYSHNSQTEVAATLAYRFGNVTPRVSYAHGFKGSFDATNYNNDYDQVVVGAEYDFSKRT 300

Query: 288 KLFAFYT-NRSFDNVDN-DDSYFGVGLEHKF 316
+ + GVGL HKF
Sbjct: 301 SALVSAGWLQEGKGESKFVSTAGGVGLRHKF 331


93Shal_3147Shal_3158N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_31472260.074251translation initiation factor IF-2
Shal_31481170.412412transcription elongation factor NusA
Shal_31491160.696949hypothetical protein
Shal_31502160.941614**preprotein translocase subunit SecG
Shal_31512160.765411triosephosphate isomerase
Shal_31522160.587137phosphoglucosamine mutase
Shal_31530120.746441dihydropteroate synthase
Shal_31541140.434245ATP-dependent metalloprotease FtsH
Shal_3155-1130.02036023S rRNA methyltransferase J
Shal_31561180.851090hypothetical protein
Shal_31571170.591830protein-export membrane protein SecF
Shal_31582171.045188preprotein translocase subunit SecD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3147TCRTETOQM763e-16 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 76.0 bits (187), Expect = 3e-16
Identities = 52/202 (25%), Positives = 80/202 (39%), Gaps = 30/202 (14%)

Query: 400 IMGHVDHGKTSLLDYI-----RRAKVASGEAG-------------GITQHIGAYHVETEN 441
++ HVD GKT+L + + ++ S + G GIT G + EN
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 442 GMITFLDTPGHAAFTAMRARGAKATDIVILVVAADDGVMPQTIEAIQHAKAGGVPLIVAV 501
+ +DTPGH F A R D IL+++A DGV QT + G+P I +
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 502 NKMDKPEADPDRV----KSELSQHGVMS-------EDWGGNNMFVNVSAKTGAGIDELLE 550
NK+D+ D V K +LS V+ N G D+LLE
Sbjct: 128 NKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLE 187

Query: 551 GILLEAEVLELKAIKEGMAAGV 572
+ + LE +++ +
Sbjct: 188 K-YMSGKSLEALELEQEESIRF 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3150SECGEXPORT1211e-39 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 121 bits (306), Expect = 1e-39
Identities = 66/112 (58%), Positives = 83/112 (74%), Gaps = 3/112 (2%)

Query: 1 MYEVLMIIYLMVAIGLVGLILIQQGKGADMGASFGAGASGTLFGSSGSGNFLTRSTAVLA 60
MYE L++++L+VAIGLVGLI++QQGKGADMGASFGAGAS TLFGSSGSGNF+TR TA+LA
Sbjct: 1 MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLA 60

Query: 61 VAFFALSLTIGNLSANHTKAEGAWDDLGSDAAQVVEQVQQEA-EKSEDKIPD 111
FF +SL +GN+++N T W++L A EQ Q A K IP+
Sbjct: 61 TLFFIISLVLGNINSNKTNKGSEWENL--SAPAKTEQTQPAAPAKPTSDIPN 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3151adhesinb330.001 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 32.5 bits (74), Expect = 0.001
Identities = 18/95 (18%), Positives = 34/95 (35%), Gaps = 16/95 (16%)

Query: 142 REARRTFEVIAEELDVVIEKNGTMAFDNAIIAY----EPLWAVGTGKSATPEQAQEVHAF 197
+EA+ F I E +++ G F AY +W + T + TP+Q + +
Sbjct: 186 KEAKEKFNNIPGEKKMIVTSEG--CFKYFSKAYNVPSAYIWEINTEEEGTPDQIKTLVEK 243

Query: 198 IRKRLSEVSPFIGENIRILYGGSVTPSNAADLFAQ 232
+RK + L+ S ++
Sbjct: 244 LRKT----------KVPSLFVESSVDDRPMKTVSK 268


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3154HTHFIS364e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 36.0 bits (83), Expect = 4e-04
Identities = 23/82 (28%), Positives = 32/82 (39%), Gaps = 18/82 (21%)

Query: 193 VLLVGPPGTGKTLLAKAIAGESK---VPFFT-----ISGSDFVEMFVGV------GASRV 238
+++ G GTGK L+A+A+ K PF I G GA
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTR 222

Query: 239 RD-MFEQAKKSAPCIIFIDEID 259
FEQA+ +F+DEI
Sbjct: 223 STGRFEQAEGGT---LFLDEIG 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3157SECFTRNLCASE2403e-80 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 240 bits (614), Expect = 3e-80
Identities = 89/302 (29%), Positives = 155/302 (51%), Gaps = 18/302 (5%)

Query: 14 KARYLSSIFSVLIMLSSIGIILVNGFNWGLDFTGGIVTEVKLDPKIKSAQISELLAETSE 73
+ ++ + ++++M++S+ + LV G N+G+DF GG + I L
Sbjct: 18 RWQWATFGAAIVMMIASVILPLVIGLNFGIDFKGGTTIRTESTTAIDVGVYRAALEPLEL 77

Query: 74 QEVSV------------------ISAGEPGRWILRYAKVDNAGAHDIKTVLAPLTDNVEV 115
+V + I E G+ + ++T L + +++
Sbjct: 78 GDVIISEVRDPSFREDQHVAMIRIQMQEDGQGAEGQGAQGQELVNKVETALTAVDPALKI 137

Query: 116 LNSSIVGPQIGQELAEQGGLALLAAMLCILAYLSFRFEWRLASGALLALFHDVIFVLAFF 175
+ VGP++ EL +LLAA + I+ Y+ RFEW+ A GA++AL HDV+ + F
Sbjct: 138 TSFESVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHDVLLTVGLF 197

Query: 176 SLTQMEFNLTILAAVLAILGYSLNDSIVIADRVREVLIAKPKMAIDEICSSAVQATFSRT 235
++ Q++F+LT +AA+L I GYS+ND++V+ DR+RE LI M + ++ + +V T SRT
Sbjct: 198 AVLQLKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNETLSRT 257

Query: 236 MVTSGTTLFTVAALWIMGGAPLQGFSIAMFLGIMIGTISSVSVGTCLPEYLKVSAEHYKV 295
++T TTL + + I GG ++GF AM G+ GT SSV V + ++ + K
Sbjct: 258 VMTGMTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIVLFIGLDRNKEKK 317

Query: 296 EP 297
+P
Sbjct: 318 DP 319


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3158SECFTRNLCASE832e-19 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 83.0 bits (205), Expect = 2e-19
Identities = 40/214 (18%), Positives = 94/214 (43%), Gaps = 5/214 (2%)

Query: 383 SVATIQAQLGDRFSITGSADYASAQQLALLLRAGSMTAP-VTIVEERTIGPSLGEENITN 441
VA I+ Q+ + + + + A + P + I ++GP + E +
Sbjct: 95 HVAMIRIQMQEDGQGAEGQGAQGQELVNKVETALTAVDPALKITSFESVGPKVSGELVWT 154

Query: 442 GFSALALGMAVTLLFMGLWYR-RLGWIANVALIANMILIFGLMALIPGAVLTLPGIAGLV 500
+L V + ++ + + + A VAL+ +++L GL A++ L +A L+
Sbjct: 155 AVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHDVLLTVGLFAVL-QLKFDLTTVAALL 213

Query: 501 LTVGMAVDTNVLIFERIKDKLK--EGRSFAHAIDRGFDSAFSTIVDANFTTMITAVVLYA 558
G +++ V++F+R+++ L + ++ + S V TT++ V +
Sbjct: 214 TITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNETLSRTVMTGMTTLLALVPMLI 273

Query: 559 IGNGPIQGFALTLGLGLLTSMFTGIFASRALVNW 592
G I+GF + G+ T ++ ++ ++ +V +
Sbjct: 274 WGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIVLF 307


94Shal_3245Shal_3251N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_3245-213-1.194041hypothetical protein
Shal_3246-113-0.475794ATPase
Shal_3247-110-0.256005hypothetical protein
Shal_3248011-0.142498RluA family pseudouridine synthase
Shal_3249114-0.167445putative lipoprotein
Shal_32501150.701044exporter of the RND superfamily protein-like
Shal_32511141.312888*methyl-accepting chemotaxis sensory transducer
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3245VACJLIPOPROT280.006 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 28.3 bits (63), Expect = 0.006
Identities = 12/32 (37%), Positives = 21/32 (65%)

Query: 8 MKKQLAVIALVSLSVVGCSSSNYEKEQTSAPY 39
MK +L+ +AL + +VGC+SS +++ S P
Sbjct: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPL 32


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3246HTHFIS412e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 40.6 bits (95), Expect = 2e-05
Identities = 37/168 (22%), Positives = 62/168 (36%), Gaps = 30/168 (17%)

Query: 570 VIGQNEAVNSVSNAIRRSRAGLSDPDRPIGSFLFLGPTGVGKTELCKALAKFLFDTESAL 629
++G++ A+ + + R L D + + G +G GK + +AL +
Sbjct: 139 LVGRSAAMQEIYRVLAR----LMQTDLTL---MITGESGTGKELVARALHDYGKRRNGPF 191

Query: 630 VRIDMSEFMEKHSVSRLVGAPPGYVGYEEGGYLTEAVRRKPYSV-------ILLDEVEKA 682
V I+M+ S L G +E+G + T A R + LDE+
Sbjct: 192 VAINMAAIPRDLIESELFG-------HEKGAF-TGAQTRSTGRFEQAEGGTLFLDEIGDM 243

Query: 683 HPDVFNILLQVLDDG---RLTDGQGRTVDFRNTVVIMTSNLGSDLIQM 727
D LL+VL G + D R ++ +N DL Q
Sbjct: 244 PMDAQTRLLRVLQQGEYTTVGGRTPIRSDVR---IVAATN--KDLKQS 286



Score = 31.7 bits (72), Expect = 0.015
Identities = 17/88 (19%), Positives = 37/88 (42%), Gaps = 4/88 (4%)

Query: 131 ATKELMESTIDQIRNGQNVDDPNAEDQRQALKKFTVDLTERAEQG-KLDPVIGRDDEIRR 189
A K + D + ++ + +AL + ++ + P++GR ++
Sbjct: 90 AIKASEKGAYDYLPKPFDLTE-LIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQE 148

Query: 190 TIQVLQRRSKNN-PVLI-GEPGVGKTAI 215
+VL R + + ++I GE G GK +
Sbjct: 149 IYRVLARLMQTDLTLMITGESGTGKELV 176


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3250ACRIFLAVINRP330.005 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 32.9 bits (75), Expect = 0.005
Identities = 24/141 (17%), Positives = 55/141 (39%), Gaps = 10/141 (7%)

Query: 196 IIFLMAALF---IRQALWIFAICINSGIALVLSMGLAAWLKLTLAAITAFVPVIIVTLGL 252
++FL+ LF +R L I I + + L+ + + A ++ +T F V+ + L +
Sbjct: 350 LVFLVMYLFLQNMRATL-IPTIAVP--VVLLGTFAILAAFGYSINTLTMFGMVLAIGLLV 406

Query: 253 AYASHLYFG-WRAEINRGQTHQQALRYTVKVNQAPLFYSTITTIFGFSLLMFSPSPP--- 308
A + R + ++A ++ Q L + F + F
Sbjct: 407 DDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAI 466

Query: 309 IQSFGLLVAFAVLCNYVLSLT 329
+ F + + A+ + +++L
Sbjct: 467 YRQFSITIVSAMALSVLVALI 487



Score = 31.0 bits (70), Expect = 0.023
Identities = 25/170 (14%), Positives = 61/170 (35%), Gaps = 17/170 (10%)

Query: 144 LDDVLQKVDRDSREYWLPLGVNTYFNGTHALNWQYAKVLGHDLSW-FAPALLGIIFLMAA 202
D + ++ + LP G+ + G ++ G+ A + + + +AA
Sbjct: 836 SGDAMALMENLAS--KLPAGIGYDWTGMS----YQERLSGNQAPALVAISFVVVFLCLAA 889

Query: 203 LF--IRQALWIFAICINSGIALVLSMGLAAWLKLTLAAITAFVPVIIVTLGLAYASHLY- 259
L+ + + + + ++L+ L K + + + T+GL+ + +
Sbjct: 890 LYESWSIPVSVMLVVPLGIVGVLLAATLFN-QKNDVYFMVGL----LTTIGLSAKNAILI 944

Query: 260 --FGWRAEINRGQTHQQALRYTVKVNQAPLFYSTITTIFGFSLLMFSPSP 307
F G+ +A V++ P+ +++ I G L S
Sbjct: 945 VEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGA 994


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3251BONTOXILYSIN300.026 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 30.3 bits (68), Expect = 0.026
Identities = 11/87 (12%), Positives = 34/87 (39%)

Query: 182 NNLEAFINTANVSQLDGQAKQQVLNYIQQYNADFKALVDKEKQLGLTISEGERGQLRQAV 241
N ++ F+N A++ + ++Y+++Y + + Q I++ E+ L +
Sbjct: 742 NRVDNFLNKASICVFVEDIYPKFISYMEKYINNINIKTREFIQRCTNINDNEKSILINSY 801

Query: 242 ANTDSALNQLESLALDAISDKESSAFI 268
L+ ++ + + +
Sbjct: 802 TFKTIDFKFLDIQSIKNFFNSQVEQVM 828


95Shal_3400Shal_3407N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_34002152.484409TonB family protein
Shal_34012151.497480MotA/TolQ/ExbB proton channel
Shal_34020150.306234biopolymer transport protein ExbD/TolR
Shal_34031150.693163periplasmic-binding protein
Shal_3404-1170.026128transport system permease
Shal_3405-220-2.553239hemin importer ATP-binding subunit
Shal_3406-117-1.776464TetR family transcriptional regulator
Shal_3407017-1.180315deaminase-reductase domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3400PF03544743e-18 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 74.2 bits (182), Expect = 3e-18
Identities = 34/195 (17%), Positives = 67/195 (34%), Gaps = 9/195 (4%)

Query: 43 AVSISIAMQASQANKAVEQIKPALEKQTKPSPRPLAKPLPESAATPVVKAKPITEPELAT 102
A IS+ M A + + ++P E +P P P P P A V++
Sbjct: 47 AQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK 106

Query: 103 QPKPAPSQDTEKPIAEINEIQVAHNQIQVDAKQGVTQQSITLSQPTFAAPP-----AQPH 157
K + E N + + + A+ P QP
Sbjct: 107 PVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQ 166

Query: 158 YPKKARKRGFQGTATVEVMFNQLGEQLSLTLVGSSGYRLLDKAALNAVEQWQFAAPTPQT 217
YP +A+ +G V+ G ++ ++ + + ++ NA+ +W++ P +
Sbjct: 167 YPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGS 226

Query: 218 AYAYTVRVPVKFALN 232
+ V + F +N
Sbjct: 227 ----GIVVNILFKIN 237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3403FERRIBNDNGPP452e-07 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 44.6 bits (105), Expect = 2e-07
Identities = 62/299 (20%), Positives = 114/299 (38%), Gaps = 40/299 (13%)

Query: 9 LVNNKMSISAALLSVLV--INQPVLAHQDKAHSETARVVSAGAGVTELVLAL-------- 58
L++ + ++A LS L+ +N A D R+V+ EL+LAL
Sbjct: 6 LISRRRLLTAMALSPLLWQMNTAHAAAID-----PNRIVALEWLPVELLLALGIVPYGVA 60

Query: 59 DAGNELVAVDSTSQLPEEYTNIEKLGYHRMLSAEGILALSPDLLLGSDAMGPQTTLDVLT 118
D N + V LP+ ++ G + E + + P ++ S GP + ++L
Sbjct: 61 DTINYRLWVSEPP-LPDSVIDV---GLRTEPNLELLTEMKPSFMVWSAGYGP--SPEML- 113

Query: 119 AAKVKVVQLATANSQSQLI---NNISEMGKLLNREAQAKKLQAEIKQSFKNIEQKQQKIA 175
A ++ + L +++EM LLN ++ A+ A+ + + +
Sbjct: 114 ARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFI---RSMKPRFV 170

Query: 176 KQGEAPKVLFLLLQTDRPARVGGDDTAADIIIKLAGGKNI----AGFSGYKSVSQEGILS 231
K+G P +L L+ R V G ++ I+ G N F G +VS + + +
Sbjct: 171 KRGARPLLLTTLIDP-RHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAA 229

Query: 232 LQ-PDVILISNRSNKAVQNPVDALLNNMPLLAHTPAGKNKQIQFLQPQALLGGLGLSAI 289
+ DV+ + L PL P + + Q + G LSA+
Sbjct: 230 YKDVDVLCFD-----HDNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGAT-LSAM 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3406TETREPRESSOR482e-09 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 48.4 bits (115), Expect = 2e-09
Identities = 27/92 (29%), Positives = 50/92 (54%), Gaps = 4/92 (4%)

Query: 16 SQLSKEAIVIQAKALMLSDG-KIPSIRHLAGKLSVDAMAIYHYFKNKDALLEAITTSLVS 74
++L++E+++ A L+ G + R LA KL ++ +Y + KNK ALL+A+ +++
Sbjct: 2 ARLNRESVIDAALELLNETGIDGLTTRKLAQKLGIEQPTLYWHVKNKRALLDALAVEILA 61

Query: 75 E---ICQPEAGDKWQQALTELSLSYLNLLKQY 103
P AG+ WQ L ++S+ L +Y
Sbjct: 62 RHHDYSLPAAGESWQSFLRNNAMSFRRALLRY 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3407PF01540270.042 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 27.4 bits (60), Expect = 0.042
Identities = 11/46 (23%), Positives = 23/46 (50%), Gaps = 4/46 (8%)

Query: 75 NKPVFVLSNTLTEVPAELEGKAFIING----ELEHIVQQLNSKGYQ 116
++ + ++T+ +LEGK F I+ +L ++ LN K +
Sbjct: 134 SEKIQSFADTIALTITKLEGKKFQIDETFKKQLISTIELLNKKSAE 179


96Shal_3782Shal_3789N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_3782-1202.579848RND family efflux transporter MFP subunit
Shal_3783-1193.208114acriflavin resistance protein
Shal_37840182.901691hypothetical protein
Shal_3785-1172.323078hypothetical protein
Shal_37860152.640560hypothetical protein
Shal_37870152.801399hypothetical protein
Shal_37880162.705790hypothetical protein
Shal_37891151.815582PAS/PAC sensor-containing hybrid histidine
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3782RTXTOXIND454e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.8 bits (106), Expect = 4e-07
Identities = 28/200 (14%), Positives = 67/200 (33%), Gaps = 29/200 (14%)

Query: 104 EADYELAKADFKRKGELLRRELISQAEYDLASAQLKSS--KANLASAQDQLSYTELTAPY 161
E++ AK +++ +L + E++ + L LA +++ + + AP
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDK----LRQTTDNIGLLTLELAKNEERQQASVIRAPV 334

Query: 162 DGTVAKISI-DNYQMVQANQPVL-VLQKDSDIDIVIQVPESLASKVTQFNPNAVTQPV-V 218
V ++ + +V + ++ ++ +D +++ V + V Q +
Sbjct: 335 SVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFIN------VGQNAII 388

Query: 219 RFANDPSVAYPVL------------LKEHATQVTPGTQSYEVVFTLARPSNMTVLPGMSA 266
+ P Y L + V S E N+ + GM+
Sbjct: 389 KVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAV 448

Query: 267 ELTMDIAQQKSQALTAILPP 286
+I ++ +L P
Sbjct: 449 T--AEIKTGMRSVISYLLSP 466



Score = 33.3 bits (76), Expect = 0.001
Identities = 18/83 (21%), Positives = 31/83 (37%), Gaps = 7/83 (8%)

Query: 78 EGQQVKKAMVLARLDRRDAQNTLLNREADYELAKADFKRKGELLR-------RELISQAE 130
EG+ V+K VL +L A+ L ++ A+ + R L R EL E
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 131 YDLASAQLKSSKANLASAQDQLS 153
+ + + ++Q S
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFS 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3783ACRIFLAVINRP494e-160 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 494 bits (1274), Expect = e-160
Identities = 203/1053 (19%), Positives = 436/1053 (41%), Gaps = 61/1053 (5%)

Query: 4 AEYSITHKVISWMFALLLLIGGTVSFFSLGQLEFPEFTIKQALVVTTYPGASPEQVEEEV 63
A + I + +W+ A++L++ G ++ L ++P V YPGA + V++ V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 64 TLPLEDALQQLDGIKHITSV-NSAGLSQIEIEIKENYDASELPQVWDEVRRKVNDKAVEL 122
T +E + +D + +++S +SAG I + + D +V+ K+ L
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTD---PDIAQVQVQNKLQLATPLL 118

Query: 123 PPGVHTPSVIDDFGD---VYGILLNVSGDGYSDRELQNYADF-LRRELVLVDGIKKVTIA 178
P V + + + G + ++ +Y ++ L ++G+ V +
Sbjct: 119 PQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLF 178

Query: 179 GGVNEQVVVEISQQKLNALGLDQNYIYGLINSQNVVSNAGSMLVGDN------RIRIHPT 232
G + + + LN L + + QN AG + I
Sbjct: 179 GA-QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQ 237

Query: 233 GEFDNVRQMERLLISPPGSPKLIYLGDIAKIYKGSEETPSNIYHTDGKQALSIGIAFSSG 292
F N + ++ + ++ L D+A++ G E + I +GK A +GI ++G
Sbjct: 238 TRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGEN-YNVIARINGKPAAGLGIKLATG 296

Query: 293 VNVVKVGEAVNQRMSELYSELPIGMTLNTVYDQSKMVDQTVNGFLVNLAESVAIVIGVLL 352
N + +A+ +++EL P GM + YD + V +++ + L E++ +V V+
Sbjct: 297 ANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMY 356

Query: 353 IFMG-VRSGLLMGLVLLLTILGTFIVMNVLNIELQIISLGALIIALGMLVDNAIVVTEGI 411
+F+ +R+ L+ + + + +LGTF ++ + +++ +++A+G+LVD+AIVV E +
Sbjct: 357 LFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENV 416

Query: 412 L-IGIKRGQTRLETAKQVVSQTQWPLLGATIIAIIAFAPIGLSDNATGEFCASLFQVLLI 470
+ ++ E ++ +SQ Q L+G ++ F P+ +TG ++
Sbjct: 417 ERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVS 476

Query: 471 SLFISWITAMTLTPFFCNLMFKDGVISEDENDDPYKGWLFAIYRQSLNLA-------MRF 523
++ +S + A+ LTP C + K EN + GW + S+N +
Sbjct: 477 AMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGS 536

Query: 524 RSLTLILVIAALVSSVIGFGHVKNVFFPASNTPIFFVDVWMPEGSDIKATERLLSRIEAD 583
L++ + V+ F + + F P + +F + +P G+ + T+++L ++
Sbjct: 537 TGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDY 596

Query: 584 LLQQQKEQDTGLVNLTTVIGQG-AQRFVLSYVPEKGYKAYGQILLEMTDLTTLNQYMRVL 642
L+ +K + + G AQ +++V K ++ + +
Sbjct: 597 YLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWE----------ERNGDENSAEAV 646

Query: 643 ERELSLKFPEAEYRFKYMENGPS---------PAAKIEARFYGEDPQVLRQLAVQAEDIL 693
++ + F N P+ ++ + + +
Sbjct: 647 IHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAA 706

Query: 694 KAEPTAVGVRHNWRNQVTLVRPQLAQAQARETGISKQDLDNALLTNFSGKQIGTYRENSH 753
+ + V VR N + ++ Q +A+ G+S D++ + T G + + +
Sbjct: 707 QHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGR 766

Query: 754 LLPIIARAPDEERLDAQSIWKLQVWSSDNHTFVPVTQVVSDFSTEWEDPLIMRRDRKRVI 813
+ + +A + R+ + + KL V S N VP + + P + R + +
Sbjct: 767 VKKLYVQADAKFRMLPEDVDKLYV-RSANGEMVPFSAFT-TSHWVYGSPRLERYNGLPSM 824

Query: 814 SVLADPLNGTRETADSVFRKVKADIEAIP--LPAGYELEWGGEYETSMEAQESVFSSIPL 871
+ + G + A +E + LPAG +W G + + + +
Sbjct: 825 EIQGEAAPG------TSSGDAMALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAI 878

Query: 872 GYLAMFIITVLLFNSVRQPLVIWFTVPLALIGVVSGLLIFDAPFSFMALLGLLSLTGMII 931
++ +F+ L+ S P+ + VPL ++GV+ +F+ ++GLL+ G+
Sbjct: 879 SFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSA 938

Query: 932 KNGIVLVDQIN-LELSSGKEAYQAVVDSAVSRVRPVLMAAITTMLGMLPLISDAFFGS-- 988
KN I++V+ L GK +A + + R+RP+LM ++ +LG+LPL GS
Sbjct: 939 KNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGA 998

Query: 989 ---MAITIIFGLGFASVLTLIVLPVTYTLAFRI 1018
+ I ++ G+ A++L + +PV + + R
Sbjct: 999 QNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3786ECOLNEIPORIN260.028 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 26.3 bits (58), Expect = 0.028
Identities = 9/20 (45%), Positives = 11/20 (55%)

Query: 30 DFSKTVAARADYDFSKNISA 49
+ V A+YDFSK SA
Sbjct: 283 NDYDQVVVGAEYDFSKRTSA 302


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3789HTHFIS742e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 73.7 bits (181), Expect = 2e-15
Identities = 29/112 (25%), Positives = 47/112 (41%), Gaps = 2/112 (1%)

Query: 1077 DSGYVLLVEDNFINQQVATELLKSAGYTVDVAENGQIALDMIDKTQYDAVLMDIQMPVMD 1136
+L+ +D+ + V + L AGY V + N I D V+ D+ MP +
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 1137 GLTATKELRKRYSTEELPVIAMTAHAMSGDREKSLAAGMNAHITKPIVLTEL 1188
++K +LPV+ M+A K+ G ++ KP LTEL
Sbjct: 62 AFDLLPRIKKAR--PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTEL 111



Score = 64.5 bits (157), Expect = 1e-12
Identities = 20/108 (18%), Positives = 42/108 (38%), Gaps = 6/108 (5%)

Query: 937 KTLVIDDNPTALQIYSSVLRDFHFNVDTAASGPEGLYKLGKNPVDLLLLDWMMPEMDGIE 996
LV DD+ + + L ++V ++ + DL++ D +MP+ + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 997 VIKQIDAMVADGRLEKRPIIIMMTAYTAEPMHKDVEQANVYALLQKPF 1044
++ +I P+++ M+A + Y L KPF
Sbjct: 65 LLPRIKK-----ARPDLPVLV-MSAQNTFMTAIKASEKGAYDYLPKPF 106


97Shal_3825Shal_3830N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_38250132.325601peptidase U62 modulator of DNA gyrase
Shal_38261142.412039hypothetical protein
Shal_38270143.235402Ig domain-containing protein
Shal_38280173.150826ABC-2 type transporter
Shal_3829-2162.671093ABC-2 type transporter
Shal_3830-2142.403123secretion protein HlyD family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3825FRAGILYSIN300.017 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 30.4 bits (68), Expect = 0.017
Identities = 15/68 (22%), Positives = 26/68 (38%)

Query: 373 LVKEMGTGLIVTEVMGQGVNTVTGDYSRGAAGFYVENGVILYPVEEITIAGNLKDMYQNI 432
++E G+ + EV Q + Y+ YV +LY E +G+ K+ +
Sbjct: 232 CLRENGSTIYPNEVSAQMQDAANSVYAVHGLKRYVNFHFVLYTTEYSCPSGDAKEGLEGF 291

Query: 433 VAVAKDRD 440
A K
Sbjct: 292 TASLKSNP 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3826BCTERIALGSPD300.005 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 30.3 bits (68), Expect = 0.005
Identities = 17/121 (14%), Positives = 36/121 (29%), Gaps = 14/121 (11%)

Query: 44 SKTQVEKLNLDETLHDNVLKAKTIKMNTEAHRRHIQYIGKL--------------MRYVD 89
SK+ + + + D A + + +R I I +L ++Y
Sbjct: 219 SKSALPGSMVANVVADERTNAVLVSGEPNSRQRIIAMIKQLDRQQATQGNTKVIYLKYAK 278

Query: 90 LEELEVAIRNVLNKNSNESAKTNVADKTRDQLLAEGDSAVQALIEQHPEFDRQKLRQFIR 149
+L + + + +E ++ + ALI L + I
Sbjct: 279 ASDLVEVLTGISSTMQSEKQAAKPVAALDKNIIIKAHGQTNALIVTAAPDVMNDLERVIA 338

Query: 150 Q 150
Q
Sbjct: 339 Q 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3827INTIMIN340.003 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 34.3 bits (78), Expect = 0.003
Identities = 46/239 (19%), Positives = 95/239 (39%), Gaps = 34/239 (14%)

Query: 278 NKLATTLPFEADKYIVSAGGTFGVTADLATKNDDGSYTRLQTPTSVSFNSSCISSNNASI 337
N + T+ ++ +V G TAD + DG+ + + A++
Sbjct: 540 NNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGT-----EAITYTATVKKNGVAQANV 594

Query: 338 DSPVTTLSGTASSTFQNTNCSGNGE--------RDDQIIASVVAGNQTLTAELNFS-LAS 388
+SGTA + + N +G+G+ + Q++ S T N
Sbjct: 595 PVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVD 654

Query: 389 QTLANLSFISAEPTSIRIKGAGGTGSSESSLITFKV-SDANGQAIAQQAVDFSLDTTVGG 447
QT A+++ I A+ T+ ++ IT+ V + ++ Q V F TT+G
Sbjct: 655 QTKASITEIKADKTTAV--------ANGQDAITYTVKVMKGDKPVSNQEVTF--TTTLGK 704

Query: 448 ISFANGGTSTSNTSNSAGLVSATVLSGTMPTPVRVLATATANGESVTTQSEQLTINTGL 506
+ ++++ +++ G T+ S TP + L +A + +V ++ ++ T L
Sbjct: 705 L------SNSTEKTDTNGYAKVTLTST---TPGKSLVSARVSDVAVDVKAPEVEFFTTL 754



Score = 33.1 bits (75), Expect = 0.006
Identities = 32/148 (21%), Positives = 62/148 (41%), Gaps = 11/148 (7%)

Query: 82 SGQLISFSTSFGELSVATKLTNSNGLAEIIVSNPTAAEGAGTLTASFTEGENNANASKNF 141
S Q ++F+T+ G+LS +T+ T++NG A++ +++ T G ++A + +
Sbjct: 692 SNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTP--GKSLVSARVS-DVAVDVKAPEV 748

Query: 142 EFTSATAPSEPKFSLNSAIINGTTVVTQFKAGETVQLQAQFLDEQGQGVAGKKASFTAGS 201
EF + + + + G + G V L+A G S
Sbjct: 749 EFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYG-QVNLKA-----SGGNGKYTWRSANPAI 802

Query: 202 ASLNPDSALT--KDNGIAQVSYTPSDSE 227
AS++ S K+ G +S SD++
Sbjct: 803 ASVDASSGQVTLKEKGTTTISVISSDNQ 830


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3830RTXTOXIND431e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.5 bits (100), Expect = 1e-06
Identities = 32/176 (18%), Positives = 62/176 (35%), Gaps = 12/176 (6%)

Query: 2 RTNRILAAVALIVLIGALLYGLLLAYTPKPQLLQGQI--EAREYNVSSKVPGRVEEVLVR 59
R R++A + L+ A + +L G++ R + V+E++V+
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVL-GQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 60 RGDLVTEGDLLFAINSPELEAKLMQAEGGRDAAVAMQKEADNGARKQQIAAAKEEWLKAQ 119
G+ V +GD+L + + EA ++ + A ++ + I K LK
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQ--SSLLQARLEQTRYQILSRSIELNKLPELKLP 171

Query: 120 AAAKLYRTTYERLEVLFNEGVLARQKRDEAFTQWQAAKYTEQAALAMYQMTDEGAR 175
+ E E + E F+ WQ KY ++ L +
Sbjct: 172 DEPYFQNVSEE-------EVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVL 220


98Shal_3839Shal_3853N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_3839-2110.790418rod shape-determining protein MreB
Shal_3840-2130.331166PA14 domain-containing protein
Shal_38411211.013352MSHA biogenesis protein MshP
Shal_38421210.378095methylation site containing protein
Shal_3843022-0.276170methylation site containing protein
Shal_3844-2180.733599MSHA pilin protein MshC
Shal_38450130.930245methylation site containing protein
Shal_3846-1151.543556MSHA pilin protein MshA
Shal_3847-1171.725855methylation site containing protein
Shal_3848-1151.407745hypothetical protein
Shal_38490151.246839type II secretion system protein
Shal_3850-1150.592672type II secretion system protein E
Shal_3851017-0.304685hypothetical protein
Shal_3852115-1.477113MSHA biogenesis protein MshM
Shal_3853013-2.117259pilus (MSHA type) biogenesis protein MshL
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3839SHAPEPROTEIN5510.0 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 551 bits (1422), Expect = 0.0
Identities = 310/348 (89%), Positives = 330/348 (94%), Gaps = 1/348 (0%)

Query: 1 MFKKLRGIFSNDLSIDLGTANTLIYVREEGIVLNEPSVVAIRGERSGSGQKSVAAVGTEA 60
M KK RG+FSNDLSIDLGTANTLIYV+ +GIVLNEPSVVAIR +R KSVAAVG +A
Sbjct: 1 MLKKFRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDR-AGSPKSVAAVGHDA 59

Query: 61 KQMLGRTPGNIQAIRPMKDGVIADFYVTEKMLQHFIKQVHNNSVFRPSPRVLVCVPVGAT 120
KQMLGRTPGNI AIRPMKDGVIADF+VTEKMLQHFIKQVH+NS RPSPRVLVCVPVGAT
Sbjct: 60 KQMLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGAT 119

Query: 121 QVERRAIRESAMGAGAREVYLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAIISLN 180
QVERRAIRESA GAGAREV+LIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVA+ISLN
Sbjct: 120 QVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN 179

Query: 181 GVVYSSSVRIGGDKFDDAIINYVRRNYGSLIGEATAERIKHTIGTAYPGDEVLEIEVRGR 240
GVVYSSSVRIGGD+FD+AIINYVRRNYGSLIGEATAERIKH IG+AYPGDEV EIEVRGR
Sbjct: 180 GVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGR 239

Query: 241 NLAEGVPRSFTLNSNEILEALQEPLSGIVSAVMVALEQSPPELASDISERGMVLTGGGAL 300
NLAEGVPR FTLNSNEILEALQEPL+GIVSAVMVALEQ PPELASDISERGMVLTGGGAL
Sbjct: 240 NLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGAL 299

Query: 301 IRDLDRLLMQETGIPVMIADDPLTCVARGGGRALEMIDMHGGDLFSEE 348
+R+LDRLLM+ETGIPV++A+DPLTCVARGGG+ALEMIDMHGGDLFSEE
Sbjct: 300 LRNLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHGGDLFSEE 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3840BINARYTOXINB422e-05 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 42.4 bits (99), Expect = 2e-05
Identities = 27/98 (27%), Positives = 49/98 (50%), Gaps = 14/98 (14%)

Query: 347 IFEGYIDAPETGEYTFAIDGDDAIELLIDGEVITGFYGAHGTCNCTRYQGKVSLEQG-AH 405
I+ G+I ++ EYTFA D+ + + +D + N K+ LE+G +
Sbjct: 93 IWSGFIKVKKSDEYTFATSADNHVTMWVDDQ---------EVINKASNSNKIRLEKGRLY 143

Query: 406 TIELRFHEAFGSE---AFRLYWQTPSSNRFEIVPADNL 440
I++++ +E F+LYW S N+ E++ +DNL
Sbjct: 144 QIKIQYQRENPTEKGLDFKLYWTD-SQNKKEVISSDNL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3842BCTERIALGSPG351e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 34.9 bits (80), Expect = 1e-04
Identities = 13/29 (44%), Positives = 21/29 (72%)

Query: 20 RSRGFTLVEMITVVIILGVLVLGVSSFLL 48
+ RGFTL+E++ V++I+GVL V L+
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLM 34


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3843GPOSANCHOR334e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 33.5 bits (76), Expect = 4e-04
Identities = 14/69 (20%), Positives = 26/69 (37%), Gaps = 2/69 (2%)

Query: 10 KRQTQKAFTLIELVVGMVVISIAFVLLSTMLFPQAERAADTLHRVRTAELAHSILNEIWG 69
K T + ++L +L G +++A +L L + R +T L + E
Sbjct: 3 KNNTNRHYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTL--EKVQERAD 60

Query: 70 KRYDQNTNA 78
K +N
Sbjct: 61 KFEIENNTL 69


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3844BCTERIALGSPG458e-09 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 45.3 bits (107), Expect = 8e-09
Identities = 19/69 (27%), Positives = 36/69 (52%), Gaps = 8/69 (11%)

Query: 1 MHRAQRHAGFTLVELVTTIILIAILAVVVVPRLLTSSSYSAFTLRDEFISELRKVQIMAM 60
M + GFTL+E++ I++I +LA +VVP L+ + +++ + I+A+
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGN--------KEKADKQKAVSDIVAL 52

Query: 61 NNQDRCYRL 69
N Y+L
Sbjct: 53 ENALDMYKL 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3845BCTERIALGSPG488e-10 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 48.0 bits (114), Expect = 8e-10
Identities = 18/54 (33%), Positives = 31/54 (57%), Gaps = 4/54 (7%)

Query: 2 KMQKQNGFTLIELVVVIIILGILAVTAAPKFINLQGDAR----LSTLQGVKGAI 51
KQ GFTL+E++VVI+I+G+LA P + + A +S + ++ A+
Sbjct: 3 ATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENAL 56


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3846BCTERIALGSPG482e-09 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 47.6 bits (113), Expect = 2e-09
Identities = 18/53 (33%), Positives = 30/53 (56%), Gaps = 4/53 (7%)

Query: 1 MQKQNGFTLIELVVVIIILGILAVTAAPKFINLQSDARA----SALQGVKGAI 49
KQ GFTL+E++VVI+I+G+LA P + + A S + ++ A+
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENAL 56


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3847BCTERIALGSPG431e-07 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 42.6 bits (100), Expect = 1e-07
Identities = 14/35 (40%), Positives = 27/35 (77%)

Query: 6 QKGFSLIELVIVIVILGLLAATAIPRFLNVTDDAE 40
Q+GF+L+E+++VIVI+G+LA+ +P + + A+
Sbjct: 7 QRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKAD 41


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3849BCTERIALGSPF301e-101 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 301 bits (773), Expect = e-101
Identities = 109/406 (26%), Positives = 205/406 (50%), Gaps = 4/406 (0%)

Query: 1 MPTYQYRGRSAQGEQIKGQVDASSESAAADQLMSRGVIPLEL----VLAKEVKEFSLKAL 56
M Y Y+ AQG++ +G +A S A L RG++PL + ++ L
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 57 FKGKVALEELQIFTRQMYSLTRSGIPILRAIAGLSETTHSVRMKEALDDISAQLTSGRPL 116
K +++ +L + TRQ+ +L + +P+ A+ +++ + + + + + +++ G L
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 117 SSAMNQHPDVFDALFVSMVHVGENTGKLEDAFIQLSGYIEREQETRRRIKAAMRYPIFVL 176
+ AM P F+ L+ +MV GE +G L+ +L+ Y E+ Q+ R RI+ AM YP +
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 177 IAIAIAMVILNIMVIPKFAEMFSRFGADLPWATKILINTSNLFVNYWPAMIIALIGTFVG 236
+ + IL +V+PK E F LP +T++L+ S+ + P M++AL+ F+
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 237 IRYWHSTEKGEKQWDKWKLNIPVVGSIIERSTLSRYCRSFSMMLSAGVPMTQALSLVADA 296
R EK + + L++P++G I +RY R+ S++ ++ VP+ QA+ + D
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 297 VDNAYMHDRIVGMRRGIESGESMLRVSNNCQLFTPLVLQMVAVGEETGQIDQLLNDAADF 356
+ N Y R+ + G S+ + LF P++ M+A GE +G++D +L AAD
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 357 YEGEVDYDLKNLTAKLEPILIGFVACIVLVLALGIYLPMWDMLNVV 402
+ E + EP+L+ +A +VL + L I P+ + ++
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3851SYCDCHAPRONE383e-05 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 37.6 bits (87), Expect = 3e-05
Identities = 22/109 (20%), Positives = 44/109 (40%)

Query: 321 VQQAAGQNEQALKSLAMIPDASELAIKKWHQQSDLAQKQQDYPVAEQSFRQLAKHEPYQG 380
Q A + ++AM+ + S +++ + + + Y A + F+ L + Y
Sbjct: 11 YQLAMESFLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDS 70

Query: 381 RWWMGLGYALDAQQKYTEAKLAYNQALSQGNLSAQAKVYVDNRLLQLGA 429
R+++GLG A +Y A +Y+ + + LLQ G
Sbjct: 71 RFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGE 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3853BCTERIALGSPD1662e-46 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 166 bits (422), Expect = 2e-46
Identities = 73/307 (23%), Positives = 136/307 (44%), Gaps = 28/307 (9%)

Query: 243 GNTGGGRQVVVT--PQAGLVTVRAYPNEIRQIRNFIKTAESHLQRQVILEAKVLEVTLSD 300
+ +++ Q + V A P+ + + I + + QV++EA + EV +D
Sbjct: 302 PVAALDKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIR-RPQVLVEAIIAEVQDAD 360

Query: 301 GYQQGIHWESVL-GHAGSTDITFGTSPTTGLSDQ-------ISSVLGGVTSIK-----LE 347
G GI W + G T+ S ++Q SS+ ++S
Sbjct: 361 GLNLGIQWANKNAGMTQFTNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAGFY 420

Query: 348 GSDFSTMISLLDTQGDVDVLSSPRVTASNNQKAVIKVGTDEYFVTNVSSTTVASNTPITT 407
+++ +++ L + D+L++P + +N +A VG + +T S T + + T
Sbjct: 421 QGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLT--GSQTTSGDNIFNT 478

Query: 408 PDVELTPFFSGIALDVTPQIDEQGNVLLHIHPSVIDIKEQTKIIKISGSTLELPLAQSEI 467
+ + GI L V PQI+E +VLL I V + + S ++ +L +
Sbjct: 479 VERKTV----GIKLKVKPQINEGDSVLLEIEQEVSSVADAA-----SSTSSDLGATFN-T 528

Query: 468 RESDTVIKAASGDVVVIGGLMKSENTEIVSKVPLLGDIPFLGEAFTNRANSTVKTELVIL 527
R + + SG+ VV+GGL+ ++ KVPLLGDIP +G F + + K L++
Sbjct: 529 RTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLF 588

Query: 528 LKPIVVG 534
++P V+
Sbjct: 589 IRPTVIR 595


99Shal_3915Shal_3934N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_3915110-0.428410frataxin family protein
Shal_39160131.101546adenylate cyclase
Shal_39175212.002894porphobilinogen deaminase
Shal_39185201.931438uroporphyrinogen III synthase HEM4
Shal_39195201.935089hypothetical protein
Shal_39205202.044599HemY domain-containing protein
Shal_39216202.091609putative outer membrane adhesin-like protein
Shal_39225231.750218putative outer membrane adhesin-like protein
Shal_3923-1140.465306ABC transporter transmembrane protein
Shal_3924-114-0.788341HlyD family type I secretion membrane fusion
Shal_3925-110-0.823666TolC family type I secretion outer membrane
Shal_3926-111-1.225190OmpA/MotB domain-containing protein
Shal_3927-112-1.128529hypothetical protein
Shal_3928-112-0.065715diguanylate cyclase/phosphodiesterase
Shal_3929-112-0.066882****GAF sensor-containing diguanylate
Shal_3930-2130.924250ATP-dependent DNA helicase Rep
Shal_3931-1171.002966TetR family transcriptional regulator
Shal_3932-1171.176255RND family efflux transporter MFP subunit
Shal_3933-1192.212201acriflavin resistance protein
Shal_3934-1182.867911hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3915MALTOSEBP300.001 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 30.1 bits (67), Expect = 0.001
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 6/62 (9%)

Query: 44 QLEFNGTSKIVINKQEPLHEIWLATQFGGYHFAYVDGKW------MDERNGNEFLPFVVE 97
+L+ G S ++ N QEP L GGY F Y +GK+ +D L F+V+
Sbjct: 164 ELKAKGKSALMFNLQEPYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVD 223

Query: 98 SI 99
I
Sbjct: 224 LI 225


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3919RTXTOXIND310.007 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.3 bits (71), Expect = 0.007
Identities = 12/79 (15%), Positives = 27/79 (34%), Gaps = 5/79 (6%)

Query: 80 GYYLYQQLQAQQAETAALQQELKQELQTVLVAPNQRITSLEQQ----QNQFKSSVDSTLK 135
+ Q+ + EL+ + I S +++ FK+ + L+
Sbjct: 247 QAIAKHAVLEQENKYVEAVNELRVYKSQLEQI-ESEILSAKEEYQLVTQLFKNEILDKLR 305

Query: 136 KTVDQQTQLEERVSIIAQR 154
+T D L ++ +R
Sbjct: 306 QTTDNIGLLTLELAKNEER 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3921CABNDNGRPT851e-18 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 84.6 bits (209), Expect = 1e-18
Identities = 41/169 (24%), Positives = 61/169 (36%), Gaps = 17/169 (10%)

Query: 1880 TIVGTDNINNLIFGSTNSDSLTGAN-LDDRIFGREDNDILIGLSGNDELIGGSGNDNVQG 1938
G +N+ + + G + N + + IGGSGND + G
Sbjct: 295 VWDAGGTDTFDFSGYSNNQRINLNEGSFSDVGGLKGNVSIAHGVTIENAIGGSGNDILVG 354

Query: 1939 GEGNDFVIGGIGDDTLNGGIGRDYLSGGQGNDSLDGGALNGSDDGERDFFVWESDSADGS 1998
++ + GG G+D L GG G D L GG G D+ G+ DS +
Sbjct: 355 NSADNILQGGAGNDVLYGGAGADTLYGGAGRDTFVYGS--------------GQDSTVAA 400

Query: 1999 TDTVLNFNLDIDVLDLSDLLIGEESGNLEDFLSFSFSGGNTTITIDADG 2047
D + +F ID +DLS E F+ G + DA
Sbjct: 401 YDWIADFQKGIDKIDLSAF--RNEGQLSFVQDQFTGKGQEVMLQWDAAN 447



Score = 67.3 bits (164), Expect = 3e-13
Identities = 33/159 (20%), Positives = 49/159 (30%), Gaps = 43/159 (27%)

Query: 1875 ADDSITIVGTDNINNLIFGSTNSDSLTGANLDDRIFGREDNDILIGLSGNDELIGGSGND 1934
++ I + + + G + S+ + G NDIL+G S ++ L GG+GND
Sbjct: 309 YSNNQRINLNEGSFSDVGGLKGNVSIAHGVTIENAIGGSGNDILVGNSADNILQGGAGND 368

Query: 1935 NVQGGEGNDFVIGGIGDDTLNGGIG-----------RDYLSGGQGNDSLDGGALNG---- 1979
+ GG G D + GG G DT G G D+ G D
Sbjct: 369 VLYGGAGADTLYGGAGRDTFVYGSGQDSTVAAYDWIADFQKGIDKIDLSAFRNEGQLSFV 428

Query: 1980 ----------------------------SDDGERDFFVW 1990
+ DF V
Sbjct: 429 QDQFTGKGQEVMLQWDAANSITNLWLHEAGHSSVDFLVR 467



Score = 33.4 bits (76), Expect = 0.012
Identities = 11/70 (15%), Positives = 19/70 (27%), Gaps = 4/70 (5%)

Query: 1922 SGNDELIGGSGNDNVQGGE----GNDFVIGGIGDDTLNGGIGRDYLSGGQGNDSLDGGAL 1977
N G D++ + N G N RD+ + + +L
Sbjct: 237 DYNGHYGGAPMIDDIAAIQRLYGANMTTRTGDSVYGFNSNTDRDFYTATDSSKALIFSVW 296

Query: 1978 NGSDDGERDF 1987
+ DF
Sbjct: 297 DAGGTDTFDF 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3922CABNDNGRPT671e-12 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 66.5 bits (162), Expect = 1e-12
Identities = 52/206 (25%), Positives = 73/206 (35%), Gaps = 22/206 (10%)

Query: 3789 GDDKVNAGEGNDIIFGDLVSFDGIDGQGFSALQAFVAQETNQQPTNVTVQDIHEFISSNT 3848
D A + + + + G D FS + N + N
Sbjct: 278 DRDFYTATDSSKALIFSVWDAGGTDTFDFSG-----YSNNQRINLNEGSFSDVGGLKGNV 332

Query: 3849 HLFGESNTDDGADTLEGGEGNDILFGQGGNDELIGGLGADTLIGGLGDDKLTGNEGADTF 3908
+ + GG GNDIL G ++ L GG G D L GG G D L G G DTF
Sbjct: 333 SIAH----GVTIENAIGGSGNDILVGNSADNILQGGAGNDVLYGGAGADTLYGGAGRDTF 388

Query: 3909 VWDRNSIDNSDPTDRNTDHITDFNMAEDKLDLSDILQGDTVSELAQH---------LSFT 3959
V+ D T D I DF DK+DLS +S + L +
Sbjct: 389 VYG----SGQDSTVAAYDWIADFQKGIDKIDLSAFRNEGQLSFVQDQFTGKGQEVMLQWD 444

Query: 3960 DENGSTSINIDTDGNGSFDQHIVLDG 3985
N T++ + G+ S D + + G
Sbjct: 445 AANSITNLWLHEAGHSSVDFLVRIVG 470


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3924RTXTOXIND3002e-99 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 300 bits (770), Expect = 2e-99
Identities = 91/441 (20%), Positives = 195/441 (44%), Gaps = 11/441 (2%)

Query: 20 MMTDAPASHRLTIWALTALIITFLLWAYFAELDQVTTGMGKVIPSSQVQVIQSLDGGVLQ 79
+ T RL + + ++ + + +++ V T GK+ S + + I+ ++ +++
Sbjct: 49 IETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVK 108

Query: 80 EMYVQEGLIVTKGQPLVRIDATRFQSDFAQQEQEVNSLSANVIRLNAELNSITISAITND 139
E+ V+EG V KG L+++ A ++D + + + R SI
Sbjct: 109 EIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSI-------- 160

Query: 140 WREQVKISPQALIFPSSLEDSEPELTSRQREEYVGRLDNLSNQLEIQARQIQQRNQEIQE 199
E K+ L ++ E R + NQ + + ++ E
Sbjct: 161 --ELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT 218

Query: 200 LASKIRTLTTSFQLVSRELELTRPLADKGIVPEVELLKLQRVVNDIQGELASLRLLRPKV 259
+ ++I ++ L+ L K + + +L+ + + EL + ++
Sbjct: 219 VLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQI 278

Query: 260 KSTMDEAILKRRESVLIYAADTRAQLNEMQTKLSRMNEAQVGAQDKVSKAEIVSPVNGTV 319
+S + A + + ++ + +L + + + +++ + I +PV+ V
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKV 338

Query: 320 KTVHINTLGGVVQPGVDIIEIVPSEDKLLIETKIVPKDIAFLHPGLPAVVKVTAYDFTRY 379
+ + ++T GGVV ++ IVP +D L + + KDI F++ G A++KV A+ +TRY
Sbjct: 339 QQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRY 398

Query: 380 GGLNGVVEHISADTTQDEEGNSFYIVKVRTEFSSLTKDDGTEMPIIPGMLTSVDVITGQR 439
G L G V++I+ D +D+ + V + E + L+ +P+ GM + ++ TG R
Sbjct: 399 GYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLS-TGNKNIPLSSGMAVTAEIKTGMR 457

Query: 440 SVLEYILNPILRAKDTALRER 460
SV+ Y+L+P+ + +LRER
Sbjct: 458 SVISYLLSPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3926OMPADOMAIN858e-22 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 84.6 bits (209), Expect = 8e-22
Identities = 33/118 (27%), Positives = 55/118 (46%), Gaps = 12/118 (10%)

Query: 76 KVLFANDSYYIDPQYYPQVEVIATFMRDY--PNTQAVIEGHCSKTGSHEHNQVLSQNRAN 133
VLF + + P+ ++ + + + + + V+ G+ + GS +NQ LS+ RA
Sbjct: 220 DVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQ 279

Query: 134 AVSSLLAERFGIDSDRLSAVGYSFDRPV-----DPTHTRAAHK----VNRRVIAELTG 182
+V L + GI +D++SA G PV D RAA +RRV E+ G
Sbjct: 280 SVVDYLISK-GIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKG 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3931HTHTETR632e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.7 bits (152), Expect = 2e-14
Identities = 29/145 (20%), Positives = 53/145 (36%), Gaps = 10/145 (6%)

Query: 21 YELILRAAEKIIAAGGIQGLSMQLVATEAGVAAGTIYRYFKDKDQLILELRKDVLSQVAS 80
+ IL A ++ + G+ S+ +A AGV G IY +FKDK L E+ + S +
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 81 AILE--DHQIGTIEQRFKRIWMKMYNYGKQP-------SPANLSYEQYANLPVINTQEIR 131
LE G + I + + E + V+ Q R
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQ-QAQR 131

Query: 132 QLEMQLFAPLQQLFEQGKAQGLIQP 156
L ++ + ++Q + ++
Sbjct: 132 NLCLESYDRIEQTLKHCIEAKMLPA 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3932RTXTOXIND476e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.1 bits (112), Expect = 6e-08
Identities = 20/96 (20%), Positives = 36/96 (37%), Gaps = 1/96 (1%)

Query: 71 TIANQVSGVVSAIRFENGSKVEVGQMLIELDSKVEKANLKSKQVQLPAAEADFKRLSKLY 130
I + +V I + G V G +L++L + +A+ Q L A + R L
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 131 KQ-NSVSKQDLDNSQSKYLALMADSESLSATIGRRE 165
+ +L Y +++ E L T +E
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKE 193



Score = 43.7 bits (103), Expect = 7e-07
Identities = 36/215 (16%), Positives = 81/215 (37%), Gaps = 28/215 (13%)

Query: 78 GVVSAIRFENGSKVEVGQMLIEL--DSKVEKANLKSKQVQLPAAEADFKRLSKLYKQNSV 135
++ +E +E + +V K+ L+ + ++ +A+ +++ +++L+ +N +
Sbjct: 247 QAIAKHAV-----LEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF-KNEI 300

Query: 136 SKQDLDNSQSKYLALMADSESLSATIGRREIRAPFAGLVGIRNVN-LGEYLQTGS---EI 191
+ L + L + IRAP + V V+ G + T I
Sbjct: 301 LDK-LRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVI 359

Query: 192 VRLEDISTMKIRFTIPQTQLSRIATGQKVHVYVDAYP---TEPFEGVISAIEP------- 241
V +D T+++ + + I GQ + V+A+P G + I
Sbjct: 360 VPEDD--TLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQR 417

Query: 242 -AVFYQSGLIQVQARIP--NTDAKLRSGMFAKVDI 273
+ + + + + N + L SGM +I
Sbjct: 418 LGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEI 452


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3933ACRIFLAVINRP7950.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 795 bits (2054), Expect = 0.0
Identities = 312/1037 (30%), Positives = 533/1037 (51%), Gaps = 34/1037 (3%)

Query: 5 DIFIRRPVLAASISFLILLLGFYALKSMQVREYPEMTNTVVTVSTSYYGADSNLIQGFIT 64
+ FIRRP+ A ++ ++++ G A+ + V +YP + V+VS +Y GAD+ +Q +T
Sbjct: 3 NFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVT 62

Query: 65 QPLEQALAQADNVDFMTSDSF-LGSSKISVYMKLNTDPNGALADILAKVNSVRSQLPKEA 123
Q +EQ + DN+ +M+S S GS I++ + TDP+ A + K+ LP+E
Sbjct: 63 QVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEV 122

Query: 124 EDSTVEMSTGAQTSALYISFYSDQINSSQ--ITDYLERVVKPELFTIDGVAKVNLYGGIK 181
+ + + + + + F SD ++Q I+DY+ VK L ++GV V L+G +
Sbjct: 123 QQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA-Q 181

Query: 182 YAMRIWLDPARMGGFNLSSTDVMSVLQANNYQSAVGQTNNTFTL------LNGTADTQVA 235
YAMRIWLD + + L+ DV++ L+ N Q A GQ T L + A T+
Sbjct: 182 YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 236 TVAELEKLVI-GSKDGLVIRLGDIADVSLEKSHDVYRALADGQEAVVVGLDVTPTANPLV 294
E K+ + + DG V+RL D+A V L + A +G+ A +G+ + AN L
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 295 VAADARAMMPQITRNLPISMKARILYDSSLAIDESINEVIKTIGEASLIVIVVITLFLGS 354
A +A + ++ P MK YD++ + SI+EV+KT+ EA ++V +V+ LFL +
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 355 FRAVIIPIVTIPLSLIGVAVIMQMFGFTLNLMTLLAMVLAIGLVVDDAIVVVENVDRHIK 414
RA +IP + +P+ L+G I+ FG+++N +T+ MVLAIGL+VDDAIVVVENV+R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 LGETPFRAAII-GTREIAVPVISMTITLAAVYAPIALMGGITGSLFKEFALTLAGAVFIS 473
+ P + A +I ++ + + L+AV+ P+A GG TG+++++F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 474 GIVALTLSPMMCSKILKA-----HSKPNRFERGVENFLSGLTSRYNRMLTAVLNKRPVVI 528
+VAL L+P +C+ +LK H F + Y + +L +
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 529 AFAIIVFASLPVMFSFIPSELAPNEDKSVVMMMGTAPSSANLDYIQANMTLVTDMISAQP 588
++ A + V+F +PS P ED+ V + M P+ A + Q + VTD
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 589 EAAASLAF----VGVPNANQAFGIA--PLVPWSERDKSQKEMQKFFGE---EVKKVPGMA 639
+A F Q G+A L PW ER+ + + E+ K+
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 640 VTTFQMPE--LPGASSGLPIQFVITSSNSFESLFQIGSGVLNKVQQSPLFVYS-EVNLKF 696
V F MP G ++G + + + ++L Q + +L Q P + S N
Sbjct: 662 VIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLE 721

Query: 697 DSGSMKIHIKRDVAGAYGITMKDIGLTLTTMMSDGYVNRINLDGRSYEVIPQVERKLRAN 756
D+ K+ + ++ A A G+++ DI T++T + YVN GR ++ Q + K R
Sbjct: 722 DTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRML 781

Query: 757 PEALAGYYLTAADGRSIPLSSLVDVEIVSEPRSLPHFNQMNAITVGGVAAPGVAIGDAIA 816
PE + Y+ +A+G +P S+ V L +N + ++ + G AAPG + GDA+A
Sbjct: 782 PEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMA 841

Query: 817 FLQNIGDNELPKGYSYDFLGEARQYVTEGSALYATFGLALAIIFLVLASQFESLRDPLVI 876
++N+ ++LP G YD+ G + Q G+ A ++ ++FL LA+ +ES P+ +
Sbjct: 842 LMENL-ASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 877 LVSVPLAISGALIALGWTHVFGLSSMNIYTQVGLITLVGLITKHGILMCEVAKEEQLHNG 936
++ VPL I G L+A + ++Y VGL+T +GL K+ IL+ E AK+ G
Sbjct: 901 MLVVPLGIVGVLLAATLFN----QKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LSKLDAIKHAATIRLRPILMTTAAMIAGLIPLLFAVGAGAVARFNIGLVIVSGLAIGTVF 996
++A A +RLRPILMT+ A I G++PL + GAG+ A+ +G+ ++ G+ T+
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLFVLPVIYTYLAEKHE 1013
+F +PV + + +
Sbjct: 1017 AIFFVPVFFVVIRRCFK 1033


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_3934ADHESNFAMILY250.042 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 24.8 bits (54), Expect = 0.042
Identities = 15/50 (30%), Positives = 25/50 (50%), Gaps = 5/50 (10%)

Query: 4 PKKIEEVAKQLSENLPGGLKQFAGEFEERSKQILQNQLQKLDVVSREEFE 53
+ +AKQLS P EF E++ + ++L KLD S+++F
Sbjct: 148 IIFAKNIAKQLSAKDPNN-----KEFYEKNLKEYTDKLDKLDKESKDKFN 192


100Shal_4079Shal_4087N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_4079-215-2.035937DNA adenine methylase
Shal_4080-215-2.250651sporulation domain-containing protein
Shal_4081-115-2.5475493-dehydroquinate synthase
Shal_4082-116-3.030371shikimate kinase I
Shal_4083-113-1.956150type IV pilus secretin PilQ
Shal_4084-212-0.441934pilus assembly protein PilP
Shal_4085-2110.002910pilus assembly protein PilO
Shal_4086-3110.219447fimbrial assembly family protein
Shal_4087-2110.545848type IV pilus assembly protein PilM
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4079TYPE3IMSPROT310.007 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 30.5 bits (69), Expect = 0.007
Identities = 16/42 (38%), Positives = 18/42 (42%), Gaps = 6/42 (14%)

Query: 205 DDQALLARKARHTAIERGIPVLISNHDIPLTRELYHGARFDT 246
D Q R A E G+P+L IPL R LY A D
Sbjct: 289 DAQVQ---TVRKIAEEEGVPIL---QRIPLARALYWDALVDH 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4080HOKGEFTOXIC270.017 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 27.5 bits (61), Expect = 0.017
Identities = 10/32 (31%), Positives = 19/32 (59%), Gaps = 3/32 (9%)

Query: 224 VQLPQKSRLKMIIIVSTLCLIALLASWLLIDS 255
++LP+ S + ++IV CL L+ ++L S
Sbjct: 1 MKLPRSSLVWCVLIV---CLTLLIFTYLTRKS 29


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4082PF05272280.016 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.016
Identities = 16/65 (24%), Positives = 24/65 (36%), Gaps = 10/65 (15%)

Query: 9 LVGPMGAGKSTIGRHLAQMLHLEFHDSDQEIESRTGADIA------WVFDVEGEEGFRNR 62
L G G GKST+ L + SD + TG D +++ FR
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFF----SDTHFDIGTGKDSYEQIAGIVAYELSEMTAFRRA 656

Query: 63 ETQVV 67
+ + V
Sbjct: 657 DAEAV 661


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4083BCTERIALGSPD2454e-74 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 245 bits (628), Expect = 4e-74
Identities = 100/412 (24%), Positives = 187/412 (45%), Gaps = 38/412 (9%)

Query: 310 GDITLRLDDVPWDQALDLILQAKGLDKRIEGNILMIAPSEEIAIRESQEL-------KNR 362
GD ++ + W A D++ L+K + L + + E +R
Sbjct: 190 GDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVLVSGEPNSR 249

Query: 363 QEVKQLA-PLY--------SEYLQINYAKAKDIAALLKGEDSSLLTARG----------- 402
Q + + L ++ + + YAKA D+ +L G S++ + +
Sbjct: 250 QRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAKPVAALDKN 309

Query: 403 -SVAVDERTNTLLVKDTEETLENVHRLIEVLDIPIKQVLIEARMVTVKDDVSEDLGVRWG 461
+ +TN L+V + + ++ R+I LDI QVL+EA + V+D +LG++W
Sbjct: 310 IIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGIQWA 369

Query: 462 VTDQQGDKGTSGSLEGAGDIANGIVPSLSDRLNVNLPAAVANPTSIAFHVAKLADGTVLD 521
+ + T+ L + IA + ++ +L +A+++ IA +
Sbjct: 370 NKNAGMTQFTNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAGFYQGN----WA 425

Query: 522 LELSALEQENKGEIIASPRITTSNQKAAYIEQGVEIPYVEASSSG-----ATTVTFKKAV 576
+ L+AL K +I+A+P I T + A G E+P + S + TV K
Sbjct: 426 MLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDNIFNTVERKTVG 485

Query: 577 LSLRVTPQITPDNRVILDLEITQDSQGKT-VDTPTGQAVSIDTQRIGTQVLVDHGETIVL 635
+ L+V PQI + V+L++E S T + + +T+ + VLV GET+V+
Sbjct: 486 IKLKVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGETVVV 545

Query: 636 GGIYQQQLITKVSKVPILGDIPYLGFMFRNSSDKNERQELLIFVTPKIVSEA 687
GG+ + + KVP+LGDIP +G +FR++S K ++ L++F+ P ++ +
Sbjct: 546 GGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIRDR 597



Score = 43.4 bits (102), Expect = 3e-06
Identities = 30/175 (17%), Positives = 72/175 (41%), Gaps = 14/175 (8%)

Query: 279 SLNFQNISVRTVLQIIADYNNFNLVTSDTVEGDITLR-LDDVPWDQALDLILQAKGLDKR 337
S +F+ ++ + ++ N ++ +V G IT+R D + +Q L
Sbjct: 31 SASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRSYDMLNEEQYYQFFLSVL----D 86

Query: 338 IEGNILMIAPSEEIAIRESQELKNRQEVK--QLAP-----LYSEYLQINYAKAKDIAALL 390
+ G ++ + + + S++ K AP + + + + A+D+A LL
Sbjct: 87 VYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPGIGDEVVTRVVPLTNVAARDLAPLL 146

Query: 391 KGEDSSLLTARGSVAVDERTNTLLVKDTEETLENVHRLIEVLDIPIKQVLIEARM 445
+ + + GSV E +N LL+ ++ + ++E +D + ++ +
Sbjct: 147 RQLNDN--AGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVDNAGDRSVVTVPL 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4087SHAPEPROTEIN462e-07 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 45.5 bits (108), Expect = 2e-07
Identities = 31/156 (19%), Positives = 59/156 (37%), Gaps = 34/156 (21%)

Query: 199 VDIGANITTFSVVEHGETTFVREQAFGGEQFTQSILSFYGMTY------EQAEKAKLE-- 250
VDIG T +V+ + GG++F ++I+++ Y AE+ K E
Sbjct: 164 VDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIG 223

Query: 251 -------------------GDLPRNY------VFEVLSPFQTQLLQQIKRTLQIYCTSSG 285
+PR + + E L T ++ + L+
Sbjct: 224 SAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELA 283

Query: 286 RDKVDH-IVLCGGTSKLEGMANLMVNELGVHTIIAD 320
D + +VL GG + L + L++ E G+ ++A+
Sbjct: 284 SDISERGMVLTGGGALLRNLDRLLMEETGIPVVVAE 319


101Shal_4168Shal_4175N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_4168-1151.447560general secretion pathway protein J
Shal_41691171.399359general secretion pathway protein I
Shal_41700161.466160general secretion pathway protein H
Shal_41711161.412208general secretion pathway protein G
Shal_41721141.448982general secretion pathway protein F
Shal_41732171.838054general secretory pathway protein E
Shal_41741161.478187general secretion pathway protein D
Shal_41751161.420391general secretion pathway protein C
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4168BCTERIALGSPG364e-05 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 36.0 bits (83), Expect = 4e-05
Identities = 19/55 (34%), Positives = 32/55 (58%), Gaps = 7/55 (12%)

Query: 3 LKRIKTQRGFTLLEMLIAIAIFAMLGLAANTVLSTVMKNDIATRDFAAKLKAMQQ 57
++ QRGFTLLE+++ I I +G+ A+ V+ +M N ++ A K KA+
Sbjct: 1 MRATDKQRGFTLLEIMVVIVI---IGVLASLVVPNLMGN----KEKADKQKAVSD 48


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4169BCTERIALGSPH300.002 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 29.9 bits (67), Expect = 0.002
Identities = 9/31 (29%), Positives = 19/31 (61%)

Query: 5 IKQRGMTLLEVIVALAVFSIAAVSITKSLGD 35
++QRG TLLE+++ L + ++A + +
Sbjct: 1 MRQRGFTLLEMMLILLLMGVSAGMVLLAFPA 31


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4170BCTERIALGSPH853e-23 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 85.4 bits (211), Expect = 3e-23
Identities = 39/170 (22%), Positives = 71/170 (41%), Gaps = 39/170 (22%)

Query: 23 RQTGFTLMEVLLVVLLMGLAATAVTLSMGGASKEKALERTAQQFMMATEMVLDETVLSGH 82
RQ GFTL+E++L++LLMG++A V L+ + + A + T +F V + +G
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQ-TLARFEAQLRFVQQRGLQTGQ 60

Query: 83 FVGIVIEDSSYKFVYYDEG---------------KWKPLEQDRLLAERQMEPGVEMVLVL 127
F G+ + ++F+ + +W PL R+ +
Sbjct: 61 FFGVSVHPDRWQFLVLEARDGADPAPADDGWSGYRWLPLRAGRVATSGS----------I 110

Query: 128 DGLPLVQEDEEQDSWFDEPLIEKSADDKKKFPEPQIMLFPSGEMSAFELS 177
G L + ++W D+ P +++FP GEM+ F L+
Sbjct: 111 AGGKLNLAFAQGEAW-------TPGDN------PDVLIFPGGEMTPFRLT 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4171BCTERIALGSPG2351e-83 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 235 bits (602), Expect = 1e-83
Identities = 99/144 (68%), Positives = 121/144 (84%)

Query: 1 MQTRNKQQGFTLLEVMVVIVILGILASMVVPNLMGNKDKADQQKAVSDIVALENALDMYK 60
M+ +KQ+GFTLLE+MVVIVI+G+LAS+VVPNLMGNK+KAD+QKAVSDIVALENALDMYK
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK 60

Query: 61 LDNSVYPTTEQGLEALVQKPTISPEPRNYRADGYVKRLPQDPWRSDYLLLSPGENGKLDI 120
LDN YPTT QGLE+LV+ PT+ P NY +GY+KRLP DPW +DY+L++PGE+G D+
Sbjct: 61 LDNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDL 120

Query: 121 FSAGPDGQPGTEDDIGNWNLQNFQ 144
SAGPDG+ GTEDDI NW L +
Sbjct: 121 LSAGPDGEMGTEDDITNWGLSKKK 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4172BCTERIALGSPF5030.0 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 503 bits (1296), Expect = 0.0
Identities = 233/407 (57%), Positives = 304/407 (74%), Gaps = 1/407 (0%)

Query: 1 MPAFEYKALDKTGKQQKGVVEADTARHARTQLREQRLMPLEITPVVEKESKAKSAGFSF- 59
M + Y+ALD GK+ +G EAD+AR AR LRE+ L+PL + + K+ S G S
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 60 LKRGISTAELALITRQIATLVAAGLPIEEALKAVGQQCEKDRLASMVMAVRSRVVEGYSL 119
K +ST++LAL+TRQ+ATLVAA +P+EEAL AV +Q EK L+ ++ AVRS+V+EG+SL
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 120 ADSMAEFPHIFDELYRAMVASGEKSGHLEVVLNRLADYTERRQQLKSKMTQAMIYPIVLT 179
AD+M FP F+ LY AMVA+GE SGHL+ VLNRLADYTE+RQQ++S++ QAMIYP VLT
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 180 VVAIGVIAILLAAVVPQVVGQFEHMGQELPWTTELLIASSDFIRDYGLIVLVAIVGLFFI 239
VVAI V++ILL+ VVP+VV QF HM Q LP +T +L+ SD +R +G +L+A++ F
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 240 AKRLLVTPKNRMIYDSMLLRLPVISKVSKSLNTARFARTLSILTASSVPLLDAMRIASDV 299
+ +L K R+ + LL LP+I ++++ LNTAR+ARTLSIL AS+VPLL AMRI+ DV
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 300 LVNVKVKAAVEDATLRVREGTSLGTALANTKLFPPMMLYMITSGEKSGQLEQMLERAADN 359
+ N + + AT VREG SL AL T LFPPMM +MI SGE+SG+L+ MLERAADN
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 360 QDREFESNVTIALGVFEPMLVVSMAAVVLFIVLAILQPILALNNMIS 406
QDREF S +T+ALG+FEP+LVVSMAAVVLFIVLAILQPIL LN ++S
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLMS 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4174BCTERIALGSPD6080.0 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 608 bits (1568), Expect = 0.0
Identities = 330/677 (48%), Positives = 448/677 (66%), Gaps = 29/677 (4%)

Query: 6 IRRRLIAGMVMGASLLAPQLAWSEQYAANFKGTDIQEFINIVGKNLERTIIVDPTVRGKI 65
IR + ++ A L P A +E+++A+FKGTDIQEFIN V KNL +T+I+DP+VRG I
Sbjct: 7 IRSFSLTLLIFAALLFRP--AAAEEFSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTI 64

Query: 66 NVRSYDLLNDEQYYQFFLNVLQVYGYAVVEMDNNIIKVIKDKDAKTASIRVADDSAPGLG 125
VRSYD+LN+EQYYQFFL+VL VYG+AV+ M+N ++KV++ KDAKTA++ VA D+APG+G
Sbjct: 65 TVRSYDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPGIG 124

Query: 126 DEMVTRIVALYNTEAKQLAPLLRQLNDNAGGGNVVNYDPSNVLMISGRAAVVNKLVEIVR 185
DE+VTR+V L N A+ LAPLLRQLNDNAG G+VV+Y+PSNVL+++GRAAV+ +L+ IV
Sbjct: 125 DEVVTRVVPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVE 184

Query: 186 RVDKQGDTAVEVVRLEYASAGEIVRIIDTLYRSTANQAQMPGQAPKVVADERINAVVVSG 245
RVD GD +V V L +ASA ++V+++ L + T+ A VVADER NAV+VSG
Sbjct: 185 RVDNAGDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVLVSG 244

Query: 246 DEKSRERVVKLIKKLDAEQATTGNTKVRYLRYAKAEDLVEVLTGFAEQLVDDKGAPQGGG 305
+ SR+R++ +IK+LD +QAT GNTKV YL+YAKA DLVEVLTG + + +K A +
Sbjct: 245 EPNSRQRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAKPVA 304

Query: 306 SKRRNEINIMAHTDTNALVISAEPDQMRTIESVINQLDIRRAQVLVEAIIVEVAEGDDVG 365
+ +N I I AH TNAL+++A PD M +E VI QLDIRR QVLVEAII EV + D +
Sbjct: 305 ALDKN-IIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLN 363

Query: 366 FGVQWATEAGGGTQFNNLGPTIGEIGAGIWQAQGEEGSTVCNDGTCTENPDTRGDITLLA 425
G+QWA + G TQF N G I AG + DGT + + LA
Sbjct: 364 LGIQWANKNAGMTQFTNSGLPISTAIAG--------ANQYNKDGTVS---------SSLA 406

Query: 426 QALGKVNGMAWGVAMGDFGALIQAVSSDTKSNVLATPSITTLDNQEASFIVGDEVPILTG 485
AL NG+A G G++ L+ A+SS TK+++LATPSI TLDN EA+F VG EVP+LTG
Sbjct: 407 SALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTG 466

Query: 486 SQNSSNGNSNPFQTVERKEVGVKLKVVPQVNEGNAVKLTIEQEVSGINGK-----TGVDV 540
SQ +S N F TVERK VG+KLKV PQ+NEG++V L IEQEVS + + +
Sbjct: 467 SQTTSGD--NIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGA 524

Query: 541 TFATRRLTTTVMADSGQIVVLGGLIKEEVQESVQKVPFLGDIPIIGHLFKSSSSGKKKTN 600
TF TR + V+ SG+ VV+GGL+ + V ++ KVP LGDIP+IG LF+S+S K N
Sbjct: 525 TFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRN 584

Query: 601 LMVFIKPTIIRDGITMEGIAGRKYNYFRALQLEQQ-ERGVNLMPNTDVPILEEWNQADYL 659
LM+FI+PT+IRD + +Y F Q +Q+ + + M N D+ + Q
Sbjct: 585 LMLFIRPTVIRDRDEYRQASSGQYTAFNDAQSKQRGKENNDAMLNQDLLEIYP-RQDTAA 643

Query: 660 PPEVNEVLNRYKEGKGL 676
+V+ ++ + G L
Sbjct: 644 FRQVSAAIDAFNLGGNL 660


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4175BCTERIALGSPC2003e-65 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 200 bits (509), Expect = 3e-65
Identities = 77/286 (26%), Positives = 136/286 (47%), Gaps = 36/286 (12%)

Query: 17 KPVSTAVFGLGLLIALYLLAQITWKL-VPDNSAATRWVPTPVASNASGQVNILGLQQLSL 75
+ +F L +L+ LA I W++ +PDN+ + TP + L +L
Sbjct: 12 SVIRRILFYLLMLLFCQQLAMIFWRIGLPDNAPVSSVQITPAQARQQPV----TLNDFTL 67

Query: 76 FGQPDAAGAKPKAAPVEE--IITDAPKTSLSIQLTGVVASTTEQKGLAVIASSGSQDTYG 133
FG + A + +++ P ++L++ LTGV+A + + +A+I+ Q + G
Sbjct: 68 FG----VSPEKNKAGALDASQMSNLPPSTLNLSLTGVMAGDDDSRSIAIISKDNEQFSRG 123

Query: 134 LGDKIRGTSASLKEVYADRIIITNSGRYETLMLDGLEYNTNGAANQHLQKAKSVSKGKTI 193
+ +++ G +A + + DR+++ GRYE L L E + G +
Sbjct: 124 VNEEVPGYNAKIVSIRPDRVVLQYQGRYEVLGLYSQEDS-----------GSDGVPGAQV 172

Query: 194 DNRKNRAVAAELSQSREEILADPSKITDYLSISPVKSGGELAGYRLNPGKDRELFKQAGF 253
+ E Q R + ++DY+S SP+ + +L GYRLNPG + F + G
Sbjct: 173 N---------EQLQQR-----ASTTMSDYVSFSPIMNDNKLQGYRLNPGPKSDSFYRVGL 218

Query: 254 KANDLAKSINGYDLTDMGQALEVMAQLPEMTEVALMVERDGQLIEI 299
+ ND+A ++NG DL D QA + M ++ ++ L VERDGQ +I
Sbjct: 219 QDNDMAVALNGLDLRDAEQAKKAMERMADVHNFTLTVERDGQRQDI 264


102Shal_4183Shal_4188N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_41830161.248390putative lipoprotein
Shal_41841140.920090signal peptidase I
Shal_41850141.411859excinuclease ABC subunit C
Shal_4186-1131.083455oligopeptidase B
Shal_4187-1130.6429603,4-dihydroxy-2-butanone 4-phosphate synthase
Shal_4188-110-0.314939PAS/PAC sensor-containing diguanylate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4183RTXTOXINA280.029 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.4 bits (63), Expect = 0.029
Identities = 23/102 (22%), Positives = 37/102 (36%), Gaps = 1/102 (0%)

Query: 145 DGSGDLWIKGGHDLNTQDGSGSVEISNTTGKLTLEDGSGSITLSGIGGDTHIDDGSGDLN 204
DG + G+D D G+ +S G L G G+ L G+ G+ +++ G GD
Sbjct: 744 DGDDLIEGNDGNDRLYGD-KGNDTLSGGNGDDQLYGGDGNDKLIGVAGNNYLNGGDGDDE 802

Query: 205 VNNVNGRVVIDDGSGDIDVDNTLGLSIIESGSGDLSVDNING 246
+ + G D G + G D + G
Sbjct: 803 FQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLKG 844


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4184IGASERPTASE300.007 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.007
Identities = 22/86 (25%), Positives = 38/86 (44%), Gaps = 20/86 (23%)

Query: 44 KEGDRIVVDKMA---YDLRVPFTQISLATTGEPERGEIVVFESKAADKRLIKRVIGLPGD 100
K+GD++VV K A + L+V TGEP E+ +F++ A + D
Sbjct: 909 KQGDKVVVTKSATGNFTLQVA------DKTGEPNHNELTLFDASKAQR-----------D 951

Query: 101 KISLSHEVLFINGKALDYSLVTSDQR 126
+++S ++ A Y L + R
Sbjct: 952 HLNVSLVGNTVDLGAWKYKLRNVNGR 977


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4187ACRIFLAVINRP270.045 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 27.5 bits (61), Expect = 0.045
Identities = 15/68 (22%), Positives = 25/68 (36%), Gaps = 4/68 (5%)

Query: 79 QLPPMVSDNNSQY---GTAFTVSIEAKQGVTTGVSAADRVTTIKTAIADNAKPDDLARPG 135
Q P + F + ++ ++ GVS +D TI TA+ +D G
Sbjct: 707 QHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALG-GTYVNDFIDRG 765

Query: 136 HVYPLRAR 143
V L +
Sbjct: 766 RVKKLYVQ 773


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4188ANTHRAXTOXNA310.027 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 31.3 bits (70), Expect = 0.027
Identities = 39/169 (23%), Positives = 68/169 (40%), Gaps = 21/169 (12%)

Query: 58 NVIYELQKERGLSAGYLSSGGRAFKQRLEQ-QWLNTDRKFQELLQSSALEKTVETASNDA 116
V YE+ K G+S +S + L + L+ D +LL S ++ +E +
Sbjct: 168 EVYYEIGK--GISLDIISKDKSLDPEFLNLIKSLSDDSDSSDLLFSQKFKEKLELNNKSI 225

Query: 117 ELFHALQN----QLSKASLARQQLEIARQKVLELNTPEYFNFYSGLTQKLINFVSLLRFK 172
++ +N Q + + + VLEL P+ F + + L + K
Sbjct: 226 DINFIKENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFEYMNKLEKGGFE-KISESLK 284

Query: 173 SDNSYQVLVQSDLINMLKIQELAGQERGLVNSLLAAPTLEPDSFKQIAQ 221
+ V+ D I++LK E A + GLV D+FK+IA+
Sbjct: 285 KEG-----VEKDRIDVLK-GEKALKASGLVPE-------HADAFKKIAR 320


103Shal_4228Shal_4241N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_4228-1232.436135alpha/beta hydrolase fold protein
Shal_4229-1242.521677flagellar biosynthesis protein FlhA
Shal_42301220.917985flagellar biosynthetic protein FlhB
Shal_42310190.479335flagellar biosynthetic protein FliR
Shal_42320170.585603export protein FliQ
Shal_42330160.498625flagellar biosynthesis protein FliP
Shal_42341161.560027flagellar motor switch protein FliN
Shal_42351151.563827hypothetical protein
Shal_42362152.294205OmpA/MotB domain-containing protein
Shal_42370153.698357sigma-54 dependent trancsriptional regulator
Shal_42380184.133722flagellar hook-basal body complex subunit FliE
Shal_42391184.418730flagellar MS-ring protein
Shal_42401204.424915flagellar motor switch protein G
Shal_42411225.255394flagellar assembly protein H
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4228HTHFIS310.007 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.007
Identities = 15/73 (20%), Positives = 28/73 (38%), Gaps = 5/73 (6%)

Query: 199 LGGKDYHAEPFDKNDLTHSKLRYDDYRELYRQQPQLQLGSPTNHWLV---ESIDAAEQAI 255
G DY +PFD +L R + R+ +L+ S LV ++ + +
Sbjct: 96 KGAYDYLPKPFDLTELIGIIGRALAEPK--RRPSKLEDDSQDGMPLVGRSAAMQEIYRVL 153

Query: 256 DAARNSKIPLLIL 268
+ + L+I
Sbjct: 154 ARLMQTDLTLMIT 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4230TYPE3IMSPROT313e-107 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 313 bits (804), Expect = e-107
Identities = 111/352 (31%), Positives = 186/352 (52%), Gaps = 6/352 (1%)

Query: 10 KTEKATPQKLKKAREQGQVPRSKDLASSALIIGCSLLLTTTADWIAAKVAAMTRINMSFT 69
KTE+ TP+K++ AR++GQV +SK++ S+ALI+ S +L +D+ + + M
Sbjct: 5 KTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKL----MLIP 60

Query: 70 KAQLDEPGMMARHLAYS--LLEVLNILGPLFLMVALIAMVAGAMPGGPIFNFNNANFKYS 127
Q P A LLE + PL + AL+A+ + + G + +
Sbjct: 61 AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIK 120

Query: 128 RIDPIAGLGRMASVKSLVELVKSILKIVLLIGIMLLFLQKNFQALMAVSQLPIDEAVTRG 187
+I+PI G R+ S+KSLVE +KSILK+VLL ++ + ++ N L+ + I+
Sbjct: 121 KINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLL 180

Query: 188 IEMLSLAMFYMGLGLLVITFIDVPYQYWHHHKELRMSLQEVKDEYKQQDGKPEVKAKIRQ 247
++L M +G +VI+ D ++Y+ + KEL+MS E+K EYK+ +G PE+K+K RQ
Sbjct: 181 GQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQ 240

Query: 248 MQQRISRSRADVSVPKADVLLVNPSHYAVALKYDMNKADAPYVLAKGTDELALYMRDIAK 307
Q I +V ++ V++ NP+H A+ + Y + P V K TD +R IA+
Sbjct: 241 FHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAE 300

Query: 308 RNDVEVLELPPLTRAIYYSTQVEQQIPSGLFIAIAHVLTYVMQLRAARQGQQ 359
V +L+ PL RA+Y+ V+ IP+ A A VL ++ + +Q +
Sbjct: 301 EEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEKQHSE 352


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4231TYPE3IMRPROT1072e-30 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 107 bits (269), Expect = 2e-30
Identities = 78/229 (34%), Positives = 115/229 (50%), Gaps = 2/229 (0%)

Query: 1 MVMPFLGNAYIPATVRILLAISISALIAPMLPPFPAVDAISFQALILAVEQLLIGFMLAL 60
P L +P V++ LA+ I+ IAP LP SF AL LAV+Q+LIG L
Sbjct: 28 STAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPV-FSFFALWLAVQQILIGIALGF 86

Query: 61 FLSIMIHVMTLFGAMMSMQMGLSMAVMNDPSSGGANPILGLWFMLYGTLLFLALDGHLVA 120
+ + G ++ +QMGLS A DP+S P+L + LLFL +GHL
Sbjct: 87 TMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDMLALLLFLTFNGHLWL 146

Query: 121 IGILVDSFHLWPIG-MGVFDLPLMGLIARFSWIFAAAFMLALPAILAMLVVNLTFGVLSR 179
I +LVD+FH PIG + + L S IF MLALP I +L +NL G+L+R
Sbjct: 147 ISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALPLITLLLTLNLALGLLNR 206

Query: 180 AAPSLNVFALGFPMSMLMGLLCVFFSFSGLPTRYSDLCLDALSAMYQFI 228
AP L++F +GFP+++ +G+ + + L + + + I
Sbjct: 207 MAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIFNLLADII 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4232TYPE3IMQPROT491e-11 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 49.4 bits (118), Expect = 1e-11
Identities = 19/79 (24%), Positives = 40/79 (50%)

Query: 3 VNELTALFAEAIFLVVAMVGVLVVPGLLVGLFIAVFQAATQVNEQTLSFLPRLVITLLMV 62
+++L +A++LV+ + G + ++GL + +FQ TQ+ EQTL F +L+ L +
Sbjct: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60

Query: 63 VFAGEWLLMKICDFFDRLF 81
W + + ++
Sbjct: 61 FLLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4233FLGBIOSNFLIP2374e-81 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 237 bits (606), Expect = 4e-81
Identities = 112/238 (47%), Positives = 162/238 (68%)

Query: 4 VLLVFVLSFSSAAYASDGLTLFSLADGDKTQAVSVKLEILALMTVLSFLPALLMMLTSFT 63
+ +L + + + S Q+ S+ ++ L +T L+F+PA+L+M+TSFT
Sbjct: 6 SVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMMTSFT 65

Query: 64 RIIVVLAILRQALGLQQSPPNKVLIGIALVMTIFIMRPVGQEIYDDAFLPYDMGQIELPE 123
RII+V +LR ALG +PPN+VL+G+AL +T FIM PV +IY DA+ P+ +I + E
Sbjct: 66 RIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKISMQE 125

Query: 124 AVARGEVPLRQFMLAQTRETDLEQMLKIADEPITLTADEIPFFVLMPAFVLSELKTAFQI 183
A+ +G PLR+FML QTRE DL ++A+ + +P +L+PA+V SELKTAFQI
Sbjct: 126 ALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKTAFQI 185

Query: 184 GFLLFLPFLVIDLVVASVLMSMGMMMLSPLIISLPFKLMVFVLVDGWSMTVSTLVASY 241
GF +F+PFL+IDLV+ASVLM++GMMM+ P I+LPFKLM+FVLVDGW + V +L S+
Sbjct: 186 GFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQSF 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4234FLGMOTORFLIN809e-23 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 80.3 bits (198), Expect = 9e-23
Identities = 41/125 (32%), Positives = 67/125 (53%), Gaps = 17/125 (13%)

Query: 1 MAEHNILQDEDFLLDDDFFAEEQSHKPKVQAK-----------------PVKDMSFFHQL 43
M++ N DE+ DD +A+ + + K ++D+ +
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 44 PVHVTLELASVEMSLGELAQMGEGDVVALDRMVGEPLDIRVNGALLGRGEVVEVNGRYGV 103
PV +T+EL M++ EL ++ +G VVALD + GEPLDI +NG L+ +GEVV V +YGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 104 RLLEV 108
R+ ++
Sbjct: 121 RITDI 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4235FLGMOTORFLIM310.004 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 31.4 bits (71), Expect = 0.004
Identities = 17/93 (18%), Positives = 37/93 (39%), Gaps = 7/93 (7%)

Query: 184 LCFYMAEHLLELMAEQP----SQYQADPELSLKLAHRLKQIPLRTLLELGHQSMPVTSLQ 239
+ + E ++ ++ Q + + + L +L + + + E+G + V +
Sbjct: 217 IPYITIEPIISKLSSQFWFSSVRRSSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDIL 276

Query: 240 SLAVGDILPMN-LHSRCPVT--VGKRPLFYATV 269
L VGDI+ ++ H P +G R F
Sbjct: 277 GLRVGDIIRLHDTHVGDPFVLSIGNRKKFLCQP 309


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4236OMPADOMAIN632e-13 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 63.4 bits (154), Expect = 2e-13
Identities = 45/174 (25%), Positives = 63/174 (36%), Gaps = 19/174 (10%)

Query: 126 ALEQGYTWQVNIQAADTSGYQISSSPVSTRLVAN-----QFRLCRQQLLPKPFTYLRRIE 180
A Y W NI A T G + + +S + + P P +
Sbjct: 157 ATRLEYQWTNNIGDAHTIGTRPDNGMLSLGVSYRFGQGEAAPVVAPAPAPAPEVQTKHFT 216

Query: 181 L----LFAPSSSLLNNSHEQDLYAVYRYL-QADSSIVEILVDGHADASGDHLANLVLSKE 235
L LF + + L + L +Y L D ++V G+ D G N LS+
Sbjct: 217 LKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSER 276

Query: 236 RADEVVSRLIELGVSAKMIQTRHHGTRAPVASN--NNTEGREL-------NRRV 280
RA VV LI G+ A I R G PV N +N + R +RRV
Sbjct: 277 RAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRV 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4237HTHFIS380e-131 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 380 bits (978), Expect = e-131
Identities = 141/401 (35%), Positives = 211/401 (52%), Gaps = 47/401 (11%)

Query: 74 ELASIAMQCGVQDYLLLPIDEEQLCSLLQR--------LRRLELPNNE---LICAAPVSR 122
A A + G DYL P D +L ++ R +LE + + L+ + +
Sbjct: 88 MTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQ 147

Query: 123 QLLMLAHRAANTEATVLLTGESGTGKEPLARYIHRHSNRTDKPFIAINCAAIPESILESI 182
++ + R T+ T+++TGESGTGKE +AR +H + R + PF+AIN AAIP ++ES
Sbjct: 148 EIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESE 207

Query: 183 LFGHVKGAFTGATTDQIGKFELANGGTLLLDEIGEMPLLLQAKLLRVLQEREVERLGSHK 242
LFGH KGAFTGA T G+FE A GGTL LDEIG+MP+ Q +LLRVLQ+ E +G
Sbjct: 208 LFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRT 267

Query: 243 SITLDIRVIAATNKDLRQAVQDGKFREDLFYRLDVLPIKILPLRQRKEDILPIAEHFLQR 302
I D+R++AATNKDL+Q++ G FREDL+YRL+V+P+++ PLR R EDI + HF+Q+
Sbjct: 268 PIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQ 327

Query: 303 YKILAANQQCYFSEQARNLLLSHDWPGNVRELENTIQRALVMRRGQALQAEELGL----- 357
+ + + F ++A L+ +H WPGNVRELEN ++R + + E +
Sbjct: 328 AEKEGLDVKR-FDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSE 386

Query: 358 -------------VNQDGSALVEQNELG--------------LKASKRQAEFQYIIDTLK 390
+ S VE+N + E+ I+ L
Sbjct: 387 IPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALT 446

Query: 391 RYNGHRNNTAQALGMTTRALRYKLVQMREEGIDIDQILSQS 431
G++ A LG+ LR K +RE G+ + + +
Sbjct: 447 ATRGNQIKAADLLGLNRNTLRKK---IRELGVSVYRSSRSA 484


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4238FLGHOOKFLIE496e-11 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 48.9 bits (116), Expect = 6e-11
Identities = 21/72 (29%), Positives = 34/72 (47%), Gaps = 1/72 (1%)

Query: 42 SFTELMKHKVSSINADQNASSALVAAVDSGQSD-DLVGAMVASQKSSLAFSAMIQIRNRL 100
SF + + I+ Q A+ G+ L M QK+S++ IQ+RN+L
Sbjct: 32 SFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQVRNKL 91

Query: 101 VQAFDDVMKMPI 112
V A+ +VM M +
Sbjct: 92 VAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4239FLGMRINGFLIF315e-103 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 315 bits (809), Expect = e-103
Identities = 172/597 (28%), Positives = 275/597 (46%), Gaps = 82/597 (13%)

Query: 2 STTLTQTSPVMTDVNGNNDRLNSLKQKWHQFSRGDRHAATLAILAIVAACVIVLMLWSTG 61
++T TQ P+ + LN L+ + + A V+ ++LW+
Sbjct: 4 ASTATQPKPL--------EWLNRLRAN--------PRIPLIVAGSAAVAIVVAMVLWAKT 47

Query: 62 QGYSPLYGNQENVETSHIIEVLEAEGISYRLDPTSGLILVPEDRVGNARMVLAARGVKAK 121
Y L+ N + + I+ L I YR SG I VP D+V R+ LA +G+
Sbjct: 48 PDYRTLFSNLSDQDGGAIVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKG 107

Query: 122 VPSGMESLDSSAIGTSQFMEQAKYRYSLEGELSRTIMALKSVKTARVHLAIPKKTLFIRQ 181
G E LD G SQF EQ Y+ +LEGEL+RTI L VK+ARVHLA+PK +LF+R+
Sbjct: 108 GAVGFELLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVRE 167

Query: 182 QPELPSASVMLDLYAGQHLQPEQITSIANLVAGSVTGMTPERVQIVDQEGNHLSSEINAN 241
Q + PSASV + L G+ L QI+++ +LV+ +V G+ P V +VDQ G+ L+ +
Sbjct: 168 Q-KSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSG 226

Query: 242 QDLTQARDKQLQYTQELEQSLINRASSMLQPILGQDNFQVQVAALVNFNQVEETRESLDP 301
+DL D QL++ ++E + R ++L PI+G N QV A ++F E+T E P
Sbjct: 227 RDLN---DAQLKFANDVESRIQRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSP 283

Query: 302 Q------TVVSQEKQSTNQTSGDMALGIPGALSNQPPTADAATNNSTSNLNQ-------- 347
T+ S++ + Q G+PGALSNQP + A + Q
Sbjct: 284 NGDASKATLRSRQLNISEQVGAGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQT 343

Query: 348 ----------------QESRQFEVGRSVKHTRYQQMQLENLSISVLLNNQ---AAGETGW 388
E+ +EV R+++HT+ +E LS++V++N +
Sbjct: 344 STSTNSNSAGPRSTQRNETSNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPL 403

Query: 389 TQPQLDQMSTMVQDAIGYSAARGDQFSISSFNFAPVKIAEFEPLPWWQGESYQAYLRYFI 448
T Q+ Q+ + ++A+G+S RGD ++ + F+ V E LP+WQ +S+ L
Sbjct: 404 TADQMKQIEDLTREAMGFSDKRGDTLNVVNSPFSAVDNTGGE-LPFWQQQSFIDQLLAAG 462

Query: 449 GAILGLGMIF----FVLRPLVQHLTRTVEHNIKDTLPSTQPLMPPAEAATAHLSQDAAQE 504
+L L + + +RP LTR VE A+ + +E
Sbjct: 463 RWLLVLVVAWILWRKAVRPQ---LTRRVEE------------AKAAQEQAQ--VRQETEE 505

Query: 505 AADNLVALNNANINSNWTSNLNLPAPGSPLTVQMEHLSLLANQEPARVAEVISHWIS 561
A + L+ +N L A V + + +++ +P VA VI W+S
Sbjct: 506 AVEV--RLSKDEQLQQRRANQRLGA-----EVMSQRIREMSDNDPRVVALVIRQWMS 555


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4240FLGMOTORFLIG1781e-55 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 178 bits (453), Expect = 1e-55
Identities = 82/324 (25%), Positives = 169/324 (52%), Gaps = 1/324 (0%)

Query: 5 EQAAMLLLSMGEKGAAQVMAHLDRNDVQHLSHKMARLSSITQQEAEAVLGRFFTRYQEQS 64
++AA+LL+S+G + +++V +L + +++ L+ ++A+L +IT + + VL F Q
Sbjct: 19 QKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKELMMAQE 78

Query: 65 GIARASRTYLQKTLDLALGDRVAKSLIDSIYGDEIKVLVKRLEWVDPQLLAREIANEHCQ 124
I + Y ++ L+ +LG + A +I+++ + + DP + I EH Q
Sbjct: 79 FIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANILNFIQQEHPQ 138

Query: 125 LQAVLLGLLPPESAAQVLQGLPAEGQDEVLIRIAQLGDLDREVVDELKQLVERCMLMAME 184
A++L L P+ A+ +L LP E Q V RIA + EVV E+++++E+ +
Sbjct: 139 TIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKKLASLSS 198

Query: 185 KSHTQISGVRQVADILNRFD-GDREQLMEMLKLHDKQLANNVADNMFDFIILGRQKPETL 243
+ +T GV V +I+N D + ++E L+ D +LA + MF F + ++
Sbjct: 199 EDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIVLLDDRSI 258

Query: 244 QEIMAIVPADVLALALKGIDSELKMTLLRALPKRMSSAIETQVEAIGTVPLSQAIAARKE 303
Q ++ + LA ALK +D ++ + + + KR +S ++ +E +G ++++
Sbjct: 259 QRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKDVEESQQK 318

Query: 304 IMELAKQMMDEGQIELQLFEEQVV 327
I+ L +++ ++G+I + E+ V
Sbjct: 319 IVSLIRKLEEQGEIVISRGGEEDV 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4241FLGFLIH606e-13 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 59.8 bits (144), Expect = 6e-13
Identities = 41/182 (22%), Positives = 85/182 (46%), Gaps = 2/182 (1%)

Query: 42 QDFQQAFDKGYDEGVQKGHQAGFTSGEEEGRQTGYAAGFNQGRIEGQQKGKDNIDDQLNS 101
++ + + ++ + + H+ G+ +G EGRQ G+ G+ +G +G ++G Q
Sbjct: 34 EEAEPSLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAP 93

Query: 102 IIAPLGALKSLLEEGHNQQILQQQSLILDLVRRVSLQVIRCELTLQPQQILSLIEETLSA 161
I A + L S + + S ++ + + QVI T+ ++ I++ L
Sbjct: 94 IHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVIGQTPTVDNSALIKQIQQLLQQ 153

Query: 162 LPDDPSQVKIHLEPSAVDKLKEL--AADKIQSWSLVPDATISAGGCRIVSETSDADASVE 219
P + ++ + P + ++ ++ A + W L D T+ GGC++ ++ D DASV
Sbjct: 154 EPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDPTLHPGGCKVSADEGDLDASVA 213

Query: 220 TR 221
TR
Sbjct: 214 TR 215


104Shal_4248Shal_4260N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_42480225.342771flagellar basal-body rod protein FlgC
Shal_42490245.651564flagellar hook capping protein
Shal_42500246.174322flagellar hook protein FlgE
Shal_4251-2245.615182flagellar basal-body rod protein FlgF
Shal_4252-2234.333284flagellar basal-body rod protein FlgG
Shal_4253-1193.547382flagellar basal body L-ring protein
Shal_4254-1162.294873flagellar basal body P-ring protein
Shal_42551151.094767peptidoglycan hydrolase
Shal_42561150.822069flagellar hook-associated protein FlgK
Shal_42571150.349777flagellar hook-associated protein 3
Shal_42582160.450881hypothetical protein
Shal_42591130.571621hypothetical protein
Shal_42602190.908903flagellin domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4248FLGHOOKAP1333e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 33.0 bits (75), Expect = 3e-04
Identities = 8/39 (20%), Positives = 19/39 (48%)

Query: 98 SNVNTVEEMADMMAASRSFETNVEIMNRARSMQQGLLQL 136
S VN EE ++ + + N +++ A ++ L+ +
Sbjct: 507 SGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 27.2 bits (60), Expect = 0.021
Identities = 14/59 (23%), Positives = 25/59 (42%), Gaps = 4/59 (6%)

Query: 9 IAGAGMNAQTIRLNTVASNLANAGAAAESPDEAFRALKPVFSTIYKQTQEGQVAGAHVE 67
A +G+NA LNT ++N+++ A + + ST+ G G +V
Sbjct: 6 NAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--MAQANSTLGAGGWVGN--GVYVS 60


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4250FLGHOOKAP1340.001 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 34.2 bits (78), Expect = 0.001
Identities = 15/41 (36%), Positives = 22/41 (53%)

Query: 360 LEGSNVDQTAEMVNLMTAQRNYQSNAKVLDTNSTMQQALLN 400
S V+ E NL Q+ Y +NA+VL T + + AL+N
Sbjct: 504 QSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544



Score = 34.2 bits (78), Expect = 0.001
Identities = 21/59 (35%), Positives = 27/59 (45%), Gaps = 5/59 (8%)

Query: 2 SFNIALSGLQATTQDLNTISNNIANSSTVGFRSGR----SEFSAIYNGGQAG-GVNVMN 55
N A+SGL A LNT SNNI++ + G+ S + GG G GV V
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4252FLGHOOKAP1421e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 42.3 bits (99), Expect = 1e-06
Identities = 11/47 (23%), Positives = 21/47 (44%)

Query: 213 QIRQGALEGANVNVVEEMVEMISTQRAYEMNAKVVSASDDMLKFLNQ 259
Q+ + VN+ EE + Q+ Y NA+V+ ++ + L
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544



Score = 39.2 bits (91), Expect = 1e-05
Identities = 10/37 (27%), Positives = 19/37 (51%)

Query: 3 SALWVSKTGLTAQDTKMTTIANNLANVNTTGFKRDRV 39
S + + +GL A + T +NN+++ N G+ R
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTT 38


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4253FLGLRINGFLGH1481e-46 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 148 bits (375), Expect = 1e-46
Identities = 76/224 (33%), Positives = 112/224 (50%), Gaps = 17/224 (7%)

Query: 12 LTLLLSGCVAHIPEPDTAPGKPEWAPPEIDYSLPDAENGSVYRPGFMLT-----LFKDKR 66
L L L+GC A IP + P + NGS+++ + LF+D+R
Sbjct: 15 LVLSLTGC-AWIP---STPLVQGATSAQPVPGPTPVANGSIFQSAQPINYGYQPLFEDRR 70

Query: 67 AYREGDILTVALDEKTYSSKRADTKTSKSGGVSIDGQGTTGTSSIAGSG------EANMG 120
GD LT+ L E +SK + S+ G + G T G EA+ G
Sbjct: 71 PRNIGDTLTIVLQENVSASKSSSANASRDGKTNF-GFDTVPRYLQGLFGNARADVEASGG 129

Query: 121 RSFNGTGSSTQQNQLSGSITVTVAKVLPNGALLIRGEKWLRLNQGDEYLRLLGLIRADDI 180
+FNG G + N SG++TVTV +VL NG L + GEK + +NQG E++R G++ I
Sbjct: 130 NTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTI 189

Query: 181 DNDNTISSQRIADARIIYGGQGAISDSNRMGWAARYFNSPWFPL 224
NT+ S ++ADARI Y G G I+++ MGW R+F + P+
Sbjct: 190 SGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLN-LSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4254FLGPRINGFLGI343e-118 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 343 bits (880), Expect = e-118
Identities = 151/370 (40%), Positives = 211/370 (57%), Gaps = 14/370 (3%)

Query: 11 MLLSLSPLLPVKAQPQHRYLMDIVDVQGLRDNQLVGYGLVVGLDGTGDRT-QVRFTSQSI 69
+ +L L AQ + DI +Q RDNQL+GYGLVVGL GTGD FT QS+
Sbjct: 12 VFSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSM 71

Query: 70 VNMLKQFGVQIDDKTDPKLKNVAAVAVHATVPPLASPGQTLDITVSSLGDAKSLRGGTLL 129
ML+ G+ KN+AAV V A +PP ASPG +D+TVSSLGDA SLRGG L+
Sbjct: 72 RAMLQNLGITTQG-GQSNAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLI 130

Query: 130 MTPMRAVDGEIYAVAQGNLVVGGVSAQGRNGTSVTINVPTVGSIPNGALLEAAMHSNFND 189
MT + DG+IYAVAQG L+V G SAQG + ++T V T +PNGA++E + S F D
Sbjct: 131 MTSLSGADGQIYAVAQGALIVNGFSAQG-DAATLTQGVTTSARVPNGAIIERELPSKFKD 189

Query: 190 NENIVLNLIDPSFKTARNIERAVNEL----FGPDVAQADSSAKVIVRAPSSNRERVTFMS 245
+ N+VL L +P F TA + VN +G +A+ S ++ V+ P + M+
Sbjct: 190 SVNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKP-RVADLTRLMA 248

Query: 246 MLEELQIEQGRKSPRVVFNSRTGTVVMGGDVVVRKAAVSHGNLTVTIVEQEFVSQPNGAY 305
+E L +E +VV N RTGT+V+G DV + + AVS+G LTV + E V QP
Sbjct: 249 EIENLTVETDTP-AKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPF- 306

Query: 306 LGQAQGETVVTTDSQVGIDEGNGHMFVWPEGTALNDIVRAVNSLGASPMDLMAILQALNE 365
++G+T V + + + + + EG L +V +NS+G ++AILQ +
Sbjct: 307 ---SRGQTAVQPQTDIMAMQEGSKVAI-VEGPDLRTLVAGLNSIGLKADGIIAILQGIKS 362

Query: 366 AGALEAELVV 375
AGAL+AELV+
Sbjct: 363 AGALQAELVL 372


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4255FLGFLGJ503e-10 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 50.1 bits (119), Expect = 3e-10
Identities = 27/96 (28%), Positives = 48/96 (50%), Gaps = 5/96 (5%)

Query: 26 GALKLVSQQFEAQFLQTVLKQMRSASDVMADKDSPLSSQNDGMYRDWHDAELAGRLSQMQ 85
++ V++Q E F+Q +LK MR A KD SS++ +Y +D ++A +++ +
Sbjct: 31 ANIRPVARQVEGMFVQMMLKSMRDALP----KDGLFSSEHTRLYTSMYDQQIAQQMTAGK 86

Query: 86 STGLAEVMTKQLSAGLKSEPEMVASNKQVNSSPNTA 121
GLAE+M KQ++ + PE + T
Sbjct: 87 GLGLAEMMVKQMTPE-QPLPEESTPAAPMKFPLETV 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4256FLGHOOKAP11681e-48 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 168 bits (427), Expect = 1e-48
Identities = 95/320 (29%), Positives = 156/320 (48%), Gaps = 8/320 (2%)

Query: 2 SMLNIGMSGLNASMAALTATSNNINNAMVPGYSRQQVMLSSVGNGVYGS---GSGVMVDG 58
S++N MSGLNA+ AAL SNNI++ V GY+RQ +++ + + G+GV V G
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61

Query: 59 VRRISDQYEVAQLWNTTSGLGYANTQSSYFGQVEQIFGSEGNSISAGLDLLFASLNSAME 118
V+R D + QL + + +++ + + +S++ + F SL + +
Sbjct: 62 VQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLVS 121

Query: 119 QPNEIAHRQGVLNEAKALTQRFNSISEGLNSQVTQVEGQINASAKEINTQLETIASLNAE 178
+ A RQ ++ +++ L +F + + L Q QV I AS +IN + IASLN +
Sbjct: 122 NAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLNDQ 181

Query: 179 IQSS--NASGNVPLALLDARDSAIDDLSSIIDVNVVEDSSGMLNISLAQGQPLLSGTTAS 236
I +G P LLD RD + +L+ I+ V V G NI++A G L+ G+TA
Sbjct: 182 ISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTAR 241

Query: 237 KLEV---TPDPSNPKFSQISIQFGQSSFPLDETAGGSLGALIDYRDNSLVDSMAFIDELA 293
+L + DPS + + G P GSLG ++ +R L + + +LA
Sbjct: 242 QLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQLA 301

Query: 294 MTMADEFNAVLAGGTDLNGN 313
+ A+ FN G D NG+
Sbjct: 302 LAFAEAFNTQHKAGFDANGD 321



Score = 73.1 bits (179), Expect = 6e-16
Identities = 36/105 (34%), Positives = 60/105 (57%), Gaps = 3/105 (2%)

Query: 351 GDNSNLKELVDIANKSFTFSSMGVDTTMGDAFGSKIGELGSASRQAQMSKTTAENLQMEA 410
DN N + L+D+ + S ++G + DA+ S + ++G+ + + S T N+ +
Sbjct: 443 SDNRNGQALLDLQSNS---KTVGGAKSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQL 499

Query: 411 QKQWASTSGVNMDEEGVNLIIYQQSYQANAKVISTADQLFQTILN 455
Q S SGVN+DEE NL +QQ Y ANA+V+ TA+ +F ++N
Sbjct: 500 SNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4257FLAGELLIN451e-07 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 45.4 bits (107), Expect = 1e-07
Identities = 28/143 (19%), Positives = 53/143 (37%)

Query: 11 NNLQSLQNSTFDIAKLNEMMSTGSSILRPSDDPIGAVKVIGNERDMAATNQYIKNTESLS 70
+L S ++ E +S+G I DD G ++ Q +N
Sbjct: 12 LTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGI 71

Query: 71 TSFSRSETYMSSMVELQGRMREITVSANNGSLSPEDRAAYAAEMNELLEAFADTLNAKDE 130
+ +E ++ + R+RE++V A NG+ S D + E+ + LE N
Sbjct: 72 SIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQF 131

Query: 131 SGNYLFSGNKTDTPPIGKDADGN 153
+G + S + +G +
Sbjct: 132 NGVKVLSQDNQMKIQVGANDGET 154


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4260FLAGELLIN983e-25 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 97.8 bits (243), Expect = 3e-25
Identities = 74/269 (27%), Positives = 125/269 (46%), Gaps = 9/269 (3%)

Query: 4 VMTNNASNIAQNAVTKNNDLLSNAMERLSTGLRINSASDDAAGLQIATRLNANVTGMETA 63
+ TN+ S + QN + K+ LS+A+ERLS+GLRINSA DDAAG IA R +N+ G+ A
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 NRNVSDATSMLQTADGALDELNNIASRQKELATQAANGVNSAEDIKALGAEYKELNAEAN 123
+RN +D S+ QT +GAL+E+NN R +EL+ QA NG NS D+K++ E ++ E +
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 124 RIIDSTEYGGNKLFTALDTGVGFQIGASNTASEQLTVKTDVAAVKTLFAGE--------I 175
R+ + T++ G K+ + D + Q+GA++ + + ++ L +
Sbjct: 124 RVSNQTQFNGVKVLSQ-DNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATV 182

Query: 176 TDSATAKAQIDNVDAIIDAVGTQRSNLGASINRLGHTASNLTNVTENTKAAAGRIMDTDF 235
D ++ + D R ++ + TA + + A D
Sbjct: 183 GDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAE 242

Query: 236 AVETAAMTKNQLLVQAGTNILSSSNQNTG 264
+ K + + G
Sbjct: 243 NNTAVDLFKTTKSTAGTAEAKAIAGAIKG 271



Score = 65.8 bits (160), Expect = 2e-14
Identities = 50/266 (18%), Positives = 83/266 (31%)

Query: 5 MTNNASNIAQNAVTKNNDLLSNAMERLSTGLRINSASDDAAGLQIATRLNANVTGMETAN 64
NN + + + D G+ G +
Sbjct: 241 AENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVS 300

Query: 65 RNVSDATSMLQTADGALDELNNIASRQKELATQAANGVNSAEDIKALGAEYKELNAEANR 124
++ L AD N A+ + + VN ++
Sbjct: 301 TTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEA 360

Query: 125 IIDSTEYGGNKLFTALDTGVGFQIGASNTASEQLTVKTDVAAVKTLFAGEITDSATAKAQ 184
+ A T + KT + +
Sbjct: 361 NNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTANP 420

Query: 185 IDNVDAIIDAVGTQRSNLGASINRLGHTASNLTNVTENTKAAAGRIMDTDFAVETAAMTK 244
+ ++D+ + V RS+LGA NR +NL N N +A RI D D+A E + M+K
Sbjct: 421 LASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSK 480

Query: 245 NQLLVQAGTNILSSSNQNTGLVMGLL 270
Q+L QAGT++L+ +NQ V+ LL
Sbjct: 481 AQILQQAGTSVLAQANQVPQNVLSLL 506


105Shal_4305Shal_4311N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Shal_4305017-2.064325tRNA uridine 5-carboxymethylaminomethyl
Shal_4306-117-2.359472flavodoxin
Shal_4307-117-2.228447inosine/uridine-preferring nucleoside hydrolase
Shal_4308-118-2.348091SecC motif-containing protein
Shal_4309lysine exporter protein LysE/YggA
Shal_4310tRNA modification GTPase TrmE
Shal_431160 kDa inner membrane insertion protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4305ECOLIPORIN300.032 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 29.9 bits (67), Expect = 0.032
Identities = 13/41 (31%), Positives = 21/41 (51%)

Query: 358 ETKAISGLFFAGQINGTTGYEEAGAQGLLAGMNASLQVQGK 398
++ + + G+ NG Y GL+ G+N +LQ QGK
Sbjct: 135 DSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGK 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4308SECA513e-09 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 51.4 bits (123), Expect = 3e-09
Identities = 16/20 (80%), Positives = 18/20 (90%)

Query: 294 QTGRNDPCPCGSGKKYKKCC 313
+ GRNDPCPCGSGKKYK+C
Sbjct: 878 KVGRNDPCPCGSGKKYKQCH 897


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_4310PF05272310.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.009
Identities = 39/225 (17%), Positives = 71/225 (31%), Gaps = 45/225 (20%)

Query: 211 IIREGMKV----VIAGRPNAGKSSLLNALAGKESAIVTEI-AGTTRDVLREHIHLDGMPL 265
++ G K V+ G GKS+L+N L G + T GT +D + + G+
Sbjct: 588 VMEPGCKFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKD---SYEQIAGIVA 644

Query: 266 H-IIDTAGLRDTADTVEKIGIERAWDEIR------------------TADRVLFMVDGTT 306
+ + + R K D R T ++ ++ D T
Sbjct: 645 YELSEMTAFRRADAEAVKAFFSSRKDRYRGAYGRYVQDHPRQVVIWCTTNKRQYLFDITG 704

Query: 307 TPAVDPHEIWPDFIDRLPNNLGVTVVRNK--AD-----LTGEDLAITTEAGHSVYRISAK 359
WP + N + + R + A+ L GE + E +R +
Sbjct: 705 N-----RRFWPVLVPGRANLVWLQKFRGQLFAEALHLYLAGERYFPSPEDEEIYFRPEQE 759

Query: 360 TGLGVDDLKQHLKSLMGYQSNLEG------GFIARRRHLEALDLA 398
L ++ L +L+ + G+ + DL
Sbjct: 760 LRLVETGVQGRLWALLTREGAPAAEGAAQKGYSVNTTFVTIADLV 804


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Shal_431160KDINNERMP6210.0 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 621 bits (1604), Expect = 0.0
Identities = 279/552 (50%), Positives = 379/552 (68%), Gaps = 21/552 (3%)

Query: 1 MESQRNILLIGLLFVSFLLWQQWQADQAPQPVAAAQTQSSIPASTVADSHSSDVPDADSA 60
M+SQRN+L+I LLFVSF++WQ W+ D+ PQP A TQ++ A+ AD
Sbjct: 1 MDSQRNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAG---------SAADQG 51

Query: 61 VPEAITASKELITVFTDQLEIKIDPVGGDIVYSALLSHKLEENGDDPFVLLEQTNDIYYI 120
VP + +LI+V TD L++ I+ GGD+ + L ++ E N PF LLE + Y
Sbjct: 52 VPASGQG--KLISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQ 109

Query: 121 AQSGLIGRDGIDSSVKG-RAHFDSASREYKLADGQDTLSVPLTYVSDKGVTYTKSFVFTR 179
AQSGL GRDG D+ G R ++ Y LA+GQ+ L VP+TY G T+TK+FV R
Sbjct: 110 AQSGLTGRDGPDNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTKTFVLKR 169

Query: 180 GKYNIAVDYKINNTSDASLQVQMYGQIKHSIKK------SESSMMMPTYLGGAFSTDDIR 233
G Y + V+Y + N + L++ +GQ+K SI S+ + T+ G A+ST D +
Sbjct: 170 GDYAVNVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFALHTFRGAAYSTPDEK 229

Query: 234 YEKYSFDDMADK-NLDDSTLGGWVAMLQHYFVSAWVPPATEKNIIFSS-MKNGLANIGFR 291
YEKY FD +AD NL+ S+ GGWVAMLQ YF +AW+P N +++ + NG+A IG++
Sbjct: 230 YEKYKFDTIADNENLNISSKGGWVAMLQQYFATAWIPHNDGTNNFYTANLGNGIAAIGYK 289

Query: 292 GELHDIAPGSEQVIKSEFYVGPKDQKALSVLSPSLNLVVDYGFLWWLAVPIYKLLMFFHS 351
+ + PG + S +VGP+ Q ++ ++P L+L VDYG+LW+++ P++KLL + HS
Sbjct: 290 SQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQPLFKLLKWIHS 349

Query: 352 IVGNWGFAIILITLTVRGLLYPLTKAQYTSMAKMRNLQPKLAELKERFGDDRQKMGQAMM 411
VGNWGF+II+IT VRG++YPLTKAQYTSMAKMR LQPK+ ++ER GDD+Q++ Q MM
Sbjct: 350 FVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGDDKQRISQEMM 409

Query: 412 ELYKKEKVNPMGGCLPILLQMPIFIALYWVLLESVELRHAPFMLWITDLSVQDPYYVLPI 471
LYK EKVNP+GGC P+L+QMPIF+ALY++L+ SVELR APF LWI DLS QDPYY+LPI
Sbjct: 410 ALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLSAQDPYYILPI 469

Query: 472 LMGISMFVMQKMQPMAPTMDPMQVKMMQWMPVIFTVFFLWFPAGLVLYWLVGNLVAITQQ 531
LMG++MF +QKM P DPMQ K+M +MPVIFTVFFLWFP+GLVLY++V NLV I QQ
Sbjct: 470 LMGVTMFFIQKMSP-TTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQ 528

Query: 532 KIIYAGLEKKGL 543
++IY GLEK+GL
Sbjct: 529 QLIYRGLEKRGL 540



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.