PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome2265.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_004741 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1S0055S0061Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0055-3173.33251423S rRNA/tRNA pseudouridine synthase A
S0056-2163.166701ATP-dependent helicase HepA
S0057-1143.270932DNA polymerase II
S0058-1153.545592L-ribulose-5-phosphate 4-epimerase
S0059-1164.227489L-arabinose isomerase
S00600163.940178ribulokinase
S00610163.201156DNA-binding transcriptional regulator AraC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0060TCRTETOQM290.040 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 29.4 bits (66), Expect = 0.040
Identities = 19/103 (18%), Positives = 39/103 (37%), Gaps = 18/103 (17%)

Query: 300 ILIADKQSVGERAVKGICGQVDGSVV------PGFIGLEAGQS-AFGDIYAWFGRVLGWP 352
+ I++K+ + + + ++G + G I + + + G P
Sbjct: 281 VRISEKEKIK---ITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSV---LGDTKLLP 334

Query: 353 L-EQLAAQHPELKAQINASQKQ----LLPALTEAWAKNPSLDH 390
E++ P L+ + S+ Q LL AL E +P L +
Sbjct: 335 QRERIENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRY 377


2S0099S0114Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0099216-0.318120hypothetical protein
S0100211-0.222045zinc-binding protein
S0101211-0.109997hypothetical protein
S01021140.303960dephospho-CoA kinase
S0103-1150.498128guanosine 5'-monophosphate oxidoreductase
S01040170.636884hypothetical protein
S01050130.616622type IV pilin biogenesis protein
S0107-2111.157354major pilin subunit
S01080181.123292quinolinate phosphoribosyltransferase
S01092261.402271N-acetyl-anhydromuranmyl-L-alanine amidase
S01103321.245421regulatory protein AmpE
S01113361.910614aromatic amino acid transporter
S01124372.036318transcriptional regulator PdhR
S01133381.231519pyruvate dehydrogenase subunit E1
S01142311.034715dihydrolipoamide acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0105BCTERIALGSPF2275e-73 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 227 bits (581), Expect = 5e-73
Identities = 94/405 (23%), Positives = 184/405 (45%), Gaps = 13/405 (3%)

Query: 6 LWRWHGITGDGNAQDGMLWAESRTLLLMALQQQMVTPLSLKRIAINSAQ----------- 54
+ + + G G A+S L+++ + PLS+ + +
Sbjct: 3 QYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLRRK 62

Query: 55 WRGDKS--AEVIHQLATLLKAGLTLSEGLALLAEQHPSKQWQALLQSLAHDLEQGIAFSN 112
R S A + QLATL+ A + L E L +A+Q L+ ++ + +G + ++
Sbjct: 63 IRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD 122

Query: 113 ALLPWSEAFPPLYQAMIRTGELTGKLDECCFELARQQKSQRQLTDKVKSALRYPIIILAM 172
A+ + +F LY AM+ GE +G LD LA + ++Q+ +++ A+ YP ++ +
Sbjct: 123 AMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLTVV 182

Query: 173 AIMVVVAMLHFVLPEFAAIYKTFNTPLPALTQGIMTLADFSGEWGWLLVLFGFLLAIANK 232
AI VV +L V+P+ + LP T+ +M ++D +G ++L +A +
Sbjct: 183 AIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMAFR 242

Query: 233 LLMRRPTWLIARQKLLLRIPIMGSLMRGQKLTQIFTILALTQSAGITFLQGVESVRETMR 292
+++R+ ++ + LL +P++G + RG + L++ ++ + LQ + + M
Sbjct: 243 VMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDVMS 302

Query: 293 CPYWVQLLTQIQHDISNGHPIWLALKNAGEFSPLCLQLVRTGEASGSLDLMLDNLAHHHR 352
Y L+ + G + AL+ F P+ ++ +GE SG LD ML+ A +
Sbjct: 303 NDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADNQD 362

Query: 353 DNTMALADNLAALLEPALLIITGGIIGTLVVAMYLPIFHLGDAMS 397
+ L EP L++ ++ +V+A+ PI L MS
Sbjct: 363 REFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLMS 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0107BCTERIALGSPG502e-10 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 49.5 bits (118), Expect = 2e-10
Identities = 27/79 (34%), Positives = 43/79 (54%), Gaps = 1/79 (1%)

Query: 1 MDKQRGFTLIELMVVIGIIAILSSIGIPAYQNYLRKAALTDMLQTFVPYRTAVELCALEH 60
DKQRGFTL+E+MVVI II +L+S+ +P KA + V A+++ L++
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDN 63

Query: 61 GGLDTCD-GGSNGIPSPTT 78
T + G + + +PT
Sbjct: 64 HHYPTTNQGLESLVEAPTL 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0114RTXTOXIND320.007 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.1 bits (73), Expect = 0.007
Identities = 15/60 (25%), Positives = 29/60 (48%), Gaps = 2/60 (3%)

Query: 119 EVTEILVKVGDKV-EAEQSLITVEGDKASMEVPAPFAGTVKEIKVN-VGDKVSTGSLIMV 176
E+ + L + D + L E + + + AP + V+++KV+ G V+T +MV
Sbjct: 299 EILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358



Score = 32.1 bits (73), Expect = 0.008
Identities = 16/63 (25%), Positives = 27/63 (42%), Gaps = 2/63 (3%)

Query: 26 DKVEAEQSLITVEGDKASMEVPSPQAGIVKEIKVSVGDKTQTGALIMIFDSADGAADAAP 85
+ V +T G S E+ + IVKEI V G+ + G +++ + AD
Sbjct: 81 EIVATANGKLTHSGR--SKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLK 138

Query: 86 AQA 88
Q+
Sbjct: 139 TQS 141



Score = 31.0 bits (70), Expect = 0.015
Identities = 17/106 (16%), Positives = 38/106 (35%), Gaps = 4/106 (3%)

Query: 230 DKVAAEQSLITVEGDKASMEVPAPFAGVVKELKVNVGDKVKTGSLIMIFEVEGAAPAAAP 289
+ VA +T G S E+ +VKE+ V G+ V+ G + + ++ A
Sbjct: 81 EIVATANGKLTHSGR--SKEIKPIENSIVKEIIVKEGESVRKGDV--LLKLTALGAEADT 136

Query: 290 AKQEAAAPAPAAKAEAPAAKAEGKSEFAENDAYVHATPLIRRLARE 335
K +++ + + + + P + ++ E
Sbjct: 137 LKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEE 182


3S0202S0246Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0202-124-3.045222transcriptional regulator LYSR-type
S0203-122-2.943944hypothetical protein
S0204-122-3.395653biotin synthesis protein
S0205-217-1.208129membrane-bound lytic murein transglycosylase D
S0206-219-1.002865hydroxyacylglutathione hydrolase
S0207-2142.458525hypothetical protein
S0208-1185.115203ribonuclease H
S0209-1205.650839DNA polymerase III subunit epsilon
S02110247.063767*hypothetical protein
S02121235.844886hypothetical protein
S02131214.593950periplasmic chaperone of fimbral assembly
S02142213.578413outer membrane usher protein
S02152221.453489hypothetical protein
S02172220.226583hypothetical protein
S0218326-0.619485hypothetical protein
S0219227-0.298666hypothetical protein
S0220128-0.127471insertion sequence 2 OrfA protein
S0221225-0.037418insertion element IS2 transposase InsD
S0223625-2.277540IS911 orfA
S0224930-2.130349IS911 orfB
S0229830-2.383505endolysin R of prophage CP-933V
S0230833-1.199449hypothetical protein
S0231933-0.692440lysis protein S
S0232635-0.662469hypothetical protein
S0233431-0.445320hypothetical protein
S02354300.114890IS911 orfA
S0234528-0.066516IS911 orfB
S0236529-0.615672terminase large subunit
S0237429-0.926595terminase large subunit
S0238528-1.025118packaging glycoprotein
S0239528-1.099186scaffolding protein
S0240528-1.728850coat protein
S0241429-1.959784hypothetical protein
S0242329-1.565238DNA stabilization protein
S0243431-1.364092packaged DNA stabilization protein
S0244430-1.099271packaged DNA stabilization protein
S0245229-0.564505head assembly protein
S0246329-0.615224DNA transfer protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0205INTIMIN290.050 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 28.9 bits (64), Expect = 0.050
Identities = 20/84 (23%), Positives = 41/84 (48%), Gaps = 9/84 (10%)

Query: 316 RESLASGE---IAAVQSTLVANNTPLNSRVYTVRSGDTLSSIASRLGVSTKDLQQWNKLR 372
+ S A+GE S L+ +N+ N YT+++G+T++ ++ ++ + NK
Sbjct: 35 QNSFANGENYFKLGSDSKLLTHNSYQNRLFYTLKTGETVADLSKSQDINLSTIWSLNKHL 94

Query: 373 GS------KLKPGQSLTIGAQRLA 390
S K +PGQ + + ++L
Sbjct: 95 YSSESEMMKAEPGQQIILPLKKLP 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0231FLAGELLIN250.039 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 24.6 bits (53), Expect = 0.039
Identities = 14/77 (18%), Positives = 31/77 (40%), Gaps = 9/77 (11%)

Query: 2 KSMDKISTGIAYGTSSGSAGYWFL--------QWLDQVSPSQWAAIGVLGSLVLGFLTYL 53
+++++S+G+ ++ A + + L Q S + I + G L +
Sbjct: 26 SAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGISIA-QTTEGALNEI 84

Query: 54 TNLYFKIREDKRKAARG 70
N ++RE +A G
Sbjct: 85 NNNLQRVRELSVQATNG 101


4S0270S0298Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0270-1183.253480delta-aminolevulinic acid dehydratase
S0271-1203.496234taurine dioxygenase
S02720170.602973taurine transporter subunit
S0273116-1.446401taurine transporter ATP-binding subunit
S0274-115-2.351106taurine transporter substrate binding subunit
S0275019-3.718404IS1 orfB
S0276121-3.986970IS1 orfA
S0277121-3.986970hypothetical protein
S0278018-1.444027hypothetical protein
S02790130.133837transporter
S0280115-0.051150hypothetical protein
S02812160.394355dehydrogenase subunit
S02823191.030814IS1 orfA
S02833243.514964IS1 orfB
S02844263.835466Rhs-family protein
S02853294.573447hypothetical protein
S02863284.358125hypothetical protein
S02870234.368409Rhs-family protein
S0288-1234.053704Rhs-family protein
S0289-1182.490976Rhs-family protein
S0290-1161.508154hypothetical protein
S02910161.436909C-lysozyme inhibitor
S02921160.955779acyl-CoA dehydrogenase
S02932211.403205phosphoheptose isomerase
S02942221.521420amidotransferase
S02951221.563185hypothetical protein
S02962211.489720lipoprotein
S02973190.479370hypothetical protein
S02992190.325214flagellar biosynthetic protein FlhA
S0298219-2.113651hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0270BINARYTOXINB300.019 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 29.7 bits (66), Expect = 0.019
Identities = 19/69 (27%), Positives = 30/69 (43%)

Query: 265 DIVRELRERTELPIGAYQVSGEYAMIKFAALAGAIDEEKVVLESLGSIKRAGADLIFSYF 324
+ EL + +L + QV G A F +D E L I+ A +IF+
Sbjct: 466 NQFLELEKTKQLRLDTDQVYGNIATYNFENGRVRVDTGSNWSEVLPQIQETTARIIFNGK 525

Query: 325 ALDLAEKKI 333
L+L E++I
Sbjct: 526 DLNLVERRI 534


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0284CHANLCOLICIN300.027 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.4 bits (68), Expect = 0.027
Identities = 31/101 (30%), Positives = 42/101 (41%), Gaps = 9/101 (8%)

Query: 10 PVGNGGPVITT-----PPIAGESGGMSTGSAVTDVSGAAEEMAEQAAADLFGALPEPSGL 64
P + G VI T P +G GG G + ++ S A A+ + A L E +
Sbjct: 13 PYDDKGQVIITLLNGTPDGSGSGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAAR 72

Query: 65 VKAAVAAAQAAAAA---AGISDMAGAVQDAAASLAAGAPGA 102
KAA A AQA A A A + V +A A+ P A
Sbjct: 73 AKAA-AEAQAKAKANRDALTQRLKDIVNEALRHNASRTPSA 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0298OMPADOMAIN382e-05 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 38.4 bits (89), Expect = 2e-05
Identities = 29/118 (24%), Positives = 45/118 (38%), Gaps = 22/118 (18%)

Query: 121 FERGSAQIMPFFKTLLVELAPVFDSLY---NKIIITGHTDAM---AYKNNIYNNWNLSGD 174
F A + P + L +L +L +++ G+TD + AY N LS
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAY------NQGLSER 276

Query: 175 RALSARRVLEEAGMPEDKVMQVS-----AMADQMLLDAKNPQS-----AGNRRIEIMV 222
RA S L G+P DK+ + + K + A +RR+EI V
Sbjct: 277 RAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


5S0316S0339Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
S0316135-8.382286*phage integrase
S0317334-8.721665IS600 orfA
S0318441-11.952412IS600 orfB
S0319444-13.117992hypothetical protein
S4801546-14.261602phage tail fibre protein
S4802545-13.793496hypothetical protein
S4808538-10.359435hypothetical protein
S4803439-10.982845glucosyl tranferase II
S0320333-5.609885bactoprenol glucosyl transferase
S0321139-8.308594flippase
S0322237-7.605374integrase fragment
S0323133-6.628618IS600 orfB
S0324135-7.300845IS600 orfA
S0325029-5.339177phage integrase
S4804223-3.856693hypothetical protein
S03260141.742408IS600 orfB
S03271152.000476IS629 orfB
S03281181.626348IS629, orfA
S03290170.168095diguanylate cyclase AdrA
S0330017-0.432593pyrroline-5-carboxylate reductase
S0331025-1.779152hypothetical protein
S0332-128-4.156612shikimate kinase
S0333-122-3.763659hypothetical protein
S0334-121-1.913456hypothetical protein
S0335121-1.153271hypothetical protein
S03363180.203446hypothetical protein
S03372150.450472hypothetical protein
S03392161.070632recombination associated protein
6S0367S0387Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0367016-5.144273exodeoxyribonuclease VII small subunit
S0368023-8.627308thiamine biosynthesis protein ThiI
S0369131-10.099907hypothetical protein
S0370233-11.0283552-dehydropantoate 2-reductase
S0371336-12.652417nucleotide-binding protein
S4809341-13.047688hypothetical protein
S4810227-8.672991hypothetical protein
S0372123-0.930806IS1 orfA
S4811122-0.551789IS1 orfB
S4812118-0.131343hypothetical protein
S03732190.446717hypothetical protein
S03744220.733822protoheme IX farnesyltransferase
S03755231.105478cytochrome o ubiquinol oxidase subunit IV
S03763241.191023cytochrome o ubiquinol oxidase subunit III
S03790191.493853ISSfl3 orfD
S0380-1181.182492ISSfl3 orfC
S0381-1170.738842ISSfl3 orfC
S03820200.266116ISSfl3 orfB
S0383018-0.252073ISSfl3 orfA
S0384021-0.265271muropeptide transporter
S0385326-0.646966hypothetical protein
S0386326-0.699185transcriptional regulator BolA
S0387326-0.407193trigger factor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0384TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.0 bits (91), Expect = 3e-05
Identities = 71/347 (20%), Positives = 135/347 (38%), Gaps = 20/347 (5%)

Query: 62 KFLWSPLMDRYTPPFFGRRRGWLLATQILLLVAIAAMGFLEPGTQLRWMAALAVVIAFCS 121
+F +P++ + F RR LL + V A M W+ + ++A +
Sbjct: 56 QFACAPVLGALSDRF--GRRPVLLVSLAGAAVDYAIMAT----APFLWVLYIGRIVAGIT 109

Query: 122 ASQDIVFDAWKTDVLPAEERGAGAAISVLGYRLGMLVSGGLALWLADKWLGWQGMYWLMA 181
+ V A+ D+ +ER + GM+ L + ++ A
Sbjct: 110 GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG--FSPHAPFFAAA 167

Query: 182 AL-LIPCIIATLLAPEP--TDTIPVPKTLEQAVVAPLRDFFGRNNAWLILLLIVLYKLGD 238
AL + + L PE + P+ + + + A L+ + ++ +G
Sbjct: 168 ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQ 227

Query: 239 AFAMSLTTTFLIRGVGFDAGEVGVVNKTLGLLATIVGALYGGILMQRLSLFRALLIFGIL 298
A +L F +DA +G+ G+L ++ A+ G + RL RAL+ G++
Sbjct: 228 VPA-ALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALM-LGMI 285

Query: 299 QGASNAGYWLLSITDKHLYSMGAAVFFENLCGGMGTSAFVALLMTLCNKSFSATQFALLS 358
A GY LL+ + + V GG+G A A+L ++ L+
Sbjct: 286 --ADGTGYILLAFATRGWMAFPIMVLL--ASGGIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 359 ALSAVGRVYVGPVAGWFVEAHGWSTF--YLFSVAAAVPGLILLLVCR 403
AL+++ + VGP+ + A +T+ + + AA+ L L + R
Sbjct: 342 ALTSLTSI-VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0385PF06291270.030 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 26.5 bits (58), Expect = 0.030
Identities = 11/34 (32%), Positives = 18/34 (52%)

Query: 3 KKILFPLVALFMLAGCAKPPTTIEVSPTITLPQQ 36
KK+LF ++ GCA+ T+ PT P++
Sbjct: 7 KKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKE 40


7S0405S0422Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0405220-2.024000IS4 orf
S0406219-4.772272hypothetical protein
S0408012-1.984099hypothetical protein
S0409013-1.379524hypothetical protein
S0410216-1.164483hypothetical protein
S0412014-0.631675hemolysin expression-modulating protein
S0413014-0.435979hypothetical protein
S04140150.446071acriflavine resistance protein
S0415111-0.148454acriflavin resistance protein AcrA precursor
S0416113-0.339300DNA-binding transcriptional repressor AcrR
S04172141.832591potassium efflux protein KefA
S04184153.540497hypothetical protein
S04193164.106587primosomal replication protein N''
S04203212.630976hypothetical protein
S04213262.455158adenine phosphoribosyltransferase
S04222212.491635DNA polymerase III subunits gamma and tau
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0409BCTERIALGSPF300.026 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.8 bits (67), Expect = 0.026
Identities = 31/137 (22%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 245 IWLPLGLVIGLLAAMFVLRILRRIQSPHHRLQDAIENRDICVHYQPIVSLANGKIVGAEA 304
W+ L L+ G +A +LR R+ + + P++ G+I
Sbjct: 228 PWMLLALLAGFMAFRVMLR------QEKRRVS-----FHRRLLHLPLI----GRIARGLN 272

Query: 305 LARWPQTDGSWLSPDSFIPLAQQTGLS-EPLTLLIIRSVFEDMGDCLRQHPQQHISINLE 363
AR+ +T + S +PL Q +S + ++ R D +R+ H + LE
Sbjct: 273 TARYARTLSILNA--SAVPLLQAMRISGDVMSNDYARHRLSLATDAVREGVSLHKA--LE 328

Query: 364 STVLTSEKIPQLLREMI 380
T L P ++R MI
Sbjct: 329 QTAL----FPPMMRHMI 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0414ACRIFLAVINRP13660.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1366 bits (3536), Expect = 0.0
Identities = 800/1033 (77%), Positives = 913/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++SA+YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ + LNK++LTPVDVI +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL 300
+ EEFGK+ L+VN DGS V L+DVA++ELGGENY++IA NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+LA+++PFFP G+K++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVF+PMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWLNRMFEKSTHHYTDSVGGILRSTGR 540
SVLVALILTPALCAT+LKP++ H E K GFFGW N F+ S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTHYYLT 600
YL++Y +IV GM LF+RLPSSFLP+EDQGVF+TM+QLPAGATQERTQKVL++VT YYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD 660
EK NVESVF VNGF F+G+ QN G+AFVSLK W +R G+EN EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQLL AA+HP L SVRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS++DIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D+ YVR+A+G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSTPFS 900
M LME LASKLP G+GYDWTGMSYQERLSGNQAP+L AIS +VVFLCLAALYESWS P S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF 1020
+EATL AVRMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0415RTXTOXIND453e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 45.2 bits (107), Expect = 3e-07
Identities = 33/212 (15%), Positives = 72/212 (33%), Gaps = 23/212 (10%)

Query: 100 TYQATYDSAKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQGYDQALADAQQANAAVTA 159
+ Y A +L + + Q+ Q +++ ++ L +Q +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 160 AKAAVETARINLAYTKVTSPISGRIGKSNV-TEGALVQNGQATVLATVQQLDPIYVDVTQ 218
+ + + +P+S ++ + V TEG +V + T++ V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALV 372

Query: 219 SSNDFLRLKQELA----------NGTLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVD 268
+ D + KV I D I+ + G + ++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLV---GKVKNINLDAIEDQRLGLVFNVIISIE 429

Query: 269 QTTGSITLRAIFPNPDHTLLPGMFVRARLEEG 300
+ S + I L GM V A ++ G
Sbjct: 430 ENCLSTGNKNIP------LSSGMAVTAEIKTG 455



Score = 32.9 bits (75), Expect = 0.002
Identities = 26/127 (20%), Positives = 50/127 (39%), Gaps = 10/127 (7%)

Query: 49 PLQITTELPGR-TSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQATYDS 107
++I G+ T + R E++P + I+ + KEG + G L ++ +A
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA---- 134

Query: 108 AKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQGYDQALADAQQANAAVTAAKAAVETA 167
D K Q++ A+L RYQ L + I + + V+ + T+
Sbjct: 135 ---DTLKTQSSLLQARLEQTRYQILSRS--IELNKLPELKLPDEPYFQNVSEEEVLRLTS 189

Query: 168 RINLAYT 174
I ++
Sbjct: 190 LIKEQFS 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0416HTHTETR2225e-76 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 222 bits (567), Expect = 5e-76
Identities = 215/215 (100%), Positives = 215/215 (100%)

Query: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60
MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180
GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215
APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0417RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRIKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0422IGASERPTASE397e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.9 bits (90), Expect = 7e-05
Identities = 40/251 (15%), Positives = 77/251 (30%), Gaps = 31/251 (12%)

Query: 402 PLPETTSQVLAARQQLQRVQGATKAKKSEPAA----ATRARPVNNAALERLASVTDRVQA 457
P E +Q + + + P+ AR + A + A T
Sbjct: 983 PEVEKRNQTVDTTN----ITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETT 1037

Query: 458 RPVPSALEKAPAKKEAYRWKATTPVMQQKE--------VVATPKALKKA---LEHEKTPE 506
V ++ E AT Q +E V A + + A E ++T
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 507 LAVKLAA---------EAIERDPWAAQVSQLSLPKLVEQVALNAWKE-ESDNAVCLHLRS 556
K A E+ +V+ PK + + E +N ++++
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 557 SQRHLNNRGAQQKLAEALS-MLKGSTVELTIVEDDNPAVRTPLEWRQAIYEEKLAQARES 615
Q N ++ A+ S ++ E T V N V P A + + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 616 IIADNNIQTLR 626
+ + +++R
Sbjct: 1218 KPKNRHRRSVR 1228


8S0461S0482Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S04612192.284159phosphoribosylaminoimidazole carboxylase ATPase
S04623201.813726phosphoribosylaminoimidazole carboxylase
S04633181.268307UDP-2,3-diacylglucosamine hydrolase
S0464217-0.199020peptidyl-prolyl cis-trans isomerase B (rotamase
S0465015-1.141008cysteinyl-tRNA synthetase
S0466-122-2.346666hypothetical protein
S0467-121-2.516468hypothetical protein
S0468019-3.708235bifunctional 5,10-methylene-tetrahydrofolate
S0469-122-4.346839fimbrial-like protein
S0470-119-2.552100chaperone
S0471-116-0.853208IS1 orfB
S0472217-0.898277IS1 orfA
S0475118-1.357690*envelope protein
S0476015-0.591663hypothetical protein
S0479-215-0.249616sensor kinase CusS
S0480-115-0.006975DNA-binding transcriptional activator CusR
S0481-2170.035820copper/silver efflux system outer membrane
S0482224-0.040492periplasmic copper-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0465RTXTOXIND290.031 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.031
Identities = 16/150 (10%), Positives = 44/150 (29%), Gaps = 8/150 (5%)

Query: 299 RSQLNYSEENLKQARAALERLYTALRGTDKTVAPAGGEAFEARFIEAMDDDFNTP----- 353
+ ++ +L QAR R R + P E F +++
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIK 192

Query: 354 EAYSVLFDMAREVNRLKAEDMAAANAMASHLRKLSAVLGLLEQEPEAFLQSGAQADDSEV 413
E +S + + + A + + + + + + + + F +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSS---LLHKQAI 249

Query: 414 AEIEALIQQRLDARKAKDWAAADAARDRLN 443
A+ L Q+ + + +++
Sbjct: 250 AKHAVLEQENKYVEAVNELRVYKSQLEQIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0480HTHFIS861e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.7 bits (212), Expect = 1e-21
Identities = 35/117 (29%), Positives = 62/117 (52%)

Query: 2 KLLIVEDEKKTGEYLTKGLTEAGFVVDLADNGLNGYHLAMTGDYDLIILDIMLPDVNGWD 61
+L+ +D+ L + L+ AG+ V + N + GD DL++ D+++PD N +D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 IVRMLRSANKGMPILLLTALGTIEHRVKGLELGADDYLVKPFAFAELLARVRTLLRR 118
++ ++ A +P+L+++A T +K E GA DYL KPF EL+ + L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0481RTXTOXIND392e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.4 bits (92), Expect = 2e-05
Identities = 26/189 (13%), Positives = 61/189 (32%), Gaps = 13/189 (6%)

Query: 254 QAQTVNSGSLQSVKLPA-GLSSQILLQRPDIMEAEHALM-----AANANIGAARAAFFPS 307
+ +SG + +K + +I+++ + + L+ A A+ ++
Sbjct: 87 NGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQS----- 141

Query: 308 ISLTSGISTASSDLSSLFNASSGMWNFIPKIEIPIFNAGRNQANLDIAEIRQQQSVVNYE 367
SL + + + + P F + L + + ++Q
Sbjct: 142 -SLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQN 200

Query: 368 QKIQNAFKEVADALALRQSLDDQISAQQRYLASLQITLQRARALYQHGAVSYLEVLDAER 427
QK Q + A R ++ +I+ + + L +L A++ VL+ E
Sbjct: 201 QKYQ-KELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQEN 259

Query: 428 SLFATRQTL 436
L
Sbjct: 260 KYVEAVNEL 268


9S0493S0522Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0493221-2.347386dihydropteridine reductase
S0494327-2.740283hypothetical protein
S0495228-2.561225hypothetical protein
S0496127-2.856358carboxylate-amine ligase
S0497116-1.125387IS103 orf
S04980130.179731IS150 orfB
S0499-2112.333690hypothetical protein
S0500-2102.379946hypothetical protein
S0501-2102.476798phosphopantetheinyltransferase component of
S0502-2112.828782outer membrane receptor FepA
S05030123.521628enterobactin/ferric enterobactin esterase
S05040134.416236enterobactin synthase subunit F
S05051144.637732IS1 orfA
S05061154.689317IS1 orfB
S05081174.856707iron-enterobactin transporter ATP-binding
S05091174.992121iron-enterobactin transporter permease
S05100174.847707iron-enterobactin transporter membrane protein
S05110184.281988enterobactin exporter EntS
S0512-1194.353705iron-enterobactin transporter periplasmic
S0514-1194.285311enterobactin synthase subunit E
S05150173.8342022,3-dihydro-2,3-dihydroxybenzoate synthetase
S0516-1153.7441742,3-dihydroxybenzoate-2,3-dehydrogenase
S05170152.945038hypothetical protein
S0518-1132.450216carbon starvation protein
S0519-2130.479091hypothetical protein
S0520-117-1.449498aminotransferase
S0522223-2.647901IS911 orfB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0499HOKGEFTOXIC562e-15 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 56.4 bits (136), Expect = 2e-15
Identities = 18/50 (36%), Positives = 27/50 (54%)

Query: 1 MLTKYALVAVIVLCLTVLGFTLLAGDSLCEFTVKERNIEFRAVLAYEPKK 50
+ + V+++CLT+L FT L SLCE ++ E A +AYE K
Sbjct: 3 LPRSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYESGK 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0500HOKGEFTOXIC649e-18 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 63.7 bits (155), Expect = 9e-18
Identities = 18/52 (34%), Positives = 28/52 (53%)

Query: 32 INMLTKYALVAVIVLCLTVLGFTLLVGDSLCEFTVKERNIEFKAVLAYEPKK 83
+ + + V+++CLT+L FT L SLCE ++ E A +AYE K
Sbjct: 1 MKLPRSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYESGK 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0501ENTSNTHTASED2757e-97 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 275 bits (705), Expect = 7e-97
Identities = 105/183 (57%), Positives = 130/183 (71%), Gaps = 1/183 (0%)

Query: 4 MKTTHTSLPFAGHTLHFVEFDPANFCEQDLLWLPHYAQLQHAGRKRKTEHLAGRIAAVYA 63
M T+H LPFAGH LH V+FD ++F E DLLWLPH+ +L+ AGRKRK EHLAGRIAAV+A
Sbjct: 1 MLTSHFPLPFAGHRLHIVDFDASSFREHDLLWLPHHDRLRSAGRKRKAEHLAGRIAAVHA 60

Query: 64 LREYGYKCVPAIGELRQPVWPAEVYGSISHCGATALAVVSRQPIGVDIEEIFSAQTATEL 123
LRE G + VP +G+ RQP+WP ++GSISHC TALAV+SRQ IG+DIE+I S TATEL
Sbjct: 61 LREVGVRTVPGMGDKRQPLWPDGLFGSISHCATTALAVISRQRIGIDIEKIMSQHTATEL 120

Query: 124 TDNIITPAEHERLADCGLAFSLALTLAFSAKESAFKA-SEIQTDAGFLDYQIISWNKQQV 182
+II E + L L F LALTLAFSAKES +KA S+ T GF ++ S +
Sbjct: 121 APSIIDSDERQILQASLLPFPLALTLAFSAKESVYKAFSDRVTLPGFNSAKVTSLTATHI 180

Query: 183 IIH 185
+H
Sbjct: 181 SLH 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0511TCRTETA371e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.7 bits (85), Expect = 1e-04
Identities = 81/391 (20%), Positives = 142/391 (36%), Gaps = 38/391 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATSALVGR 141
V+L + G ++ + P L +Y+ + G + G A A + +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPE-LP 200
+ + G V P++GGL+ A + AA L LPE
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHK 186

Query: 201 PPPQPLEHPLKSLLAGFRFLLASPLLGGLLTMA----------SAVLVLYPALADNWQMS 250
+PL + LA FR+ ++ L+ + +A+ V++ D +
Sbjct: 187 GERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRFHWD 244

Query: 251 AAQIGFLYAAIP-LGAAIGALTSGKLAHSARPGLLMLLSTLGS---FLAIGLFGLMPMWI 306
A IG AA L + A+ +G +A ++L + ++ + M
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAF 304

Query: 307 LGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGGLGA 366
+V LA G ML Q E G++ G A +G L + A
Sbjct: 305 PIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA 360

Query: 367 MMTPVASASASGFGLLIIGVLLLLVLVELRR 397
+ + +G+ + L LL L LRR
Sbjct: 361 ----ASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0512FERRIBNDNGPP632e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 63.0 bits (153), Expect = 2e-13
Identities = 61/285 (21%), Positives = 102/285 (35%), Gaps = 35/285 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSAEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKS--- 151
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 152 --WQSLLTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
+ LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQVLERL 314
KD DA+ A PL +P V+ + + F SAM + L
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVL 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0515ISCHRISMTASE440e-159 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 440 bits (1133), Expect = e-159
Identities = 145/299 (48%), Positives = 194/299 (64%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPESHDIPQNKVDWAFELQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W + RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPIPASKA-----------ALREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA + + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0516DHBDHDRGNASE359e-129 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 359 bits (922), Expect = e-129
Identities = 108/258 (41%), Positives = 149/258 (57%), Gaps = 20/258 (7%)

Query: 5 GKNVWVTGAGKGIGYATALTFVEAGAKVTGFD---------------QAFTQEQYPFATE 49
GK ++TGA +GIG A A T GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAGQVAQVCQRLLAETERLDVLINAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+ + ++ R+ E +D+L+N AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRRQRGGAIVTVASDAAHTARIGMSAYGASKAALKSLALSVGLELAGSGVRC 169
+ +R G+IVTV S+ A R M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


10S0561S0571Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0561318-1.135002PnuC protein
S0562320-0.709663quinolinate synthetase
S0568320-0.633698*****tol-pal system protein YbgF
S0569322-0.424026peptidoglycan-associated outer membrane
S0570318-0.491094translocation protein TolB
S0571318-0.368936cell envelope integrity inner membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0569OMPADOMAIN1165e-34 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 116 bits (292), Expect = 5e-34
Identities = 35/119 (29%), Positives = 54/119 (45%), Gaps = 4/119 (3%)

Query: 55 EEQARLQMQQLQQNNIVYFDLDKYDIRSDFAQMLDAHANFLRSN--PSYKVTVEGHADER 112
+Q + + V F+ +K ++ + LD + L + V V G+ D
Sbjct: 205 APAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI 264

Query: 113 GTPEYNISLGERRANAVKMYLQGKGVSADQISIVSYGKEKPAVLGHDEAAYSKNRRAVL 171
G+ YN L ERRA +V YL KG+ AD+IS G+ P V G+ K R A++
Sbjct: 265 GSDAYNQGLSERRAQSVVDYLISKGIPADKISARGMGESNP-VTGN-TCDNVKQRAALI 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0571IGASERPTASE647e-13 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 63.5 bits (154), Expect = 7e-13
Identities = 39/210 (18%), Positives = 70/210 (33%), Gaps = 9/210 (4%)

Query: 79 EQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQAELKQKQAEVAA 138
E+R A+ + +E E A +E + AE +
Sbjct: 986 EKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK 1045

Query: 139 AKAAADAKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKAEVAAAALKKKAEAAEAAAAE 198
++ K E+ A + A ++ A+ + A Q EVA + +E E E
Sbjct: 1046 QESKTVEKN-EQDATETTAQNREVAKEAKSNVKANTQT-NEVA----QSGSETKETQTTE 1099

Query: 199 ARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAAEKAAADKKAAEKAAADKAAADKKAAA 258
++ A E EKAK E EK K + E++ + AE A +
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV---NIK 1156

Query: 259 EKAAADKKAAAAKAAAEKAAAAKAAAEADD 288
E + A + A++ ++ +
Sbjct: 1157 EPQSQTNTTADTEQPAKETSSNVEQPVTES 1186



Score = 55.1 bits (132), Expect = 3e-10
Identities = 30/244 (12%), Positives = 81/244 (33%), Gaps = 20/244 (8%)

Query: 51 DAVMVDSGAVVEQYKRMQSQESSAKRSDEQRKMKEQQAAE-ELREKQAAEQER------L 103
D V A + ++ ++K+ + + EQ A E + ++ A++ +
Sbjct: 1021 DEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANT 1080

Query: 104 KQLEKERLAAQEQKKQAEEAAKQAELKQKQAEVAAAKAAADAKAAEEAAKKAAADAKKKA 163
+ E + ++ ++ Q K+ +K+ KA + + +E K + + K+
Sbjct: 1081 QTNEVAQSGSETKETQ-TTETKETATVEKE-----EKAKVETEKTQEVPKVTSQVSPKQE 1134

Query: 164 EAEAAKAAAEAQKKAEVAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEK 223
++E + AE ++ + E + + A++ + E+
Sbjct: 1135 QSETVQPQAEPARENDPTVN-------IKEPQSQTNTTADTEQPAKETSSNVEQPVTEST 1187

Query: 224 AAADKKAAAEKAAADKKAAEKAAADKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAA 283
+ E A + + +++K + + + A +
Sbjct: 1188 TVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTV 1247

Query: 284 AEAD 287
A D
Sbjct: 1248 ALCD 1251



Score = 54.7 bits (131), Expect = 5e-10
Identities = 33/221 (14%), Positives = 76/221 (34%), Gaps = 8/221 (3%)

Query: 66 RMQSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAK 125
R ++E+ + + + Q+ E +E Q E + +EKE A E +K E
Sbjct: 1066 REVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKV 1125

Query: 126 QAELKQKQAEVAAAKAAADAKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKAEVAAAAL 185
+++ KQ ++ AE A + K+ +++ A Q E ++
Sbjct: 1126 TSQVSPKQ-----EQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVE 1180

Query: 186 KKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAAEKAAADKKAAEKA 245
+ E+ + + ++ K + + + + A +
Sbjct: 1181 QPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTS 1240

Query: 246 AADKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAAAEA 286
+ D++ A + + + A + A A+ A +A
Sbjct: 1241 SNDRSTV---ALCDLTSTNTNAVLSDARAKAQFVALNVGKA 1278



Score = 52.4 bits (125), Expect = 3e-09
Identities = 32/184 (17%), Positives = 67/184 (36%), Gaps = 12/184 (6%)

Query: 99 EQERLKQLEKERLAAQEQKKQAEEAAKQAELKQK----QAEVAAAKAAADAKAAEEAAK- 153
E E+ Q QA+ + + ++ +A V A ++ E A+
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 154 -KAAADAKKKAEAEAAKAAAEAQKKAEVAAAALKKKAEAAEAAAAEARKKAATEAAEKAK 212
K + +K E +A + A+ ++ A+ A + +K + E A ++ +E E
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVA------QSGSETKETQT 1097

Query: 213 AEAEKKAAAEKAAADKKAAAEKAAADKKAAEKAAADKAAADKKAAAEKAAADKKAAAAKA 272
E ++ A EK K + K ++ + + + + AE A + K
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 273 AAEK 276
+
Sbjct: 1158 PQSQ 1161



Score = 47.4 bits (112), Expect = 9e-08
Identities = 29/213 (13%), Positives = 68/213 (31%), Gaps = 6/213 (2%)

Query: 71 ESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQAELK 130
+S ++ + Q ++ A E EK E E+ +++ K +++Q+E QAE
Sbjct: 1087 QSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPA 1146

Query: 131 QKQAEVAAAKAAADAKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKAEVAAAALKKKAE 190
++ K ++ A + E ++ + V A
Sbjct: 1147 RENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPAT 1206

Query: 191 AAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAAEKAAADKKAAEKAAADKA 250
+E+ K ++ A ++ D+ A +
Sbjct: 1207 TQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTN------TNAV 1260

Query: 251 AADKKAAAEKAAADKKAAAAKAAAEKAAAAKAA 283
+D +A A+ A + A ++ ++ +
Sbjct: 1261 LSDARAKAQFVALNVGKAVSQHISQLEMNNEGQ 1293



Score = 40.4 bits (94), Expect = 1e-05
Identities = 22/156 (14%), Positives = 44/156 (28%), Gaps = 11/156 (7%)

Query: 126 QAELKQKQAEVAAAKAAADAKAAEEAAK-KAAADAKKKAEAEAAKAAAEAQKKAEVAAAA 184
+ E + + + + +A + A+ A A + E A
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 185 LKKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAAEKAAADKKAAEK 244
K++++ E K +A E E A+ E A + + E
Sbjct: 1044 SKQESKTVE--------KNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKET 1095

Query: 245 AAADKAAADKKAAAEKAAA--DKKAAAAKAAAEKAA 278
+ EKA +K K ++ +
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSP 1131


11S0700S0765Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0700229-1.180409bacteriophage protein
S0701434-0.637401IS911 orfB
S0702644-1.653573IS911 orfB
S0703432-2.179854IS911 orfA
S0704432-2.732735hypothetical protein
S0705430-3.499858IS911 orfA
S0706330-1.696861IS911 orfB
S0707325-2.801092IS911 orfB
S0708227-2.386607IS911 orfB
S0709326-1.924853bacteriophage protein
S0714327-0.926726****hypothetical protein
S0715327-1.396479IS600 orfB
S0716224-1.673677IS600 orfA
S0718323-0.958581bacteriophage protein
S0719220-0.600835IS911 orfB
S0720322-1.377978IS911 orfB
S0721321-1.949237IS911 orfA
S0722423-1.386287replication protein DnaC
S0723424-1.985437helicase
S0724125-3.158803bacteriophage protein
S0725326-2.529714bacteriophage protein
S0726126-2.638130Q protein
S0731227-1.891658****S protein
S0732221-1.734760hypothetical protein
S0733122-0.947812endolysin R of prophage CP-933V
S07341220.127661endopeptidase
S07351240.530285IS600 orfA
S07361232.400609IS600 orfB
S07372242.894242DNA packaging protein of prophage CP-933R;
S07380274.363606head completion protein gp3
S07390284.231587head-tail preconnector gp5
S07401283.849877head-tail preconnector gp5
S07414263.681308head-tail preconnector gp5
S07424281.925845capsid protein small subunit
S07435231.904122major capsid protein
S07446271.710754DNA-packaging protein
S07458251.561069tail attachment protein
S07463252.802788tail component of prophage CP-933K
S07473252.653887tail component of prophage CP-933K
S07482252.983490tail component of prophage CP-933K
S07493293.336449tail component of prophage CP-933K
S07503283.670567tail component of prophage CP-933K
S07512263.779782tail length tape measure protein precursor
S07525254.009152minor tail protein
S07534244.130213minor tail protein
S07544243.844460tail assembly protein
S07553233.404341tail component
S07562203.119542host specificity protein
S07571152.783599membrane protein precursor
S07581152.707982tail component encoded by cryptic prophage
S07592153.398639tail component of prophage CP-933K
S07632163.757749hypothetical protein
S07642153.725627kinase inhibitor protein
S0765-1133.217855adenosylmethionine-8-amino-7-oxononanoate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0718PF05272542e-09 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 53.9 bits (129), Expect = 2e-09
Identities = 30/82 (36%), Positives = 48/82 (58%), Gaps = 2/82 (2%)

Query: 4 SELSDLLWAQVDRVAPHLLPNGKIEGHEWVAGNVNGDKGNSLKVNLIGKKKWADFAEGDG 63
+ L+D L + + P LP G + GHE+ G++ G KG+S KVN + KW DF+ G+
Sbjct: 12 TSLADALLTRAKDLLPEWLPGGVLVGHEYECGSLAGGKGDSCKVN-VTTGKWCDFSTGES 70

Query: 64 G-DMLDLWMACRGINLHQAMQE 84
G D+LDL+ G+ + +A +
Sbjct: 71 GRDLLDLYAEIHGLKVSKAAAQ 92


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0725DNABINDNGFIS303e-04 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 29.6 bits (66), Expect = 3e-04
Identities = 12/33 (36%), Positives = 19/33 (57%)

Query: 3 VKIQTIPELLIQTRGNMTEVSRMLNCNRATVRK 35
V+ + ++ TRGN T + M+ NR T+RK
Sbjct: 58 VEQPLLDMVMQYTRGNQTRAALMMGINRGTLRK 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0726FIMREGULATRY270.014 Escherichia coli: P pili regulatory PapB protein si...
		>FIMREGULATRY#Escherichia coli: P pili regulatory PapB protein

signature.
Length = 104

Score = 26.8 bits (59), Expect = 0.014
Identities = 8/31 (25%), Positives = 15/31 (48%)

Query: 71 LVDYYVFGMTFMTLARKHGCSDGYIGKKLQK 101
+ DY V G + + K+ ++GY L +
Sbjct: 51 MKDYLVGGHSRKEVCEKYQMNNGYFSTTLGR 81


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0751RTXTOXIND406e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.8 bits (93), Expect = 6e-05
Identities = 14/194 (7%), Positives = 43/194 (22%), Gaps = 12/194 (6%)

Query: 29 LNGAASDAERSSARMQRFMERQTQAARQTMQAASSAATAASAHAQTVEKNARAHERMARE 88
L ++A+ + Q + + Q S + + E
Sbjct: 127 LTALGAEADTLKTQSSL---LQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEE 183

Query: 89 VEQTRLRVDALNQKMREEQAQARALAEAQDKAAAAFYRQIDSVKQAGAGLQELQRIQQQI 148
V + + + ++ Q + + +I+ + +
Sbjct: 184 VLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDD---F 240

Query: 149 RQARNSGGVGQQDYLALISEITAKTRALTQAE------EQATRQKAAFIRQLKEQATRQN 202
+ + + L ++ L + E + + + +
Sbjct: 241 SSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI 300

Query: 203 LSSSELLRARAAQL 216
L L
Sbjct: 301 LDKLRQTTDNIGLL 314



Score = 38.7 bits (90), Expect = 1e-04
Identities = 25/224 (11%), Positives = 63/224 (28%), Gaps = 30/224 (13%)

Query: 545 NYQEQQKRRNAENAALNRMNETEAARHQREIARINAMQYADQAVRDAAIQRENERYEKAI 604
+ + + A L + + E+ ++ ++ D+ + E R I
Sbjct: 133 EADTLKTQSSLLQARLEQ-TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLI 191

Query: 605 KKNTRATRNDEATRLLLQYSQQQAQVEGQIAAARQSAGIATERMTEAHKQLLALQQRISD 664
K+ + Q+ Q E + R R+ + R+ D
Sbjct: 192 KEQ------------FSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDD 239

Query: 665 LDGKKLTADEKSVLARKNELIQALTLLDVKQQELQKQTALNDLRKKTVQLTSQLADKERA 724
L + I +L+ + + ++ L + + Q+ S++ +
Sbjct: 240 F--SSL---------LHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE 288

Query: 725 LREQHNLDIATAGMGDKQRQRYQAQLRIRQEYRQQLQQLENDSR 768
+ + + Q I +L + E +
Sbjct: 289 YQ-LVTQLFKN----EILDKLRQTTDNIGL-LTLELAKNEERQQ 326



Score = 32.5 bits (74), Expect = 0.009
Identities = 31/238 (13%), Positives = 69/238 (28%), Gaps = 42/238 (17%)

Query: 402 DPVNAAKALDNALHFLNATQLEQIRVLGEQGRSSDAARIAMSALAEETGKRTSDIDNNLN 461
+ A L +LEQ R RS + ++ L +E + + L
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILS-RSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 462 ALGSTLQTLSDWWKQFWDAAMNIGREDSLDAQIDALQEKIQRAKKYPWTNASTQVEYDQQ 521
+ S W Q + +N+ D A+ + +I R + ++
Sbjct: 187 LTSLIKEQFSTWQNQKYQKELNL---DKKRAERLTVLARINRYEN--------LSRVEKS 235

Query: 522 RLNDLQEKKRRKDLQDAKAQAERNYQEQQKRRNAENAALNRMNETEAARHQREIARINAM 581
RL+D L +A A+ EQ+ + L +++ +
Sbjct: 236 RLDDFSS------LLHKQAIAKHAVLEQENKYVEAVNELRVY-----------KSQLEQI 278

Query: 582 QYADQAVRDAAIQRENERYEKAIKKNTRATRNDEATRLLLQYSQQQAQVEGQIAAARQ 639
+ + ++ + + K + T + ++A +
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTT-------------DNIGLLTLELAKNEE 323



Score = 31.0 bits (70), Expect = 0.030
Identities = 26/185 (14%), Positives = 57/185 (30%), Gaps = 13/185 (7%)

Query: 13 IDAAEFKNEIPRIKNLLNGAASDAERSSARMQRFMERQTQAARQTMQAASSAATAASAHA 72
+ E + +E R+ ++ Q + A
Sbjct: 157 SRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAER 216

Query: 73 QTVEKNARAHERMAREVEQTRLRVDALNQKMREEQAQARALAEAQDKAAAAFYRQIDSVK 132
TV +E ++R VE++RL + +QA A+ Q+ ++
Sbjct: 217 LTVLARINRYENLSR-VEKSRL---DDFSSLLHKQAIAKHAVLEQENKYVEAVNEL---- 268

Query: 133 QAGAGLQELQRIQQQIRQARNSGGVGQQDYLALISEITAKTRA-LTQAEEQATRQKAAFI 191
+L++I+ +I A+ + Q + I + +T + + K
Sbjct: 269 --RVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE--LAKNEER 324

Query: 192 RQLKE 196
+Q
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0757ENTEROVIROMP1342e-42 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 134 bits (339), Expect = 2e-42
Identities = 62/200 (31%), Positives = 98/200 (49%), Gaps = 30/200 (15%)

Query: 1 MRKLYAAILSAAICLAVSGAPAWASEHRSTLSAGYLHASTNAPGSDDLNGINVKYRYEFT 60
M+K+ AA+ +G A+ ST++ GY + + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAAT---STVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGLVTSFSYANAEDEQKTHYSDTRWHEDSVRNRWFSVMAGLSVRVNEWFSAYAMAGV 119
++ LG++ SF+Y T S T D +N+++ + AG + R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDNRHSNTSLAWGAGVQFNPTESVAIDLAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0758FLAGELLIN280.042 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 28.5 bits (63), Expect = 0.042
Identities = 14/93 (15%), Positives = 30/93 (32%)

Query: 110 VAQAQQSAGAAAGNAQQTAQDVAAAATARDDAQRFAEKARQDATVTAEDRKATAEDVTST 169
V Q + N D+ A + +++ A A + + +
Sbjct: 337 VVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFID 396

Query: 170 GANAAAAGQSAQDAAGYARAAEQAKNDIDAALT 202
+ + +DAA ++ ID+AL+
Sbjct: 397 KTASGVSTLINEDAAAAKKSTANPLASIDSALS 429


12S0857S4814Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0857216-0.464408arginine transporter permease subunit ArtM
S0858019-7.711028arginine transporter permease subunit ArtQ
S0859-123-8.677361arginine 3rd transport system periplasmic
S0860122-7.209411arginine transporter ATP-binding subunit
S0861121-6.449057lipoprotein
S4813118-4.887754hypothetical protein
S4814114-3.227214hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0860PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.010
Identities = 9/18 (50%), Positives = 12/18 (66%)

Query: 31 LVLLGPSGAGKSSLLRVL 48
+VL G G GKS+L+ L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


13S0898S0959Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0898221-0.988224MFS family transporter protein
S0899323-1.177570ISSfl3 orfA
S0900321-1.208006ISSfl3 orfB
S0901420-2.142385ISSfl3 orfC,D
S0905326-3.238498IS911 orfA
S0906226-2.386210bacteriophage protein
S0907327-1.715735bacteriophage protein
S0908323-2.235719hypothetical protein
S0909425-1.809306IS600 orfA
S0910326-2.141784IS600 orfB
S0911222-2.272310hypothetical protein
S0912224-2.359978host cell-killing modulation protein
S0913121-2.623366hypothetical protein
S0914325-2.333455bacteriophage protein
S0917-127-0.647912bacteriophage protein
S0918-1250.831383bacteriophage protein
S0919-1260.370476DNA-binding transcriptional regulator DicC
S0920025-0.030317hypothetical protein
S0921126-0.469325hypothetical protein
S0922127-0.245592insertion element IS2 transposase InsD
S0923325-1.732747insertion sequence 2 OrfA protein
S0924226-2.162707bacteriophage protein
S0925427-2.684898hypothetical protein
S0927323-1.716825IS600 orfA
S0928626-3.092364IS600 orfB
S0929420-2.440600exodeoxyribonuclease VIII of prophage CP-933R
S0930221-2.651197IS600 orfB
S0931117-2.485000IS600 orfA
S0932013-1.916924hypothetical protein
S0934013-1.695836invasion plasmid antigen
S0935112-1.2129896-phosphogluconolactonase
S0937011-1.586911hypothetical protein
S0938-113-1.533509membrane pump protein
S0939-115-1.755272enzyme
S0940225-2.125335pectinesterase
S0941432-2.897789integrase encoded by prophage CP-933K; partial
S0942428-1.613844IS600 orfB
S0943529-1.580462IS600 orfA
S0944424-0.104820lysozyme protein R of prophage CP-933K
S09454270.791435ISSfl3 orfA
S09464251.011696ISSfl3 orfB
S09475240.739105ISSfl3 orfC
S09483210.461047IS629 orfA
S09494250.153183IS629 orfB
S0950527-0.365467hypothetical protein
S0952430-0.305726IS1 orfA
S0953330-0.930806IS1 orfB
S0956324-1.557169IS600 orfA
S0957632-1.138508IS600 orfB
S0958432-1.685372IS1 orfB
S0959229-1.717362IS600 orfB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0901RTXTOXIND418e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.4 bits (97), Expect = 8e-06
Identities = 9/74 (12%), Positives = 33/74 (44%), Gaps = 1/74 (1%)

Query: 16 LRKQQSRLRQYACQVAGYEQEIERLKAQLDRLRRMLFGQSSEKKRHKLENQIRQAEKRLS 75
+ +Q+++ + ++ Y+ ++E++++++ + + K L+ ++RQ +
Sbjct: 254 VLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILD-KLRQTTDNIG 312

Query: 76 ELENRLNTARNLLE 89
L L +
Sbjct: 313 LLTLELAKNEERQQ 326


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0912HOKGEFTOXIC431e-10 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 43.3 bits (102), Expect = 1e-10
Identities = 17/31 (54%), Positives = 20/31 (64%)

Query: 2 TALVTRKDLCEVRIRTGQTEVAVFVDYESRK 32
+TRK LCE+R R G EVA F+ YES K
Sbjct: 22 FTYLTRKSLCEIRYRDGYREVAAFMAYESGK 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0947RTXTOXIND403e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.8 bits (93), Expect = 3e-06
Identities = 9/74 (12%), Positives = 33/74 (44%), Gaps = 1/74 (1%)

Query: 16 LRKQQSRLRQYACQVAGYEQEIERLKAQLDRLRRMLFGQSSEKKRHKLENQIRQAEKRLS 75
+ +Q+++ + ++ Y+ ++E++++++ + + K L+ ++RQ +
Sbjct: 254 VLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILD-KLRQTTDNIG 312

Query: 76 ELENRLNTARNLLE 89
L L +
Sbjct: 313 LLTLELAKNEERQQ 326


14S1002S1008Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S1002120-3.015855IS1 orfB, A
S1004122-3.674845IS1 orfB
S1005024-3.755093IS1 orfA
S1006022-4.207889outer membrane protein
S1007022-3.898759fimbrial protein
S1008021-3.389975fimbrial-like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1006PF005778220.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 822 bits (2125), Expect = 0.0
Identities = 414/862 (48%), Positives = 570/862 (66%), Gaps = 18/862 (2%)

Query: 15 GVPSFIGGLVVFVSAAFNAQAETWFDPAFFKDDPSMVADLSRFEKGQKITPGVYRVDIVL 74
G + F + A + AE +F+P F DDP VADLSRFE GQ++ PG YRVDI L
Sbjct: 25 GFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYL 84

Query: 75 NQTIVDTRNVNFVEITPEKGIAACLTTESLDAMGVNTDAFPAFKQLDKQVCVPLAEIIPD 134
N + TR+V F E+GI CLT L +MG+NT + L CVPL +I D
Sbjct: 85 NNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHD 144

Query: 135 ASVTFNVNKLRLEISVPQIAIKSNARGYVPPERWDEGINALLLGYSFSGANSIHSSADSD 194
A+ +V + RL +++PQ + + ARGY+PPE WD GINA LL Y+FSG + + +
Sbjct: 145 ATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNS 204

Query: 195 SGDSYFLNLNSGVNLGPWRLRNNSTWSR-----SSGQTAEWKNLSSYLQRAVIPLKGELT 249
+LNL SG+N+G WRLR+N+TWS SSG +W++++++L+R +IPL+ LT
Sbjct: 205 --HYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLT 262

Query: 250 VGDDYTAGDFFDSVSFRGVQLASDDNMLPDSLKGFAPVVRGIAKSNAQITIKQNGYTIYQ 309
+GD YT GD FD ++FRG QLASDDNMLPDS +GFAPV+ GIA+ AQ+TIKQNGY IY
Sbjct: 263 LGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYN 322

Query: 310 TYVSPGAFEISDLYSTSSSGDLLVEIKEADGSVNSYSVPFSSVPLLQRQGRIKYAVTLAK 369
+ V PG F I+D+Y+ +SGDL V IKEADGS ++VP+SSVPLLQR+G +Y++T +
Sbjct: 323 STVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGE 382

Query: 370 YRTNSNEQQESKFAQTTLQWGGPWGTTWYGGGQYAEYYRAAMFGLGFNLGDFGAISFDAT 429
YR+ + +Q++ +F Q+TL G P G T YGG Q A+ YRA FG+G N+G GA+S D T
Sbjct: 383 YRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMT 442

Query: 430 QAKSTLADQSEHKGQSYRFLYAKTLNQLGTNFQLMGYRYSTSGFYTLSDTMYKHMDGY-- 487
QA STL D S+H GQS RFLY K+LN+ GTN QL+GYRYSTSG++ +DT Y M+GY
Sbjct: 443 QANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNI 502

Query: 488 EFNDGDDEDTPMWSRYYNLFYTKRGKLQVNISQQLGEYGSFYLSGSQQTYWHTDQQDRLL 547
E DG + P ++ YYNL Y KRGKLQ+ ++QQLG + YLSGS QTYW T D
Sbjct: 503 ETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQF 562

Query: 548 QFGYNTQIKDLSLGVSWNYSKSRGQPDADQVFALNFSLPLNLLLPRSNDSYTRKKNYAWM 607
Q G NT +D++ +S++ +K+ Q DQ+ ALN ++P + L + S R +A
Sbjct: 563 QAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWR---HASA 619

Query: 608 TSNTSIDNEGHITQNLGLTETLLDDGNLSYSVQQGYNSEGKTANGS---ASMDYKGAFAD 664
+ + S D G +T G+ TLL+D NLSYSVQ GY G +GS A+++Y+G + +
Sbjct: 620 SYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGN 679

Query: 665 ARVGYNYSDNGSQQQLNYALSGSLVAHSQGITLGQSLGETNVLIAAPGAENTRVANSTGL 724
A +GY++SD+ +QL Y +SG ++AH+ G+TLGQ L +T VL+ APGA++ +V N TG+
Sbjct: 680 ANIGYSHSDD--IKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGV 737

Query: 725 KTDWRGYTVVPYATSYRENRIALDAASLKRNVDLENAVVNVVPTKGALVLAEFNAHAGAR 784
+TDWRGY V+PYAT YRENR+ALD +L NVDL+NAV NVVPT+GA+V AEF A G +
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIK 797

Query: 785 VLMKTSKQGIPLRFGAIATLDGIQTNSGIIDDDGSLYMSGLPAQGAITVRWGEAPDQICH 844
+LM + PL FGA+ T + +SGI+ D+G +Y+SG+P G + V+WGE + C
Sbjct: 798 LLMTLTHNNKPLPFGAMVTSES-SQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCV 856

Query: 845 ISYQLTEQQINSAITRMDAICR 866
+YQL + +T++ A CR
Sbjct: 857 ANYQLPPESQQQLLTQLSAECR 878


15S1018S1023Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S1018316-0.439807hypothetical protein
S1019217-0.956688ribosome modulation factor
S1020216-1.4494983-hydroxydecanoyl-ACP dehydratase
S1021218-1.434860ATP-dependent protease
S1022424-0.917369hypothetical protein
S1023316-0.192655outer membrane protein A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1023OUTRMMBRANEA5990.0 Outer membrane protein A signature.
		>OUTRMMBRANEA#Outer membrane protein A signature.

Length = 346

Score = 599 bits (1545), Expect = 0.0
Identities = 335/350 (95%), Positives = 339/350 (96%), Gaps = 6/350 (1%)

Query: 1 MKKTAIAIAVALAGFATVAQAAPKDNTWYTGAKLGWSQYHDTGFIPNNGPTHENQLGAGA 60
MKKTAIAIAVALAGFATVAQAAPKDNTWYTGAKLGWSQYHDTGFI NNGPTHENQLGAGA
Sbjct: 1 MKKTAIAIAVALAGFATVAQAAPKDNTWYTGAKLGWSQYHDTGFINNNGPTHENQLGAGA 60

Query: 61 FGGYQVNPYVGFEMGYDWLGRMPYKGDNINGAYKAQGVQLTAKLGYPITDDLDIYTRLGG 120
FGGYQVNPYVGFEMGYDWLGRMPYKG NGAYKAQGVQLTAKLGYPITDDLDIYTRLGG
Sbjct: 61 FGGYQVNPYVGFEMGYDWLGRMPYKGSVENGAYKAQGVQLTAKLGYPITDDLDIYTRLGG 120

Query: 121 MVWRADTKANVPGGASFKDHDTGVSPVFAGGVEYAITPEIATRLEYQWTNNIGDANTIGT 180
MVWRADTK+NV G K+HDTGVSPVFAGGVEYAITPEIATRLEYQWTNNIGDA+TIGT
Sbjct: 121 MVWRADTKSNVYG----KNHDTGVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGT 176

Query: 181 RPDNGLLSLGVSYRFGQGEAAPVVAPAP--APEVQTKHFTLKSDVLFNFNKATLKPEGQA 238
RPDNG+LSLGVSYRFGQGEAAPVVAPAP APEVQTKHFTLKSDVLFNFNKATLKPEGQA
Sbjct: 177 RPDNGMLSLGVSYRFGQGEAAPVVAPAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQA 236

Query: 239 ALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPADKIS 298
ALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPADKIS
Sbjct: 237 ALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPADKIS 296

Query: 299 ARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKGIKDVVTQPQA 348
ARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKGIKDVVTQPQA
Sbjct: 297 ARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKGIKDVVTQPQA 346


16S1066S1080Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
S1066-118-5.086420trimethylamine N-oxide reductase, cytochrome
S1069-118-5.031321IS1 orfB
S1070-120-5.735316IS1 orfA
S1072019-5.102332chaperone-modulator protein CbpM
S1073020-5.497039curved DNA-binding protein CbpA
S1074019-4.536089hypothetical protein
S10751161.134927glucose-1-phosphatase/inositol phosphatase
S10761192.228731hypothetical protein
S10770173.309971TrpR binding protein WrbA
S10780183.325445hypothetical protein
S1079-1173.739524transporter
S1080-2183.973460hypothetical protein
17S1094S1140Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
S1094119-4.898279IS3 orfA
S1096020-5.634858hypothetical protein
S1101-119-4.916798*hydrolase
S1102125-5.907274oxidoreductase component
S1103329-7.415486hypothetical protein
S1105129-7.324902curli assembly protein CsgF
S1106233-7.119062curli assembly protein CsgE
S1107230-6.889917DNA-binding transcriptional regulator CsgD
S1108329-4.599353curlin minor subunit
S1110024-2.698563IS600
S1111023-3.984952IS600 ORF2
S1113017-4.164304autoagglutination protein
S1114-212-1.710610hypothetical protein
S1115-113-1.105757hypothetical protein
S1116-213-1.145654synthase
S1117-116-2.609978glucans biosynthesis protein
S1118019-2.066475glucan biosynthesis protein G
S1119121-1.493777glucosyltransferase MdoH
S1120430-3.660518IS91 orf
S4816335-5.558356hypothetical protein
S4815434-4.597841hypothetical protein
S4817425-2.384286hypothetical protein
S1123324-0.851562IS629 orfB
S1124226-0.298105IS629 orfA
S1126226-0.331405IS911 orfA
S11273260.833738IS911 orfB
S1128225-0.110603hypothetical protein
S11292261.780542insertion sequence 2 OrfA protein
S11303251.300469insertion element IS2 transposase InsD
S1131627-2.418618IS629 orfA
S1132321-2.041917IS629 orfB
S1133-118-2.915284IS600 orfA
S1134-115-3.096334IS600 orfB
S1135-314-3.581278IS600 orfB
S1136-315-3.749841hypothetical protein
S1138217-1.684121lipid A biosynthesis lauroyl acyltransferase
S1139117-2.620023hypothetical protein
S1140222-2.786371hypothetical protein
18S1237S1244Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S1237017-3.75877323S rRNA pseudouridine synthase E
S1238016-4.265006isocitrate dehydrogenase
S1239022-4.547524IS1 orfB
S1240124-4.977183IS1 orfA
S1241025-5.497676hypothetical protein
S1242022-4.337503hypothetical protein
S1244120-3.170711hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1242PRTACTNFAMLY977e-23 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 97.4 bits (242), Expect = 7e-23
Identities = 142/655 (21%), Positives = 226/655 (34%), Gaps = 93/655 (14%)

Query: 137 DVDITTHGDNAHAIAARQGTVSFNQGEIYTTGPDAAIAKIYNGGTVTLKNTSAVAHQGSG 196
D + + +V Q + AAI + G VT+ S A G+
Sbjct: 285 PGGFGPVLDGWYGVDVSGSSVELAQSIVEAPELGAAIR-VGRGARVTVSGGSLSAPHGNV 343

Query: 197 IVLESSIN--GQEATVDILSGSSLRSANEILYHKDETSNVTITDSEVSSAADVFINNIKG 254
I + Q A + I + + + L ++ V +T ++ AD + +
Sbjct: 344 IETGGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKLT---LTGGADAQGDIVAT 400

Query: 255 HLTVDATNSKITGSANISTDDN------THTYLSLS-DNSTWDIKADSTVSNLTV--DNS 305
L S G +++ T SLS DN+TW + +S V L + D S
Sbjct: 401 ELPSIPGTS--IGPLDVALASQARWTGATRAVDSLSIDNATWVMTDNSNVGALRLASDGS 458

Query: 306 TVYISRADGRDVEPTRLTITENYVGNNGVLHLRTELDDDNSATDKVVINGNTSGTTRVKV 365
+ A+ + +T N + +G+ + D +DK+V+ + SG R+ V
Sbjct: 459 VDFQQPAEAGRFK----VLTVNTLAGSGLFRMNVFAD--LGLSDKLVVMQDASGQHRLWV 512

Query: 366 TNAGGSGAYTLNGIEIISVEGESNGEFI---KDSRIFAGAYEYSLTRGNTEATNKNWYLT 422
N+G S + N + ++ S F KD ++ G Y Y L N W L
Sbjct: 513 RNSG-SEPASANTLLLVQTPLGSAATFTLANKDGKVDIGTYRYRLA----ANGNGQWSLV 567

Query: 423 NFQAT-------SGGETNSGGSSAPTVAPTPVLRPEAGSYVANLAAANTLFVMRLNDRAG 475
+A G AP P A AA NT V A
Sbjct: 568 GAKAPPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSAAANAAVNTGGV----GLAS 623

Query: 476 ETRYIDPVTEQERSSRLWLRQIGGHNAWRDSNGQLRTTSHRY-------VS--QLGGDLL 526
Y + +R L L G AW Q + +R V+ +LG D
Sbjct: 624 TLWYAESNALSKRLGELRLNPDAG-GAWGRGFAQRQQLDNRAGRRFDQKVAGFELGADHA 682

Query: 527 TGGFTDSDSWRLGVMAGYARDYNLTHSSVSDYRSKGSVRGYSAGLYATWFADDISKKGAY 586
W LG +AGY R + G G YAT+ AD G Y
Sbjct: 683 VAVAGGR--WHLGGLAGYTR----GDRGFTGDGG-GHTDSVHVGGYATYIADS----GFY 731

Query: 587 IDSWAQYSWFKN----------SVKGDELAYESYSAKGATVSLEAGYGFALNKSFGLEAA 636
+D+ + S +N +VKG Y G SLEAG F
Sbjct: 732 LDATLRASRLENDFKVAGSDGYAVKGK------YRTHGVGASLEAGRRFTHADG------ 779

Query: 637 KYTWIFQPQAQAIWMGVDHNAHTEANGSRIENDANNNIQTRLGFRTFIRTQEKNSGPHGD 696
W +PQA+ A+ ANG R+ ++ +++ RLG R + G
Sbjct: 780 ---WFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIELAG----GR 832

Query: 697 DFEPFVEMNWIHNSK-DFAVSMNGVKVEQDGVSNLGEIKLGVNGNLNPAASVWGN 750
+P+++ + + V NG+ + E+ LG+ L S++ +
Sbjct: 833 QVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYAS 887


19S1333S1344Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S1333018-3.284219oligopeptide transporter ATP-binding component
S1334-118-3.423174ATP-binding protein of oligopeptide ABC
S1335-120-4.128831dsDNA-mimic protein
S1336-221-3.132324cardiolipin synthetase
S1337-224-4.216128voltage-gated potassium channel
S1338121-2.094414IS600 orfA
S1339016-1.947839IS600 orfB
S1340218-2.242726YciI-like protein
S1341020-3.505718transporter
S1342024-5.756401acyl-CoA thioester hydrolase
S1343024-5.906039intracellular septation protein A
S1344021-4.090586hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1334HTHFIS310.008 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.6 bits (69), Expect = 0.008
Identities = 9/16 (56%), Positives = 11/16 (68%)

Query: 55 VVGESGCGKSTFARAI 70
+ GESG GK ARA+
Sbjct: 165 ITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1340adhesinmafb314e-04 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 31.2 bits (70), Expect = 4e-04
Identities = 16/57 (28%), Positives = 20/57 (35%), Gaps = 2/57 (3%)

Query: 41 GPMPAVDSNDPGAAGFTGSTVIAEFESLEAAQAWADADPYVAAGVYEHVSVKPFKKV 97
P+PA G GS E + EA W +P A V +V KV
Sbjct: 268 APLPA--EGKFAVIGGLGSVAGFEKNTREAVDRWIQENPNAAETVEAVFNVAAAAKV 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1341TONBPROTEIN2562e-88 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 256 bits (654), Expect = 2e-88
Identities = 236/239 (98%), Positives = 237/239 (99%)

Query: 4 ITLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVAPADLEPPQA 63
+TLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMV PADLEPPQA
Sbjct: 1 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQA 60

Query: 64 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 123
VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR
Sbjct: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120

Query: 124 PASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 183
PASPFENTAPAR TSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF
Sbjct: 121 PASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 180

Query: 184 DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ 242
DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ
Sbjct: 181 DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ 239


20S1384S1406Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S1384-1153.244500glutamine synthetase
S13850163.253432gamma-glutamyl-gamma-aminobutyrate hydrolase
S13861172.947399DNA-binding transcriptional repressor PuuR
S13872183.026689gamma-glutamyl-gamma-aminobutyraldehyde
S13881151.982581oxidoreductase
S13893140.7978964-aminobutyrate transaminase
S1390213-1.392015phage shock protein operon transcriptional
S1391214-1.784286phage shock protein PspA
S1392116-0.739989phage shock protein B
S13930170.487760DNA-binding transcriptional activator PspC
S1394-2200.730232peripheral inner membrane phage-shock protein
S1395-1211.422576thiosulfate:cyanide sulfurtransferase
S13970231.861699IS1 orfA
S13980181.753234IS1 orfB
S14001172.105392binding-protein dependent transport protein
S14012171.678222transport system permease
S14022160.635109oxidoreductase
S14042160.589864dehydrogenase
S14053160.764305hypothetical protein
S14062170.321080beta-phosphoglucomutase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1390HTHFIS342e-118 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 342 bits (879), Expect = e-118
Identities = 126/341 (36%), Positives = 183/341 (53%), Gaps = 23/341 (6%)

Query: 6 DNLLGEANSFLEVLEQVSHLAPLDKPVLIIGERGTGKELIASRLHYLSSRWQGPFISLNC 65
L+G + + E+ ++ L D ++I GE GTGKEL+A LH R GPF+++N
Sbjct: 137 MPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINM 196

Query: 66 AALNENLLDSELFGHEAGAFTGAQKRHPGRFERADGGTLFLDELATAPMMVQEKLLRVIE 125
AA+ +L++SELFGHE GAFTGAQ R GRFE+A+GGTLFLDE+ PM Q +LLRV++
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQ 256

Query: 126 YGELERVGGSQPLQVNVRLVCATNADLPAMVNEGTFRADLLDRLAFDVVQLPPLRERESD 185
GE VGG P++ +VR+V ATN DL +N+G FR DL RL ++LPPLR+R D
Sbjct: 257 QGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAED 316

Query: 186 IMLMAEHFAIQMCREIKLPLFPGFTERARETLLNYRWPGNIRELKNVVERSVYRHGTSDY 245
I + HF Q +E F + A E + + WPGN+REL+N+V R +
Sbjct: 317 IPDLVRHFVQQAEKEGLDVK--RFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVI 374

Query: 246 PLDDIIID---PFKRRPPEEAIAVSENTSLPTLPLD------------------LREFQM 284
+ I + P E+A A S + S+ +
Sbjct: 375 TREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLA 434

Query: 285 QQEKELLQLSLQQGKYNQKRAAELLGLTYHQFRALLKKHQI 325
+ E L+ +L + NQ +AA+LLGL + R +++ +
Sbjct: 435 EMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1392MPTASEINHBTR250.041 Metalloprotease inhibitor signature.
		>MPTASEINHBTR#Metalloprotease inhibitor signature.

Length = 122

Score = 24.6 bits (53), Expect = 0.041
Identities = 6/43 (13%), Positives = 16/43 (37%)

Query: 30 SGRSELSQSEQQRLAQLVDEAKRMRERIQALESILDAEHPNWR 72
+G+ + + A ++A + + E L + +W
Sbjct: 37 AGQLGIEATGSGVCAGPAEQANALAGDVACAEQWLGDKPVSWS 79


21S1421S1427Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
S1421-118-4.830307LYSR-type transcriptional regulator
S1422-115-4.098822IS600 orfB
S1423015-3.838053IS600 orfA
S1424015-3.318074transport periplasmic protein
S1425117-1.833033hypothetical protein
S14262190.372813IS911 orfA
S14272190.655995IS91 orf
22S1459S1499Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
S1459-321-4.894245D-lactate dehydrogenase
S1461-128-4.304112IS1 orfB
S1462-127-4.244952IS1 orfA
S1463-228-4.573968hypothetical protein
S1464-122-3.195736hypothetical protein
S1465-110-0.613309phosphatidate cytidylyltransferase
S1466-1100.223864hypothetical protein
S14680100.299277IS600 orfB
S1469-1100.045208IS600 orfA
S1471-1110.105567azoreductase
S1472-19-0.896197ATP-dependent RNA helicase HrpA
S1473123-3.854719IS600 orfB
S1475021-3.608206aldehyde dehydrogenase A
S1478027-4.417664cytochrome b561
S1479025-3.320395hypothetical protein
S1482-220-2.763704IS103 orf
S1483-318-1.964101IS103 orf
S1485-318-1.817990methyl-accepting chemotaxis protein III, ribose
S1486-218-2.675262LYSR-type transcriptional regulator
S1487-114-2.601405hypothetical protein
S1488-215-2.730592glucan biosynthesis protein D
S1489-220-3.952863hypothetical protein
S1490-118-4.502933hypothetical protein
S1491215-2.556291ribosomal-protein-L7/L12-serine
S1492216-1.812013hypothetical protein
S1493314-1.134790potassium-tellurite ethidium and proflavin
S1494415-1.451436tellurite resistance protein TehB
S1495316-0.999339hypothetical protein
S14974250.504534IS91 orf
S1498123-1.587857IS91 orf
S1499222-1.823247IS1 orfA
23S1554S1567Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S1554219-1.355818IS600 orfB
S1555119-1.769184hypothetical protein
S1556023-2.077954hypothetical protein
S1557020-2.473805aldehyde reductase
S1558-117-2.760121hypothetical protein
S1559-119-3.810205glyceraldehyde-3-phosphate dehydrogenase
S1560023-4.819877methionine sulfoxide reductase B
S1561122-4.790366hypothetical protein
S1562023-5.337331oxidoreductase
S1563125-6.208676transporter
S1565123-6.108132aldolase
S1566120-5.030380kinase
S1567-115-3.369630hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1558INVEPROTEIN290.021 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 28.9 bits (64), Expect = 0.021
Identities = 18/81 (22%), Positives = 34/81 (41%), Gaps = 13/81 (16%)

Query: 158 ETTSALHTYFNVGDIAKVSVSGLGDRFIDKVNDAKED-----------VLTDGIQTFPDR 206
E ++AL + N D K S S L + F ++V + + V ++ F +
Sbjct: 57 EMSAALAQFRNRRDYEKKS-SNLSNSF-ERVLEDEALPKAKQILKLISVHGGALEDFLRQ 114

Query: 207 TDRVYLNPQDCSVINDEALNR 227
++ +P D ++ E L R
Sbjct: 115 ARSLFPDPSDLVLVLRELLRR 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1563TCRTETB290.041 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.1 bits (65), Expect = 0.041
Identities = 32/142 (22%), Positives = 47/142 (33%), Gaps = 23/142 (16%)

Query: 71 MFLGALVGGIIGDKTGRRNAFILYEAIHIASMVVGAFSPNMDF-LIACRFVMDVGLGALL 129
+G V G + D+ G + + I+ V+G + LI RF+ G A
Sbjct: 62 FSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFP 121

Query: 130 VTLFAGFTEYMPGRNR----GTWSSRVSFIGNWSYPLCSLIAMGLTPLISA----EWNWR 181
+ Y+P NR G S V+ + G+ P I +W
Sbjct: 122 ALVMVVVARYIPKENRGKAFGLIGSIVA------------MGEGVGPAIGGMIAHYIHWS 169

Query: 182 VQLLIPAILSLIATALAWRYFP 203
LLIP I I T
Sbjct: 170 YLLLIPMI--TIITVPFLMKLL 189


24S1591S1605Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S15912151.482873hypothetical protein
S15922131.726848hypothetical protein
S15931112.127041hypothetical protein
S15942122.619822exonuclease III
S15962142.243865arginine succinyltransferase
S15972131.001839succinylglutamic semialdehyde dehydrogenase
S1599014-1.111535succinylglutamate desuccinylase
S1600117-2.474637periplasmic protein
S1601019-2.803229hypothetical protein
S1602-119-2.661634nucleotide excision repair endonuclease
S1603-117-2.842868NAD synthetase
S1604-116-3.898608DNA-binding transcriptional activator OsmE
S1605019-3.559658N,N'-diacetylchitobiose-specific PTS system
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1597DNABINDINGHU310.002 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 31.2 bits (71), Expect = 0.002
Identities = 14/61 (22%), Positives = 28/61 (45%), Gaps = 5/61 (8%)

Query: 74 SNKAELTAIIARETGKPRWEAATEVTAMINKIAISIKAYHVRTGEQRSEMPDGAASLRHR 133
+NK +L A +A T + ++A V A+ + ++ + GE+ + G +R R
Sbjct: 2 ANKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAK-----GEKVQLIGFGNFEVRER 56

Query: 134 P 134

Sbjct: 57 A 57


25S1650S1669Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S1650016-4.622669electron transfer flavoprotein subunit YdiR
S1651022-4.480855electron transfer flavoprotein YdiQ
S1652024-4.818833ARAC-type regulatory protein
S1653122-4.064709IS1
S1654020-3.668553transcriptional regulator YdeO
S1655218-1.731857hypothetical protein
S1656218-0.866056oxidoreductase
S16584250.494078IS1 orfA
S1659329-0.990031IS1 orfB
S1660330-1.298879IS600 orfB
S1661230-2.770737lysis protein S of prophage CP-933V
S4818129-3.819283hypothetical protein
S1665225-2.656258**endodeoxyribonuclease RUS (Holliday junction
S1667322-3.957355hypothetical protein
S1668527-1.869214prophage maintenance protein
S1669425-2.475483hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1668HOKGEFTOXIC615e-17 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 60.6 bits (147), Expect = 5e-17
Identities = 19/46 (41%), Positives = 32/46 (69%)

Query: 4 QKAMLIALIVICLTVIVTALVTRKDLCEVRIRTGQTEVAVFTAYEP 49
+ +++ ++++CLT+++ +TRK LCE+R R G EVA F AYE
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


26S1687S1694Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S1687217-2.817208transporter
S1688220-2.252383O-acetylserine/cysteine export protein
S1689121-4.992712hypothetical protein
S1690020-4.194209DNA-binding transcriptional activator MarA
S1691019-3.644481DNA-binding transcriptional repressor MarR
S1692-119-3.300860multiple drug resistance protein MarC
S1693-214-2.756426sugar efflux transporter
S1694-314-3.429402hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1687TCRTETA392e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.0 bits (91), Expect = 2e-05
Identities = 38/222 (17%), Positives = 73/222 (32%), Gaps = 21/222 (9%)

Query: 1 MATLPFMTIYLSRQYSLSVDLI---GYAMTIALTIGVVFSLGFGILADKFDKKRYMLLAI 57
M LP + R S D+ G + + + + G L+D+F ++ +L+++
Sbjct: 25 MPVLPGLL----RDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSL 80

Query: 58 TAFASGFIAIPLVNNVTLVVLFFALINCAYSVFATVLKAWFADNLSSTSKTKIFSINYTM 117
A + + + ++ + + + A A+ AD + + F
Sbjct: 81 AGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG-AYIADITDGDERARHFGFMSAC 139

Query: 118 LNIGWTIGPPLGTLLVMQSINLPFWLAAICSAFPMLFIQIWVKRSEK---------IIAT 168
G GP LG L+ S + PF+ AA + L + S K +
Sbjct: 140 FGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNP 199

Query: 169 ETGSVWSPKVLLQDKALLWFTCSGFLASFVSGAFASCISQYV 210
W + A L F+ V A+ +
Sbjct: 200 LASFRW--ARGMTVVAALMAV--FFIMQLVGQVPAALWVIFG 237



Score = 29.8 bits (67), Expect = 0.017
Identities = 19/130 (14%), Positives = 52/130 (40%), Gaps = 2/130 (1%)

Query: 9 IYLSRQYSLSVDLIGYAMTIALTIGVVF-SLGFGILADKFDKKRYMLLAITAFASGFIAI 67
I+ ++ IG ++ + + ++ G +A + ++R ++L + A +G+I +
Sbjct: 235 IFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILL 294

Query: 68 PLVNNVTLVVLFFALINCAYSVFATVLKAWFADNLSSTSKTKIFSINYTMLNIGWTIGPP 127
+ L+ + L+A + + + ++ + ++ +GP
Sbjct: 295 AFATRGWMAFPIMVLLASG-GIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPL 353

Query: 128 LGTLLVMQSI 137
L T + SI
Sbjct: 354 LFTAIYAASI 363


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1693TCRTETB537e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 53.3 bits (128), Expect = 7e-10
Identities = 40/192 (20%), Positives = 83/192 (43%), Gaps = 8/192 (4%)

Query: 36 LSDIAQSFHMQTAQVGIMLTIYAWVVALMSLPFMLMTSQVERRKLLICLFVVFIASHVLS 95
L DIA F+ A + T + ++ + + ++ Q+ ++LL+ ++ V+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 96 FLSWS-FTVLVISRIGVAFAHAIFWSITASLAIRMAPAGKRAQALSLIATGTALAMVLGL 154
F+ S F++L+++R A F ++ + R P R +A LI + A+ +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 155 PLGRIVGQYFGWRMTFFAIGIGALVTLLCLIKLLPLLPSEHSGSLKSLPLLFRRPALMSI 214
+G ++ Y W I + ++T+ L+KLL + LMS+
Sbjct: 157 AIGGMIAHYIHWSY-LLLIPMITIITVPFLMKLLK------KEVRIKGHFDIKGIILMSV 209

Query: 215 YLLTVVVVTAHY 226
++ ++ T Y
Sbjct: 210 GIVFFMLFTTSY 221


27S1719S1739Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S1719525-1.202756hypothetical protein
S1720830-0.986047hypothetical protein
S17221134-0.961435hypothetical protein
S172313411.000754hypothetical protein
S172712380.891578hypothetical protein
S17289320.030733hypothetical protein
S17301234-0.264387hypothetical protein
S1731727-0.164440hypothetical protein
S1732423-0.008420hypothetical protein
S17363190.033006IS911 orfB
S17381190.367715IS911 orfA
S17392150.768509integrase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1728PYOCINKILLER270.048 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 26.7 bits (58), Expect = 0.048
Identities = 11/31 (35%), Positives = 16/31 (51%)

Query: 78 AYGRRQMNKRAEAERIAKEQRLQAERMREEN 108
A + + AEA+R A+EQ Q +R N
Sbjct: 218 AANKAREQAAAEAKRKAEEQARQQAAIRAAN 248


28S1850S1859Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S1850217-3.684262inner membrane protein
S1851119-4.757705hypothetical protein
S1852118-4.345257transport system permease
S1853019-5.948307amino acid/amine transport protein
S1854017-6.120030quinate/shikimate dehydrogenase
S1855016-5.7727283-dehydroquinate dehydratase
S1857217-6.101418IS1 orfB
S1856218-6.387254IS1 orfA
S1859217-6.002973enzyme
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1852TCRTETA310.007 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 31.3 bits (71), Expect = 0.007
Identities = 63/331 (19%), Positives = 113/331 (34%), Gaps = 19/331 (5%)

Query: 41 FAGLLSDRFGRRPFIMLGMCCYMAFFFDILQTNNIIIAYVFGFLAGMANSFLDAGTYPSL 100
G LSDRFGRRP +++ + + + + + Y+ +AG+ + A +
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAYI 120

Query: 101 MEAFPRSPGTANI-LIKAFVSSGQFLLPLIISLLVWAELWFGWSFMIAAGIMFINALFLY 159
+ + + A G P++ L+ F AA + +N L
Sbjct: 121 ADITDGDERARHFGFMSACFGFGMVAGPVLGGLM--GGFSPHAPFFAAAALNGLNFLTGC 178

Query: 160 RCTFPPHPGRRLPV---IKKTTSSTEHRCSIIDLASYTLYGYISMATFYLVSQWLAQYGQ 216
H G R P+ +S + +A+ +I + + +G+
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGE 238

Query: 217 FVAGMSYTM-SIKLLSIYTVGSLLCVFITAPLIRNTVRPTTLLMLYTFISFIALFTVCLH 275
T I L + + SL IT P+ L++ IA T +
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALML-----GMIADGTGYIL 293

Query: 276 PTFYVVIIFAF-VIGFTSAGGVVQIGLTLMAERF--PYAKGKATGIYYSTGSIATFTIPL 332
F AF ++ ++GG+ L M R +G+ G + S+ + PL
Sbjct: 294 LAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPL 353

Query: 333 ITAHLSQRSIA---DIMWFDTAIGFLLALFI 360
+ + SI W A +LL L
Sbjct: 354 LFTAIYAASITTWNGWAWIAGAALYLLCLPA 384



Score = 29.8 bits (67), Expect = 0.019
Identities = 25/137 (18%), Positives = 51/137 (37%), Gaps = 3/137 (2%)

Query: 20 NAAGVSIVISSLGI-GRLSVLLFAGLLSDRFGRRPFIMLGMCCYMAFFFDILQTNNIIIA 78
+A + I +++ GI L+ + G ++ R G R +MLGM + + +A
Sbjct: 244 DATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMA 303

Query: 79 YVFGFLAGMANSFLDAGTYPSLMEAFPRSPGTANILIKAFVSSGQFLLPLIISLL--VWA 136
+ L + A + G + A S + PL+ + +
Sbjct: 304 FPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASI 363

Query: 137 ELWFGWSFMIAAGIMFI 153
W GW+++ A + +
Sbjct: 364 TTWNGWAWIAGAALYLL 380


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1853TCRTETB300.021 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.8 bits (67), Expect = 0.021
Identities = 31/150 (20%), Positives = 63/150 (42%), Gaps = 7/150 (4%)

Query: 4 LAEKFSTDNAGIAYLISGIGLGRLISILFFGVISDKFGRRAVILMAVIMY----LLFFFG 59
+A F+ A ++ + L I +G +SD+ G + ++L +I+ ++ F G
Sbjct: 40 IANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVG 99

Query: 60 IPACPNLTLAYGLAVCVGIANSALDTGGYPALMECFPKASGSAVILVKAMVSFGQMFYPM 119
L +A + A AL + G A L+ ++V+ G+ P
Sbjct: 100 HSFFSLLIMARFIQGAGAAAFPALVMVVVARY--IPKENRGKAFGLIGSIVAMGEGVGP- 156

Query: 120 LVSYMLLNNIWYGYGLIIPGILFVLITLML 149
+ M+ + I + Y L+IP I + + ++
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLM 186


29S1882S1888Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S1882012-3.231376formate dehydrogenase-N subunit gamma
S1883013-3.187516formate dehydrogenase-N, nitrate-inducible,
S1884013-3.631766formate dehydrogenase-N, nitrate-inducible,
S1885124-6.608034hypothetical protein
S1887120-5.268869outer membrane porin protein
S1888125-5.201295glycoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1887ECOLIPORIN473e-170 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 473 bits (1219), Expect = e-170
Identities = 223/386 (57%), Positives = 271/386 (70%), Gaps = 23/386 (5%)

Query: 1 MKLKIVAVVVTGLLAANVAHAAEVYNKDGNKLDLYGKVTALRYFTDDKRDDGDKTYARLG 60
MK K++A+V+ LLAA AHAAE+YNKDGNKLDLYGKV L YF+DD DGD+TY R+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQINDQMIGFGHWEYDFKGYNDEANGSRGNKTRLAYAGLKISEFGSLDYGRNYGVG 120
FKGETQINDQ+ G+G WEY+ + E G+ + TRLA+AGLK ++GS DYGRNYGV
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGAN-SWTRLAFAGLKFGDYGSFDYGRNYGVL 119

Query: 121 YDIGSWTDMLPEFGGDTWSQKDVFMTYRTTGLATYRNYDFFGLIEGLNFAAQYQGKNKR- 179
YD+ WTDMLPEFGGD+++ D +MT R G+ATYRN DFFGL++GLNFA QYQGKN+
Sbjct: 120 YDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQ 179

Query: 180 -------TDNSHLYGADDTRANGDGFGISSTYVYD-GFGIGAVYTKSDRTNAQERAAANP 231
N+ G D NGDGFGIS+TY GF GA YT SDRTN Q A
Sbjct: 180 SADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGT- 238

Query: 232 LNASGKNAELWATGIKYDANNIYFAANYDETLNMTTYG------DGYISNKAQSFEVVAQ 285
A G A+ W G+KYDANNIY A Y ET NMT YG DG ++NK Q+FEV AQ
Sbjct: 239 -IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQ 297

Query: 286 YQFDFGLRPSLAYLKSKGRDLGR----YADQDMIEYIDVGATYFFNKNMSTYVDYKINLI 341
YQFDFGLRP++++L SKG+DL D+D+++Y DVGATY+FNKN STYVDYKINL+
Sbjct: 298 YQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLL 357

Query: 342 DESD-FTRAVDIRTDNIVATGITYQF 366
D+ D F + I TD+IVA G+ YQF
Sbjct: 358 DDDDPFYKDAGISTDDIVALGMVYQF 383


30S1904S1911Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
S19042190.756645hypothetical protein
S19053191.645288gamma-aminobutyraldehyde dehydrogenase
S19064201.949266transport system permease
S19085220.702722IS1294 transposase
S1909320-0.414995IS1294 transposase
S1910220-1.072792IS629 orfB
S1911222-0.830033IS629 orfB
31S1939S1954Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
S19392211.986539Holliday junction resolvase
S1940220-0.203172hypothetical protein
S1941219-0.047433dATP pyrophosphohydrolase
S19422190.081515aspartyl-tRNA synthetase
S1944327-0.771713ISSfl4 orf
S1946527-2.811420hypothetical protein
S1947426-1.307060invasion plasmid antigen
S1948222-0.104013hypothetical protein
S1949120-1.053956hypothetical protein
S1950123-1.620379hypothetical protein
S1952022-2.338342insertion sequence 2 OrfA protein
S1953120-2.084920insertion element IS2 transposase InsD
S1954223-2.497440hypothetical protein
32S1964S1992Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S1964321-0.922045Iron transport protein
S1965320-0.568796Iron transport protein, ATP-binding component
S1966320-0.609840Iron transport protein
S1967324-0.577913Iron transport protein, inner membrane
S1968427-0.241843hypothetical protein
S1969426-0.252073tail fiber protein
S1970423-0.744603tail fiber assembly protein
S1971323-1.227424hypothetical protein
S1972728-2.199470IS600 orfB
S1973626-3.250313IS1 orfB
S1975527-3.002808IS1 orfA
S1978628-2.736235IS911 orfB
S1980528-2.835079IS911 orfA
S1981527-2.162321integrase
S1986122-1.522673****Q antiterminator encoded by prophage CP-933P
S1987123-0.928586crossover junction endodeoxyribonuclease
S1989224-0.952094IS600 orfB
S1990122-1.843269serine protease
S1991118-2.565701hypothetical protein
S1992-119-3.024343hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1964adhesinb332e-116 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 332 bits (852), Expect = e-116
Identities = 90/296 (30%), Positives = 163/296 (55%), Gaps = 7/296 (2%)

Query: 1 MLLGCLALTCSIAFQASATEKFKVITTFTIIADMAKNVAGDAAEVSSITKPGAEIHEYQP 60
+G A + + + + K V+ T +IIAD+ KN+AGD + SI G + HEY+P
Sbjct: 13 AFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKNIAGDKINLHSIVPVGQDPHEYEP 72

Query: 61 TPGDIKRAQGAQLILANGMNLEL----WFQRFYQHLNGVPE---VIVSSGVTPVGITEGP 113
P D+K+ A LI NG+NLE WF + ++ VS GV + +
Sbjct: 73 LPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVSEGVDVIYLEGQS 132

Query: 114 YEGKPNPHAWMSPDNALIYVDNIRDALIKYDPANAQTYQRNADTYKAKITQTLAPLRKQI 173
+GK +PHAW++ +N +IY NI L + DPAN +TY++N Y K++ +++
Sbjct: 133 EKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANKETYEKNLKAYVEKLSALDKEAKEKF 192

Query: 174 TELPENQRWMVTSEGAFSYLARDLGLKELYLWPINADQQGTPQQVRKVVDIVKKNHIPAV 233
+P ++ +VTSEG F Y ++ + Y+W IN +++GTP Q++ +V+ ++K +P++
Sbjct: 193 NNIPGEKKMIVTSEGCFKYFSKAYNVPSAYIWEINTEEEGTPDQIKTLVEKLRKTKVPSL 252

Query: 234 FSESTISDKPARQVARETGAHYGGVLYVDSLSTENGPVPTYIDLLKVTTSTLVQGI 289
F ES++ D+P + V+++T ++ DS++ + +Y ++K + +G+
Sbjct: 253 FVESSVDDRPMKTVSKDTNIPIYAKIFTDSVAEKGEEGDSYYSMMKYNLEKIAEGL 308


33S2067S2184Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S2067016-3.596168hypothetical protein
S2068122-5.091645inner membrane protein
S2069533-7.803997hypothetical protein
S2072432-7.176284virulence protein
S2073125-5.834963outer membrane porin protein C precurser
S4819223-3.110399regulator
S48200171.839292kinase inhibitor
S48210173.259934multidrug efflux protein
S20760183.839510flagellar hook-basal body protein FliE
S2078-1183.366277flagellar motor switch protein G
S2079-2163.257279flagellar assembly protein H
S2080-1173.027042flagellum-specific ATP synthase
S2082-2161.679687flagellar hook-length control protein
S2083-117-0.095495flagellar basal body-associated protein FliL
S2084016-2.689047flagellar motor switch protein FliM
S2085118-3.930486flagellar motor switch protein FliN
S2086121-5.156422flagellar biosynthesis protein FliO
S2088123-6.160225flagellar biosynthesis protein FliQ
S2089120-4.535261flagellar biosynthesis protein FliR
S2090219-2.620954positive regulator for ctr capsule biosynthesis,
S20910200.879927hypothetical protein
S20922170.564921hypothetical protein
S20960160.358572hypothetical protein
S20970150.182912hypothetical protein
S2098-212-1.200542hypothetical protein
S2099015-0.977514DNA mismatch endonuclease, patch repair protein
S2100016-1.388622DNA cytosine methylase
S2103425-2.386167ISEhe3 orfB
S2104327-2.264139ISEhe3 orfA
S2105423-1.641428outer membrane pore protein
S2106021-0.825297insertion element IS2 transposase InsD
S2107-127-6.226235insertion sequence 2 OrfA protein
S2108030-7.066478outer membrane protein
S2109-132-7.166493IS1 orfA
S2110-132-7.237532IS1 orfB
S2111034-9.020675chaperone protein HchA
S2112240-9.6591342-component sensor protein
S2113535-6.609871transcriptional regulatory protein YedW
S2114431-5.581252hypothetical protein
S2116529-4.664219sulfite oxidase subunit YedZ
S2117626-3.999051hypothetical protein
S2118521-1.276975bacteriophage protein
S2119420-1.328423invasion plasmid antigen
S21212220.010242hypothetical protein
S21220210.287143IS1 orfA
S21231230.165179IS1 orfB
S21241220.106579tail protein
S2126122-0.306445bacteriophage protein
S21272230.246159hypothetical protein
S21282240.247066sheath protein
S2129324-1.981128hypothetical protein
S2130125-1.767041IS600 orfB
S2131021-1.381876IS600 orfA
S2133023-0.460035ISSfl3 orfB
S2134026-0.818805ISSfl3 orfA
S2139026-1.332231***Q antiterminator of prophage
S2140-126-0.236809crossover junction endodeoxyribonuclease
S2142025-2.531115hypothetical protein
S2141429-5.052363hypothetical protein
S2143125-4.773243cell killing protein of prophage
S2144225-5.078691hypothetical protein
S2145227-5.190203hypothetical protein
S2146223-4.182430integrase for prophage CP-933U
S2148-123-2.888848*hypothetical protein
S2152-124-3.723648*integrase
S2155-126-4.718005IS911 orfA
S2157-127-4.526024IS600 orfA
S2158-228-4.224806IS600 orfB
S2160-131-4.152761AMP nucleosidase
S2161032-3.802831hypothetical protein
S2165127-2.091202**transcriptional regulator Cbl
S2166126-1.797522nitrogen assimilation transcriptional regulator
S2168025-0.994533*hypothetical protein
S21690220.145634nicotinate-nucleotide--dimethylbenzimidazole
S2170-118-0.201363cobalamin synthase
S2172325-0.923272IS600 orfB
S2173323-1.119961IS600 orfA
S2175325-0.197018IS629 orfA
S2176327-0.091645IS629 orfB
S2177326-0.443769hypothetical protein
S2179328-0.418403ISSfl3 orfA
S2180428-0.275588ISSfl3 orfB
S2181429-0.496356IS629 orfA
S2182430-0.383589IS629 orfB
S2184529-0.735711ISSfl3 orfC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2068RTXTOXIND300.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.017
Identities = 10/57 (17%), Positives = 17/57 (29%), Gaps = 2/57 (3%)

Query: 164 RFTLLPIFRIPVKMQKVSAASPLTQKPDQARRRF--RLGMLVFFGMLGWALLTAMNQ 218
R L R + + + A L + P R R M ++L +
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEI 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2069PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2073ECOLIPORIN5090.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 509 bits (1312), Expect = 0.0
Identities = 239/388 (61%), Positives = 282/388 (72%), Gaps = 33/388 (8%)

Query: 1 MKKLTVAISAVAASVLMAMSAQAAEIYNKDSNKLDLYGKVNAKHYFSSNDADDGDTTYVR 60
MK+ +A+ V ++L A +A AAEIYNKD NKLDLYGKV+ HYFS + + DGD TY+R
Sbjct: 1 MKRKVLAL--VIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMR 58

Query: 61 LGFKGETQINDQLTGFGQWEYEFKGNRAESQGSSKDKTRLAFAGLKFGDYGSIDYGRNYG 120
+GFKGETQINDQLTG+GQWEY + N E +G++ TRLAFAGLKFGDYGS DYGRNYG
Sbjct: 59 VGFKGETQINDQLTGYGQWEYNVQANTTEGEGANS-WTRLAFAGLKFGDYGSFDYGRNYG 117

Query: 121 VAYDIGAWTDVLPEFGGDTWTQTDVFMTGRTTGVATYRNNDFFGLVDGLNFAAQYQGKND 180
V YD+ WTD+LPEFGGD++T D +MTGR GVATYRN DFFGLVDGLNFA QYQGKN+
Sbjct: 118 VLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNE 177

Query: 181 R----------------TDVTEANGDGFGFSTTYEY-EGFGVGATYAKSDRTNDQVIYGN 223
D+ NGDGFG STTY+ GF GA Y SDRTN+QV G
Sbjct: 178 SQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGG 237

Query: 224 NSLNASGQNAEVWAAGLKYDANNIYLATTYSETQNMTVFG------NNHIANKAQNFEVV 277
A G A+ W AGLKYDANNIYLAT YSET+NMT +G + +ANK QNFEV
Sbjct: 238 T--IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVT 295

Query: 278 AQYQFDFGLRPSVAYLQSKGKDLG----AWGDQDLIEYIDVGATYYFNKNMSTFVDYKIN 333
AQYQFDFGLRP+V++L SKGKDL D+DL++Y DVGATYYFNKN ST+VDYKIN
Sbjct: 296 AQYQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKIN 355

Query: 334 LIDKSD-FTKASGVATDDIVAVGLVYQF 360
L+D D F K +G++TDDIVA+G+VYQF
Sbjct: 356 LLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4819HTHFIS290.017 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.017
Identities = 8/30 (26%), Positives = 16/30 (53%)

Query: 176 RTKWTANKVARYLYISVSTLHRRLASEGIS 205
T+ K A L ++ +TL +++ G+S
Sbjct: 447 ATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2076FLGHOOKFLIE1178e-38 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 117 bits (293), Expect = 8e-38
Identities = 102/103 (99%), Positives = 102/103 (99%)

Query: 2 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTVARTQAEKFTL 61
SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQT ARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2078FLGMOTORFLIG338e-118 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 338 bits (868), Expect = e-118
Identities = 117/329 (35%), Positives = 197/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE 60
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD 120
+ + DY R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLRRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLAKRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2079FLGFLIH373e-135 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 373 bits (958), Expect = e-135
Identities = 223/228 (97%), Positives = 226/228 (99%)

Query: 1 MSDNLPWKTWTPDDLAPPPAEFVPMVESEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60
MSDNLPWKTWTPDDLAPP AEFVP+VE EETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 61 AEGRQQGHEQGYQEGLAQGLEQGLAEAKAQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120
AEGRQQGH+QGYQEGLAQGLEQGLAEAK+QQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180
MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2082FLGHOOKFLIK468e-168 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 468 bits (1204), Expect = e-168
Identities = 364/375 (97%), Positives = 369/375 (98%)

Query: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLTLLSEALAGETTTDKAAPQLLVATDKPTTK 60
MIRLAPLITADVDTTTLPGGKASDAAQDFL LLSEALAGETTTDKAAPQLLVATDKPTTK
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLVSDILADAQQADLLIPVDETLPVINDEQSTSTPLTTAQTMTLAAVADKNTTKDEKA 120
GEPL+SDI++DAQQA+LLIPVDET PVINDEQSTSTPLTTAQTM LAAVADKNTTKDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPAEKPTLFTKLTSAQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLP EKPTLFTKLTS QLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTADASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTPLVAEAQSKAEVISTPSPVTA ASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMISPHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQM+SPHQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360

Query: 361 LQGRVTGNSGVDIFA 375
LQGRVTGNSGVDIFA
Sbjct: 361 LQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2084FLGMOTORFLIM380e-134 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 380 bits (977), Expect = e-134
Identities = 86/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 5 ILSQAEIDALLNGDS--EVKDEPTASISGESDIRPYDPNTQRRVVRERLQALEIINERFA 62
+LSQ EID LL S + E IS I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 63 RHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 122
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 123 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 182
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 183 EMQVEFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 240
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 241 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 297
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 298 GVPVLTSQYGTLNGQYALRIEHLI 321
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2085FLGMOTORFLIN2106e-74 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 210 bits (537), Expect = 6e-74
Identities = 125/137 (91%), Positives = 133/137 (97%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTSEKSAADAVFQQFGGGDVSGTLQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T+ KSAADAVFQQ GGGDVSG +QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2088TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2089TYPE3IMRPROT2034e-67 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 203 bits (517), Expect = 4e-67
Identities = 254/261 (97%), Positives = 257/261 (98%)

Query: 1 MMQETSDQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
M+Q TS+QWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPGSHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDP SHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGSEPLNSNAFLAPTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIG EPLNSNAFLA TKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIISELPLI 261
EHLFSEIFNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2100PF05272290.045 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.045
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 15/62 (24%)

Query: 320 AKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVTRTLSARYYKDGAEILIDRG 379
A+Y + PVLW Y+ R+ K + G+ VY +R +DG+E RG
Sbjct: 166 ARYQVGPVLWGYVVRFIK---SDGDKLTLPYVY------------SRSQRDGSEAWKWRG 210

Query: 380 WD 381
WD
Sbjct: 211 WD 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2104HTHFIS270.013 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 26.7 bits (59), Expect = 0.013
Identities = 7/45 (15%), Positives = 16/45 (35%), Gaps = 1/45 (2%)

Query: 4 KRYPEEFKTEAVKQVVDR-GYSVASVATRLDITTHSLYAWIKKYG 47
R E + + + + A L + ++L I++ G
Sbjct: 430 DRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELG 474


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2105ECOLIPORIN296e-101 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 296 bits (758), Expect = e-101
Identities = 137/268 (51%), Positives = 165/268 (61%), Gaps = 31/268 (11%)

Query: 31 DTSYARVGVKGETQINPEMTGYGQFELDLEASNRHNPDQ---TRLAYAGLSYKDFGSFDY 87
D +Y RVG KGETQIN ++TGYGQ+E +++A+ TRLA+AGL + D+GSFDY
Sbjct: 53 DQTYMRVGFKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDY 112

Query: 88 SRNVGVAYDAEAFTDMFVEWGGDSWAGTDLFMTNRTNGVATYRNTDFFGMVEGLNFALQY 147
RN GV YD E +TDM E+GGDS+ D +MT R NGVATYRNTDFFG+V+GLNFALQY
Sbjct: 113 GRNYGVLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQY 172

Query: 148 QGKNEGTGNY----------------KANGDGHGLSATYTID-GFSFAGAYANSDRTDWQ 190
QGKNE NGDG G+S TY I GFS AY SDRT+ Q
Sbjct: 173 QGKNESQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQ 232

Query: 191 SGDGK----GERAEVWALSTKYDANNVYAAVMYGESHNM-------NSDDGDVVNKTQNF 239
G G++A+ W KYDANN+Y A MY E+ NM DG V NKTQNF
Sbjct: 233 VNAGGTIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNF 292

Query: 240 EAVLQYQFDFGLRPSIGYSYSKALDVAG 267
E QYQFDFGLRP++ + SK D+
Sbjct: 293 EVTAQYQFDFGLRPAVSFLMSKGKDLTY 320


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2108ECOLIPORIN755e-20 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 75.0 bits (184), Expect = 5e-20
Identities = 29/67 (43%), Positives = 41/67 (61%), Gaps = 1/67 (1%)

Query: 4 DSGGQSTGYKDSDRLNYIEIGTWYYFNKNMNIYTAYQINLLDKSD-YVLAHGLNTDDQLA 62
D + D D + Y ++G YYFNKN + Y Y+INLLD D + G++TDD +A
Sbjct: 317 DLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDDDDPFYKDAGISTDDIVA 376

Query: 63 VGIVYQF 69
+G+VYQF
Sbjct: 377 LGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2112PF06580354e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 35.2 bits (81), Expect = 4e-04
Identities = 38/195 (19%), Positives = 73/195 (37%), Gaps = 34/195 (17%)

Query: 261 TLSQIRSIAEYQKTIAGN-IEELENISRLTENILFLARADKNNVLVKLDSLSLNKEVENL 319
L+ IR++ T A + L + R + L ++ V SL E+ +
Sbjct: 178 ALNNIRALILEDPTKAREMLTSLSELMRYS-----LRYSNARQV-------SLADELTVV 225

Query: 320 LDYL--EYLSDEKEICFKVKCNQQIFADKI---LLQRMLSNLIVNAIRYSPEKSRIHITS 374
YL + E + F+ + N I ++ L+Q ++ N I + I P+ +I +
Sbjct: 226 DSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKG 285

Query: 375 FLDANGSLNIDIASPGTKINEPEKLFRRFWRGDNSRHSVGQGLGLSLVKA-IAELHGGSA 433
D NG++ +++ + G+ + K G GL V+ + L+G A
Sbjct: 286 TKD-NGTVTLEVENTGSLALKNTKE--------------STGTGLQNVRERLQMLYGTEA 330

Query: 434 TYHYLSKHNVFRITL 448
K +
Sbjct: 331 QIKLSEKQGKVNAMV 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2113HTHFIS831e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.6 bits (204), Expect = 1e-20
Identities = 30/117 (25%), Positives = 60/117 (51%), Gaps = 1/117 (0%)

Query: 2 KILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDDYALIILDIMLPGMDGWQ 61
IL+ +D+ + + Q LS AGY + S+ D L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 ILQTLRTA-KQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSFSELLARVRAQLRQ 117
+L ++ A PV+ ++A+++ ++ + GA DYL KPF +EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2143HOKGEFTOXIC645e-18 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 64.1 bits (156), Expect = 5e-18
Identities = 19/46 (41%), Positives = 32/46 (69%)

Query: 23 QKAMLIALIVICLTVIVTALVTRKDLCEVRIRTGQTEVAVFTAYEP 68
+ +++ ++++CLT+++ +TRK LCE+R R G EVA F AYE
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2177FbpA_PF05833280.010 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 28.3 bits (63), Expect = 0.010
Identities = 13/83 (15%), Positives = 33/83 (39%), Gaps = 6/83 (7%)

Query: 38 RLFRRKNKLQREIQDVEKKIRDNQKRVLLLDNLSDYIKPGMSVEAIQGIIASMKGDYEDR 97
+++ NKL++ + +++ N++ + L ++ I + + I+ I E
Sbjct: 385 SYYKKYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEIEEIKK------ELI 438

Query: 98 VDDYIIKNAELSKERRDISKKLK 120
YI ++ SK +
Sbjct: 439 ETGYIKFKKIYKSKKSKTSKPMH 461


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2184RTXTOXIND417e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.4 bits (97), Expect = 7e-06
Identities = 9/74 (12%), Positives = 33/74 (44%), Gaps = 1/74 (1%)

Query: 16 LRKQQSRLRQYACQVAGYEQEIERLKAQLDRLRRMLFGQSSEKKRHKLENQIRQAEKRLS 75
+ +Q+++ + ++ Y+ ++E++++++ + + K L+ ++RQ +
Sbjct: 254 VLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILD-KLRQTTDNIG 312

Query: 76 ELENRLNTARNLLE 89
L L +
Sbjct: 313 LLTLELAKNEERQQ 326


34S2202S2225Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
S2202-1233.207555ATP phosphoribosyltransferase
S22030222.868224histidinol dehydrogenase
S22040241.771084histidinol-phosphate aminotransferase
S2205-118-0.171190imidazole glycerol-phosphate
S2206-116-0.821536imidazole glycerol phosphate synthase subunit
S2207-114-2.6814801-(5-phosphoribosyl)-5-[(5-
S2208016-5.622429imidazole glycerol phosphate synthase subunit
S2209225-8.208457bifunctional phosphoribosyl-AMP
S2210334-10.337151regulator of length of O-antigen component of
S2212438-11.8550566-phosphogluconate dehydrogenase
S2213664-18.410040hypothetical protein
S2214665-20.425336hypothetical protein
S2215763-20.943507hypothetical protein
S2216663-21.039351hypothetical protein
S2217764-20.602962glycosyl transferase
S2218762-20.353580glycosyl translocase
S2219557-18.360926O-antigen polymerase
S2220449-14.368949dTDP-rhamnosyl transferase
S2221245-11.971806dTDP-rhamnosyl transferase
S2222134-8.937057polysaccharide biosynthesis protein
S2223-120-5.826892dTDP-6-deoxy-L-mannose-dehydrogenase
S2225-114-3.585406glucose-1-phosphate thymidylyltransferase
35S2256S2340Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S2256-2153.937706ISSfl4 orf
S2258-2133.5481493-methyladenine DNA glycosylase
S2261-2133.366214hypothetical protein
S2262-2153.586812hypothetical protein
S2263-2163.558725ISSfl4 orf
S2266-1182.983116multidrug efflux system subunit MdtC
S2267-2120.819857multidrug efflux system protein MdtE
S2268-19-0.471273signal transduction histidine-protein kinase
S2269112-2.312525DNA-binding transcriptional regulator BaeR
S2270114-3.237066hypothetical protein
S2271013-2.746593hypothetical protein
S2273024-4.107675hypothetical protein
S2274319-2.429040lipid kinase
S2275220-2.825775galactitol utilization operon repressor
S2277121-3.094083galactitol-1-phosphate dehydrogenase
S2278218-3.022623IS1 orfB
S2279317-2.703053IS1 orfA
S2280216-2.507254PTS system galactitol-specific enzyme IIC
S2281117-1.689780galactitol-specific PTS system component IIB
S2282218-1.191684galactitol-specific PTS system component IIA
S2283215-0.933502tagatose 6-phosphate kinase 1
S22841150.285236tagatose-bisphosphate aldolase
S22851160.452265fructose-bisphosphate aldolase
S22871160.470181kinase
S2288122-3.066907transcriptional regulator
S2289023-4.119376hypothetical protein
S2290022-3.655254phosphomethylpyrimidine kinase
S2291022-4.523088hydroxyethylthiazole kinase
S2292025-5.933198hypothetical protein
S2296119-4.298604outer membrane protein
S2298115-0.912920fimbrial-like protein
S2299-1140.445229IS1 orfB
S23000111.226964IS1 orfA
S23010101.275515hypothetical protein
S23021111.737760ATPase
S23031131.627389methionyl-tRNA synthetase
S23132201.851852hypothetical protein
S23143191.291570hypothetical protein
S23152170.457493hypothetical protein
S23163170.196482hypothetical protein
S2317017-2.139918IS1 orfB
S2318-215-2.187373IS1 orfA
S2319-214-2.790224hypothetical protein
S2321-217-1.929456IS1 orfB
S2322-217-1.292120IS1 orfA
S2324-117-0.423427two-component response-regulatory protein YehT
S2325018-0.2128572-component sensor protein
S23283260.253810DNA-damage-inducible protein
S23293251.008154hypothetical protein
S23313261.267670hypothetical protein
S23321250.778596phage tail fiber protein
S23331200.525997phage tail fiber assembly protein
S23342221.707363phage tail fiber protein
S23352231.873461IS1 orfA
S23361232.684637IS1 B
S2337-1193.299083IS629 orfA
S23381193.831382IS629 orfB
S23403193.703449hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2266ACRIFLAVINRP9070.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 907 bits (2345), Expect = 0.0
Identities = 287/1035 (27%), Positives = 502/1035 (48%), Gaps = 40/1035 (3%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ +L++ + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVSEMTSSS-SLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLPSGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSDT--YSQGELYDFASTQLAPTISQIDGVGDVDVGGSSL 182
+ S + +M+ SD +Q ++ D+ ++ + T+S+++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLTPQALFNQGVSLDDVRTAISNANVRKPQG------ALEDGTHRWQIQTNDELK 236
A+R+ L L ++ DV + N + G AL I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDSIRAKLPELQETIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRS 355
T +I+AKL ELQ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RAT+IP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RMLVALQQGYGKSLKWVLNHTRLVG 530
+LV+L LTP +C +LK + GF Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVLLGTIALNI----SIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 582
++ +A + +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 583 RD-DPAVDNVTGFT-GGSRVNSGMMFITLKPRDERS---ETAQQIIDRLRVKLAKEPGAN 637
+ +V V GF+ G N+GM F++LKP +ER+ +A+ +I R +++L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 638 LFLMAVQDIRVGGRQSNASYQYTLLSDDLAALREWEPKIRKKLATL-----PELADVNSD 692
+ + I G + ++ L D + + R +L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 693 QQDNGAEMNLVYDRDTMARLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 752
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 753 TQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSD 812
++K++V + G+ +P S F + + I G S D
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 813 ASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVILIIAAIATVYIVLGILYESYVH 872
A A ++ ++L P+ + + G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 873 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGN 932
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA +
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 933 LTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLEITIVGGLVMSQLL 992
EA A +R RPI+MT+LA + G LPL +S G GS + + I ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 993 TLYTTPVVYLFFDRL 1007
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031



Score = 78.3 bits (193), Expect = 1e-16
Identities = 77/446 (17%), Positives = 162/446 (36%), Gaps = 26/446 (5%)

Query: 588 VDNVTGFTGGS-RVNSGMMFITLKPRDERSETAQQIIDRLRVKLAKEPGANLFLMAVQDI 646
+DN+ + S S + +T + + Q+ ++L++ P + Q I
Sbjct: 72 IDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE----VQQQGI 127

Query: 647 RVGGRQSNASYQYTLLSDDLAALREW-----EPKIRKKLATLPELADVNSDQQDNGAE-- 699
V S+ +SD+ ++ ++ L+ L + DV GA+
Sbjct: 128 SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL----FGAQYA 183

Query: 700 MNLVYDRDTMARLGID----VQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRYTQD 755
M + D D + + + + + + T P Q + R+
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP 243

Query: 756 ISALEKMFVINNEGKAIPLSYFAK--WQPANAPLSVNHQGLSAASTISFNLPTGKSLSDA 813
+ +N++G + L A+ N + G AA +L D
Sbjct: 244 EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL-DT 302

Query: 814 SAAIDRAMTQL--GVPSTVRGSFA-GTAQVFQETMNSQVILIIAAIATVYIVLGILYESY 870
+ AI + +L P ++ + T Q +++ V + AI V++V+ + ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 871 VHPLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRH 930
L +P +G L F + + + G++L IG++ +AI++V+
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 931 GNLTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLEITIVGGLVMSQ 990
L P+EA ++ ++ + +P+ GG + + ITIV + +S
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 991 LLTLYTTPVVYLFFDRLRLRFSRKPK 1016
L+ L TP + + + K
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2267TCRTETB1252e-33 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 125 bits (315), Expect = 2e-33
Identities = 97/429 (22%), Positives = 188/429 (43%), Gaps = 23/429 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIATLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGLSPLAIAGLVAVGVVALVLYLLHARNNNRALFSLKL 257
G +L++VG+ L + + V V++ ++++ H R L
Sbjct: 202 KGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+R G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHISVDSGTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYTWLSMAF 441
+Y+ L + F
Sbjct: 428 LYSNLLLLF 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2268BCTERIALGSPF310.009 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 31.3 bits (71), Expect = 0.009
Identities = 27/93 (29%), Positives = 34/93 (36%), Gaps = 27/93 (29%)

Query: 173 LATLLAALATFLLA-------------RGLLAPVKRLVDGTHKLAAGDFTTRVTPTSEDE 219
LATL+AA A L+A V+ V H LA + P S +
Sbjct: 77 LATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSFER 133

Query: 220 L-----------GKLAQDFNQLASTLEKNQQMR 241
L G L N+LA E+ QQMR
Sbjct: 134 LYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2269HTHFIS766e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 6e-18
Identities = 28/136 (20%), Positives = 65/136 (47%), Gaps = 1/136 (0%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLAYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDVPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTILRRCK 129
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 PQRELQQQDAESPLII 145
+ + D++ + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2273LIPOLPP20270.026 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 26.6 bits (58), Expect = 0.026
Identities = 13/38 (34%), Positives = 24/38 (63%), Gaps = 1/38 (2%)

Query: 18 EGEMKKIAAISLISIFLISGCAVHNDETSIGKFGLAYK 55
+ ++KKI +S+++ +I GC+ H ++ I K AYK
Sbjct: 2 KNQVKKILGMSVVAAMVIVGCS-HAPKSGISKSNKAYK 38


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2277DHBDHDRGNASE347e-04 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 33.9 bits (77), Expect = 7e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 2/92 (2%)

Query: 156 AQGCENKNVIIIGAGT-IGLLAIQCAVALGAKSVTAIDISSEKLALAKSFGAMQTFNSSE 214
A+G E K I GA IG + + GA + A+D + EKL S + ++
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAH-IAAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 215 MSAPQMQSVLRELRFNQLILETAGVPQTVELA 246
A S + ++ E + V +A
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2296PF005777130.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 713 bits (1843), Expect = 0.0
Identities = 239/843 (28%), Positives = 389/843 (46%), Gaps = 35/843 (4%)

Query: 2 LRMTPLASAI---VALLLGIEAYAAEETFDTHFMIGGMKDQQVANIRL--DDNQPLPGQY 56
R+ + A +AE F+ F+ Q VA++ + + PG Y
Sbjct: 21 HRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADD--PQAVADLSRFENGQELPPGTY 78

Query: 57 DIDIYVNKQWRGKYEIIVKDNPQET----CLSREVIKRLGIN-----SDNFASGKQCLTF 107
+DIY+N + ++ E CL+R + +G+N N + C+
Sbjct: 79 RVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPL 138

Query: 108 EQLVQGGSYSWDIGVFRLDFSVPQAWVEELESGYVPPENWERGINAFYTSYYVSQYYSDY 167
++ + D+G RL+ ++PQA++ GY+PPE W+ GINA +Y S
Sbjct: 139 TSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQN 198

Query: 168 KASGNNKSTYVRFNSGLNLLEWQLHSDASFSKTNNNPGV-----WKSNTLYLERGFAQFL 222
+ GN+ Y+ SGLN+ W+L + ++S +++ W+ +LER
Sbjct: 199 RIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLR 258

Query: 223 GTLRVGDMYTSSDIFDSVRFSGVRLFRDMQMLPNSKQNFTPRVQGIAQSNALVTIEQNGF 282
L +GD YT DIFD + F G +L D MLP+S++ F P + GIA+ A VTI+QNG+
Sbjct: 259 SRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGY 318

Query: 283 VVYQKEVPPGPFAITDLQLAGGGADLDVSVKEADGSVTTYLVPYAAVPNMLQPGVSKYDF 342
+Y VPPGPF I D+ AG DL V++KEADGS + VPY++VP + + G ++Y
Sbjct: 319 DIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSI 378

Query: 343 AAGRSHIEGASKQSD-FVQAGYQYGFNNLLTLYGGTMVANNYYAFTLGTGWNT-RIGAIS 400
AG A ++ F Q+ +G T+YGGT +A+ Y AF G G N +GA+S
Sbjct: 379 TAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALS 438

Query: 401 VDATKSHSKQDNGDVFDGQSYQIAYNKFVSQTSTRFGLAAWRYSSRDYRTFNDHVWANNK 460
VD T+++S + DGQS + YNK ++++ T L +RYS+ Y F D ++
Sbjct: 439 VDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMN 498

Query: 461 DNYRRDENDIYDI----ADYYQNDFGRKNSFSANMSQSLPEGWGSVSLSTLWRDYWGRSG 516
++ + + DYY + ++ ++Q L ++ LS + YWG S
Sbjct: 499 GYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSN 557

Query: 517 SSKDYQLSYSNNWRRISYTLAASQAYGENHHE-EKRFNIFISIPCD--WGDDVTTPRRQI 573
+ +Q + + I++TL+ S ++ + ++IP D + R
Sbjct: 558 VDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHA 617

Query: 574 YMSNSTTFDDQGFASNNTGLSGTVGSRDQFNYGVNLSHQHQGN---ETTAGANLTWNAPV 630
S S + D G +N G+ GT+ + +Y V + G+ +T A L +
Sbjct: 618 SASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGY 677

Query: 631 ATVNGSYSQSSTYRQTGASVSGGIVAWSGGVNLANRLSETFAVMNAPGIKDAYVNGQKYR 690
N YS S +Q VSGG++A + GV L L++T ++ APG KDA V Q
Sbjct: 678 GNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGV 737

Query: 691 TTNRNGVVVYDGMTPYRENHLMLDVSQSDSEAELRGNRKIAAPYRGAVVLVNFDTDQRKP 750
T+ G V T YREN + LD + +L P RGA+V F +
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKA-RVGI 796

Query: 751 WFIKALRADGQPLTFGYEVNDIHGHNIGVVGQGSQLFIRTNEIPPSVNVAIDKQQGLSCT 810
+ L + +PL FG V + G+V Q+++ + V V +++ C
Sbjct: 797 KLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCV 856

Query: 811 ITF 813
+
Sbjct: 857 ANY 859


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2319INTIMIN280.015 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 28.1 bits (62), Expect = 0.015
Identities = 20/94 (21%), Positives = 32/94 (34%)

Query: 36 LNGTEIAITYVYKGDKVLKQSSETKIQFASIGATTKEDAAKTLEPLSAKYKNIAGVEEKS 95
+ + AITY K K K S ++ F + KT AK + KS
Sbjct: 671 VANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKS 730

Query: 96 TYTDTYAQENVTIDMEKVDFKALQGISGINVSAE 129
+ + V + +V+F I N+
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIV 764


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2324HTHFIS712e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 71.4 bits (175), Expect = 2e-16
Identities = 41/178 (23%), Positives = 76/178 (42%), Gaps = 14/178 (7%)

Query: 2 IKVLIVDDEPLARENL-RIFLQEQSDIEIVGECSNAVEGIGAVHKLRPDVLFLDIQMPRI 60
+L+ DD+ R L + + D+ I NA + D++ D+ MP
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITS---NAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 SGLEMVGMLDPEHRPYI--VFLTAFD--EYAIKAFEEHAFDYLLKPIDEARLEKTLARLR 116
+ +++ + + RP + + ++A + AIKA E+ A+DYL KP D L + R
Sbjct: 61 NAFDLLPRIK-KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 117 QERSKQDVSLLPENQQALKFIPCTGHSRIYLLQMKDVAFVSSRMSGVYVT--SHEGKE 172
E ++ L ++Q + + G S + +A + + +T S GKE
Sbjct: 120 AEPKRRPSKLEDDSQDGMPLV---GRSAAMQEIYRVLARLMQTDLTLMITGESGTGKE 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2325PF065802204e-69 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 220 bits (562), Expect = 4e-69
Identities = 63/216 (29%), Positives = 115/216 (53%), Gaps = 3/216 (1%)

Query: 343 LGEGIAQLLSAQILAGQYERQKAMLTQSEIKLLHAQVNPHFLFNALNTIKAVIRRDSEQA 402
L G + + + +M ++++ L AQ+NPHF+FNALN I+A+I D +A
Sbjct: 134 LYFGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKA 193

Query: 403 SQLVQYLSTFFRKNLKR-PSEFVTLADEIEHVNAYLQIEKARFQSRLQVNIAIPQELSQQ 461
+++ LS R +L+ + V+LADE+ V++YLQ+ +F+ RLQ I +
Sbjct: 194 REMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDV 253

Query: 462 QLPAFTLQPIVENAIKHGTSQLLDTGRVAISARREGQHLMLEIEDNAGL-YQPVTNASGL 520
Q+P +Q +VEN IKHG +QL G++ + ++ + LE+E+ L + ++G
Sbjct: 254 QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGT 313

Query: 521 GMNLVDKRLRERFGDDYGISVACEPDSYTRITLRLP 556
G+ V +RL+ +G + I ++ + + +P
Sbjct: 314 GLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2332FLAGELLIN300.022 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 30.0 bits (67), Expect = 0.022
Identities = 14/93 (15%), Positives = 30/93 (32%)

Query: 76 VAQAQQSAGAAAGNAQQTAQDVAAAATARDDAQRFAEKARQDATVTAEDRKATAEDVTST 135
V Q + N D+ A + +++ A A + + +
Sbjct: 337 VVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFID 396

Query: 136 GANAAAAGQSAQDAAGYARAAEQAKNDIDAALT 168
+ + +DAA ++ ID+AL+
Sbjct: 397 KTASGVSTLINEDAAAAKKSTANPLASIDSALS 429


36S2407S2418Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S24070203.049763transcriptional regulator NarP
S24080213.476487subunit of heme lyase
S24091203.741857disulfide oxidoreductase
S24100183.893604cytochrome c-type biogenesis protein
S24110152.579186cytochrome c-type biogenesis protein CcmE
S24121162.774978heme exporter protein C
S24130152.835342heme exporter protein C
S2414-1173.428169heme exporter protein B, cytochrome c-type
S2415-1183.587283cytochrome c biogenesis protein CcmA
S2416-1203.547769cytochrome c-type protein NapC
S2417-1193.913475citrate reductase cytochrome c-type subunit
S2418-1183.348994quinol dehydrogenase membrane component
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2407HTHFIS637e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.9 bits (153), Expect = 7e-14
Identities = 22/114 (19%), Positives = 46/114 (40%), Gaps = 2/114 (1%)

Query: 9 VMIVDDHPLMRRGVRQLLELDPGFEVVAEAGEGASAIDLANRLDIDVILLDLNMKGMSGL 68
+++ DD +R + Q L G++V A+ D D+++ D+ M +
Sbjct: 6 ILVADDDAAIRTVLNQALS-RAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 69 DTLNALRRDGVTAQIIILTVSDASSDVFALIDAGADGYLLKDSDPEVLLEAIRT 122
D L +++ +++++ + + GA YL K D L+ I
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


37S2567S2494Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S25670143.485791sucrose-6 phosphate hydrolase
S24710144.243599hypothetical protein
S24731144.197399O-succinylbenzoic acid--CoA ligase
S24740134.014719O-succinylbenzoate synthase
S2475-1132.738563naphthoate synthase
S24760141.981526acyl-CoA thioester hydrolase YfbB
S24770141.6114482-succinyl-5-enolpyruvyl-6-hydroxy-3-
S2478-117-0.272400menaquinone-specific isochorismate synthase
S24790131.151057hypothetical protein
S24800151.710997hypothetical protein
S24811213.026114ribonuclease Z
S24822253.697848IS1 orfA
S24831273.665696IS1 orfB
S24870283.685073NADH dehydrogenase subunit N
S24880283.081874NADH dehydrogenase subunit M
S2489-1293.544195NADH dehydrogenase subunit L
S24900283.358217NADH dehydrogenase subunit K
S24910283.387866NADH dehydrogenase subunit J
S24921273.541262NADH dehydrogenase subunit I
S24930263.355183NADH dehydrogenase subunit H
S24941243.378368NADH dehydrogenase subunit G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2567BCTERIALGSPC290.003 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 29.2 bits (65), Expect = 0.003
Identities = 12/31 (38%), Positives = 18/31 (58%), Gaps = 1/31 (3%)

Query: 34 KHIVLWLGLALACLGLAMMLWLLVL-QNVPV 63
+ I+ +L + L C LAM+ W + L N PV
Sbjct: 15 RRILFYLLMLLFCQQLAMIFWRIGLPDNAPV 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2480AUTOINDCRSYN356e-05 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 34.8 bits (80), Expect = 6e-05
Identities = 14/79 (17%), Positives = 32/79 (40%), Gaps = 12/79 (15%)

Query: 1 MIEWQDLHHSELSVSQLYALLQLRCAVFV--------VEQNCPYQDIDGDDLTGDNRHIL 52
M+E D++H+ LS ++ L LR F + D + + ++
Sbjct: 1 MLEIFDVNHTLLSETKSGELFTLRKETFKDRLNWAVQCTDGMEFDQYDNN----NTTYLF 56

Query: 53 GWKNDELVAYARILKSDDD 71
G K++ ++ R +++
Sbjct: 57 GIKDNTVICSLRFIETKYP 75


38S2555S2577Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S2555-117-3.5704853-ketoacyl-CoA thiolase
S2556019-5.785842hypothetical protein
S2557118-4.883207long-chain fatty acid outer membrane
S2558119-4.502014hypothetical protein
S2559115-0.927716lipoprotein precursor
S2560117-1.249414transporter
S2565124-3.217642*IS911 orfA
S2566026-4.267267IS4 orf
S2470028-5.602213hypothetical protein
S2568033-8.567696sucrose specific repressor
S2570035-9.299638D-serine dehydratase
S2571036-9.718984multidrug resistance protein Y
S2572035-8.999473multidrug resistance protein K
S2573134-8.571625DNA-binding transcriptional activator EvgA
S2574135-8.537495hybrid sensory histidine kinase in two-component
S2575332-6.327381hypothetical protein
S2577128-5.107069oxalyl-CoA decarboxylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2559VACJLIPOPROT407e-148 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 407 bits (1048), Expect = e-148
Identities = 251/251 (100%), Positives = 251/251 (100%)

Query: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60
MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR
Sbjct: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60

Query: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120
DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM
Sbjct: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120

Query: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADALYPVLSWLTWPM 180
ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADALYPVLSWLTWPM
Sbjct: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADALYPVLSWLTWPM 180

Query: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240
SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA
Sbjct: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240

Query: 241 IQDDLKDIDSE 251
IQDDLKDIDSE
Sbjct: 241 IQDDLKDIDSE 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2571TCRTETB1193e-31 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 119 bits (300), Expect = 3e-31
Identities = 98/408 (24%), Positives = 168/408 (41%), Gaps = 25/408 (6%)

Query: 19 VTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVITSFGVANAIAIPVTGRLAQ 78
+ I L + +F +L+ + NV++P I+ WV T+F + +I V G+L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 79 RIGELRLFLLSVTFFSLSSLMCSLS-TNLDVLIFFRVVQGLMAGPLIPLSQSLLLRNYPP 137
++G RL L + S++ + + +LI R +QG A L ++ R P
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 138 EKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRE 197
E R A L V + GP +GG I W +L+ +PM I+ L L +E
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192

Query: 198 TETSPVKMNLPRLTLLVLGVGGLQIMLDKGRDLDWFNSSTIIILTVVSVISLISLVIWES 257
K + ++++ VG + ML F +S I +VSV+S + V
Sbjct: 193 VRI---KGHFDIKGIILMSVGIVFFML--------FTTSYSISFLIVSVLSFLIFVKHIR 241

Query: 258 TSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQKTMGYNAIWAGLAYAPI 317
+P +D L K+ F IG++ + +G + ++P +++ + G
Sbjct: 242 KVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 318 GIMPLLISPLIG-----RYGNKIDMRVLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQ 372
G M ++I IG R G + + VTF +V + S T F II+
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL----SVSFLTASFLLETTSWFMTIIIVF 357

Query: 373 FFQGFAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL 420
G + ++TI S L + S+ NF LS G ++
Sbjct: 358 VLGGLSFTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2572RTXTOXIND771e-17 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 77.2 bits (190), Expect = 1e-17
Identities = 63/419 (15%), Positives = 125/419 (29%), Gaps = 96/419 (22%)

Query: 8 KKQSNRKKYFSLLVIVLFIAFSGAYAYWSMELEDMISTDDAYVT-GNADPISAQVSGSVT 66
+ +R+ I+ F+ + + ++E + + + G + I + V
Sbjct: 50 ETPVSRRPRLVAYFIMGFLVIAFILSVLG-QVEIVATANGKLTHSGRSKEIKPIENSIVK 108

Query: 67 VVNHKDTNYVRQGDILVSLDKTDATIALNKA----------------------------- 97
+ K+ VR+GD+L+ L A K
Sbjct: 109 EIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPEL 168

Query: 98 -----------------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQ 131
K + Q + L + AE + + Y+
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYEN 228

Query: 132 SLEDYNRRV----PLAKQGVISKE----------TLEHTKDTLISSKAALNAAIQAYKAN 177
R+ L + I+K + S + + I + K
Sbjct: 229 LSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE 288

Query: 178 KALVMN-------TPLNR-QPQVVEAADATKEAWLVLKRTDIRSPVTGYIAQRSVQ-VGE 228
LV L + + + + + IR+PV+ + Q V G
Sbjct: 289 YQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGG 348

Query: 229 TVSSGQSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINM 287
V++ ++LM +VP + V A + + + +GQ+ I + F G +
Sbjct: 349 VVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLV 402

Query: 288 GTGNAFSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDTKD 342
G + + +V V +S++ L PL G+++TA I T
Sbjct: 403 GK---VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2573HTHFIS493e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 3e-09
Identities = 22/148 (14%), Positives = 53/148 (35%), Gaps = 31/148 (20%)

Query: 4 IIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQV 63
++ DD + L + ++ + + + + D+V+ DV +P N +
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 LETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF 123
L ++K + ++++SA+N + AI+A++ G +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDY 101

Query: 124 ---PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 102 LPKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2574HTHFIS802e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.9 bits (197), Expect = 2e-17
Identities = 30/105 (28%), Positives = 51/105 (48%)

Query: 960 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKVSMQHYDLLITDVNMPNMDGFE 1019
+IL+ADD R +L + L+ GYDV ++ ++ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 LTRKLREQNSSLPIWGLTANAQANEREKGLSCGMNLCLFKPLTLD 1064
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


39S2632S2649Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S26322170.708105hypothetical protein
S26331180.624022hypothetical protein
S26341181.708712hypothetical protein
S26351161.529833acetyltransferase
S26361160.383013N-acetylmuramoyl-l-alanine amidase I
S26380180.481109ARAC-type regulatory protein
S2639-119-0.947492hypothetical protein
S2641020-2.656017hypothetical protein
S2642-124-4.967292hypothetical protein
S2643-119-5.566672hypothetical protein
S2644535-8.210206hypothetical protein
S2645534-7.355714hypothetical protein
S2646635-7.733690hypothetical protein
S2647534-7.229359hypothetical protein
S2648332-6.870858hypothetical protein
S2649014-3.175484amino acid antiporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2635SACTRNSFRASE316e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 31.5 bits (71), Expect = 6e-04
Identities = 15/102 (14%), Positives = 38/102 (37%), Gaps = 4/102 (3%)

Query: 24 LRPWNDPEMDIERKMNHDVSLFLVAEVNGEVVG--TVMGGYDGHRGSAYYLGVHPEFRGR 81
+ + D +MD+ + FL + +G + ++G + V ++R +
Sbjct: 47 FKQYEDDDMDVSYVEEEGKAAFL-YYLENNCIGRIKIRSNWNG-YALIEDIAVAKDYRKK 104

Query: 82 GIANALLNRLEKKLIARGCPKIQINVPEDNDMVLGMYERLGY 123
G+ ALL++ + + + + N Y + +
Sbjct: 105 GVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


40S2697S2720Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S26974190.121773IS600 orfA
S26982171.289089IS600 orfB
S27000234.334762hypothetical protein
S27011244.959327outer membrane lipoprotein
S27023328.064111hypothetical protein
S27031379.746723dihydropteroate synthase
S270414010.024905phosphoglucosamine mutase
S270524310.124166transposase
S270654910.908880hypothetical protein
S270744410.661855resolvase
S27084367.807284transcriptional regulator
S27094357.619684arsenical resistance protein
S27104346.797560arsenate reductase
S27112336.624628sodium bile acid symporter family protein
S27153284.894061hypothetical protein
S27162304.070854mating pair formation protein
S27181325.218917conjugal transfer protein TrbJ
S27190223.723496stabilization protein
S27201223.450573replication protein C
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2702IGASERPTASE280.024 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.1 bits (62), Expect = 0.024
Identities = 19/124 (15%), Positives = 40/124 (32%), Gaps = 6/124 (4%)

Query: 34 QQGKNEEQRQHDEWVAERNREIQQEKQRRANAQAAANKRAATAAANKKARQDKLDAEATA 93
Q + ++ + + + E+ Q Q K AT +KA+ + +
Sbjct: 1064 QNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVET-EKTQEV 1122

Query: 94 DKKRDQSYEDELRSLEIQKQKLALAKEEARVKRENEFIDQELKHKAAQTDVVQSEADANR 153
K Q + +S +Q Q + + V I + D Q + +
Sbjct: 1123 PKVTSQVSPKQEQSETVQPQAEPARENDPTVN-----IKEPQSQTNTTADTEQPAKETSS 1177

Query: 154 NMTE 157
N+ +
Sbjct: 1178 NVEQ 1181


41S2733S2747Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S27332151.1976294-hydroxy-3-methylbut-2-en-1-yl diphosphate
S27340121.823668hypothetical protein
S27351151.631772hypothetical protein
S27361141.844394nucleoside diphosphate kinase
S27380123.118351ISSfl4 orf
S27410162.167808enhanced serine sensitivity protein SseB
S27420183.128455aminopeptidase B
S27432212.226977hypothetical protein
S27442252.075326[2FE-2S] ferredoxin, electron carrer protein
S27452252.136788chaperone protein HscA
S27461240.592154co-chaperone HscB
S27472280.918456iron-sulfur cluster assembly protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2734IGASERPTASE280.044 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.5 bits (63), Expect = 0.044
Identities = 18/120 (15%), Positives = 34/120 (28%), Gaps = 5/120 (4%)

Query: 137 KAQQEEITTMADQSSAELSSNSEQGQSVPLNTSTTTDPATTSTPPASVDTTATNTQTPAV 196
++ E T ++ + E+ V + T+ P + Q
Sbjct: 1087 QSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPA 1146

Query: 197 TAPAPAVDPQQNAVVSPSQANVDTAATPVPTAATTPDGAAPLPTDQAGVTTPAADPNALV 256
P V+ ++ P TA T P T+ + P+ T + N
Sbjct: 1147 RENDPTVNIKE-----PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPEN 1201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2741STREPKINASE300.014 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 29.7 bits (66), Expect = 0.014
Identities = 27/120 (22%), Positives = 52/120 (43%), Gaps = 21/120 (17%)

Query: 127 GNPLSSQEILEGGESLILSE-----VAEPPAQMIDSLTTLFKTIKPVKRAFICSIKENEE 181
G+ ++SQE+L +S++ + E + ++ +F+TI P+ + F +K E+
Sbjct: 217 GDTITSQELLAQAQSILNKNHPGYTIYERDSSIVTHDNDIFRTILPMDQEFTYRVKNREQ 276

Query: 182 A-QPNLLIGIEADGDIEEIIQATGSVATDTLPGDEPIDICQVKKGEKGISHFITEHIAPF 240
A + N G+ + + ++I V +KKGEK F H+ F
Sbjct: 277 AYRINKKSGLNEEINNTDLISEKYYV---------------LKKGEKPYDPFDRSHLKLF 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2745SHAPEPROTEIN1149e-30 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 114 bits (286), Expect = 9e-30
Identities = 80/371 (21%), Positives = 143/371 (38%), Gaps = 74/371 (19%)

Query: 23 GIDLGTTNSLVATVRSGQAETLADHEGRHLLPSVVHYQQQGHS-------VGYDARTNAA 75
IDLGT N+L+ G + +E PSVV +Q VG+DA+
Sbjct: 14 SIDLGTANTLIYVKGQG----IVLNE-----PSVVAIRQDRAGSPKSVAAVGHDAK-QML 63

Query: 76 LDTANTISSVKRLMGRSLADIQQRYPHLPYQFQASENGLPMIETAAGLLNPVRVSADILK 135
T I++++ + +AD V+ +L+
Sbjct: 64 GRTPGNIAAIRPMKDGVIADF-------------------------------FVTEKMLQ 92

Query: 136 ALAARATEALAGE-LDGVVITVPAYFDDAQRQGTKDAARLAGLHVLRLLNEPTAAAIAYG 194
+ V++ VP +R+ +++A+ AG + L+ EP AAAI G
Sbjct: 93 HFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAG 152

Query: 195 LDSGQEGVIAVYDLGGGTFDISILRLSRGVFEVLATGGDSALGGDDFDHLLADYIREQAD 254
L + V D+GGGT +++++ L+ V +GGD FD + +Y+R
Sbjct: 153 LPVSEATGSMVVDIGGGTTEVAVISLNGVV-----YSSSVRIGGDRFDEAIINYVRRNYG 207

Query: 255 --IPDRSDNRVQRELLDATIAAKIALSDADSVTVNVAG---WQG-----EISREQFNELI 304
I + + R++ E+ A + + V G +G ++ + E +
Sbjct: 208 SLIGEATAERIKHEI-------GSAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEAL 260

Query: 305 APLVKRTLLACRRALKDAGVE-ADEVLE--VVMVGGSTRVPLVRERVGEFFGRPPLTSID 361
+ + A AL+ E A ++ E +V+ GG + + + E G P + + D
Sbjct: 261 QEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLMEETGIPVVVAED 320

Query: 362 PDKVVAIGAAI 372
P VA G
Sbjct: 321 PLTCVARGGGK 331


42S2779S2790Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S2779224-1.398758hypothetical protein
S2780425-1.698340DNA-binding transcriptional regulator
S2781528-2.042789DNA-invertase
S2782529-1.791600invasion plasmid antigen
S27833231.373982hypothetical protein
S27842220.354027tail fiber protein
S2785323-0.409105tail fiber assembly protein
S2786324-0.064193tail fiber protein
S2787224-0.397675insertion sequence 2 OrfA protein
S2788324-0.430728insertion element IS2 transposase InsD
S2789325-1.262388tail fiber protein
S2790227-1.260476hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2781STREPKINASE280.019 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 27.8 bits (61), Expect = 0.019
Identities = 15/24 (62%), Positives = 18/24 (75%), Gaps = 2/24 (8%)

Query: 42 RPGLK--KLLKTLSAGDTLVVWKL 63
RPGLK KLLKTL+ GDT+ +L
Sbjct: 202 RPGLKDTKLLKTLAIGDTITSQEL 225


43S2848S2885Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
S2848214-1.086591hypothetical protein
S2850213-1.333195transporter
S2851215-1.627843heat shock protein GrpE
S2852115-1.444808inorganic polyphosphate/ATP-NAD kinase
S2853217-1.709226recombination and repair protein
S2854122-2.264457small membrane protein A
S2855222-2.081881hypothetical protein
S2856121-1.147694hypothetical protein
S2857124-0.395946SsrA-binding protein
S2859227-1.986613integrase
S2860-116-0.270902IS1 orfA
S2861-3171.694781IS3 orfA
S2862-1203.350624IS3 orfB
S28631152.055200IS3 orfB
S28701141.790283*hypothetical protein
S28710152.169546hypothetical protein
S28722172.265336hydroxyglutarate oxidase
S28740151.6643474-aminobutyrate aminotransferase
S2875-1170.157232gamma-aminobutyrate transporter
S2876-1250.775891DNA-binding transcriptional regulator CsiR
S2877-123-0.586841LysM domain/BON superfamily protein
S2878-123-0.615713hypothetical protein
S2879-222-0.959469insertion element IS2 transposase InsD
S2880123-3.153886insertion sequence 2 OrfA protein
S2881024-3.511844hypothetical protein
S2882222-2.800598hypothetical protein
S2883112-1.039503DNA binding protein, nucleoid-associated
S2884112-1.238975hypothetical protein
S2885315-1.539361hypothetical protein
44S2921S2932Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S29211153.119981DNA-bindng transcriptional repressor SrlR
S29221153.606439D-arabinose 5-phosphate isomerase
S29231173.173515anaerobic nitric oxide reductase transcriptional
S29241172.736264anaerobic nitric oxide reductase
S29250163.441376nitric oxide reductase
S29260173.974127transcriptional regulatory protein
S2927-1224.054598electron transport protein HydN
S2928-1214.423020ascBF operon repressor
S2929-1244.617452iron-sulfur protein of hydrogenase 3 (part of
S29300234.530536large subunit of hydrogenase 3 (part of FHL
S29312214.211168membrane-spanning protein of hydrogenase 3 (part
S29322213.784401formate hydrogenlyase subunit 3
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2921ARGREPRESSOR290.014 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 28.7 bits (64), Expect = 0.014
Identities = 20/105 (19%), Positives = 35/105 (33%), Gaps = 17/105 (16%)

Query: 1 MKPRQRQAAILEYLQKQGKCSVEEL-----AQYFDTTGTTIRKDLVILEHAGTVIRTYGG 55
M QR I E + + +EL ++ T T+ +D+ E + T G
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIK--ELHLVKVPTNNG 58

Query: 56 ---VVLNKEESDPPIDHKTLINTHKKELIAEAAVSFIHDGDSIIL 97
L ++ P+ K + +A V I+L
Sbjct: 59 SYKYSLPADQRFNPLS-------KLKRSLMDAFVKIDSASHLIVL 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2923HTHFIS374e-127 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 374 bits (961), Expect = e-127
Identities = 125/388 (32%), Positives = 194/388 (50%), Gaps = 33/388 (8%)

Query: 149 IAALAAGALS----------NALLIEQLESQNMLPGDATPFEAVKQTQMIGLSPGMTQLK 198
I A GA +I + ++ ++ ++G S M ++
Sbjct: 91 IKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIY 150

Query: 199 KEIEIVAASDLNVLISGETGTGKELVAKAIHEASPRAVNPLVYLNCAALPESVAESELFG 258
+ + + +DL ++I+GE+GTGKELVA+A+H+ R P V +N AA+P + ESELFG
Sbjct: 151 RVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFG 210

Query: 259 HVKGAFTGAISNRSGKFEMADNGTLFLDEIGELSLALQAKLLRVLQYGDIQRVGDDRSLR 318
H KGAFTGA + +G+FE A+ GTLFLDEIG++ + Q +LLRVLQ G+ VG +R
Sbjct: 211 HEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIR 270

Query: 319 VDVRVLAATNRDLREEVLAGRFRADLFHRLSVFPLSVPPLRERGDDVILLAGYFCEQCRL 378
DVR++AATN+DL++ + G FR DL++RL+V PL +PPLR+R +D+ L +F +Q
Sbjct: 271 SDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAE- 329

Query: 379 RQGLSRVVLSAGARNLLQHYSFPGNVRELEHAIHRAVVLARATRNGDEVIL-----EAQH 433
++GL A L++ + +PGNVRELE+ + R L E+I E
Sbjct: 330 KEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPD 389

Query: 434 FAFPEVTLPPPEAAAVPVVKQNLR-----------------EATEAFQRETIRQALAQNH 476
+ + V++N+R + I AL
Sbjct: 390 SPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATR 449

Query: 477 HNWAACARMLETDVANLHRLAKRLGLKD 504
N A +L + L + + LG+
Sbjct: 450 GNQIKAADLLGLNRNTLRKKIRELGVSV 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2932PF00577300.037 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 29.8 bits (67), Expect = 0.037
Identities = 9/47 (19%), Positives = 13/47 (27%), Gaps = 2/47 (4%)

Query: 4 PLFWHFSFQKALSGWIAGIGGAVGS--LYTAAAGFTVLTGAVGVSGA 48
P F+ + L GG + G GA+G
Sbjct: 393 PRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSV 439


45S3052S3060Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
S3052218-1.000736acetyl-CoA acetyltransferase
S3053525-1.803112transporter protein
S3055728-0.590438IS3 orfA
S30561033-1.101746IS3 orfB
S30591032-1.920105hypothetical protein
S3060321-1.230586hypothetical protein
46S3148S3155Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S3148224-3.709691hypothetical protein
S3149123-3.325895deoxyribonucleotide triphosphate
S3150023-5.200618coproporphyrinogen III oxidase
S3151026-7.259188hypothetical protein
S3152-219-5.392541hypothetical protein
S3153-214-3.928390hypothetical protein
S3154-210-3.378660IS1 orfA
S3155-210-3.059981hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3152ANTHRAXTOXNA290.012 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.3 bits (65), Expect = 0.012
Identities = 17/55 (30%), Positives = 23/55 (41%), Gaps = 3/55 (5%)

Query: 128 ETAAKKSEAYQQKLWEKIDADTRAQAKAMGGEIVKVDKAPFR-KAVQPLFDDFKK 181
ET K + Q L +KI D +GGEI D K +Q L ++ K
Sbjct: 74 ETLDKIQQT--QDLLKKIPKDVLEIYSELGGEIYFTDIDLVEHKELQDLSEEEKN 126


47S3165S3209Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S3165020-4.882937ornithine decarboxylase
S3166224-5.766992transporter
S4822424-4.858793*P4-type integrase
S3169423-4.728580superfamily I DNA helicase
S4823625-4.005477hypothetical protein
S4824626-4.523413serine protease
S3177625-0.212857hypothetical protein
S3178624-0.021894serine protease precurser
S31804240.277918reverse transcriptase-like protein
S31814250.660706reverse transcriptase-like protein
S31844260.628619IS629 orfA
S31855260.642802hypothetical protein
S31864250.619818hypothetical protein
S3187423-0.255077hypothetical protein
S3188228-4.044850insertion sequence 2 OrfA protein
S3189324-2.290497insertion element IS2 transposase InsD
S3190624-2.031133hypothetical protein
S31917220.908891hypothetical protein
S48256230.718401hypothetical protein
S48266230.648455hypothetical protein
S31938243.112664hypothetical protein
S31948242.820792outer membrane fluffing protein
S31958252.896897outer membrane fluffing protein
S31967251.341360hypothetical protein
S31978293.471761hypothetical protein
S31998304.609438hypothetical protein
S32008284.055125hypothetical protein
S32017271.641533RADC family DNA repair protein
S32026270.392507hypothetical protein
S32034271.028310structural protein
S3204324-0.688956hypothetical protein
S3205221-2.855607hypothetical protein
S3207321-2.986667hypothetical protein
S3209221-2.230759IS3 orfA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4824IGASERPTASE2833e-80 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 283 bits (726), Expect = 3e-80
Identities = 136/571 (23%), Positives = 223/571 (39%), Gaps = 122/571 (21%)

Query: 33 KKVILGIILSSIYGSYGETAFA-AMLDINNIWTRDYLDLAQNRGEFRPGATNVQLMMKDG 91
KK L I ++ +Y T + A L +++ + + D A+N+G+F GATNV + K+
Sbjct: 4 KKFKLNFIALTV--AYALTPYTEAALVRDDVDYQIFRDFAENKGKFSVGATNVLVKDKNN 61

Query: 92 KIFH--FPE-LPVPDFSAVS-NKGATTSIGGAYSVTATH--------------------N 127
K P +P+ DFS V +K T I Y V H N
Sbjct: 62 KDLGTALPNGIPMIDFSVVDVDKRIATLINPQYVVGVKHVSNGVSELHFGNLNGNMNNGN 121

Query: 128 GTQHHAITTQSWDQTAYKASNRVSS----------------GDFSVHRLNKFVVETTGVT 171
H ++++ + + + + D+ + RL+KFV T
Sbjct: 122 AKAHRDVSSEENRYFSVEKNEYPTKLNGKTVTTEDQTQKRREDYYMPRLDKFV------T 175

Query: 172 ESADFSLSPEDAMKRYGVNYNGKEQ-IIGFRAGAGTTSTILNGKQY-------------- 216
E A S + YN + + R G+G+ G Y
Sbjct: 176 EVAPIEASTASS---DAGTYNDQNKYPAFVRLGSGSQFIYKKGDNYSLILNNHEVGGNNL 232

Query: 217 -LFGQNYNPDLLSASLFNLDWKNKSYIYT--------------NRTPFKNSPIFGDSGSG 261
L G Y ++ + + ++ +N I ++ P N + GDSGS
Sbjct: 233 KLVGDAYTY-GIAGTPYKVNHENNGLIGFGNSKEEHSDPKGILSQDPLTNYAVLGDSGSP 291

Query: 262 SYLYDKEQQKWVFHGVTSTVGFISSTNIAWTNYSLFNNILVNNLKKNFTNTMQLDGKKQE 321
++YD+E+ KW+F G + +W ++++ + ++ + + K
Sbjct: 292 LFVYDREKGKWLFLGSYD--FWAGYNKKSWQEWNIYKSQFTKDVLNKDSAGSLIGSKTDY 349

Query: 322 LSSIIKD-------------------------KDLSVSGGGVLTLKQDTDLGIGGLIFDK 356
S K ++ G G LTL + D G GGL F+
Sbjct: 350 SWSSNGKTSTITGGEKSLNVDLADGKDKPNHGKSVTFEGSGTLTLNNNIDQGAGGLFFEG 409

Query: 357 NQTYKVYGKDKSYKGAGIDIDNNTTVEWNVKGVAGDNLHKIGSGTLDVKIAQGN--NLKI 414
+ K + ++KGAG+ + TV W V D L KIG GTL V+ N +LK+
Sbjct: 410 DYEVKGTSDNTTWKGAGVSVAEGKTVTWKVHNPQYDRLAKIGKGTLIVEGTGDNKGSLKV 469

Query: 415 GNGTVIL------SAEKAFNKIYMAGGKGTVKINAKDALSESGNGEIYFTRNGGTLDLNG 468
G+GTVIL S + AF + + G+ T+ +N + + IYF GG LDLNG
Sbjct: 470 GDGTVILKQQTNGSGQHAFASVGIVSGRSTLVLNDDKQVDPNS---IYFGFRGGRLDLNG 526

Query: 469 YDQSFQKIAATDAGTTVTNSNVKQ-STLSLT 498
+F I D G + N N+ S +++T
Sbjct: 527 NSLTFDHIRNIDDGARLVNHNMTNASNITIT 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3178IGASERPTASE481e-150 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 481 bits (1239), Expect = e-150
Identities = 162/546 (29%), Positives = 255/546 (46%), Gaps = 52/546 (9%)

Query: 6 DAPVDFVSGLGPLNWTYDKTSGTGTLSQGSKNWTMHGQKDNDLNAGKNLVFSGQNGAIIL 65
D+ + +W+ + + T T + S N + KD N GK++ F G +G + L
Sbjct: 337 DSAGSLIGSKTDYSWSSNGKTSTITGGEKSLNVDLADGKDKP-NHGKSVTFEG-SGTLTL 394

Query: 66 KDSVTQGAGYLEFKDSYTVSAES-GKTWTGAGIITDKGTNVTWKVNGVAGDNLHKLGEGT 124
+++ QGAG L F+ Y V S TW GAG+ +G VTWKV+ D L K+G+GT
Sbjct: 395 NNNIDQGAGGLFFEGDYEVKGTSDNTTWKGAGVSVAEGKTVTWKVHNPQYDRLAKIGKGT 454

Query: 125 LTINGTGVNPGGLKTGDGIVVLNQQADTAGNIQAFSSVNLASGRPTVVLGDARQVNPDNI 184
L + GTG N G LK GDG V+L QQ + +G AF+SV + SGR T+VL D +QV+P++I
Sbjct: 455 LIVEGTGDNKGSLKVGDGTVILKQQTNGSGQ-HAFASVGIVSGRSTLVLNDDKQVDPNSI 513

Query: 185 SWGYRGGKLDLNGNAVTFTRLQAADYGAVITN-NAQQKSQLLLDLKAQDT--------NV 235
+G+RGG+LDLNGN++TF ++ D GA + N N S + + ++ T N+
Sbjct: 514 YFGFRGGRLDLNGNSLTFDHIRNIDDGARLVNHNMTNASNITITGESLITDPNTITPYNI 573

Query: 236 SEPTIGNISPFGGTGTPGNLYSMILNSQTRFYILKSASYGNTLWGNSLNDPAQWEFVGMN 295
P N F G LY + L + T + + K AS + L NS W ++G
Sbjct: 574 DAPDEDNPYAFRRIKDGGQLY-LNLENYTYYALRKGASTRSELPKNSGESNENWLYMGKT 632

Query: 296 KNKAVQTVKDRILAGRAKQPVIF----HGQLTGNMDVAIPQVPGGRKVIFDGSVNLPEGT 351
++A + V + I R + G+ GN++V + + G NL G
Sbjct: 633 SDEAKRNVMNHINNERMNGFNGYFGEEEGKNNGNLNVTFKGKSEQNRFLLTGGTNL-NGD 691

Query: 352 LSQDSGTLIFQGHPVIHA-SISGSAPVSLN-----------QKDWENRQFTMKTLSLK-D 398
L+ + GTL G P HA I+G + + + DW NR F T+++ +
Sbjct: 692 LTVEKGTLFLSGRPTPHARDIAGISSTKKDPHFAENNEVVVEDDWINRNFKATTMNVTGN 751

Query: 399 ADFHLSRN-ASLNSDIKSDNS---HITLGSDRAFVDKNDGTGNYVIPEEGTSVPDTVNDR 454
A + RN A++ S+I + N HI + ++D TG T D ++D+
Sbjct: 752 ASLYSGRNVANITSNITASNKAQVHIGYKTGDTVCVRSDYTGY------VTCTTDKLSDK 805

Query: 455 -------SQYEGNITLNHNSALDIGSR--FTGGIDAYDSAVSITSPDVLLTAPGAFAGSS 505
+ GN+ L ++ +G F +S V +T + G
Sbjct: 806 ALNSFNPTNLRGNVNLTESANFVLGKANLFGTIQSRGNSQVRLTE-NSHWHLTGNSDVHQ 864

Query: 506 LTVHDG 511
L + +G
Sbjct: 865 LDLANG 870



Score = 77.0 bits (189), Expect = 3e-16
Identities = 41/154 (26%), Positives = 65/154 (42%), Gaps = 18/154 (11%)

Query: 563 DNAALEITRGAHASGDIHASAASTVTIGSDTPAELASAETAASAFAG--------SLLEG 614
+ A + I + + + VT +D ++ A + G + + G
Sbjct: 771 NKAQVHIGYKTGDTVCVRSDYTGYVTCTTDKLSDKALNSFNPTNLRGNVNLTESANFVLG 830

Query: 615 YNAAFNGAITGGRADVSM-HNALWTLGGDSAIHSLTVRNSRI------SSEGDRTFRTLT 667
F + G + V + N+ W L G+S +H L + N I +S + TLT
Sbjct: 831 KANLFGTIQSRGNSQVRLTENSHWHLTGNSDVHQLDLANGHIHLNSADNSNNVTKYNTLT 890

Query: 668 VNKLDATGSDFVLRTDLKN--ADKINVTEKATGS 699
VN L GS + L TDL N DK+ VT+ ATG+
Sbjct: 891 VNSLSGNGSFYYL-TDLSNKQGDKVVVTKSATGN 923


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3195PRTACTNFAMLY426e-06 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 42.3 bits (99), Expect = 6e-06
Identities = 110/539 (20%), Positives = 181/539 (33%), Gaps = 70/539 (12%)

Query: 103 VNGGLFTARGGTLAGTTTLNNGAILTLSGKTV---NNDTLTIR-EGDALLQGGALTGNGS 158
G T GG+L+ G ++ G L+I + A QG AL
Sbjct: 324 GRGARVTVSGGSLSAPH----GNVIETGGARRFAPQAAPLSITLQAGAHAQGKALLYRVL 379

Query: 159 VEKSGSGTLTVSNTTLTQKAVNLNEGTLTLNDSTVTTDVIAQRGTALKLTGSTVLNGAID 218
E LT++ Q + E S DV GA
Sbjct: 380 PEPV---KLTLTGGADAQGDIVATELPSIPGTSIGPLDVALASQARWT--------GATR 428

Query: 219 PTNVTLASGATWNIPDNATVQSVVDDLSHAGQIHF-TSTRTGKFVPATLKVKNLNGQNGT 277
+ ATW + DN+ V ++ L+ G + F G+F L V L G +G
Sbjct: 429 AVDSLSIDNATWVMTDNSNVGAL--RLASDGSVDFQQPAEAGRF--KVLTVNTLAG-SGL 483

Query: 278 ISLRVRPDMAQNNADRLVIDGGRATGKTILNLVNAGNSASGLATSGKGIQVVEAINGATT 337
+ V D+ + D+LV+ A+G+ L + N+G+ + T + V + A T
Sbjct: 484 FRMNVFADLGLS--DKLVVMQD-ASGQHRLWVRNSGSEPASANTL---LLVQTPLGSAAT 537

Query: 338 EEGAFIQGNKLQAGAFNYSLNRDSDESWYLRSENAYRAEVPLYASMLTQAMDYDRILAGS 397
A + K+ G + Y L + + W L A A P
Sbjct: 538 FTLAN-KDGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPEAP 596

Query: 398 RSHQTGVSGENNSVRLSIQGGHLGHDNNGGIARG-----------ATPESSGSYG--FVR 444
+ + ++ G +G + A P++ G++G F +
Sbjct: 597 APQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDAGGAWGRGFAQ 656

Query: 445 LE------GDLLRTEVAG--------MSVTAGVYGAAGHSSVDVKDDDGSRAGTVRDDAG 490
+ G +VAG ++V G + G + D + G D+
Sbjct: 657 RQQLDNRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDGGGHTDSV 716

Query: 491 SLGGYLNLIHNASGLWADIVAQGTRH-------SMKASSDNNDFRVRGWGWLGSLETGLP 543
+GGY I + SG + D + +R + +R G G SLE G
Sbjct: 717 HVGGYATYIAD-SGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGA--SLEAGRR 773

Query: 544 FSITDNLMLEPQLQYTWQGLSLDDGQ-DNASYVKFGHGSAQHVRAGFRLGSHHDMNFGK 601
F+ D LEPQ + + N V+ GS+ R G +G ++ G+
Sbjct: 774 FTHADGWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIELAGGR 832


48S3365S3370Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S3365013-3.126768hypothetical protein
S3366014-4.150515formate acetyltransferase 3
S3367020-6.499716propionate/acetate kinase
S3368-116-5.171978threonine/serine transporter TdcC
S3369120-4.759774threonine dehydratase
S3370013-3.592239DNA-binding transcriptional activator TdcA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3367ACETATEKNASE5330.0 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 533 bits (1374), Expect = 0.0
Identities = 173/397 (43%), Positives = 254/397 (63%), Gaps = 11/397 (2%)

Query: 11 VLVINCGSSSIKFSVLDASDCEVLMSGIADGINSENAFLSVN-GGEPAP--LAHHSYEGA 67
+LVINCGSSS+K+ ++++ D VL G+A+ I ++ L+ N GE ++ A
Sbjct: 3 ILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHNANGEKIKIKKDMKDHKDA 62

Query: 68 LKAIAFELEKRNLN-----DSVALIGHRIAHGGSIFTESAIITDEVIDNIRRVSPLAPLH 122
+K + L + + +GHR+ HGG FT S +ITD+V+ I LAPLH
Sbjct: 63 IKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCIELAPLH 122

Query: 123 NYANLSGIESAQQLFPGVTQVAVFDTSFHQTMAPEAYLYGLPWKYYEELGVRRYGFHGTS 182
N AN+ GI++ Q+ P V VAVFDT+FHQTM AYLY +P++YY + +R+YGFHGTS
Sbjct: 123 NPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKYGFHGTS 182

Query: 183 HRYVSQRAHSLLNLAEDDSGLVVAHLGNGASICAVRNGQSVDTSMGMTPLEGLMMGTRSG 242
H+YVSQRA +LN + ++ HLGNG+SI AV+NG+S+DTSMG TPLEGL MGTRSG
Sbjct: 183 HKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLAMGTRSG 242

Query: 243 DVDFGAMSWVASQTNQSLGDLERVVNKESGLLGISGLSSDLR-VLEKAWHEGHERAQLAI 301
+D +S++ + N S ++ ++NK+SG+ GISG+SSD R + + A+ G +RAQLA+
Sbjct: 243 SIDPSIISYLMEKENISAEEVVNILNKKSGVYGISGISSDFRDLEDAAFKNGDKRAQLAL 302

Query: 302 KTFVHRIARHIAGHAASLRRLDGIIFTGGIGENSSLIRRLVMEHLAVLGVEIDTEMNNRS 361
F +R+ + I +AA++ +D I+FT GIGEN IR +++ L LG ++D E N
Sbjct: 303 NVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREFILDGLEFLGFKLDKEKNKVR 362

Query: 362 NSFGERIVSSENAHVICVVIPTNEEKMIALDAIHLGK 398
E I+S+ ++ V +V+PTNEE MIA D + +
Sbjct: 363 GE--EAIISTADSKVNVMVVPTNEEYMIAKDTEKIVE 397


49S3412S3426Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S34121193.087061hypothetical protein
S3413-1183.231983GIY-YIG nuclease superfamily protein
S3414-1182.760965hypothetical protein
S3415-1172.991178hypothetical protein
S3416-1142.343133collagenase
S34170211.731916hypothetical protein
S34181251.419024hypothetical protein
S34191280.986068tryptophan permease
S34204320.917354ATP-dependent RNA helicase DeaD
S34215320.329285lipoprotein NlpI
S34225370.859137polynucleotide phosphorylase/polyadenylase
S34235320.37293330S ribosomal protein S15
S34244280.362339tRNA pseudouridine synthase B
S34253260.082201ribosome-binding factor A
S3426326-0.064709translation initiation factor IF-2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3426TCRTETOQM732e-15 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 73.4 bits (180), Expect = 2e-15
Identities = 69/313 (22%), Positives = 109/313 (34%), Gaps = 77/313 (24%)

Query: 388 IMGHVDHGKTSLLDYI-----RSTKVASGEAG-------------GITQHIGAYHVETEN 429
++ HVD GKT+L + + T++ S + G GIT G + EN
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 430 GMITFLDTPGHAAFTSMRARGAQATDIVVLVVAADDGVMPQTIEAIQHAKAAQVPVVVAV 489
+ +DTPGH F + R D +L+++A DGV QT + +P + +
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 490 NKIDKPEADPDRV----KNELSQYGI-----------------LPEEWG----------- 517
NKID+ D V K +LS + E+W
Sbjct: 128 NKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLE 187

Query: 518 ---------------GESQFV---------HVSAKAGTGIDELLDAILLQAEVLELKAVR 553
ES H SAK GID L++ I +
Sbjct: 188 KYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVIT--NKFYSSTHRG 245

Query: 554 KGMASGAVIESFLDKGRGPVATVLVREGTLHKGDIVL-CGFEYGRVRAMRNELGQEVLEA 612
+ G V + + R +A + + G LH D V E ++ M + E+ +
Sbjct: 246 QSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKIKITEMYTSINGELCKI 305

Query: 613 GPSIPVEILGLSG 625
+ EI+ L
Sbjct: 306 DKAYSGEIVILQN 318


50S3719S3732Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
S3719-2173.260057fructose-like phosphotransferase EIIB subunit 2
S3720-2173.355099fructose-like permease EIIC subunit 2
S3727-1173.311968catalase; hydroperoxidase HPI(I)
S37280172.9515875,10-methylenetetrahydrofolate reductase
S37292164.025424bifunctional aspartate kinase II/homoserine
S37301132.329307cystathionine gamma-synthase
S37312132.829397transcriptional repressor protein MetJ
S37323122.770892peptidoglycan peptidase
51S3795S3816Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S3795012-3.110480aldose-1-epimerase
S3796-113-3.849976alpha-glucosidase
S3798-119-5.769088permease
S3800113-3.043856IS1 orfB
S3801220-2.540281IS1 orfA
S3802117-1.949886resistance protein
S38031170.007560hypothetical protein
S38041191.213033transcriptional regulator
S38052211.561226GTP-binding protein
S38060161.505634glutamine synthetase
S38070140.994253nitrogen regulation protein NR(II)
S3808014-0.127072nitrogen regulation protein NR(I)
S3809-111-2.991067coproporphyrinogen III oxidase
S3810-114-3.814178hypothetical protein
S3811-215-4.108018ribosome biogenesis GTP-binding protein YsxC
S3813-216-4.509028DNA polymerase I
S3814-221-6.489780acyltransferase
S3815-122-6.105474hypothetical protein
S3816-116-3.271282periplasmic protein disulfide isomerase I
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3805TCRTETOQM1802e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 180 bits (459), Expect = 2e-51
Identities = 97/445 (21%), Positives = 170/445 (38%), Gaps = 81/445 (18%)

Query: 4 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQE--RVMDSNDLEKERGITILAKNT 61
K+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIVYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPALSVDEPTV 307
K+ ++ + E + D A +G+IV + L ++ + DT+ + + P +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 308 SMFFCVNTSPFSGKEGKFVTSRQILDRLNKKLVHNVALRVEETEDADAFRVSGRGELHLS 367
+ + D L LR +S G++ +
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKVQME 397

Query: 368 VLIENMRRE-GFELAVSRPKVIFRE 391
V ++ + E+ + P VI+ E
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3807PF06580280.042 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.3 bits (63), Expect = 0.042
Identities = 34/190 (17%), Positives = 72/190 (37%), Gaps = 41/190 (21%)

Query: 171 IIEQADRLRNLVDRL---LGPQLPGTRVTE-SIHKVAERV---VTLVSMELPDNVRLIRD 223
I+E + R ++ L + L + + S+ V + L S++ D ++
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 224 YDPSLPELAHDPDQIEQVLLN-IVRNALQ---ALGPEGGEIILRTRTAFQLTLHGERYRL 279
+P++ ++ Q+ +L+ +V N ++ A P+GG+I+L+
Sbjct: 246 INPAIMDV-----QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGT------KDNGTVT- 293

Query: 280 AARIDVEDNGPGIPPHLQDTLFYPMVSGREGGTGLGLSIARNLIDQHSGK---IEFTSWP 336
++VE+ G + ++ TG GL R + G I+ +
Sbjct: 294 ---LEVENTGSLALKNTKE------------STGTGLQNVRERLQMLYGTEAQIKLSEKQ 338

Query: 337 GHTEFSVYLP 346
G V +P
Sbjct: 339 GKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3808HTHFIS5970.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 597 bits (1542), Expect = 0.0
Identities = 206/478 (43%), Positives = 300/478 (62%), Gaps = 11/478 (2%)

Query: 1 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGAEVLEALASKTPDVLLSDIRMPGM 60
M + V DDD++IR VL +AL+ AG N A + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 120
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 HYQEQQQPRNIQLNGPTTDIIGEAQAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 180
+ + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 181 LHRHSPRTKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 240
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 241 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVQEGKFREDLFHR 300
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L+Q + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 301 LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL 360
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 361 ENTCRWLTVMAAGQEVLIQDLPGELFESTVAESTSQMQPDSWATLLAQWADRALRS---- 416
EN R LT + + + + EL + S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 417 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 469
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3810SECA300.004 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.2 bits (68), Expect = 0.004
Identities = 11/71 (15%), Positives = 30/71 (42%)

Query: 14 AKARRKTREELDQEARDRKRQKKRRGHAPGSRAAGGNTTSGSKGQNAPKDPRIGSKTPIP 73
+K + + EE+++ + R+ + +R ++ + + + ++G P P
Sbjct: 827 SKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCP 886

Query: 74 LGVTEKVTKQH 84
G +K + H
Sbjct: 887 CGSGKKYKQCH 897


52S3856S3865Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S3856013-4.098227phospholipase A
S3857-114-4.693275hypothetical protein
S3858115-3.257425hypothetical protein
S3859216-3.975031hypothetical protein
S38600130.853224hypothetical protein
S3861-2152.348274magnesium/nickel/cobalt transporter CorA
S3862-2152.582617hypothetical protein
S3863-2162.988321IS4 orf
S3864-1183.223863hypothetical protein
S3865-1183.663095DNA-dependent helicase II
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3856PHPHLIPASEA14990.0 Bacterial phospholipase A1 protein signature.
		>PHPHLIPASEA1#Bacterial phospholipase A1 protein signature.

Length = 289

Score = 499 bits (1286), Expect = 0.0
Identities = 289/289 (100%), Positives = 289/289 (100%)

Query: 1 MRTLQGWLLPVFMLPMAVYAQEATVKEVHDAPAVRGSIIANMLQEHDNPFTLYPYDTNYL 60
MRTLQGWLLPVFMLPMAVYAQEATVKEVHDAPAVRGSIIANMLQEHDNPFTLYPYDTNYL
Sbjct: 1 MRTLQGWLLPVFMLPMAVYAQEATVKEVHDAPAVRGSIIANMLQEHDNPFTLYPYDTNYL 60

Query: 61 IYTQTSDLNKEAIASYDWAENARKDEVKFQLSLAFPLWRGILGPNSVLGASYTQKSWWQL 120
IYTQTSDLNKEAIASYDWAENARKDEVKFQLSLAFPLWRGILGPNSVLGASYTQKSWWQL
Sbjct: 61 IYTQTSDLNKEAIASYDWAENARKDEVKFQLSLAFPLWRGILGPNSVLGASYTQKSWWQL 120

Query: 121 SNSEESSPFRETNYEPQLFLGFATDYRFAGWTLRDVEMGYNHDSNGRSDPTSRSWNRLYT 180
SNSEESSPFRETNYEPQLFLGFATDYRFAGWTLRDVEMGYNHDSNGRSDPTSRSWNRLYT
Sbjct: 121 SNSEESSPFRETNYEPQLFLGFATDYRFAGWTLRDVEMGYNHDSNGRSDPTSRSWNRLYT 180

Query: 181 RLMAENGNWLVEVKPWYVVGNTDDNPDITKYMGYYQLKIGYHLGDAVLSAKGQYNWNTGY 240
RLMAENGNWLVEVKPWYVVGNTDDNPDITKYMGYYQLKIGYHLGDAVLSAKGQYNWNTGY
Sbjct: 181 RLMAENGNWLVEVKPWYVVGNTDDNPDITKYMGYYQLKIGYHLGDAVLSAKGQYNWNTGY 240

Query: 241 GGAELGLSYPITKHVRLYTQVYSGYGESLIDYNFNQTRVGVGVMLNDLF 289
GGAELGLSYPITKHVRLYTQVYSGYGESLIDYNFNQTRVGVGVMLNDLF
Sbjct: 241 GGAELGLSYPITKHVRLYTQVYSGYGESLIDYNFNQTRVGVGVMLNDLF 289


53S3948S3960Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S3948219-1.33223116S rRNA methyltransferase GidB
S3949434-0.004022F0F1 ATP synthase subunit I
S39503320.260839F0F1 ATP synthase subunit A
S39514401.362828F0F1 ATP synthase subunit C
S39525381.473779F0F1 ATP synthase subunit B
S39533341.421721F0F1 ATP synthase subunit delta
S39543351.636414F0F1 ATP synthase subunit alpha
S39551221.785196F0F1 ATP synthase subunit gamma
S39562221.308882F0F1 ATP synthase subunit beta
S39570141.251256F0F1 ATP synthase subunit epsilon
S39580131.104790bifunctional N-acetylglucosamine-1-phosphate
S39591170.318444glucosamine--fructose-6-phosphate
S3960325-0.267465IS4 orf
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3952IGASERPTASE270.028 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 27.3 bits (60), Expect = 0.028
Identities = 20/101 (19%), Positives = 37/101 (36%), Gaps = 18/101 (17%)

Query: 31 AAIEKRQKEIADGLASAERAHKDLDLAKASATDQLKKAKAEAQVIIEQ--ANKRRSQILD 88
+EK +++ + A K+ + T + A++ ++ Q K + +
Sbjct: 1049 KTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEK 1108

Query: 89 EAKAEAEQERTKIVA----------------QAQAEIEAER 113
E KA+ E E+T+ V Q QAE E
Sbjct: 1109 EEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREN 1149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3958RTXTOXINA290.047 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.047
Identities = 23/80 (28%), Positives = 31/80 (38%), Gaps = 10/80 (12%)

Query: 367 LGDAEIGDNVNIGAGTITCNYDGANKFKTIIGDDVFVGSDTQLVAPVTVGKGATIAAGTT 426
LGD + D V + AG+ N G DV T G AT A T
Sbjct: 616 LGDGD--DKVFLSAGSA--NIYAGK------GHDVVYYDKTDTGYLTIDGTKATEAGNYT 665

Query: 427 VTRNVGENALAISRVPQTQK 446
VTR +G + + V + Q+
Sbjct: 666 VTRVLGGDVKVLQEVVKEQE 685


54S3973S3982Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S39730163.121655sugar phosphate antiporter
S39741173.822097regulatory protein UhpC
S39751174.350523sensory histidine kinase UhpB
S39762183.898254DNA-binding transcriptional activator UhpA
S39772152.992422acetolactate synthase 1 regulatory subunit
S39782153.198085acetolactate synthase catalytic subunit
S39791161.649356ilvB operon leader peptide
S39801151.205090multidrug resistance protein D
S3981115-2.012054transcriptional regulator
S3982215-2.390540hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3973TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 0.001
Identities = 28/168 (16%), Positives = 61/168 (36%), Gaps = 17/168 (10%)

Query: 49 FNIAQNDMISTYGLSMTQLGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAIC 108
N++ D+ + + + F +T+ +G + +D K+ L F +I++ C
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN--C 90

Query: 109 MLGFSASMGSGSVSLFLMIAFYALSGFFQSTGGSCSYSTI----TKWTPRRKRGTFLGFW 164
+G SL +M + F Q G + + + ++ P+ RG G
Sbjct: 91 FGSVIGFVGHSFFSLLIM------ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 165 NISHNLGGAGAAGVALFGANYLFDGHVIGMFIFPSIIALIVGFIGLRY 212
+G + A+Y+ + + + P + I+ L
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSY---LLLIP--MITIITVPFLMK 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3974TCRTETB418e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.6 bits (95), Expect = 8e-06
Identities = 65/408 (15%), Positives = 137/408 (33%), Gaps = 60/408 (14%)

Query: 29 RHILLTIWLGYALFY--FTRKSFNAAVPEILANGVLSRSDIGLLATLFYITYGVSKFVSG 86
RH + IWL F+ N ++P+I + + + T F +T+ + V G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 87 IVSDRSNARYFMGIGLIATGIMNILFGFSTSLWAFAVLWVLNAFFQGWGS---PVCARLL 143
+SD+ + + G+I +++ S F L ++ F QG G+ P ++
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHS---FFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 144 TAWY-SRTERGGWWALWNTAHNVGGALIPIVMAASALHYGWRAGMMIAGCMAIVVGIFLC 202
A Y + RG + L + +G + P + A + W ++ M ++ +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPF- 184

Query: 203 WRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKEILTKYVLLNPYIWLLSFCYVLV 262
+ L +I G L I+ + Y VL
Sbjct: 185 --------LMKLLKKEVRIKGHFDIK----GIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 263 YVV-----RAAINDWGNLYMSETLGVDLVTANTAVTMFELGGFI-----------GALVA 306
+++ R + + + + + + + + + GF+ A
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 307 GWGSDKLFNGNRGPMNLIFAAGILL-SVGSLWLMPFASYVMQATCFFTIGFFVFGPQMLI 365
GS +F G + + GIL+ G L+++ + + F T F + +
Sbjct: 293 EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFM 351

Query: 366 ---------GMAAAECS---------HKEAAGAATGFVGLFAYLGASL 395
G++ + ++ AGA + ++L
Sbjct: 352 TIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3975PF06580402e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.8 bits (93), Expect = 2e-05
Identities = 28/142 (19%), Positives = 56/142 (39%), Gaps = 11/142 (7%)

Query: 365 LRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWRIDESALSENQRVTLFRVCQEGLNN 424
LR ++L + + ++L L++ + + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 425 IVKHA-----DASAVTLQGWQQDERLMLVIEDDGSGLPPGSGQ-QGFGLTGMRERVTALG 478
+KH + L+G + + + L +E+ GS + + G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 479 G---TLHISCLHG-TRVSVSLP 496
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3976HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 2e-13
Identities = 29/174 (16%), Positives = 59/174 (33%), Gaps = 20/174 (11%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATG 118
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 119 GCYLTPDIAIKLASGRQDPLTKRERQVAEKLAQG---MAVKEIAAELGLSPKTV 169
A+ R L + + + + + A L + T+
Sbjct: 118 --------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3980TCRTETB574e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 57.2 bits (138), Expect = 4e-11
Identities = 39/176 (22%), Positives = 76/176 (43%), Gaps = 1/176 (0%)

Query: 2 LVLLVAVGQMAQTIYIPAIADMARDLNVREGAVQSVMGAYLLTYGVSQLFYGPISDRVGR 61
L +L + + + ++ D+A D N + V A++LT+ + YG +SD++G
Sbjct: 19 LCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGI 78

Query: 62 RPVILVGMSIFMLATLVA-VTTSSLTVLIAASAMQGMGTGVGGVMARTLPRDLYERTQLR 120
+ ++L G+ I +++ V S ++LI A +QG G + + +
Sbjct: 79 KRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRG 138

Query: 121 HANSLLNMGILVSPLLAPLIGGLLDTMWNWRACYLFLLVLCAGVTFSMARWMPETR 176
A L+ + + + P IGG++ +W L ++ V F M E R
Sbjct: 139 KAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVR 194


55S4045S4831Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S4045214-1.309737phosphate ABC transporter substrate-binding
S4046317-2.166835IS1 orfA
S4048315-1.397160long polar fimbriae
S40493180.594933insertion sequence 2 OrfA protein
S40503180.754718insertion element IS2 transposase InsD
S40524181.042224ferric siderophore receptor
S4053419-0.263503lysine:N6-hydroxylase
S40544190.816789siderophore biosynthesis protein
S40554200.306523siderophore biosynthesis protein
S4056421-1.364408siderophore biosynthesis protein
S4057529-2.468853membrane transport protein
S4829229-3.571314hypothetical protein
S4058126-0.305010IS1 orfA
S40592260.168095IS1 orfB
S4830129-2.892162colV-immunity protein
S4060026-1.397462insertion sequence 2 OrfA protein
S4061025-0.741793insertion element IS2 transposase InsD
S4062-128-2.609233IS629 orfA
S4063126-4.025097IS629 orfB
S4831229-6.080541hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4048PF005777560.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 756 bits (1954), Expect = 0.0
Identities = 328/870 (37%), Positives = 484/870 (55%), Gaps = 54/870 (6%)

Query: 6 IVVGLTAGTCLIFSQNLMAEVSVFNPALLEINHQSGVDIRQFNRANLMPPGVYSVDIFIN 65
V L L + FNP L + Q+ D+ +F +PPG Y VDI++N
Sbjct: 26 FFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLN 85

Query: 66 GKMFERQDVTFVQDNPDADLHACFIAIKKTLSSFGIKVDALKSFNDVDETVCLDPAPRIE 125
+DVTF + + + C L+S G+ ++ N + + C+ I
Sbjct: 86 NGYMATRDVTFNTGDSEQGIVPCLTR--AQLASMGLNTASVSGMNLLADDACVPLTSMIH 143

Query: 126 GSSWQFDSDKLQLNISIHQIYMDAMAYDYISPTRWDEGINALTINYDFSGSHTLRSDYGS 185
++ Q D + +LN++I Q +M A YI P WD GINA +NY+FSG+ +
Sbjct: 144 DATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSV--QNRIG 201

Query: 186 QETDTSYLNLRNGLNIGPWRLRNYSTLN------TSDGRAEYNSISTWIQRDIAALRSQI 239
+ +YLNL++GLNIG WRLR+ +T + +S + ++ I+TW++RDI LRS++
Sbjct: 202 GNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRL 261

Query: 240 MIGDTWTASDIFDSTQIRGARLYTDNDMLPASQNGFAPVVRGIAKSNATVIIRQNGYVIY 299
+GD +T DIFD RGA+L +D++MLP SQ GFAPV+ GIA+ A V I+QNGY IY
Sbjct: 262 TLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIY 321

Query: 300 QSAVPQGAFEITDLNTASTGGDLDVTIKEEDGSEQRFTQPYASLAILKREGLTDVDVSVG 359
S VP G F I D+ A GDL VTIKE DGS Q FT PY+S+ +L+REG T ++ G
Sbjct: 322 NSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAG 381

Query: 360 ELRDEDG--FTPDVLQAQILHGFSHGITLYGGMQAAENYGSAALGVGKDLGALGAISFDV 417
E R + P Q+ +LHG G T+YGG Q A+ Y + G+GK++GALGA+S D+
Sbjct: 382 EYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDM 441

Query: 418 THARANFSHDDTETGQSYRFLYSKLFDDTDTSLRLVGYRYSTEGYYTLNEWASRRNS--- 474
T A + D GQS RFLY+K +++ T+++LVGYRYST GY+ + R +
Sbjct: 442 TQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYN 501

Query: 475 --------------PEDFWETGNRRSRVEGTLTQSLGRDYGNLYLTLSRQQYWHTDDVER 520
+ + N+R +++ T+TQ LGR LYL+ S Q YW T +V+
Sbjct: 502 IETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSNVDE 560

Query: 521 LMQFGYSSSWKRLSWNVSWSYSNTARQGTGNNHASDNTSEQIYMLSLSVPLSGW------ 574
Q G +++++ ++W +S+S + A +Q+ L++++P S W
Sbjct: 561 QFQAGLNTAFEDINWTLSYSLTKN---------AWQKGRDQMLALNVNIPFSHWLRSDSK 611

Query: 575 --WGNSYATYSVSQNDNSGSSHQLGLSGTALERNNLSWNLMQSYNSHDDEVGGN---MSL 629
W ++ A+YS+S + N ++ G+ GT LE NNLS+++ Y D G+ +L
Sbjct: 612 SQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATL 671

Query: 630 TYDGSYGTVNGSYNYSQNSQRLNYGIRGGILAHSEGVTLSQELGETIALVKAPGAAGLEI 689
Y G YG N Y++S + ++L YG+ GG+LAH+ GVTL Q L +T+ LVKAPGA ++
Sbjct: 672 NYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKV 731

Query: 690 DNMRGAATDWRGYTVKTQLNPYDENRVAISDNYFSKSNIELDNTVVTMVPTRGAVVKAEF 749
+N G TDWRGY V Y ENRVA+ N + N++LDN V +VPTRGA+V+AEF
Sbjct: 732 ENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLA-DNVDLDNAVANVVPTRGAIVRAEF 790

Query: 750 VTHVGYRVLFRVLNANGKPVPFGAIAAIQDASLADSGIVGDRGELYLSGLPEKGQVTLSW 809
VG ++L L N KP+PFG A + S SGIV D G++YLSG+P G+V + W
Sbjct: 791 KARVGIKLLMT-LTHNNKPLPFG--AMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKW 847

Query: 810 GENASTKCIFNYSFSTPESESGLIEQGVTC 839
GE + C+ NY + L + C
Sbjct: 848 GEEENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4054PF041838160.0 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 816 bits (2109), Expect = 0.0
Identities = 565/580 (97%), Positives = 571/580 (98%)

Query: 1 MNHKDWDFVNRRLVAKMLSEMEYEQVFHAESQGDDHYCINLPGAQWRFIAERGIWGWLWI 60
MNHKDWD VNRRLVAKMLSE+EYEQVFHAESQGDD YCINLPGAQWRFIAERGIWGWLWI
Sbjct: 1 MNHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWI 60

Query: 61 DAQTLRCTDEPVLAQTLLMQLKPVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD 120
DAQTLRC DEPVLAQTLLMQLK VLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD
Sbjct: 61 DAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD 120

Query: 121 LINLDADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYTNTFRLHWLAVKREHMIWRC 180
LINL+ADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEY NTFRLHWLAVKREHMIWRC
Sbjct: 121 LINLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRC 180

Query: 181 DNDLDIQQLLTAAMDPQEFTRFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG 240
DN++DI QLLTAAMDPQEF RFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG
Sbjct: 181 DNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG 240

Query: 241 RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR 300
RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR
Sbjct: 241 RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR 300

Query: 301 WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK 360
WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK
Sbjct: 301 WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK 360

Query: 361 PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI 420
PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI
Sbjct: 361 PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI 420

Query: 421 AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEAFPEMDSLPQEVRDVTSRLSADYLIHDL 480
AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKE FPEMDSLPQEVRDVTSRLSADYLIHDL
Sbjct: 421 AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDL 480

Query: 481 QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMNKHPQMAERFALFSLFRPQIIR 540
QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYM KHPQM+ERFALFSLFRPQIIR
Sbjct: 481 QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSLFRPQIIR 540

Query: 541 VVLNPVKLTWPDLDGGSRMLPNYLENLQNPLWLVTQEYES 580
VVLNPVKLTWPDLDGGSRMLPNYLE+LQNPLWLVTQEYES
Sbjct: 541 VVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQEYES 580


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4056PF04183339e-111 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 339 bits (872), Expect = e-111
Identities = 104/480 (21%), Positives = 178/480 (37%), Gaps = 46/480 (9%)

Query: 56 ELLIPLDEQKSLHFRVAYFSPTQHHRF-----AFPARLVTASGSYPVDFTTLSRLIIDKL 110
E + + Q + + P RF + + A D L++ ++ +L
Sbjct: 24 EQVFHAESQGDDRYCIN--LPGAQWRFIAERGIWGWLWIDAQTLRCADEPVLAQTLLMQL 81

Query: 111 RHQLFLPVPLCETFHQRVLESHVHTQQAIDARHDWAALREKALNFGEAEQALLTGHAFHP 170
+ L + Q + + + Q + AR +A LN + Q LL+GH
Sbjct: 82 KQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINLNA-DRLQCLLSGHPKFV 140

Query: 171 APKSHEPFNRREAERYLPDMAPHFPLRWFSVDKTQIAGES-LHLNLQQRLTRFAAENAPQ 229
K + + ERY P+ A F L W +V + + +++ Q LT A PQ
Sbjct: 141 FNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQLLT---AAMDPQ 197

Query: 230 LLNELS--------DNQWLF-PLHPWQGEYLLQQGWCQALVAKGLIKDLGEAGTSWLPTT 280
S D+ WL P+HPWQ + + + A+G + LGE G WL
Sbjct: 198 EFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADF-AEGRMVSLGEFGDQWLAQQ 256

Query: 281 SSRSLYCATSRD--MIKFSLSVRLTNSIRTLSVKEVKRGMRLARLAQ----TDGWQMLQ- 333
S R+L A+ R IK L++ T+ R + + + G +R Q TD +
Sbjct: 257 SLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQVFATDATLVQSG 316

Query: 334 ---VRFPTFRVMQEDGWAGLLDLNGNIMQESLFALRENLLVDQPKSQTNVLVSLTQAAPD 390
+ P + +G+A L + REN ++ VL++ +
Sbjct: 317 AVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKPDESPVLMATLMECDE 376

Query: 391 GGDSLLVSAVKRLSDRLGITVQQAAHAWVDAYCQQVLKPLFTAEADYGLVLLAHQQNILV 450
L + + DR G+ A W+ + V+ PL+ YG+ L+AH QNI +
Sbjct: 377 NNQPLAGAYI----DRSGLD----AETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITL 428

Query: 451 QMLGDLPVGFIYRDCQGSAFMPHATDWLDSIGEAQAENIFTHEQLLRYFPYYLLVNSTFA 510
M +P + +D QG M + + E + L++
Sbjct: 429 AMKEGVPQRVLLKDFQGD--MRLVKEEFPEMDSLPQE----VRDVTSRLSADYLIHDLQT 482


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4057TCRTETA485e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 47.5 bits (113), Expect = 5e-08
Identities = 81/375 (21%), Positives = 135/375 (36%), Gaps = 41/375 (10%)

Query: 20 FSAGLLGIGQNGLLVVLPVLVIQTNLSLSV---WAALLMLGSMLFLPSSPWWGKQISRTG 76
+ L +G ++ VLP L+ S V + LL L +++ +P G R G
Sbjct: 12 STVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFG 71

Query: 77 SKPVVLWALGGYGISFTLLGLGSVLMATSAITTAVGLGILIIARIAYGLTVSAMVPACQV 136
+PV+L +L G + + ++ L +L I RI G+T + A
Sbjct: 72 RRPVLLVSLAGAAVDYAIMATAPFLW------------VLYIGRIVAGITGATGAVAGAY 119

Query: 137 WALQRAGEGNRMAALATISSGLSCGRLFGPLCAAAMLAIHPLAPLGLLMAAPVLALLMLL 196
A R +S+ G + GP+ M P AP A L L
Sbjct: 120 IA-DITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGC 178

Query: 197 RL------PGTPPQPTPECKSVSLKRDCLPYLLCAILLAAAVSMMQLGLSPAL------T 244
L P ++ R + A L+A M +G PA
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGE 238

Query: 245 RQFATDTTAISQQVAWLLGLSAVAALIAQFGVLRPQRLTPVALLLSAGVLMSGGLAIMLS 304
+F D T I +A L ++A + G + + AL+L +G + + +
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMI-TGPVAARLGERRALMLGMIADGTGYILLAFA 297

Query: 305 EQLWLFYPGCAVLSFGAALATPAYQLLLNDKLADGAGAGWLATSHTLGYGLCALLVPLVS 364
+ W+ +P +L+ G + PA Q +L+ + D G L L L S
Sbjct: 298 TRGWMAFPIMVLLASG-GIGMPALQAMLS-RQVDEERQGQLQ----------GSLAALTS 345

Query: 365 KTGVAIALIMAALFA 379
T + L+ A++A
Sbjct: 346 LTSIVGPLLFTAIYA 360


56S4097S4105Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S4097025-7.224366IS1 orfA
S4098130-9.610669IS1 orfB
S4099336-12.604397lipopolysaccharide core biosynthesis protein
S4100242-14.452306LPS alpha1,3-glucosyltransferase
S4101448-16.977673lipopolysaccharide core biosynthesis protein
S4102448-16.265793UDP-D-galactose:(glucosyl)lipopolysaccharide-
S4103339-12.370648lipopolysaccharide core biosynthesis protein
S4104229-9.686308UDP-glucose:(galactosyl) LPS
S4105019-5.678661lipopolysaccharide 1,2-N-
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4102RTXTOXINA320.003 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 32.2 bits (73), Expect = 0.003
Identities = 25/117 (21%), Positives = 45/117 (38%), Gaps = 10/117 (8%)

Query: 60 HVFTDYISDKDKLYFSDL-------AKQYNSRINIYVINCDKLKSLPSTKNWTYATYFRF 112
H+ D +DKL +D+ ++ N I + S+ T+ +F
Sbjct: 860 HIIDDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEG--NVLSIGHKNGITFRNWFEK 917

Query: 113 IIADYFYHKHEKILYLDADIACKGSIKELLDYQFSTNEIAAVVAERDIEWWQNRASV 169
D H+ E+I I S+K+ L+YQ N A+ V D + ++ +
Sbjct: 918 ESGDISNHEIEQIFDKSGRIITPDSLKKALEYQ-QRNNKASYVYGNDALAYGSQGDL 973


57S4155S4177Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S4155-119-3.167266hypothetical protein
S4157021-3.489739xylose transporter membrane component
S4158019-4.831657xylose transporter ATP-binding subunit
S4159116-3.225318D-xylose transporter subunit XylF
S4161-114-0.080232xylulokinase
S4162-216-0.597502hypothetical protein
S4163-1180.154349hypothetical protein
S4164-2180.962217hypothetical protein
S41650170.627577glycyl-tRNA synthetase subunit alpha
S4166014-1.989500glycyl-tRNA synthetase subunit beta
S4167-216-3.527341IS1 orfB
S4168014-3.978996IS1 orfA
S4169-114-3.745647hypothetical protein
S4171-114-3.644768ARAC-type regulatory protein
S4172-115-3.658824DNA-binding transcriptional regulator GadX
S4173-112-2.363589glutamate decarboxylase
S4174-119-4.394513cytochrome C peroxidase
S4175321-4.175741IS1 orfB
S4177021-3.755784IS1 orfA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4155FLGFLGJ391e-05 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 38.5 bits (89), Expect = 1e-05
Identities = 31/105 (29%), Positives = 46/105 (43%), Gaps = 17/105 (16%)

Query: 134 TRKIPWNTLLERVDIIPTSMVATMAAAESGWGTSKLARNN----NNLFGMKC---MKGRC 186
+ L + +P ++ AA ESGWG ++ R N NLFG+K KG
Sbjct: 154 AQLSLPAQLASQQSGVPHHLILAQAALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPV 213

Query: 187 T---------NAPGKVKG-YSQFSSVKESVSAYVTNLNTHPAYSS 221
T KVK + +SS E++S YV L +P Y++
Sbjct: 214 TEITTTEYENGEAKKVKAKFRVYSSYLEALSDYVGLLTRNPRYAA 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4163FLGBIOSNFLIP270.017 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 27.5 bits (61), Expect = 0.017
Identities = 19/66 (28%), Positives = 26/66 (39%), Gaps = 1/66 (1%)

Query: 77 MTCLTVFIISVALLMVGLWNATLLLSEKGFYGLAFFLSLFGAVAVQKNIRDAGINPPKET 136
MT T II LL L + + GLA FL+ F V I P E
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAP-PNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEE 119

Query: 137 QVTQEE 142
+++ +E
Sbjct: 120 KISMQE 125


58S4217S4235Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
S4217-212-3.429213trehalase
S4218018-7.951204IS1 orfA
S4219120-9.798327IS1 orfB
S4221224-11.131590hypothetical protein
S4222326-8.468483acid-resistance membrane protein
S4223225-7.605014acid-resistance protein
S4224224-6.687444acid-resistance protein
S4226224-5.589763hypothetical protein
S4227218-2.011447outer membrane protein induced after carbon
S4229117-0.922704hypothetical protein
S42300200.883069IS1 orfB
S42310200.915151IS1 orfA
S42320191.315994DNA-binding transcriptional repressor ArsR
S4233-1181.900527arsenical pump membrane protein
S4234-1192.222405arsenate reductase
S4235-1193.001973IS1 orfB
59S4246S4288Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S4246-124-4.921691IS1 orfA
S4247-124-3.595082IS1 orfB
S4248026-3.734442IS600 orfA
S4249127-3.807646hypothetical protein
S4250231-6.295952chaperone
S4254234-7.609032fimbrial protein remnant
S4252336-8.482449insertion element IS2 transposase InsD
S4253343-13.004936insertion sequence 2 OrfA protein
S4255136-9.902966hypothetical protein
S4836-121-4.943781hypothetical protein
S4837018-2.908631hypothetical protein
S4838016-1.782055hypothetical protein
S4259-114-0.449116hypothetical protein
S42600193.604318hypothetical protein
S42610193.412647ATP-binding component of a transport system
S42621203.668811transporter
S42631214.485417hypothetical protein
S42640224.412518nickel responsive regulator
S42650224.479200nickel transporter ATP-binding protein NikE
S42660203.614831nickel transporter ATP-binding protein NikD
S42671183.391008nickel transporter permease NikC
S42681162.337194nickel transporter permease NikB
S42691161.279628periplasmic binding protein for nickel
S42702191.347014holo-(acyl carrier protein) synthase 2
S42712162.663343hypothetical protein
S4272-1153.536146major facilitator superfamily transporter
S42730142.920305hypothetical protein
S42740142.755092hypothetical protein
S42750163.288422sulfur transfer protein SirA
S42762122.724080zinc/cadmium/mercury/lead-transporting ATPase
S42773111.170228enzyme
S42782130.797026receptor
S42791151.132605hypothetical protein
S42801171.60694416S rRNA m(2)G966-methyltransferase
S42811181.760476cell division protein FtsY
S42820202.082676cell division protein FtsE
S4283-2212.190345cell division protein FtsX
S4284-2222.747773RNA polymerase factor sigma-32
S4285-1202.744499high-affinity amino acid transport protein,
S4286-1203.355566hypothetical protein
S4287-1222.873829high-affinity leucine-specific transport
S4288-2243.065487branched-chain amino acid ABC transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4254PF00577321e-105 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 321 bits (824), Expect = e-105
Identities = 119/360 (33%), Positives = 181/360 (50%), Gaps = 13/360 (3%)

Query: 3 YRLSLILIMALLA-GQLSAQEWSFDSSQLEGNVSADT-VAMFNQGEQ-LPGNYRVEIYLN 59
+ + L + A A LS+ E F+ L + A ++ F G++ PG YRV+IYLN
Sbjct: 26 FFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLN 85

Query: 60 GEKVDVGEFPFHRPESPEEKELVPCLTVDDLIHYGIKIDKSSSDTDNKKNQCFKWNS-IE 118
+ + F+ E+ +VPCLT L G+ S + C S I
Sbjct: 86 NGYMATRDVTFN--TGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIH 143

Query: 119 GLKVNYDFDSQRVQITVPQLYLQDKKSSLAPVSLWNEGVAAFRMVYQTNIDISKQNDNQS 178
D QR+ +T+PQ ++ ++ P LW+ G+ A + Y + + +
Sbjct: 144 DATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSG--NSVQNRIG 201

Query: 179 TTRNSRYGRFTPGFNLGAWRFRSSVTWSKELGQSE-----RWQRGYMWFERGINAIKSRL 233
+ Y G N+GAWR R + TWS S +WQ W ER I ++SRL
Sbjct: 202 GNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRL 261

Query: 234 TLGESYTSSEVFDSIPFRGGMLATDDAMTPPEDSYYTPVVHGIAQSEAQVIIKQNGQIIF 293
TLG+ YT ++FD I FRG LA+DD M P + PV+HGIA+ AQV IKQNG I+
Sbjct: 262 TLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIY 321

Query: 294 TRSVPPGPFALDNLPTLAVGGELDVTVRESNGEEQYFSVPFQTPAIALHEGYFKYSVMGG 353
+VPPGPF ++++ G+L VT++E++G Q F+VP+ + + EG+ +YS+ G
Sbjct: 322 NSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAG 381


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4260RTXTOXIND844e-20 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 84.5 bits (209), Expect = 4e-20
Identities = 71/408 (17%), Positives = 139/408 (34%), Gaps = 81/408 (19%)

Query: 6 RHLAWWVVGALAVAAVVAWWLLRPAGVP-EGFAVSNGRIEATEVDIASKIAGRIDTILVK 64
R +A++++G L +A +++ G +GR + I + I+VK
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVK 113

Query: 65 EGQFVREGEVLAKMDTRV----------------LQEQRLEAI----------------- 91
EG+ VR+G+VL K+ L++ R + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 92 -------------------AQIKEAQSAVAAAQALLEQRQSETRAAQSLVNQRQAELDSV 132
Q Q+ + L+++++E + +N+ +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 133 AKRHTRSRSLAQRGAISAQQLDDDRAAAESARAALESAKAQVSASKAAIEAARTNIIQ-- 190
R SL + AI+ + + A L K+Q+ ++ I +A+
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 191 -----------AQTRVEAAQATERRIAADID--DSELKAPRDGRV-QYRVAEPGEVLAAG 236
QT T + S ++AP +V Q +V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 237 GRVLNMVDLSDVY-MTFFLPTEQAGTLKLGGEARLILDAAPDLRIPATISFVASVAQFTP 295
++ +V D +T + + G + +G A + ++A P R V V
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YLVGKVKNINL 410

Query: 296 KTVETSDERLKLMFRVKARIPPELLQQHLEYV--KTGLPGVAWVRVNE 341
+E D+RL L+F V I L + + +G+ A ++
Sbjct: 411 DAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4261PF05272300.045 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.045
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 37 ARCMVGLIGPDGVGKSSLLSLISGAR 62
V L G G+GKS+L++ + G
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4262ABC2TRNSPORT505e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 49.9 bits (119), Expect = 5e-09
Identities = 41/171 (23%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 200 REREHGTVEHLLVMPITPFEIMMAKI-WSMGLVVLVVSGLSLVLMVKGVLGVPIEGSIPL 258
R T E +L + +I++ ++ W+ L +G+ +V G + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY----TQWLSLL 148

Query: 259 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQMLSGGSTPRESMPQMVQD 317
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P + Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 318 IMLTMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFF-TIALLRFR 367
+P +H + L + I+ ++ + I FF + ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4265HTHFIS290.018 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.4 bits (66), Expect = 0.018
Identities = 10/34 (29%), Positives = 19/34 (55%)

Query: 25 QAVLNNVSLTLKSGETVALLGRSGCGKSTLARLL 58
Q + ++ +++ T+ + G SG GK +AR L
Sbjct: 147 QEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4272TCRTETA545e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 53.7 bits (129), Expect = 5e-10
Identities = 80/398 (20%), Positives = 147/398 (36%), Gaps = 32/398 (8%)

Query: 13 LRLNLRIVSIVMFNFASYLTIGLPLAVLPGYVHDVM--GFSAFWAGLVISLQYFATLLSR 70
++ N ++ I+ + IGL + VLPG + D++ G++++L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 71 PHAGRYADLLGPKKIVVFGLCGCFLSGLGYLTAGLTASLPVISLLLLCLGRVILGI-GQS 129
P G +D G + +++ L G + + Y L V L +GR++ GI G +
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAG---AAVDYAIMATAPFLWV-----LYIGRIVAGITGAT 112

Query: 130 FAGTGSTLWGVGVVGSL--HIGRVISWNGIVTYGAMAMGAPLGVVFYHWGGLQALALIIM 187
A G+ + + H G + + G +G +G H A AL +
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGL 172

Query: 188 GVALVAILLAIPRPTVK--ASKGKPLPFRAVLGRVWLYGMALALA-----SAGFGVIATF 240
LL + + P + + +A +A V A
Sbjct: 173 NFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAAL 232

Query: 241 ITLFYDAK-GWDGAAFALTLFSCAFVGT---RLLFPNGINRIGGLNVAMICFSVEIIGLL 296
+F + + WD ++L + + + ++ R+G M+ + G +
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI 292

Query: 297 LVGVATMPWMAKIG-VLLAGAGFSLVFPALGVVAVKAVPQQNQGAALATYTVFMDLSLGV 355
L+ AT WMA VLLA G + PAL + + V ++ QG + L+ +
Sbjct: 293 LLAFATRGWMAFPIMVLLASGGIGM--PALQAMLSRQVDEERQGQLQGSLAALTSLT-SI 349

Query: 356 TGPLAGLVMSWAGVPV----IYLAAAGLVAIALLLTWR 389
GPL + A + ++A A L + L R
Sbjct: 350 VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4275PF012061053e-34 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 105 bits (265), Expect = 3e-34
Identities = 24/72 (33%), Positives = 41/72 (56%)

Query: 9 DHTLDALGLRCPEPVMMVRKTVRNMQPGETLLIIADDPATTRDIPGFCTFMEHELVAKET 68
D +LDA GL CP P++ +KT+ M GE L ++A DP + +D F HEL+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 69 DGLPYRYLIRKG 80
+ Y + +++
Sbjct: 65 EDGTYHFRLKRA 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4278SHIGARICIN260.039 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 25.9 bits (57), Expect = 0.039
Identities = 6/21 (28%), Positives = 13/21 (61%)

Query: 7 FFIVIIGLIVVAASFRFMQQR 27
+V+I AA ++F++Q+
Sbjct: 173 ALMVLIQSTSEAARYKFIEQQ 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4281IGASERPTASE519e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 50.8 bits (121), Expect = 9e-09
Identities = 36/181 (19%), Positives = 60/181 (33%), Gaps = 13/181 (7%)

Query: 19 EQTPEKETEVQNEQPVVEEI---VQAQEPAKASEQAVEEQPQAHTEAEAETFAADVVEVT 75
TP + TE E E Q+ + + Q E +A + +A T EV
Sbjct: 1030 PATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTN---EVA 1086

Query: 76 EQVVESEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLPEDVNAEEVSPEEWQAEAETV 135
+ E+++ Q + V +E V E+ + +VSP++ Q+E
Sbjct: 1087 QSGSETKETQTTE--TKETATVEKEEKAKVETEKTQ---EVPKVTSQVSPKQEQSETVQP 1141

Query: 136 EIVEAAEEEAAK--EEITDEELEAQALAAEAAEEAVMVVPPAEEEQPVAEIAQEQEKPTK 193
+ A E + +E + A E + V P E V E P
Sbjct: 1142 QAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPEN 1201

Query: 194 E 194

Sbjct: 1202 T 1202



Score = 47.4 bits (112), Expect = 1e-07
Identities = 40/192 (20%), Positives = 69/192 (35%), Gaps = 26/192 (13%)

Query: 20 QTPEK-ETEVQNEQPVVEEIVQAQE----------PAKASEQAVEEQPQAHTEAE----- 63
TP + +V + EEI + E P++ +E E Q E
Sbjct: 998 TTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQD 1057

Query: 64 AETFAADVVEVTEQVVESEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLPEDVNAEEV 123
A A EV ++ + KA + VAQ +ET + E V EE
Sbjct: 1058 ATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKET------QTTETKETATVEKEEK 1111

Query: 124 SPEEWQAEAETVEIVEAAEEEAAKEEITDEELEAQALAAEAAEEAVMVVPPAEEEQPVAE 183
+ E +T E+ + + + K+E E ++ QA A + V + P + A+
Sbjct: 1112 AKVE---TEKTQEVPKVTSQVSPKQE-QSETVQPQAEPARENDPTVNIKEPQSQTNTTAD 1167

Query: 184 IAQEQEKPTKEG 195
Q ++ +
Sbjct: 1168 TEQPAKETSSNV 1179



Score = 46.2 bits (109), Expect = 3e-07
Identities = 36/177 (20%), Positives = 60/177 (33%), Gaps = 18/177 (10%)

Query: 22 PEKETE---VQNEQPVVEEIVQAQEPAKASEQAVEEQPQAHTEAEAETFAADVVEVTEQV 78
PE E V +QA P+ S + A E TE V
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVD--EAPVPPPAPATPSETTETV 1040

Query: 79 VESEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLPEDVNAEEVSPEEWQAEAETVEIV 138
E+ K + + + + + + + + EV+ Q+ +ET E
Sbjct: 1041 AENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVA----QSGSETKETQ 1096

Query: 139 EAAEEEAAKEEITDEELEAQALAAEAAEEAVMV--VPP----AEEEQPVAEIAQEQE 189
+E A E +E +A+ + E + V P +E QP AE A+E +
Sbjct: 1097 TTETKETATVE---KEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND 1150



Score = 45.1 bits (106), Expect = 6e-07
Identities = 26/159 (16%), Positives = 48/159 (30%), Gaps = 7/159 (4%)

Query: 17 QKEQTPEKETEVQNEQPVVEEIVQAQEPAKASE------QAVEEQPQAHTEAEAETFAAD 70
Q +T E T + E+ VE + P S+ Q+ QPQA E +
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNI 1155

Query: 71 VVEVTEQVVESEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLPEDVNAEEVSPEEWQA 130
++ ++ QP E + E V E+ V + PE+ P
Sbjct: 1156 KEPQSQTNTTADTEQPAKETSSNVEQPVTES-TTVNTGNSVVENPENTTPATTQPTVNSE 1214

Query: 131 EAETVEIVEAAEEEAAKEEITDEELEAQALAAEAAEEAV 169
+ + + + + + A +
Sbjct: 1215 SSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLT 1253



Score = 38.5 bits (89), Expect = 7e-05
Identities = 27/177 (15%), Positives = 52/177 (29%), Gaps = 12/177 (6%)

Query: 17 QKEQTPEKETEVQNEQPVVEEIVQAQEPAKASEQAVEEQPQAHTEAEAETFAAD------ 70
+E E ++ V+ E E + +E + E +
Sbjct: 1065 NREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE-TATVEKEEKAKVETEKTQEVP 1123

Query: 71 --VVEVTEQVVESEKAQPEAEVVAQPEPVV--EETPEPVAIEREELPLPEDVNAEEVSPE 126
+V+ + +SE QP+AE + +P V +E + ++ ++ P
Sbjct: 1124 KVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPV 1183

Query: 127 EWQAEAETV-EIVEAAEEEAAKEEITDEELEAQALAAEAAEEAVMVVPPAEEEQPVA 182
T +VE E E+ +V VP E +
Sbjct: 1184 TESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTS 1240


60S4455S4477Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S4455315-0.588042fructuronate transporter
S4456122-2.648397minor fimbrial subunit, D-mannose specific
S4457124-2.621662minor fimbrial subunit, precursor polypeptide
S4458024-2.910531minor fimbrial subunit, precursor polypeptide
S4460-124-3.170013IS1 orfB
S4461130-4.947351IS1 orfA
S4463231-6.189168periplasmic chaperone
S4464231-6.587083fimbrial protein
S4465031-6.677358major type 1 subunit fimbrin (pilin)
S4466128-6.519426tyrosine recombinase
S4467027-6.290450recombinase; regulator for fimA
S4468025-5.651217hypothetical protein
S4469-122-4.721968hypothetical protein
S4470-122-2.763978hypothetical protein
S4471121-2.925501IS1 orfB
S4472226-2.820823IS1 orfA
S4473226-2.820823DeoR family transcriptional regulator
S4474325-0.958867hypothetical protein
S4475224-1.855564ISEhe3a orf
S4476122-0.947730IS600 orfA
S4477223-0.480184IS600 orfB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4455PF06580310.008 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.008
Identities = 10/49 (20%), Positives = 25/49 (51%)

Query: 230 LVPLIPAIIMISTTIANIWLVKDTPAWEVVNFIGSSPIAMFIAMVVAFV 278
+ +I ++ I +W V +T W ++ FI + P+A + + ++ +
Sbjct: 73 MGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSII 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4456SURFACELAYER280.047 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 28.1 bits (62), Expect = 0.047
Identities = 19/79 (24%), Positives = 32/79 (40%), Gaps = 1/79 (1%)

Query: 211 SQNLGYYLSGTTADAGNSIFTNTASFSPAQGVGVQLTRNGTIIPANNTVSLGAVGTSAVS 270
S+N G ++ +A+ N FT PA V V L ++G ++ + + +
Sbjct: 133 SENAGKEITIGSAN-PNVTFTEKTGDQPASTVKVTLDQDGVAKLSSVQIKNVYAIDTTYN 191

Query: 271 LGLTANYARTGGQVTAGNV 289
+ TG VT G V
Sbjct: 192 SNVNFYDVTTGATVTTGAV 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4457VACCYTOTOXIN334e-04 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 33.5 bits (76), Expect = 4e-04
Identities = 30/158 (18%), Positives = 49/158 (31%), Gaps = 9/158 (5%)

Query: 3 WCKRGYVLAAMLALASATIQAADVTITVNGKVVAKPCTVSTTNATVDLGDLYSFSLMSAG 62
W R + A LA + +TI + VT VN + + + + G
Sbjct: 258 WMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTH------IG 311

Query: 63 AASAWHDVALELTNCPVG--TSRVTASFSGAADSTGYYKNQGTAQNIQLELQDDSGNTLN 120
W L + P G + S + Q ++QN + N+
Sbjct: 312 TLDLWQSAGLNIIAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSAQ 371

Query: 121 TGATKTVQVDDSSQSAHFPLQVRALTVNGGATQGTIQA 158
+ QV D + V +N A GTI+
Sbjct: 372 KTEIQPTQVIDGPFAGGKNTVVNINRINTNA-DGTIRV 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4475HTHFIS270.013 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 26.7 bits (59), Expect = 0.013
Identities = 7/45 (15%), Positives = 16/45 (35%), Gaps = 1/45 (2%)

Query: 4 KRYPEEFKTEAVKQVVDR-GYSVASVATRLDITTHSLYAWIKKYG 47
R E + + + + A L + ++L I++ G
Sbjct: 430 DRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELG 474


61S4548S4554Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
S4548319-0.495049hypothetical protein
S4549523-0.901668IS1 orfB
S4550420-0.842994IS1 orfA
S4551214-0.020409IS600 orfB
S4552213-0.053613transport of lysine/cadaverine
S45532130.779688IS1 orfA
S4554219-0.221669IS1 orfB
62S4594S4602Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S45942181.384265tRNA delta(2)-isopentenylpyrophosphate
S45954241.468010RNA-binding protein Hfq
S45964221.380122GTPase HflX
S45974221.889707FtsH protease regulator HflK
S45984221.687504FtsH protease regulator HflC
S45993201.872384hypothetical protein
S46003191.916391adenylosuccinate synthetase
S46012141.423326transcriptional repressor NsrR
S46022151.704114exoribonuclease R
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4596SECA320.005 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 32.2 bits (73), Expect = 0.005
Identities = 26/144 (18%), Positives = 54/144 (37%), Gaps = 6/144 (4%)

Query: 282 HVIDAADVRVQENIEAVNTVLEEIDAHEIPTLLVMNKIDMLEDFEPRIDRDEENK-PIRV 340
++D +DV N + IDA+ P L ++ + + R+ D + PI
Sbjct: 665 ELLDVSDVSETINSIREDVFKATIDAYIPPQSL--EEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 341 WLSAQTGAGIPQLFQALTERLSGEVAQHTLRLPPQEGRLRSRFYQLQAIEKEWMEEDGSV 400
WL + L + + + + + + R + LQ ++ W E ++
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAM 782

Query: 401 SLQVRMPIVDWRRLCKQEPALIDY 424
+R I R +++P +Y
Sbjct: 783 D-YLRQGIH-LRGYAQKDP-KQEY 803


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4597cloacin320.006 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.6 bits (71), Expect = 0.006
Identities = 25/81 (30%), Positives = 30/81 (37%), Gaps = 10/81 (12%)

Query: 17 GSSKPGGNSEGNGNKGGRDQGPPDLDDIFRKLSKKLGGLGGGKGTGSGGGSSSQGP---- 72
S G +SE N GG G G GGG GTG G S+ P
Sbjct: 33 ASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTG-GNLSAVAAPVAFG 91

Query: 73 -----RPQLGGRVVTIAAAAI 88
P GG V+I+A A+
Sbjct: 92 FPALSTPGAGGLAVSISAGAL 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4602RTXTOXIND310.027 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.027
Identities = 12/55 (21%), Positives = 24/55 (43%), Gaps = 1/55 (1%)

Query: 165 VVPDDSRLSFDILIPPDQIMGARMGFVVVVELTQRPTRRTKAV-GKIVEVLGDNM 218
+VP+D L L+ I +G ++++ P R + GK+ + D +
Sbjct: 359 IVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAI 413


63S4624S4633Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S4624332-1.801334hypothetical protein
S4625432-0.52396830S ribosomal protein S6
S4626430-0.538402primosomal replication protein N
S4627429-1.19564330S ribosomal protein S18
S4628226-2.15623250S ribosomal protein L9
S4629223-6.194699ISEhe3 orfA
S4630123-5.925214ISEhe3 orfB
S4631023-6.134426IS600 orfB
S4632-118-5.214134IS600 orfA
S4633-216-4.990635hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4629HTHFIS270.013 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 26.7 bits (59), Expect = 0.013
Identities = 7/45 (15%), Positives = 16/45 (35%), Gaps = 1/45 (2%)

Query: 4 KRYPEEFKTEAVKQVVDR-GYSVASVATRLDITTHSLYAWIKKYG 47
R E + + + + A L + ++L I++ G
Sbjct: 430 DRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELG 474


64S0338S0345N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S03381151.253667fructokinase
S03411121.335530MFS transport protein AraJ
S03420121.581316exonuclease subunit SbcC
S0343-1111.611322exonuclease subunit SbcD
S03440131.730861transcriptional regulator PhoB
S03450121.277839phosphate regulon sensor protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0338ACETATEKNASE290.017 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.4 bits (66), Expect = 0.017
Identities = 17/69 (24%), Positives = 29/69 (42%), Gaps = 10/69 (14%)

Query: 187 FISGTGFATDYRRLSGHALKGSEIIRLVEESDPVAELALRRYELRLAKSLAHVVNILDP- 245
+G ++D+R L A + D A+LAL + R+ K++ +
Sbjct: 273 VYGISGISSDFRDLEDAAF---------KNGDKRAQLALNVFAYRVKKTIGSYAAAMGGV 323

Query: 246 DVIVLGGGM 254
DVIV G+
Sbjct: 324 DVIVFTAGI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0341TCRTETA531e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 52.5 bits (126), Expect = 1e-09
Identities = 73/356 (20%), Positives = 126/356 (35%), Gaps = 36/356 (10%)

Query: 5 ILSLALGTFGLGMAEFGIMSVLTELAHNVGISIPAAGH---MISYYALVVVVGAPIIALF 61
+ ++AL G+G+ IM VL L ++ S H +++ YAL+ AP++
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 62 SSRYSLKHILLFLVALCVIGNAMFTLSSSYLMLAIGRLVSGFPHGAFFGVGAIVLSKIIK 121
S R+ + +LL +A + A+ + +L IGR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 122 PGKVTAAVAGMVSGMTVANLLGIP-LGTYLSQECWRYTFLLIAVFNIAVMASVYFWVPDI 180
G A G +S ++ P LG + F A N + F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 181 RDEAKGNLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYVKPYMMFI 228
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 229 SGFSETAMTFIMMLVGLGM---VLGNMLSGRISGRYSPLRIAAVTDFIIVLALLMLFFCG 285
F A T + L G+ + M++G ++ R R + ++L F
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 286 GMKTTSLIFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAVG 339
I + G+ LQ +L + E G G +A +L S VG
Sbjct: 299 RGWMAFPIMVLLASGGIG--MPALQAMLSRQV-DEERQGQLQGSLAALTSLTSIVG 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0342RTXTOXIND397e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.4 bits (92), Expect = 7e-05
Identities = 34/199 (17%), Positives = 71/199 (35%), Gaps = 14/199 (7%)

Query: 671 QQEAQSWQQRQNELTALQNRIQQLTPILETLPQSDDLPHSEETVALDNWRQVHEQCLALH 730
+ + Q + Q R Q L+ +E + E + +V +
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIK 192

Query: 731 SQQQTLQQQDVLAAQSLQKAQAQFDTAL--------QASVFDDQQAFLAALMDEQTLTQL 782
Q T Q Q +L K +A+ T L + V + ++L+ +Q +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIA-- 250

Query: 783 EQLKQNLENQRRQAQTLVTQTAETLAQHQQHRPDGLALTVTVEQIQQEL-AQTHQKLREN 841
K + Q + V + +Q +Q + L+ + + Q + KLR+
Sbjct: 251 ---KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQT 307

Query: 842 TTSQGEIRQQLKQDADNRQ 860
T + G + +L ++ + +Q
Sbjct: 308 TDNIGLLTLELAKNEERQQ 326



Score = 39.4 bits (92), Expect = 7e-05
Identities = 25/204 (12%), Positives = 59/204 (28%), Gaps = 18/204 (8%)

Query: 487 EARIKTLEAQRAQLQAGQPCPLCGSTSHPAVEAYQALEPGVNQSRLLALENEVKKLGEEG 546
EA ++ Q + Q ++E + E + +E + L
Sbjct: 133 EADTLKTQSSLLQARLEQ---TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLT- 188

Query: 547 AALRGQLDALTKQLQRDENEAQSLRQDEQALTQQWQAVTASLNITLQPQDDIQPWLDAQD 606
+ ++ Q Q + E R + + + + DD L Q
Sbjct: 189 SLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQA 248

Query: 607 -------EHERQL-RLLSQRHELQGQIAAHNQQIIQYQQQIEQRQQQLLTALAGYALTLP 658
E E + +++ + Q+ +I+ +++ + Q L
Sbjct: 249 IAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF------KNEILD 302

Query: 659 QEDEEESWLATRQQEAQSWQQRQN 682
+ + + E ++RQ
Sbjct: 303 KLRQTTDNIGLLTLELAKNEERQQ 326



Score = 32.5 bits (74), Expect = 0.009
Identities = 16/150 (10%), Positives = 42/150 (28%), Gaps = 5/150 (3%)

Query: 731 SQQQTLQQQDVLAAQSLQKAQAQFDTA----LQASVFDDQQAFLAALMDEQTLTQLEQLK 786
+ Q + A + Q + L D+ F +E+ L +K
Sbjct: 134 ADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS-EEEVLRLTSLIK 192

Query: 787 QNLENQRRQAQTLVTQTAETLAQHQQHRPDGLALTVTVEQIQQELAQTHQKLRENTTSQG 846
+ + Q + A+ + L L + ++
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKH 252

Query: 847 EIRQQLKQDADNRQQQQTLMQQIAQMTQQV 876
+ +Q + + + + Q+ Q+ ++
Sbjct: 253 AVLEQENKYVEAVNELRVYKSQLEQIESEI 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0343FRAGILYSIN310.010 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 30.8 bits (69), Expect = 0.010
Identities = 14/70 (20%), Positives = 25/70 (35%), Gaps = 4/70 (5%)

Query: 149 KQQHLLAAITDYYQQHYADACKLRGDQPLPIIATGHLTTVGSSKSDAVRDIYIGTLDAFP 208
K+ ++ I ++Y + + + I T S+ D + + I A
Sbjct: 135 KEAQMMNEIAEFYAAPFKKTRAINEKEAFECI-YDSRTR--SAGKD-IVSVKINIDKAKK 190

Query: 209 AQNFPPADYI 218
N P DYI
Sbjct: 191 ILNLPECDYI 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0344HTHFIS951e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.5 bits (235), Expect = 1e-24
Identities = 32/149 (21%), Positives = 63/149 (42%), Gaps = 9/149 (6%)

Query: 4 RILVVEDEAPIREMVCFVLEQNGFQPVEAEDYDSAVNQLNEPWPDLILLDWMLPGGSGIQ 63
ILV +D+A IR ++ L + G+ + + + DL++ D ++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 FIKHLKRESMTRDIPVVMLTARGEEEDRVRGLETGADDYITKPFSPKELVARIKAVMRRI 123
+ +K+ D+PV++++A+ ++ E GA DY+ KPF EL+ I +
Sbjct: 65 LLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA-- 120

Query: 124 SPMAVEEVIKMQGLSLNPTSHRVMAGEEP 152
E + L + + G
Sbjct: 121 -----EPKRRPSKLEDDSQDGMPLVGRSA 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0345PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.1 bits (78), Expect = 0.001
Identities = 19/105 (18%), Positives = 33/105 (31%), Gaps = 26/105 (24%)

Query: 325 LVYNAVNH----TPEGTHITVRWQRVPHGAEFSVEDNGPGIAPEHIPRLTERFYRVDKAR 380
LV N + H P+G I ++ + VE+ G
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN---------------- 306

Query: 381 SRQTGGSGLGLAIVKHAVNH---HESRLNIESTVGKGTRFSFVIP 422
+G GL V+ + E+++ + GK +IP
Sbjct: 307 --TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM-VLIP 348


65S0384S0391N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0384021-0.265271muropeptide transporter
S0385326-0.646966hypothetical protein
S0386326-0.699185transcriptional regulator BolA
S0387326-0.407193trigger factor
S0388019-0.123062ATP-dependent Clp protease proteolytic subunit
S0389120-0.379676ATP-dependent protease ATP-binding subunit ClpX
S0390017-0.428041DNA-binding ATP-dependent protease La
S0391-113-0.828657transcriptional regulator HU subunit beta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0384TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.0 bits (91), Expect = 3e-05
Identities = 71/347 (20%), Positives = 135/347 (38%), Gaps = 20/347 (5%)

Query: 62 KFLWSPLMDRYTPPFFGRRRGWLLATQILLLVAIAAMGFLEPGTQLRWMAALAVVIAFCS 121
+F +P++ + F RR LL + V A M W+ + ++A +
Sbjct: 56 QFACAPVLGALSDRF--GRRPVLLVSLAGAAVDYAIMAT----APFLWVLYIGRIVAGIT 109

Query: 122 ASQDIVFDAWKTDVLPAEERGAGAAISVLGYRLGMLVSGGLALWLADKWLGWQGMYWLMA 181
+ V A+ D+ +ER + GM+ L + ++ A
Sbjct: 110 GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG--FSPHAPFFAAA 167

Query: 182 AL-LIPCIIATLLAPEP--TDTIPVPKTLEQAVVAPLRDFFGRNNAWLILLLIVLYKLGD 238
AL + + L PE + P+ + + + A L+ + ++ +G
Sbjct: 168 ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQ 227

Query: 239 AFAMSLTTTFLIRGVGFDAGEVGVVNKTLGLLATIVGALYGGILMQRLSLFRALLIFGIL 298
A +L F +DA +G+ G+L ++ A+ G + RL RAL+ G++
Sbjct: 228 VPA-ALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALM-LGMI 285

Query: 299 QGASNAGYWLLSITDKHLYSMGAAVFFENLCGGMGTSAFVALLMTLCNKSFSATQFALLS 358
A GY LL+ + + V GG+G A A+L ++ L+
Sbjct: 286 --ADGTGYILLAFATRGWMAFPIMVLL--ASGGIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 359 ALSAVGRVYVGPVAGWFVEAHGWSTF--YLFSVAAAVPGLILLLVCR 403
AL+++ + VGP+ + A +T+ + + AA+ L L + R
Sbjct: 342 ALTSLTSI-VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0385PF06291270.030 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 26.5 bits (58), Expect = 0.030
Identities = 11/34 (32%), Positives = 18/34 (52%)

Query: 3 KKILFPLVALFMLAGCAKPPTTIEVSPTITLPQQ 36
KK+LF ++ GCA+ T+ PT P++
Sbjct: 7 KKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKE 40


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0389HTHFIS290.043 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.043
Identities = 16/73 (21%), Positives = 29/73 (39%), Gaps = 13/73 (17%)

Query: 60 ERSALPTPHEIRNHLDDYVIGQEQAKKVLAVAVYNHYKRLRNGDTSNGVELGKSNILLIG 119
E P+ E + ++G+ A + +Y RL D +++ G
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGRSAAMQ----EIYRVLARLMQTD---------LTLMITG 167

Query: 120 PTGSGKTLLAETL 132
+G+GK L+A L
Sbjct: 168 ESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0390GPOSANCHOR350.001 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.0 bits (80), Expect = 0.001
Identities = 34/133 (25%), Positives = 69/133 (51%), Gaps = 15/133 (11%)

Query: 191 ERLEYLMAMMESEIDLLQVEKRIRNRVKKQMEKSQREYYLNEQMKAIQKELGEMDDVPD- 249
LE A +E + +L R +++ ++ S+ +Q++A ++L E + + +
Sbjct: 291 AALEAEKADLEHQSQVLNAN---RQSLRRDLDASREAK---KQLEAEHQKLEEQNKISEA 344

Query: 250 ENEALKRKIDAAKMPKEAKEKAEAELQKLKMMSPMS-AEATVVRGYIDWMVQVPWNARSK 308
++L+R +DA++ EAK++ EAE QKL+ + +S A +R +D + A+ +
Sbjct: 345 SRQSLRRDLDASR---EAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASRE----AKKQ 397

Query: 309 VKKDLRQAQEILD 321
V+K L +A L
Sbjct: 398 VEKALEEANSKLA 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0391DNABINDINGHU1173e-38 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 117 bits (294), Expect = 3e-38
Identities = 49/88 (55%), Positives = 67/88 (76%)

Query: 2 NKSQLIDKIAAGADISKAAAGRALDAIIASVTESLKEGDDVALVGFGTFAVKERAARTGR 61
NK LI K+A +++K + A+DA+ ++V+ L +G+ V L+GFG F V+ERAAR GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEITIAAAKVPSFRAGKALKDAV 89
NPQTG+EI I A+KVP+F+AGKALKDAV
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


66S0409S0422N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0409013-1.379524hypothetical protein
S0410216-1.164483hypothetical protein
S0412014-0.631675hemolysin expression-modulating protein
S0413014-0.435979hypothetical protein
S04140150.446071acriflavine resistance protein
S0415111-0.148454acriflavin resistance protein AcrA precursor
S0416113-0.339300DNA-binding transcriptional repressor AcrR
S04172141.832591potassium efflux protein KefA
S04184153.540497hypothetical protein
S04193164.106587primosomal replication protein N''
S04203212.630976hypothetical protein
S04213262.455158adenine phosphoribosyltransferase
S04222212.491635DNA polymerase III subunits gamma and tau
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0409BCTERIALGSPF300.026 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.8 bits (67), Expect = 0.026
Identities = 31/137 (22%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 245 IWLPLGLVIGLLAAMFVLRILRRIQSPHHRLQDAIENRDICVHYQPIVSLANGKIVGAEA 304
W+ L L+ G +A +LR R+ + + P++ G+I
Sbjct: 228 PWMLLALLAGFMAFRVMLR------QEKRRVS-----FHRRLLHLPLI----GRIARGLN 272

Query: 305 LARWPQTDGSWLSPDSFIPLAQQTGLS-EPLTLLIIRSVFEDMGDCLRQHPQQHISINLE 363
AR+ +T + S +PL Q +S + ++ R D +R+ H + LE
Sbjct: 273 TARYARTLSILNA--SAVPLLQAMRISGDVMSNDYARHRLSLATDAVREGVSLHKA--LE 328

Query: 364 STVLTSEKIPQLLREMI 380
T L P ++R MI
Sbjct: 329 QTAL----FPPMMRHMI 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0414ACRIFLAVINRP13660.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1366 bits (3536), Expect = 0.0
Identities = 800/1033 (77%), Positives = 913/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++SA+YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ + LNK++LTPVDVI +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL 300
+ EEFGK+ L+VN DGS V L+DVA++ELGGENY++IA NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+LA+++PFFP G+K++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVF+PMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWLNRMFEKSTHHYTDSVGGILRSTGR 540
SVLVALILTPALCAT+LKP++ H E K GFFGW N F+ S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTHYYLT 600
YL++Y +IV GM LF+RLPSSFLP+EDQGVF+TM+QLPAGATQERTQKVL++VT YYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD 660
EK NVESVF VNGF F+G+ QN G+AFVSLK W +R G+EN EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQLL AA+HP L SVRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS++DIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D+ YVR+A+G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSTPFS 900
M LME LASKLP G+GYDWTGMSYQERLSGNQAP+L AIS +VVFLCLAALYESWS P S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF 1020
+EATL AVRMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0415RTXTOXIND453e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 45.2 bits (107), Expect = 3e-07
Identities = 33/212 (15%), Positives = 72/212 (33%), Gaps = 23/212 (10%)

Query: 100 TYQATYDSAKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQGYDQALADAQQANAAVTA 159
+ Y A +L + + Q+ Q +++ ++ L +Q +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 160 AKAAVETARINLAYTKVTSPISGRIGKSNV-TEGALVQNGQATVLATVQQLDPIYVDVTQ 218
+ + + +P+S ++ + V TEG +V + T++ V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALV 372

Query: 219 SSNDFLRLKQELA----------NGTLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVD 268
+ D + KV I D I+ + G + ++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLV---GKVKNINLDAIEDQRLGLVFNVIISIE 429

Query: 269 QTTGSITLRAIFPNPDHTLLPGMFVRARLEEG 300
+ S + I L GM V A ++ G
Sbjct: 430 ENCLSTGNKNIP------LSSGMAVTAEIKTG 455



Score = 32.9 bits (75), Expect = 0.002
Identities = 26/127 (20%), Positives = 50/127 (39%), Gaps = 10/127 (7%)

Query: 49 PLQITTELPGR-TSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQATYDS 107
++I G+ T + R E++P + I+ + KEG + G L ++ +A
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA---- 134

Query: 108 AKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQGYDQALADAQQANAAVTAAKAAVETA 167
D K Q++ A+L RYQ L + I + + V+ + T+
Sbjct: 135 ---DTLKTQSSLLQARLEQTRYQILSRS--IELNKLPELKLPDEPYFQNVSEEEVLRLTS 189

Query: 168 RINLAYT 174
I ++
Sbjct: 190 LIKEQFS 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0416HTHTETR2225e-76 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 222 bits (567), Expect = 5e-76
Identities = 215/215 (100%), Positives = 215/215 (100%)

Query: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60
MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180
GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215
APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0417RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRIKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0422IGASERPTASE397e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.9 bits (90), Expect = 7e-05
Identities = 40/251 (15%), Positives = 77/251 (30%), Gaps = 31/251 (12%)

Query: 402 PLPETTSQVLAARQQLQRVQGATKAKKSEPAA----ATRARPVNNAALERLASVTDRVQA 457
P E +Q + + + P+ AR + A + A T
Sbjct: 983 PEVEKRNQTVDTTN----ITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETT 1037

Query: 458 RPVPSALEKAPAKKEAYRWKATTPVMQQKE--------VVATPKALKKA---LEHEKTPE 506
V ++ E AT Q +E V A + + A E ++T
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 507 LAVKLAA---------EAIERDPWAAQVSQLSLPKLVEQVALNAWKE-ESDNAVCLHLRS 556
K A E+ +V+ PK + + E +N ++++
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 557 SQRHLNNRGAQQKLAEALS-MLKGSTVELTIVEDDNPAVRTPLEWRQAIYEEKLAQARES 615
Q N ++ A+ S ++ E T V N V P A + + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 616 IIADNNIQTLR 626
+ + +++R
Sbjct: 1218 KPKNRHRRSVR 1228


67S0511S0516N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S05110184.281988enterobactin exporter EntS
S0512-1194.353705iron-enterobactin transporter periplasmic
S0514-1194.285311enterobactin synthase subunit E
S05150173.8342022,3-dihydro-2,3-dihydroxybenzoate synthetase
S0516-1153.7441742,3-dihydroxybenzoate-2,3-dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0511TCRTETA371e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.7 bits (85), Expect = 1e-04
Identities = 81/391 (20%), Positives = 142/391 (36%), Gaps = 38/391 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATSALVGR 141
V+L + G ++ + P L +Y+ + G + G A A + +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPE-LP 200
+ + G V P++GGL+ A + AA L LPE
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHK 186

Query: 201 PPPQPLEHPLKSLLAGFRFLLASPLLGGLLTMA----------SAVLVLYPALADNWQMS 250
+PL + LA FR+ ++ L+ + +A+ V++ D +
Sbjct: 187 GERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRFHWD 244

Query: 251 AAQIGFLYAAIP-LGAAIGALTSGKLAHSARPGLLMLLSTLGS---FLAIGLFGLMPMWI 306
A IG AA L + A+ +G +A ++L + ++ + M
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAF 304

Query: 307 LGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGGLGA 366
+V LA G ML Q E G++ G A +G L + A
Sbjct: 305 PIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA 360

Query: 367 MMTPVASASASGFGLLIIGVLLLLVLVELRR 397
+ + +G+ + L LL L LRR
Sbjct: 361 ----ASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0512FERRIBNDNGPP632e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 63.0 bits (153), Expect = 2e-13
Identities = 61/285 (21%), Positives = 102/285 (35%), Gaps = 35/285 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSAEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKS--- 151
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 152 --WQSLLTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
+ LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQVLERL 314
KD DA+ A PL +P V+ + + F SAM + L
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVL 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0515ISCHRISMTASE440e-159 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 440 bits (1133), Expect = e-159
Identities = 145/299 (48%), Positives = 194/299 (64%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPESHDIPQNKVDWAFELQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W + RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPIPASKA-----------ALREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA + + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0516DHBDHDRGNASE359e-129 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 359 bits (922), Expect = e-129
Identities = 108/258 (41%), Positives = 149/258 (57%), Gaps = 20/258 (7%)

Query: 5 GKNVWVTGAGKGIGYATALTFVEAGAKVTGFD---------------QAFTQEQYPFATE 49
GK ++TGA +GIG A A T GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAGQVAQVCQRLLAETERLDVLINAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+ + ++ R+ E +D+L+N AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRRQRGGAIVTVASDAAHTARIGMSAYGASKAALKSLALSVGLELAGSGVRC 169
+ +R G+IVTV S+ A R M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


68S0783S0788N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0783-2192.680575hypothetical protein
S0784-2172.736928hypothetical protein
S0785-2162.423631ATP-binding component of a transport system
S0786-2132.394127hypothetical protein
S0787-1132.158728DNA-binding transcriptional regulator
S0788-1132.012609ATP-dependent RNA helicase RhlE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0783ABC2TRNSPORT473e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 47.2 bits (112), Expect = 3e-08
Identities = 36/146 (24%), Positives = 63/146 (43%), Gaps = 5/146 (3%)

Query: 197 AREREQGTLDQLLVSPLTTWQIFIGKAVPALIVATFQATIVLAIGIWAYQIPFAGSLALF 256
R Q T + +L + L I +G+ A A IG+ A + + L+L
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGA---GIGVVAAALGYTQWLSLL 148

Query: 257 YFTMVI--YGLSLVGFGLLISSLCSTQQQAFIGVFVFMMPAILLSGYVSPVENMPVWLQN 314
Y VI GL+ G+++++L + + + P + LSG V PV+ +P+ Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 315 LTWINPIRHFTDITKQIYLKDASLDI 340
P+ H D+ + I L +D+
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0785PF05272310.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.012
Identities = 20/90 (22%), Positives = 28/90 (31%), Gaps = 21/90 (23%)

Query: 298 TPRFEDAFIDLLGGAGTSESPLGAILHTVEGTPGETVIEAKELTKKFGDFAATDHVNFAV 357
PR E + +LG P + + + K HV +
Sbjct: 547 VPRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVM 589

Query: 358 KRGEIFG----LLGPNGAGKSTTFKMMCGL 383
+ G F L G G GKST + GL
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0786RTXTOXIND636e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 62.5 bits (152), Expect = 6e-13
Identities = 42/259 (16%), Positives = 92/259 (35%), Gaps = 25/259 (9%)

Query: 83 ALMQAKAGVSVAQAQYDLMLAGYRDEEIAQAAAAVKQAQAAYDYAQNFYNRQQGLWKSRT 142
Q + + +A+ +LA E + + + + L +
Sbjct: 201 QKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENK 260

Query: 143 ISA--NDLENARSSRDQAQATLKSAQDKLRQYRSGNREQ---DIAQAKASLEQAQAQLAQ 197
N+L +S +Q ++ + SA+++ + + + + Q ++ +LA+
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 198 AELNLQDSTLIAPSDGTLLTRAV-EPGTVLNEGGTVFTVSLT-RPVWVRAYVDERNLDQA 255
E Q S + AP + V G V+ T+ + + V A V +++
Sbjct: 321 NEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFI 380

Query: 256 QPGRKVLLYTDGRPDKPYH---GQIGFVSPTAEFTPKTVETPDLRTDLVYRLRIVVT--- 309
G+ ++ + P Y G++ ++ A D R LV+ + I +
Sbjct: 381 NVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISIEENC 432

Query: 310 ----DADDALRQGMPVTVQ 324
+ + L GM VT +
Sbjct: 433 LSTGNKNIPLSSGMAVTAE 451


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0787HTHTETR736e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 73.1 bits (179), Expect = 6e-18
Identities = 33/214 (15%), Positives = 77/214 (35%), Gaps = 17/214 (7%)

Query: 13 KGEQAKKQLIAAALAQFGEYGMNATT-REIAAQAGQNIAAITYYFGSKEDLYLACAQWIA 71
+ ++ ++ ++ AL F + G+++T+ EIA AG AI ++F K DL+ +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 72 DFIGEQFRPHAEEAERLFAQPQPDRAAIRELILRACRNMIKLLTQDDTVNLSKFISREQL 131
IGE E + P + +RE+++ + + + + + F E +
Sbjct: 68 SNIGELEL---EYQAKFPGDP---LSVLREILIHVLESTVTEERRRLLMEII-FHKCEFV 120

Query: 132 SPTAAYHLVHEQVISPLHSHLTRLIAAWTGCDANDTRMILHTHALIGEILAFRLGKETIL 191
A + + + + + +A L T + + G
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKH--CIEAKMLPADLMTRRAAIIMRGYISG----- 173

Query: 192 LRTGWTAFDEEKTELINQTVTCHIDLILQGLSQR 225
L W + + + ++ ++L+
Sbjct: 174 LMENWLFAPQSFD--LKKEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0788SECA310.013 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.6 bits (69), Expect = 0.013
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 4/67 (5%)

Query: 246 QQVLVFTRTKHGANHLAEQLNKDGIRSVAIHG-NKSQGARTRALADFKSGDIRVLVATDI 304
Q VLV T + + ++ +L K GI+ ++ + A A A + + V +AT++
Sbjct: 450 QPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNM 506

Query: 305 AARGLDI 311
A RG DI
Sbjct: 507 AGRGTDI 513


69S0838S0843N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0838114-0.773820chloramphenicol resistance pump Cmr
S0839016-1.148156hypothetical protein
S0840117-1.584897hypothetical protein
S0841016-0.789533DEOR-type transcriptional regulator
S0842014-1.566862DEOR-type transcriptional regulator
S0843013-0.560355hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0838TCRTETB393e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 38.7 bits (90), Expect = 3e-05
Identities = 28/155 (18%), Positives = 61/155 (39%), Gaps = 5/155 (3%)

Query: 48 QAGIDWVPTSMTAYLAGGMFLQWLLGPLSDRIGRRPVMLAGVVWFIVTCLAILLAQNIEQ 107
A +WV T+ + G + G LSD++G + ++L G++ + + +
Sbjct: 48 PASTNWVNTAFMLTFSIGTAV---YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFS 104

Query: 108 FTLL-RFLQGISFCFIGAVGYAAIQESFEEAVCIKITALMANVALIAPLLGPLVGAAWIH 166
++ RF+QG A+ + + K L+ ++ + +GP +G H
Sbjct: 105 LLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAH 164

Query: 167 VLPWEGMFVLFAALAAISFFGLQRAMPETAMRIGE 201
+ W +L + I+ L + + + G
Sbjct: 165 YIHW-SYLLLIPMITIITVPFLMKLLKKEVRIKGH 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0841PF05272330.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 33.1 bits (75), Expect = 0.002
Identities = 24/61 (39%), Positives = 29/61 (47%), Gaps = 1/61 (1%)

Query: 302 DSAWVAGVSVVLWGLGASLGFPLTISAASDTGPDAPTRVSVVATTGYLAFLVGPPLLGYL 361
D +W+AG +VVLW SL LT DT PD R ++A YL F P L
Sbjct: 270 DWSWLAGCTVVLWPDCDSLREKLTRQELKDT-PDPLAREKLLAAKPYLPFDKQPGQKAML 328

Query: 362 G 362
G
Sbjct: 329 G 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0842HTHTETR506e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 49.6 bits (118), Expect = 6e-10
Identities = 14/83 (16%), Positives = 30/83 (36%), Gaps = 4/83 (4%)

Query: 2 RRANDPQRREKIIQATLEAVKLYGIHAVTHRKIATLAGVPLGSMTYYFSGIDELLLEAFS 61
+ + R+ I+ L G+ + + +IA AGV G++ ++F +L E +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIW- 63

Query: 62 SFTEIMSRQYQAFFSDVSDAQGA 84
E+ +
Sbjct: 64 ---ELSESNIGELELEYQAKFPG 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0843TCRTETA320.006 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.1 bits (73), Expect = 0.006
Identities = 21/106 (19%), Positives = 34/106 (32%), Gaps = 6/106 (5%)

Query: 394 LMIGMITFQFSTFSFGMGNAAGLLFAGIML-GFMRANHPTFG-YIPQ--GALSMVKEFGL 449
L++ + +L+ G ++ G A G YI + FG
Sbjct: 76 LLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGF 135

Query: 450 MVFMAGVGLSAGSGINNGLGAIGGQM--LIAGLIVSLVPVVICFLF 493
M G G+ AG + +G A + L + CFL
Sbjct: 136 MSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLL 181


70S0860S0865N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S0860122-7.209411arginine transporter ATP-binding subunit
S0861121-6.449057lipoprotein
S4813118-4.887754hypothetical protein
S4814114-3.227214hypothetical protein
S08620132.451293hypothetical protein
S08630132.600054regulator
S0864-2142.401151nucleotide di-P-sugar epimerase or dehydratase
S0865-2121.749847dTDP-glucose enzyme
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0860PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.010
Identities = 9/18 (50%), Positives = 12/18 (66%)

Query: 31 LVLLGPSGAGKSSLLRVL 48
+VL G G GKS+L+ L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0863ECOLIPORIN300.009 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 30.3 bits (68), Expect = 0.009
Identities = 21/54 (38%), Positives = 27/54 (50%), Gaps = 9/54 (16%)

Query: 2 RRVFWLVAAALLLAGCAGEKGIVEKEGYQLDTRHQAQAAYPRIKVLVIHYTADD 55
R+V LV ALL AG A I K+G +LD Y ++ L HY +DD
Sbjct: 3 RKVLALVIPALLAAGAAHAAEIYNKDGNKLDL-------YGKVDGL--HYFSDD 47


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0864NUCEPIMERASE746e-17 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 74.0 bits (182), Expect = 6e-17
Identities = 70/363 (19%), Positives = 123/363 (33%), Gaps = 65/363 (17%)

Query: 13 MKVLVTGATSGLGRNAVEFLCQKGISVRA---------SGRNEAMGKLLEKMGAEFVPTD 63
MK LVTGA +G + + L + G V +A +LL + G +F D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 64 LTELVSSQAKVMLAGIDTLWHCS-------SFTSPWGTQQAFDLANVRATRRLGEWAVAW 116
L + + ++ S +P A+ +N+ + E
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPH----AYADSNLTGFLNILEGCRHN 116

Query: 117 GVRNFIHISSPSLYFDYHHHRNIKEDFRPHRFANEFARSKAASEEVINMLSQANPQTRFT 176
+++ ++ SS S+Y + D + +A +K A+E + + S T
Sbjct: 117 KIQHLLYASSSSVYGL-NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH-LYGLPAT 174

Query: 177 ILRPQSLFGPHDK--VFIPRLAHMMHHYGSILLPHGGSALVDMTYYENAVHAMWLASQEA 234
LR +++GP + + + + M SI + + G D TY ++ A+
Sbjct: 175 GLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI 234

Query: 235 CDKLPS--------------GRVYNITNGEHRTLRSIVQKLIDELNIDCRIRSVPYPMLD 280
RVYNI N L +Q L D L I+ + +P D
Sbjct: 235 PHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGD 294

Query: 281 MIARSMERLGRKSAKEPPLTHYGASKLNFDFTLDITRAQEELGYQPVITLDEGIEKTAAW 340
+ T D E +G+ P T+ +G++ W
Sbjct: 295 V----------------LETS-----------ADTKALYEVIGFTPETTVKDGVKNFVNW 327

Query: 341 LRD 343
RD
Sbjct: 328 YRD 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S0865NUCEPIMERASE562e-10 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 55.6 bits (134), Expect = 2e-10
Identities = 29/125 (23%), Positives = 52/125 (41%), Gaps = 17/125 (13%)

Query: 4 RILVLGASGYIGQHLVRTLSQQGHQILA---------AARHVDRLAKLQLANVSCHKVDL 54
+ LV GA+G+IG H+ + L + GHQ++ + RL L HK+DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 55 SWPDNLPALLQD--IDTVYFLVH------SMGEGGDFIAQERQLALNVRDALREVPVKQL 106
+ + + L + V+ H S+ + LN+ + R ++ L
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 107 IFLSS 111
++ SS
Sbjct: 122 LYASS 126


71S1160S1168N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S11600131.882814flagellar hook protein FlgE
S11620131.941467flagellar basal body rod protein FlgG
S11631141.681996flagellar basal body L-ring protein
S11641141.380543flagellar basal body P-ring protein
S11653161.087021flagellar rod assembly protein/muramidase FlgJ
S11683171.006840ribonuclease E
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1160FLGHOOKAP1393e-05 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 39.2 bits (91), Expect = 3e-05
Identities = 16/49 (32%), Positives = 28/49 (57%)

Query: 354 TLTNGALEASNVDLSKELVNMIVAQRNYKSNAQTIKTQDQILNTRVNLR 402
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + +N+R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 36.9 bits (85), Expect = 1e-04
Identities = 22/56 (39%), Positives = 30/56 (53%), Gaps = 4/56 (7%)

Query: 6 AVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1162FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1163FLGLRINGFLGH349e-126 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 349 bits (897), Expect = e-126
Identities = 232/232 (100%), Positives = 232/232 (100%)

Query: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60
MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180
RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232
SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1164FLGPRINGFLGI427e-152 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 427 bits (1100), Expect = e-152
Identities = 157/363 (43%), Positives = 213/363 (58%), Gaps = 9/363 (2%)

Query: 4 FLSALILLLVTTAAQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQTLN 63
F + L A RI+D+ S+Q R N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 13 FSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMR 72

Query: 64 NMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGTLLM 123
ML LGIT G + KN+AAVMVTA+LPPF G +DV VSS+G+A SLRGG L+M
Sbjct: 73 AMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIM 131

Query: 124 TPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFGVGN 183
T L G D Q+YA+AQG ++V G A +++ R+ NGA+IERELPS+F
Sbjct: 132 TSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSV 191

Query: 184 TLNLQLNDEDFSMAQQIADTINRVR----GYGSATALDARTIQVRVPSGNSSQVRFLADI 239
L LQL + DFS A ++AD +N G A D++ I V+ P + R +A+I
Sbjct: 192 NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEI 250

Query: 240 QNMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAVAQGNLSVTVNRQANVSQPDTPFGG 299
+N+ V T AKVVIN RTG++V+ +V + AV+ G L+V V V QP PF
Sbjct: 251 ENLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSR 308

Query: 300 GQTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLR 359
GQT V PQT I Q G + ++ L +V LN++G +++ILQ ++SAG L+
Sbjct: 309 GQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQ 367

Query: 360 AKL 362
A+L
Sbjct: 368 AEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1165FLGFLGJ5030.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 503 bits (1297), Expect = 0.0
Identities = 308/313 (98%), Positives = 309/313 (98%)

Query: 1 MISDSKLLASAAWDAQSLNELKAKASEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60
MISDSKLLASAAWDAQSLNELKAKA EDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120
LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VVRYQNQTLSQLVQKAVPRNYDDSLPGDSRAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180
VVRYQNQ LSQLVQKAVPRNYDDSLPGDS+AFLAQLSLPAQLASQQSGVPHHLILAQAAL
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180

Query: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGQVTEITTTEYENGEAKKVKAKFRVYSSYL 240
ESGWGQRQIRRENGEPSYNLFGVKASGNWKG VTEITTTEYENGEAKKVKAKFRVYSSYL
Sbjct: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240

Query: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQVLQDAGYATDPHYARKLTNMIQQMKSISDK 300
EALSDYVGLLTRNPRYAAVTTAASAEQGAQ LQDAGYATDPHYARKLTNMIQQMKSISDK
Sbjct: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300

Query: 301 VSKTYSMNIDNLF 313
VSKTYSMNIDNLF
Sbjct: 301 VSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1168IGASERPTASE682e-13 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 68.2 bits (166), Expect = 2e-13
Identities = 49/261 (18%), Positives = 87/261 (33%), Gaps = 26/261 (9%)

Query: 551 VAPAPKAATATPAAPAQPGLLSRFFGALKALFSGGEEAKPTEQP-TPKAEAKPERQQDRR 609
T P + S E A+ E P P A A P
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSN----------NEEIARVDEAPVPPPAPATPSET---- 1036

Query: 610 KPRQSNRRDRNERRDTRSERTEGSDNREENRRNRRQAQQQTAETRESRQQAEV------T 663
+ N ++++++ D E +NR A++ + + + Q EV T
Sbjct: 1037 ----TETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 664 EKARTTDEQQAPRRERSRRRNDDKRQAQQEAKALNVEEQSVQETEQEERVRPVQPRRKQR 723
++ +TT+ ++ E+ + + + Q+ K + + QE + + + R
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPK-VTSQVSPKQEQSETVQPQAEPARENDP 1151

Query: 724 QLNQKVRYEQSVAEEAVVAPVVEETAAAEPIVQEAPAPRTELVKVPLPVVAQTAPEQQEE 783
+N K Q+ P E ++ E V E+ T V P A Q
Sbjct: 1152 TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTV 1211

Query: 784 NNADNRDNGGMPRRSRRSPRH 804
N+ + RRS RS H
Sbjct: 1212 NSESSNKPKNRHRRSVRSVPH 1232



Score = 61.2 bits (148), Expect = 2e-11
Identities = 48/288 (16%), Positives = 88/288 (30%), Gaps = 36/288 (12%)

Query: 513 PSEEEFAERKRPEQPALATFAMPDVPPAPT-PAEPAATVVAPAPKAATATPAAPAQPGLL 571
P E+ + DVP P+ E A AP P A ATP+ +
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTE---- 1038

Query: 572 SRFFGALKALFSGGEEAKPTEQPTPKAEAKPERQQDRRKPRQSNRRDRNERRDTRSER-- 629
A E +K + K E Q+ + + + ++
Sbjct: 1039 ------TVA-----ENSKQESKTVEKNEQDATE-----TTAQNREVAKEAKSNVKANTQT 1082

Query: 630 TEGSDNREENRRNRRQAQQQTAETRESRQQAEVTEKARTTDEQQAPRRERSRRRNDDKRQ 689
E + + E + + ++TA + + TEK + + + + + + Q
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ 1142

Query: 690 AQ---QEAKALNVEEQSVQETEQEERVRPVQPRRKQRQLNQKVRYEQSV--AEEAVVAPV 744
A+ + +N++E Q + +P + + Q V +V V P
Sbjct: 1143 AEPARENDPTVNIKEPQSQTNTTADTEQPA--KETSSNVEQPVTESTTVNTGNSVVENPE 1200

Query: 745 VEETAAAEPIVQEAPA------PRTELVKVPLPVVAQTAPEQQEENNA 786
A +P V + R + VP V T A
Sbjct: 1201 NTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVA 1248


72S1278S1293N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S12780180.906777ferric enterobactin transport ATP-binding
S1279-2170.083364iron compound ABC transporter permease
S1280-216-0.078875ABC transporter ATP-binding protein
S1281-213-0.374818IS1 orfB
S1282-215-0.583038IS1 orfA
S1284-115-1.124549trehalase
S1285-214-1.250680dihydroxyacetone kinase subunit M
S1286-214-1.177282dihydroxyacetone kinase ADP-binding subunit
S1287-114-1.334324dihydroxyacetone kinase subunit DhaK
S1288-113-0.785305DNA-binding transcriptional regulator DhaR
S1289017-0.576493adhesion and penetration protein
S1290-1160.606518GTP-dependent nucleic acid-binding protein EngD
S1291-1140.833085peptidyl-tRNA hydrolase
S1292-2121.323576hypothetical protein
S1293-2131.270079sulfate transporter YchM
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1278LCRVANTIGEN300.006 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 30.4 bits (68), Expect = 0.006
Identities = 19/63 (30%), Positives = 29/63 (46%), Gaps = 7/63 (11%)

Query: 193 LMSTHHTLHANAIADSIIQVEPDGRVTQGLPTEQLTTNKLAAL------YRVSADQIHHH 246
+ H +L A+ I D I++V D G +L +LA L Y V +I+ H
Sbjct: 119 MAVMHFSLTADRIDDDILKVIVDSMNHHGDARSKL-REELAELTAELKIYSVIQAEINKH 177

Query: 247 LSA 249
LS+
Sbjct: 178 LSS 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1280FERRIBNDNGPP401e-05 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 39.5 bits (92), Expect = 1e-05
Identities = 65/301 (21%), Positives = 102/301 (33%), Gaps = 48/301 (15%)

Query: 2 PITRRTFAQALASTLLLQSLPSFSQTVNRFASQSLPEAQNI--TRIVSAG-APADLLL-L 57
I+RR A+A + L + A I RIV+ P +LLL L
Sbjct: 6 LISRRRLLTAMALSPL-------------LWQMNTAHAAAIDPNRIVALEWLPVELLLAL 52

Query: 58 AVAPEKMVGFSSFDFARQALI--PLPEHIRQLPRLGRLAGRASTLSLEGLMALHPDLVVD 115
+ P G + R + PLP+ + + G + +LE L + P +V
Sbjct: 53 GIVP---YGVADTINYRLWVSEPPLPDSVIDV-------GLRTEPNLELLTEMKPSFMVW 102

Query: 116 CGNTDETLISQARQVSEQTQIPWLLLN-----GKLAQSAEQLTTLGKTLGEEHRAAEQAN 170
+ AR P N LA + + LT + L + A
Sbjct: 103 SAGYGPSPEMLARIA------PGRGFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLA 156

Query: 171 LASHFVGEAQA-FATSPAANLRFYAARGPRGLETGLQGSLHTEAAELLGLHNVAQ-IADR 228
F+ + F A L PR + SL E + G+ N Q +
Sbjct: 157 QYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNF 216

Query: 229 HGLTQVSMENLLRWQ-PDIILVQEAVTADF--IRRDPLWQGVKAVAEQRILFLSGLPFGW 285
G T VS++ L ++ D++ + D + PLWQ + V R +P W
Sbjct: 217 WGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGR---FQRVPAVW 273

Query: 286 L 286

Sbjct: 274 F 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1285PHPHTRNFRASE1402e-38 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 140 bits (355), Expect = 2e-38
Identities = 60/206 (29%), Positives = 100/206 (48%), Gaps = 1/206 (0%)

Query: 258 GKAFYYQPVLCTVQAKSTLTAEEEQDRLRQAIDFTLLDLMTLTAKAEASGLDDIAAIFSG 317
KAF + ++ S E ++L A++ + +L + + EAS D A IF+
Sbjct: 17 AKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQTEASMGADKAEIFAA 76

Query: 318 HHTLLGDPELLAAASELLQHEHCTAEYAWQQVLKELSQQYQQLDDEYLQARYIDVDDLLH 377
H +L DPEL+ +++E AEYA ++V ++ +D+EY++ R D+ D+
Sbjct: 77 HLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEYMKERAADIRDVSK 136

Query: 378 RTLVHLT-QTKEELPQFNSPTILLAENIYPSTVLQLDPAVVKGICLSAGSPVSHSALIAR 436
R L HL L T+++AE++ PS QL+ VKG G SHSA+++R
Sbjct: 137 RVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSR 196

Query: 437 ELGIGWICQQGEKLYAIQPEETLTLD 462
L I + E IQ + + +D
Sbjct: 197 SLEIPAVVGTKEVTEKIQHGDMVIVD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1286adhesinmafb280.020 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 28.5 bits (63), Expect = 0.020
Identities = 10/47 (21%), Positives = 26/47 (55%)

Query: 138 VESLRQSSEQNLSVPVALEAASSIAESAAQSTITMQARKGRASYLGE 184
E++ + ++N + +EA ++A +A + + A+ G+A+ G+
Sbjct: 293 REAVDRWIQENPNAAETVEAVFNVAAAAKVAKLAKAAKPGKAAVSGD 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1288HTHFIS2441e-75 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 244 bits (624), Expect = 1e-75
Identities = 91/363 (25%), Positives = 155/363 (42%), Gaps = 33/363 (9%)

Query: 311 QMRQLMTSQLGKVSHTFAHMPQDDPQTRRLIHFGRQAARSSFPVLLCGEEGVGKALLSQA 370
+ S+L S + + + + ++ +++ GE G GK L+++A
Sbjct: 120 AEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 371 IHNESERAAGPYIAVNCELYGDAALAEEFIG---GDRTDNENGRLSRLELAHGGTLFLEK 427
+H+ +R GP++A+N + E G G T + R E A GGTLFL++
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 428 IEYLAVELQSALLQVIKQGVITRLDARRLIPIDVKVIATTTADLAMLVEQNRFSRQLYYA 487
I + ++ Q+ LL+V++QG T + R I DV+++A T DL + Q F LYY
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 488 LHAFEITIPPLRMRRGSIPALVNNKLRSLEKRFSTRLKIDDDALARLVSCAWPGNDFELY 547
L+ + +PPLR R IP LV + ++ EK + D +AL + + WPGN EL
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELE 359

Query: 548 SVIENLALSSDNGRIRVSDLPEHLFTEQATDDVSATRLSTS------------------- 588
+++ L I + L +E + +
Sbjct: 360 NLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASF 419

Query: 589 -----------LSFAEVEKEAIINAAQVTGGRIQEMSALLGIGRTTLWRKMKQHGIDAGQ 637
AE+E I+ A T G + + LLG+ R TL +K+++ G+ +
Sbjct: 420 GDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVSVYR 479

Query: 638 FKR 640
R
Sbjct: 480 SSR 482


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1289PRTACTNFAMLY2123e-59 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 212 bits (540), Expect = 3e-59
Identities = 247/980 (25%), Positives = 402/980 (41%), Gaps = 117/980 (11%)

Query: 14 RLAELKIRSPSIQLIKFGAIGLNAIIFSPLLIAADTGSQYGTNITINDGDRI---TGDTA 70
+ A L+ + ++ L GA ++ I Q+G +I +D + +G T
Sbjct: 10 KAAPLRRTTLAMALGALGAAPAAHADWNNQSIVKTGERQHGIHIQGSDPGGVRTASGTTI 69

Query: 71 DPSGN-LYGVMTPAGNTPGNINLGNDVTVN---VNDASGYAKGIIIQGKNSSLTANRLTV 126
SG G++ N + N + ++D + K L A+ T+
Sbjct: 70 KVSGRQAQGILLE--NPAAELQFRNGSVTSSGQLSDDGIRRFLGTVTVKAGKLVADHATL 127

Query: 127 DVVGQT---SAIGINLIGDYTHADLGTGSTIKSNDDGIIIGHSSTLTATQFTIENSNGIG 183
VG T I + + G+ A + ST++ G+ I + +T + I + G+
Sbjct: 128 ANVGDTWDDDGIALYVAGEQAQASIAD-STLQGAG-GVQIERGANVTVQRSAIVD-GGLH 184

Query: 184 LTINDYGTSVDLGSGSKIKTDGS-TGVYIGGLNGNNANGAARFTATDLTID---VQGYSA 239
+ DL + D + T V G + A++LT+D + G A
Sbjct: 185 IGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPA----AVSVLGASELTLDGGHITGGRA 240

Query: 240 MGINVQKNSVVDLGTNSTIKTNGDNAHGLWSFGQVSANAL-------TVDVTGAAANGVE 292
G+ + +VV L +TI+ A G G V A+ GV+
Sbjct: 241 AGVAAMQGAVVHL-QRATIRRGDAPAGGAVPGGAVPGGAVPGGFGPGGFGPVLDGWYGVD 299

Query: 293 VRGGTTTIGADSHISSAQGGGLVTSSSDATINFSG---TAAQRNSIFSGGSYGASAQTAT 349
V G + + A S + + + G + A + SG +A N I +GG+ + Q A
Sbjct: 300 VSGSSVEL-AQSIVEAPELGAAIRVGRGARVTVSGGSLSAPHGNVIETGGARRFAPQAAP 358

Query: 350 AVINMQNTDITVDRNGSLALGLWALSGGRITGDSLAITGAAGARGIYAMTNSQIDLTSDL 409
I +Q G+ A G L L +TG A A+G T + +
Sbjct: 359 LSITLQA--------GAHAQGKALLYRVLPEPVKLTLTGGADAQGDIVATELPSIPGTSI 410

Query: 410 VIDMSTPDQMAIATQHDDGYAASRINASGRMLINGSVLSKGGLINLDMHPGSVWTGSSLS 469
P +A+A+ + WTG++
Sbjct: 411 -----GPLDVALAS------------------------------------QARWTGAT-- 427

Query: 470 DNVNGGKLDVAMNNSVWNVTSNSNLDTLAL-SHSTVDFASHGSTAGTFTTLNVENLSGNS 528
V+ +D N+ W +T NSN+ L L S +VDF + AG F L V L+G+
Sbjct: 428 RAVDSLSID----NATWVMTDNSNVGALRLASDGSVDFQQ-PAEAGRFKVLTVNTLAGSG 482

Query: 529 TFIMRADVVGEGNGVNNRGDLLNISGSSAGNHVLAIRNQGSEATTGNEVLTVVKTTDGAA 588
F M D L + ++G H L +RN GSE + N +L V AA
Sbjct: 483 LFRMNV------FADLGLSDKLVVMQDASGQHRLWVRNSGSEPASANTLLLVQTPLGSAA 536

Query: 589 SFSASS---QVELGGYLYDVRKNG-TNWELYASGTVPEPTPNPEPTPAPAQPPIVNPD-P 643
+F+ ++ +V++G Y Y + NG W L + P P P P+P P P QPP P+ P
Sbjct: 537 TFTLANKDGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPEAP 596

Query: 644 TPEPAPTPKPTTTADAGGNYLNVGYL--LNYVENRTLMQRMGDLRNQSKDGNIWLRSYG- 700
P+P + + A+A N VG L Y E+ L +R+G+LR G W R +
Sbjct: 597 APQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDAGGAWGRGFAQ 656

Query: 701 -GSLDSFASGKLSGFDMGYSGIQFGGDKRLSDVM-PLYVGLYIDSTHASPDYSG-GDGTA 757
LD+ A + FD +G + G D ++ ++G T ++G G G
Sbjct: 657 RQQLDNRAGRR---FDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDGGGHT 713

Query: 758 RSDYMGMYASYMAQNGFYSDLVIKASRQKNSFHVLDSQNNGVNANGTANGMSISLEAGQR 817
S ++G YA+Y+A +GFY D ++ASR +N F V S V +G+ SLEAG+R
Sbjct: 714 DSVHVGGYATYIADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGASLEAGRR 773

Query: 818 FNLSPTGYGFYIEPQTQLTYSHQNEMAMKASNGLNIHLNHYESLLGRASMILGYDIT-AG 876
F + G+++EPQ +L A +A+NGL + S+LGR + +G I AG
Sbjct: 774 FTHAD---GWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIELAG 830

Query: 877 NSQLNVYVKTGAIREFSGDTEYLLNDSREKYSFKGNGWNNGVGVSAQYNKQHTFYLEADY 936
Q+ Y+K ++EF G N + +G G+G++A + H+ Y +Y
Sbjct: 831 GRQVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYASYEY 890

Query: 937 TQGNLFDQK-QVNGGYRFSF 955
++G + GYR+S+
Sbjct: 891 SKGPKLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1293RTXTOXINA330.003 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 33.0 bits (75), Expect = 0.003
Identities = 24/81 (29%), Positives = 37/81 (45%), Gaps = 16/81 (19%)

Query: 279 LGAIESLLCAV----VL---DGMTGTKHKANSELVGQGLGNI---IAPFF------GGIT 322
L + +L A+ +L D T TK A EL + LGN+ I+ + G++
Sbjct: 242 LDTVSGILSAISASFILSNADADTRTKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGLS 301

Query: 323 ATAAIARSAANVRAGATSPIS 343
+AA A A+ A SP+S
Sbjct: 302 TSAAAAGLIASAVTLAISPLS 322


73S1302S1310N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S1302-114-1.2820062-dehydro-3-deoxyphosphooctonate aldolase
S1303-218-1.307565calcium/sodium:proton antiporter
S1304-1170.105288cation transport regulator
S1305-2130.552755cation transport regulator
S1306-2121.279310hypothetical protein
S1307-2151.357539hypothetical protein
S1308-2181.732638transcriptional regulator NarL
S1309-1202.010853nitrate/nitrite sensor protein NarX
S1310-2231.520162nitrate transport protein nark
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1302TRNSINTIMINR290.032 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 28.5 bits (63), Expect = 0.032
Identities = 35/157 (22%), Positives = 58/157 (36%), Gaps = 23/157 (14%)

Query: 82 QELKQTFGVKIITDVHEPSQAQPVADVVDVIQLPAFLARQTDLVEAMAKTGAVINVKKPQ 141
Q +QT T V + + P V + Q + + D ++A T
Sbjct: 391 QPAEQTTTTTTHTVVQQQTGGIPQHKVALMPQERRRFSDRRDSQGSVASTH--------- 441

Query: 142 FVSPGQMGNIVDKFKEGGNEKVILCDRGA-NFGYDNLVVDMLGFSIMKKVSGNSPVIFDV 200
+V+ + E G + L YD + D G+S+++ SG+ P V
Sbjct: 442 --WSDSSSEVVNPYAEVGGARNSLSAHQPEEHIYDEVAADP-GYSVIQNFSGSGP----V 494

Query: 201 THALQCRDPFGAASGGRRAQVAELA-RAGMAVGLAGL 236
T L G G ++ A LA G+ +G+ GL
Sbjct: 495 TGRL-----IGTPGQGIQSTYALLANSGGLRLGMGGL 526


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1307INTIMIN2538e-78 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 253 bits (646), Expect = 8e-78
Identities = 119/378 (31%), Positives = 196/378 (51%), Gaps = 21/378 (5%)

Query: 32 GEQAKAFAWGKVRDALSQQVNQHVESWLSPWGNASVDVKVDNEGHFTGSRGSWFVPLQDN 91
G+ AK A G + Q + +++WL +G A V+++ N F GS + +P D+
Sbjct: 184 GDYAKDTALGIAGN----QASSQLQAWLQHYGTAEVNLQSGNN--FDGSSLDFLLPFYDS 237

Query: 92 DRYLTWSQLGLTQQDNGLVSNVGVGQRWARGNWLVGYNTFYDNLQDENLQRAGFGAEAWG 151
++ L + Q+G D+ +N+G GQR+ ++GYN F D + R G G E W
Sbjct: 238 EKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLGIGGEYWR 297

Query: 152 EYLRLSANFYQPFAAWHE--QTATQEQRMARGYDLTARMRMPFYQHLNTSVSLEQYFGDR 209
+Y + S N Y + WHE ++R A G+D+ +P Y L + EQY+GD
Sbjct: 298 DYFKSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDN 357

Query: 210 VDLFNSGTGYHNPVALSLGLNYTPVPLVTVTAQHKQGESGENQNNLGLNLNYRFGVPLKK 269
V LFNS NP A ++G+NYTP+PLVT+ ++ G EN + Y+F P +
Sbjct: 358 VALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQ 417

Query: 270 QLSAGEVAESQSLRGSRYDNPQRNNLPTLEYRQRKTLTVFLATPPWDLKPGETVPLKLQI 329
Q+ V E ++L GSRYD QRNN LEY+++ L++ + + T ++L +
Sbjct: 418 QIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNI-PHDINGTERSTQKIQLIV 476

Query: 330 RSRYGIRQLIWQGDTQILS-----LTPGAQANSAEGWTLIMPDWQNGEGASNHWRLSVVV 384
+S+YG+ +++W D+ + S G+Q SA+ + I+P + +G SN ++++
Sbjct: 477 KSKYGLDRIVWD-DSALRSQGGQIQHSGSQ--SAQDYQAILPAYV--QGGSNVYKVTARA 531

Query: 385 EDNQGQRVSSNEITLTLV 402
D G SSN + LT+
Sbjct: 532 YDRNGN--SSNNVLLTIT 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1308HTHFIS742e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 73.7 bits (181), Expect = 2e-17
Identities = 32/117 (27%), Positives = 56/117 (47%), Gaps = 2/117 (1%)

Query: 7 ATILLIDDHPMLRTGVKQLISMAPDITVVGEASNGEQGIELAESLDPDLILLDLNMPGMN 66
ATIL+ DD +RT + Q +S A + SN + D DL++ D+ MP N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 67 GLETLDKLREKSLSGRIVVFSVSNHEEDVVTALKRGADGYLLKDMEPEDLLKALHQA 123
+ L ++++ ++V S N + A ++GA YL K + +L+ + +A
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1309PF06580531e-09 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 53.3 bits (128), Expect = 1e-09
Identities = 36/172 (20%), Positives = 73/172 (42%), Gaps = 23/172 (13%)

Query: 424 PESSRELLSQIRNELNASWAQLRELLTTFRLQLTEPGLRPALEASCEEYSAKFGFPVKLD 483
P +RE+L+ + + S + +LT +++ + S +F ++ +
Sbjct: 190 PTKAREMLTSLSELMRYSLRYSNARQVSLADELT------VVDSYLQLASIQFEDRLQFE 243

Query: 484 YQLPPRL----VPSHQAIHLLQIAREALSNALKH-----SQASEVVVTVAQNDNQVKLTV 534
Q+ P + VP L+Q E N +KH Q ++++ +++ V L V
Sbjct: 244 NQINPAIMDVQVPPM----LVQTLVE---NGIKHGIAQLPQGGKILLKGTKDNGTVTLEV 296

Query: 535 QDNGCGVPENAIRSNHYGMIIMRDRAQSLRG-DCRVRRRESGGTEVVVTFIP 585
++ G +N S G+ +R+R Q L G + +++ E G + IP
Sbjct: 297 ENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S1310ACRIFLAVINRP330.004 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 32.5 bits (74), Expect = 0.004
Identities = 35/166 (21%), Positives = 60/166 (36%), Gaps = 22/166 (13%)

Query: 258 IMSLLYLATFGSFIGFSAGFAMLSKTQFPDVQILQYAFFGPFIGALARSA---GGALSDR 314
I+S + L+ + I A A L K + + FFG F S ++
Sbjct: 474 IVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKI 533

Query: 315 LGGTRVTLVNFILMAIFSGLLFLTLPTD----GQGGSFMAFFAVFLALFLTAGLGSGSTF 370
LG T L+ + L+ +LFL LP+ G F+ L +G+T
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTM----------IQLPAGATQ 583

Query: 371 QMISVIFRKLTMDRVKAEGGSDER-----AMREAATGTAAALGFIS 411
+ + ++T +K E + E + A + F+S
Sbjct: 584 ERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVS 629


74S2022S2030N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S20221140.810705chemotaxis regulatory protein CheY
S20230111.057682chemotaxis-specific methylesterase
S20240120.971560chemotaxis methyltransferase CheR
S20250130.715071methyl-accepting protein IV
S2026-1140.139633methyl-accepting chemotaxis protein II,
S2027-2140.054121purine-binding chemotaxis protein
S2028-2140.170875chemotaxis protein CheA
S2029-216-0.539901flagellar motor protein MotB
S2030-311-1.073679flagellar motor protein MotA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2022HTHFIS904e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 4e-24
Identities = 30/105 (28%), Positives = 51/105 (48%), Gaps = 3/105 (2%)

Query: 7 KFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMPNMDGL 66
LV DD + +R ++ L G++ V + + AG V++D MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 ELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKPF 111
+LL I+ LPVL+++A+ I A++ GA Y+ KPF
Sbjct: 64 DLLPRIKKARPD--LPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2023HTHFIS659e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 65.2 bits (159), Expect = 9e-14
Identities = 35/188 (18%), Positives = 72/188 (38%), Gaps = 23/188 (12%)

Query: 1 MSKIRVLSVDDSALMRQIMTEIINSHSDMEMVATAPDPLVARDLIKKFNPDVLTLDVEMP 60
M+ +L DD A +R ++ + ++ V + I + D++ DV MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 RMDGLDFLEKLMRLRPMPVVMVSSLTGKGS-EVTLRALELGAIDFVTKPQLGIREGMLAY 119
+ D L ++ + RP V+V ++ + + ++A E GA D++ KP + E +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLV--MSAQNTFMTAIKASEKGAYDYLPKP-FDLTELIGII 115

Query: 120 SEMIAEKVRTAAKASLAAHKPLSAPTTLKAGPLLSSEKLIAIGASTGGTEAIRHVLQPLP 179
+AE R +K + + +G S E R + + +
Sbjct: 116 GRALAEPKRRPSKLEDDSQDGMP-----------------LVGRSAAMQEIYRVLARLMQ 158

Query: 180 LSSPALLI 187
++
Sbjct: 159 TDLTLMIT 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2028PF06580424e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.2 bits (99), Expect = 4e-06
Identities = 23/151 (15%), Positives = 49/151 (32%), Gaps = 52/151 (34%)

Query: 361 ELDKSLIERIIDPLT--HLVRNSLDHGIELPEKRLAAGKNSVGNLILSAEHQGGNICIEV 418
+++ ++++ + P+ LV N + HGI G ++L G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 419 TDDGAGLNRERILAKAASQGLTVSENMSDDEVAMLIFAPGFSTAEQVTDVSGRGVGMDVV 478
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKNTK--------------------------------------ESTGTGLQNV 318

Query: 479 KRNIQEMGG---HVEIQSKQGTGTTIRILLP 506
+ +Q + G +++ KQG +L+P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2029PF05272310.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.8 bits (69), Expect = 0.009
Identities = 22/93 (23%), Positives = 35/93 (37%), Gaps = 11/93 (11%)

Query: 46 LISISSPKELIQIAEYFRTPLATAVTGGDRISNSESPIPGGGDDYTQSQGEVNKQPNIEE 105
L +SSP A P + G + ++ PGGGDD GE +++
Sbjct: 384 LADVSSPTAAAGGAGGGEPPKKRDPSAG---AGTDPGGPGGGDD-----GEDPFGEWLDD 435

Query: 106 LKKRM---EQSRLRKLRGDLDQLIESDPKLRAL 135
R+ + L+ R L + + S P L
Sbjct: 436 EVARLRLRGRWLLKPRRAALIEALRSAPALAGC 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2030PF05844330.001 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 33.1 bits (75), Expect = 0.001
Identities = 12/28 (42%), Positives = 22/28 (78%), Gaps = 2/28 (7%)

Query: 76 MDLLALLYRLMAKSRQMGMFSLERDIEN 103
++LL +L+R+ K+R++G+ L+RD EN
Sbjct: 74 VELLLILFRIAQKARELGV--LQRDNEN 99


75S2062S2089N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S2062-112-1.733679flagellin
S2063-216-0.115351flagellar capping protein
S2064-114-0.379888flagellar protein FliS
S2065-2120.013057flagellar biosynthesis protein FliT
S2066-111-1.639463alpha-amylase
S2067016-3.596168hypothetical protein
S2068122-5.091645inner membrane protein
S2069533-7.803997hypothetical protein
S2072432-7.176284virulence protein
S2073125-5.834963outer membrane porin protein C precurser
S4819223-3.110399regulator
S48200171.839292kinase inhibitor
S48210173.259934multidrug efflux protein
S20760183.839510flagellar hook-basal body protein FliE
S2078-1183.366277flagellar motor switch protein G
S2079-2163.257279flagellar assembly protein H
S2080-1173.027042flagellum-specific ATP synthase
S2082-2161.679687flagellar hook-length control protein
S2083-117-0.095495flagellar basal body-associated protein FliL
S2084016-2.689047flagellar motor switch protein FliM
S2085118-3.930486flagellar motor switch protein FliN
S2086121-5.156422flagellar biosynthesis protein FliO
S2088123-6.160225flagellar biosynthesis protein FliQ
S2089120-4.535261flagellar biosynthesis protein FliR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2062FLAGELLIN2349e-73 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 234 bits (599), Expect = 9e-73
Identities = 260/551 (47%), Positives = 311/551 (56%), Gaps = 47/551 (8%)

Query: 2 AQVINTNSLSLITQNNINKNQSALSSSIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 61
AQVINTNSLSL+TQNN+NK+QS+LSS+IERLSSGLRINSAKDDAAGQAIANRFTSNIKGL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 TQAARNANDGISVAQTTEGALSEINNNLQRIRELTVQASTGTNSDSDLDSIQDEIKSRLD 121
TQA+RNANDGIS+AQTTEGAL+EINNNLQR+REL+VQA+ GTNSDSDL SIQDEI+ RL+
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EIDRVSGQTQFNGVNVLAKDGSMKIQVGANDGQTITIDLKKIDSDTLGLNGFNVNGGGAV 181
EIDRVS QTQFNGV VL++D MKIQVGANDG+TITIDL+KID +LGL+GFNVNG
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEA 180

Query: 182 A---NTAASKADLVAANATVVGNKYTVSAGYDAAKASDLLAGVSDGDTVQATINNGFGTA 238
++ K V NKY V A V D V A N T
Sbjct: 181 TVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAA-NGQLTTD 239

Query: 239 ASATNYKYDSASKSYSFDTTTASAADVQKYLTPGVGDTAKGTITIDGSAQDVQISSDGKI 298
+ N D + S T + A GDT +GK+
Sbjct: 240 DAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKV 299

Query: 299 TASNGDKLYIDTTGRLTKNGSGASLTEASLSTLAANNTKATTIDIGGTSISFTGNSTTPD 358
+ T NG +LT A ++ AAN AT S T D
Sbjct: 300 ST--------------TINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFD 345

Query: 359 TITYSVTGAKVDQAAFDKAVSTSGNNVDFTTAGYSVNGTTGAVTKGVDSVYVDNNEALTT 418
T + + D A + S V+ + G +
Sbjct: 346 DKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLA---------------- 389

Query: 419 SDTVDFYLQDDGSVTNGSGKAVYKDADGKLTTDAETKAATTADPLKALDEAISSIDKFRS 478
+ DA +TA+PL ++D A+S +D RS
Sbjct: 390 -------------GKTMFIDKTASGVSTLINEDAAAAKKSTANPLASIDSALSKVDAVRS 436

Query: 479 SLGAVQNRLDSAVTNLNNTTTNLSEAQSRIQDADYATEVSNMSKAQIIQQAGNSVLAKAN 538
SLGA+QNR DSA+TNL NT TNL+ A+SRI+DADYATEVSNMSKAQI+QQAG SVLA+AN
Sbjct: 437 SLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKAQILQQAGTSVLAQAN 496

Query: 539 QVPQQVLSLLQ 549
QVPQ VLSLL+
Sbjct: 497 QVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2063TYPE3OMBPROT320.005 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 32.0 bits (72), Expect = 0.005
Identities = 24/72 (33%), Positives = 37/72 (51%), Gaps = 2/72 (2%)

Query: 214 NGMEVSVAAQNAQLTVNNVAIENSSNTISDALENITLNLNDVTTGNQTLTITQDTSKVQT 273
N E +VAA+N + + A+ + +S AL T++L V+T LT T T ++
Sbjct: 236 NSSERAVAARNKAEELVSAALYSRPELLSQALSGKTVDLKIVSTS--LLTPTSLTGGEES 293

Query: 274 AIKDWVNAYNSL 285
+KD VNA L
Sbjct: 294 MLKDQVNALKGL 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2068RTXTOXIND300.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.017
Identities = 10/57 (17%), Positives = 17/57 (29%), Gaps = 2/57 (3%)

Query: 164 RFTLLPIFRIPVKMQKVSAASPLTQKPDQARRRF--RLGMLVFFGMLGWALLTAMNQ 218
R L R + + + A L + P R R M ++L +
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEI 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2069PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2073ECOLIPORIN5090.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 509 bits (1312), Expect = 0.0
Identities = 239/388 (61%), Positives = 282/388 (72%), Gaps = 33/388 (8%)

Query: 1 MKKLTVAISAVAASVLMAMSAQAAEIYNKDSNKLDLYGKVNAKHYFSSNDADDGDTTYVR 60
MK+ +A+ V ++L A +A AAEIYNKD NKLDLYGKV+ HYFS + + DGD TY+R
Sbjct: 1 MKRKVLAL--VIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMR 58

Query: 61 LGFKGETQINDQLTGFGQWEYEFKGNRAESQGSSKDKTRLAFAGLKFGDYGSIDYGRNYG 120
+GFKGETQINDQLTG+GQWEY + N E +G++ TRLAFAGLKFGDYGS DYGRNYG
Sbjct: 59 VGFKGETQINDQLTGYGQWEYNVQANTTEGEGANS-WTRLAFAGLKFGDYGSFDYGRNYG 117

Query: 121 VAYDIGAWTDVLPEFGGDTWTQTDVFMTGRTTGVATYRNNDFFGLVDGLNFAAQYQGKND 180
V YD+ WTD+LPEFGGD++T D +MTGR GVATYRN DFFGLVDGLNFA QYQGKN+
Sbjct: 118 VLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNE 177

Query: 181 R----------------TDVTEANGDGFGFSTTYEY-EGFGVGATYAKSDRTNDQVIYGN 223
D+ NGDGFG STTY+ GF GA Y SDRTN+QV G
Sbjct: 178 SQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGG 237

Query: 224 NSLNASGQNAEVWAAGLKYDANNIYLATTYSETQNMTVFG------NNHIANKAQNFEVV 277
A G A+ W AGLKYDANNIYLAT YSET+NMT +G + +ANK QNFEV
Sbjct: 238 T--IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVT 295

Query: 278 AQYQFDFGLRPSVAYLQSKGKDLG----AWGDQDLIEYIDVGATYYFNKNMSTFVDYKIN 333
AQYQFDFGLRP+V++L SKGKDL D+DL++Y DVGATYYFNKN ST+VDYKIN
Sbjct: 296 AQYQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKIN 355

Query: 334 LIDKSD-FTKASGVATDDIVAVGLVYQF 360
L+D D F K +G++TDDIVA+G+VYQF
Sbjct: 356 LLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4819HTHFIS290.017 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.017
Identities = 8/30 (26%), Positives = 16/30 (53%)

Query: 176 RTKWTANKVARYLYISVSTLHRRLASEGIS 205
T+ K A L ++ +TL +++ G+S
Sbjct: 447 ATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2076FLGHOOKFLIE1178e-38 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 117 bits (293), Expect = 8e-38
Identities = 102/103 (99%), Positives = 102/103 (99%)

Query: 2 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTVARTQAEKFTL 61
SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQT ARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2078FLGMOTORFLIG338e-118 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 338 bits (868), Expect = e-118
Identities = 117/329 (35%), Positives = 197/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE 60
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD 120
+ + DY R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLRRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLAKRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2079FLGFLIH373e-135 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 373 bits (958), Expect = e-135
Identities = 223/228 (97%), Positives = 226/228 (99%)

Query: 1 MSDNLPWKTWTPDDLAPPPAEFVPMVESEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60
MSDNLPWKTWTPDDLAPP AEFVP+VE EETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 61 AEGRQQGHEQGYQEGLAQGLEQGLAEAKAQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120
AEGRQQGH+QGYQEGLAQGLEQGLAEAK+QQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180
MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2082FLGHOOKFLIK468e-168 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 468 bits (1204), Expect = e-168
Identities = 364/375 (97%), Positives = 369/375 (98%)

Query: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLTLLSEALAGETTTDKAAPQLLVATDKPTTK 60
MIRLAPLITADVDTTTLPGGKASDAAQDFL LLSEALAGETTTDKAAPQLLVATDKPTTK
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLVSDILADAQQADLLIPVDETLPVINDEQSTSTPLTTAQTMTLAAVADKNTTKDEKA 120
GEPL+SDI++DAQQA+LLIPVDET PVINDEQSTSTPLTTAQTM LAAVADKNTTKDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPAEKPTLFTKLTSAQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLP EKPTLFTKLTS QLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTADASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTPLVAEAQSKAEVISTPSPVTA ASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMISPHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQM+SPHQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360

Query: 361 LQGRVTGNSGVDIFA 375
LQGRVTGNSGVDIFA
Sbjct: 361 LQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2084FLGMOTORFLIM380e-134 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 380 bits (977), Expect = e-134
Identities = 86/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 5 ILSQAEIDALLNGDS--EVKDEPTASISGESDIRPYDPNTQRRVVRERLQALEIINERFA 62
+LSQ EID LL S + E IS I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 63 RHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 122
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 123 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 182
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 183 EMQVEFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 240
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 241 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 297
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 298 GVPVLTSQYGTLNGQYALRIEHLI 321
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2085FLGMOTORFLIN2106e-74 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 210 bits (537), Expect = 6e-74
Identities = 125/137 (91%), Positives = 133/137 (97%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTSEKSAADAVFQQFGGGDVSGTLQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T+ KSAADAVFQQ GGGDVSG +QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2088TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2089TYPE3IMRPROT2034e-67 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 203 bits (517), Expect = 4e-67
Identities = 254/261 (97%), Positives = 257/261 (98%)

Query: 1 MMQETSDQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
M+Q TS+QWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPGSHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDP SHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGSEPLNSNAFLAPTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIG EPLNSNAFLA TKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIISELPLI 261
EHLFSEIFNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


76S2100S2108N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S2100016-1.388622DNA cytosine methylase
S2103425-2.386167ISEhe3 orfB
S2104327-2.264139ISEhe3 orfA
S2105423-1.641428outer membrane pore protein
S2106021-0.825297insertion element IS2 transposase InsD
S2107-127-6.226235insertion sequence 2 OrfA protein
S2108030-7.066478outer membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2100PF05272290.045 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.045
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 15/62 (24%)

Query: 320 AKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVTRTLSARYYKDGAEILIDRG 379
A+Y + PVLW Y+ R+ K + G+ VY +R +DG+E RG
Sbjct: 166 ARYQVGPVLWGYVVRFIK---SDGDKLTLPYVY------------SRSQRDGSEAWKWRG 210

Query: 380 WD 381
WD
Sbjct: 211 WD 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2104HTHFIS270.013 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 26.7 bits (59), Expect = 0.013
Identities = 7/45 (15%), Positives = 16/45 (35%), Gaps = 1/45 (2%)

Query: 4 KRYPEEFKTEAVKQVVDR-GYSVASVATRLDITTHSLYAWIKKYG 47
R E + + + + A L + ++L I++ G
Sbjct: 430 DRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELG 474


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2105ECOLIPORIN296e-101 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 296 bits (758), Expect = e-101
Identities = 137/268 (51%), Positives = 165/268 (61%), Gaps = 31/268 (11%)

Query: 31 DTSYARVGVKGETQINPEMTGYGQFELDLEASNRHNPDQ---TRLAYAGLSYKDFGSFDY 87
D +Y RVG KGETQIN ++TGYGQ+E +++A+ TRLA+AGL + D+GSFDY
Sbjct: 53 DQTYMRVGFKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDY 112

Query: 88 SRNVGVAYDAEAFTDMFVEWGGDSWAGTDLFMTNRTNGVATYRNTDFFGMVEGLNFALQY 147
RN GV YD E +TDM E+GGDS+ D +MT R NGVATYRNTDFFG+V+GLNFALQY
Sbjct: 113 GRNYGVLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQY 172

Query: 148 QGKNEGTGNY----------------KANGDGHGLSATYTID-GFSFAGAYANSDRTDWQ 190
QGKNE NGDG G+S TY I GFS AY SDRT+ Q
Sbjct: 173 QGKNESQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQ 232

Query: 191 SGDGK----GERAEVWALSTKYDANNVYAAVMYGESHNM-------NSDDGDVVNKTQNF 239
G G++A+ W KYDANN+Y A MY E+ NM DG V NKTQNF
Sbjct: 233 VNAGGTIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNF 292

Query: 240 EAVLQYQFDFGLRPSIGYSYSKALDVAG 267
E QYQFDFGLRP++ + SK D+
Sbjct: 293 EVTAQYQFDFGLRPAVSFLMSKGKDLTY 320


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2108ECOLIPORIN755e-20 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 75.0 bits (184), Expect = 5e-20
Identities = 29/67 (43%), Positives = 41/67 (61%), Gaps = 1/67 (1%)

Query: 4 DSGGQSTGYKDSDRLNYIEIGTWYYFNKNMNIYTAYQINLLDKSD-YVLAHGLNTDDQLA 62
D + D D + Y ++G YYFNKN + Y Y+INLLD D + G++TDD +A
Sbjct: 317 DLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDDDDPFYKDAGISTDDIVA 376

Query: 63 VGIVYQF 69
+G+VYQF
Sbjct: 377 LGMVYQF 383


77S2266S2277N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S2266-1182.983116multidrug efflux system subunit MdtC
S2267-2120.819857multidrug efflux system protein MdtE
S2268-19-0.471273signal transduction histidine-protein kinase
S2269112-2.312525DNA-binding transcriptional regulator BaeR
S2270114-3.237066hypothetical protein
S2271013-2.746593hypothetical protein
S2273024-4.107675hypothetical protein
S2274319-2.429040lipid kinase
S2275220-2.825775galactitol utilization operon repressor
S2277121-3.094083galactitol-1-phosphate dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2266ACRIFLAVINRP9070.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 907 bits (2345), Expect = 0.0
Identities = 287/1035 (27%), Positives = 502/1035 (48%), Gaps = 40/1035 (3%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ +L++ + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVSEMTSSS-SLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLPSGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSDT--YSQGELYDFASTQLAPTISQIDGVGDVDVGGSSL 182
+ S + +M+ SD +Q ++ D+ ++ + T+S+++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLTPQALFNQGVSLDDVRTAISNANVRKPQG------ALEDGTHRWQIQTNDELK 236
A+R+ L L ++ DV + N + G AL I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDSIRAKLPELQETIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRS 355
T +I+AKL ELQ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RAT+IP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RMLVALQQGYGKSLKWVLNHTRLVG 530
+LV+L LTP +C +LK + GF Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVLLGTIALNI----SIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 582
++ +A + +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 583 RD-DPAVDNVTGFT-GGSRVNSGMMFITLKPRDERS---ETAQQIIDRLRVKLAKEPGAN 637
+ +V V GF+ G N+GM F++LKP +ER+ +A+ +I R +++L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 638 LFLMAVQDIRVGGRQSNASYQYTLLSDDLAALREWEPKIRKKLATL-----PELADVNSD 692
+ + I G + ++ L D + + R +L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 693 QQDNGAEMNLVYDRDTMARLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 752
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 753 TQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSD 812
++K++V + G+ +P S F + + I G S D
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 813 ASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVILIIAAIATVYIVLGILYESYVH 872
A A ++ ++L P+ + + G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 873 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGN 932
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA +
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 933 LTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLEITIVGGLVMSQLL 992
EA A +R RPI+MT+LA + G LPL +S G GS + + I ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 993 TLYTTPVVYLFFDRL 1007
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031



Score = 78.3 bits (193), Expect = 1e-16
Identities = 77/446 (17%), Positives = 162/446 (36%), Gaps = 26/446 (5%)

Query: 588 VDNVTGFTGGS-RVNSGMMFITLKPRDERSETAQQIIDRLRVKLAKEPGANLFLMAVQDI 646
+DN+ + S S + +T + + Q+ ++L++ P + Q I
Sbjct: 72 IDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE----VQQQGI 127

Query: 647 RVGGRQSNASYQYTLLSDDLAALREW-----EPKIRKKLATLPELADVNSDQQDNGAE-- 699
V S+ +SD+ ++ ++ L+ L + DV GA+
Sbjct: 128 SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL----FGAQYA 183

Query: 700 MNLVYDRDTMARLGID----VQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRYTQD 755
M + D D + + + + + + T P Q + R+
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP 243

Query: 756 ISALEKMFVINNEGKAIPLSYFAK--WQPANAPLSVNHQGLSAASTISFNLPTGKSLSDA 813
+ +N++G + L A+ N + G AA +L D
Sbjct: 244 EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL-DT 302

Query: 814 SAAIDRAMTQL--GVPSTVRGSFA-GTAQVFQETMNSQVILIIAAIATVYIVLGILYESY 870
+ AI + +L P ++ + T Q +++ V + AI V++V+ + ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 871 VHPLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRH 930
L +P +G L F + + + G++L IG++ +AI++V+
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 931 GNLTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLEITIVGGLVMSQ 990
L P+EA ++ ++ + +P+ GG + + ITIV + +S
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 991 LLTLYTTPVVYLFFDRLRLRFSRKPK 1016
L+ L TP + + + K
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2267TCRTETB1252e-33 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 125 bits (315), Expect = 2e-33
Identities = 97/429 (22%), Positives = 188/429 (43%), Gaps = 23/429 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIATLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGLSPLAIAGLVAVGVVALVLYLLHARNNNRALFSLKL 257
G +L++VG+ L + + V V++ ++++ H R L
Sbjct: 202 KGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+R G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHISVDSGTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYTWLSMAF 441
+Y+ L + F
Sbjct: 428 LYSNLLLLF 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2268BCTERIALGSPF310.009 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 31.3 bits (71), Expect = 0.009
Identities = 27/93 (29%), Positives = 34/93 (36%), Gaps = 27/93 (29%)

Query: 173 LATLLAALATFLLA-------------RGLLAPVKRLVDGTHKLAAGDFTTRVTPTSEDE 219
LATL+AA A L+A V+ V H LA + P S +
Sbjct: 77 LATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSFER 133

Query: 220 L-----------GKLAQDFNQLASTLEKNQQMR 241
L G L N+LA E+ QQMR
Sbjct: 134 LYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2269HTHFIS766e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 6e-18
Identities = 28/136 (20%), Positives = 65/136 (47%), Gaps = 1/136 (0%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLAYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDVPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTILRRCK 129
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 PQRELQQQDAESPLII 145
+ + D++ + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2273LIPOLPP20270.026 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 26.6 bits (58), Expect = 0.026
Identities = 13/38 (34%), Positives = 24/38 (63%), Gaps = 1/38 (2%)

Query: 18 EGEMKKIAAISLISIFLISGCAVHNDETSIGKFGLAYK 55
+ ++KKI +S+++ +I GC+ H ++ I K AYK
Sbjct: 2 KNQVKKILGMSVVAAMVIVGCS-HAPKSGISKSNKAYK 38


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2277DHBDHDRGNASE347e-04 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 33.9 bits (77), Expect = 7e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 2/92 (2%)

Query: 156 AQGCENKNVIIIGAGT-IGLLAIQCAVALGAKSVTAIDISSEKLALAKSFGAMQTFNSSE 214
A+G E K I GA IG + + GA + A+D + EKL S + ++
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAH-IAAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 215 MSAPQMQSVLRELRFNQLILETAGVPQTVELA 246
A S + ++ E + V +A
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVA 93


78S2571S2574N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S2571036-9.718984multidrug resistance protein Y
S2572035-8.999473multidrug resistance protein K
S2573134-8.571625DNA-binding transcriptional activator EvgA
S2574135-8.537495hybrid sensory histidine kinase in two-component
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2571TCRTETB1193e-31 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 119 bits (300), Expect = 3e-31
Identities = 98/408 (24%), Positives = 168/408 (41%), Gaps = 25/408 (6%)

Query: 19 VTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVITSFGVANAIAIPVTGRLAQ 78
+ I L + +F +L+ + NV++P I+ WV T+F + +I V G+L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 79 RIGELRLFLLSVTFFSLSSLMCSLS-TNLDVLIFFRVVQGLMAGPLIPLSQSLLLRNYPP 137
++G RL L + S++ + + +LI R +QG A L ++ R P
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 138 EKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRE 197
E R A L V + GP +GG I W +L+ +PM I+ L L +E
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192

Query: 198 TETSPVKMNLPRLTLLVLGVGGLQIMLDKGRDLDWFNSSTIIILTVVSVISLISLVIWES 257
K + ++++ VG + ML F +S I +VSV+S + V
Sbjct: 193 VRI---KGHFDIKGIILMSVGIVFFML--------FTTSYSISFLIVSVLSFLIFVKHIR 241

Query: 258 TSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQKTMGYNAIWAGLAYAPI 317
+P +D L K+ F IG++ + +G + ++P +++ + G
Sbjct: 242 KVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 318 GIMPLLISPLIG-----RYGNKIDMRVLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQ 372
G M ++I IG R G + + VTF +V + S T F II+
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL----SVSFLTASFLLETTSWFMTIIIVF 357

Query: 373 FFQGFAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL 420
G + ++TI S L + S+ NF LS G ++
Sbjct: 358 VLGGLSFTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2572RTXTOXIND771e-17 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 77.2 bits (190), Expect = 1e-17
Identities = 63/419 (15%), Positives = 125/419 (29%), Gaps = 96/419 (22%)

Query: 8 KKQSNRKKYFSLLVIVLFIAFSGAYAYWSMELEDMISTDDAYVT-GNADPISAQVSGSVT 66
+ +R+ I+ F+ + + ++E + + + G + I + V
Sbjct: 50 ETPVSRRPRLVAYFIMGFLVIAFILSVLG-QVEIVATANGKLTHSGRSKEIKPIENSIVK 108

Query: 67 VVNHKDTNYVRQGDILVSLDKTDATIALNKA----------------------------- 97
+ K+ VR+GD+L+ L A K
Sbjct: 109 EIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPEL 168

Query: 98 -----------------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQ 131
K + Q + L + AE + + Y+
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYEN 228

Query: 132 SLEDYNRRV----PLAKQGVISKE----------TLEHTKDTLISSKAALNAAIQAYKAN 177
R+ L + I+K + S + + I + K
Sbjct: 229 LSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE 288

Query: 178 KALVMN-------TPLNR-QPQVVEAADATKEAWLVLKRTDIRSPVTGYIAQRSVQ-VGE 228
LV L + + + + + IR+PV+ + Q V G
Sbjct: 289 YQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGG 348

Query: 229 TVSSGQSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINM 287
V++ ++LM +VP + V A + + + +GQ+ I + F G +
Sbjct: 349 VVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLV 402

Query: 288 GTGNAFSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDTKD 342
G + + +V V +S++ L PL G+++TA I T
Sbjct: 403 GK---VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2573HTHFIS493e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 3e-09
Identities = 22/148 (14%), Positives = 53/148 (35%), Gaps = 31/148 (20%)

Query: 4 IIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQV 63
++ DD + L + ++ + + + + D+V+ DV +P N +
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 LETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF 123
L ++K + ++++SA+N + AI+A++ G +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDY 101

Query: 124 ---PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 102 LPKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2574HTHFIS802e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.9 bits (197), Expect = 2e-17
Identities = 30/105 (28%), Positives = 51/105 (48%)

Query: 960 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKVSMQHYDLLITDVNMPNMDGFE 1019
+IL+ADD R +L + L+ GYDV ++ ++ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 LTRKLREQNSSLPIWGLTANAQANEREKGLSCGMNLCLFKPLTLD 1064
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


79S2894S2901N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S28940111.683247hypothetical protein
S2896-1120.691111hypothetical protein
S2897-1130.415505hypothetical protein
S2898-1120.346007transcriptional repressor MprA
S2900-1130.624022multidrug resistant protein emrB
S2901-1150.182658S-ribosylhomocysteinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2894TCRTETB469e-08 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 46.4 bits (110), Expect = 9e-08
Identities = 32/165 (19%), Positives = 70/165 (42%), Gaps = 2/165 (1%)

Query: 34 LDTIARNFSLSASSAGFIVTAAQLGYAAGLLFLVPLGDMFERRRLIVSMTLLAAGGMLIT 93
L IA +F+ +S ++ TA L ++ G L D +RL++ ++ G +I
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 94 ASSQSLA-MMILGTALTGLFSVVAQILVPLA-ATLASPDKRGKVVGTIMSGLLLGILLAR 151
S ++I+ + G + LV + A + RGK G I S + +G +
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 152 TVAGLLANLGGWRTVFWVASVLMALMALALWRGLPQMKSETHLNY 196
+ G++A+ W + + + + + + +++ + H +
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2898PF05272280.020 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.1 bits (62), Expect = 0.020
Identities = 22/94 (23%), Positives = 36/94 (38%), Gaps = 12/94 (12%)

Query: 23 PYQEILLTRLCMHMQSKLLENRNKMLKAQGINETLFMALITLESQENHSIQPSELSCALG 82
P QE+ L + + L R A+G + + T + ++L ALG
Sbjct: 756 PEQELRLVETGVQGRLWALLTREGAPAAEGAAQKGYSVNTTFVTI-------ADLVQALG 808

Query: 83 -----SSRTNATRIADELEKRGWIERRKSDNDRR 111
SS ++ D L + GW R++ RR
Sbjct: 809 ADPGKSSPMLEGQVRDWLNENGWEYLRETSGQRR 842


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2900TCRTETB1333e-36 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 133 bits (337), Expect = 3e-36
Identities = 98/402 (24%), Positives = 168/402 (41%), Gaps = 17/402 (4%)

Query: 17 IALSLATFMQVLNSTIANVAIPTIAGNLGSSLSQGTWVITSFGVANAISIPLTGWLAKRV 76
I L + +F VLN + NV++P IA + + WV T+F + +I + G L+ ++
Sbjct: 17 IWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQL 76

Query: 77 GEVKLFLWSTIAFAIASWACGVS-SSLNMLIFFRVIQGIVAGPLIPLSQSLLLNNYPPAK 135
G +L L+ I S V S ++LI R IQG A L ++ P
Sbjct: 77 GIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN 136

Query: 136 RSIALALWSMTVIVAPICGPILGGYISDNYHWGWIFFINVPIGVAVVLMTLQTLRGRETR 195
R A L V + GP +GG I+ HW + + +P+ + + L L +E R
Sbjct: 137 RGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVR 194

Query: 196 TERRRIDAVGLALLVIGIGSLQIMLDRGKELDCFSSQEIIILTVVAVVAICFLIVWELTD 255
+ D G+ L+ +GI + ML F++ I +V+V++ +
Sbjct: 195 I-KGHFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIRKV 243

Query: 256 DNPIVDLSLFKSRNFTIGCLCISLAYMLYFGAIVLLPQLLQEVYGYTATWAGLASAPVGI 315
+P VD L K+ F IG LC + + G + ++P ++++V+ + G G
Sbjct: 244 TDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGT 303

Query: 316 IPVILS-PIIGRFAHKLDMRRLVTFSFIMYAVCFYWRAYTFEPGMDFCASAWPQFIQGFA 374
+ VI+ I G + ++ +V F ++ E F + G +
Sbjct: 304 MSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLS 363

Query: 375 VVCFFMPLTTITLSGLPPERLAAASSLSNFTRTLAGSIGTSI 416
++TI S L + A SL NFT L+ G +I
Sbjct: 364 FTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S2901LUXSPROTEIN291e-105 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 291 bits (747), Expect = e-105
Identities = 131/170 (77%), Positives = 148/170 (87%)

Query: 2 PLLDSFTVDHTRMEAPAVRVAKTMNTPHGDAITVFDLRFCVPNKEVMPERGIHTLEHLFA 61
PLLDSFTVDHTRM APAVRVAKTM TP GD ITVFDLRF PNK+++ E+GIHTLEHL+A
Sbjct: 1 PLLDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYA 60

Query: 62 GFMRNHLNGNGVEIIDISPMGCRTGFYMSLIGTPDEQRVADAWKAAMEDVLKVQDQNQIP 121
GFMRNHLNG+ VEIIDISPMGCRTGFYMSLIGTP EQ+VADAW AAMEDVLKV++QN+IP
Sbjct: 61 GFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIP 120

Query: 122 ELNVYQCGTYQMHSLQEAQDIARSILEREVRINSNEELALPKEKLQELHI 171
ELN YQCGT MHSL EA+ IA++ILE V +N N+ELALP+ L+EL I
Sbjct: 121 ELNEYQCGTAAMHSLDEAKQIAKNILEVGVAVNKNDELALPESMLRELRI 170


80S3403S3410N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S34030130.371332fimbrial protein
S34041122.185544hypothetical protein
S34050132.296726glycosylase
S34060161.635297hypothetical protein
S34071171.562182DnaA initiator-associating protein DiaA
S34081182.279655hypothetical protein
S34090202.484370hypothetical protein
S34100201.067461hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3403FIMBRIALPAPF290.022 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 28.9 bits (64), Expect = 0.022
Identities = 41/160 (25%), Positives = 67/160 (41%), Gaps = 21/160 (13%)

Query: 208 VKLSIQGNLTAPQSCKINQGDVIKVNFGFINGQKFTTRNAMPDGFTPVDFDITYDCGDTS 267
V+++I+GN+ P C IN G I V+FG IN + V +I+ C S
Sbjct: 21 VQINIRGNVYIP-PCTINNGQNIVVDFGNINPEHVDNSRG------EVTKNISISCPYKS 73

Query: 268 KIKNSLQMRIDGTTGVVDQYNLVARRRSSDNVPDVGIRIENLGGGVANIPFQNG------ 321
SL +++ G T V Q N++A N+ GI + G + NG
Sbjct: 74 ---GSLWIKVTGNTMGVGQNNVLA-----TNITHFGIALYQGKGMSTPLTLGNGSGNGYR 125

Query: 322 ILPVDPSGHGTVNMRAWPVNLVGGELETGKFQGTATITVM 361
+ + T + P G L G F+ TA+++++
Sbjct: 126 VTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMI 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3405IGASERPTASE300.029 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.029
Identities = 41/266 (15%), Positives = 83/266 (31%), Gaps = 15/266 (5%)

Query: 277 QQGFEAAKNIGTQPVAAQVAAAPAADVAEQPQPQTADSVASPAQASVSDLTGDQPAAQPV 336
Q G E + T+ E + Q V S QP A+P
Sbjct: 1087 QSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPA 1146

Query: 337 PVSAPATSTAAVSAPANPSAELKIYDTSSQPLSQILSQVQQDGASIVVGPLLKNNVEELL 396
+ P + + N +A+ + + S + V + +++N
Sbjct: 1147 RENDPTVNIKEPQSQTNTTAD--TEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTP 1204

Query: 397 KSNTPLNVLALNQPENIENRVNICYFALSPEDEARDAARHIRDQGKQAPLVLIPR---SA 453
+ P + +R ++ + E A D+ A L +
Sbjct: 1205 ATTQPTVNSESSNKPKNRHRRSVRSVPHNVE----PATTSSNDRSTVALCDLTSTNTNAV 1260

Query: 454 LGDRVANAFAQEWQKLGGGTVLQQKFGSTSELRAGVNGGSGIALTGSPITPRATTDSGMT 513
L D A A ++ L G + Q S+L G + ++ + + ++
Sbjct: 1261 LSDARAKA---QFVALNVGKAVSQHI---SQLEMNNEGQYNVWVSNTSMNKNYSSSQYRR 1314

Query: 514 TNNPTLQTTPTDDQFTNNGGRVDAVY 539
++ + QT DQ +N ++ V+
Sbjct: 1315 FSSKSTQTQLGWDQTISNNVQLGGVF 1340


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3407RTXTOXINA280.036 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 27.6 bits (61), Expect = 0.036
Identities = 26/111 (23%), Positives = 44/111 (39%), Gaps = 22/111 (19%)

Query: 42 NKILCCGNGTSAANAQHFAASMINRFETERPSLPAIALNTDNVVLTAIA-------NDRL 94
K+L GN + A T + IA + V AI+ D+
Sbjct: 277 TKVL--GNVGKGISQYIIAQRAAQGLSTSAAAAGLIA----SAVTLAISPLSFLSIADKF 330

Query: 95 HD----EVYAKQVRALGHAGDVLLAISTRGNSRDIVKAVEAAVTRDMTIVA 141
E Y+++ + LG+ GD LLA + A++A++T T++A
Sbjct: 331 KRANKIEEYSQRFKKLGYDGDSLLAAFHKETG-----AIDASLTTISTVLA 376


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3410NUCEPIMERASE290.013 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.0 bits (65), Expect = 0.013
Identities = 8/22 (36%), Positives = 13/22 (59%)

Query: 19 VLITGATGLVGGHLLRMLINEP 40
L+TGA G +G H+ + L+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG 24


81S3490S3496N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S3490-212-1.001097serine endoprotease
S3491-213-0.814612malate dehydrogenase
S3492-212-1.156181arginine repressor
S3493-313-0.529904hypothetical protein
S3494-2120.489375hypothetical protein
S3495-3100.707371p-hydroxybenzoic acid efflux subunit AaeB
S3496-2100.663752p-hydroxybenzoic acid efflux subunit AaeA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3490V8PROTEASE538e-10 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 52.7 bits (126), Expect = 8e-10
Identities = 31/160 (19%), Positives = 59/160 (36%), Gaps = 26/160 (16%)

Query: 77 RTLGSGVIMDQRGYIITNKHVINDADQIIVALQ------------DGRVFEALLVGSDSL 124
+ SGV++ + ++TNKHV++ AL+ +G +
Sbjct: 101 TFIASGVVVG-KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 125 TDLAVLKI-------NATGGLPTIPINARRVPHIGDVVLAIGNPYNLGQTITQGIISATG 177
DLA++K + + ++ + + G P + T + G
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGD-KPVATMW--ESKG 216

Query: 178 RIGLNPTGRQNFLQTDASINHGNSGGALVNSLGELMGINT 217
+I + +Q D S GNSG + N E++GI+
Sbjct: 217 KI---TYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHW 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3491DHBDHDRGNASE280.045 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 28.1 bits (62), Expect = 0.045
Identities = 37/167 (22%), Positives = 61/167 (36%), Gaps = 27/167 (16%)

Query: 3 VAVLGAAGGIGQALALLLKTQLPSGSELSLYDIAPVTPGVAVDLSHIPTAVKIKGFSGED 62
+ GAA GIG+A+A L G+ ++ D P V S A + F +
Sbjct: 11 AFITGAAQGIGEAVARTL---ASQGAHIAAVDYNP-EKLEKVVSSLKAEARHAEAFPADV 66

Query: 63 ATPA------------LEGADVVLISAGVARK------PGMDRSDLFNVNAGIVKNLVQQ 104
A + D+++ AGV R + F+VN+ V N +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 105 VAKNCPK----ACIGIITNPVNTT-VAIAAEVLKKAGVYDKNKLFGV 146
V+K + + + +NP ++AA KA K G+
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGL 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3492ARGREPRESSOR1694e-57 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 169 bits (430), Expect = 4e-57
Identities = 44/141 (31%), Positives = 71/141 (50%), Gaps = 5/141 (3%)

Query: 15 KALLKEEKFSSQGEIVAALQEQGFDNINQSKVSRMLTKFGAVRTRNAKMEMVYCLPAELG 74
+ ++ + +Q E+V L++ G+ N+ Q+ VSR + + V+ Y LPA+
Sbjct: 11 REIITANEIETQDELVDILKKDGY-NVTQATVSRDIKELHLVKVPTNNGSYKYSLPADQR 69

Query: 75 VPTTSSPLKNLV---LDIDYNDAVVVIHTSPGAAQLIARLLDSLGKAEGILGTIAGDDTI 131
S ++L+ + ID ++V+ T PG AQ I L+D+L E I+GTI GDDTI
Sbjct: 70 FNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEE-IMGTICGDDTI 128

Query: 132 FTTPANGFTVKDLYEAILELF 152
K + + ILEL
Sbjct: 129 LIICRTHDDTKVVQKKILELL 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3496RTXTOXIND535e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 53.3 bits (128), Expect = 5e-10
Identities = 28/163 (17%), Positives = 59/163 (36%), Gaps = 16/163 (9%)

Query: 6 RKFSRTAITVVLVILAFIAIFNAWVYYTE----SPWTRDARFSADVVAIAPDVSGLITQV 61
SR V I+ F+ I + + S I P + ++ ++
Sbjct: 51 TPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEI 110

Query: 62 NVHDNQLVKKGQILFTIDQPR-------YQKALEEAQADVAYYQVLAQEKRQEAGRRNRL 114
V + + V+KG +L + Q +L +A+ + YQ+L++ E + L
Sbjct: 111 IVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRS--IELNKLPEL 168

Query: 115 GVQAMSREEIDQANNVL---QTVLHQLAKAQATRDLAKLDLER 154
+ + VL + Q + Q + +L+L++
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDK 211



Score = 51.4 bits (123), Expect = 2e-09
Identities = 28/147 (19%), Positives = 54/147 (36%), Gaps = 15/147 (10%)

Query: 100 LAQEKRQEAGRRNRLGVQ-AMSREEIDQANNVLQT-VLHQLAKAQAT-------RDLAKL 150
E R + ++ + ++EE + + +L +L + +
Sbjct: 264 AVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEE 323

Query: 151 DLERTVIRAPADGWVTNLNVYT-GEFITRGSTAVALVKQNSFY-VLAYMEETKLEGVRPG 208
+ +VIRAP V L V+T G +T T + +V ++ V A ++ + + G
Sbjct: 324 RQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVG 383

Query: 209 YRAEIT----PLGSNKVLKGTVDSVAA 231
A I P L G V ++
Sbjct: 384 QNAIIKVEAFPYTRYGYLVGKVKNINL 410


82S3554S3561N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S3554-1120.734894hypothetical protein
S3555014-0.051512hypothetical protein
S3556-117-0.221907phosphate-starvation-inducible protein PsiE
S3557-119-0.102231D-xylose transporter XylE
S3558-1190.829131maltose transporter permease
S35590170.518267maltose transporter membrane protein
S3560-1150.153658maltose ABC transporter periplasmic protein
S3561-1131.343628maltose/maltodextrin transporter ATP-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3554CHANLCOLICIN310.006 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.8 bits (69), Expect = 0.006
Identities = 21/95 (22%), Positives = 38/95 (40%), Gaps = 3/95 (3%)

Query: 20 AAGTVKVFSNGSSEAKTLTGAEHLIDLVGQPRLANSWWPGAVISEELATAAALRQQQALL 79
A + + + LT + L D+V + N+ + A AA++ + L
Sbjct: 73 AKAAAEAQAKAKANRDALT--QRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERL 130

Query: 80 TRLAEQGADSSTDDAAAINALRQQIQALKVTGRQK 114
RLA+ + + AA A ++ Q K R+K
Sbjct: 131 -RLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREK 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3557TCRTETA363e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.0 bits (83), Expect = 3e-04
Identities = 20/87 (22%), Positives = 42/87 (48%), Gaps = 3/87 (3%)

Query: 279 VIGVMLSIFQQFVGINVVLYYAPEVFKTLGASTDIALLQTIIVGVINLTFTVLAIMT--- 335
+I ++ ++ VGI +++ P + + L S D+ I++ + L A +
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 336 VDKFGRKPLQIIGALGMAIGMFSLGTA 362
D+FGR+P+ ++ G A+ + TA
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3559FLGHOOKAP1310.011 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 31.1 bits (70), Expect = 0.011
Identities = 22/124 (17%), Positives = 43/124 (34%), Gaps = 21/124 (16%)

Query: 128 GDEWQLALSDGETGKNYLSDAFKFGGEQKLQLKETTAQPEGERANLRVITQNRQALSDIT 187
++WQ+ T DA L+L T + L+ + A+ ++
Sbjct: 367 NNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPV---SDAIVNMD 423

Query: 188 AILPDGNKVMMSSLRQFSGTQPLYTLDGDGTLTNNQSGVKYRPNNQ--------IGFYQS 239
++ D K+ M+S GD N Q+ + + N++ Y S
Sbjct: 424 VLITDEAKIAMAS----------EEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYAS 473

Query: 240 ITAD 243
+ +D
Sbjct: 474 LVSD 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3560MALTOSEBP7550.0 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 755 bits (1951), Expect = 0.0
Identities = 395/396 (99%), Positives = 395/396 (99%)

Query: 1 MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK 60
MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK
Sbjct: 1 MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK 60

Query: 61 VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW 120
VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW
Sbjct: 61 VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW 120

Query: 121 DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP 180
DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP
Sbjct: 121 DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP 180

Query: 181 YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE 240
YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE
Sbjct: 181 YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE 240

Query: 241 AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSTGINAASPNKE 300
AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLS GINAASPNKE
Sbjct: 241 AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKE 300

Query: 301 LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP 360
LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP
Sbjct: 301 LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP 360

Query: 361 QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK 396
QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK
Sbjct: 361 QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3561PF05272356e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.7 bits (79), Expect = 6e-04
Identities = 13/35 (37%), Positives = 18/35 (51%)

Query: 32 VVFVGPSGCGKSTLLRMIAGLETITSGDLFIGEKR 66
VV G G GKSTL+ + GL+ + IG +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGK 633


83S3613S3620N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S3613-1140.072811phosphonate/organophosphate ester transporter
S3614-215-0.316279hypothetical protein
S3615-315-0.133264hypothetical protein
S3616-214-0.417793hypothetical protein
S3617-1120.206159hypothetical protein
S3618013-1.482050proline/glycine betaine transporter
S3619-117-0.817568sensor protein BasS/PmrB
S3620016-1.425455DNA-binding transcriptional regulator BasR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3613PF05272290.019 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.019
Identities = 12/22 (54%), Positives = 13/22 (59%)

Query: 32 MVALLGPSGSGKSTLLRHLSGL 53
V L G G GKSTL+ L GL
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3618TCRTETA432e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.9 bits (101), Expect = 2e-06
Identities = 57/290 (19%), Positives = 105/290 (36%), Gaps = 55/290 (18%)

Query: 85 FFGMLGDKYGRQKILAITIVIMSISTFCIGLIPSYDTIGIWAPILLLICKMAQGFSVGGE 144
G L D++GR+ +L +++ ++ + P +W +L I ++ G + G
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPF-----LW---VLYIGRIVAGIT-GAT 112

Query: 145 YTGASIFVAEYSPDRKR----GFMGSWLDFGSIAGFVLGAGVVVLISTIVGEANFLDWGW 200
A ++A+ + +R GFM + FG +AG VLG G++ S
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG-GLMGGFSP------------ 159

Query: 201 RIPFFIALPLGIIGLYLRHALEETPAFQQHVDKLEQGDREGLQDGPKVSFKEIATKYWRS 260
PFF A L + L K E+ P SF+ W
Sbjct: 160 HAPFFAAAALNGLNFLTGCFLLPESH------KGERRPLRREALNPLASFR------WAR 207

Query: 261 LLTCIGLVIATNVTYYML----LTYMPSYLSHNLHYS-EDHGVLIIIAIMIGMLFVQPVM 315
+T + ++A ++ + H+ G+ + ++ L +
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMIT 267

Query: 316 GLLSDRFGRRPFVLLG----SVALFVLA--------IPAFILINSNVIGL 353
G ++ R G R ++LG +LA P +L+ S IG+
Sbjct: 268 GPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM 317



Score = 41.0 bits (96), Expect = 8e-06
Identities = 39/164 (23%), Positives = 73/164 (44%), Gaps = 16/164 (9%)

Query: 286 LSHNLHYSEDHGVLI-IIAIMIGMLFVQPVMGLLSDRFGRRPFVLLGSVALFVLAIPAFI 344
L H+ + +G+L+ + A+M PV+G LSDRFGRRP +L+ L A+ I
Sbjct: 35 LVHSNDVTAHYGILLALYALM--QFACAPVLGALSDRFGRRPVLLVS---LAGAAVDYAI 89

Query: 345 LINSNVIGLIFAGLLMLAVILNCFMGVMASTLPAMFPTHIR---YSALAAAFNISVLVAG 401
+ + + +++ G ++A I V + + + R + ++A F +VAG
Sbjct: 90 MATAPFLWVLYIG-RIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFG-MVAG 147

Query: 402 LTPTLAAWLVESSQNLMMPAYYLMVVAVIGLITG-VTMKETANR 444
P L + S + P + + + +TG + E+
Sbjct: 148 --PVLGGLMGGFSPH--APFFAAAALNGLNFLTGCFLLPESHKG 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3619PF06580371e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 1e-04
Identities = 40/182 (21%), Positives = 80/182 (43%), Gaps = 34/182 (18%)

Query: 184 ARLDQMMESVSQLLQLARAGQSFSSGNYQHVKLLEDV-ILPSYDELSTML--DQRQQTLL 240
+ +M+ S+S+L++ S N + V L +++ ++ SY +L+++ D+ Q
Sbjct: 191 TKAREMLTSLSELMR-----YSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 241 LPESAADITVQGDATLLRMLLRNLVENAHRY----SPQGSNIMIKLQEDGGAV-MAVEDE 295
+ + D+ V ML++ LVEN ++ PQG I++K +D G V + VE+
Sbjct: 246 INPAIMDVQV------PPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENT 299

Query: 296 GPGIDESKCGELSKAFVRMDSRYGGIGLGLSIV-SRITQLHHGQFFLQNRQETSGTRAWV 354
G + + G GL V R+ L+ + ++ ++ A V
Sbjct: 300 GSLA--------------LKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 355 RL 356
+
Sbjct: 346 LI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3620HTHFIS912e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 90.7 bits (225), Expect = 2e-23
Identities = 40/121 (33%), Positives = 59/121 (48%)

Query: 2 KILIVEDDTLLLQGLILAAQTEGYACDGVTTARMAEQSLEDGHYSLVVLDLGLPDEDGLH 61
IL+ +DD + L A GY + A + + G LVV D+ +PDE+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 FLARIRQKKYTLPVLILTARDTLTDKIAGLDVGADDYLVKPFALEELHARIRALLRRHNN 121
L RI++ + LPVL+++A++T I + GA DYL KPF L EL I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 Q 122
+
Sbjct: 125 R 125


84S3631S3637N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S3631-114-0.844629DNA-binding transcriptional activator DcuR
S3632-113-0.748649sensory histidine kinase DcuS
S3633-116-0.286209insertion sequence 2 OrfA protein
S3634015-1.029625insertion element IS2 transposase InsD
S3635217-1.22971123S rRNA pseudouridine synthase F
S3636017-0.827948sor-operon regulator
S3637016-0.020463sorbitol-6-phosphate 2-dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3631HTHFIS712e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 71.4 bits (175), Expect = 2e-16
Identities = 31/109 (28%), Positives = 50/109 (45%), Gaps = 4/109 (3%)

Query: 4 VLIIDDDAMVAELNRRYVAQIPGFQCCGTASTLEKAKEIIFNSDAPIDLILLDIYMQKEN 63
+L+ DDDA + + + +++ G+ S I + DL++ D+ M EN
Sbjct: 6 ILVADDDAAIRTVLNQALSRA-GYDVR-ITSNAATLWRWI--AAGDGDLVVTDVVMPDEN 61

Query: 64 GLDLLPVLHNARCKSDVIVISSAADAATIKDSLHYGVVDYLIKPFQASR 112
DLLP + AR V+V+S+ T + G DYL KPF +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTE 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3632PF06580418e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.0 bits (96), Expect = 8e-06
Identities = 21/99 (21%), Positives = 38/99 (38%), Gaps = 18/99 (18%)

Query: 442 LIENALE-ALGP-EPGGEISVTLHYRHGWLHCEVNDDGPGIAPDKIDHIFDKGVSTKGSE 499
L+EN ++ + GG+I + +G + EV + G +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN------------TKES 310

Query: 500 RGVGLALVKQQVENLGG---SIAVESEPGIFTQFFVQIP 535
G GL V+++++ L G I + + G V IP
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3636HTHFIS290.033 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 28.6 bits (64), Expect = 0.033
Identities = 6/21 (28%), Positives = 12/21 (57%)

Query: 24 QAQIARELGIYRTTISRLLKR 44
Q + A LG+ R T+ + ++
Sbjct: 452 QIKAADLLGLNRNTLRKKIRE 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3637DHBDHDRGNASE1155e-33 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 115 bits (289), Expect = 5e-33
Identities = 79/272 (29%), Positives = 128/272 (47%), Gaps = 27/272 (9%)

Query: 7 LQDKIIIVTGGASGIGLAIVEELLAQGANVQMVDIHG-------GDGQYEGHKGYQFWPT 59
++ KI +TG A GIG A+ L +QGA++ VD + + E F P
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF-PA 64

Query: 60 DISSTKEVNHTVAEIIQRFGRIDGLVNNAGVNFPRLLVDEKAPAGQYELNEAAFEKMVNI 119
D+ + ++ A I + G ID LVN AGV P L+ + L++ +E ++
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLI---------HSLSDEEWEATFSV 115

Query: 120 NQKGVFLMSQAVARQMVKQHDGVIVNVSSESGLEGSEGQSCYAATKAALNSFTRSWSKEL 179
N GVF S++V++ M+ + G IV V S + YA++KAA FT+ EL
Sbjct: 116 NSTGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLEL 175

Query: 180 GKHGIRVVGIAPGILEKTGLRTPEYEEALAWTRNITVEQLREGYT---KNAIPIGRAGRL 236
++ IR ++PG E + W EQ+ +G K IP+ + +
Sbjct: 176 AEYNIRCNIVSPGSTETDMQWS-------LWADENGAEQVIKGSLETFKTGIPLKKLAKP 228

Query: 237 AEVADFVCYLLSERASYITGVTTNIAGGKTRG 268
+++AD V +L+S +A +IT + GG T G
Sbjct: 229 SDIADAVLFLVSGQAGHITMHNLCVDGGATLG 260


85S3645S3663N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S3645-1172.617400B12-dependent methionine synthase
S3646-1162.752682IclR family transcriptional regulator
S36490183.153882isocitrate lyase
S36510163.083966homoserine O-succinyltransferase
S36520173.668129hypothetical protein
S36571163.611155*bifunctional
S36581153.150914phosphoribosylamine--glycine ligase
S36591131.529945transcriptional regulatory protein ZraR
S36601141.318742sensor protein ZraS
S36610151.802722zinc resistance protein
S36620151.680535hypothetical protein
S36630151.136053transcriptional regulator HU subunit alpha
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3645BCTERIALGSPD340.004 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 34.1 bits (78), Expect = 0.004
Identities = 20/87 (22%), Positives = 37/87 (42%), Gaps = 17/87 (19%)

Query: 343 SGLEPLNIGDDSLFVNVGERTN---VTGSA----KFKRLIKEEKYSEALDVARQQVENGA 395
+P+ D ++ + +TN VT + +R+I + LD+ R QV A
Sbjct: 298 QAAKPVAALDKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQ------LDIRRPQVLVEA 351

Query: 396 QIIDINMDEGMLDAEAAMVRFLNLIAG 422
I ++ D L+ +++ N AG
Sbjct: 352 IIAEVQ-DADGLNLG---IQWANKNAG 374


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3649BINARYTOXINB320.004 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 32.3 bits (73), Expect = 0.004
Identities = 14/58 (24%), Positives = 23/58 (39%)

Query: 289 ETSTPDLELARRFAQAIHAKYPGKLLAYNCSPSFNWQKNLDDKTIASFQQQLSDMGYK 346
ET+ PD+ L A P L Y + N D +T + + QL+++
Sbjct: 544 ETTKPDMTLKEALKIAFGFNEPNGNLQYQGKDITEFDFNFDQQTSQNIKNQLAELNAT 601


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3652SACTRNSFRASE310.001 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 30.7 bits (69), Expect = 0.001
Identities = 15/54 (27%), Positives = 20/54 (37%), Gaps = 5/54 (9%)

Query: 78 IDPDVCGCGVGRMLVEHALSMAPE-----LTTNVNEQNEQAVGFYKKVGFKVTG 126
+ D GVG L+ A+ A E L + N A FY K F +
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3659HTHFIS5250.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 525 bits (1355), Expect = 0.0
Identities = 183/468 (39%), Positives = 253/468 (54%), Gaps = 35/468 (7%)

Query: 8 ILVVDDDISHCTILQALLRGWGYNVALANSGRQALEQVREQVFDLVLCDVRMAEMDGIAT 67
ILV DDD + T+L L GY+V + ++ + DLV+ DV M + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 68 LKEIKALNPAIPVLIMTAYSSVETAVEALKTGAQDYLIKPLDFDNLQATLEKALAHTHSI 127
L IK P +PVL+M+A ++ TA++A + GA DYL KP D L + +ALA
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRR 125

Query: 128 DAETPAVTASQFGMVGKSPAMQHLLSEIALVAPSEATVLIHGDSGTGKELVARAIHASSA 187
++ + +VG+S AMQ + +A + ++ T++I G+SGTGKELVARA+H
Sbjct: 126 PSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGK 185

Query: 188 RSEKPLVTLNCAALNESLLESELFGHEKGAFTGADKRREGRFVEADGGTLFLDEIGDISP 247
R P V +N AA+ L+ESELFGHEKGAFTGA R GRF +A+GGTLFLDEIGD+
Sbjct: 186 RRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPM 245

Query: 248 MMQVRLLRAIQEREVQRVGSNQTISVDVRLIAATHRDLAAEVNAGRFRQDLYYRLNVVAI 307
Q RLLR +Q+ E VG I DVR++AAT++DL +N G FR+DLYYRLNVV +
Sbjct: 246 DAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPL 305

Query: 308 EVPSLRQRREDIPLLAGHFLQRFAERNRKAVKGFTPQAMDLLIHYDWPGNIRELENAVER 367
+P LR R EDIP L HF+Q+ + VK F +A++L+ + WPGN+RELEN V R
Sbjct: 306 RLPPLRDRAEDIPDLVRHFVQQAEKEGLD-VKRFDQEALELMKAHPWPGNVRELENLVRR 364

Query: 368 AVVLLTGEYISERELPLAIASTPIPLGQSQDIQP-------------------------- 401
L + I+ + + S +
Sbjct: 365 LTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALP 424

Query: 402 --------LVEVEKEVILAALEKTGGNKTEAARQLGITRKTLLAKLSR 441
L E+E +ILAAL T GN+ +AA LG+ R TL K+
Sbjct: 425 PSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3660PF06580362e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.4 bits (84), Expect = 2e-04
Identities = 49/262 (18%), Positives = 104/262 (39%), Gaps = 43/262 (16%)

Query: 197 ILFALATVLLA-SVLSFFW-YRRYLRSRQLLQDEMKRKEKLVALGHLAAGV-AHEIRNPL 253
I+F + V S+L F W + + + ++ Q +M + L L A + H + N L
Sbjct: 120 IIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNAL 179

Query: 254 SSIKGLAKYFAERAPAGGEAHQLAQVM---AKEADRLNRVVSELLELVKPTHLALQAVDL 310
++I+ L +A L+++M + ++ +++ L +V ++L L ++
Sbjct: 180 NNIRALILEDPTKAREM--LTSLSELMRYSLRYSNARQVSLADELTVVD-SYLQLASIQF 236

Query: 311 NTLINHSLQLVSQDANSREIQLRFTANDTLPEIQADPDRLTQVLL-NLYLNAIQAIGQHG 369
+ Q+ + ++Q+ P L Q L+ N + I + Q G
Sbjct: 237 EDRLQFENQI---NPAIMDVQV--------------PPMLVQTLVENGIKHGIAQLPQGG 279

Query: 370 VISVTASESGAGVKISVTDSGKGIAADQLEAIFTPYFTTKAEGTGLGLAVVHNIVEQHGG 429
I + ++ V + V ++G + E TG GL V ++ G
Sbjct: 280 KILLKGTKDNGTVTLEVENTGSLALKNT------------KESTGTGLQNVRERLQMLYG 327

Query: 430 ---TIQVASQEGKGSTFTLWLP 448
I+++ ++GK + +P
Sbjct: 328 TEAQIKLSEKQGKV-NAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3663DNABINDINGHU1202e-39 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 120 bits (302), Expect = 2e-39
Identities = 50/89 (56%), Positives = 66/89 (74%)

Query: 2 NKTQLIDVIAEKAELSKTQAKAALESTLAAITESLKEGDAVQLVGFGTFKVNHRAERTGR 61
NK LI +AE EL+K + AA+++ +A++ L +G+ VQL+GFG F+V RA R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEIKIAAANVPAFVSGKALKDAVK 90
NPQTG+EIKI A+ VPAF +GKALKDAVK
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


86S3805S3810N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S38052211.561226GTP-binding protein
S38060161.505634glutamine synthetase
S38070140.994253nitrogen regulation protein NR(II)
S3808014-0.127072nitrogen regulation protein NR(I)
S3809-111-2.991067coproporphyrinogen III oxidase
S3810-114-3.814178hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3805TCRTETOQM1802e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 180 bits (459), Expect = 2e-51
Identities = 97/445 (21%), Positives = 170/445 (38%), Gaps = 81/445 (18%)

Query: 4 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQE--RVMDSNDLEKERGITILAKNT 61
K+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIVYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPALSVDEPTV 307
K+ ++ + E + D A +G+IV + L ++ + DT+ + + P +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 308 SMFFCVNTSPFSGKEGKFVTSRQILDRLNKKLVHNVALRVEETEDADAFRVSGRGELHLS 367
+ + D L LR +S G++ +
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKVQME 397

Query: 368 VLIENMRRE-GFELAVSRPKVIFRE 391
V ++ + E+ + P VI+ E
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3807PF06580280.042 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.3 bits (63), Expect = 0.042
Identities = 34/190 (17%), Positives = 72/190 (37%), Gaps = 41/190 (21%)

Query: 171 IIEQADRLRNLVDRL---LGPQLPGTRVTE-SIHKVAERV---VTLVSMELPDNVRLIRD 223
I+E + R ++ L + L + + S+ V + L S++ D ++
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 224 YDPSLPELAHDPDQIEQVLLN-IVRNALQ---ALGPEGGEIILRTRTAFQLTLHGERYRL 279
+P++ ++ Q+ +L+ +V N ++ A P+GG+I+L+
Sbjct: 246 INPAIMDV-----QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGT------KDNGTVT- 293

Query: 280 AARIDVEDNGPGIPPHLQDTLFYPMVSGREGGTGLGLSIARNLIDQHSGK---IEFTSWP 336
++VE+ G + ++ TG GL R + G I+ +
Sbjct: 294 ---LEVENTGSLALKNTKE------------STGTGLQNVRERLQMLYGTEAQIKLSEKQ 338

Query: 337 GHTEFSVYLP 346
G V +P
Sbjct: 339 GKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3808HTHFIS5970.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 597 bits (1542), Expect = 0.0
Identities = 206/478 (43%), Positives = 300/478 (62%), Gaps = 11/478 (2%)

Query: 1 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGAEVLEALASKTPDVLLSDIRMPGM 60
M + V DDD++IR VL +AL+ AG N A + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 120
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 HYQEQQQPRNIQLNGPTTDIIGEAQAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 180
+ + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 181 LHRHSPRTKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 240
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 241 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVQEGKFREDLFHR 300
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L+Q + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 301 LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL 360
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 361 ENTCRWLTVMAAGQEVLIQDLPGELFESTVAESTSQMQPDSWATLLAQWADRALRS---- 416
EN R LT + + + + EL + S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 417 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 469
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3810SECA300.004 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.2 bits (68), Expect = 0.004
Identities = 11/71 (15%), Positives = 30/71 (42%)

Query: 14 AKARRKTREELDQEARDRKRQKKRRGHAPGSRAAGGNTTSGSKGQNAPKDPRIGSKTPIP 73
+K + + EE+++ + R+ + +R ++ + + + ++G P P
Sbjct: 827 SKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCP 886

Query: 74 LGVTEKVTKQH 84
G +K + H
Sbjct: 887 CGSGKKYKQCH 897


87S3969S3980N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S39690150.332429ribonucleoside transporter
S3970-1141.120476hypothetical protein
S3971-1141.631370hypothetical protein
S3972-1152.769660cryptic adenine deaminase
S39730163.121655sugar phosphate antiporter
S39741173.822097regulatory protein UhpC
S39751174.350523sensory histidine kinase UhpB
S39762183.898254DNA-binding transcriptional activator UhpA
S39772152.992422acetolactate synthase 1 regulatory subunit
S39782153.198085acetolactate synthase catalytic subunit
S39791161.649356ilvB operon leader peptide
S39801151.205090multidrug resistance protein D
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3969TCRTETA385e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 37.9 bits (88), Expect = 5e-05
Identities = 33/208 (15%), Positives = 71/208 (34%), Gaps = 13/208 (6%)

Query: 33 IIVEFLPVSLLTP----MAQDLGISEGVA---GQSVTVTAFVAMFASLFITQTIQATDRR 85
+ ++ + + L+ P + +DL S V G + + A + + + RR
Sbjct: 14 VALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRR 73

Query: 86 YVVILFAVLLTLSCLLVSFANSFSLLLIGRACLGLALGGFWAMSASLTMRLVPPRTVPKA 145
V+++ + +++ A +L IGR G+ G A++ + + +
Sbjct: 74 PVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARH 132

Query: 146 LSVIFGAVSIALVIAAPLGSFLGELIGWRNVFNAAAVMG----VLCIFWIIKSLPSLPGE 201
+ +V LG +G F AAA + + F + +S
Sbjct: 133 FGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRP 191

Query: 202 PSHQKQNTFRLLQRPGVMAGMIAIFMSF 229
+ N + M + A+ F
Sbjct: 192 LRREALNPLASFRWARGMTVVAALMAVF 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3972UREASE403e-05 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 39.7 bits (93), Expect = 3e-05
Identities = 30/105 (28%), Positives = 43/105 (40%), Gaps = 17/105 (16%)

Query: 22 AVSRGDAVADYIIDNVSILDLINAGEISGPIVIKGRYIAGVG-AEYADT---------PA 71
V+R D +I N ILD + G + I +K IA +G A D P
Sbjct: 60 QVTREGGAVDTVITNALILD--HWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPG 117

Query: 72 LQRIDARGATAVPGFIDAHLHIESSMMTPVTFETATLPRGLTTVI 116
+ I G G +D+H+H + P E A L GLT ++
Sbjct: 118 TEVIAGEGKIVTAGGMDSHIH----FICPQQIEEA-LMSGLTCML 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3973TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 0.001
Identities = 28/168 (16%), Positives = 61/168 (36%), Gaps = 17/168 (10%)

Query: 49 FNIAQNDMISTYGLSMTQLGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAIC 108
N++ D+ + + + F +T+ +G + +D K+ L F +I++ C
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN--C 90

Query: 109 MLGFSASMGSGSVSLFLMIAFYALSGFFQSTGGSCSYSTI----TKWTPRRKRGTFLGFW 164
+G SL +M + F Q G + + + ++ P+ RG G
Sbjct: 91 FGSVIGFVGHSFFSLLIM------ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 165 NISHNLGGAGAAGVALFGANYLFDGHVIGMFIFPSIIALIVGFIGLRY 212
+G + A+Y+ + + + P + I+ L
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSY---LLLIP--MITIITVPFLMK 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3974TCRTETB418e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.6 bits (95), Expect = 8e-06
Identities = 65/408 (15%), Positives = 137/408 (33%), Gaps = 60/408 (14%)

Query: 29 RHILLTIWLGYALFY--FTRKSFNAAVPEILANGVLSRSDIGLLATLFYITYGVSKFVSG 86
RH + IWL F+ N ++P+I + + + T F +T+ + V G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 87 IVSDRSNARYFMGIGLIATGIMNILFGFSTSLWAFAVLWVLNAFFQGWGS---PVCARLL 143
+SD+ + + G+I +++ S F L ++ F QG G+ P ++
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHS---FFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 144 TAWY-SRTERGGWWALWNTAHNVGGALIPIVMAASALHYGWRAGMMIAGCMAIVVGIFLC 202
A Y + RG + L + +G + P + A + W ++ M ++ +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPF- 184

Query: 203 WRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKEILTKYVLLNPYIWLLSFCYVLV 262
+ L +I G L I+ + Y VL
Sbjct: 185 --------LMKLLKKEVRIKGHFDIK----GIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 263 YVV-----RAAINDWGNLYMSETLGVDLVTANTAVTMFELGGFI-----------GALVA 306
+++ R + + + + + + + + + GF+ A
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 307 GWGSDKLFNGNRGPMNLIFAAGILL-SVGSLWLMPFASYVMQATCFFTIGFFVFGPQMLI 365
GS +F G + + GIL+ G L+++ + + F T F + +
Sbjct: 293 EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFM 351

Query: 366 ---------GMAAAECS---------HKEAAGAATGFVGLFAYLGASL 395
G++ + ++ AGA + ++L
Sbjct: 352 TIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3975PF06580402e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.8 bits (93), Expect = 2e-05
Identities = 28/142 (19%), Positives = 56/142 (39%), Gaps = 11/142 (7%)

Query: 365 LRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWRIDESALSENQRVTLFRVCQEGLNN 424
LR ++L + + ++L L++ + + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 425 IVKHA-----DASAVTLQGWQQDERLMLVIEDDGSGLPPGSGQ-QGFGLTGMRERVTALG 478
+KH + L+G + + + L +E+ GS + + G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 479 G---TLHISCLHG-TRVSVSLP 496
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3976HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 2e-13
Identities = 29/174 (16%), Positives = 59/174 (33%), Gaps = 20/174 (11%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATG 118
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 119 GCYLTPDIAIKLASGRQDPLTKRERQVAEKLAQG---MAVKEIAAELGLSPKTV 169
A+ R L + + + + + A L + T+
Sbjct: 118 --------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S3980TCRTETB574e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 57.2 bits (138), Expect = 4e-11
Identities = 39/176 (22%), Positives = 76/176 (43%), Gaps = 1/176 (0%)

Query: 2 LVLLVAVGQMAQTIYIPAIADMARDLNVREGAVQSVMGAYLLTYGVSQLFYGPISDRVGR 61
L +L + + + ++ D+A D N + V A++LT+ + YG +SD++G
Sbjct: 19 LCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGI 78

Query: 62 RPVILVGMSIFMLATLVA-VTTSSLTVLIAASAMQGMGTGVGGVMARTLPRDLYERTQLR 120
+ ++L G+ I +++ V S ++LI A +QG G + + +
Sbjct: 79 KRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRG 138

Query: 121 HANSLLNMGILVSPLLAPLIGGLLDTMWNWRACYLFLLVLCAGVTFSMARWMPETR 176
A L+ + + + P IGG++ +W L ++ V F M E R
Sbjct: 139 KAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVR 194


88S4183S4188N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S41831131.004118outer membrane lipoprotein
S4184-110-0.244811biotin sulfoxide reductase
S4185-211-1.222997hypothetical protein
S4186-212-0.5196893-methyladenine DNA glycosylase
S4187-2140.542717lipase
S4188-2181.225654resistance protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4183OMPADOMAIN1129e-32 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 112 bits (281), Expect = 9e-32
Identities = 40/122 (32%), Positives = 61/122 (50%), Gaps = 11/122 (9%)

Query: 105 LNMPNNVTFDSSSAPLKPAGANTLTGVAMVLKEY--PKTAVNVIGYTDSTGGHDLNMRLS 162
+ ++V F+ + A LKP G L + L +V V+GYTD G N LS
Sbjct: 215 FTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLS 274

Query: 163 QQRADSVASALITQGVDASRIRTQGLGPANPIASNSTAEGK---------AQNRRVEITL 213
++RA SV LI++G+ A +I +G+G +NP+ N+ K A +RRVEI +
Sbjct: 275 ERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334

Query: 214 SP 215

Sbjct: 335 KG 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4185SACTRNSFRASE355e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 34.9 bits (80), Expect = 5e-05
Identities = 16/52 (30%), Positives = 22/52 (42%), Gaps = 5/52 (9%)

Query: 76 VAPKAVRRGIGKALMQYV-----QQRYPHLMLEVYQKNQPAIDFYRAQGFHI 122
VA ++G+G AL+ + + LMLE N A FY F I
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4187ECOLNEIPORIN280.039 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 27.8 bits (62), Expect = 0.039
Identities = 19/90 (21%), Positives = 37/90 (41%), Gaps = 13/90 (14%)

Query: 119 SMYNEFGDSTTTLTDPLWHASVSSLGWRVDSRLGDLRPWAQISYNQQFGENIWKAQSGLS 178
S+ + D+ + H S + + + R G++ P ++SY F +
Sbjct: 228 SVAVQQQDAKLV-EENYSHNSQTEVAATLAYRFGNVTP--RVSYAHGFKGSF-------- 276

Query: 179 RMTATNQNGNWLDVTVGADMLLNQNIAAYA 208
ATN N ++ V VGA+ ++ +A
Sbjct: 277 --DATNYNNDYDQVVVGAEYDFSKRTSALV 304


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4188TCRTETA431e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.9 bits (101), Expect = 1e-06
Identities = 47/275 (17%), Positives = 94/275 (34%), Gaps = 32/275 (11%)

Query: 44 PVSQVAFSFGLLSLGLAIS----SSVAGKLQERFGVKRVTVASGILLGLGFFLTAHSNNL 99
+ V +G+L A+ + V G L +RFG + V + S + + + A + L
Sbjct: 37 HSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFL 96

Query: 100 MMLWLS---AGVLVGLADGAGYLL----TLSNCVKWFPERKGLISAFAIGSYGLGSLGFK 152
+L++ AG+ AG + + F + LG
Sbjct: 97 WVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGG---- 152

Query: 153 FIDTQLLETVGLEKTFVIWGAIALVMIVFGATLMKDAPKQEVKTSNGVVEKDYTLAESMR 212
L+ F A+ + + G L+ ++ K E + R
Sbjct: 153 -----LMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWAR 207

Query: 213 --KPQYWMLAVMFLTACMSG----LYVIGVAKDIAQSLAHLDVVSAANAVTVISIAN-LS 265
++AV F+ + L+VI + H D + ++ I + L+
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWVI-----FGEDRFHWDATTIGISLAAFGILHSLA 262

Query: 266 GRLVLGILSDKIARIRVITIGQVISLVGMAALLFA 300
++ G ++ ++ R + +G + G L FA
Sbjct: 263 QAMITGPVAARLGERRALMLGMIADGTGYILLAFA 297



Score = 36.0 bits (83), Expect = 2e-04
Identities = 37/155 (23%), Positives = 64/155 (41%), Gaps = 9/155 (5%)

Query: 241 AQSLAHLDVVSAANAVTVISIANLSGRLVLGILSDKIARIRVITIGQVISLVGMAALLFA 300
AH ++ A A+ + A + G L SD+ R V+ + + V A + A
Sbjct: 39 NDVTAHYGILLALYALMQFACAPVLGAL-----SDRFGRRPVLLVSLAGAAVDYAIMATA 93

Query: 301 PLNAVTFFAAIACVAFNFGGTITVFPSLVSEFFGLNNLAKNYGVIYLGFGIGSIFGSIIA 360
P V + I VA G T V + +++ + A+++G + FG G + G ++
Sbjct: 94 PFLWVLYIGRI--VAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG 151

Query: 361 SLFGGF--YVTFYVIFALLILSLALSTTIRQPEQK 393
L GGF + F+ AL L+ + K
Sbjct: 152 GLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHK 186


89S4260S4265N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S42600193.604318hypothetical protein
S42610193.412647ATP-binding component of a transport system
S42621203.668811transporter
S42631214.485417hypothetical protein
S42640224.412518nickel responsive regulator
S42650224.479200nickel transporter ATP-binding protein NikE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4260RTXTOXIND844e-20 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 84.5 bits (209), Expect = 4e-20
Identities = 71/408 (17%), Positives = 139/408 (34%), Gaps = 81/408 (19%)

Query: 6 RHLAWWVVGALAVAAVVAWWLLRPAGVP-EGFAVSNGRIEATEVDIASKIAGRIDTILVK 64
R +A++++G L +A +++ G +GR + I + I+VK
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVK 113

Query: 65 EGQFVREGEVLAKMDTRV----------------LQEQRLEAI----------------- 91
EG+ VR+G+VL K+ L++ R + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 92 -------------------AQIKEAQSAVAAAQALLEQRQSETRAAQSLVNQRQAELDSV 132
Q Q+ + L+++++E + +N+ +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 133 AKRHTRSRSLAQRGAISAQQLDDDRAAAESARAALESAKAQVSASKAAIEAARTNIIQ-- 190
R SL + AI+ + + A L K+Q+ ++ I +A+
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 191 -----------AQTRVEAAQATERRIAADID--DSELKAPRDGRV-QYRVAEPGEVLAAG 236
QT T + S ++AP +V Q +V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 237 GRVLNMVDLSDVY-MTFFLPTEQAGTLKLGGEARLILDAAPDLRIPATISFVASVAQFTP 295
++ +V D +T + + G + +G A + ++A P R V V
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YLVGKVKNINL 410

Query: 296 KTVETSDERLKLMFRVKARIPPELLQQHLEYV--KTGLPGVAWVRVNE 341
+E D+RL L+F V I L + + +G+ A ++
Sbjct: 411 DAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4261PF05272300.045 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.045
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 37 ARCMVGLIGPDGVGKSSLLSLISGAR 62
V L G G+GKS+L++ + G
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4262ABC2TRNSPORT505e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 49.9 bits (119), Expect = 5e-09
Identities = 41/171 (23%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 200 REREHGTVEHLLVMPITPFEIMMAKI-WSMGLVVLVVSGLSLVLMVKGVLGVPIEGSIPL 258
R T E +L + +I++ ++ W+ L +G+ +V G + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY----TQWLSLL 148

Query: 259 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQMLSGGSTPRESMPQMVQD 317
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P + Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 318 IMLTMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFF-TIALLRFR 367
+P +H + L + I+ ++ + I FF + ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4265HTHFIS290.018 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.4 bits (66), Expect = 0.018
Identities = 10/34 (29%), Positives = 19/34 (55%)

Query: 25 QAVLNNVSLTLKSGETVALLGRSGCGKSTLARLL 58
Q + ++ +++ T+ + G SG GK +AR L
Sbjct: 147 QEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


90S4292S4299N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S4292-1202.328036glycerol-3-phosphate transporter periplasmic
S4293-1192.359071glycerol-3-phosphate transporter permease
S42940161.475126glycerol-3-phosphate transporter membrane
S4296-1151.194248glycerophosphodiester phosphodiesterase
S42971151.079324hypothetical protein
S42981150.962184gamma-glutamyltranspeptidase
S42990160.987626acetyltransferase YhhY
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4292MALTOSEBP402e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 39.7 bits (92), Expect = 2e-05
Identities = 41/160 (25%), Positives = 68/160 (42%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTSVLYYNKDAFKKAGLDPEQPPKTWQDLADYSAKLKASGIKCGYASGWQ 193
G L++ P L YNKD L P PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKD------LLP-NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4296PF04619280.017 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 28.4 bits (63), Expect = 0.017
Identities = 12/60 (20%), Positives = 22/60 (36%), Gaps = 4/60 (6%)

Query: 29 VGAKYGHKMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGELNWQD----LLRVDAGSW 84
+G ++ D + G+ FL+ D+N ++ W + D GSW
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4298NAFLGMOTY320.007 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 31.6 bits (71), Expect = 0.007
Identities = 27/82 (32%), Positives = 37/82 (45%), Gaps = 17/82 (20%)

Query: 275 RTPISGDYRGYQVYSMPPPSSGGIHIVQILNI--LENFDMKKYGF-GSADAMQIMAEAEK 331
R P+ G+ R + SMPPP G H +I N+ + FD G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNLKFFKQFD----GYVGGQTAWGILSELEK 131

Query: 332 YAYADRSEYLGDPDFVKVPWQA 353
Y P F WQ+
Sbjct: 132 GRY---------PTFSYQDWQS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4299SACTRNSFRASE371e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 37.2 bits (86), Expect = 1e-05
Identities = 21/92 (22%), Positives = 33/92 (35%), Gaps = 16/92 (17%)

Query: 55 VACIDGDVVGHLTIDVQQRPRRSHVADFGICVDSRWKNRGVASALMREMIE------MCD 108
+ ++ + +G + I + + D + D R K GV +AL+ + IE C
Sbjct: 69 LYYLENNCIGRIKIR-SNWNGYALIEDIAVAKDYRKK--GVGTALLHKAIEWAKENHFCG 125

Query: 109 NWLRVDRIELTVFVDNAPAIKVYKKFGFEIEG 140
L I N A Y K F I
Sbjct: 126 LMLETQDI-------NISACHFYAKHHFIIGA 150


91S4389S4398N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S4389-1142.162143phosphoribulokinase
S4390-2142.425626hypothetical protein
S4391-1142.543056hydrolase
S43921121.760365ABC transporter ATP-binding protein
S4393-1130.067845glutathione-regulated potassium-efflux system
S4394013-0.108193glutathione-regulated potassium-efflux system
S4395323-1.711725FKBP-type peptidyl-prolyl cis-trans isomerase
S4396320-2.678807hypothetical protein
S4397323-2.450467FKBP-type peptidyl-prolyl cis-trans isomerase
S4398225-1.828832hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4389PF07299320.002 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 31.8 bits (72), Expect = 0.002
Identities = 10/46 (21%), Positives = 21/46 (45%), Gaps = 2/46 (4%)

Query: 71 PEANDFGLLEQTFIEYGQSGKGKSRKYLHTYDEAVPWNQVPGTFTP 116
P+ + + E ++ KG SRK++ ++ + + GTF
Sbjct: 112 PDMEELDMKELSY--LSWIDKGSSRKFIIAKNDKNKFVGLQGTFQS 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4392GPOSANCHOR330.005 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.7 bits (74), Expect = 0.005
Identities = 28/152 (18%), Positives = 54/152 (35%), Gaps = 22/152 (14%)

Query: 504 KVEPFDGDLEDYQQWLSDVQKQENQTDEAPKENANSAQARKDQKRREAELRAQTQPLRKE 563
+ D + ++ E + + ++ R+ +R R + L E
Sbjct: 272 AMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAE 331

Query: 564 IARLEKEME---------------------KLNAQLAQAEEKLGDSELYDQSRKAELTAC 602
+LE++ + +L A+ + EE+ SE QS + +L A
Sbjct: 332 HQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDAS 391

Query: 603 LQQQASAKSGLEECEMAWLEAQEQLEQMLLEG 634
+ + + LEE L A E+L + L E
Sbjct: 392 REAKKQVEKALEEANSK-LAALEKLNKELEES 422



Score = 32.0 bits (72), Expect = 0.008
Identities = 13/125 (10%), Positives = 39/125 (31%), Gaps = 7/125 (5%)

Query: 513 EDYQQWLSDVQKQENQTDEAPKENANSAQARKDQKRREAELRAQTQPLRKEIARLEKEME 572
+ + ++ + E A A + D ++ + +++
Sbjct: 127 KALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST-------ADSAKIK 179

Query: 573 KLNAQLAQAEEKLGDSELYDQSRKAELTACLQQQASAKSGLEECEMAWLEAQEQLEQMLL 632
L A+ A E + + E + TA + + ++ + ++ LE +
Sbjct: 180 TLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMN 239

Query: 633 EGQSN 637
++
Sbjct: 240 FSTAD 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4393ISCHRISMTASE320.001 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 31.9 bits (72), Expect = 0.001
Identities = 32/135 (23%), Positives = 51/135 (37%), Gaps = 16/135 (11%)

Query: 11 YAHPESQDSVANRVLLKPATQLSNVTVHDLYAHYPDFFIDIPREQALLREHEVIVFQH-- 68
Y P + D N+V P + + +HD+ ++ D F L + +
Sbjct: 9 YQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCV 68

Query: 69 ----PLYTYSCPALLKEWLDRVLSRGFASGPGGNQLAGKYWRNVITTGEPESA------Y 118
P+ + P DR L F GPG N +G Y +IT PE +
Sbjct: 69 QLGIPVVYTAQPGSQNP-DDRALLTDFW-GPGLN--SGPYEEKIITELAPEDDDLVLTKW 124

Query: 119 RYDALNRYPMSDVLR 133
RY A R + +++R
Sbjct: 125 RYSAFKRTNLLEMMR 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S439460KDINNERMP310.019 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 30.7 bits (69), Expect = 0.019
Identities = 13/69 (18%), Positives = 29/69 (42%), Gaps = 6/69 (8%)

Query: 226 TAIDPFKGLLLG---LFFISVGMSLNLGVLYTHL-LWVVISVVVLVAVKILVLYLLARLY 281
A+ P L + L+FIS + L +++ + W +++ V+ ++ L
Sbjct: 318 AAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKA-- 375

Query: 282 GVRSSERMQ 290
S +M+
Sbjct: 376 QYTSMAKMR 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4397INFPOTNTIATR1332e-40 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 133 bits (337), Expect = 2e-40
Identities = 80/226 (35%), Positives = 124/226 (54%), Gaps = 9/226 (3%)

Query: 28 AAKPATTADSKASFKNDDQKSAYALGASLGRYMENSLKEQEKLGIKLDKDQLIAGVQDAF 87
A A A S D K +Y++GA LG K + GI ++ D L G+QD
Sbjct: 14 AMSTAMAATDATSLTTDKDKLSYSIGADLG-------KNFKNQGIDINPDVLAKGMQDGM 66

Query: 88 A-DKSKLSDQEIEQTLQAFEARVKSSAQAKMEKDAADNEAKGKEYREKFAKEKGVKTSST 146
+ + L++++++ L F+ + + A+ K A +N+AKG + + G+ +
Sbjct: 67 SGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVLPS 126

Query: 147 GLVYQVVEAGKGEAPKDSDTVVVNYKGTLIDGKEFDNSYTRGEPLSFRLDGVIPGWTEGL 206
GL Y++++AG G P SDTV V Y GTLIDG FD++ G+P +F++ VIPGWTE L
Sbjct: 127 GLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEAL 186

Query: 207 KNIKKGGKIKLVIPPELAYGKAGVPG-IPPNSTLVFDVELLDVKPA 251
+ + G ++ +P +LAYG V G I PN TL+F + L+ VK A
Sbjct: 187 QLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVKKA 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4398ACRIFLAVINRP290.021 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 29.0 bits (65), Expect = 0.021
Identities = 14/62 (22%), Positives = 29/62 (46%), Gaps = 1/62 (1%)

Query: 160 ASSVEDLVTQTLEFTIEEVNADRNV-SNNAKNRQIVLNLYEKGIFDIKDAINQVADRLNI 218
A +V+D VTQ +E + ++ + S + + + L + D A QV ++L +
Sbjct: 54 AQTVQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQL 113

Query: 219 SK 220
+
Sbjct: 114 AT 115


92S4698S4704N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
S4698-1170.479451phosphoglycerate mutase
S4699-114-0.123872right origin-binding protein
S4700016-0.587857hypothetical protein
S4701DNA-binding response regulator CreB
S4702sensory histidine kinase CreC
S4703hypothetical protein
S4704two-component response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4698VACCYTOTOXIN290.014 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 29.2 bits (65), Expect = 0.014
Identities = 14/45 (31%), Positives = 20/45 (44%), Gaps = 4/45 (8%)

Query: 145 PLLVSHGIALGCLVSTILGLPAWAERRLRLRNCSISRVDYQESLW 189
P +V GIA G V T+ GL W ++ N D + +W
Sbjct: 42 PAIVG-GIATGAAVGTVSGLLGWGLKQAEEAN---KTPDKPDKVW 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4701HTHFIS876e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 86.8 bits (215), Expect = 6e-22
Identities = 33/139 (23%), Positives = 60/139 (43%)

Query: 1 MQRETVWLVEDEQGIADTLVYMLQQEGFAVEVFERGLPVLDKARQQVPDVMILDVGLPDI 60
M T+ + +D+ I L L + G+ V + + D+++ DV +PD
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 SGFELCRQLLALHPALPVLFLTARSEEVDRLLGLEIGADDYVAKPFSPREVCARVRTLLR 120
+ F+L ++ P LPVL ++A++ + + E GA DY+ KPF E+ + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 RVKKFSTPSPVIRIGHFEL 139
K+ + L
Sbjct: 121 EPKRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4702PF06580330.003 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.9 bits (75), Expect = 0.003
Identities = 47/207 (22%), Positives = 80/207 (38%), Gaps = 51/207 (24%)

Query: 298 LTQNARMQAL---------VETL--LRQARLENRQEVVLTAVDVAALFR---RVSEARTV 343
+ Q A++ AL L +R LE+ + ++ L R R S AR V
Sbjct: 157 MAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQV 216

Query: 344 QLAE--KNITLHVM--------PTEVNVAAEPALLDQALGNLL-----DNA----IDFTP 384
LA+ + ++ + PA++D + +L +N I P
Sbjct: 217 SLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLP 276

Query: 385 ESGCITLSAEVDQEHVTLKVLDTGSGIPDYALSRIFERFYSLPRANGQKSSGLGLAFVSE 444
+ G I L D VTL+V +TGS N ++S+G GL V E
Sbjct: 277 QGGKILLKGTKDNGTVTLEVENTGSLALK----------------NTKESTGTGLQNVRE 320

Query: 445 -VARLFNGEVTLR-NVQEGGVLASLRL 469
+ L+ E ++ + ++G V A + +
Sbjct: 321 RLQMLYGTEAQIKLSEKQGKVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
S4704HTHFIS824e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.8 bits (202), Expect = 4e-20
Identities = 30/122 (24%), Positives = 60/122 (49%), Gaps = 1/122 (0%)

Query: 1 MQTPHILIVEDELVTRNTLKSIFEAEGYDVFEATDGAEMHQILSEYDINLVIMDINLPGK 60
M IL+ +D+ R L GYDV ++ A + + ++ D +LV+ D+ +P +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLLLARELRE-QANVALMFLTGRDNEVDKILGLEIGADDYITKPFNPRELTIRARNLLS 119
N L +++ + ++ ++ ++ ++ + I E GA DY+ KPF+ EL L+
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RT 121

Sbjct: 121 EP 122



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.