PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeMaurumNCTC10437.gbThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in LR134356 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1NCTC10437_00022NCTC10437_00035Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_000222130.692075protein kinase
NCTC10437_000232130.533865penicillin-binding transpeptidase
NCTC10437_000241130.552325cell cycle protein
NCTC10437_000251131.378944protein phosphatase 2C domain-containing
NCTC10437_000262130.240367FHA domain-containing protein
NCTC10437_000272120.991314FHA domain-containing protein
NCTC10437_000291121.074154*TetR family transcriptional regulator
NCTC10437_000302131.207065short-chain dehydrogenase/reductase SDR
NCTC10437_000312151.498350lipoprotein LppJ
NCTC10437_000322181.771947Alpha/beta hydrolase of uncharacterised function
NCTC10437_000334213.405307Uncharacterised protein
NCTC10437_000344211.947641Uncharacterised protein
NCTC10437_000355211.392635Protein of uncharacterised function (DUF322)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00022YERSSTKINASE330.002 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 33.2 bits (75), Expect = 0.002
Identities = 32/112 (28%), Positives = 52/112 (46%), Gaps = 15/112 (13%)

Query: 135 AGLVHRDVKPGNILIT-PTGQVKLTDFGIAKAVDAAPVTQTGMVMGTAQYIAPEQALGH- 192
AG+VH D+KPGN++ +G+ + D G+ P G T + APE +G+
Sbjct: 264 AGVVHNDIKPGNVVFDRASGEPVVIDLGLHSRSGEQP---KGF---TESFKAPELGVGNL 317

Query: 193 DATAASDVYAL------GVVGYEAVSGKRPFTGEGALTVAMKHI-KENPPPL 237
A+ SDV+ + + G+E +P G +T H+ EN P+
Sbjct: 318 GASEKSDVFLVVSTLLHCIEGFEKNPEIKPNQGLRFITSEPAHVMDENGYPI 369


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00025PF03544416e-06 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 40.7 bits (95), Expect = 6e-06
Identities = 21/79 (26%), Positives = 21/79 (26%), Gaps = 4/79 (5%)

Query: 422 APPAPATTPAPRPAPTSAPRSPAPSGAPTPAPPRTVTSSPAPSSPAPSPAPSGTAAPAPS 481
PP P P P P P P AP P P P P P P S
Sbjct: 68 PPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPK----PKPKPKPVKKVEQPKRDVKPVES 123

Query: 482 GPPSPAPTPTPTVTALPPP 500
P SP P
Sbjct: 124 RPASPFENTAPARPTSSTA 142



Score = 36.1 bits (83), Expect = 2e-04
Identities = 21/87 (24%), Positives = 25/87 (28%), Gaps = 1/87 (1%)

Query: 422 APPAPATTPAPRPAPTSAPRSPAPSGAPTPAPPRTVTSSPAPSSPAPSPAPSGTAAPAPS 481
AP P + PA P++ P P P P P AP P P
Sbjct: 45 APAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPK 104

Query: 482 GPPS-PAPTPTPTVTALPPPPPEPGTN 507
P P V + P P N
Sbjct: 105 PKPVKKVEQPKRDVKPVESRPASPFEN 131



Score = 35.7 bits (82), Expect = 3e-04
Identities = 18/84 (21%), Positives = 19/84 (22%), Gaps = 3/84 (3%)

Query: 423 PPAPATTPAPRPAPTSAPRSPAPSGAPTPAPPRTVTSSPAPSSPAPSPAPSGTAAPAPSG 482
P A P P P P P P P + P P P
Sbjct: 63 PQAVQPPPEPVVEP---EPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVK 119

Query: 483 PPSPAPTPTPTVTALPPPPPEPGT 506
P P TA P T
Sbjct: 120 PVESRPASPFENTAPARPTSSTAT 143



Score = 34.6 bits (79), Expect = 5e-04
Identities = 27/116 (23%), Positives = 34/116 (29%), Gaps = 2/116 (1%)

Query: 393 VIAGLPSGSLDEAIGQIEELSRSSVLPVCAPPAPATTPAPRPAPTSAPRSPAPSGAPTPA 452
V+AGL S+ + I SV V AP A +P P P P P P
Sbjct: 28 VVAGLLYTSVHQVIELPAPAQPISVTMV-APADLEPPQAVQPPPEPVVE-PEPEPEPIPE 85

Query: 453 PPRTVTSSPAPSSPAPSPAPSGTAAPAPSGPPSPAPTPTPTVTALPPPPPEPGTNC 508
PP+ P P P P P P P ++
Sbjct: 86 PPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSST 141



Score = 34.6 bits (79), Expect = 6e-04
Identities = 15/82 (18%), Positives = 20/82 (24%), Gaps = 1/82 (1%)

Query: 422 APPAPATTPAPRPAPTSAPRSPAPSGAPTPAPPRTVTSSPAPSSPAPSPAPSGTAAPAPS 481
P P P P P + P P P P V P + +
Sbjct: 74 VEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK-PVKKVEQPKRDVKPVESRPASPFENT 132

Query: 482 GPPSPAPTPTPTVTALPPPPPE 503
P P + T+ P
Sbjct: 133 APARPTSSTATAATSKPVTSVA 154



Score = 34.6 bits (79), Expect = 6e-04
Identities = 24/124 (19%), Positives = 35/124 (28%), Gaps = 3/124 (2%)

Query: 380 LGVDDMRPSERAQVIAGLPSGSLDEAIGQIEELSRSSVLPVCAPPAPATTPAPRPAPTSA 439
+ V + P++ A P E + + E P P P P+P P
Sbjct: 50 ISVTMVAPADLEPPQAVQPP---PEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK 106

Query: 440 PRSPAPSGAPTPAPPRTVTSSPAPSSPAPSPAPSGTAAPAPSGPPSPAPTPTPTVTALPP 499
P P + +SP ++ P S A S A P P
Sbjct: 107 PVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQ 166

Query: 500 PPPE 503
P
Sbjct: 167 YPAR 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00029HTHTETR682e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 68.1 bits (166), Expect = 2e-16
Identities = 32/169 (18%), Positives = 60/169 (35%), Gaps = 15/169 (8%)

Query: 6 RPVRADAARNRALLLAAAEDEFRERG-ASASVADIARRAGVAKGTVFRHFPTKEDLIASI 64
R + +A R +L A F ++G +S S+ +IA+ AGV +G ++ HF K DL + I
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 65 VCEHVAVLAEAAARLAD------SPDPGAALLEFLTLAADQRQRHDLTFLQSASDGDPRV 118
+ + E L+ L + +R L +
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 119 TEVRDAL--------HEHLETLVDRARASGAIRADITEADVFLMMCAPI 159
V ++ +E + + + AD+ ++M I
Sbjct: 123 MAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00030DHBDHDRGNASE673e-15 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 66.6 bits (162), Expect = 3e-15
Identities = 52/231 (22%), Positives = 88/231 (38%), Gaps = 27/231 (11%)

Query: 3 LRGATVLVTGTNRGIGQHFAVQLLQRGAKVYATARRPELVDIPG---------AEVLRLD 53
+ G +TG +GIG+ A L +GA + A PE ++ AE D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 54 ITDQSSVDAV----AAVAGDVDVLINNAADTAGGNLVTGDLDAIRSTMDSNYYGTLAMIR 109
+ D +++D + G +D+L+N A G + + + +T N G R
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 110 AFAPILARNGGGAILNVLSAAAWTTVDGNTAYAAAKSAQWGLTNGVRLELAAQGTQVAAL 169
+ + + G+I+ V S A AYA++K+A T + LELA + +
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 170 VPGLIGT--------------QTLLDFAERHGIDLPDDAVMDPADLVRLAL 206
PG T Q + E +P + P+D+ L
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVL 236


2NCTC10437_00072NCTC10437_00085Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00072310-1.126790cell division protein FtsK
NCTC10437_00073312-0.595446cell division protein FtsK
NCTC10437_00074312-0.156808PE-like protein
NCTC10437_000754120.063757PPE protein
NCTC10437_00076115-0.036231WXG repeat protein
NCTC10437_000771130.364283early secretory antigenic target, 6 kDa
NCTC10437_000781120.555180chromosome partitioning ATPase
NCTC10437_00079118-0.489462type VII secretion integral membrane protein
NCTC10437_00080430-2.327149translation initiation factor IF-2
NCTC10437_00081331-3.210546Protein of uncharacterised function (DUF2580)
NCTC10437_00082426-1.345656Uncharacterised protein
NCTC10437_00083421-1.191016putative alanine and proline rich protein
NCTC10437_00084421-0.801328Uncharacterised protein
NCTC10437_000852170.061249Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00078PF03544320.003 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 32.3 bits (73), Expect = 0.003
Identities = 24/130 (18%), Positives = 39/130 (30%), Gaps = 10/130 (7%)

Query: 45 PPMPAAPPRTQTAAQAPQSWQTEVMSQVTNAPAHQPMHQRLPNNGMMRTPQSAPSVGARH 104
+PA AP + Q P +P + P P+
Sbjct: 41 IELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEP------IPEPPKEAPVVI 94

Query: 105 EQPRPAPAPAPRPVPPPPPSQYYGENTDQRQAAPPTSAATMGNHRAIDALSHVGVRSAVK 164
E+P+P P P P+PV + + + R A+P + A + + S
Sbjct: 95 EKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAP----ARPTSSTATAATSKPV 150

Query: 165 MPPQRGWRHW 174
G R
Sbjct: 151 TSVASGPRAL 160


3NCTC10437_00175NCTC10437_00198Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_001752130.924797PMT family glycosyltransferase,
NCTC10437_001764150.952829Uncharacterised protein
NCTC10437_001772163.043807dehydrogenase
NCTC10437_001783162.875878LysR family transcriptional regulator
NCTC10437_001793130.980491Major facilitator superfamily MFS_1
NCTC10437_001801120.265670Uncharacterised protein
NCTC10437_001811120.288452Uncharacterized protein conserved in bacteria
NCTC10437_001820120.329071Uncharacterised protein
NCTC10437_00183010-1.615853Uncharacterised protein
NCTC10437_0018409-1.313515Transport protein
NCTC10437_00185212-0.423149mycobacterium membrane protein
NCTC10437_00186211-0.169382PPOX class probable F420-dependent enzyme
NCTC10437_001873111.480230endopeptidase
NCTC10437_001882111.383431IclR family transcriptional regulator
NCTC10437_001892122.672914dihydroxy-acid dehydratase
NCTC10437_001903133.998604putative lipoprotein
NCTC10437_001910113.697569regulated in copper repressor
NCTC10437_001921103.897435Uncharacterised protein
NCTC10437_00193092.514975two-component regulator receiver
NCTC10437_00194-192.566489two-component regulator - sensor kinase
NCTC10437_00195-181.977783S-adenosyl-L-methionine-dependent methyl
NCTC10437_001960102.281206ErfK/YbiS/YcfS/YnhG family protein
NCTC10437_001972122.748988Uncharacterised protein
NCTC10437_001982112.158600Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00175PERTACTIN300.027 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 30.5 bits (68), Expect = 0.027
Identities = 15/37 (40%), Positives = 18/37 (48%), Gaps = 1/37 (2%)

Query: 266 RIFGGGAGGAGGLPG-PPPGNAGPGPEGPAMFGSAGI 301
I G A G +PG PG A PG GP + G G+
Sbjct: 257 TIRRGDAPAGGAVPGGAVPGGAVPGGFGPLLDGWYGV 293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00177DHBDHDRGNASE752e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 75.5 bits (185), Expect = 2e-18
Identities = 69/255 (27%), Positives = 106/255 (41%), Gaps = 27/255 (10%)

Query: 6 RVALITGASQGIGAGLEAGYRKLGYAVVA---NSRTIDGGDDPMVL------AVPGDVAQ 56
++A ITGA+QGIG + G + A N ++ + A P DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 57 PGVGGRIVDAAVQRFGRIDTVVNNAGLFIARPFTDYTDEEYDAITGVNLRGFFEISRAAV 116
I + G ID +VN AG+ +DEE++A VN G F SR+
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 117 ARMLDQGGGGHLVTISTTLVEHANSAVPSALASLTKGGLNAATRALAVEYATQGIRSNAV 176
M+D+ G +VT+ + +++ + +S K T+ L +E A IR N V
Sbjct: 129 KYMMDRRSGS-IVTVGSNPAGVPRTSMAAYASS--KAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 177 ALGIIRTPMHQP---SDYDALATLH----------PVGHVGEVSDVVDAVLYL--ENAPF 221
+ G T M + A + P+ + + SD+ DAVL+L A
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 222 VTGEILHVDGGQSAG 236
+T L VDGG + G
Sbjct: 246 ITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00182RTXTOXINA403e-05 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 39.9 bits (93), Expect = 3e-05
Identities = 42/133 (31%), Positives = 52/133 (39%), Gaps = 9/133 (6%)

Query: 377 FGGQDLDDNADGGNG------GQGGEGGAGGVGPEGGAGGSGGDGGIGGGGGENGTGGDG 430
F G D DD +G +G +G + +GG G + GG G D IG G GGDG
Sbjct: 740 FHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDGNDKLIGVAGNNYLNGGDG 799

Query: 431 GTG-GPGGLGGSGSLDEGGDGGDGGDGGTGGDSLEGGTGGSGGFGGGGGSGGFGGSGDGG 489
G + ++ GG G D G G D L+GG G GG G SG G
Sbjct: 800 DDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLSGYGH 859

Query: 490 DPGTAGSAGSGND 502
G D
Sbjct: 860 H--IIDDDGGKED 870



Score = 38.8 bits (90), Expect = 6e-05
Identities = 34/113 (30%), Positives = 43/113 (38%), Gaps = 4/113 (3%)

Query: 253 GSGGLFAPGGTDGNGGAGGNGGDGITGDGSVGGNGADGGKGGDGVIGGTGGDGGLGGLGD 312
G+ G G DGN G+ G+ G G + GG G D +IG G + GG GD
Sbjct: 742 GADGDDLIEGNDGNDRLYGDKGNDTL-SGGNGDDQLYGGDGNDKLIGVAGNNYLNGGDGD 800

Query: 313 EGFGGGEGGSGGTGGSGEIGGVGGTGGAGGSGSAGFDGGLGGSGGDGGEGSTT 365
+ F GG G G G+ DGG G GG G+
Sbjct: 801 DEFQVQGNSLAKNVLF---GGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDI 850



Score = 35.7 bits (82), Expect = 4e-04
Identities = 29/79 (36%), Positives = 34/79 (43%), Gaps = 1/79 (1%)

Query: 243 GGDGGDGGTAGSGGLFAPGGTDGNGGAGGNGGDGIT-GDGSVGGNGADGGKGGDGVIGGT 301
GG+G D G G G N GG+G D S+ N GGKG D + G
Sbjct: 769 GGNGDDQLYGGDGNDKLIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSE 828

Query: 302 GGDGGLGGLGDEGFGGGEG 320
G D GG GD+ GG G
Sbjct: 829 GADLLDGGEGDDLLKGGYG 847



Score = 35.3 bits (81), Expect = 7e-04
Identities = 26/79 (32%), Positives = 27/79 (34%), Gaps = 7/79 (8%)

Query: 447 GGDGGDGGDGGTGGDSLEGGTGGSGGFGGGGGSGGFGGSGD-------GGDPGTAGSAGS 499
G G D GG G D L GG G G G + GG GD G
Sbjct: 760 GDKGNDTLSGGNGDDQLYGGDGNDKLIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGK 819

Query: 500 GNDGTQGGGGGAGGTGGAG 518
GND G G GG G
Sbjct: 820 GNDKLYGSEGADLLDGGEG 838



Score = 32.2 bits (73), Expect = 0.005
Identities = 40/134 (29%), Positives = 45/134 (33%), Gaps = 8/134 (5%)

Query: 297 VIGGTGGDGGLGGLGDEGFGGGEGGSGGTGGSGEIGGVGGTGGAGGSGSAGFDGGLGGSG 356
+IG T D G + F G +G G G G G SG G D GG G
Sbjct: 722 LIGTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDG 781

Query: 357 GDGGEGSTTGGAGGKGGLGGFGGQDLDDNADGGNGGQGGEGGAGGVGPEGGAGGSGGDGG 416
D G AG GG G DD GG G + G G D
Sbjct: 782 ND----KLIGVAGNNYLNGGDG----DDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLL 833

Query: 417 IGGGGGENGTGGDG 430
GG G + GG G
Sbjct: 834 DGGEGDDLLKGGYG 847



Score = 31.9 bits (72), Expect = 0.007
Identities = 24/78 (30%), Positives = 29/78 (37%), Gaps = 4/78 (5%)

Query: 447 GGDGGDGGDGGTGGDSLEGGTGGSGGFGGGGGSGGFGGSGD----GGDPGTAGSAGSGND 502
G DG D +G G D L G G GG G +GG G+ G + G G+D
Sbjct: 742 GADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDGNDKLIGVAGNNYLNGGDGDD 801

Query: 503 GTQGGGGGAGGTGGAGGA 520
Q G GG
Sbjct: 802 EFQVQGNSLAKNVLFGGK 819



Score = 31.1 bits (70), Expect = 0.014
Identities = 38/133 (28%), Positives = 48/133 (36%), Gaps = 11/133 (8%)

Query: 387 DGGNGGQGGEGGAGGVGPEGGAGGSGGDGGIGGGGGENGTGGDGGTGGPGGLGGSGSLDE 446
D +G G + G G + G G D GG G + GGDG G G +
Sbjct: 738 DIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDGNDKLIGVAG--NNYLN 795

Query: 447 GGDGGD---GGDGGTGGDSLEGGTGGSGGFGGGGGSGGFGGSGDGGDPGTAGS------A 497
GGDG D + L GG G +G G GG GD G G+ +
Sbjct: 796 GGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLS 855

Query: 498 GSGNDGTQGGGGG 510
G G+ GG
Sbjct: 856 GYGHHIIDDDGGK 868



Score = 31.1 bits (70), Expect = 0.015
Identities = 38/125 (30%), Positives = 50/125 (40%), Gaps = 10/125 (8%)

Query: 283 VGGNGAD---GGKGGDGVIGGTGGDGGLGGLGDEGFGGGEGGSGGTGGSGEIGGVGGTGG 339
+G AD G K D G G D G G++ G +G +GG+G+ GG G
Sbjct: 723 IGTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDGN 782

Query: 340 AGGSGSAGFDGGLGGSGGDGGEGSTTG-------GAGGKGGLGGFGGQDLDDNADGGNGG 392
G AG + GG G D + G G L G G DL D +G +
Sbjct: 783 DKLIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLL 842

Query: 393 QGGEG 397
+GG G
Sbjct: 843 KGGYG 847



Score = 30.3 bits (68), Expect = 0.020
Identities = 35/106 (33%), Positives = 38/106 (35%), Gaps = 9/106 (8%)

Query: 134 GNGGKGFAGGNGGAGALYFGNGGDGGAGVAGGNGGRGGAGG--LFFGNGGNGGVGGTGAD 191
G G GN G LY G D +G G + GG G L G N GG G D
Sbjct: 742 GADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDGNDKLIGVAGNNYLNGGDGDD 801

Query: 192 GTEPGQAGQAGG--KGGAG-----GSGGFFFGSGGSGGQGGKGGDG 230
+ A GG G GS G GG G KGG G
Sbjct: 802 EFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYG 847


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00184ACRIFLAVINRP469e-07 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 45.6 bits (108), Expect = 9e-07
Identities = 40/236 (16%), Positives = 85/236 (36%), Gaps = 34/236 (14%)

Query: 184 HGTLKTTLITLGVIAVMLLWLYRRLTTVFLVLFTVMIELTASRGVVAVLANAGIIELSTY 243
H +KT + ++ +++ + + + V V +L I+ Y
Sbjct: 338 HEVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAV---------PVVLLGTFAILAAFGY 388

Query: 244 STNLLTL--LVIAAGT--DYAIFILGRFHEARYAGQDRVTAFTTMYHGTAHI---ILGSG 296
S N LT+ +V+A G D AI ++ R +D++ + I ++G
Sbjct: 389 SINTLTMFGMVLAIGLLVDDAIVVVENVE--RVMMEDKLPPKEATEKSMSQIQGALVGIA 446

Query: 297 LTIAGAVLCLSF---TRLPYFQSLGVPAGIGVLVAVVAALTLAPALL------------- 340
+ ++ + ++F + ++ + + ++V+ AL L PAL
Sbjct: 447 MVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHE 506

Query: 341 IIGRHFGLFEPSRPLRTRGWRRIGTAIVRWPGPILIATIAIALIGLLALPRYTTSY 396
G FG F + + I+ G L+ I ++ R +S+
Sbjct: 507 NKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSF 562



Score = 42.5 bits (100), Expect = 8e-06
Identities = 30/158 (18%), Positives = 60/158 (37%), Gaps = 7/158 (4%)

Query: 760 AGIAALCLILLVMMFITRSIIAAFVIVGTVALSLGASFGLSVLIWQDIFGIQLYWIVLAL 819
A+ L+ LVM +++ A + V + L +F + I + ++ +VLA+
Sbjct: 343 TLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAI 402

Query: 820 AVILLLAVGSDYNLLLISRFKEEIGAGLNTGIIRAMAGSGAVVTAAGLVFAATMASFLF- 878
+++ A+ N + R E ++M+ + +V +A F
Sbjct: 403 GLLVDDAIVVVEN---VERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFF 459

Query: 879 --ADLRILGQIGTTIALGLLFDTLIVRSFMTPAIAALI 914
+ I Q TI + L+ TPA+ A +
Sbjct: 460 GGSTGAIYRQFSITIVSAMALSVLVALIL-TPALCATL 496


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00193HTHFIS733e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 73.3 bits (180), Expect = 3e-17
Identities = 39/140 (27%), Positives = 74/140 (52%), Gaps = 5/140 (3%)

Query: 2 LVIEDSEAIREMVVEALGDAGFATSAYPDGEGLEALLDGHRPDAVILDVMIPGRDGFALI 61
LV +D AIR ++ +AL AG+ + L + D V+ DV++P + F L+
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLL 66

Query: 62 DVVRAWG-DVGIVMLTARDGLPDRLRGLDGGADDYVVKPFEMSELMSRVGAVL----RRR 116
++ D+ +++++A++ ++ + GA DY+ KPF+++EL+ +G L RR
Sbjct: 67 PRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRP 126

Query: 117 GTVTATVEVGDLLVDRTAAI 136
+ + G LV R+AA+
Sbjct: 127 SKLEDDSQDGMPLVGRSAAM 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00194PF05616340.002 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 33.6 bits (76), Expect = 0.002
Identities = 21/49 (42%), Positives = 22/49 (44%), Gaps = 7/49 (14%)

Query: 91 PAVVPGPGPGP---PG--PPPPPRPDGRPPRGPD--GGPPGRPPTPPPP 132
PA P P P PG P P P PD P PD G P RP +P P
Sbjct: 333 PAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSPAVP 381


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00196PF03544383e-05 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 38.0 bits (88), Expect = 3e-05
Identities = 23/107 (21%), Positives = 28/107 (26%), Gaps = 8/107 (7%)

Query: 20 AGLTLAAPAALAQPL-----PAPAPVPAPAPPPANPFVFPNPFAPPAPAAPAGPTADDPF 74
+T+ APA L P P P P P P P P P P
Sbjct: 50 ISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVK 109

Query: 75 AVVPGQPMAIPEGTPAGQNPTPFVGQPPFVPPSFNPTNGSIAGAAKP 121
V + P + PF P P S T +
Sbjct: 110 KVEQPKRDVKPVESRPAS---PFENTAPARPTSSTATAATSKPVTSV 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00197ACRIFLAVINRP290.035 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.7 bits (64), Expect = 0.035
Identities = 15/77 (19%), Positives = 29/77 (37%), Gaps = 8/77 (10%)

Query: 48 TGGTGSLVA-VGVMLAVSIIV---IAISVFSQGRRRQVRSGQHDYRRERGRDAGRRRWRP 103
+ VG++ + + I I F++ + + E A R R RP
Sbjct: 918 FNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEK----EGKGVVEATLMAVRMRLRP 973

Query: 104 LLIAAAVLLAWALLLAL 120
+L+ + + L LA+
Sbjct: 974 ILMTSLAFILGVLPLAI 990


4NCTC10437_00302NCTC10437_00313Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00302322-2.745657Uncharacterised protein
NCTC10437_00303225-4.287116lysine exporter protein LysE/YggA
NCTC10437_00304329-5.244377nucleoside-diphosphate-sugar epimerase
NCTC10437_00305428-5.658066Emopamil binding protein
NCTC10437_00306428-5.373634transcriptional regulator
NCTC10437_00307528-5.645096virulence factor Mce family protein
NCTC10437_00308528-6.059566virulence factor Mce family protein
NCTC10437_00309529-7.410698MCE-family protein MCE4d
NCTC10437_00310530-8.007140virulence factor Mce family protein
NCTC10437_00311429-7.445965MCE-family protein MCE4b
NCTC10437_00312425-5.948875Virulence factor Mce family protein
NCTC10437_00313219-4.406060Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00304NUCEPIMERASE347e-06 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 33.6 bits (77), Expect = 7e-06
Identities = 11/27 (40%), Positives = 14/27 (51%)

Query: 7 ITGGCGLVGSATVRRLAELGRTVVVTD 33
+TG G +G +RL E G VV D
Sbjct: 5 VTGAAGFIGFHVSKRLLEAGHQVVGID 31


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00306HTHTETR588e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 58.1 bits (140), Expect = 8e-13
Identities = 26/117 (22%), Positives = 47/117 (40%), Gaps = 3/117 (2%)

Query: 14 RQERGDRTRELLIDETVRCIREEGFSAASARHIIERAGVSWGVIQHHFGDRDGLLTAVIE 73
++ TR+ ++D +R ++G S+ S I + AGV+ G I HF D+ L + + E
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 74 DALDRLVESLETLSDPAQAMSTD---ELVRATWEAFANPKAMAGLEILIATKQLRVG 127
+ + E E++ E+ + L +I K VG
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00307PF03544371e-04 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 36.9 bits (85), Expect = 1e-04
Identities = 23/114 (20%), Positives = 29/114 (25%), Gaps = 1/114 (0%)

Query: 437 PVVNLPPGVAPGPGPALTPPYPLPVPPNTPGPQPFPLPYQAPPDQTLPPSGRPPASPQPP 496
P P V PA P PP P +P P P P P P+P
Sbjct: 44 PAPAQPISVTM-VAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPK 102

Query: 497 VQQSPPAPPALPAEAAPTGSVQPQAAELPSAAYDERSGAFLDPHGGISVYAAGG 550
+ P P +P + +A S A G
Sbjct: 103 PKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASG 156



Score = 30.7 bits (69), Expect = 0.014
Identities = 29/91 (31%), Positives = 37/91 (40%), Gaps = 2/91 (2%)

Query: 427 PPEADYDPGPPVVNLPPGVAPGPGPALTPPYPLPVPPNTPGPQPFPLPYQAPPDQTLPPS 486
PP+A P PVV P P P P P + P P P+P P+ P + + P
Sbjct: 62 PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPV 121

Query: 487 GRPPASPQPPVQQSPPAPPALPAEAAPTGSV 517
PASP P P + A AA + V
Sbjct: 122 ESRPASPFENTA--PARPTSSTATAATSKPV 150



Score = 28.8 bits (64), Expect = 0.045
Identities = 18/107 (16%), Positives = 30/107 (28%), Gaps = 4/107 (3%)

Query: 418 LPQNKYPYIPPEADYDPGPP--VVNLPPGVAP--GPGPALTPPYPLPVPPNTPGPQPFPL 473
L + PPE +P P + PP AP P P P+
Sbjct: 60 LEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVK 119

Query: 474 PYQAPPDQTLPPSGRPPASPQPPVQQSPPAPPALPAEAAPTGSVQPQ 520
P ++ P + + + ++ + QPQ
Sbjct: 120 PVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQ 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00309PF03544330.002 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 33.0 bits (75), Expect = 0.002
Identities = 22/62 (35%), Positives = 23/62 (37%), Gaps = 1/62 (1%)

Query: 427 PPPADGGPPSAPAQSSPPPVGFPPPPTELSPPPPGFLPSPTELSPPPPGFPPPPATQAPP 486
PAD PP A Q P PV P P E P PP P E P P P P +
Sbjct: 55 VAPADLEPPQAV-QPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQ 113

Query: 487 VD 488

Sbjct: 114 PK 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00310RTXTOXIND346e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.4 bits (79), Expect = 6e-04
Identities = 25/164 (15%), Positives = 55/164 (33%), Gaps = 7/164 (4%)

Query: 147 RNVGDLDKSRLTEALGVLTDAMRDATPQLRGTLDGVTA----LSRSINDRDDQIQQLLAR 202
+NV + + RLT + ++ Q LD A + IN ++ + +R
Sbjct: 177 QNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSR 236

Query: 203 AKAVSDVLAHRSKQMNQLIVEGNQLFAA---LSQKRRSLGILIGGIDDLARQITGFVADN 259
S +L ++ + ++ + N+ A L + L + I +
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF 296

Query: 260 RKEFGPALTKLNLVLDNLETRRAQIDDSLRRLPNFANQLGEVVG 303
+ E L + + L A+ ++ + A +V
Sbjct: 297 KNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQ 340


5NCTC10437_00329NCTC10437_00334Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00329-193.099228diguanylate cyclase
NCTC10437_00330093.166030chromosome replication initiation inhibitor
NCTC10437_00331093.360349precorrin-6A synthase, deacetylating
NCTC10437_003320103.207287cobyric acid synthase
NCTC10437_003335134.033062precorrin-6A reductase
NCTC10437_003344132.590158cobalbumin biosynthesis protein
6NCTC10437_00384NCTC10437_00396Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00384280.024595trans-aconitate methyltransferase
NCTC10437_00385270.082656sulfate adenylyltransferase subunit 1 /
NCTC10437_00386190.149859sulfate adenylyltransferase subunit 2
NCTC10437_00387070.810480putative sulfotransferase
NCTC10437_00388171.140945arylsulfatase A family protein
NCTC10437_003890101.931235Uncharacterised protein
NCTC10437_003900111.727474Uncharacterised protein
NCTC10437_00391090.799346PA-phosphatase-like phosphoesterase
NCTC10437_00392091.927099PA-phosphatase-like phosphoesterase
NCTC10437_00393291.336193Helix-turn-helix domain
NCTC10437_003943101.009431activator of Hsp90 ATPase 1 family protein
NCTC10437_003952120.100063Conserved exported protein of uncharacterised
NCTC10437_00396211-0.201233bile acid 7-alpha dehydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00385TCRTETOQM647e-13 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 64.5 bits (157), Expect = 7e-13
Identities = 65/292 (22%), Positives = 105/292 (35%), Gaps = 57/292 (19%)

Query: 7 QLLRITTAGSVDDGKSTLIGRLLHDTDSLPLDHLEAVTDEEGVADLA-ALSDGLRAEREQ 65
+++ I VD GK+TL LL+++ E G D +D ER++
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNS---------GAITELGSVDKGTTRTDNTLLERQR 52

Query: 66 GITIDVAYRFFSTENRSYILTDTPGHERYTRNMFTGASNAHVAVLLVDARAGVLRQTNRH 125
GITI F EN + DTPGH + ++ S A+LL+ A+ GV QT
Sbjct: 53 GITIQTGITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRIL 112

Query: 126 ARIAKLLGIKHFVAAVNKID----------------------------------LVDFDE 151
+ +GI + +NKID + +F E
Sbjct: 113 FHALRKMGIPT-IFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTE 171

Query: 152 GRFREVEVELQR--VAARLGDAEITVIPVAAKLGDNVVHRSDNTPWYQGPA--------L 201
+ +E + + + + + + + S P Y G A L
Sbjct: 172 SEQWDTVIEGNDDLLEKYMSGKSLEALELEQEESIRFHNCSLF-PVYHGSAKNNIGIDNL 230

Query: 202 LEYLESVELTAPHAESRSLRLPVQWVSRPSPHQRRRYTGRLAAGTLSVGDSV 253
+E + + ++ H L V + QR Y RL +G L + DSV
Sbjct: 231 IEVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYI-RLYSGVLHLRDSV 281


7NCTC10437_00453NCTC10437_00466Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_004532112.233452Uncharacterised protein
NCTC10437_004542112.262563Uncharacterised protein
NCTC10437_004550112.396554dynamin family protein
NCTC10437_004562123.203465Isoniazid inducible protein IniC
NCTC10437_004573101.893609molecular chaperone
NCTC10437_00458390.421048Uncharacterised protein
NCTC10437_0045937-0.119156Uncharacterised protein
NCTC10437_00460280.026504luciferase family protein
NCTC10437_00461190.095748TIGR03085 family protein
NCTC10437_00462111-1.049337glycosyl hydrolase, glucoamylase
NCTC10437_00463112-1.175816glucose-6-phosphate 1-dehydrogenase
NCTC10437_00464115-1.363647mycothione reductase
NCTC10437_00465313-1.246882putative lipoprotein LpqJ
NCTC10437_00466213-1.296169Protein of uncharacterised function (DUF2910)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00457TONBPROTEIN537e-10 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 53.1 bits (127), Expect = 7e-10
Identities = 26/112 (23%), Positives = 35/112 (31%), Gaps = 2/112 (1%)

Query: 495 VVLPAPAAPPATSTPTPEPQPSPSPTPTPTPSPEPSPSPSPSPTPDPTPTPTPEPTPEPT 554
+ LPAPA P + + TP P P P P P+P
Sbjct: 36 IELPAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPK 95

Query: 555 PEPSPSPEPSPE--PSPSPAPSAEPAPEPSAEPAPSAEPAAPVTECVPAPDT 604
P+P P P + P P P AP+ ++ T P T
Sbjct: 96 PKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATSKPVT 147



Score = 52.3 bits (125), Expect = 1e-09
Identities = 33/107 (30%), Positives = 42/107 (39%), Gaps = 3/107 (2%)

Query: 490 PAAPPVVLPAPAAPPATSTPTPEPQPSPSPTPTPTPSPEPSPSPSPSPTPDPTPTPTPEP 549
P + +V PA PP P PEP P P P P P P +P P P P P+P
Sbjct: 44 PISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPK---EAPVVIEKPKPKPKPKP 100

Query: 550 TPEPTPEPSPSPEPSPEPSPSPAPSAEPAPEPSAEPAPSAEPAAPVT 596
P + P + P S +P AP +A + PVT
Sbjct: 101 KPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATSKPVT 147



Score = 46.5 bits (110), Expect = 9e-08
Identities = 28/111 (25%), Positives = 35/111 (31%), Gaps = 1/111 (0%)

Query: 484 PQQVSPPAAPPVVLPAPAAPPATSTPTPEPQPSPSPTPTPTPSPEPSPSPSPSPTPDPTP 543
P Q P PVV P P P P P P P P P P+P P D P
Sbjct: 57 PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKP 116

Query: 544 TPTPEPTPEPTPEPSPSPEPSPEPSPS-PAPSAEPAPEPSAEPAPSAEPAA 593
+ +P P+ + + S P S P + P A
Sbjct: 117 VESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARA 167



Score = 40.7 bits (95), Expect = 7e-06
Identities = 25/121 (20%), Positives = 35/121 (28%), Gaps = 2/121 (1%)

Query: 468 VQAVPAPQQVVTKQSAPQQVSPPAAPPVVLPAPAAPPATSTPTPEPQPSPSPTPTPTPSP 527
V +PAP Q ++ P P P P P +P P P
Sbjct: 35 VIELPAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKP 94

Query: 528 EPSPSPSPSPTPDPTPTPTPEPTPEPTPEPSPSPEPS--PEPSPSPAPSAEPAPEPSAEP 585
+P P P P P +P P + P+ + + A S S
Sbjct: 95 KPKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPR 154

Query: 586 A 586
A
Sbjct: 155 A 155


8NCTC10437_00494NCTC10437_00499Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00494212-0.517867transcriptional regulator
NCTC10437_00495315-0.996464putative membrane-associated protein
NCTC10437_00496216-0.755400alcohol dehydrogenase
NCTC10437_00497321-2.579144Protein of uncharacterised function (DUF3558)
NCTC10437_00498422-3.091601Uncharacterised protein
NCTC10437_00499318-2.954685Protein of uncharacterised function (DUF732)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00494HTHTETR631e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 63.5 bits (154), Expect = 1e-14
Identities = 30/146 (20%), Positives = 59/146 (40%), Gaps = 5/146 (3%)

Query: 1 MPRTSERGGPLTRRKISDVATTLFLDRGFDAVTVADVAREAGVSSVTVFKHFPRKEDLLF 60
M R +++ TR+ I DVA LF +G + ++ ++A+ AGV+ ++ HF K DL
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 DREEDAVEILRSAVRDRAS--GAGVLASLREVAFRLVDDRHALSGVKAGSIPFFRT---V 115
+ E + + + + L+ LRE+ +++ + F V
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 116 AASPALIARAREIASELERALADLLE 141
+ R + E + L+
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLK 146


9NCTC10437_00512NCTC10437_00523Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_005122121.407504diguanylate cyclase
NCTC10437_005133122.212292Uncharacterised protein
NCTC10437_005142112.962716putative ferredoxin
NCTC10437_005152112.963159cytochrome P450
NCTC10437_005161102.967591MMPL domain-containing protein
NCTC10437_005172102.238416MarR family transcriptional regulator
NCTC10437_005183112.018153Na+ antiporter
NCTC10437_005192130.958806formate-dependent phosphoribosylglycinamide
NCTC10437_00520114-0.450824Conserved membrane protein of uncharacterised
NCTC10437_005210130.001272Rhodanese domain-containing protein
NCTC10437_005221140.241865O-succinylhomoserine sulfhydrylase
NCTC10437_005232130.626605Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00516ACRIFLAVINRP541e-09 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 54.5 bits (131), Expect = 1e-09
Identities = 53/274 (19%), Positives = 104/274 (37%), Gaps = 19/274 (6%)

Query: 63 LVVTRSDGAQLSPADLNA--VDSARGRMLAAADTGPAAASAPPPQVSEDGQAAVATVPLS 120
L V ++ P D++ V SA G M+ + + P++ +
Sbjct: 770 LYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEI-QG 828

Query: 121 ADLSGFALTDKVDELRAAATDGLPGDLAAQVTGGPAFGADIANSFSGANFTLLAVTAVVV 180
G + D + + A+ LP + TG N L+A++ VVV
Sbjct: 829 EAAPGTSSGDAMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAP----ALVAISFVVV 883

Query: 181 ALLLIVTYRSPVLWLVPLLVIGFADRVAAVLGVAVAEALGMSPDGSTSGITSVLVFGAGT 240
L L Y S W +P+ V+ ++GV +A L + + + G
Sbjct: 884 FLCLAALYES---WSIPVSVM--LVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSA 938

Query: 241 NYALLLISRYREELGR-TESHREA-LDTAVRRAGPAIIASNATVVLALLTLLFAS---SP 295
A+L++ ++ + + + EA L R P I+ ++ +L +L L ++ S
Sbjct: 939 KNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRP-ILMTSLAFILGVLPLAISNGAGSG 997

Query: 296 STRSLGVQAAAGLVVAAVFVLLVLPTALGLFGRK 329
+ ++G+ G+V A + + +P + R
Sbjct: 998 AQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRC 1031



Score = 49.5 bits (118), Expect = 5e-08
Identities = 36/172 (20%), Positives = 64/172 (37%), Gaps = 19/172 (11%)

Query: 501 AGHDRAVVIPAILIVVLLVLYLLLRSALAPLVLVGVTVLSALAALGLGGWASVHLFGFPA 560
+G+ ++ +VV L L L S P+ V+ + +G + LF
Sbjct: 868 SGNQAPALVAISFVVVFLCLAALYESWSIPVS-----VMLVVPLGIVGVLLAATLFNQKN 922

Query: 561 LDNTAPLFAFLFLVALGVDYTIFLVTRAREETPEHGTTDGIVRAVSATGAVISSAGIVLA 620
+ + L + L I +V A++ + G V + + I++
Sbjct: 923 --DVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKG---VVEATLMAVRMRLRPILMT 977

Query: 621 AVFCVLGVLPLI--------VLTQVGIIVGLGILLDTFLVRTVIIPALFTLI 664
++ +LGVLPL VGI V G++ T L +P F +I
Sbjct: 978 SLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLA-IFFVPVFFVVI 1028



Score = 47.5 bits (113), Expect = 2e-07
Identities = 41/162 (25%), Positives = 65/162 (40%), Gaps = 20/162 (12%)

Query: 514 IVVLLVLYLLL---RSALAPLVLVGVTVLSALAALGLGGWASVHLFGFPALDNTAPLFAF 570
++V LV+YL L R+ L P + V V +L A L FG+ NT +F
Sbjct: 349 MLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAIL--------AAFGYSI--NTLTMFGM 398

Query: 571 LFLVALGVDYTIFLVTRAREETPEHGTT--DGIVRAVSATGAVISSAGIVLAAVF---CV 625
+ + L VD I +V E + +++S + +VL+AVF
Sbjct: 399 VLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAF 458

Query: 626 LGVLPLIVLTQVGIIVGLGILLDTFLVRTVIIPALF-TLIGP 666
G + Q I + + L + + PAL TL+ P
Sbjct: 459 FGGSTGAIYRQFSITIVSAMALSVLVALIL-TPALCATLLKP 499



Score = 34.8 bits (80), Expect = 0.001
Identities = 30/226 (13%), Positives = 76/226 (33%), Gaps = 13/226 (5%)

Query: 106 VSEDGQAAVA-TVPLSADLSGFALTDKVDELRAAATDGLPGDLAAQVTGGPAFGADIANS 164
+G+ A + L+ + + A P + + S
Sbjct: 279 ARINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTT--PFVQLS 336

Query: 165 FSGANFTLLAVTAVVVALLLIVTYRSPVLWLVPLLVIGFADRVAAVLGVAVAEALGMSPD 224
TL ++V L++ + ++ L+P + + V + A+ A G S +
Sbjct: 337 IHEVVKTLF-EAIMLVFLVMYLFLQNMRATLIPTIAV----PVVLLGTFAILAAFGYSIN 391

Query: 225 GSTSGITSVLVFGAGTNYALLLISR-YREELGRTESHREALDTAVRRAGPAIIASNATVV 283
T VL G + A++++ R + +EA + ++ + A++ +
Sbjct: 392 TLTMF-GMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLS 450

Query: 284 LALLTLLFASSPST---RSLGVQAAAGLVVAAVFVLLVLPTALGLF 326
+ + F + R + + + ++ + L++ P
Sbjct: 451 AVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATL 496


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00520PilS_PF08805280.009 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 28.4 bits (63), Expect = 0.009
Identities = 10/44 (22%), Positives = 19/44 (43%), Gaps = 3/44 (6%)

Query: 5 DEPADRPAAWPF-LAALAIIVIVVIGIWLL--DVFSGDDLTDEQ 45
+ D+ A L + +IV++ + L V S ++EQ
Sbjct: 21 KKEQDKGATLMEVLLVVGVIVVLAASAYKLYSMVQSNIQSSNEQ 64


10NCTC10437_00545NCTC10437_00556Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_005452133.445218Conserved membrane protein of uncharacterised
NCTC10437_005463134.014031ADP-ribose pyrophosphatase
NCTC10437_005473133.598962thiamine-phosphate diphosphorylase
NCTC10437_005495143.418666thiamine biosynthesis protein ThiS
NCTC10437_005502122.909372thiazole synthase
NCTC10437_005514142.427042GDSL family lipase
NCTC10437_005523121.350355ABC transporter--like protein
NCTC10437_005533131.078677putative ABC transporter membrane protein
NCTC10437_005543120.619791alpha/beta hydrolase fold protein
NCTC10437_00555211-0.443197aminopeptidase Y
NCTC10437_00556210-0.565839aminopeptidase Y
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00551RTXTOXINA290.017 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.017
Identities = 17/60 (28%), Positives = 25/60 (41%), Gaps = 3/60 (5%)

Query: 29 GRSQRNYPHLVAAALGLDLVDVTYSGATTAHVLSESQRSAPPQIDALDGTETLVTVTIGG 88
G +N P+L GLD V S + + +LS + + A G E L T +G
Sbjct: 226 GNKLQNLPNLDNIGAGLDTVSGILSAISASFILSNADADTRTKAAA--GVE-LTTKVLGN 282


11NCTC10437_00592NCTC10437_00605Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00592312-0.123573putative monovalent cation/H+ antiporter subunit
NCTC10437_00593211-0.086552putative monovalent cation/H+ antiporter subunit
NCTC10437_005942120.115919putative monovalent cation/H+ antiporter subunit
NCTC10437_005952110.578547putative monovalent cation/H+ antiporter subunit
NCTC10437_005961101.837076putative monovalent cation/H+ antiporter subunit
NCTC10437_005972112.023933NADH:ubiquinone oxidoreductase subunit 5 (chain
NCTC10437_00598292.338942TrkA domain-containing protein
NCTC10437_005991103.051146NhaP-type Na+(K+)/H+ antiporter
NCTC10437_006000113.532775protein of uncharacterised function (DUF1707)
NCTC10437_006010123.389241vesicle-fusing ATPase
NCTC10437_006020133.162353CDP-diacylglycerol--serine
NCTC10437_006032133.599304phosphatidylserine decarboxylase
NCTC10437_006043153.229416molybdenum cofactor synthesis domain-containing
NCTC10437_006052132.368854short chain dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00605DHBDHDRGNASE594e-12 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 58.5 bits (141), Expect = 4e-12
Identities = 54/206 (26%), Positives = 85/206 (41%), Gaps = 19/206 (9%)

Query: 14 GRVAIVTGANTGIGYEAALVLAGKGAKVVVAVRNLDKGAAAVAALKRVHPGADVTVQELD 73
G++A +TGA GIG A LA +GA + N +K V++LK A+ D
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF--PAD 65

Query: 74 LSSLASVRAAAEDLGAALPRIDLLINNAGVMYPP--KQTTADGFELQFGTNHLGHFALTG 131
+ A++ + + ID+L+N AGV+ P + + +E F N G F +
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 132 LLLHNLLDVPGSRVVTVASLAHRQLADIHFDDLQWERKYNRVAAYGQSKLANLMFTYELQ 191
+ ++D +VTV S +AAY SK A +MFT L
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGV-------------PRTSMAAYASSKAAAVMFTKCLG 172

Query: 192 RRLGAAGASTIAVAAHPGISNTELMR 217
L V+ PG + T++
Sbjct: 173 LELAEYNIRCNIVS--PGSTETDMQW 196


12NCTC10437_00644NCTC10437_00668Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_006440103.143901putative GAF sensor protein
NCTC10437_006451112.899097Uncharacterized protein conserved in bacteria
NCTC10437_006462113.629750hydrophobic protein
NCTC10437_006472124.106364transmembrane protein
NCTC10437_00648-1102.389587extracellular solute-binding protein
NCTC10437_006491102.058524binding-protein-dependent transport system inner
NCTC10437_006501111.525934binding-protein-dependent transport system inner
NCTC10437_006510101.706689ABC transporter-like protein
NCTC10437_00652-1101.196193ABC transporter-like protein
NCTC10437_00653-1110.656117dihydrolipoamide dehydrogenase
NCTC10437_006542102.626349Uncharacterised protein
NCTC10437_006551102.530667diguanylate cyclase
NCTC10437_006560121.549296alpha/beta hydrolase fold protein
NCTC10437_006571110.169983Uncharacterised protein
NCTC10437_00658112-0.1868922-polyprenyl-6-methoxyphenol hydroxylase-like
NCTC10437_00659214-0.498791TetR family transcriptional regulator
NCTC10437_00660212-1.122745carboxymuconolactone decarboxylase
NCTC10437_00661111-1.856061transcriptional regulator
NCTC10437_00662112-2.256801acyl-ACP thioesterase
NCTC10437_00663112-1.902760isocitrate lyase
NCTC10437_00664210-2.5096123-hydroxyacyl-CoA dehydrogenase
NCTC10437_00665311-2.133656cullin, a subunit of E3 ubiquitin ligase
NCTC10437_00666310-2.091293polyphosphate:nucleotide phosphotransferase
NCTC10437_00667311-0.959177TetR family transcriptional regulator
NCTC10437_00668312-0.447522transmembrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00648FIMBRIALPAPF290.021 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 29.3 bits (65), Expect = 0.021
Identities = 31/131 (23%), Positives = 53/131 (40%), Gaps = 11/131 (8%)

Query: 239 VDGTNLPPRLIDSVKGDQVQTVAAQSADWRGVAL---PAGNPFTADPQARLAMNLGVDRA 295
VD N+ P +D+ +G+ + ++ S ++ +L GN LA N+
Sbjct: 44 VDFGNINPEHVDNSRGEVTKNISI-SCPYKSGSLWIKVTGNTMGVGQNNVLATNITHFGI 102

Query: 296 AMVRDVLVGYGREASTPVAEAYGDAYNPDAQYRFDAARATTLLDEAGWRPGAGGIREKD- 354
A+ + G+ STP+ G D AR+T +R G+G + D
Sbjct: 103 ALYQ------GKGMSTPLTLGNGSGNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDF 156

Query: 355 GARASFELLYN 365
AS ++YN
Sbjct: 157 RTTASMSMIYN 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00650ACRIFLAVINRP280.038 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.3 bits (63), Expect = 0.038
Identities = 13/62 (20%), Positives = 26/62 (41%), Gaps = 5/62 (8%)

Query: 81 AVAATALGLAIGVSAAMIGGWLDALVMRLVDGVNALPHLVVGIVIAAMWRGAPLAIIASI 140
A+ A + + AA+ W + + LV +P +VG+++AA + +
Sbjct: 874 ALVAISFVVVFLCLAALYESWSIPVSVMLV-----VPLGIVGVLLAATLFNQKNDVYFMV 928

Query: 141 AL 142
L
Sbjct: 929 GL 930


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00651HTHFIS310.003 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.3 bits (71), Expect = 0.003
Identities = 15/46 (32%), Positives = 20/46 (43%)

Query: 13 DIDLYRGRDSAAVHVLDDVNLAVPAGRVTALIGESGCGKSLVASAL 58
D GR +A + + + + GESG GK LVA AL
Sbjct: 135 DGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00659TETREPRESSOR603e-13 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 60.3 bits (146), Expect = 3e-13
Identities = 60/215 (27%), Positives = 92/215 (42%), Gaps = 27/215 (12%)

Query: 15 LSKERIVAAAIEILDADGEKALTFRALTARLATGAGAIYWHVADKDALLAATTDHVIAGA 74
L++E ++ AA+E+L+ G LT R L +L +YWHV +K ALL A ++A
Sbjct: 4 LNRESVIDAALELLNETGIDGLTTRKLAQKLGIEQPTLYWHVKNKRALLDALAVEILARH 63

Query: 75 LVDGLSGAEPGDA----IRAIALGLFDAIDAHPWVGA--HLARQPWQLAMVQIFEGVGRA 128
S G++ +R A+ A+ + GA HL +P + + E R
Sbjct: 64 HD--YSLPAAGESWQSFLRNNAMSFRRALLRYR-DGAKVHLGTRPDEKQYDTV-ETQLRF 119

Query: 129 LHGLGVPDEALFDAGSALVNYILG-VAGQNAANARLMPPGTDRDAVLGDITRRWLSYDAD 187
+ G A SA+ ++ LG V Q A L TDR A +
Sbjct: 120 MTENGFSLRDGLYAISAVSHFTLGAVLEQQEHTAAL----TDRPAA-----------PDE 164

Query: 188 RYPFVHRVAPQLREHDDRDQ-FLAGIEIFLAGVAA 221
P + R A Q+ + DD +Q FL G+E + G
Sbjct: 165 NLPPLLREALQIMDSDDGEQAFLHGLESLIRGFEV 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00667HTHTETR493e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 48.9 bits (116), Expect = 3e-09
Identities = 29/145 (20%), Positives = 53/145 (36%), Gaps = 7/145 (4%)

Query: 17 RRWHKHKVERRNELVDGTLEAIRRLG-SNVSMDEIAAEIGVSKTVLYRYFVDKNDLTTAV 75
R+ + E R ++D L + G S+ S+ EIA GV++ +Y +F DK+DL + +
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 76 M---MRFAQVTLIPNMAAALSSNLDGYDLTREIIRVYVDTVANEPEPYRFVMANNSASKS 132
+ A L REI+ +++ E + +
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVL---REILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 133 KAVADSEQIIARMLAVMFRRRMVEV 157
Q R L + R+ +
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQT 144


13NCTC10437_00715NCTC10437_00748Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_007153120.902164Uncharacterised protein
NCTC10437_007162131.933406pyridoxamine 5'-phosphate oxidase
NCTC10437_007172132.232897cytochrome P450
NCTC10437_007181131.691980Predicted membrane protein
NCTC10437_007193151.879926deazaflavin-dependent nitroreductase family
NCTC10437_007201141.344018purine-cytosine permease-like transporter
NCTC10437_007212161.286920glutamate-1-semialdehyde-2,1-aminomutase
NCTC10437_007221150.976437fructose-2,6-bisphosphatase
NCTC10437_007230150.697558alkyl hydroperoxide reductase/ Thiol specific
NCTC10437_00724-1150.898536cytochrome c biogenesis protein
NCTC10437_00725-3120.720496ResB family protein
NCTC10437_007260111.993697cytochrome C biogenesis protein CcsB
NCTC10437_007270132.716286chromosome partitioning ATPase
NCTC10437_007282143.381380Uncharacterised protein
NCTC10437_007293133.507496Conserved membrane protein of uncharacterised
NCTC10437_007303122.603211SnoaL-like polyketide cyclase
NCTC10437_007313122.385522Uncharacterised protein
NCTC10437_007324122.051372Uncharacterised protein
NCTC10437_007332111.753543Uncharacterised protein
NCTC10437_00734292.779480Uncharacterised protein
NCTC10437_00735292.639026lipoprotein
NCTC10437_007362123.387262ATPase
NCTC10437_007371123.5080181,4-dihydroxy-2-naphthoate prenyltransferase
NCTC10437_007380123.427793Uncharacterised protein
NCTC10437_007391123.333165EmbB
NCTC10437_007400142.761258methylthioadenosine phosphorylase
NCTC10437_007411143.175879nucleoside-diphosphate-sugar epimerase
NCTC10437_007421143.0171102-nitropropane dioxygenase
NCTC10437_007433162.915717signal transduction histidine kinase
NCTC10437_007444172.990415response regulator with CheY-like receiver
NCTC10437_007454163.600265glycosyl transferase family protein
NCTC10437_007463142.597804glycosyltransferase
NCTC10437_007472112.793875type 12 methyltransferase
NCTC10437_007482112.695202sulfite oxidase-like oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00721FLGMRINGFLIF300.028 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 29.5 bits (66), Expect = 0.028
Identities = 16/84 (19%), Positives = 31/84 (36%), Gaps = 1/84 (1%)

Query: 277 AAFGGRAEVMQRLAPLGPVYQAGTLSGNPVAMAAGLATLRAADDAVYATLDANADRL-AG 335
A+ + + ++ L L + + A+A +A + A Y TL +N G
Sbjct: 4 ASTATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGG 63

Query: 336 LITDALTEAGVAHQIPRAGNMFSV 359
I LT+ + ++ V
Sbjct: 64 AIVAQLTQMNIPYRFANGSGAIEV 87


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00736HTHFIS320.003 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.1 bits (73), Expect = 0.003
Identities = 32/162 (19%), Positives = 60/162 (37%), Gaps = 16/162 (9%)

Query: 17 ETARRIVAALSEAF--SAKVVGQ----EHLRETLLIGLLTGGHILLESVPGLAKTTAARV 70
+R + L + +VG+ + + L + T +++ G K AR
Sbjct: 120 AEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 71 IAES---IDGQFRRIQC---TPDLLPSDIIGTQIYDATTTSFVTQLGPV---HANVVLLD 121
+ + +G F I DL+ S++ G + A T + G + LD
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHE-KGAFTGAQTRSTGRFEQAEGGTLFLD 238

Query: 122 EINRSSAKTQSAMLEAMEERQTTIAGTEYPIPEPFLVIATQN 163
EI Q+ +L +++ + T G PI ++A N
Sbjct: 239 EIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATN 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00741NUCEPIMERASE1673e-51 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 167 bits (424), Expect = 3e-51
Identities = 92/363 (25%), Positives = 135/363 (37%), Gaps = 66/363 (18%)

Query: 1 MRVLLTGAAGFIGTRIRAQLTDAGHDVVAVDAM-------LPAAHGEGATVPD-DVRDLD 52
M+ L+TGAAGFIG + +L +AGH VV +D + L A E P +D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 53 VRDAAAVAD--ALTGVDVVCHQAAVVGAGVNAADSPSYASHNDYGTAVLLAEMFAAGCTR 110
+ D + D A + V + + + +YA N G +L
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 111 LVLASSMVVYGQGRFVCPDHGDVEPPPRTREDLDAGMFDHRCPLGGEQVDWALVGEDAPL 170
L+ ASS VYG P + +D DH
Sbjct: 121 LLYASSSSVYGLN----------RKMPFSTDD----SVDH-------------------- 146

Query: 171 RPRSLYAASKVAQEHYALAWSESTGGSVVALRYHNVYGPGMPRDTPYSGVAAIFRSSLEK 230
P SLYAA+K A E A +S G LR+ VYGP D F ++ +
Sbjct: 147 -PVSLYAATKKANELMAHTYSHLYGLPATGLRFFTVYGPWGRPDMALF----KFTKAMLE 201

Query: 231 GDAPKVYEDGGQMRDFVHVDDIAAANVASVEA-----------------GLDGFEAFNVC 273
G + VY G RDF ++DDIA A + + + + +N+
Sbjct: 202 GKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIG 261

Query: 274 SGKPVSILDVATALCETRGDVTPVVTGQYRSGDVRHIVADPARAADVLGFRAAVDPADGL 333
+ PV ++D AL + G + GDV AD +V+GF DG+
Sbjct: 262 NSSPVELMDYIQALEDALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGV 321

Query: 334 REF 336
+ F
Sbjct: 322 KNF 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00744HTHFIS1052e-28 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 105 bits (264), Expect = 2e-28
Identities = 41/150 (27%), Positives = 63/150 (42%), Gaps = 6/150 (4%)

Query: 2 TRVLIADDDGVVRDVVRRYLERDGLDVSTAHDGTEALRLLGSQHIDVAVLDVMMPGPDGL 61
+L+ADDD +R V+ + L R G DV + R + + D+ V DV+MP +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 SLCRSLRQRGHSGGYGMPVILLTALGEEDDRIAGLEAGADDYLTKPFSPRELALRVKSVL 121
L +++ +PV++++A I E GA DYL KPF EL + L
Sbjct: 64 DLLPRIKKARP----DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 122 RRSPSAVGVQPVDITSGDLTV--STASRAV 149
D G V S A + +
Sbjct: 120 AEPKRRPSKLEDDSQDGMPLVGRSAAMQEI 149


14NCTC10437_00866NCTC10437_00897Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00866210-0.081818transmembrane proteinm, MmpS5
NCTC10437_008671100.820919TetR family transcriptional regulator
NCTC10437_008680140.545759Transposase, IS30 family
NCTC10437_00869-1200.951960putative dehydrogenase
NCTC10437_008700171.592251ABC-type spermidine/putrescine transport system,
NCTC10437_00871411-1.640039ABC-type spermidine/putrescine transport system,
NCTC10437_00872718-4.730081spermidine/putrescine ABC transporter
NCTC10437_00873728-6.979113Spermidine/putrescine-binding periplasmic
NCTC10437_00874737-8.104046sugar phosphate isomerase/epimerase
NCTC10437_00875844-8.593829transcriptional regulator/sugar kinase
NCTC10437_00876846-9.854222Uncharacterized protein conserved in bacteria
NCTC10437_00877744-9.721037RNA-directed DNA polymerase
NCTC10437_00878336-7.622734Domain of uncharacterised function DUF87
NCTC10437_00879431-6.441112Uncharacterised protein
NCTC10437_00880224-5.243701Site-specific DNA methylase
NCTC10437_00881117-4.408263transposase ISMyma01_aa2-like protein
NCTC10437_00882120-4.602781transposase IS3/IS911 family protein
NCTC10437_00883122-4.505638fatty acid desaturase
NCTC10437_00884122-3.028287Uncharacterised protein
NCTC10437_00885117-1.007289Uncharacterised protein
NCTC10437_00886115-0.242601Uncharacterised protein
NCTC10437_008870120.114319Uncharacterised protein
NCTC10437_00888090.443143ESX-1 secretion-associated protein EspE
NCTC10437_00889291.820010Uncharacterized conserved protein
NCTC10437_00890091.347298allophanate hydrolase subunit 2
NCTC10437_00891091.061236allophanate hydrolase subunit 1
NCTC10437_00892191.287091Mn2+/Fe2- transporter NRAMP family
NCTC10437_008932101.678337GntR family transcriptional regulator
NCTC10437_008941111.109941Transcriptional regulator, effector-binding
NCTC10437_008951111.252829prevent-host-death family protein
NCTC10437_008961111.410440ribonuclease H
NCTC10437_008972160.972993VanW family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00867HTHTETR577e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 56.6 bits (136), Expect = 7e-12
Identities = 24/106 (22%), Positives = 47/106 (44%), Gaps = 3/106 (2%)

Query: 24 ASSLRKDAERNRRRIIEAARELCATRGLEA-TLNEVAHHANLGVGTVYRRFPTKDLLFEA 82
A +++A+ R+ I++ A L + +G+ + +L E+A A + G +Y F K LF
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 83 IFQDGMDQLVEIADAALE--MEDSWEAFASLVWQLCELTATDRGLR 126
I++ + E+ D ++ + E T T+ R
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRR 107


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00872PF05272280.047 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.047
Identities = 10/20 (50%), Positives = 12/20 (60%)

Query: 34 TLLGPSGSGKSTLLRIIAGL 53
L G G GKSTL+ + GL
Sbjct: 600 VLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00876PYOCINKILLER320.007 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 32.5 bits (73), Expect = 0.007
Identities = 42/194 (21%), Positives = 71/194 (36%), Gaps = 33/194 (17%)

Query: 48 EIASDGSSNEVLWEDEDGQQHTVRAGTGGPSPAVAVFTKAWVQRNLSQFLDGATASPIVT 107
+I S G+ N + E+ + VR G A F L + ++G TA+ V
Sbjct: 140 KITSLGAKNFLTRTAEEIGEQAVREGNINGPEAYMRF--------LDREMEGLTAAYNVK 191

Query: 108 LGEEAIEAKEREKELSRQIEDHRAKAQEADKKRQDHTSRARKLATAAQTAVADQLREFDY 167
L EAI + + + A + Q RK A+ A +
Sbjct: 192 LFTEAISSLQIRMNTLTAAKASIEAAAANKAREQAAAEAKRKAEEQARQQAAIR------ 245

Query: 168 QRFSKNRYSMPVVEGKLRDYNGEFPDESQHAEALKRLGEGAADRVRAIPDPPTGRAVDLE 227
+ N Y+MP + + L ++ +GAA +AI D A+ +
Sbjct: 246 ---AANTYAMPANGSVV---------ATAAGRGLIQVAQGAASLAQAISD-----AIAV- 287

Query: 228 FVGALLAETPTRVA 241
+G +LA P+ +A
Sbjct: 288 -LGRVLASAPSVMA 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00896SECA330.001 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 33.3 bits (76), Expect = 0.001
Identities = 17/71 (23%), Positives = 32/71 (45%), Gaps = 8/71 (11%)

Query: 165 TKLIG-PQVVLVAELRAIGAAVQKLRGRDITVLSDSRL---AVEMVQRWRDG---DDVLP 217
TK+ G + +R + + + ++ LSD L E R G ++++P
Sbjct: 7 TKVFGSRNDRTLRRMRKVVNIINAME-PEMEKLSDEELKGKTAEFRARLEKGEVLENLIP 65

Query: 218 EGYAIYRESGK 228
E +A+ RE+ K
Sbjct: 66 EAFAVVREASK 76


15NCTC10437_01008NCTC10437_01025Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_01008013-3.200345beta-lactamase domain-containing protein
NCTC10437_01011411-3.579386**50S ribosomal protein L33
NCTC10437_01012412-3.159270(3R)-hydroxyacyl-ACP dehydratase subunit HadA
NCTC10437_01013516-2.138322(3R)-hydroxyacyl-ACP dehydratase subunit HadB
NCTC10437_01014515-1.714738(3R)-hydroxyacyl-ACP dehydratase subunit HadC
NCTC10437_01016413-1.446533*preprotein translocase subunit SecE
NCTC10437_01017414-2.201755transcription antitermination protein nusG
NCTC10437_01018214-2.39935050S ribosomal protein L11
NCTC10437_01019112-2.08218150S ribosomal protein L1
NCTC10437_01020112-2.052212Uncharacterised protein
NCTC10437_01021110-1.973560PE-PPE domain-containing protein
NCTC10437_01022111-1.439773cyclopropane-fatty-acyl-phospholipid synthase
NCTC10437_01023111-0.614590methyltransferase, cyclopropane fatty acid
NCTC10437_01024111-0.201334alpha/beta hydrolase fold protein
NCTC10437_01025212-0.166426putative unusual protein kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01016SECETRNLCASE397e-07 Bacterial translocase SecE signature.
		>SECETRNLCASE#Bacterial translocase SecE signature.

Length = 127

Score = 39.5 bits (92), Expect = 7e-07
Identities = 18/57 (31%), Positives = 32/57 (56%)

Query: 94 VINYLQEVIGELRKVIWPNRKQMASYTTVVLFFLIFMVAMIGGVDLGLAKLVTWVFG 150
+ + +E E+RKVIWP R++ T +V M ++ G+D L +LV+++ G
Sbjct: 68 TVAFAREARTEVRKVIWPTRQETLHTTLIVAAVTAVMSLILWGLDGILVRLVSFITG 124


16NCTC10437_01036NCTC10437_01047Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_01036217-1.505026LSU ribosomal protein L12P
NCTC10437_01037115-1.269135ABC-type transport system involved in resistance
NCTC10437_01038115-1.103839esterase/lipase
NCTC10437_01039117-1.405021esterase/lipase
NCTC10437_01040017-1.784949DNA-directed RNA polymerase subunit beta
NCTC10437_01041115-1.372045DNA-directed RNA polymerase subunit beta'
NCTC10437_010423142.808375putative hydrolase
NCTC10437_010433121.975053Phytanoyl-CoA dioxygenase (PhyH)
NCTC10437_010442111.268219endonuclease IV
NCTC10437_010452110.660836PA-phosphatase-like phosphoesterase
NCTC10437_010463140.894055FkbM family methyltransferase
NCTC10437_010473121.528282Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01039BINARYTOXINB290.048 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 29.3 bits (65), Expect = 0.048
Identities = 23/103 (22%), Positives = 34/103 (33%), Gaps = 14/103 (13%)

Query: 72 SNDTDADDTDADDTDVDADAD---DAGDADGDTAIDTAPETAAGGKDEDTAPDADADQGE 128
SN T A T D D D D+ + +G T +D + +
Sbjct: 190 SNSRKKRSTSAGPTVPDRDNDGIPDSLEVEGYT-VDVKNKRTFLSPWISNIHEKKGLTKY 248

Query: 129 VSD-----TAADPGTDIDDSDAAESIQPS---EPSDPSLPAEP 163
S TA+DP +D + I + E P + A P
Sbjct: 249 KSSPEKWSTASDPYSDFE--KVTGRIDKNVSPEARHPLVAAYP 289


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01047cloacin391e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 38.5 bits (89), Expect = 1e-04
Identities = 28/85 (32%), Positives = 35/85 (41%)

Query: 662 GGAGSAGASGKAGSGGTVGTPGTSDGKGGAGGAGGKGGDGGSGGAGGSSGGSGSGGSGGS 721
GG G +G + G + T G GG G + GGS G GG G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 722 GGSGGGGGAGGGGGAGGSGGSGGAA 746
G GG G +GGG G GG+ + A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAP 87



Score = 36.2 bits (83), Expect = 5e-04
Identities = 29/85 (34%), Positives = 34/85 (40%)

Query: 382 TGGTGGTGGTGATGGKGGVGGAGATGGTGGTGGTGGTGGTGGTGGTGGTGGTGGAGGIGG 441
+GG G TGA G + G G GG G + GG+G GG G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 442 KGGTGGTGGTGGTGGTGGTGGAGGT 466
G GG G +GG GTGG A
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLSAVAA 86



Score = 35.8 bits (82), Expect = 6e-04
Identities = 30/100 (30%), Positives = 35/100 (35%)

Query: 373 TGGAGGAGGTGGTGGTGGTGATGGKGGVGGAGATGGTGGTGGTGGTGGTGGTGGTGGTGG 432
+GG G TG +G GVGG + G + GG+G GG G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 433 TGGAGGIGGKGGTGGTGGTGGTGGTGGTGGAGGTGAGGTG 472
G GG G GG GTGG G G G
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAG 101



Score = 35.8 bits (82), Expect = 6e-04
Identities = 32/78 (41%), Positives = 35/78 (44%), Gaps = 2/78 (2%)

Query: 659 GGKGGAGSAGASGKAGSGGTVGTPGTSDGKGGAGGAGGKGGDGGSGGAGGSSGGSGSGGS 718
G GA S + G G G SDG G + GG GSG GGSG G
Sbjct: 8 GHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGG--GSGSGIHWGGGSGHGNG 65

Query: 719 GGSGGSGGGGGAGGGGGA 736
GG+G SGGG G GG A
Sbjct: 66 GGNGNSGGGSGTGGNLSA 83



Score = 35.8 bits (82), Expect = 7e-04
Identities = 29/86 (33%), Positives = 33/86 (38%)

Query: 479 LGGVGGTGGTGGTGGTGGTGGAGGTGGTGGAGGQGGTGGTGGTGGTGGTGGAGGLGGDGG 538
+ G G G G T G G TG G G G+G + GG G+G G G
Sbjct: 1 MSGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGS 60

Query: 539 AGGAGGTGGNGGAGGKGGLGGSTGGA 564
G GG GN G G G S A
Sbjct: 61 GHGNGGGNGNSGGGSGTGGNLSAVAA 86



Score = 35.5 bits (81), Expect = 7e-04
Identities = 29/84 (34%), Positives = 33/84 (39%)

Query: 426 GTGGTGGTGGAGGIGGKGGTGGTGGTGGTGGTGGTGGAGGTGAGGTGFGGQGGLGGVGGT 485
G G G GA G G TG G G + G+G + G G G GG G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 486 GGTGGTGGTGGTGGAGGTGGTGGA 509
G GG G +GG G GG A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAA 86



Score = 35.1 bits (80), Expect = 0.001
Identities = 29/84 (34%), Positives = 34/84 (40%)

Query: 417 GTGGTGGTGGTGGTGGTGGAGGIGGKGGTGGTGGTGGTGGTGGTGGAGGTGAGGTGFGGQ 476
G G G G T G G G G G + G+G + GG G+G G G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 477 GGLGGVGGTGGTGGTGGTGGTGGA 500
G GG G +GG GTGG A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAA 86



Score = 34.3 bits (78), Expect = 0.002
Identities = 27/82 (32%), Positives = 34/82 (41%)

Query: 640 AGSAGTEGATGAAGSTGATGGKGGAGSAGASGKAGSGGTVGTPGTSDGKGGAGGAGGKGG 699
+G G TGA ++G G G GSG + G G GG G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 700 DGGSGGAGGSSGGSGSGGSGGS 721
G GG G S GGSG+GG+ +
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLSA 83



Score = 33.9 bits (77), Expect = 0.003
Identities = 26/80 (32%), Positives = 28/80 (35%)

Query: 454 TGGTGGTGGAGGTGAGGTGFGGQGGLGGVGGTGGTGGTGGTGGTGGAGGTGGTGGAGGQG 513
+GG G G G GG GLG GG G G G G GG G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 514 GTGGTGGTGGTGGTGGAGGL 533
G G GG+G G L
Sbjct: 62 HGNGGGNGNSGGGSGTGGNL 81



Score = 33.1 bits (75), Expect = 0.005
Identities = 32/104 (30%), Positives = 35/104 (33%)

Query: 439 IGGKGGTGGTGGTGGTGGTGGTGGAGGTGAGGTGFGGQGGLGGVGGTGGTGGTGGTGGTG 498
+ G G G G T G G G GG G GG+G GG
Sbjct: 1 MSGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGS 60

Query: 499 GAGGTGGTGGAGGQGGTGGTGGTGGTGGTGGAGGLGGDGGAGGA 542
G G GG G +GG GTGG G L G G A
Sbjct: 61 GHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 32.8 bits (74), Expect = 0.006
Identities = 30/75 (40%), Positives = 32/75 (42%), Gaps = 13/75 (17%)

Query: 686 DGKGGAGGAGGKGGD--GGSGGAGGSSGGSG-----------SGGSGGSGGSGGGGGAGG 732
DG+G GA G+ GG G G G S GGSG GGG G G
Sbjct: 5 DGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGN 64

Query: 733 GGGAGGSGGSGGAAG 747
GGG G SGG G G
Sbjct: 65 GGGNGNSGGGSGTGG 79



Score = 32.8 bits (74), Expect = 0.006
Identities = 29/79 (36%), Positives = 33/79 (41%)

Query: 411 GTGGTGGTGGTGGTGGTGGTGGTGGAGGIGGKGGTGGTGGTGGTGGTGGTGGAGGTGAGG 470
G G G G T G G TG G G G+G + GG G+G G G+G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 471 TGFGGQGGLGGVGGTGGTG 489
GG G GG GTGG
Sbjct: 63 GNGGGNGNSGGGSGTGGNL 81



Score = 32.0 bits (72), Expect = 0.009
Identities = 29/102 (28%), Positives = 36/102 (35%)

Query: 354 GSGGLFGKGGNAGNAGNGGTGGAGGAGGTGGTGGTGGTGATGGKGGVGGAGATGGTGGTG 413
G G G +GN G G G G + G+G + GG G+G G G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 414 GTGGTGGTGGTGGTGGTGGTGGAGGIGGKGGTGGTGGTGGTG 455
G GG G G G G + A + T G GG
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 31.2 bits (70), Expect = 0.015
Identities = 25/80 (31%), Positives = 30/80 (37%)

Query: 306 GGQGAKGGTGGAGTAGSAGGTGGVGGDAPQVMGGAGSVGGKGGNGGNGGSGGLFGKGGNA 365
GG G TG T+G+ G G G+G GG GSG +G G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 366 GNAGNGGTGGAGGAGGTGGT 385
GN G G G G G +
Sbjct: 63 GNGGGNGNSGGGSGTGGNLS 82



Score = 30.5 bits (68), Expect = 0.026
Identities = 27/79 (34%), Positives = 29/79 (36%)

Query: 408 GTGGTGGTGGTGGTGGTGGTGGTGGTGGAGGIGGKGGTGGTGGTGGTGGTGGTGGAGGTG 467
G G G G T G G TG G G G G + GG G+G G G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 468 AGGTGFGGQGGLGGVGGTG 486
G G G GG G GG
Sbjct: 63 GNGGGNGNSGGGSGTGGNL 81



Score = 29.7 bits (66), Expect = 0.047
Identities = 22/77 (28%), Positives = 29/77 (37%)

Query: 345 GKGGNGGNGGSGGLFGKGGNAGNAGNGGTGGAGGAGGTGGTGGTGGTGATGGKGGVGGAG 404
G+G N G + G G G G + G+G + GG G+G G G G G
Sbjct: 6 GRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNG 65

Query: 405 ATGGTGGTGGTGGTGGT 421
G G G G +
Sbjct: 66 GGNGNSGGGSGTGGNLS 82


17NCTC10437_01305NCTC10437_01315Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_013051113.723245putative F420-dependent oxidoreductase, Rv2161c
NCTC10437_01306083.593782Uncharacterised protein
NCTC10437_01307183.746544NLP/P60 protein
NCTC10437_01308273.189587Protein of uncharacterised function (DUF2580)
NCTC10437_01309282.921650uracil phosphoribosyltransferase
NCTC10437_013102112.973678phosphomannomutase
NCTC10437_013113132.272925MarR family transcriptional regulator
NCTC10437_013123100.434049putative ammonia monooxygenase
NCTC10437_013132110.359993purine nucleoside phosphorylase
NCTC10437_01314210-0.199119amidohydrolase
NCTC10437_01315211-0.311687amidohydrolase
18NCTC10437_01375NCTC10437_01438Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_01375218-0.079398extracellular solute-binding protein family 1
NCTC10437_013762200.990749sugar ABC transporter integral membrane protein
NCTC10437_013771171.690420sugar-transport integral membrane protein ABC
NCTC10437_013781132.062123sugar ABC transporter ATP-binding protein
NCTC10437_013790122.115326putative acyltransferase
NCTC10437_013800111.203439ArsR family transcriptional regulator
NCTC10437_01381-1101.056896sulfate transporter
NCTC10437_013821160.117573Uncharacterised protein
NCTC10437_01383114-0.460745diguanylate cyclase
NCTC10437_01384214-0.962314Fe-S metabolism associated SufE
NCTC10437_01385112-1.002692rhodanese domain-containing protein
NCTC10437_01386214-1.288010Uncharacterised protein
NCTC10437_01387017-1.123312acyltransferase 3
NCTC10437_01388015-2.002046Uncharacterised protein
NCTC10437_01389319-4.045955Uncharacterised protein
NCTC10437_01390627-6.244420Maf-like protein
NCTC10437_01391731-7.538748acetyl-/propionyl-coenzyme A carboxylase AccE5
NCTC10437_01392835-8.966936carboxyl transferase
NCTC10437_01393952-11.110783Uncharacterised protein
NCTC10437_013941053-11.583258Uncharacterised protein
NCTC10437_01395953-11.861416lipopolysaccharide biosynthesis
NCTC10437_01396949-10.128780glucose-1-phosphate thymidylyltransferase
NCTC10437_013971047-9.764185dTDP-glucose 4,6-dehydratase
NCTC10437_013981149-9.966763dTDP-4-dehydrorhamnose reductase
NCTC10437_013991146-9.516436Uncharacterised protein
NCTC10437_01400942-9.046378Uncharacterised protein
NCTC10437_01401942-9.854003polysaccharide biosynthesis protein
NCTC10437_01402742-11.299000Uncharacterised protein
NCTC10437_01403743-11.628310Lipid A core - O-antigen ligase and related
NCTC10437_01404534-8.638371inositol monophosphatase
NCTC10437_01405431-7.751622Spore coat polysaccharide biosynthesis protein
NCTC10437_01406435-7.8970253-deoxy-manno-octulosonate cytidylyltransferase
NCTC10437_01407429-6.544606type 11 methyltransferase
NCTC10437_01408221-4.805664transposase
NCTC10437_01409119-4.513919UDP-glucuronic acid decarboxylase 1
NCTC10437_01410-114-3.530198exopolysaccharide biosynthesis polyprenyl
NCTC10437_01411-111-1.997309protein tyrosine phosphatase
NCTC10437_01412-210-1.524906undecaprenyl pyrophosphate synthase
NCTC10437_01413-29-0.635461UDP-galactopyranose mutase
NCTC10437_014140120.003727membrane-flanked domain-containing protein
NCTC10437_014150120.191762DUF218 domain
NCTC10437_01416010-0.171695diguanylate cyclase
NCTC10437_01417112-0.568648GtrA family protein
NCTC10437_01418314-0.374273phosphoribosylaminoimidazole carboxylase ATPase
NCTC10437_01419315-1.231617phosphoribosylaminoimidazole carboxylase
NCTC10437_01420114-1.991871Uncharacterised protein
NCTC10437_01421113-1.467574putative outer membrane adhesin-like protein
NCTC10437_01422-212-1.893549isoprenylcysteine carboxyl methyltransferase
NCTC10437_01423-114-2.489243chalcone/stilbene synthase domain-containing
NCTC10437_01424114-2.787724Uncharacterised protein
NCTC10437_01425014-3.116647butyryl-CoA dehydrogenase
NCTC10437_01426117-3.296751biotin--acetyl-CoA-carboxylase ligase
NCTC10437_01427116-3.339602Conserved protein of uncharacterised function,
NCTC10437_01428-112-3.187302Uncharacterised protein
NCTC10437_01429-110-2.109535PE-PGRS family protein
NCTC10437_01430-110-1.528057Conserved protein of uncharacterised function,
NCTC10437_01431012-0.031921Uncharacterised protein
NCTC10437_01432013-0.188127TIGR03089 family protein
NCTC10437_01433013-0.364715cell envelope-related transcriptional
NCTC10437_014340140.810065dTDP-4-dehydrorhamnose reductase
NCTC10437_014351130.408000putative glycosyltransferase
NCTC10437_01436215-0.134340Nucleoside-diphosphate-sugar pyrophosphorylase
NCTC10437_01437216-0.767382cullin, a subunit of E3 ubiquitin ligase
NCTC10437_014382150.346263NUDIX hydrolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01375MALTOSEBP522e-09 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 52.0 bits (124), Expect = 2e-09
Identities = 48/180 (26%), Positives = 76/180 (42%), Gaps = 21/180 (11%)

Query: 121 DDFYPAEREAATVGDKIVGVPALVDNLAIVYNKKLFADAGIAPPSPDWTWDDFRSAAAKL 180
D YP +A K++ P V+ L+++YNK L P+P TW++ + +L
Sbjct: 113 DKLYPFTWDAVRYNGKLIAYPIAVEALSLIYNKDLL-------PNPPKTWEEIPALDKEL 165

Query: 181 TNPDTGQYGFLIPADGSEDTVWHYVPMLWEAGGDILSPDNE----RAVFNSEAGVKA-LT 235
F + + P++ GG +N + V AG KA LT
Sbjct: 166 KAKGKSALMFNL------QEPYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLT 219

Query: 236 VLQQMAVTDKSLYLDTTNENGPKLMNSGKIGMLVTGPWDLSQL--SDIDYDVQVMPTFAG 293
L + + +K + DT N G+ M + GPW S + S ++Y V V+PTF G
Sbjct: 220 FLVDL-IKNKHMNADTDYSIAEAAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKG 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01378PF05272362e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 36.2 bits (83), Expect = 2e-04
Identities = 13/32 (40%), Positives = 17/32 (53%)

Query: 33 LILVGPSGCGKSTALRLLAGLDKPTSGEILIG 64
++L G G GKST + L GLD + IG
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIG 630


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01386PRTACTNFAMLY260.032 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 25.8 bits (56), Expect = 0.032
Identities = 15/60 (25%), Positives = 26/60 (43%), Gaps = 4/60 (6%)

Query: 11 AVSSIAVLSGCSQVAEVTNEGGDTTCGDFAGFDEKKQNETITKMLTDEGKNEPANAELTG 70
A ++++VL +E+T +GG T G AG + + T + PA + G
Sbjct: 216 APAAVSVLGA----SELTLDGGHITGGRAAGVAAMQGAVVHLQRATIRRGDAPAGGAVPG 271


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01397NUCEPIMERASE1411e-41 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 141 bits (358), Expect = 1e-41
Identities = 75/334 (22%), Positives = 138/334 (41%), Gaps = 32/334 (9%)

Query: 3 RLLVTGGAGFIGSNFVRYVIENTDYEVVVLDKLT--YAGNL--MSLQNLPERRFRFEHGD 58
+ LVTG AGFIG + + ++E +VV +D L Y +L L+ L + F+F D
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGH-QVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 59 ITDSALVDDLARNA--DAIVHYAAESHNDNSLSDPRPFLHTNLVGTFTLLEAARKYG-SR 115
+ D + DL + + + SL +P + +NL G +LE R
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 116 FHHISTDEVYGDLEFDEPERFTETTPYN-PSSPYSSTKAGSDLLVRAWVRSFGLAATISN 174
+ S+ VYG + F+ + P S Y++TK ++L+ + +GL AT
Sbjct: 121 LLYASSSSVYGL---NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLR 177

Query: 175 CSNNYGPYQHVEKFIPRQITNVLLGIRPKLYGEGKNIRDWIHADDHSSAVLAILE----- 229
YGP+ + + + +L G +Y GK RD+ + DD + A++ + +
Sbjct: 178 FFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHA 237

Query: 230 -------------KGKLGETYLIGADGEKENKSVVELVLALMDQPVDAYDHVADRAGHDL 276
Y IG E ++ + + + + + G L
Sbjct: 238 DTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKK-NMLPLQPGDVL 296

Query: 277 RYAIDSSKLRNELGWSPKYHNFTHGLESTIDWYR 310
+ D+ L +G++P+ G+++ ++WYR
Sbjct: 297 ETSADTKALYEVIGFTPET-TVKDGVKNFVNWYR 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01398NUCEPIMERASE418e-06 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 40.5 bits (95), Expect = 8e-06
Identities = 54/240 (22%), Positives = 86/240 (35%), Gaps = 52/240 (21%)

Query: 154 DGQLGQALRETYRFVSHVDFATRSDITLGVTDLCSARRWREYDAVINAAAYTAVDRAETS 213
+L + ++F +D A R +T ++ V + AV +
Sbjct: 43 QARLELLAQPGFQFH-KIDLADREGMTDLFAS-------GHFERVFISPHRLAVRYS--L 92

Query: 214 EGRMAAWAANVKGVSDLAKVAAQNGIT-LIHISSDYVFDGTSDRPYREDDPMA-PLGVYG 271
E A +N+ G ++ + N I L++ SS V+ P+ DD + P+ +Y
Sbjct: 93 ENPHAYADSNLTGFLNILEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYA 152

Query: 272 QTKAAGDQIV----------TTVPRHYVVRTSWVIG----AGHNFVRTMVSLAERGVDPG 317
TK A + + T R + V W G A F + M+ G
Sbjct: 153 ATKKANELMAHTYSHLYGLPATGLRFFTVYGPW--GRPDMALFKFTKAMLE----GKSID 206

Query: 318 VVDDQRGRLTFTV--DIAGAVRHLLD------------------SGAPYGTYNVTGSGPV 357
V + + + FT DIA A+ L D S APY YN+ S PV
Sbjct: 207 VYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPV 266


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01409NUCEPIMERASE1813e-53 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 181 bits (461), Expect = 3e-53
Identities = 79/338 (23%), Positives = 136/338 (40%), Gaps = 39/338 (11%)

Query: 4 RAVLTGGAGFVGSHLAERLLERDIDVVCVDNFVTGTPDNV----AHLQVHDGFRLVKADV 59
+ ++TG AGF+G H+++RLLE VV +DN ++ L GF+ K D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 60 SNF-----ISVPGPVDYVLHFASPASPVDYA-ELPIQTMKAGSLGTLHTLGLAKEKGARY 113
++ + G + V + V Y+ E P + G L+ L + ++
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLA-VRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 114 L-LASTSETYGDPLVHPQPETYWGNVN-PVGPRACYDEAKRFAEALTVSYRKAHGVNTAI 171
L AS+S YG P + +V+ PV Y K+ E + +Y +G+
Sbjct: 121 LLYASSSSVYGLN--RKMPFSTDDSVDHPVSL---YAATKKANELMAHTYSHLYGLPATG 175

Query: 172 MRIFNTYGPRMRPNDGRAIPNFITQALAGEPITVHGDGTHTRSVCYVDDLVEGALNLLFS 231
+R F YGP RP+ A+ F L G+ I V+ G R Y+DD+ E + L
Sbjct: 176 LRFFTVYGPWGRPD--MALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDV 233

Query: 232 DLAGP-------------------VNIGNPHELTILELAEFIRELAGSESPVEFIARPQD 272
NIGN + +++ + + + G E+ +
Sbjct: 234 IPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPG 293

Query: 273 DPSQRQPDITLARAELGWEPQVAPRDGLLKTIAWFRDL 310
D + D +G+ P+ +DG+ + W+RD
Sbjct: 294 DVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01421PF03544330.005 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 32.6 bits (74), Expect = 0.005
Identities = 21/105 (20%), Positives = 30/105 (28%), Gaps = 10/105 (9%)

Query: 86 DDEIAASQEASPSPSSRHKKADDVEDLPTEVQSEVDEPEEMAAPEQDASEITVDTEPAAT 145
+ A + P + + VE EPE PE V +P
Sbjct: 51 SVTMVAPADLEPPQAVQPPPEPVVEP----------EPEPEPIPEPPKEAPVVIEKPKPK 100

Query: 146 PVTTAEPAADVAVTTDAAPKVTQRVLTTAVDTSPAAPPQHPGDAT 190
P +P V V R + +T+PA P A
Sbjct: 101 PKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAA 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01434NUCEPIMERASE442e-07 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 44.4 bits (105), Expect = 2e-07
Identities = 30/159 (18%), Positives = 53/159 (33%), Gaps = 31/159 (19%)

Query: 2 VITGAGGLVGRVLAGQARKTGREVVAL------------------------TSSEWDITD 37
++TGA G +G ++ + + G +VV + + D+ D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLAD 63

Query: 38 AAAAERYITAD--DVVVNCAAFTQVDAAEAEPDRAHAVNVGGAENVAHACARAGAS-LIH 94
+ + V V + P N+ G N+ C L++
Sbjct: 64 REGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHLLY 123

Query: 95 LSTDYVFSGIFDGEPRPYDIDD-TTGPLSVYGRTKLAGE 132
S+ V+ P+ DD P+S+Y TK A E
Sbjct: 124 ASSSSVYG---LNRKMPFSTDDSVDHPVSLYAATKKANE 159


19NCTC10437_01523NCTC10437_01582Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_01523220-3.888682heat shock protein 90
NCTC10437_01525125-3.839615*Predicted ATPase
NCTC10437_01526022-3.268764acetyl-CoA acetyltransferase
NCTC10437_01527-120-3.400001pyruvate ferredoxin/flavodoxin oxidoreductase
NCTC10437_01528-122-3.033347Sugar diacid utilization regulator
NCTC10437_01529125-4.121699Uncharacterised protein
NCTC10437_01530225-3.425804aldehyde dehydrogenase
NCTC10437_01531428-3.352802putative LysR family transcriptional regulator
NCTC10437_01532430-2.409178putative esterase
NCTC10437_01533429-2.622190Conserved protein of uncharacterised function,
NCTC10437_01534527-2.814526Conserved protein of uncharacterised function,
NCTC10437_01535531-2.901313Protein of uncharacterised function (DUF732)
NCTC10437_01536532-2.946192Mce associated membrane protein
NCTC10437_01537531-3.122560conserved Mce associated membrane protein
NCTC10437_01538632-2.377302virulence factor Mce family protein
NCTC10437_01539632-2.220781virulence factor Mce family protein
NCTC10437_01540530-2.178780virulence factor Mce family protein
NCTC10437_01541426-2.143867virulence factor Mce family protein
NCTC10437_01542222-2.061495virulence factor Mce family protein
NCTC10437_01543322-1.952586ABC-type transport system involved in resistance
NCTC10437_01544224-2.914609ABC-type transport system involved in resistance
NCTC10437_01545225-3.427705organic solvent resistance ABC transporter
NCTC10437_01546227-3.823513transcriptional regulator
NCTC10437_01547228-3.661389acyl-CoA synthetase (AMP-forming)/AMP-acid
NCTC10437_01548229-4.761059Pigment production hydroxylase
NCTC10437_01549230-5.602465HpcE protein
NCTC10437_01550229-5.200591oxidoreductase
NCTC10437_01551028-5.098368biphenyl-2,3-diol 1,2-dioxygenase 3
NCTC10437_01552021-4.323985alpha/beta hydrolase domain-containing protein
NCTC10437_01553122-4.519039F420-dependent methylene-tetrahydromethanopterin
NCTC10437_01554223-4.547769GntR family transcriptional regulator
NCTC10437_01555324-4.834386bifunctional 3,4-dihydroxy-2-butanone
NCTC10437_01556223-5.215314Uncharacterised protein
NCTC10437_01557324-5.154796acyl-CoA dehydrogenase
NCTC10437_01558431-5.578882Regulator of polyketide synthase expression
NCTC10437_01559431-5.855018flavin reductase domain-containing protein
NCTC10437_01560329-5.797010monooxygenase
NCTC10437_01561231-6.173009alpha/beta hydrolase
NCTC10437_01562233-6.425551alcohol dehydrogenase, class IV
NCTC10437_01563132-6.967190Citrate synthase
NCTC10437_01564234-7.5908503-oxoacid CoA-transferase, A subunit
NCTC10437_01565233-7.7984443-oxoacid CoA-transferase, B subunit
NCTC10437_01566336-8.074823hydroxyquinol 1,2-dioxygenase
NCTC10437_01567231-6.882496Probable tautomerase SA1195.1
NCTC10437_01568230-6.833597medium chain fatty-acid-CoA ligase FadD14
NCTC10437_01569130-6.652728NADH dehydrogenase/NAD(P)H nitroreductase
NCTC10437_01570128-5.760071Uncharacterised protein
NCTC10437_01571026-5.459692Uncharacterised protein
NCTC10437_01572-125-5.442224Large subunit of N,N-dimethylformamidase
NCTC10437_01573024-5.499618Uncharacterised protein
NCTC10437_01574122-5.336761ribose ABC transporter periplasmic binding
NCTC10437_01575021-5.086021ABC-type sugar transporter, permease protein
NCTC10437_01576122-5.163729ABC-type sugar transporter, permease protein
NCTC10437_01577023-5.519291sugar ABC transporter, ATP-binding protein
NCTC10437_01578021-5.494019amino acid transporter
NCTC10437_01579020-5.471943FAD dependent oxidoreductase
NCTC10437_01580-120-4.297487Uncharacterised protein
NCTC10437_01581-121-4.041746aldehyde dehydrogenase
NCTC10437_01582-119-3.362108GntR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01540IGASERPTASE290.042 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.3 bits (65), Expect = 0.042
Identities = 13/59 (22%), Positives = 24/59 (40%)

Query: 376 TARPNEVTYSEDWMRPDYIPPQPPAAAASPQTGDLTPGTLPAEAPLLATPQPQPTDPSA 434
P+ + +E+ R D P PPA A +T + E+ + + T+ +A
Sbjct: 1005 ADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTA 1063


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01541PERTACTIN359e-04 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 34.7 bits (79), Expect = 9e-04
Identities = 20/46 (43%), Positives = 22/46 (47%)

Query: 398 PYREPAPAPPPGGPPPGPPANPSPSPPPSPPIGPIDDPLTPPGAGQ 443
P P P P PG PP PP P P PP PP + P P AG+
Sbjct: 571 PKPAPQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQPPAGR 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01543PF05616354e-04 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 35.5 bits (81), Expect = 4e-04
Identities = 39/120 (32%), Positives = 48/120 (40%), Gaps = 11/120 (9%)

Query: 343 GPGGKPGCGSLPDVAKNWPVRQLIANTGF----GTGLDWR--PNPGIGFPGYANYLPVTR 396
PG K G + D N PV Q++A G T +D + P P + PG A P +
Sbjct: 271 APGTKVNMGPVTDRNGN-PV-QVVATFGRDSQGNTTVDVQVIPRPDLT-PGSAE-APNAQ 326

Query: 397 AVPEPPSIRYPGG-PAPGPIPYPGAPPYGAPQYGPDGTPLYPGVPPAPPQSPPAPPPPDG 455
+PE P PAP P P P PD P G P P SP P P+G
Sbjct: 327 PLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSPAVPDRPNG 386


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01546HTHTETR682e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 68.5 bits (167), Expect = 2e-16
Identities = 41/213 (19%), Positives = 72/213 (33%), Gaps = 26/213 (12%)

Query: 10 SRAAQAARTRQRLIDAAVQLFSANNYDDVAVADIAREAKVAHGLLFHYFGSKSQIYLAAI 69
+A TRQ ++D A++LFS ++ +IA+ A V G ++ +F KS ++
Sbjct: 4 KTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIW 63

Query: 70 QFAADEITGSFTVQEGLSPG---QQLREALVRHFRYLAS--HRGLALRLALAGPGAGQSA 124
+ + I + PG LRE L+ + R L + +
Sbjct: 64 ELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEM 123

Query: 125 WEIFESTRLHAVQWIALILDL-------------PPPSPAMQMMWRACIGAIDEAALFWL 171
+ ++ R ++ I A +M G I WL
Sbjct: 124 AVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIM----RGYISGLMENWL 179

Query: 172 RHDEPFAVEDIVDSVVDIMATAMRAAARLDPTL 204
+ F ++ V I+ L PTL
Sbjct: 180 FAPQSFDLKKEARDYVAILL----EMYLLCPTL 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01572BONTOXILYSIN340.003 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 33.7 bits (77), Expect = 0.003
Identities = 15/46 (32%), Positives = 23/46 (50%), Gaps = 1/46 (2%)

Query: 121 STGWAVALEPDGLVVTLGDGDGNVVRVRTGMPLYKQVWYSVVITVD 166
+ GW + E +GLV + D +GN V + WY + I+VD
Sbjct: 926 NCGWEIYFEDNGLVFEIIDSNGNQESV-YLSNIINDNWYYISISVD 970


20NCTC10437_01982NCTC10437_01997Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_0198229-0.658587major facilitator transporter
NCTC10437_01983413-0.9103152-keto-4-pentenoate
NCTC10437_01984313-0.640621glutamyl-tRNA synthetase
NCTC10437_01987412-1.406268**transcriptional regulator
NCTC10437_01988411-1.562418isopropylmalate isomerase large subunit
NCTC10437_01989311-0.8740333-isopropylmalate dehydratase, small subunit
NCTC10437_01990290.114066bacterial nucleoid DNA-binding protein
NCTC10437_01991-180.292706NUDIX hydrolase
NCTC10437_01992-190.902748polyphosphate kinase 1
NCTC10437_01993-182.3573712-phospho-L-lactate guanylyltransferase
NCTC10437_01994182.685422NAD(P)H-dependent glycerol-3-phosphate
NCTC10437_01995082.312100cystathionine gamma-lyase
NCTC10437_019960112.503763D-alanyl-alanine synthetase A
NCTC10437_019972103.691021thiamine-monophosphate kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01982TCRTETB330.002 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 32.9 bits (75), Expect = 0.002
Identities = 28/152 (18%), Positives = 56/152 (36%), Gaps = 8/152 (5%)

Query: 2 LVIALAATLCANVFINGVAFLIP-SLHTERGLDLASAALMSSLP-SLGMVVTLIAWGYVV 59
+I + + G ++P + L A + P ++ +++ G +V
Sbjct: 258 FMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILV 317

Query: 60 DRVGERMVLTVGSALTA----AAAFAAASADSLFTVGVFLFLGGMAAASSNTASGRLVVG 115
DR G VL +G + A+F + T+ + LGG++ + T +V
Sbjct: 318 DRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSF--TKTVISTIVSS 375

Query: 116 WFDAEKRGLVMGIRQTAQPLGVGVGALVIPRL 147
++ G M + L G G ++ L
Sbjct: 376 SLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407



Score = 31.0 bits (70), Expect = 0.009
Identities = 26/119 (21%), Positives = 48/119 (40%), Gaps = 1/119 (0%)

Query: 45 SLGMVVTLIAWGYVVDRVGERMVLTVGSALTAAAAFAAASADSLFTVGVFL-FLGGMAAA 103
L + +G + D++G + +L G + + S F++ + F+ G AA
Sbjct: 59 MLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAA 118

Query: 104 SSNTASGRLVVGWFDAEKRGLVMGIRQTAQPLGVGVGALVIPRLAEFSSVCAALLFPAI 162
+ +V + E RG G+ + +G GVG + +A + LL P I
Sbjct: 119 AFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMI 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01990DNABINDINGHU777e-21 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 77.4 bits (191), Expect = 7e-21
Identities = 38/88 (43%), Positives = 52/88 (59%)

Query: 2 NKAELIDVLTEKLGSDRRQATAAVENVVDTIVRAVHKGESVTITGFGVFEQRRRAARVAR 61
NK +LI + E ++ + AAV+ V + + KGE V + GFG FE R RAAR R
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPRTGETVKVKPTSVPAFRPGAQFKAVV 89
NP+TGE +K+K + VPAF+ G K V
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


21NCTC10437_02012NCTC10437_02019Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_020122100.454419putative methyltransferase
NCTC10437_02013211-0.376020phosphopantetheine adenylyltransferase
NCTC10437_020142110.331972cell division initiation protein
NCTC10437_02015391.369439metal-binding protein
NCTC10437_02016391.469802ribonuclease III
NCTC10437_02017191.991831formamidopyrimidine-DNA glycosylase
NCTC10437_020182101.205090OsmC family protein
NCTC10437_020192101.232085acylphosphatase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02013LPSBIOSNTHSS2308e-81 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 230 bits (588), Expect = 8e-81
Identities = 77/154 (50%), Positives = 106/154 (68%), Gaps = 1/154 (0%)

Query: 4 AVCPGSFDPVTRGHVDIFERSAAQFDELVIAVMVNPNKTGMFSLDERIALIEESTTHLPN 63
A+ PGSFDP+T GH+DI ER FD++ +AV+ NPNK MFS+ ER+ I ++ HLPN
Sbjct: 3 AIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLPN 62

Query: 64 VRVESGRGLIVDFARERGLTAIVKGLRTGTDFEYELQMAQMNKHIA-GVDTFFVATTPQY 122
+V+S GL V++AR+R AI++GLR +DFE ELQMA NK +A ++T F+ T+ +Y
Sbjct: 63 AQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTEY 122

Query: 123 SFVSSSLAKEVAMLGGDVSALLPDPVNARLKTKL 156
SF+SSSL KEVA GG+V +P V A L +
Sbjct: 123 SFLSSSLVKEVARFGGNVEHFVPSHVAAALYDQF 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02014IGASERPTASE330.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.1 bits (75), Expect = 0.001
Identities = 23/168 (13%), Positives = 57/168 (33%), Gaps = 11/168 (6%)

Query: 39 DDIKDAIPGELDDAQDVLDARDTLLRDAKEHSESTVSTANAEADSMVNHARAEADRLLAD 98
++I+ +P + +++ + + + S + AE + + ++ +
Sbjct: 1001 NNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATE 1060

Query: 99 AKAQADRMVAEARQH-----SDRMVGEAREEAARIAATAKREYE--ATTGRAKSEADRLI 151
AQ + EA+ + V ++ E T +E +AK E ++
Sbjct: 1061 TTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQ 1120

Query: 152 ENGNLTYEKAVQEGIKEQQRLVSQTEVVATATAEATRMIDSAHAEADR 199
E + Q K++Q Q + + T I ++ +
Sbjct: 1121 EVP----KVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNT 1164


22NCTC10437_02144NCTC10437_02153Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_021442151.588465cobalamin (vitamin B12) biosynthesis protein
NCTC10437_021451130.442565cobalt transporter CbiM
NCTC10437_02146-19-0.771289cobalt ABC transporter inner membrane subunit
NCTC10437_02147010-2.243466cobalt ABC transporter ATPase
NCTC10437_02148012-3.580265integral membrane sensor signal transduction
NCTC10437_02149119-5.836909two component transcriptional regulator
NCTC10437_02150121-6.219588GntR family transcriptional regulator
NCTC10437_02151219-5.400816hydrophobic amino acid ABC transporter
NCTC10437_02152116-4.331947ABC-type branched-chain amino acid transport
NCTC10437_02153115-3.787957branched-chain amino acid ABC transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02149HTHFIS938e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 93.0 bits (231), Expect = 8e-24
Identities = 30/120 (25%), Positives = 61/120 (50%)

Query: 28 SPIRVLLVDDERALTNLVRMALKYEGWEIDVAHDAAEAVEKYRAQTPDLLVLDIMLPDMD 87
+ +L+ DD+ A+ ++ AL G+++ + +AA A DL+V D+++PD +
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 88 GLGVLAQLRDSGTYTPTLFLTARDSVADRVTGLTAGGDDYMTKPFSLEELVARLRGLLRR 147
+L +++ + P L ++A+++ + G DY+ KPF L EL+ + L
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


23NCTC10437_02233NCTC10437_02253Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_022333100.732986type 12 methyltransferase
NCTC10437_02235513-0.082747*N-acetylglutamate synthase
NCTC10437_022364111.572981CDP-diacylglycerol--glycerol-3-phosphate
NCTC10437_022374111.915882XRE family transcriptional regulator
NCTC10437_02238292.183639phage shock protein A, PspA
NCTC10437_02239-181.624963putative membrane alanine rich protein
NCTC10437_02240-1101.474723limonene-1,2-epoxide hydrolase
NCTC10437_02241-1112.002017UDP-glucuronosyl/UDP-glucosyltransferase
NCTC10437_022421130.232471Protein of uncharacterised function (DUF3046)
NCTC10437_02243010-0.230898X-Pro dipeptidyl-peptidase (S15 family)
NCTC10437_02244111-1.710227recombinase A
NCTC10437_02245212-1.714432recombination regulator RecX
NCTC10437_02246111-2.909766Uncharacterised protein
NCTC10437_0224719-2.048716Protein of uncharacterised function (DUF1622)
NCTC10437_0224809-1.322250amino acid ABC transporter membrane protein 2,
NCTC10437_02249310-0.113541polar amino acid ABC transporter inner membrane
NCTC10437_02250290.741590extracellular solute-binding protein
NCTC10437_02251092.257260ABC transporter--like protein
NCTC10437_02252-1103.251234(dimethylallyl)adenosine tRNA
NCTC10437_022530103.968598putative transmembrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02235SACTRNSFRASE392e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.8 bits (90), Expect = 2e-06
Identities = 19/86 (22%), Positives = 37/86 (43%), Gaps = 5/86 (5%)

Query: 26 ESVQEFWVAELSGELIGCGALHVLWSDLGEVRTVAVHPKVRGRGVGHAIVDRLLAVAADL 85
E + ++ L IG + W+ + +AV R +GVG A++ + + A +
Sbjct: 62 EEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKEN 121

Query: 86 HLQRIFVLTFEQ-----EFFARHGFR 106
H + + T + F+A+H F
Sbjct: 122 HFCGLMLETQDINISACHFYAKHHFI 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02238RTXTOXIND310.008 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.008
Identities = 19/168 (11%), Positives = 52/168 (30%), Gaps = 24/168 (14%)

Query: 14 ALFSSKVDEYADPKVQIQQAIEEAQRQHQGLTQQAAQVIGNQRQLEMRLNRQLADIEKLQ 73
+L + + + K Q + +++ + + + A++ + + +L D L
Sbjct: 189 SLIKEQFSTWQNQKYQKELNLDKKRAERLTVL---ARINRYENLSRV-EKSRLDDFSSL- 243

Query: 74 VNVRQALTLADQATAAGDAAKATEYTNAAEAFAAQLVTAEQSVEDLKGLHDQALQAAGQA 133
K +A + V A + K +Q A
Sbjct: 244 ------------------LHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSA 285

Query: 134 KKAVEQNAMMLQQKIA-ERTKLLSQLEQAKMQEQVSASLRSMSELAAP 180
K+ + + + +I + + + ++ + + S + AP
Sbjct: 286 KEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAP 333


24NCTC10437_02349NCTC10437_02354Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_02349213-2.198883Uncharacterised protein
NCTC10437_02350313-2.543547anion transporter
NCTC10437_02351416-2.388621anion transporter
NCTC10437_02352217-2.757968tartrate dehydrogenase
NCTC10437_02353120-3.233058LysR family transcriptional regulator
NCTC10437_02354119-3.288252Uncharacterised protein
25NCTC10437_02375NCTC10437_02382Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_023751133.349061alpha amylase
NCTC10437_023761163.773523nitroreductase family protein
NCTC10437_023773215.061276response regulator containing a CheY-like
NCTC10437_023786225.865085UspA domain-containing protein
NCTC10437_023794224.929456nitroreductase
NCTC10437_023806245.364254Uncharacterised protein
NCTC10437_023816183.166960Uncharacterised protein
NCTC10437_023825172.607049Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02377HTHFIS651e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 65.2 bits (159), Expect = 1e-14
Identities = 26/102 (25%), Positives = 43/102 (42%), Gaps = 2/102 (1%)

Query: 3 RVFLVDDHEVVRRGLIDLLGGDPDIDVVGEAGSVAEALARIPALQPDVAVLDVRLPDGNG 62
+ + DD +R L L DV + A I A D+ V DV +PD N
Sbjct: 5 TILVADDDAAIRTVLNQALS-RAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 63 IELCRDLLSSLPELRCLMLTSFTSDEAMLDAILAGASGYVVK 104
+L + + P+L L++++ + + A GA Y+ K
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPK 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02380GPOSANCHOR442e-06 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 43.5 bits (102), Expect = 2e-06
Identities = 16/74 (21%), Positives = 18/74 (24%), Gaps = 7/74 (9%)

Query: 239 PAAPPTGAPPTPTPPPAPAPGAPPGTPGPATPAGGTAPASPTIGA---PGTAGAGGARRP 295
A + TP P G A A P T G P
Sbjct: 459 RAGKASD-SQTPDAKPGNKAVPGKGQAPQAGTKPNQNKAPMKETKRQLPST---GETANP 514

Query: 296 AGTAAGPPGIGGAG 309
TAA + AG
Sbjct: 515 FFTAAALTVMATAG 528


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02382IGASERPTASE320.004 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.3 bits (73), Expect = 0.004
Identities = 31/263 (11%), Positives = 65/263 (24%), Gaps = 11/263 (4%)

Query: 92 PAQEASPLRLHRRSARRARAGRAERQRGGRSRQDRLDRPTRFRRRRLPAAHPRQRGRPCR 151
+ + + E R + T A + +Q +
Sbjct: 993 DTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVE 1052

Query: 152 RHHRHPRRPRRHRCRRRARPHRAGPPRRHRPPEHLAGLAGARRQQRCRSGRRQRAARAGS 211
++ + A+ + + A+ + + +
Sbjct: 1053 KNEQDATETT-------AQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETAT 1105

Query: 212 PARLRRRPVHRERPERAGWTDLPHSRPGHQAPLEQ-EASPHPRHHQQAPQAHPEQEASTH 270
+ + V E+ + S Q+ Q +A P + P+ + +T
Sbjct: 1106 VEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTT 1165

Query: 271 PPHSQPGHQAPLEQEASPHPRHHQQAPQAHPEQEASTHPPHSQPGHQAPLEQEASPHPRH 330
QP + E + E +T P +QP E P RH
Sbjct: 1166 ADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQP--TVNSESSNKPKNRH 1223

Query: 331 HQQ-APQAHPEQEASTHPPHSQP 352
+ H + A+T
Sbjct: 1224 RRSVRSVPHNVEPATTSSNDRST 1246


26NCTC10437_02471NCTC10437_02476Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_024713121.096063transglutaminase domain-containing protein
NCTC10437_024724111.705272Uncharacterized protein conserved in bacteria
NCTC10437_024732142.431083dehydrogenase
NCTC10437_024743162.323418Uncharacterised protein
NCTC10437_024753151.951943short-chain dehydrogenase/reductase SDR
NCTC10437_024763171.991070Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02473DHBDHDRGNASE871e-22 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 87.4 bits (216), Expect = 1e-22
Identities = 64/254 (25%), Positives = 107/254 (42%), Gaps = 15/254 (5%)

Query: 5 LDGRVALVTGAGAGIGEGIARRFADEGARVVVAEHDADSGSAVADRIGGYF-----VATD 59
++G++A +TGA GIGE +AR A +GA + +++ + V + D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 60 VSDRGQVDNAVAAAVSEFGAIDILVNNAWGGGGIGRVELKTDQQIADGVAVGYYGPFWAM 119
V D +D A E G IDILVN A G G + +D++ +V G F A
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVA-GVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 120 RAAYPHMKTAGWGRVINLCSLNGVNAHVGSLEYNAAKEALRALTRTAAREWAPTGVTVNA 179
R+ +M G ++ + S Y ++K A T+ E A + N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 180 LCPAAKSQAFFRAL--GDYPELEAMADAAN------PMGRMGDPYDDIAPVAVFLASDAS 231
+ P + +L + + + + P+ ++ P DIA +FL S +
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKP-SDIADAVLFLVSGQA 243

Query: 232 RYLTGNTLFVDGGA 245
++T + L VDGGA
Sbjct: 244 GHITMHNLCVDGGA 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02475DHBDHDRGNASE571e-11 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 57.0 bits (137), Expect = 1e-11
Identities = 45/190 (23%), Positives = 80/190 (42%), Gaps = 4/190 (2%)

Query: 8 GPWAVVAGGSEGVGAEFAALLAQAGVNVALVARKPEPLERTAARCRALGVQTRTLAVDLV 67
G A + G ++G+G A LA G ++A V PE LE+ + +A D+
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 68 DAP--DEVIAMTADLEVGLLIYNAGANTCSEHFLDA-DLADFARVIDLNITAMMRLTQHY 124
D+ DE+ A + I A + + ++ +N T + ++
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 125 ARPMRERRRGGILLVGSMAGYLGSARHSVYGGVKAFGRIFAESLWLELRQHDVHVLELVL 184
++ M +RR G I+ VGS + + Y KA +F + L LEL ++++ +
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 185 GVTRTPAMER 194
G T T M+
Sbjct: 188 GSTET-DMQW 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02476cloacin402e-05 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 40.5 bits (94), Expect = 2e-05
Identities = 31/98 (31%), Positives = 40/98 (40%)

Query: 431 AGGNGGTGGTGGTGGTGGAGDAGGNGGVGGAGGAGTGGGPTGVGGKGGLGGSGDSGGPGG 490
+GG+G TG +G GVGG G+G GG G GG G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 491 TGGTGGSGGSGGSGGSGAAGGSGGAGGGGGSAGIADPG 528
G GG+G SGG G+G + A G ++ PG
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPG 99



Score = 40.1 bits (93), Expect = 2e-05
Identities = 39/120 (32%), Positives = 49/120 (40%), Gaps = 6/120 (5%)

Query: 418 GVGGTGGVGGTGEAGGNGGTGGTGGTGGTGGAGDAGGNGGVGGAGGAGTGGGPTGVGGKG 477
G G G G GN GG G G GGA D G G G+G G G
Sbjct: 3 GGDGRGHNTGAHSTSGNI-NGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGI----HWG 57

Query: 478 GLGGSGDSGGPGGTGGTGGSGGSGGSGGSGAAGG-SGGAGGGGGSAGIADPGGTFSHAGA 536
G G G+ GG G +GG G+GG+ + + A G + G G ++ G S A A
Sbjct: 58 GGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIA 117



Score = 37.4 bits (86), Expect = 1e-04
Identities = 41/119 (34%), Positives = 52/119 (43%), Gaps = 8/119 (6%)

Query: 414 AGNGGVGGTGGVGGTGEAGGNGGTGGTGGTGGTGGAGDAGGNGGVGGAGGAGTGGGPTGV 473
+G G G G T G TG G G + G+G + N G GG+G+G G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWG--GGSGSGIHWGGG 59

Query: 474 GGKGGLGGSGDSGGPGGTGGTGGSGGSGGSGGSGAAGGSGGAG------GGGGSAGIAD 526
G G GG+G+SGG GTGG + + + G A G G G SA IAD
Sbjct: 60 SGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIAD 118



Score = 37.0 bits (85), Expect = 2e-04
Identities = 35/114 (30%), Positives = 42/114 (36%), Gaps = 2/114 (1%)

Query: 443 TGGTGGAGDAGGNGGVGGAGGAGTGGGPTGVGGKGGLGGSGDSGGPGGTGGTGGSGGSGG 502
+GG G + G + G G TG G G G S ++ GG+G GG G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 503 SGGSGAAGGSGG--AGGGGGSAGIADPGGTFSHAGAAGTSGANGASGASGASGA 554
G G G SGG GG SA A F G G + A S A
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAA 115



Score = 35.8 bits (82), Expect = 5e-04
Identities = 33/107 (30%), Positives = 40/107 (37%), Gaps = 1/107 (0%)

Query: 394 GQAGNGGNGGKGGFYGGGGNAGNGGVGGTGGVGGTGEAGGNGGTGGTGGTGGTGGAGDAG 453
G+ N G G GG G G + G G + E GG G+G G G G
Sbjct: 6 GRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNG 65

Query: 454 GNGGVGGAGGAGTGGGPTGVGGKGGLGGSGDSGGPGGTGGTGGSGGS 500
G G G GG+GTGG + V G S G S G+
Sbjct: 66 GGNGNSG-GGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111



Score = 34.3 bits (78), Expect = 0.001
Identities = 35/113 (30%), Positives = 43/113 (38%), Gaps = 3/113 (2%)

Query: 352 NGGNGGAGGIGAAGESGGVGAVGGTGGTGSNATGVGGTGTVGGQAGNGGNGGKGGFYGGG 411
+GG+G GA SG + GG G G G+G GG G G +GGG
Sbjct: 2 SGGDGRGHNTGAHSTSGNIN--GGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGG 59

Query: 412 GNAGNGGVGGTGGVGGTGEAGGNGGTGGTGGTGGTGGAGDAGGNGGVGGAGGA 464
GNGG G G GG+G G G + G V + GA
Sbjct: 60 SGHGNGGGNGNSG-GGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111



Score = 33.1 bits (75), Expect = 0.003
Identities = 29/101 (28%), Positives = 41/101 (40%)

Query: 471 TGVGGKGGLGGSGDSGGPGGTGGTGGSGGSGGSGGSGAAGGSGGAGGGGGSAGIADPGGT 530
+G G+G G+ + G G TG G G S GSG + + GGG GS G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 531 FSHAGAAGTSGANGASGASGASGANGTNGGTGKAGASGSSG 571
+ G G SG +G + ++ A G G+ G
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGG 102



Score = 33.1 bits (75), Expect = 0.003
Identities = 28/74 (37%), Positives = 35/74 (47%), Gaps = 5/74 (6%)

Query: 345 TGAASGGNGGNGGAGGIGAAGESGGVGAVGGTGGTGSNATGVGGTGTVGGQAGNGGNGGK 404
TGA S NGG G+G G GA G+G + N GG+G+ G G+G
Sbjct: 11 TGAHSTSGNINGGPTGLGVGG-----GASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNG 65

Query: 405 GGFYGGGGNAGNGG 418
GG GG +G GG
Sbjct: 66 GGNGNSGGGSGTGG 79



Score = 32.8 bits (74), Expect = 0.004
Identities = 29/82 (35%), Positives = 38/82 (46%), Gaps = 2/82 (2%)

Query: 301 AGGAATTTGTNAVAVGGNGADGGAGGTGVTTGGAGGAGGAATSTT-GAASGGNGGNGGAG 359
+GG T A + GN +GG G GV G + G+G ++ + G SG GG
Sbjct: 2 SGGDGRGHNTGAHSTSGN-INGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGS 60

Query: 360 GIGAAGESGGVGAVGGTGGTGS 381
G G G +G G GTGG S
Sbjct: 61 GHGNGGGNGNSGGGSGTGGNLS 82



Score = 32.8 bits (74), Expect = 0.004
Identities = 27/82 (32%), Positives = 35/82 (42%), Gaps = 1/82 (1%)

Query: 334 AGGAGGAATSTTGAASGG-NGGNGGAGGIGAAGESGGVGAVGGTGGTGSNATGVGGTGTV 392
+GG G + + SG NGG G G G A + G + G GS + G G+
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 393 GGQAGNGGNGGKGGFYGGGGNA 414
G G GN G G GG +A
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLSA 83



Score = 32.8 bits (74), Expect = 0.005
Identities = 36/123 (29%), Positives = 48/123 (39%), Gaps = 5/123 (4%)

Query: 134 GYGGIGGSGAAGVAGGN--GGKGGLFFGNGGAGGAGGTSANGGNGGDAGM---FALVGHG 188
G G G + A GN GG GL G G + G+G +S N GG +G +
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 189 GNGGVGGNGALGTIGTTGTGTATDGTAGEIGAIGATGANGVNPTASGAAGAAGAAGAATT 248
GNGG GN G+ A A+ GA G+ + S A +A A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIADIMAA 122

Query: 249 VTG 251
+ G
Sbjct: 123 LKG 125



Score = 31.6 bits (71), Expect = 0.009
Identities = 30/111 (27%), Positives = 37/111 (33%), Gaps = 2/111 (1%)

Query: 375 GTGGTGSNATGVGGTGTVGGQAGNGGNGGKGGFYGGGGNAGNGGVGGTGGVGGTGEAGGN 434
G G G N +G + G G G G GG G G + G G G GG+
Sbjct: 3 GGDGRGHNTGAHSTSGNING--GPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGS 60

Query: 435 GGTGGTGGTGGTGGAGDAGGNGGVGGAGGAGTGGGPTGVGGKGGLGGSGDS 485
G G G GG+G G V G T G + S +
Sbjct: 61 GHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111



Score = 31.6 bits (71), Expect = 0.010
Identities = 36/116 (31%), Positives = 41/116 (35%), Gaps = 7/116 (6%)

Query: 234 SGAAGAAGAAGAATTVTGGNGAPGSPGNPGGAGGAGGAVAGTAPLATGGNGAIG-GAGTP 292
SG G GA +T NG P G GGA G + P G I G G+
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 293 GANGGAGGAGGAATTTGTNAVAVGGNGADGGAGGTGVTTGGAGGAGGAATSTTGAA 348
NGG G G + TG G A G GAGG A S + A
Sbjct: 62 HGNGGGNGNSGGGSGTG------GNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111



Score = 30.1 bits (67), Expect = 0.027
Identities = 33/114 (28%), Positives = 41/114 (35%), Gaps = 2/114 (1%)

Query: 260 GNPGGAGGAGGAVAGTAPLATGGNGAIGGAGTPGANGGAGGAGGAATTTGTNAVAVGGNG 319
G+ GA G + G G GA G+G N GG G+ G + GNG
Sbjct: 8 GHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGG--SGHGNG 65

Query: 320 ADGGAGGTGVTTGGAGGAGGAATSTTGAASGGNGGNGGAGGIGAAGESGGVGAV 373
G G G TGG A A + A G G A I A S + +
Sbjct: 66 GGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIADI 119



Score = 30.1 bits (67), Expect = 0.032
Identities = 29/108 (26%), Positives = 36/108 (33%), Gaps = 4/108 (3%)

Query: 286 IGGAGTPGANGGAGGAGGAATTTGTNAVAVGGNGADGGAGGTGVTTGGAGGAGGAATSTT 345
+ G G N GA G T GG G GG G+G +
Sbjct: 1 MSGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGS 60

Query: 346 GAASGGNGGNGGAGGIGAAGESGGVGAVGGTGGTGSNATGVGGTGTVG 393
G +GG GN G G +G G + AV G A G G +
Sbjct: 61 GHGNGGGNGNSG----GGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104


27NCTC10437_02559NCTC10437_02575Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_025594112.236439uncharacterized protein required for cytochrome
NCTC10437_025603102.031820TetR family transcriptional regulator
NCTC10437_02561190.636992Uncharacterised protein
NCTC10437_02562-160.140542ABC transporter efflux protein, DrrB family
NCTC10437_02563-28-0.492235ABC transporter-like protein
NCTC10437_02564-37-0.608464integral membrane protein
NCTC10437_02565-19-1.569609putative transcriptional regulator
NCTC10437_0256609-2.109796FeS assembly protein SufB
NCTC10437_0256717-1.316173FeS assembly protein SufD
NCTC10437_0256827-0.955066FeS assembly ATPase SufC
NCTC10437_0256928-0.675847Cysteine desulfurase Csd
NCTC10437_0257039-0.324680SUF system FeS assembly protein, NifU family
NCTC10437_02571210-0.567130putative metal-sulfur cluster biosynthetic
NCTC10437_0257229-0.197248putative taurine catabolism dioxygenase
NCTC10437_02573210-0.072917ATP-grasp enzyme, D-alanine-D-alanine ligase
NCTC10437_02574110-0.496985ATP-grasp domain protein
NCTC10437_02575213-0.461573putative O-methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02560HTHTETR622e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.3 bits (151), Expect = 2e-14
Identities = 27/162 (16%), Positives = 56/162 (34%), Gaps = 10/162 (6%)

Query: 2 GRRRGEALDAAIQAAALRLLAEHGPEAVTMESVAVAAHTSKPVLYRRWPDRRALLRDTLL 61
++ + I ALRL ++ G + ++ +A AA ++ +Y + D+ L +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 62 GIATTSIPSPDT--GSFRGDMLAVLR-GWAALFTGDSAQPMRSVALAVAVDPELAAAFRT 118
+ F GD L+VLR + + R + + +
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA 124

Query: 119 DVLGMRK-------DQMNAILARAIERGEVRADVPVDLVREL 153
V ++ D++ L IE + AD+ +
Sbjct: 125 VVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAII 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02568INTIMIN290.027 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 28.9 bits (64), Expect = 0.027
Identities = 21/105 (20%), Positives = 37/105 (35%), Gaps = 5/105 (4%)

Query: 45 GSGKSTLSYAIAGH-PKYTVTSGSITLDGEDVLEMSIDERARAGLFLAMQYPIE-VPGVS 102
GS + +AGH K T S +T + +++ A+ L Q + G
Sbjct: 128 GSAPLVAAGGVAGHTNKLTKMSPDVTKS-NMTDDKALNYAAQQAASLGSQLQSRSLNGDY 186

Query: 103 MSNFLRTAATAVRGEAPKLRHWVKEVKTAMSDLGIDPAFSERSVN 147
+ A L+ W++ TA +L F S++
Sbjct: 187 AKDTALGIAGNQASSQ--LQAWLQHYGTAEVNLQSGNNFDGSSLD 229


28NCTC10437_02741NCTC10437_02756Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_02741014-3.064737ferredoxin
NCTC10437_02742115-3.436879cyclohexanone monooxygenase
NCTC10437_02743118-3.428574NAD-dependent epimerase/dehydratase
NCTC10437_02744116-4.190830alpha/beta hydrolase fold protein
NCTC10437_02745018-4.572659acetaldehyde dehydrogenase
NCTC10437_02746018-4.7204284-hydroxy-2-ketovalerate aldolase
NCTC10437_02747017-4.122245TetR family transcriptional regulator
NCTC10437_02748015-3.434445Uncharacterised protein
NCTC10437_02749-114-2.686343RmlD substrate binding domain-containing
NCTC10437_02750-114-2.404835major facilitator superfamily transporter
NCTC10437_02751-113-0.515539GntR family transcriptional regulator
NCTC10437_02752-1120.012729short-chain dehydrogenase
NCTC10437_02753013-0.146084Uncharacterized protein conserved in bacteria
NCTC10437_02754215-0.3834562-hydroxy-3-oxopropionate reductase
NCTC10437_02755216-0.830479Uncharacterised protein
NCTC10437_02756216-0.767790LuxR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02743DHBDHDRGNASE541e-10 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 53.9 bits (129), Expect = 1e-10
Identities = 61/275 (22%), Positives = 102/275 (37%), Gaps = 60/275 (21%)

Query: 6 VSGSASGMGRAVAERLVAAGHSVIGVDIKDA--DVVADLSTAAARH--------AAADAV 55
++G+A G+G AVA L + G + VD + V A ARH + A+
Sbjct: 13 ITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSAAI 72

Query: 56 LDRCG------GVLDGAVLAAGL---GPIPGRDDTIVQ----VNYFGVVDLLIALRPALA 102
+ G +D V AG+ G I D + VN GV + ++ +
Sbjct: 73 DEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKYMM 132

Query: 103 RAGAAKVVVVASNSTTVTPAVPRRAVRALLNGDPAKALRAVRWFGKRSAPMSYAATKIAV 162
+ +V V SN VPR ++ A YA++K A
Sbjct: 133 DRRSGSIVTVGSNPA----GVPRTSMAA------------------------YASSKAAA 164

Query: 163 SRWVRRHAVTREWAGAGIRLNALAPGAVMTPL-LSEQLSNPAEAKAINAFP------VPV 215
+ + + E A IR N ++PG+ T + S + I +P+
Sbjct: 165 VMFTK--CLGLELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPL 222

Query: 216 GGYGDAGHLADWMVFMLSDAAEFLCGSVIFVDGGS 250
+AD ++F++S A + + VDGG+
Sbjct: 223 KKLAKPSDIADAVLFLVSGQAGHITMHNLCVDGGA 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02747HTHTETR611e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.8 bits (147), Expect = 1e-13
Identities = 24/98 (24%), Positives = 40/98 (40%), Gaps = 2/98 (2%)

Query: 21 RDLSAGRHQRRHRIVSAAVTLAAD-GYDAVQLRLVAEDAGVSTSTIYHYFSSKDDLLVAC 79
R + R I+ A+ L + G + L +A+ AGV+ IY +F K DL
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 80 FHR-WIVANDLAIRAELAPIADPTERLRRVIYRMTESL 116
+ +L + + DP LR ++ + ES
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLEST 100


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02752DHBDHDRGNASE994e-27 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 99.4 bits (247), Expect = 4e-27
Identities = 81/255 (31%), Positives = 122/255 (47%), Gaps = 18/255 (7%)

Query: 9 RTAVITGGASPRGIGRATADRLARDGWSVAIVDIDEAAATR--TAEELADRHSVSTLGVG 66
+ A ITG A +GIG A A LA G +A VD + + ++ + RH+ +
Sbjct: 9 KIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEA---FP 63

Query: 67 ADVAEEDSVHAAVVRIESALPPIIGLANIAGVSSPTEFLDVTPAEWDRVFNVNMRGTFLV 126
ADV + ++ RIE + PI L N+AGV P ++ EW+ F+VN G F
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 127 TQRVLPAMIAAEVGRIVSVSSISAQRGGGTYSKVAYSASKAAVIGFSRALAREMGPHNIT 186
++ V M+ G IV+V S A G S AY++SKAA + F++ L E+ +NI
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPA--GVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIR 181

Query: 187 VNSVAPGPIDTDIMGGTLTDERK---------AQMSADIPLGRVGTVEEVAALMAFLMSE 237
N V+PG +TD+ DE IPL ++ ++A + FL+S
Sbjct: 182 CNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSG 241

Query: 238 DAGFITAATYDINGG 252
AG IT ++GG
Sbjct: 242 QAGHITMHNLCVDGG 256


29NCTC10437_02884NCTC10437_02899Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_02884115-3.333873glycosyltransferase
NCTC10437_02885016-4.671295Uncharacterised protein
NCTC10437_02886016-4.710685Protein of uncharacterised function (DUF2505)
NCTC10437_02887118-6.013642MtfA protein
NCTC10437_02888220-5.262263NAD-dependent epimerase/dehydratase
NCTC10437_02889217-5.055751methyltransferase
NCTC10437_02890118-4.513275nucleoside-diphosphate-sugar epimerase
NCTC10437_02891119-4.132193nucleotide sugar dehydrogenase
NCTC10437_02892120-4.221319methyltransferase
NCTC10437_02893119-3.180648glycosyl transferase,
NCTC10437_02894119-3.578245glucose-methanol-choline oxidoreductase
NCTC10437_02895119-2.863052putative dehydrogenase
NCTC10437_02896016-2.501558putative dehydrogenase
NCTC10437_02897017-2.419895methyltransferase MtfC
NCTC10437_02898-113-2.902508glycosyl transferase,
NCTC10437_02899-114-3.821057Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02888NUCEPIMERASE1215e-34 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 121 bits (304), Expect = 5e-34
Identities = 59/315 (18%), Positives = 109/315 (34%), Gaps = 61/315 (19%)

Query: 9 RLFTGDVTHAPDWDAVLRLFQPSQIVHLAAETGTAQSLSAATRHGSVNVVGTTQLVDALS 68
+ D+ + ++ SL + N+ G +++
Sbjct: 55 QFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCR 114

Query: 69 RAGYIPEHLILASSRAVYGEGAWGFGADVFYPPPRTHAQLSAGHWDPKGPDGVSAAPLAS 128
+HL+ ASS +VYG +
Sbjct: 115 HNK--IQHLLYASSSSVYG----------------------------------LNRKMPF 138

Query: 129 AASRTEPRPTNIYASTKLAQEHILAAWTAAHGAGLSI--LRLQNVFGPGQSLTNSYTGIV 186
+ + P ++YA+TK A E L A T +H GL LR V+GP +
Sbjct: 139 STDDSVDHPVSLYAATKKANE--LMAHTYSHLYGLPATGLRFFTVYGPWGRPDMALF--- 193

Query: 187 ALFARLSRARQSLEVYEDGRILRDFVYIDDVVDALFAAVRQPST-DARW----------- 234
F + +S++VY G++ RDF YIDD+ +A+ D +W
Sbjct: 194 -KFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASI 252

Query: 235 -----FDVGSGVSTTIHELAAMTARLCGAPDPRVVGKFRDGDVRAAKCDIRAAVDELGWR 289
+++G+ + + G + + + GDV D +A + +G+
Sbjct: 253 APYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFT 312

Query: 290 PEWALEDGLLALLDW 304
PE ++DG+ ++W
Sbjct: 313 PETTVKDGVKNFVNW 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02890NUCEPIMERASE2155e-70 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 215 bits (549), Expect = 5e-70
Identities = 84/346 (24%), Positives = 145/346 (41%), Gaps = 39/346 (11%)

Query: 3 NVVVTGGYGFVGSHLVTALLDRGDTVTVFDYAKNTHDTSIDFDR-----YPGFRFVQGDV 57
+VTG GF+G H+ LL+ G V D + +D S+ R PGF+F + D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 58 TDLEALEAAI-TPEVDSVFHLASVVGVNKYVEDPLRVVDVSVIGTRNVLTLCHRHG-ARL 115
D E + + + VF + V +E+P D ++ G N+L C + L
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 116 VFTSTSEVYGKNPGIPWKEDDDRVLGSTRTARWSYSTSKAMAEHMVFGMHDAHGLPVTVV 175
++ S+S VYG N +P+ DD S Y+ +K E M +GLP T +
Sbjct: 122 LYASSSSVYGLNRKMPFSTDD-----SVDHPVSLYAATKKANELMAHTYSHLYGLPATGL 176

Query: 176 RFFNVYGPRQNPIFVLSKSIHRILNGREPLLYDSGDQTRCFTYVDDAVAGTLLAA----- 230
RFF VYGP P L K +L G+ +Y+ G R FTY+DD +
Sbjct: 177 RFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPH 236

Query: 231 ------------GSEAAIGEVFNIGSMTETTMREAVDLAIKIAKVDAVSSTAAFDTAERY 278
+ A V+NIG+ + + + + ++A +
Sbjct: 237 ADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPG--- 293

Query: 279 GARYEDIPRRVPDSTKAQQMLGWQLKIDLEEGIRRTIDWARENPWY 324
D+ D+ +++G+ + +++G++ ++W R+ +Y
Sbjct: 294 -----DVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRD--FY 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02891ENTSNTHTASED290.035 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 28.8 bits (64), Expect = 0.035
Identities = 9/35 (25%), Positives = 17/35 (48%)

Query: 191 VVGAVDECSARACATLWRHALGVDSVIVEDPRTAE 225
+ G++ C+ A A + R +G+D + TA
Sbjct: 84 LFGSISHCATTALAVISRQRIGIDIEKIMSQHTAT 118


30NCTC10437_03015NCTC10437_03027Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_030151143.264925multidrug ABC transporter inner membrane
NCTC10437_030161143.445592multidrug ABC transporter inner membrane
NCTC10437_030171143.327492daunorubicin resistance ABC transporter ATPase
NCTC10437_030181133.508521beta-ketoacyl synthase
NCTC10437_030191133.396028beta-ketoacyl synthase
NCTC10437_030201122.998227polyketide synthase family protein
NCTC10437_030210122.540198beta-ketoacyl synthase
NCTC10437_030220121.983587acyl-CoA synthetase
NCTC10437_030230122.018922polyketide synthase family protein
NCTC10437_030240110.670026acyl-CoA synthetase
NCTC10437_030250110.764809thioesterase
NCTC10437_030260110.832985glycine dehydrogenase
NCTC10437_03027213-0.328204MerR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03015ABC2TRNSPORT300.011 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 29.5 bits (66), Expect = 0.011
Identities = 52/242 (21%), Positives = 91/242 (37%), Gaps = 16/242 (6%)

Query: 31 LPQVAVLTARILRR----WSRD-PATLVQSLVMPAVFLAALDIVLGDVIEQVTGNSGLY- 84
LP ++ + RR W + A+L+ L P ++L L LG ++ +V G S
Sbjct: 9 LPGGSLNWIAVWRRNYIAWKKAALASLLGHLAEPLIYLFGLGAGLGVMVGRVGGVSYTAF 68

Query: 85 ---GQVPLVALVGGMTGAIIGAVGIMRERDVGLLSRFWVVPVHRAAGLLARLTADFVRIV 141
G V A+ I A G M + + +L + +
Sbjct: 69 LAAGMVATSAMTAATFETIYAAFGRMEGQ--RTWEAMLYTQLRLGDIVLGEMAWAATKAA 126

Query: 142 VITIVVMGVGLALGFRFEQGIPAAIAWVFMPALFGVALSAAVLTLALFSSSTIAPQATEI 201
+ + V ALG+ + A+ + + L +L V LA I Q +
Sbjct: 127 LAGAGIGVVAAALGYTQWLSLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQT--L 184

Query: 202 VIAILMFFSIGFVPLDQYPEWLQPFVENQPVSTTIEAMKGLSLEGP---IAEPVLFSVLW 258
VI ++F S P+DQ P Q P+S +I+ ++ + L P + + V ++
Sbjct: 185 VITPILFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIY 244

Query: 259 AV 260
V
Sbjct: 245 IV 246


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03019DHBDHDRGNASE340.004 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 34.3 bits (78), Expect = 0.004
Identities = 27/119 (22%), Positives = 48/119 (40%), Gaps = 2/119 (1%)

Query: 1423 VSGALGGVGLAVVRWLVEAGAGRVVLNGRSAASAEAQSVLDELSARAEIVTVLGDIAAPG 1482
++GA G+G AV R L GA ++ + S L + AE D+
Sbjct: 13 ITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPA--DVRDSA 70

Query: 1483 VAERLTAAAEETGKPLRGLIHSAAVLADELVVGLTRESLDTIWTPKALGAWRLHEVTAT 1541
+ +TA E P+ L++ A VL L+ L+ E + ++ + G + +
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSK 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03020ISCHRISMTASE384e-04 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 37.7 bits (87), Expect = 4e-04
Identities = 17/90 (18%), Positives = 37/90 (41%), Gaps = 2/90 (2%)

Query: 1662 VVDENGSASVEFIDWSSMTRAQTVGELRTRLRAILARELGMPDAAVDFDAAFPELGLDSM 1721
++D+ +A + S+ T + V +R +A L + + GLDS+
Sbjct: 206 LLDQLQNAPADVQKTSANTGKKNVFTCEN-IRKQIAELLQETPEDITDQEDLLDRGLDSV 264

Query: 1722 MAMNLLRDAKQLVRVDLSATMLWNHPSIAQ 1751
M L+ ++ +++ L P+I +
Sbjct: 265 RIMTLVEQWRR-EGAEVTFVELAERPTIEE 293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03023ISCHRISMTASE310.045 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 30.8 bits (69), Expect = 0.045
Identities = 16/79 (20%), Positives = 36/79 (45%), Gaps = 2/79 (2%)

Query: 1712 APAKAWTEMSSEEVLAELEAGLRAVLAIELRIADDEVATDRPFAEMGLNSVMAMSIRREV 1771
A + + + ++ + E +R +A L+ +++ + GL+SV M++ +
Sbjct: 215 ADVQKTSANTGKKNVFTCEN-IRKQIAELLQETPEDITDQEDLLDRGLDSVRIMTLVEQW 273

Query: 1772 EQLAGIELSATMLWNHPTI 1790
+ G E++ L PTI
Sbjct: 274 RR-EGAEVTFVELAERPTI 291


31NCTC10437_03088NCTC10437_03114Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_030882111.789292BadM/Rrf2 family transcriptional regulator
NCTC10437_030892102.004877Major facilitator superfamily MFS_1
NCTC10437_030901111.917519secreted protein
NCTC10437_030912130.561343putative transcriptional regulator
NCTC10437_030922120.929855Uncharacterised protein
NCTC10437_030933131.001432glycosyltransferase
NCTC10437_030943140.599325UDP-N-acetylglucosamine:LPS N-acetylglucosamine
NCTC10437_03095216-0.240898sugar transferase
NCTC10437_03096418-0.793244ErfK/YbiS/YcfS/YnhG family protein
NCTC10437_03097218-0.585480Uncharacterised protein
NCTC10437_03098318-0.982310Gcn5-related N-acetyltransferase
NCTC10437_03099420-0.651847Allergen V5/Tpx-1 family protein
NCTC10437_03100519-0.567434Uncharacterised protein
NCTC10437_031015200.405304TIGR02391 family protein
NCTC10437_031025171.459257Uncharacterised protein
NCTC10437_031035201.873051Uncharacterised protein
NCTC10437_031044201.652899Mu-like prophage I protein
NCTC10437_031054181.769625Uncharacterised protein
NCTC10437_031074191.888510*Uncharacterised protein
NCTC10437_031085201.023436Uncharacterised protein
NCTC10437_03109523-0.902237Uncharacterised protein
NCTC10437_03110722-0.455930Uncharacterised protein
NCTC10437_03111522-0.918590Uncharacterised protein
NCTC10437_03112315-0.782789Uncharacterised protein
NCTC10437_03113211-0.732880Helix-turn-helix domain
NCTC10437_03114211-0.597587Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03089TCRTETB310.011 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 30.6 bits (69), Expect = 0.011
Identities = 10/63 (15%), Positives = 26/63 (41%)

Query: 64 GLVTSAVMGIVVGRWLDRYGPRWIMTTGSVLGALAVLAVTAAPNYPMFVAAWVVAGLAMS 123
G ++ + G + G +DR GP +++ G +++ L + + ++ +
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGG 361

Query: 124 AVF 126
F
Sbjct: 362 LSF 364



Score = 30.6 bits (69), Expect = 0.012
Identities = 21/115 (18%), Positives = 42/115 (36%), Gaps = 4/115 (3%)

Query: 257 AVALGLGGAGQVLGRLGYQTLVRRVRVVPRTVIIMGCVAATTALLGLFASYAALALIAIA 316
A L V G+L Q ++R+ + + G V A +
Sbjct: 57 AFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAG 116

Query: 317 AGMVRGIMTLLQATAVTERWGATHYGHLSGILSAPVMIATALGPFIGAALASLLH 371
A ++ ++ A + + + G G++ + V + +GP IG +A +H
Sbjct: 117 AAAFPALVMVVVARYIPKE----NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIH 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03097PERTACTIN342e-04 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 33.9 bits (77), Expect = 2e-04
Identities = 24/70 (34%), Positives = 27/70 (38%)

Query: 63 ALPGPLAPPPVVPPLAPPPIPPVVPPLAPPPIPPVVPPLAPPPVPPVIPPAAPLVPLAGA 122
+L G APP P P P P PP P P P PP P P P P A
Sbjct: 561 SLVGAKAPPAPKPAPQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQPPAGRELSA 620

Query: 123 ADGLPLSTMG 132
A ++T G
Sbjct: 621 AANAAVNTGG 630


32NCTC10437_03172NCTC10437_03184Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_031723170.377219short-chain dehydrogenase/reductase SDR
NCTC10437_031733200.012485activator of Hsp90 ATPase 1 family protein
NCTC10437_031743210.157731arsR family transcriptional regulator
NCTC10437_031752110.465746tRNA/rRNA methyltransferase SpoU
NCTC10437_03176111-0.01454650S ribosomal protein L20
NCTC10437_03177-1100.12367850S ribosomal protein L35
NCTC10437_03178-280.981952initiation factor 3
NCTC10437_03179-281.220278Uncharacterised protein
NCTC10437_03180-191.354029lysyl-tRNA synthetase
NCTC10437_031811101.519910putative esterase
NCTC10437_031821111.646864phage shock protein A (IM30)
NCTC10437_031832121.851747Uncharacterised protein
NCTC10437_031842121.030170Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03172DHBDHDRGNASE762e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 76.2 bits (187), Expect = 2e-18
Identities = 51/188 (27%), Positives = 86/188 (45%), Gaps = 8/188 (4%)

Query: 8 ALVTGASGGIGEEFAVQLAQRGANLILVARRADKLTAL--RETLLARHPGIVVDVLAADL 65
A +TGA+ GIGE A LA +GA++ V +KL + ARH + AD+
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHA----EAFPADV 66

Query: 66 AVPGSGAELQSGIRELGRTVDVLVNNAGVGLHGKFVTQEPEGNAAQI-QLNCGTLVDLTA 124
+ E+ + I +D+LVN AGV L + + +N + + +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGV-LRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 125 RFLPGMVERSTGVVINVASTAAFQPTPGMAVYGATKAFVLSYSEALWQECRGTGVTILAL 184
M++R +G ++ V S A P MA Y ++KA + +++ L E + +
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 185 CPGATETE 192
PG+TET+
Sbjct: 186 SPGSTETD 193


33NCTC10437_03338NCTC10437_03352Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_0333828-1.109417BadM/Rrf2 family transcriptional regulator
NCTC10437_0333928-0.832807integral membrane protein
NCTC10437_03340411-0.780983Uncharacterised protein
NCTC10437_03341311-0.187138Uncharacterised protein
NCTC10437_033422110.162519vesicle-fusing ATPase
NCTC10437_03343-1102.452812low molecular weight antigen CFP2
NCTC10437_03344-292.674831putative lipoprotein LppK
NCTC10437_03345-181.216651Protein of uncharacterised function (DUF503)
NCTC10437_03346-290.848048tRNA(1-methyladenosine) methyltransferase-like
NCTC10437_033470100.244116recombinase B
NCTC10437_03348212-0.359243putative thioesterase-like superfamily protein
NCTC10437_03349211-0.450733DinB superfamily
NCTC10437_03350112-0.822812ribonucleotide reductase, beta subunit
NCTC10437_03351211-0.569778TetR family transcriptional regulator
NCTC10437_03352211-0.216927mercuric reductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03344PF06872280.027 EspG protein
		>PF06872#EspG protein

Length = 398

Score = 27.8 bits (61), Expect = 0.027
Identities = 22/70 (31%), Positives = 29/70 (41%), Gaps = 6/70 (8%)

Query: 46 SSTGLPDPIPDSPGGPPLPPSTALVDVMARLSDPAVPGNEKITLIERGTPAEAAGLDRFA 105
S+T +P P P ST +D L GN K++L PA+A L R
Sbjct: 180 SATLIPHNNQTDPLSGLTPFSTVFMDTSRGL------GNSKLSLNGVDIPADAQKLLRNT 233

Query: 106 TALRDNGSLP 115
L+D S P
Sbjct: 234 LGLKDTNSSP 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03351HTHTETR701e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 69.7 bits (170), Expect = 1e-16
Identities = 32/174 (18%), Positives = 64/174 (36%), Gaps = 6/174 (3%)

Query: 10 GADRRAAVISAAEAEFAAHGFSAGSLNVIARRAGVAKGSLFQYFADKRDLYAFITDVGSQ 69
+ R ++ A F+ G S+ SL IA+ AGV +G+++ +F DK DL++ I ++
Sbjct: 9 AQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSES 68

Query: 70 RVRSYMEDQITRL--DPIRPFFEFLTDLLDVWVGYFAEHPRDRALHAAASFEVDTDARVS 127
+ + + DP+ E L +L+ V + F +
Sbjct: 69 NIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQ 128

Query: 128 VRAVVHRHYLQVLRPLVRDAQARGDLRPDGD----ADALLSLLLMLLPHLALAP 177
+ + + ++ L D A + + L+ + AP
Sbjct: 129 AQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAP 182


34NCTC10437_03394NCTC10437_03417Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_0339428-1.409324signal transduction histidine kinase regulating
NCTC10437_0339518-1.549615response regulator receiver/uncharacterised
NCTC10437_0339618-1.583960chitinase, cellulase
NCTC10437_033970100.057277beta-lactamase
NCTC10437_033982120.503090glutamine amidotransferase
NCTC10437_033992140.751203TIGR01777 family protein
NCTC10437_034003150.182687phosphoribosyltransferase
NCTC10437_034011140.556072transmembrane protein
NCTC10437_034020111.115632DivIVA domain protein
NCTC10437_03403081.851225transmembrane protein
NCTC10437_03404072.460276cell division protein sepF
NCTC10437_03405062.778768alanine racemase domain-containing protein
NCTC10437_03406082.722906uncharacterized protein, YfiH family
NCTC10437_03407-173.055185cell division protein ftsZ
NCTC10437_03408-172.780055polypeptide-transport-associated
NCTC10437_03409072.616402UDP-N-acetylmuramate--L-alanine ligase
NCTC10437_03410093.057588undecaprenyldiphospho-muramoylpentapeptide
NCTC10437_03411-182.259912cell division protein FtsW
NCTC10437_03412-292.723081UDP-N-acetylmuramoyl-L-alanyl-D-glutamate
NCTC10437_03413081.902443Phospho-N-acetylmuramoyl-pentapeptide-
NCTC10437_03414091.959609UDP-N-acetylmuramoylalanyl-D-glutamyl-2,
NCTC10437_034152111.513480UDP-N-acetylmuramoylalanyl-D-glutamate--2,
NCTC10437_034162120.271647peptidoglycan glycosyltransferase
NCTC10437_034173100.961761FHA domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03394PF06580402e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.8 bits (93), Expect = 2e-05
Identities = 38/223 (17%), Positives = 78/223 (34%), Gaps = 52/223 (23%)

Query: 338 ELAALQGQLSSHKSVTDTLRAQTHEFA-NQLHTISGLVQLGEYEAVRDLVGTLTR-RRAE 395
+L AL+ Q++ H F N L+ I L+ +A R+++ +L+ R
Sbjct: 162 QLMALKAQINPH-------------FMFNALNNIRALILEDPTKA-REMLTSLSELMRYS 207

Query: 396 ISDAVTQHIS-----DPAVAALLIAKTSLAAESGVALHLDPASHLAALEPALATDVITLL 450
+ + + +S + L +A ++PA + P + L+
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQV-PPM------LV 260

Query: 451 GNLIDNAVD--VSVGAPDACVTIGIDDRDG-LTISVLDTGPGVPEHLREAIFARGVTSKS 507
L++N + ++ + + +G +T+ V +TG ++ +E
Sbjct: 261 QTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE----------- 309

Query: 508 DVPGGRGIGLALV--RLVTA---QHGGTIEVSDGPGGGARFLV 545
G GL V RL + + G A L+
Sbjct: 310 ----STGTGLQNVRERLQMLYGTEAQIKLSEKQG-KVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03395HTHFIS793e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.1 bits (195), Expect = 3e-19
Identities = 35/140 (25%), Positives = 59/140 (42%), Gaps = 11/140 (7%)

Query: 4 VLVVDDDFMVAEIHRRFVDRVNGFRAVGVARTGAEALAAIRELRPQLILLDVYLPDMTGL 63
+LV DDD + + + + R G+ + A I L++ DV +PD
Sbjct: 6 ILVADDDAAIRTVLNQALSRA-GYDVRITS-NAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 64 DVLQRLRSEGDRVDVIMITAARELDTVRGALDGGAADYLIKPFEFPQL---------ETK 114
D+L R++ + V++++A T A + GA DYL KPF+ +L E K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 115 LQAYATRADALQSAGGVDQS 134
+ D+ V +S
Sbjct: 124 RRPSKLEDDSQDGMPLVGRS 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03396FLAGELLIN350.006 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 34.6 bits (79), Expect = 0.006
Identities = 32/269 (11%), Positives = 57/269 (21%), Gaps = 3/269 (1%)

Query: 1589 SFGFQATPGGGSATATNFSVNGVQNPTPVLPKVTVADATVAEGNSGTKNIVFTVTLDKAA 1648
+ G + + + L V A + D A
Sbjct: 140 DNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYA 199

Query: 1649 SAPVSVAYTTAGGTATAGSDFTAKSGVVTFAAGVLSQQISIAVAGDSVVEANETFTVTLS 1708
G + V A A +V T + +
Sbjct: 200 VGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGT 259

Query: 1709 NPTGVTIADGSAVGTITNDDVAPVLPKVTVADATVAEGNSGTKNIVFTVTLDKAATAPVS 1768
D V + G T VTL A +
Sbjct: 260 AEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGA 319

Query: 1769 VAYTTANGTAASGSDFTAKSGVVTFAAGVLS--QQISIAVAGDTAVETNETFTVTLSNPT 1826
A ++ + +G TF + ++S A + + TV + T
Sbjct: 320 ANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGES-KITVNGAEYT 378

Query: 1827 GVTIADGSAVGTITNDDVVTPTPGNSSAA 1855
D + T T + ++
Sbjct: 379 ANAAGDKVTLAGKTMFIDKTASGVSTLIN 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03399NUCEPIMERASE432e-06 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 42.8 bits (101), Expect = 2e-06
Identities = 32/147 (21%), Positives = 50/147 (34%), Gaps = 25/147 (17%)

Query: 154 IALTGASGLVGSALSAFLTTGGHRVI-------------KLVRNAAGAD--------DER 192
+TGA+G +G +S L GH+V+ K R A D
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 193 QWDPMAPAPDLLEGVDAVVHLAGASIAGRFTAEHRAAIRDSRIEPTRRLAAVAAATENGP 252
+ M + V A R++ E+ A DS + L + N
Sbjct: 63 DREGMTDLFAS-GHFERVFISPHRL-AVRYSLENPHAYADSNLTGF--LNILEGCRHNKI 118

Query: 253 RTFVSASAVGFYGFDRGDTQLTEDSTR 279
+ + AS+ YG +R T+DS
Sbjct: 119 QHLLYASSSSVYGLNRKMPFSTDDSVD 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03402RTXTOXIND344e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.4 bits (79), Expect = 4e-04
Identities = 15/93 (16%), Positives = 33/93 (35%), Gaps = 10/93 (10%)

Query: 136 ESEKMLADARAQADQLVTEARQTAETTVTEARQRADAMLADAQSRSETQLRQAQEKADAL 195
E E +A + ++ Q E+ + A++ + ++ +LRQ + L
Sbjct: 256 EQENKYVEAVNELRVYKSQLEQI-ESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLL 314

Query: 196 Q-----ADAERKHSEIM----GTINQQRTVLEG 219
+ ++ S I + Q + EG
Sbjct: 315 TLELAKNEERQQASVIRAPVSVKVQQLKVHTEG 347



Score = 31.0 bits (70), Expect = 0.006
Identities = 24/150 (16%), Positives = 57/150 (38%), Gaps = 14/150 (9%)

Query: 113 LRAAKILSLAQDTADRLTGSAKSESEKMLADARAQADQLVTEARQTAETTVTEARQRADA 172
R+ ++ L + E++L +Q T Q + + ++RA+
Sbjct: 157 SRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAER 216

Query: 173 MLADAQ-SRSETQLRQAQEKADALQA-------------DAERKHSEIMGTINQQRTVLE 218
+ A+ +R E R + + D + + E K+ E + + ++ LE
Sbjct: 217 LTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLE 276

Query: 219 GRLEQLRTFEREYRTRLKTYLESQLEELGQ 248
++ + + EY+ + + L++L Q
Sbjct: 277 QIESEILSAKEEYQLVTQLFKNEILDKLRQ 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03405ALARACEMASE290.028 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 28.6 bits (64), Expect = 0.028
Identities = 27/122 (22%), Positives = 47/122 (38%), Gaps = 10/122 (8%)

Query: 123 AEALEDGRRTEPLRVYLQLSLDGDEQRGGVDVDAADRIDQLCAAVDAAPGLAFVGLMAIP 182
+AL++ R PL +YL+ ++ R G DR+ + + A + + LM+
Sbjct: 106 LKALQNARLKAPLDIYLK--VNSGMNRLGF---QPDRVLTVWQQLRAMANVGEMTLMSHF 160

Query: 183 PLGADPGESFARLQRERDRVQQ-DFQQRLGLSAGMSGDLEAAVEHGSTCVRVGTALMGSR 241
P + R + + ++ L SA EA + VR G L G+
Sbjct: 161 AEAEHPDGISGAMARIEQAAEGLECRRSLSNSAATLWHPEAHFDW----VRPGIILYGAS 216

Query: 242 PL 243
P
Sbjct: 217 PS 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03408PF05616320.003 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 32.4 bits (73), Expect = 0.003
Identities = 19/46 (41%), Positives = 22/46 (47%), Gaps = 2/46 (4%)

Query: 5 PTDDTTAGQAPDDPAEPPPEPAPADTEAAEPAAEPAEKPAGAPADE 50
P D T G A A+P PE +PA+ A PA P E P P E
Sbjct: 311 PRPDLTPGSAEAPNAQPLPEVSPAENPANNPA--PNENPGTRPNPE 354


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03417PERTACTIN354e-04 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 35.1 bits (80), Expect = 4e-04
Identities = 22/69 (31%), Positives = 27/69 (39%), Gaps = 2/69 (2%)

Query: 202 LVQDATGNWVVVGTPKPAEGLPPPPLNAPLPEPARPAPPAPRVVEPLEVPVRIPSAPAPP 261
L + G W +VG P P P P P+ +P + P R P APAP
Sbjct: 552 LAANGNGQWSLVGAKAPPAPKPAPQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQ 611

Query: 262 LIPGAGPAL 270
P AG L
Sbjct: 612 --PPAGREL 618



Score = 28.5 bits (63), Expect = 0.048
Identities = 16/46 (34%), Positives = 17/46 (36%)

Query: 308 PAPGPAPEVPATATVPAAPAIPGPPPAAPLVADQFAPVPPAPGAPA 353
PAP PAP+ P P P Q P PAP PA
Sbjct: 569 PAPKPAPQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQPPA 614


35NCTC10437_03453NCTC10437_03464Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_034530123.818743membrane protein
NCTC10437_03454-1113.203726Iron-sulfur cluster assembly accessory protein
NCTC10437_03455-1103.214721glycerate kinase
NCTC10437_03456-1102.266009Protein of uncharacterised function (DUF3043)
NCTC10437_03457082.238511nicotinate-nucleotide--dimethylbenzimidazole
NCTC10437_03458081.027620cobalamin synthase
NCTC10437_03459180.466442branched-chain amino acid aminotransferase,
NCTC10437_03460290.572856glycine cleavage system aminomethyltransferase
NCTC10437_03461290.492887leucyl aminopeptidase
NCTC10437_034622100.546031short-chain dehydrogenase of uncharacterised
NCTC10437_03463211-0.033609alkaline phosphatase
NCTC10437_034642110.305061Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03462DHBDHDRGNASE1081e-28 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 108 bits (270), Expect = 1e-28
Identities = 65/185 (35%), Positives = 88/185 (47%), Gaps = 1/185 (0%)

Query: 272 VAITGAASGIGRETALAFARDGADVVISDLNADGLAETARLVAAEGAEAHTYTVDVADPA 331
ITGAA GIG A A GA + D N + L + + AE A + DV D A
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 332 AVEAFAEQVCAQHGVPDVVVNNAGVGHAGFFLDTPAEEFDRVLDINFGGVVNGCRSFGKR 391
A++ ++ + G D++VN AGV G EE++ +N GV N RS K
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 392 LADRGTGGYIVNVASMASYSPLSVMNAYCTSKAAVFMFSDCLRAELDAAGIGVTTICPGV 451
+ DR G IV V S + P + M AY +SKAA MF+ CL EL I + PG
Sbjct: 131 MMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGS 189

Query: 452 IGTNI 456
T++
Sbjct: 190 TETDM 194


36NCTC10437_03914NCTC10437_03929Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_03914215-2.6502446-phosphogluconate dehydrogenase
NCTC10437_03915214-3.6657646-phosphogluconate dehydrogenase
NCTC10437_03916216-4.082583cytochrome P450
NCTC10437_03917116-2.968794ferredoxin
NCTC10437_03918114-2.729478Uncharacterized conserved protein
NCTC10437_03919113-3.348010ferredoxin
NCTC10437_03920013-3.072577cytochrome P450
NCTC10437_03921011-3.062507carboxymuconolactone decarboxylase
NCTC10437_03922010-3.189316taurine catabolism dioxygenase TauD/TfdA
NCTC10437_03923-29-2.713265Xaa-Pro dipeptidase
NCTC10437_03924-19-2.862542Rieske (2Fe-2S) domain-containing protein
NCTC10437_03925-110-3.096890short-chain dehydrogenase/reductase SDR
NCTC10437_03926-110-2.183487ferredoxin
NCTC10437_03927011-2.134413MarR family transcriptional regulator
NCTC10437_03928-110-2.940240amidohydrolase
NCTC10437_03929111-3.359181Xaa-Pro dipeptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03925DHBDHDRGNASE1327e-40 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 132 bits (333), Expect = 7e-40
Identities = 85/251 (33%), Positives = 123/251 (49%), Gaps = 6/251 (2%)

Query: 3 RVAVVTGGASGMGEATCHELGRRGHAVAVLDLDEVAAQRVVDSLRADGVSALAVGVDVSD 62
++A +TG A G+GEA L +G +A +D + ++VV SL+A+ A A DV D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 63 HAAVGEAFAKVRTELGPVHILVTSAGMVAFAPSIEITPADWKRVIDVNLTGTFYCCQSAL 122
AA+ E A++ E+GP+ ILV AG++ ++ +W+ VN TG F +S
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 123 PDMIEAGWGRIVMISSSSAQRGSPGMAHYAASKGALISLTKSFAREYGPTGITVNNIPPS 182
M++ G IV + S+ A MA YA+SK A + TK E I N + P
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 183 GIETPMQHQSQAEGHLPPN------EQMAASIPIGHLGTGDDIAAAVGFLCSEEAGFITG 236
ET MQ A+ + E IP+ L DIA AV FL S +AG IT
Sbjct: 189 STETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITM 248

Query: 237 QTLGVNGGSVM 247
L V+GG+ +
Sbjct: 249 HNLCVDGGATL 259


37NCTC10437_04065NCTC10437_04072Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04065415-1.227808ATP:cob(I)alamin adenosyltransferase
NCTC10437_04066616-0.575532Protein of uncharacterised function (DUF2550)
NCTC10437_04067616-0.901999ATP synthase, F1 epsilon subunit
NCTC10437_04068517-0.896754ATP synthase F1 subunit beta
NCTC10437_04069514-1.466347ATP synthase F1 subcomplex gamma subunit
NCTC10437_04070514-2.057079ATP synthase F1 subcomplex alpha subunit
NCTC10437_0407139-1.411329ATP synthase subunit H
NCTC10437_0407227-1.707412F0F1 ATP synthase subunit B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04072IGASERPTASE270.036 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 27.3 bits (60), Expect = 0.036
Identities = 21/102 (20%), Positives = 38/102 (37%), Gaps = 4/102 (3%)

Query: 41 IAKWVVPPVSKVIAAREAMLAKTAADNRKSAEQ-VAAAQADYDETMAGARSQASSIRDEA 99
IA+ PV A + +T A+N K + V + D ET A R A +
Sbjct: 1017 IARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNV 1076

Query: 100 RTAGRQVVDEKRAEASDEVAETVRQADAELSDQRASTQTQLQ 141
+ + + A++ E ET E + + +++
Sbjct: 1077 KANTQT---NEVAQSGSETKETQTTETKETATVEKEEKAKVE 1115


38NCTC10437_04087NCTC10437_04107Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04087215-2.847512threonine synthase
NCTC10437_04088417-3.544738homoserine dehydrogenase
NCTC10437_04089526-6.547550diaminopimelate decarboxylase
NCTC10437_04090836-9.304345arginyl-tRNA synthetase
NCTC10437_040921059-14.546154*Uncharacterised protein
NCTC10437_04093958-13.804592Uncharacterised protein
NCTC10437_04094957-13.388310Uncharacterised protein
NCTC10437_04095957-13.705935Putative bacteriophage protein
NCTC10437_04096956-13.726782Uncharacterised protein
NCTC10437_04097955-13.160430Uncharacterised protein
NCTC10437_04098854-12.955722bacteriophage protein
NCTC10437_04099745-10.937605Uncharacterised protein
NCTC10437_04100539-9.731389Uncharacterised protein
NCTC10437_04101229-6.119276Uncharacterised protein
NCTC10437_04102122-3.934049phage integrase family protein
NCTC10437_04103114-1.146032Uncharacterised protein
NCTC10437_041040130.241113MspA protein
NCTC10437_041051160.081589Uncharacterised protein
NCTC10437_041060130.030527pyruvate/2-oxoglutarate dehydrogenase complex,
NCTC10437_041072130.185582Uncharacterised protein
39NCTC10437_04122NCTC10437_04145Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_041222100.266069C4-dicarboxylate transporter/malic acid
NCTC10437_04123080.634152integral membrane sensor signal transduction
NCTC10437_04124090.930567two component LuxR family transcriptional
NCTC10437_04125191.022609Uncharacterised protein
NCTC10437_04126-1101.076879transposase
NCTC10437_04127-172.134926transposase
NCTC10437_04128-190.923500amine oxidase
NCTC10437_04129216-0.3691892-nitropropane dioxygenase
NCTC10437_04130424-4.593014glyoxalase/bleomycin resistance
NCTC10437_04131628-5.296588BadM/Rrf2 family transcriptional regulator
NCTC10437_04132634-6.685855nicotinamidase-like amidase
NCTC10437_04133635-6.520543Uncharacterised protein
NCTC10437_04134434-6.320915Uncharacterised protein
NCTC10437_04135533-5.688529Putative bacteriophage protein
NCTC10437_04136633-4.838566Uncharacterised protein
NCTC10437_04137534-4.546279Uncharacterised protein
NCTC10437_04138532-4.306281Gp153
NCTC10437_04139830-4.968036Gp153
NCTC10437_041401134-6.807959Uncharacterised protein
NCTC10437_041411034-7.494808Y4cG protein
NCTC10437_04142730-6.798270Uncharacterised protein
NCTC10437_04143725-5.317386Uncharacterised protein
NCTC10437_04144416-4.343296phage integrase family protein
NCTC10437_04145211-3.755119Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04124HTHFIS533e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 52.5 bits (126), Expect = 3e-10
Identities = 24/84 (28%), Positives = 40/84 (47%), Gaps = 4/84 (4%)

Query: 2 RIVIAEDSALLRAGIERILTDAGHQVVAGVPDATNLLRLVNDEHPDLAILDVRMPPTFTD 61
I++A+D A +R + + L+ AG+ V +A L R + DL + DV MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPD---E 60

Query: 62 EGIRAAALLRSQNPESPVLVLSHY 85
++ P+ PVLV+S
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQ 84


40NCTC10437_04234NCTC10437_04239Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04234312-0.100184Conserved exported protein of uncharacterised
NCTC10437_042353120.036336DNA-3-methyladenine glycosylase I
NCTC10437_042362101.517509DivIVA domain-containing protein
NCTC10437_042372101.390035glycosyl transferase
NCTC10437_042382101.764009dihydropteroate synthase
NCTC10437_042392101.512077long-chain-acyl-CoA synthetase
41NCTC10437_04272NCTC10437_04284Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04272220-1.078022Uncharacterised protein
NCTC10437_042730160.256685Uncharacterised protein
NCTC10437_042740171.880435Uncharacterised protein
NCTC10437_042751163.177530Uncharacterised protein
NCTC10437_042762162.989586ADP-ribose pyrophosphatase
NCTC10437_042772142.570333pterin-4-alpha-carbinolamine dehydratase
NCTC10437_042783132.801908mannosyltransferase
NCTC10437_042793152.812542Uncharacterised protein
NCTC10437_042801131.353015conserved alanine and proline rich protein
NCTC10437_04281214-1.336577Protein of uncharacterised function (DUF1059)
NCTC10437_04282113-0.460137nitroreductase
NCTC10437_04283114-0.536928site-specific recombinase, DNA invertase Pin
NCTC10437_04284213-0.308515HhH-GPD family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04276SHAPEPROTEIN260.039 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 26.3 bits (58), Expect = 0.039
Identities = 20/55 (36%), Positives = 22/55 (40%)

Query: 25 PPELAGLWELPGGKLAPGESDRAGLARELREELGIDVTVGDRLGEDVALSASVTL 79
PPELA G L G + L R L EE GI V V + VA L
Sbjct: 279 PPELASDISERGMVLTGGGALLRNLDRLLMEETGIPVVVAEDPLTCVARGGGKAL 333


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04279SALSPVBPROT290.017 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 28.6 bits (63), Expect = 0.017
Identities = 13/28 (46%), Positives = 15/28 (53%)

Query: 83 ATPVSGPGATASINLPQPINTMQGTVPG 110
A SGP ASI LP PI+ +G P
Sbjct: 26 ALSQSGPDGLASITLPLPISAERGFAPA 53


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04280PF05616422e-06 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 42.0 bits (98), Expect = 2e-06
Identities = 27/70 (38%), Positives = 29/70 (41%), Gaps = 2/70 (2%)

Query: 137 PMALPEVAHAPAAAAEPGVVPAAVPAPGPAP--VPGPAPAPGPAPAPGAEVATPVAAAPG 194
P P A AP A P V PA PA PAP PG P P P P + PG
Sbjct: 313 PDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPG 372

Query: 195 FGPDAPVTQD 204
PD+P D
Sbjct: 373 TRPDSPAVPD 382


42NCTC10437_04427NCTC10437_04433Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04427010-3.236006Cytochrome D1 heme domain protein
NCTC10437_04428210-4.504856anti-sigma-factor antagonist
NCTC10437_04429110-3.671636anti-sigma-factor antagonist
NCTC10437_04430110-3.275751Protein of uncharacterised function (DUF732)
NCTC10437_04431110-3.576885polar amino acid ABC transporter inner membrane
NCTC10437_04432110-1.728651ABC transporter-like protein
NCTC10437_04433213-0.770217Uncharacterised protein
43NCTC10437_04462NCTC10437_04468Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04462116-3.094690universal stress protein UspA-like protein
NCTC10437_04463117-3.848021amino acid permease-associated protein
NCTC10437_04464223-4.030028protein of uncharacterised function DUF222
NCTC10437_04465618-3.359691Protein of uncharacterised function (DUF3303)
NCTC10437_04466113-3.274350deazaflavin-dependent nitroreductase family
NCTC10437_04468013-3.292340*UsfY protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04465STREPKINASE250.042 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 25.4 bits (55), Expect = 0.042
Identities = 12/36 (33%), Positives = 20/36 (55%), Gaps = 1/36 (2%)

Query: 7 MDFRLNGSAEENEAAASRILDLYSKWSPP-ASATFH 41
MD+ L G E+N +RI+ +Y P +A++H
Sbjct: 372 MDYTLTGKVEDNHDDTNRIITVYMGKRPEGENASYH 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04466FbpA_PF05833270.031 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 26.8 bits (59), Expect = 0.031
Identities = 12/39 (30%), Positives = 19/39 (48%), Gaps = 1/39 (2%)

Query: 37 RKSGRRYRTPLLVFQTRDGYAILVGY-GLQTDWLKNALA 74
KS + + + F ++DG I VG +Q D+L A
Sbjct: 449 YKSKKSKTSKPMHFISKDGIDIYVGKNNIQNDYLTLKFA 487


44NCTC10437_04767NCTC10437_04801Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04767280.113501Protein of uncharacterised function (DUF3027)
NCTC10437_04768110-0.808607major facilitator superfamily transporter
NCTC10437_04769011-0.616700Protein of uncharacterised function (DUF2771)
NCTC10437_04770-111-0.491040putative glutathione S-transferase
NCTC10437_047711122.534865cullin, a subunit of E3 ubiquitin ligase
NCTC10437_047720113.059543cold shock protein
NCTC10437_04773-1123.198157cold-shock protein
NCTC10437_04774-2113.461811molybdenum cofactor biosynthesis protein A
NCTC10437_04775-1153.695861molybdopterin converting factor, small subunit
NCTC10437_04777-1113.800475transglycosylase-like protein
NCTC10437_047781111.207167molybdopterin biosynthesis MoaE
NCTC10437_04779080.684886molybdopterin adenylyltransferase
NCTC10437_04780090.484210molybdenum cofactor biosynthesis protein MoaC
NCTC10437_047810100.227178Uncharacterised protein
NCTC10437_04782090.121673DNA-binding protein
NCTC10437_0478309-0.774962WD40 domain-containing protein
NCTC10437_0478428-0.837124DNA or RNA helicase of superfamily protein II
NCTC10437_04785110-0.699835lactoylglutathione lyase family protein
NCTC10437_04786010-1.111348luciferase family protein
NCTC10437_04787012-0.913880pyridoxamine 5'-phosphate oxidase-like protein
NCTC10437_04788012-0.887405AMP-dependent synthetase and ligase
NCTC10437_04789-113-1.071725transcriptional regulator containing an amidase
NCTC10437_04790112-1.223568Uncharacterised protein
NCTC10437_04791210-1.4134863-hydroxyacyl-CoA dehydrogenase
NCTC10437_0479218-1.345624acetyl-CoA acetyltransferase
NCTC10437_04793112-1.478373Uncharacterised protein
NCTC10437_04794111-1.542931NADH:flavin oxidoreductase
NCTC10437_04795212-1.981771Putative dihydrolipoamide s-acetyltransferase
NCTC10437_04796111-2.417731pyruvate/2-oxoglutarate dehydrogenase complex,
NCTC10437_04797011-2.758765lactoylglutathione lyase-like lyase
NCTC10437_04798011-2.727279GntR family transcriptional regulator
NCTC10437_04799111-3.012671succinate-semialdehyde dehydrogenase [NADP+] 1
NCTC10437_04800110-2.917886carboxymuconolactone decarboxylase
NCTC10437_0480119-3.233570peptidase M24
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04777PERTACTIN364e-04 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 35.8 bits (82), Expect = 4e-04
Identities = 25/67 (37%), Positives = 28/67 (41%), Gaps = 1/67 (1%)

Query: 139 GLNGLLPPPPPPLDPFAPPPPPPAPAPFDAMAAPAPAPLPPAPEALPPAPLPPAPVDQLS 198
L G PP P P P P P P P P PP + PAP PPA +LS
Sbjct: 561 SLVGAKAPPAPKPAPQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQPPAG-RELS 619

Query: 199 APAPAPL 205
A A A +
Sbjct: 620 AAANAAV 626



Score = 31.6 bits (71), Expect = 0.008
Identities = 32/85 (37%), Positives = 33/85 (38%), Gaps = 15/85 (17%)

Query: 103 LASQGKGAWPTCGRPLSAATPRNVVADPPPAPVNPLGLNGLLPPPPPPLDPFAPPPPPPA 162
LA+ G G W G A PPAP P G P P PP P PP PP
Sbjct: 552 LAANGNGQWSLVG------------AKAPPAP-KPAPQPGPQPGPQPPQPP--QPPQPPQ 596

Query: 163 PAPFDAMAAPAPAPLPPAPEALPPA 187
P APAP PPA L A
Sbjct: 597 PPQPPQRQPEAPAPQPPAGRELSAA 621



Score = 31.2 bits (70), Expect = 0.010
Identities = 26/54 (48%), Positives = 27/54 (50%), Gaps = 6/54 (11%)

Query: 155 APPPPPPAPAPFDAMAAPAPAPLPPAPEALPPAPLPPAPVDQLSAPAPAPLPPS 208
APP P PAP P P P P PP P P P PP P Q APAP PP+
Sbjct: 567 APPAPKPAPQP-----GPQPGPQPPQPPQPPQPPQPPQP-PQRQPEAPAPQPPA 614


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04783INTIMIN383e-04 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 37.7 bits (87), Expect = 3e-04
Identities = 47/283 (16%), Positives = 82/283 (28%), Gaps = 34/283 (12%)

Query: 771 TVTITVTEADGDTSPAPGIVFANSIPALRPTVAVASLPA--VYTIGKPGTALAPVVIIED 828
T T TV + + P S A+ + A+ T+ VV+
Sbjct: 579 TYTATVKKNGVAQANVPVSFNIVSGTAV-LSANSANTNGSGKATVTLKSDKPGQVVVSAK 637

Query: 829 LDSAVLTGATVSVGVGKQTGDALTYTAPPGITITVTGNGTSTLTFSGTATKEQYEAALEA 888
+V QT ++T T NG +T++ K + +
Sbjct: 638 TAEMTSALNANAVIFVDQTKASITEIKADKTTAV--ANGQDAITYTVKVMKGDKPVSNQE 695

Query: 889 VTFSATGGALIPRTFAINVTDDSGLAALLPATASAVVKNPDAPTVVTVGLSGLALPTVGA 948
VTF+ T G L T D+ A + T++ K+ + V V + A
Sbjct: 696 VTFTTTLGKLSNST----EKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKA------ 745

Query: 949 TVKPITLATIVDTDSTVLTGASVRITGNRTSGDTLGYSPIAGNPVTATYNSGTGELELSG 1008
+ + + ++ I G G T G L+ SG
Sbjct: 746 -------PEVEFFTTLTIDDGNIEIVGTGVKGKLP----------TVWLQYGQVNLKASG 788

Query: 1009 TATIAQYKQALEAVTFKATQFGGGFLDVTHIRTLSVYVTDDSN 1051
+Y + G + + T ++ V N
Sbjct: 789 GNG--KYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDN 829


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04791adhesinmafb300.042 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 29.6 bits (66), Expect = 0.042
Identities = 35/168 (20%), Positives = 56/168 (33%), Gaps = 37/168 (22%)

Query: 131 DVKGVVIGL--PEVTLGLLPGGGGVARTTRMFGIQKAFMEVLSQGTRFKPGKAKDIGLVD 188
+ GV G P ++ G G G + TR + I KA M + GK IG +
Sbjct: 227 FINGVAAGALNPFISAGEALGIGDILYGTR-YAIDKAAMRNI--APLPAEGKFAVIGGLG 283

Query: 189 ELVGSVEELIPAAKAWIKANPDSHTQPWDVKGYKMPGGTPSSPALAAILPSFPALLRKQL 248
+ G + A WI+ NP+ ++ + A+ A +L
Sbjct: 284 SVAGFEKNTREAVDRWIQENPN------------------AAETVEAVFNVAAAAKVAKL 325

Query: 249 KGAPMPAPRAILDAVVEGAQVDFDTASRIESRYFTTLVTGQTAKNMIQ 296
A P A+ DF + Y L +A+ + Q
Sbjct: 326 AKAAKPGKAAVSG--------DFADS------YKKKLALSDSARQLYQ 359


45NCTC10437_04912NCTC10437_04920Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04912213-1.638990Uncharacterised protein
NCTC10437_04913112-1.656826Uncharacterised protein
NCTC10437_04914111-1.677088carboxymuconolactone decarboxylase
NCTC10437_04915112-1.1910166-phosphogluconate dehydrogenase
NCTC10437_04916211-2.105797dehydrogenase of uncharacterised specificity,
NCTC10437_04917210-2.001875aldehyde dehydrogenase
NCTC10437_04918212-3.362638TetR family transcriptional regulator
NCTC10437_04919214-3.202020cytochrome P450
NCTC10437_04920213-3.103402short chain dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04916DHBDHDRGNASE1271e-37 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 127 bits (319), Expect = 1e-37
Identities = 81/251 (32%), Positives = 117/251 (46%), Gaps = 15/251 (5%)

Query: 11 KVAIVTGAGGGIGQAYAEALAREGAAVVVADLNLEGAQKVADGIKGEGGNALAVRVDVSD 70
K+A +TGA GIG+A A LA +GA + D N E +KV +K E +A A DV D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 71 IDSAKDMAAQTLSEFGGIDYLVNNAAIFGGMKLDFLITVDWDYYKKFMSVNLDGALVCTR 130
+ ++ A+ E G ID LVN A + + L +W + SVN G +R
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEW---EATFSVNSTGVFNASR 125

Query: 131 AVYRKMAKRGGGAIVNQSSTAA---WLYSNFYGLAKAGVNSLTQQLATELGGQNIRVNAI 187
+V + M R G+IV S A Y +KA T+ L EL NIR N +
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 188 APGPIDTEANRTT-TPQEMVADIVK--------GIPLSRMGEVDDLTGMCLFLLSDQAKW 238
+PG +T+ + + ++K GIPL ++ + D+ LFL+S QA
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 239 VTGQIFNVDGG 249
+T VDGG
Sbjct: 246 ITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04918HTHTETR522e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 51.9 bits (124), Expect = 2e-10
Identities = 18/135 (13%), Positives = 42/135 (31%), Gaps = 5/135 (3%)

Query: 20 RRQEETFRKVLAAGLEMLRETSYADLTVRAVAARAKVAPATAYTYFSSKNHLIAEIYLDL 79
+ +ET + +L L + + + ++ +A A V Y +F K+ L +EI+
Sbjct: 7 QEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELS 66

Query: 80 MQKV-----EYFTDVNDSKATRVEATLRTMALTLADDPEVAAACTTALLSSSDAAVRAVR 134
+ EY + + L + + + AV
Sbjct: 67 ESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVV 126

Query: 135 ERIGTEIHRRIRAAV 149
++ + +
Sbjct: 127 QQAQRNLCLESYDRI 141


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04920DHBDHDRGNASE1178e-34 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 117 bits (294), Expect = 8e-34
Identities = 67/191 (35%), Positives = 105/191 (54%)

Query: 13 ALVAGASSGIGEATAIDLAARGFPVALGARRVEKLEQIVDKIRADGGEAIAVHLDVTDPD 72
A + GA+ GIGEA A LA++G +A EKLE++V ++A+ A A DV D
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 73 SVKAAVEQTTSELGDIEVLVAGAGDTYFGKLDAISTEQFESQIQIHLIGANRVATAVLPG 132
++ + E+G I++LV AG G + ++S E++E+ ++ G + +V
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 133 MIERQRGDLIFVGSDVALRQRPHMGAYGAAKAALVAMVTNYQMELEGTGVRASIVHPGPT 192
M++R+ G ++ VGS+ A R M AY ++KAA V +EL +R +IV PG T
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGST 190

Query: 193 KTSMGWSLPAD 203
+T M WSL AD
Sbjct: 191 ETDMQWSLWAD 201


46NCTC10437_05016NCTC10437_05034Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_05016213-2.681619deazaflavin-dependent nitroreductase family
NCTC10437_05017112-2.464378GAF domain-containing protein
NCTC10437_05018013-1.496302short-chain dehydrogenase
NCTC10437_05019013-1.356579dehydrogenase of uncharacterised specificity,
NCTC10437_05020013-1.416125enoyl-CoA hydratase/isomerase
NCTC10437_05021113-1.796753coenzyme A transferase
NCTC10437_05022211-1.883657acyl CoA:acetate/3-ketoacid CoA transferase
NCTC10437_05023213-2.1892162-nitropropane dioxygenase-like enzyme
NCTC10437_05024215-3.258200oxidoreductase, Rxyl_3153 family
NCTC10437_05025120-4.111740acetyl-CoA acetyltransferase
NCTC10437_05026124-4.624168transcriptional regulator
NCTC10437_05027225-3.999527dehydrogenase
NCTC10437_05028224-4.559499putative dehydrogenase
NCTC10437_05029223-4.512217membrane protein involved in the export of
NCTC10437_05030223-4.347163putative glycosyltransferase
NCTC10437_05031120-4.078849Lipid A core - O-antigen ligase and related
NCTC10437_05032121-3.674615cellobiohydrolase A
NCTC10437_05033221-3.722261Uncharacterised protein
NCTC10437_05034121-3.105761glycosyl transferase family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05018DHBDHDRGNASE733e-17 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 73.2 bits (179), Expect = 3e-17
Identities = 55/216 (25%), Positives = 96/216 (44%), Gaps = 21/216 (9%)

Query: 4 LDGRVVIVTGAGGGIGRAHAMAFAAEGARVVVNDIGVGLDGSPAGGGSAAQNVVDEIVAA 63
++G++ +TGA GIG A A A++GA + +D +P + VV + A
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIA------AVDYNP----EKLEKVVSSLKAE 55

Query: 64 GGEAVTSGANVADWTQAEGLIQTAVDSFGGLDVLVNNAGIVRDRMFANTSEEEFDAVTAV 123
A A+V D + + G +D+LVN AG++R + + S+EE++A +V
Sbjct: 56 ARHAEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSV 115

Query: 124 HLKGHFATMKHAAAYWRAKVKSGEAGPDTLHARIINTSSGAGLQGSVGQANYSAAKAGIA 183
+ G F + + Y + +SG I+ S A Y+++KA
Sbjct: 116 NSTGVFNASRSVSKYMMDR-RSGS---------IVTVGSNPAGVPRTSMAAYASSKAAAV 165

Query: 184 ALTLVAAAEMGRYGVNVNAIAP-SARTRMTETVFAD 218
T E+ Y + N ++P S T M +++AD
Sbjct: 166 MFTKCLGLELAEYNIRCNIVSPGSTETDMQWSLWAD 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05019DHBDHDRGNASE1002e-27 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 100 bits (249), Expect = 2e-27
Identities = 70/254 (27%), Positives = 108/254 (42%), Gaps = 23/254 (9%)

Query: 21 GLARRVVLVTGGVRGVGAGISKVFADQGATVITCARRP-----VEQS-------PYEFRA 68
G+ ++ +TG +G+G +++ A QGA + P V S F A
Sbjct: 5 GIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPA 64

Query: 69 CDIRDDDAVKALIDGVAADHGRLDVVVNNAGGSPYVPAAEASAKFSTKILQLNLLGALSV 128
D+RD A+ + + + G +D++VN AG S + +N G +
Sbjct: 65 -DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 129 STHANAVMQNQDSGGSIVNISSVSGHRPTPGTAAYGAAKAGIDNLTATLAVEWAP-KVRV 187
S + M ++ GSIV + S P AAY ++KA T L +E A +R
Sbjct: 124 SRSVSKYMMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 188 NSVVVGMVETEQAELFYGDADSIAAISKN--------VPLGRLAKPDDIGWATAFLASDA 239
N V G ET+ + D + + K +PL +LAKP DI A FL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 240 ASYISGASLEVHGG 253
A +I+ +L V GG
Sbjct: 243 AGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05026HTHTETR851e-22 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 85.1 bits (210), Expect = 1e-22
Identities = 33/191 (17%), Positives = 73/191 (38%), Gaps = 6/191 (3%)

Query: 8 RRDELLALAATMFAERGLRATTVRDIADSAGILSGSLYHHFSSKEEMVDEVL-RNFLDWL 66
R +L +A +F+++G+ +T++ +IA +AG+ G++Y HF K ++ E+ + +
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 67 FARYQEIVATEPNPLERLKGLFMSSFEAIET---RHAEVVIYQDEAKRLSSQERFAYVD- 122
+ +PL L+ + + E+ T R + I + + +
Sbjct: 72 ELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQR 131

Query: 123 ERNREQRKMWVDVLNQGIKEGYFRPDVDVDLVYRFIRDT-TWVSVRWYQPGGPLTAEEVG 181
E L I+ D+ +R + + W ++
Sbjct: 132 NLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDLKKEA 191

Query: 182 RQYLSIVLGGI 192
R Y++I+L
Sbjct: 192 RDYVAILLEMY 202


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05027NUCEPIMERASE446e-07 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 43.6 bits (103), Expect = 6e-07
Identities = 52/241 (21%), Positives = 89/241 (36%), Gaps = 39/241 (16%)

Query: 1 MKPIAIIGAASFVGARFVERAELLGDFSIV--------PIVRSPRSQGRLARFGTRLALG 52
MK + GAA F+G +R G + + LA+ G +
Sbjct: 1 MK-YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 53 DAGDPASLVPLLRGCGM--VVNL--------TLGD-----DRRIVGDVQAIHAACEEAEV 97
D D + L V +L + D + G + I C ++
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLN-ILEGCRHNKI 118

Query: 98 PLFVHMSSAEVFGRAEDPGLSDDSVADSPHWMEYAHGKRAAE--AWLRSQADGPVRVVIL 155
++ SS+ V+G S D D P + YA K+A E A S G + L
Sbjct: 119 QHLLYASSSSVYGLNRKMPFSTDDSVDHPVSL-YAATKKANELMAHTYSHLYG-LPATGL 176

Query: 156 RPGLIWGPGAGW------LVDPAKALVEGTAY-LFNEGRGICNLIHVDNLVEHVLQLATA 208
R ++GP W L KA++EG + ++N G+ + ++D++ E +++L
Sbjct: 177 RFFTVYGP---WGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDV 233

Query: 209 P 209

Sbjct: 234 I 234


47NCTC10437_05155NCTC10437_05178Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_051552112.625919DEAD/DEAH box helicase
NCTC10437_051562141.027209Uncharacterised protein
NCTC10437_05157524-0.543397protein kinase
NCTC10437_05158528-1.721052helicase/secretion neighborhood TadE-like
NCTC10437_05159432-4.286254Conserved exported protein of uncharacterised
NCTC10437_05160636-5.832367Uncharacterised protein
NCTC10437_05161534-5.313513Uncharacterised protein
NCTC10437_05162530-5.261423Uncharacterised protein
NCTC10437_05163322-2.732188Uncharacterised protein
NCTC10437_05164422-2.829018Uncharacterised protein
NCTC10437_05165321-2.625615Uncharacterised protein
NCTC10437_05166420-2.422327putative transcriptional regulator
NCTC10437_05167623-3.098147Uncharacterised protein
NCTC10437_05168523-2.726104phage capsid protein
NCTC10437_05169526-3.208162Uncharacterised protein
NCTC10437_05170729-5.064256Uncharacterised protein
NCTC10437_05171926-4.411142Uncharacterised protein
NCTC10437_05172926-4.551939Uncharacterised protein
NCTC10437_05173824-3.050715Uncharacterised protein
NCTC10437_05174521-1.317759Uncharacterised protein
NCTC10437_051754180.145287Uncharacterised protein
NCTC10437_051762152.516766phague integrase
NCTC10437_051771123.416889Uncharacterised protein
NCTC10437_051781124.025556type II secretion system protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05157YERSSTKINASE372e-04 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 36.6 bits (84), Expect = 2e-04
Identities = 45/176 (25%), Positives = 71/176 (40%), Gaps = 25/176 (14%)

Query: 68 HPNIVSVHDSGV----HDGRPFIIMERLPG-------RTLADVMADGPMPP----AQVRS 112
HPN+ +VH V + ++M+ + G RTLAD G + ++
Sbjct: 190 HPNLANVHGMAVVPYGNRKEEALLMDEVDGWRCSDTLRTLADSWKQGKINSEAYWGTIKF 249

Query: 113 VLDDVLAALSAAHAAGVLHRDIKPANILVSTTGDCMKVADFGIAKTGGAAHTTTG-QIVG 171
+ +L + AGV+H DIKP N++ V D G+ H+ +G Q G
Sbjct: 250 IAHRLLDVTNHLAKAGVVHNDIKPGNVVFDRASGEPVVIDLGL-------HSRSGEQPKG 302

Query: 172 -TLCYMSPERITG-APASVRDDLYAVGIIGYEALLGRRAFPQDNPAALARAIIDAP 225
T + +PE G AS + D++ V + G P+ P R I P
Sbjct: 303 FTESFKAPELGVGNLGASEKSDVFLVVSTLLHCIEGFEKNPEIKPNQGLRFITSEP 358


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05169PF01206260.026 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 25.9 bits (57), Expect = 0.026
Identities = 6/24 (25%), Positives = 14/24 (58%)

Query: 111 DVVAILAERPGLKRNVPAFDQSQG 134
+V+ ++A PG ++ +F + G
Sbjct: 33 EVLYVMATDPGSVKDFESFSKQTG 56


48NCTC10437_05201NCTC10437_05208Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_05201216-3.369969glycosyl transferase family protein
NCTC10437_05202222-4.797056metallophosphoesterase
NCTC10437_05203-122-4.543465pyridoxal-5'-phosphate-dependent enzyme subunit
NCTC10437_05204025-4.981931Uncharacterised protein
NCTC10437_05206-124-4.761449*Uncharacterised protein
NCTC10437_05207-122-4.194649Uncharacterised protein
NCTC10437_05208-220-3.042868pyridoxamine 5'-phosphate oxidase-like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05201TONBPROTEIN411e-05 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 40.7 bits (95), Expect = 1e-05
Identities = 15/62 (24%), Positives = 23/62 (37%), Gaps = 2/62 (3%)

Query: 756 PGSIVTIQISNGIAPPPPPPPAGAPPPGLPPPPVGETVIQIPGLPPITVPVLGPPPPPPP 815
P +++ + PP P P + P P E + + P V + P P P P
Sbjct: 41 PAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPP--KEAPVVIEKPKPKPKP 98

Query: 816 PP 817
P
Sbjct: 99 KP 100



Score = 33.4 bits (76), Expect = 0.002
Identities = 12/47 (25%), Positives = 16/47 (34%)

Query: 768 IAPPPPPPPAGAPPPGLPPPPVGETVIQIPGLPPITVPVLGPPPPPP 814
I PP P P P P + V ++ P V + P P
Sbjct: 78 IPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASP 124


49NCTC10437_05436NCTC10437_05442Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_05436213-0.625035putative phage repressor
NCTC10437_054373110.735911superoxide dismutase, Ni
NCTC10437_054384121.566777Uncharacterised protein
NCTC10437_054393132.068560Helix-turn-helix protein
NCTC10437_054404132.800580Uncharacterised protein
NCTC10437_054413142.300062Uncharacterised protein
NCTC10437_054422132.178605Uncharacterised protein
50NCTC10437_05458NCTC10437_05472Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_05458217-0.7813003-hydroxyacyl-CoA dehydrogenase
NCTC10437_05459215-0.530858gluconate permease
NCTC10437_05460113-0.387803Luciferase-like protein
NCTC10437_05461212-0.205334response regulator containing a CheY-like
NCTC10437_05462415-0.282076putative outer membrane adhesin like protein
NCTC10437_05463-112-0.675058Uncharacterised protein
NCTC10437_05464-111-0.020082NADH/NADPH-dependent glutamate synthase small
NCTC10437_05465-1100.039409glutamate synthase (NADH) large subunit
NCTC10437_05466091.784805thioesterase superfamily protein
NCTC10437_054670101.786761MIP family channel protein
NCTC10437_054680121.903480Ferritin, Dps family protein
NCTC10437_054690121.9097593-methyladenine DNA glycosylase
NCTC10437_054702111.472821short-chain dehydrogenase of uncharacterised
NCTC10437_054711101.270451FAD dependent oxidoreductase
NCTC10437_054722100.017965Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05462TONBPROTEIN350.002 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 34.6 bits (79), Expect = 0.002
Identities = 27/139 (19%), Positives = 38/139 (27%), Gaps = 4/139 (2%)

Query: 99 PDIELPADPDVDAELPSDPDLEVPAEVPTDTESE-ESPPEAETPTASGDGPVLGGDPDEN 157
P P + +P V E E E P E P + PV+ P
Sbjct: 39 PAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEA---PVVIEKPKPK 95

Query: 158 ATQPHLPPSSPTTTLTPDAEPSEPEVVSTLSTQRNATTLAASATTPPHLPQASVTVTAPP 217
P D +P E S A +++AT P SV
Sbjct: 96 PKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRA 155

Query: 218 PPPAPLRQIVRALVVGVLG 236
+ RA + + G
Sbjct: 156 LSRNQPQYPARAQALRIEG 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05468HELNAPAPROT1111e-33 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 111 bits (278), Expect = 1e-33
Identities = 34/145 (23%), Positives = 58/145 (40%), Gaps = 2/145 (1%)

Query: 13 QGAEVAELLQKALSRYNDLHLTLKHVHWNVVGPNFIGVHEMIDPQVELVRGYADEAAERI 72
V L LS + L+ L HW V GP+F +HE + + D AER+
Sbjct: 9 NQTLVENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERL 68

Query: 73 AALGKSPLGTPGAIINDRTWDDYSVNRDTVQAHLAALDLVYVGVIEDTRKSIERIGE-LD 131
A+G P+ T + D + A ++ + +++ I E D
Sbjct: 69 LAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYKQ-ISSESKFVIGLAEENQD 127

Query: 132 PVSEDMLIGHAAELEKFQWFVRAHL 156
+ D+ +G E+EK W + ++L
Sbjct: 128 NATADLFVGLIEEVEKQVWMLSSYL 152


51NCTC10437_05693NCTC10437_05708Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_05693214-0.555288putative F420-dependent oxidoreductase,
NCTC10437_05694118-0.940703Uncharacterised protein
NCTC10437_05695117-0.854528Complex I intermediate-associated protein 30
NCTC10437_05696220-1.854466Uncharacterised protein
NCTC10437_05697219-1.959710Uncharacterised protein
NCTC10437_05698017-2.515138major facilitator superfamily protein
NCTC10437_05699-120-3.088035flavin-dependent oxidoreductase, F420-dependent
NCTC10437_05700-217-3.114985Predicted flavin-nucleotide-binding protein
NCTC10437_05701-217-3.155978alpha/beta hydrolase fold protein
NCTC10437_05702-215-2.127616aldo/keto reductase
NCTC10437_05703016-2.210830short-chain dehydrogenase/reductase SDR
NCTC10437_05704-110-1.699778TetR family transcriptional regulator
NCTC10437_05705-19-1.666392MerR family transcriptional regulator
NCTC10437_05706010-1.552872Uncharacterised protein
NCTC10437_05707212-1.108661short-chain dehydrogenase/reductase SDR
NCTC10437_05708213-1.608844Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05698TCRTETB432e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 42.6 bits (100), Expect = 2e-06
Identities = 27/173 (15%), Positives = 65/173 (37%), Gaps = 6/173 (3%)

Query: 27 VLALVGTLNYVDRFLPGVLAEPIKQELALSDTAIGVINGFGFLIVYAVLGIVIARIADRG 86
+L+ LN + + V I + + +N F++ +++ V +++D+
Sbjct: 21 ILSFFSVLNEM---VLNVSLPDIANDFNKPPASTNWVNT-AFMLTFSIGTAVYGKLSDQL 76

Query: 87 VFGAVIAGCLTLWGAMTMLGGAVQSGLQ-LALTRVGVAVGEAGSSPAAHAYVARNFVPEK 145
++ + + +++G S L + R G A VAR E
Sbjct: 77 GIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN 136

Query: 146 RSAPLAVITMSIPLASAASLLGGGLLAESLGWRAAFVVMGGISVLLAPLVLWV 198
R +I + + GG++A + W ++ I+++ P ++ +
Sbjct: 137 RGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIP-MITIITVPFLMKL 188



Score = 31.4 bits (71), Expect = 0.008
Identities = 27/136 (19%), Positives = 57/136 (41%), Gaps = 20/136 (14%)

Query: 278 LGLLIVGRIADRLATRDPRWLLWIVVILITGLLPASALAFVVESQTLCVWLLALAYVIGT 337
+G + G+++D+L + L + I+I S + FV + L+ ++ G
Sbjct: 64 IGTAVYGKLSDQLGIKR----LLLFGIIINCF--GSVIGFV--GHSFFSLLIMARFIQGA 115

Query: 338 ---AYLAPSIAAIQRLVLPEQRATASAIFLFFNATLGAVGPFLTGVISDALTAELGPQAL 394
A+ A + + R + E R A + A VGP + G+I+ +
Sbjct: 116 GAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIA---------HYI 166

Query: 395 GRALLILVPTMQLVAI 410
+ L+L+P + ++ +
Sbjct: 167 HWSYLLLIPMITIITV 182


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05702INTIMIN310.007 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 30.8 bits (69), Expect = 0.007
Identities = 11/36 (30%), Positives = 20/36 (55%), Gaps = 5/36 (13%)

Query: 175 SQITPEQLAEARQIADIVTVQSRYNVIDRSAQAVLE 210
QI P+ + E R ++ SRY+++ R+ +LE
Sbjct: 417 QQIEPQYVNELRTLS-----GSRYDLVQRNNNIILE 447


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05703DHBDHDRGNASE1161e-33 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 116 bits (291), Expect = 1e-33
Identities = 75/253 (29%), Positives = 118/253 (46%), Gaps = 15/253 (5%)

Query: 4 LSGRRALVTGGSRGIGAGIVRRLTADGAAVAFTFSASKAEADELLADVTAHGAKAVAIHA 63
+ G+ A +TG ++GIG + R L + GA +A + + +++++ + A A A A
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIA-AVDYNPEKLEKVVSSLKAEARHAEAFPA 64

Query: 64 DVADPTQAAAAVDTAAAELGGMDIVVNNAGIAVKAPIEEFTQEQYDRLVAINIGGPFWTT 123
DV D E+G +DI+VN AG+ I + E+++ ++N G F +
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 124 HSSLKHLGD--GGRIINIGSINADRVPVPELSVYAMTKGAVSSFTRGLARELGPRGITVN 181
S K++ D G I+ +GS N VP ++ YA +K A FT+ L EL I N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGS-NPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 182 NVQPGPINTDMN-----PDEGD------FAAALKNVTALGRYGQTSDVAAVVSFLAGPES 230
V PG TDM + G K L + + SD+A V FL ++
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 231 SYVTGANLNVDGG 243
++T NL VDGG
Sbjct: 244 GHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05704HTHTETR477e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 46.5 bits (110), Expect = 7e-09
Identities = 26/150 (17%), Positives = 46/150 (30%), Gaps = 16/150 (10%)

Query: 1 MKLFWQHGFDGVSISDVTAVTGVNRRSIYAEFGSKEKLFMRAAERYLAGPSGYVTDAL-- 58
++LF Q G S+ ++ GV R +IY F K LF E + +
Sbjct: 21 LRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELELEYQAK 80

Query: 59 ----TRPTAREVAEAMVHGAAN-----LVSGAIPGCLTTVGEAPGLAELR----EAIVRR 105
RE+ ++ L+ I VGE + + + R
Sbjct: 81 FPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNLCLESYDR 140

Query: 106 LAERFDAAVADGELF-GVDTVVLARWIVAV 134
+ + + L + T A +
Sbjct: 141 IEQTLKHCIEAKMLPADLMTRRAAIIMRGY 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05707DHBDHDRGNASE792e-19 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 78.9 bits (194), Expect = 2e-19
Identities = 53/183 (28%), Positives = 75/183 (40%), Gaps = 4/183 (2%)

Query: 6 FITGGTPGNFGMAFAETALEAGDRVALTSRRPQELDTWAQQYGDRVLV---VPLELTDAA 62
FITG G G A A T G +A P++L+ P ++ D+A
Sbjct: 12 FITGAAQG-IGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 63 QVQRAVRDAEEHFGGIDVLVNNAGRGWYGSIEGMDESSLRAMFELNFFAVLSVTRAVLPG 122
+ E G ID+LVN AG G I + + A F +N V + +R+V
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 123 MRARGNGWIVNVSSVAGLVSAPGFGFYSATKYAIEAITDALRDEVAAQGISVLTVEPGAF 182
M R +G IV V S V Y+++K A T L E+A I V PG+
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGST 190

Query: 183 RTN 185
T+
Sbjct: 191 ETD 193


52NCTC10437_05718NCTC10437_05728Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_05718215-1.525395protein of uncharacterised function (DUF1707)
NCTC10437_05719215-1.573641putative transcriptional regulator
NCTC10437_05720112-0.821079inositol 1-phosphate synthase
NCTC10437_057210110.002349Uncharacterised protein
NCTC10437_05722010-0.058640alphabeta hydrolase family protein
NCTC10437_0572328-0.923232dehydrogenase of uncharacterised specificity,
NCTC10437_0572427-0.787709TetR family transcriptional regulator
NCTC10437_0572527-0.794427acetyltransferase
NCTC10437_0572638-1.169979alpha/beta hydrolase fold protein
NCTC10437_0572717-1.306058WD40 domain-containing protein
NCTC10437_0572827-1.148102WD40 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05723DHBDHDRGNASE1191e-34 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 119 bits (298), Expect = 1e-34
Identities = 81/253 (32%), Positives = 125/253 (49%), Gaps = 11/253 (4%)

Query: 4 LTGKTALVTGATSGIGLAGARALAEEGAYVFLVGRRQDALEDAVAGIGAE--HASAIRAD 61
+ GK A +TGA GIG A AR LA +GA++ V + LE V+ + AE HA A AD
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 62 VTVQADLDRVAATIKATGRRLDVVFANAGINEFATLGNLTWEHHTTIFNTNVGGVIFAVQ 121
V A +D + A I+ +D++ AG+ + +L+ E F+ N GV A +
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 122 AALPLLND--GASIILCGSNGDVKAAPGASVYAASKAAIRSLARSWAAELVDRGIRVNVV 179
+ + D SI+ GSN + YA+SKAA + EL + IR N+V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 180 APGLTQTPGLADLFSEADDA-------LADLTSTVPMKRRARPEEIGSVVAFLASDGSSF 232
+PG T+T L+++ + A L + +P+K+ A+P +I V FL S +
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 233 MTGAEVYVDGGVS 245
+T + VDGG +
Sbjct: 246 ITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05724HTHTETR484e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 48.1 bits (114), Expect = 4e-09
Identities = 26/151 (17%), Positives = 48/151 (31%), Gaps = 8/151 (5%)

Query: 18 DQALDSAIDVFWRLGYEGASLAELTNAMGINRPSLYAVFGSKEELFIRALQRYGTTYHEH 77
LD A+ +F + G SL E+ A G+ R ++Y F K +LF + + E
Sbjct: 14 QHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGEL 73

Query: 78 LGQLLSRPGAY---QVLESYLRATANAVRAGSAPGCLSIQGGLSCGPNNARIPRLLA--- 131
+ ++ + E + + V + I + +
Sbjct: 74 ELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNL 133

Query: 132 --EYRHSIEAAVADALARTEDAAGIDTAALA 160
E IE + + A + T A
Sbjct: 134 CLESYDRIEQTLKHCIEAKMLPADLMTRRAA 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05726NEISSPPORIN280.042 Neisseria sp. porin signature.
		>NEISSPPORIN#Neisseria sp. porin signature.

Length = 348

Score = 28.0 bits (62), Expect = 0.042
Identities = 13/36 (36%), Positives = 20/36 (55%), Gaps = 6/36 (16%)

Query: 43 ALKFGDDAPRVVFLHG------GGQNAHTWDTVIVG 72
A +FG+ PRV + HG + +T+D V+VG
Sbjct: 273 AYRFGNVTPRVSYAHGFKGTVDSANHDNTYDQVVVG 308


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05727IGASERPTASE320.010 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.0 bits (72), Expect = 0.010
Identities = 33/154 (21%), Positives = 53/154 (34%), Gaps = 3/154 (1%)

Query: 73 GDAAPARSSGQSDAAGADEASEPEDSESEDADAADAADAADDADAADAADAADAAEEDAE 132
+ + AD S P S +E+ D A A A + AE +
Sbjct: 989 NQTVDTTNITTPNNIQADVPSVP--SNNEEIARVDEAPVPPPAPATPSETTETVAENSKQ 1046

Query: 133 EPGDADDAGEGATEIDAVDDAVVVEDLTGVEVELEVVTTAEREYTEPEPVTRETHSAAAV 192
E + + ATE A + V E + V+ + A+ E T ET A V
Sbjct: 1047 ESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATV 1106

Query: 193 PDSSE-RAAASATQPPPYVSEQSEYQKRVAAVLQ 225
+ + TQ P V+ Q ++ + +Q
Sbjct: 1107 EKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQ 1140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05728TONBPROTEIN340.002 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 34.2 bits (78), Expect = 0.002
Identities = 10/36 (27%), Positives = 11/36 (30%)

Query: 604 MPESPPYAPADSEEEQSEEDEEAAAPPAPRPPAPAP 639
PP A E E + E P P AP
Sbjct: 53 ADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVV 88



Score = 32.7 bits (74), Expect = 0.005
Identities = 10/38 (26%), Positives = 12/38 (31%)

Query: 603 VMPESPPYAPADSEEEQSEEDEEAAAPPAPRPPAPAPA 640
V P A + + E P P PP AP
Sbjct: 50 VTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPV 87


53NCTC10437_00022NCTC10437_00030N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_000222130.692075protein kinase
NCTC10437_000232130.533865penicillin-binding transpeptidase
NCTC10437_000241130.552325cell cycle protein
NCTC10437_000251131.378944protein phosphatase 2C domain-containing
NCTC10437_000262130.240367FHA domain-containing protein
NCTC10437_000272120.991314FHA domain-containing protein
NCTC10437_000291121.074154*TetR family transcriptional regulator
NCTC10437_000302131.207065short-chain dehydrogenase/reductase SDR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00022YERSSTKINASE330.002 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 33.2 bits (75), Expect = 0.002
Identities = 32/112 (28%), Positives = 52/112 (46%), Gaps = 15/112 (13%)

Query: 135 AGLVHRDVKPGNILIT-PTGQVKLTDFGIAKAVDAAPVTQTGMVMGTAQYIAPEQALGH- 192
AG+VH D+KPGN++ +G+ + D G+ P G T + APE +G+
Sbjct: 264 AGVVHNDIKPGNVVFDRASGEPVVIDLGLHSRSGEQP---KGF---TESFKAPELGVGNL 317

Query: 193 DATAASDVYAL------GVVGYEAVSGKRPFTGEGALTVAMKHI-KENPPPL 237
A+ SDV+ + + G+E +P G +T H+ EN P+
Sbjct: 318 GASEKSDVFLVVSTLLHCIEGFEKNPEIKPNQGLRFITSEPAHVMDENGYPI 369


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00025PF03544416e-06 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 40.7 bits (95), Expect = 6e-06
Identities = 21/79 (26%), Positives = 21/79 (26%), Gaps = 4/79 (5%)

Query: 422 APPAPATTPAPRPAPTSAPRSPAPSGAPTPAPPRTVTSSPAPSSPAPSPAPSGTAAPAPS 481
PP P P P P P P AP P P P P P P S
Sbjct: 68 PPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPK----PKPKPKPVKKVEQPKRDVKPVES 123

Query: 482 GPPSPAPTPTPTVTALPPP 500
P SP P
Sbjct: 124 RPASPFENTAPARPTSSTA 142



Score = 36.1 bits (83), Expect = 2e-04
Identities = 21/87 (24%), Positives = 25/87 (28%), Gaps = 1/87 (1%)

Query: 422 APPAPATTPAPRPAPTSAPRSPAPSGAPTPAPPRTVTSSPAPSSPAPSPAPSGTAAPAPS 481
AP P + PA P++ P P P P P AP P P
Sbjct: 45 APAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPK 104

Query: 482 GPPS-PAPTPTPTVTALPPPPPEPGTN 507
P P V + P P N
Sbjct: 105 PKPVKKVEQPKRDVKPVESRPASPFEN 131



Score = 35.7 bits (82), Expect = 3e-04
Identities = 18/84 (21%), Positives = 19/84 (22%), Gaps = 3/84 (3%)

Query: 423 PPAPATTPAPRPAPTSAPRSPAPSGAPTPAPPRTVTSSPAPSSPAPSPAPSGTAAPAPSG 482
P A P P P P P P P + P P P
Sbjct: 63 PQAVQPPPEPVVEP---EPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVK 119

Query: 483 PPSPAPTPTPTVTALPPPPPEPGT 506
P P TA P T
Sbjct: 120 PVESRPASPFENTAPARPTSSTAT 143



Score = 34.6 bits (79), Expect = 5e-04
Identities = 27/116 (23%), Positives = 34/116 (29%), Gaps = 2/116 (1%)

Query: 393 VIAGLPSGSLDEAIGQIEELSRSSVLPVCAPPAPATTPAPRPAPTSAPRSPAPSGAPTPA 452
V+AGL S+ + I SV V AP A +P P P P P P
Sbjct: 28 VVAGLLYTSVHQVIELPAPAQPISVTMV-APADLEPPQAVQPPPEPVVE-PEPEPEPIPE 85

Query: 453 PPRTVTSSPAPSSPAPSPAPSGTAAPAPSGPPSPAPTPTPTVTALPPPPPEPGTNC 508
PP+ P P P P P P P ++
Sbjct: 86 PPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSST 141



Score = 34.6 bits (79), Expect = 6e-04
Identities = 15/82 (18%), Positives = 20/82 (24%), Gaps = 1/82 (1%)

Query: 422 APPAPATTPAPRPAPTSAPRSPAPSGAPTPAPPRTVTSSPAPSSPAPSPAPSGTAAPAPS 481
P P P P P + P P P P V P + +
Sbjct: 74 VEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK-PVKKVEQPKRDVKPVESRPASPFENT 132

Query: 482 GPPSPAPTPTPTVTALPPPPPE 503
P P + T+ P
Sbjct: 133 APARPTSSTATAATSKPVTSVA 154



Score = 34.6 bits (79), Expect = 6e-04
Identities = 24/124 (19%), Positives = 35/124 (28%), Gaps = 3/124 (2%)

Query: 380 LGVDDMRPSERAQVIAGLPSGSLDEAIGQIEELSRSSVLPVCAPPAPATTPAPRPAPTSA 439
+ V + P++ A P E + + E P P P P+P P
Sbjct: 50 ISVTMVAPADLEPPQAVQPP---PEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK 106

Query: 440 PRSPAPSGAPTPAPPRTVTSSPAPSSPAPSPAPSGTAAPAPSGPPSPAPTPTPTVTALPP 499
P P + +SP ++ P S A S A P P
Sbjct: 107 PVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQ 166

Query: 500 PPPE 503
P
Sbjct: 167 YPAR 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00029HTHTETR682e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 68.1 bits (166), Expect = 2e-16
Identities = 32/169 (18%), Positives = 60/169 (35%), Gaps = 15/169 (8%)

Query: 6 RPVRADAARNRALLLAAAEDEFRERG-ASASVADIARRAGVAKGTVFRHFPTKEDLIASI 64
R + +A R +L A F ++G +S S+ +IA+ AGV +G ++ HF K DL + I
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 65 VCEHVAVLAEAAARLAD------SPDPGAALLEFLTLAADQRQRHDLTFLQSASDGDPRV 118
+ + E L+ L + +R L +
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 119 TEVRDAL--------HEHLETLVDRARASGAIRADITEADVFLMMCAPI 159
V ++ +E + + + AD+ ++M I
Sbjct: 123 MAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00030DHBDHDRGNASE673e-15 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 66.6 bits (162), Expect = 3e-15
Identities = 52/231 (22%), Positives = 88/231 (38%), Gaps = 27/231 (11%)

Query: 3 LRGATVLVTGTNRGIGQHFAVQLLQRGAKVYATARRPELVDIPG---------AEVLRLD 53
+ G +TG +GIG+ A L +GA + A PE ++ AE D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 54 ITDQSSVDAV----AAVAGDVDVLINNAADTAGGNLVTGDLDAIRSTMDSNYYGTLAMIR 109
+ D +++D + G +D+L+N A G + + + +T N G R
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 110 AFAPILARNGGGAILNVLSAAAWTTVDGNTAYAAAKSAQWGLTNGVRLELAAQGTQVAAL 169
+ + + G+I+ V S A AYA++K+A T + LELA + +
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 170 VPGLIGT--------------QTLLDFAERHGIDLPDDAVMDPADLVRLAL 206
PG T Q + E +P + P+D+ L
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVL 236


54NCTC10437_00170NCTC10437_00177N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00170-281.204709dihydroflavonol 4-reductase
NCTC10437_00171091.202344Zn-dependent oxidoreductase, NADPH:quinone
NCTC10437_00172080.707974transcriptional regulator
NCTC10437_00173-1100.682477Acyltransferase-like protein
NCTC10437_001741111.441948TetR family transcriptional regulator
NCTC10437_001752130.924797PMT family glycosyltransferase,
NCTC10437_001764150.952829Uncharacterised protein
NCTC10437_001772163.043807dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00170NUCEPIMERASE627e-13 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 61.7 bits (150), Expect = 7e-13
Identities = 43/168 (25%), Positives = 64/168 (38%), Gaps = 15/168 (8%)

Query: 12 VLVTGAAGYVGSHVVNRLIHDGYRVRAAVRSEGRADEVRNNVSAAGLAPERVEFAVADLA 71
LVTGAAG++G HV RL+ G++V D LA +F DLA
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 72 ADDGWSA--AAEGARYVIHVAS----PFPAENPENDDDVIIPARDGTVRVLAAARDAGCS 125
+G + A+ V + ENP D + G + +L R
Sbjct: 63 DREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNL---TGFLNILEGCRHNKIQ 119

Query: 126 RVVLTSSFAAVGYSPKDGDEWSEDDWTN-PADENTAYVRSKAIAERAA 172
++ SS + G + K +S DD + P + Y +K E A
Sbjct: 120 HLLYASSSSVYGLNRK--MPFSTDDSVDHPV---SLYAATKKANELMA 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00172SECA280.030 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 27.5 bits (61), Expect = 0.030
Identities = 27/110 (24%), Positives = 36/110 (32%), Gaps = 32/110 (29%)

Query: 59 YLADR------VVYSRSGLTHQAGLLQRDGLITRDTSADDQRATVVRITAAGRARVADVL 112
YLA R ++ GLT G+ A +R A AD+
Sbjct: 134 YLAQRDAENNRPLFEFLGLTV--------GINLPGMPAPAKRE----------AYAADIT 175

Query: 113 PGHIQVVRELLFDSLSDRDVTLLGDVMTR-----VRDHVRSLPSRSARTP 157
G E FD L D + + R + D V S+ ARTP
Sbjct: 176 YG---TNNEYGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTP 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00174HTHTETR491e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 49.2 bits (117), Expect = 1e-09
Identities = 27/163 (16%), Positives = 57/163 (34%), Gaps = 16/163 (9%)

Query: 2 IIDAARAEFARHGLAGARIDRIARSARASKERLYAHFGDKEALFRQVVVTDTAEF----L 57
I+D A F++ G++ + IA++A ++ +Y HF DK LF ++ + L
Sbjct: 16 ILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELEL 75

Query: 58 GSVGVRPDAVAEFAGDIY----DLAVSRPEHLRMITWARLEGVALGEPEVDGR------- 106
P +I + V+ ++ + +GE V +
Sbjct: 76 EYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNLCL 135

Query: 107 PIRERDTAAIAAAQAAGHIDDRWEPEQLLVLLFG-IGMAWAHW 148
+R + A + + +++ G I +W
Sbjct: 136 ESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENW 178


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00175PERTACTIN300.027 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 30.5 bits (68), Expect = 0.027
Identities = 15/37 (40%), Positives = 18/37 (48%), Gaps = 1/37 (2%)

Query: 266 RIFGGGAGGAGGLPG-PPPGNAGPGPEGPAMFGSAGI 301
I G A G +PG PG A PG GP + G G+
Sbjct: 257 TIRRGDAPAGGAVPGGAVPGGAVPGGFGPLLDGWYGV 293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00177DHBDHDRGNASE752e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 75.5 bits (185), Expect = 2e-18
Identities = 69/255 (27%), Positives = 106/255 (41%), Gaps = 27/255 (10%)

Query: 6 RVALITGASQGIGAGLEAGYRKLGYAVVA---NSRTIDGGDDPMVL------AVPGDVAQ 56
++A ITGA+QGIG + G + A N ++ + A P DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 57 PGVGGRIVDAAVQRFGRIDTVVNNAGLFIARPFTDYTDEEYDAITGVNLRGFFEISRAAV 116
I + G ID +VN AG+ +DEE++A VN G F SR+
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 117 ARMLDQGGGGHLVTISTTLVEHANSAVPSALASLTKGGLNAATRALAVEYATQGIRSNAV 176
M+D+ G +VT+ + +++ + +S K T+ L +E A IR N V
Sbjct: 129 KYMMDRRSGS-IVTVGSNPAGVPRTSMAAYASS--KAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 177 ALGIIRTPMHQP---SDYDALATLH----------PVGHVGEVSDVVDAVLYL--ENAPF 221
+ G T M + A + P+ + + SD+ DAVL+L A
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 222 VTGEILHVDGGQSAG 236
+T L VDGG + G
Sbjct: 246 ITMHNLCVDGGATLG 260


55NCTC10437_00193NCTC10437_00199N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00193092.514975two-component regulator receiver
NCTC10437_00194-192.566489two-component regulator - sensor kinase
NCTC10437_00195-181.977783S-adenosyl-L-methionine-dependent methyl
NCTC10437_001960102.281206ErfK/YbiS/YcfS/YnhG family protein
NCTC10437_001972122.748988Uncharacterised protein
NCTC10437_001982112.158600Uncharacterised protein
NCTC10437_00199091.221945MoxR-like ATPase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00193HTHFIS733e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 73.3 bits (180), Expect = 3e-17
Identities = 39/140 (27%), Positives = 74/140 (52%), Gaps = 5/140 (3%)

Query: 2 LVIEDSEAIREMVVEALGDAGFATSAYPDGEGLEALLDGHRPDAVILDVMIPGRDGFALI 61
LV +D AIR ++ +AL AG+ + L + D V+ DV++P + F L+
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLL 66

Query: 62 DVVRAWG-DVGIVMLTARDGLPDRLRGLDGGADDYVVKPFEMSELMSRVGAVL----RRR 116
++ D+ +++++A++ ++ + GA DY+ KPF+++EL+ +G L RR
Sbjct: 67 PRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRP 126

Query: 117 GTVTATVEVGDLLVDRTAAI 136
+ + G LV R+AA+
Sbjct: 127 SKLEDDSQDGMPLVGRSAAM 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00194PF05616340.002 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 33.6 bits (76), Expect = 0.002
Identities = 21/49 (42%), Positives = 22/49 (44%), Gaps = 7/49 (14%)

Query: 91 PAVVPGPGPGP---PG--PPPPPRPDGRPPRGPD--GGPPGRPPTPPPP 132
PA P P P PG P P P PD P PD G P RP +P P
Sbjct: 333 PAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSPAVP 381


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00196PF03544383e-05 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 38.0 bits (88), Expect = 3e-05
Identities = 23/107 (21%), Positives = 28/107 (26%), Gaps = 8/107 (7%)

Query: 20 AGLTLAAPAALAQPL-----PAPAPVPAPAPPPANPFVFPNPFAPPAPAAPAGPTADDPF 74
+T+ APA L P P P P P P P P P P
Sbjct: 50 ISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVK 109

Query: 75 AVVPGQPMAIPEGTPAGQNPTPFVGQPPFVPPSFNPTNGSIAGAAKP 121
V + P + PF P P S T +
Sbjct: 110 KVEQPKRDVKPVESRPAS---PFENTAPARPTSSTATAATSKPVTSV 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00197ACRIFLAVINRP290.035 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.7 bits (64), Expect = 0.035
Identities = 15/77 (19%), Positives = 29/77 (37%), Gaps = 8/77 (10%)

Query: 48 TGGTGSLVA-VGVMLAVSIIV---IAISVFSQGRRRQVRSGQHDYRRERGRDAGRRRWRP 103
+ VG++ + + I I F++ + + E A R R RP
Sbjct: 918 FNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEK----EGKGVVEATLMAVRMRLRP 973

Query: 104 LLIAAAVLLAWALLLAL 120
+L+ + + L LA+
Sbjct: 974 ILMTSLAFILGVLPLAI 990


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00199HTHFIS353e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.2 bits (81), Expect = 3e-04
Identities = 34/149 (22%), Positives = 57/149 (38%), Gaps = 20/149 (13%)

Query: 22 GRAVVGKRAALSLIL--ITVLARG--HVLIEDLPGLGKTLTAKS---FAAALGLEFTRVQ 74
G +VG+ AA+ I + L + ++I G GK L A++ + F +
Sbjct: 136 GMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAIN 195

Query: 75 ---FTPDLLPADLLG------STIYDMQSGRFEFRRGPIFTNLLLGDEINRTPPKTQAAL 125
DL+ ++L G + +GRFE G L DEI P Q L
Sbjct: 196 MAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEG----GTLFLDEIGDMPMDAQTRL 251

Query: 126 LEAMAERQVSIDGATHRLPDPFLVLATDN 154
L + + + + G + ++A N
Sbjct: 252 LRVLQQGEYTTVGGRTPIRSDVRIVAATN 280


56NCTC10437_00208NCTC10437_00218N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_002081111.020599MmpL11
NCTC10437_00209091.262971Fis family transcriptional regulator
NCTC10437_002101100.932000membrane protein
NCTC10437_00211190.705447response regulator with CheY-like receiver
NCTC10437_00212090.339440integral membrane sensor signal transduction
NCTC10437_00213090.249765Protein of uncharacterised function (DUF3054)
NCTC10437_00214090.091940metalloendopeptidase-like membrane protein
NCTC10437_002150110.465447putative integral membrane protein
NCTC10437_00216-1101.025459permease
NCTC10437_00217-2101.919767NAD-dependent aldehyde dehydrogenase
NCTC10437_00218-1101.862507large membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00208ACRIFLAVINRP595e-11 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 59.5 bits (144), Expect = 5e-11
Identities = 39/228 (17%), Positives = 90/228 (39%), Gaps = 26/228 (11%)

Query: 188 IVLLILLAVFGSLAAAAMPLLLGICTVVVTMGLVFLLSMYTTMSVFVTSTVSMFGIAVAI 247
+V L++ ++ A +P + V V + F + S+ +T++MFG+ +AI
Sbjct: 350 LVFLVMYLFLQNMRATLIPTI----AVPVVLLGTFAILAAFGYSI---NTLTMFGMVLAI 402

Query: 248 ----DYSLFILMRFREELRAGRE-PEDAADAAMATSGLAVVLSGLTVIASVTGIYLINT- 301
D ++ ++ + + P++A + +M+ A+V + + A +
Sbjct: 403 GLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGS 462

Query: 302 --PVLQSMATGAILAVAVAVLTSTTLTPAVLATFGRRAAKRSS-----YLHWGRGVEATQ 354
+ + + + A+A++VL + LTPA+ AT + + + W
Sbjct: 463 TGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHS 522

Query: 355 SKFWSRWTGWVMRRPWLSALAAAAVLLTFAAPAFSMLLGNSMQRQFEP 402
++ G ++ L A ++ A ++L + F P
Sbjct: 523 VNHYTNSVGKILGSTGRYLLIYALIV------AGMVVLFLRLPSSFLP 564



Score = 51.4 bits (123), Expect = 2e-08
Identities = 32/164 (19%), Positives = 74/164 (45%), Gaps = 19/164 (11%)

Query: 538 ALIAFVMLLISIRSVFLAFKGVLMTVLSVAAAYGSLVVVFQWGWLSDLGFKQIESLDSTI 597
++ F+++ + +++ + L+ ++V +V++ + L+ G+ +T+
Sbjct: 348 IMLVFLVMYLFLQN----MRATLIPTIAV-----PVVLLGTFAILAAFGYSI-----NTL 393

Query: 598 PPLVLALTFGLSMDYEIFLLTRIRERFLQTG-NTRDAVAYGVSTSARTITSAALIMIAVF 656
+ L GL +D I ++ + ++ ++A +S + A+++ AVF
Sbjct: 394 TMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVF 453

Query: 657 IGFAFAGM---PLVAQLGVACAVAIAVDATIVRLVLVPALMAMF 697
I AF G + Q + A+A+ + +V L+L PAL A
Sbjct: 454 IPMAFFGGSTGAIYRQFSITIVSAMAL-SVLVALILTPALCATL 496


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00210RTXTOXINA270.025 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 27.2 bits (60), Expect = 0.025
Identities = 17/53 (32%), Positives = 27/53 (50%), Gaps = 4/53 (7%)

Query: 50 SADFSGVASGVSAASSAYLFTHPDVNAFFTGLHGAPRAQLEAE---IAEYFAN 99
S + V+SG+SAA++ L P V+A + G LEA + E+ A+
Sbjct: 372 STVLASVSSGISAAATTSLVGAP-VSALVGAVTGIISGILEASKQAMFEHVAS 423


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00211HTHFIS962e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 96.4 bits (240), Expect = 2e-25
Identities = 38/114 (33%), Positives = 63/114 (55%)

Query: 1 MVDDDPDVRTSVARGLRHSGFDVRVAATGKEALRLLSSESHDALVLDVQMPELDGVAVVT 60
+ DDD +RT + + L +G+DVR+ + R +++ D +V DV MP+ + ++
Sbjct: 8 VADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLP 67

Query: 61 ALRALGNDIPICVLSARDTVNDRIAGLEAGADDYLTKPFDLGELVARLHALLRR 114
++ D+P+ V+SA++T I E GA DYL KPFDL EL+ + L
Sbjct: 68 RIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00212PF06580393e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.7 bits (90), Expect = 3e-05
Identities = 32/176 (18%), Positives = 65/176 (36%), Gaps = 31/176 (17%)

Query: 273 RVEGIITALGQLASGELAQAEDRDVIDVTDMLDRVAR----ENMRVRRDVQIDIEVADDL 328
+ ++T+L +L L + R + + D L V +++ +Q + ++ +
Sbjct: 192 KAREMLTSLSELMRYSLRYSNARQ-VSLADELTVVDSYLQLASIQFEDRLQFENQINPAI 250

Query: 329 GTVMGWPSGLRLAVDNLVRNAVTHGGATR-----IVLAAHRTDIGVTIVVDDNGRGLPVE 383
V P + V LV N + HG A I+L + + VT+ V++ G
Sbjct: 251 MDVQ-VPP---MLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN 306

Query: 384 EHQTVLGRFSRGSNAVPGGSGLGLALVAQQ-ATLHGGAIALT-DSPLGGLRATLTV 437
+ +G GL V ++ L+G + G + A + +
Sbjct: 307 TKE---------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00215PF05616384e-05 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 38.2 bits (88), Expect = 4e-05
Identities = 21/48 (43%), Positives = 30/48 (62%), Gaps = 5/48 (10%)

Query: 348 DIDPDSAPTAPNSSPSDEQAPAQFSPDPEAFHPPDENPVTQPD-KPDP 394
D+ P SA APN+ P E +PA+ +P P+ENP T+P+ +PDP
Sbjct: 314 DLTPGSA-EAPNAQPLPEVSPAE---NPANNPAPNENPGTRPNPEPDP 357


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00218ACRIFLAVINRP729e-15 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 71.8 bits (176), Expect = 9e-15
Identities = 41/261 (15%), Positives = 100/261 (38%), Gaps = 39/261 (14%)

Query: 184 EDQKRAEVAAIPLVAVVLFFVFGGVIAAALPAIIGGLTIAGALGIMRIFAEFMPVHFFAQ 243
+ + AI LV +V++ + A +P I + + G I+ F ++
Sbjct: 338 HEVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFG-------YSI 390

Query: 244 PVVTLMGL----GIAIDYGLFMVSRF-REEIAEGYDTEAAVRRTVMTSGRTVMFSAVILV 298
+T+ G+ G+ +D + +V R + + + A +++ ++ A++L
Sbjct: 391 NTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLS 450

Query: 299 ASSAPLLLFP--QG-FLKSITYAIIASVMLAAILSVTVLPAAFAILGRNVDALGVRTLLR 355
A P+ F G + + I++++ L+ ++++ + PA A TLL+
Sbjct: 451 AVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCA------------TLLK 498

Query: 356 VPFLRNWKPMRLWLEWLAEKTQKTKTRAEVEKGFWGRLVNVVMKRPIAFAAPILIVMILL 415
+ + + W + + V ++ + L++ L+
Sbjct: 499 PVSAEHHENKGGFFGWFNTTFDHSVNH-------YTNSVGKILGSTGRY----LLIYALI 547

Query: 416 VIPLGQLALGGISEKYLPPDN 436
V + L L + +LP ++
Sbjct: 548 VAGMVVLFL-RLPSSFLPEED 567



Score = 46.0 bits (109), Expect = 9e-07
Identities = 26/156 (16%), Positives = 62/156 (39%), Gaps = 7/156 (4%)

Query: 192 AAIPLVAVVLFFVFGGVIAAALPAIIGGLTIAGALGIMRIFAEFMPVHFFAQPVVTLMGL 251
+ +V + L ++ ++ L I G L +F + V+F V L +
Sbjct: 878 ISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFM---VGLLTTI 934

Query: 252 GIAIDYGLFMVSRFRE-EIAEGYDTEAAVRRTVMTSGRTVMFSAVILVASSAPLLLFP-- 308
G++ + +V ++ EG A V R ++ +++ + PL +
Sbjct: 935 GLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGA 994

Query: 309 -QGFLKSITYAIIASVMLAAILSVTVLPAAFAILGR 343
G ++ ++ ++ A +L++ +P F ++ R
Sbjct: 995 GSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030


57NCTC10437_00267NCTC10437_00274N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00267-181.550465Uncharacterised protein
NCTC10437_00268-181.129432acyl-CoA dehydrogenase
NCTC10437_00269-261.193119dihydrodipicolinate reductase
NCTC10437_00270-261.110059short-chain alcohol dehydrogenase
NCTC10437_00271-361.078712TspO and MBR like proteins
NCTC10437_00272-261.126639beta-ketoacyl synthase
NCTC10437_00273-111-0.217866putative polyketide synthase component
NCTC10437_00274-1120.349773transport protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00267TCRTETB260.034 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 25.6 bits (56), Expect = 0.034
Identities = 13/41 (31%), Positives = 18/41 (43%)

Query: 46 LAAGAAFFGSYAADTLGWPRYLLGAAALAFLACGAVVIAKN 86
+ F G L R++ GA A AF A VV+A+
Sbjct: 91 FGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARY 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00270DHBDHDRGNASE835e-21 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 83.2 bits (205), Expect = 5e-21
Identities = 53/189 (28%), Positives = 79/189 (41%), Gaps = 1/189 (0%)

Query: 3 KSVFITGAATGIGRATALLFAQRGYLVGAYDIDEAGLQVLRSDIAAVGGRSVVGHLDVTD 62
K FITGAA GIG A A A +G + A D + L+ + S + A + DV D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 63 ADEVAQRLGEFHEAAGGRLDVLVNNAGLLNAGRFEDIPLTVHHREIEVNVKGVVNGLHAA 122
+ + + E G +D+LVN AG+L G + VN GV N +
Sbjct: 69 SAAIDEITARI-EREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 123 FPFLKSTPGAVVVNLASASAIYGQAELANYSATKFFVRAITEALNLEWGRYGIRVIDMWP 182
++ +V + S A + +A Y+++K T+ L LE Y IR + P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 183 LYVNTAMTR 191
T M
Sbjct: 188 GSTETDMQW 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00272DHBDHDRGNASE423e-05 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 41.6 bits (97), Expect = 3e-05
Identities = 34/164 (20%), Positives = 62/164 (37%), Gaps = 11/164 (6%)

Query: 1183 LVTGGLGSIGLEIAGYLASHGARHLVLTSRRAPGDAAQQRIDALGAQHGCEVRVVTADVA 1242
+TG IG +A LAS GA + + + A ADV
Sbjct: 12 FITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHA----EAFPADVR 67

Query: 1243 DAHAVARLLAVVRAELPPLAGIVHAAGEIGTTPLSTLDDAEVDRVFAGKVWGAWHLSEAA 1302
D+ A+ + A + E+ P+ +V+ AG + + +L D E + F+ G ++ S +
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 1303 L----DTSAGTLDFFLCTSSIASVWGGFGQTAYGAANAFLDGLA 1342
D +G++ + S + AY ++ A
Sbjct: 128 SKYMMDRRSGSI---VTVGSNPAGVPRTSMAAYASSKAAAVMFT 168



Score = 39.3 bits (91), Expect = 2e-04
Identities = 31/123 (25%), Positives = 50/123 (40%), Gaps = 8/123 (6%)

Query: 3316 LITGGLGAIGLHTAAYLAQLGAGDIVLTSRRAADADTQ-QTIDDITERFKCRI-HVFAAD 3373
ITG IG A LA GA A D + + + + + R F AD
Sbjct: 12 FITGAAQGIGEAVARTLASQGA------HIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 3374 VGDEPQVMTLLDRIRAELPPLAGVAHFAGVLDDALLSQQSLDRFRMTLASKAFGAYHLDR 3433
V D + + RI E+ P+ + + AGVL L+ S + + T + + G ++ R
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 3434 WTA 3436
+
Sbjct: 126 SVS 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00274ACRIFLAVINRP582e-10 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 57.9 bits (140), Expect = 2e-10
Identities = 61/391 (15%), Positives = 125/391 (31%), Gaps = 55/391 (14%)

Query: 196 AVLVLAVLLLVYRSIVTMLLPLVTIGSSVLIAQGLVAAYSHLTGAGVS-NQSIVFLSAIL 254
+LV V+ L +++ L+P + + +L ++AA+ G S N +F +
Sbjct: 348 IMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAF------GYSINTLTMFGMVLA 401

Query: 255 AGAGTDYAVFLISRYHDYLRS-GAGYDDAVKAAMMSIGKVITASALTVGITFLVMSFAK- 312
G D A+ ++ + +A + +M I + A+ + F+ M+F
Sbjct: 402 IGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGG 461

Query: 313 -MG-VFSTVGVSAGIGIGVAYLAGVTLLPAILV-LVGPR------------GWVKPRREL 357
G ++ ++ + ++ L + L PA+ L+ P GW +
Sbjct: 462 STGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDH 521

Query: 358 TARFWRRSGIRIVRRPVPHLVASVLVLILLAGAA---GFARFNYDDRKAVSASAPSSVGY 414
+ + S +I+ +L+ L++ + + +D+ G
Sbjct: 522 SVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAG- 580

Query: 415 VALERHFPISQSIPQYILVQ----------------SPRDLRSPQALADLEQMASRIAQL 458
ER + + Y L S + + A L+ R
Sbjct: 581 ATQERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDE 640

Query: 459 PDVS-LVSGITRPLGEVPPEFRATFQAGIV---GDRLADGSAQIGERTDDLDTLTAGADT 514
++ LG++ F F + G I + D LT +
Sbjct: 641 NSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQ 700

Query: 515 LA-------DSLGDVRAQVNQIAPSLQGLID 538
L SL VR + + +D
Sbjct: 701 LLGMAAQHPASLVSVRPNGLEDTAQFKLEVD 731


58NCTC10437_00304NCTC10437_00310N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00304329-5.244377nucleoside-diphosphate-sugar epimerase
NCTC10437_00305428-5.658066Emopamil binding protein
NCTC10437_00306428-5.373634transcriptional regulator
NCTC10437_00307528-5.645096virulence factor Mce family protein
NCTC10437_00308528-6.059566virulence factor Mce family protein
NCTC10437_00309529-7.410698MCE-family protein MCE4d
NCTC10437_00310530-8.007140virulence factor Mce family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00304NUCEPIMERASE347e-06 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 33.6 bits (77), Expect = 7e-06
Identities = 11/27 (40%), Positives = 14/27 (51%)

Query: 7 ITGGCGLVGSATVRRLAELGRTVVVTD 33
+TG G +G +RL E G VV D
Sbjct: 5 VTGAAGFIGFHVSKRLLEAGHQVVGID 31


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00306HTHTETR588e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 58.1 bits (140), Expect = 8e-13
Identities = 26/117 (22%), Positives = 47/117 (40%), Gaps = 3/117 (2%)

Query: 14 RQERGDRTRELLIDETVRCIREEGFSAASARHIIERAGVSWGVIQHHFGDRDGLLTAVIE 73
++ TR+ ++D +R ++G S+ S I + AGV+ G I HF D+ L + + E
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 74 DALDRLVESLETLSDPAQAMSTD---ELVRATWEAFANPKAMAGLEILIATKQLRVG 127
+ + E E++ E+ + L +I K VG
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00307PF03544371e-04 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 36.9 bits (85), Expect = 1e-04
Identities = 23/114 (20%), Positives = 29/114 (25%), Gaps = 1/114 (0%)

Query: 437 PVVNLPPGVAPGPGPALTPPYPLPVPPNTPGPQPFPLPYQAPPDQTLPPSGRPPASPQPP 496
P P V PA P PP P +P P P P P P+P
Sbjct: 44 PAPAQPISVTM-VAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPK 102

Query: 497 VQQSPPAPPALPAEAAPTGSVQPQAAELPSAAYDERSGAFLDPHGGISVYAAGG 550
+ P P +P + +A S A G
Sbjct: 103 PKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASG 156



Score = 30.7 bits (69), Expect = 0.014
Identities = 29/91 (31%), Positives = 37/91 (40%), Gaps = 2/91 (2%)

Query: 427 PPEADYDPGPPVVNLPPGVAPGPGPALTPPYPLPVPPNTPGPQPFPLPYQAPPDQTLPPS 486
PP+A P PVV P P P P P + P P P+P P+ P + + P
Sbjct: 62 PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPV 121

Query: 487 GRPPASPQPPVQQSPPAPPALPAEAAPTGSV 517
PASP P P + A AA + V
Sbjct: 122 ESRPASPFENTA--PARPTSSTATAATSKPV 150



Score = 28.8 bits (64), Expect = 0.045
Identities = 18/107 (16%), Positives = 30/107 (28%), Gaps = 4/107 (3%)

Query: 418 LPQNKYPYIPPEADYDPGPP--VVNLPPGVAP--GPGPALTPPYPLPVPPNTPGPQPFPL 473
L + PPE +P P + PP AP P P P+
Sbjct: 60 LEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVK 119

Query: 474 PYQAPPDQTLPPSGRPPASPQPPVQQSPPAPPALPAEAAPTGSVQPQ 520
P ++ P + + + ++ + QPQ
Sbjct: 120 PVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQ 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00309PF03544330.002 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 33.0 bits (75), Expect = 0.002
Identities = 22/62 (35%), Positives = 23/62 (37%), Gaps = 1/62 (1%)

Query: 427 PPPADGGPPSAPAQSSPPPVGFPPPPTELSPPPPGFLPSPTELSPPPPGFPPPPATQAPP 486
PAD PP A Q P PV P P E P PP P E P P P P +
Sbjct: 55 VAPADLEPPQAV-QPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQ 113

Query: 487 VD 488

Sbjct: 114 PK 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00310RTXTOXIND346e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.4 bits (79), Expect = 6e-04
Identities = 25/164 (15%), Positives = 55/164 (33%), Gaps = 7/164 (4%)

Query: 147 RNVGDLDKSRLTEALGVLTDAMRDATPQLRGTLDGVTA----LSRSINDRDDQIQQLLAR 202
+NV + + RLT + ++ Q LD A + IN ++ + +R
Sbjct: 177 QNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSR 236

Query: 203 AKAVSDVLAHRSKQMNQLIVEGNQLFAA---LSQKRRSLGILIGGIDDLARQITGFVADN 259
S +L ++ + ++ + N+ A L + L + I +
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF 296

Query: 260 RKEFGPALTKLNLVLDNLETRRAQIDDSLRRLPNFANQLGEVVG 303
+ E L + + L A+ ++ + A +V
Sbjct: 297 KNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQ 340


59NCTC10437_00484NCTC10437_00494N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00484-1110.989414ATP-dependent chaperone ClpB
NCTC10437_004850121.321170secreted protein
NCTC10437_004860111.457873short-chain dehydrogenase/reductase SDR
NCTC10437_00487-2111.284748Conserved protein of uncharacterised function,
NCTC10437_00488-191.700917tRNA/rRNA methyltransferase SpoU
NCTC10437_00489-191.714182major facilitator transporter
NCTC10437_00490091.227440transcriptional regulator
NCTC10437_00491-1101.739712glycoside hydrolase family protein
NCTC10437_004920111.459390sodium/proton antiporter, NhaA family
NCTC10437_004930100.913016integral membrane transport protein
NCTC10437_00494212-0.517867transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00484HTHFIS434e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 42.9 bits (101), Expect = 4e-06
Identities = 38/184 (20%), Positives = 67/184 (36%), Gaps = 30/184 (16%)

Query: 550 AGRMLEGETAKLLRMESEL--GKRVIGQTKAVQAVSDAVRRSRAGVADPNRPTGSFMFLG 607
GR L + ++E + G ++G++ A+Q + + R + + M G
Sbjct: 115 IGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLAR----LMQTDLTL---MITG 167

Query: 608 PTGVGKTELAKALAEFLFDDERAMVRIDMSEYGEKHSVARLVGAPPGYVGYDQGGQLTEA 667
+G GK +A+AL ++ V I+M+ + L G + G T A
Sbjct: 168 ESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGH--------EKGAFTGA 219

Query: 668 VRRRPYTV-------ILFDEIEKAHPDVFDVLLQVLDEG---RLTDGQGRTVDFRNTILI 717
R + DEI D LL+VL +G + D R ++
Sbjct: 220 QTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVR---IV 276

Query: 718 LTSN 721
+N
Sbjct: 277 AATN 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00485PF05844300.012 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 29.6 bits (66), Expect = 0.012
Identities = 12/33 (36%), Positives = 15/33 (45%)

Query: 212 ARRSGSPSRPLTPAPAGRSTELPPGQAAGQPRP 244
A ++ PS P+ P AGRS P A P
Sbjct: 10 ATQAAIPSEPIAPGAAGRSVGTPQAAAELPQVP 42


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00486DHBDHDRGNASE541e-10 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 53.5 bits (128), Expect = 1e-10
Identities = 47/185 (25%), Positives = 80/185 (43%), Gaps = 5/185 (2%)

Query: 5 VALITGPTSGLGEGFARRYAVDGYDLVLVARDADRLNTLAAELHDEMGAEVEVICADLSD 64
+A ITG G+GE AR A G + V + ++L + + L E E AD+ D
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE-ARHAEAFPADVRD 68

Query: 65 ATDRAVVADRLR---DGVQVLVNNAGFGTAGEFWSADYALLQSQLDVNVTAVMQLTHAAL 121
+ + R+ + +LVN AG G S ++ VN T V + +
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 122 PSMLAAGAGTVLNVASVAGLLPGRG-STYSASKAWVVSFSEGLANALGGTGVGVHALCPG 180
M+ +G+++ V S +P + Y++SKA V F++ L L + + + PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 181 FVHTE 185
T+
Sbjct: 189 STETD 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00489TCRTETB1589e-45 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 158 bits (400), Expect = 9e-45
Identities = 92/406 (22%), Positives = 173/406 (42%), Gaps = 21/406 (5%)

Query: 15 VLLIGIDGTVLALATPFISADLGATGTQLLWIGDIYSFVLAALLISMGSLGDRIGHKKLL 74
++ VL ++ P I+ D W+ + + G L D++G K+LL
Sbjct: 23 SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLL 82

Query: 75 LYGATAFAMISAITAYASSP-QMLIGTRALLGMAGATLAPATLALIRGLFPDQRERSIAV 133
L+G S I S +LI R + G AGA PA + ++ + + R A
Sbjct: 83 LFGIIINCFGSVIGFVGHSFFSLLIMARFIQG-AGAAAFPALVMVVVARYIPKENRGKAF 141

Query: 134 GIWASAFSAGAALGPVVGGVLLEHFWWGSVFLINIPVMVILVVGGLLLLPEHRNPDPGPW 193
G+ S + G +GP +GG++ + W +L+ IP++ I+ V L+ L + G +
Sbjct: 142 GLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVRIKGHF 199

Query: 194 DLPSVGLSMLGMLGVVYAIKEGFTGLVHGLGGDVVVAAAIGVGSLTLFVRRQLRLPQPLI 253
D+ + L +G++ + + + V S +FV+ ++ P +
Sbjct: 200 DIKGIILMSVGIVFFMLFTTSYSISFL-----------IVSVLSFLIFVKHIRKVTDPFV 248

Query: 254 DVRLFRNPRFSGVVAANLLSVLGLSGLVFFLSQYFQLVHGYGPLKAGLAEL-PAAVTATV 312
D L +N F V + ++G V + + VH + G + P ++ +
Sbjct: 249 DPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVII 308

Query: 313 FGVLAGVAVRYVSHRAALTTGLALVGVAMGALMTSLIVFTPMTPYLPLGISLFVVGVGLG 372
FG + G+ V L G+ +++ L S ++ T T + I +FV+G GL
Sbjct: 309 FGYIGGILVDRRGPLYVLNIGVTF--LSVSFLTASFLLET--TSWFMTIIIVFVLG-GLS 363

Query: 373 LAFTVASDVILASVPKERAGAAAAVSETAYELGMALGIATLGSIIT 418
TV S ++ +S+ ++ AGA ++ L GIA +G +++
Sbjct: 364 FTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00490HTHTETR559e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 55.0 bits (132), Expect = 9e-12
Identities = 27/192 (14%), Positives = 57/192 (29%), Gaps = 17/192 (8%)

Query: 2 APTRDQLLRSAADFLGRRP--NATQDEIATAVGVSRATLHRHFSGRVALMAALEELAIAE 59
TR +L A ++ + + EIA A GV+R ++ HF + L + + EL+ +
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 60 MRQVLEAV-----RLSEGSATDALRRLVTACEPISPYLALLYSQSQELDLDTTLEIWDEV 114
+ ++ + L ++ + L+ + + + + +
Sbjct: 70 IGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 115 DTA--------ITELFLRGQRDGEFRPDLTAAWLTEAFYALIGG--TAWSIRAGRAAARD 164
I + DL I G W +
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDLKK 189

Query: 165 FDRLIIDLLLNG 176
R + +LL
Sbjct: 190 EARDYVAILLEM 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00493TCRTETB1333e-36 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 133 bits (335), Expect = 3e-36
Identities = 85/405 (20%), Positives = 163/405 (40%), Gaps = 16/405 (3%)

Query: 23 LSAVFLFEMLDNSILSVALPTIGRDLHASTTALQWVTGSYAVVFGGLMLAFGAVADHVGR 82
L + F +L+ +L+V+LP I D + + WV ++ + F +G ++D +G
Sbjct: 19 LCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGI 78

Query: 83 RRVMLVGLTLLALASLATAFVVT-AEQLIAVRVAMGIAAAMTTPGSMALAFRLFTDDGLR 141
+R++L G+ + S+ + LI R G AA M + R + R
Sbjct: 79 KRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN-R 137

Query: 142 VRAITVISTVGLIGLAIGPPVGGLVVAIAPWQILLVVNVPIAVLAIIGVRTGIAADRAED 201
+A +I ++ +G +GP +GG++ W LL+ I ++ II V + + E
Sbjct: 138 GKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLL----IPMITIITVPFLMKLLKKEV 193

Query: 202 LHRDPLDLAGAVLGTAAIVLTLVAPTLFVTEGAASWAAWTTTSAAVASFALFILRQRSAR 261
+ D+ G +L + IV + LF T + + +V SF +F+ R
Sbjct: 194 RIKGHFDIKGIILMSVGIVFFM----LFTTS-----YSISFLIVSVLSFLIFVKHIRKVT 244

Query: 262 HPLISLDLVSRPLVSSGLAFKAAAGMAIAGLGYLVTLQLQLDWGWSPALA-AIGMLPQVV 320
P + L G+ +AG +V ++ S A ++ + P +
Sbjct: 245 DPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTM 304

Query: 321 VLVAGSVFVGPFVDWVGPGRAAWLSASAVVAGLAVYGSLGRIGYPWVALALVLVAAGMRV 380
++ G VD GP + + + L ++ + +V V G+
Sbjct: 305 SVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSF 364

Query: 381 VGVVAALTVLRGLPENRTTIGAALTDTATEIASGVGIAVGGTILA 425
V + V L + G +L + + ++ G GIA+ G +L+
Sbjct: 365 TKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00494HTHTETR631e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 63.5 bits (154), Expect = 1e-14
Identities = 30/146 (20%), Positives = 59/146 (40%), Gaps = 5/146 (3%)

Query: 1 MPRTSERGGPLTRRKISDVATTLFLDRGFDAVTVADVAREAGVSSVTVFKHFPRKEDLLF 60
M R +++ TR+ I DVA LF +G + ++ ++A+ AGV+ ++ HF K DL
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 DREEDAVEILRSAVRDRAS--GAGVLASLREVAFRLVDDRHALSGVKAGSIPFFRT---V 115
+ E + + + + L+ LRE+ +++ + F V
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 116 AASPALIARAREIASELERALADLLE 141
+ R + E + L+
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLK 146


60NCTC10437_00827NCTC10437_00833N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_008270111.056736protein kinase
NCTC10437_008281110.996250Uncharacterised protein
NCTC10437_008291110.737856putative anti-sigma regulatory factor
NCTC10437_008302110.678282putative membrane transport protein
NCTC10437_008312150.132074cytochrome P450
NCTC10437_00832314-0.322842cytochrome P450
NCTC10437_00833011-0.375859X-Pro dipeptidyl-peptidase (S15 family)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00827YERSSTKINASE365e-04 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 35.9 bits (82), Expect = 5e-04
Identities = 28/91 (30%), Positives = 41/91 (45%), Gaps = 16/91 (17%)

Query: 126 KDGLIHRDIKPSNILI---TGRDFVYLIDFGLARSAGES--GLTTAGSTLGTLAYMAPER 180
K G++H DIKP N++ +G V ID GL +GE G T ++ APE
Sbjct: 263 KAGVVHNDIKPGNVVFDRASGEPVV--IDLGLHSRSGEQPKGFTE--------SFKAPEL 312

Query: 181 FEGG-DVDARSDIYALTCVLYECLTGSRPYP 210
G +SD++ + L C+ G P
Sbjct: 313 GVGNLGASEKSDVFLVVSTLLHCIEGFEKNP 343


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00828PERTACTIN270.006 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 27.4 bits (60), Expect = 0.006
Identities = 20/67 (29%), Positives = 23/67 (34%)

Query: 3 ALPPNPDPASTPGLESGGGVSPGETPPDSEQTSNSMPDPTAGRNLTPRAVIAFVAIGLFV 62
A P P P P P P Q P P AGR L+ A A G+ +
Sbjct: 574 APQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQPPAGRELSAAANAAVNTGGVGL 633

Query: 63 ALFLASA 69
A L A
Sbjct: 634 ASTLWYA 640


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00830ACRIFLAVINRP595e-11 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 59.1 bits (143), Expect = 5e-11
Identities = 52/234 (22%), Positives = 86/234 (36%), Gaps = 29/234 (12%)

Query: 468 RTAIIQVIPETGPNDTATADLVQRIRADRDAIEADTGATFLVGGNTASNIDTS-DKLATA 526
A + + TG N TA ++ A+ G L +T + S ++
Sbjct: 285 PAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQ-GMKVLYPYDTTPFVQLSIHEVVKT 343

Query: 527 L--PIFLVVVVGLAFILLTIAFRAALVPITSIVGFLLSVFAALGAQVAVFQWGWGASLLG 584
L I LV +V F+ RA L+P ++ LL FA L A
Sbjct: 344 LFEAIMLVFLVMYLFLQ---NMRATLIPTIAVPVVLLGTFAILAAF-------------- 386

Query: 585 VTPGETISFLPIIALAIIFGLSSDYEVFVVSRIKEELATTGDAT-AAVRNGVGLSARVVT 643
G +I+ L + + + GL D + VV ++ + A + +
Sbjct: 387 ---GYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALV 443

Query: 644 AAALIMFGVF--IAFLAG-GDPIIKSVGLTLAFGVFLDAFVVRLTLIPAVMAML 694
A+++ VF +AF G I + +T+ + L V L L PA+ A L
Sbjct: 444 GIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVA-LILTPALCATL 496



Score = 58.3 bits (141), Expect = 8e-11
Identities = 31/152 (20%), Positives = 67/152 (44%), Gaps = 9/152 (5%)

Query: 179 IIGISVAFVILLITFGAIVAAGLPILTAAIGVGIGALGIFVVAAFVDMPTASLSL-ALML 237
I + F+++ + + A +P + V + LG F + A +L++ ++L
Sbjct: 345 FEAIMLVFLVMYLFLQNMRATLIPTIA----VPVVLLGTFAILAAFGYSINTLTMFGMVL 400

Query: 238 GLSCGIDYALFIL-NRYRNNLLLLMPRQEAAGLAVGTAGGAVVFAALTVIIALCGLAVVG 296
+ +D A+ ++ N R + +P +EA ++ GA+V A+ + +A G
Sbjct: 401 AIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFG 460

Query: 297 I---PFLTYMGVAAAVSVLIAMLIALTLLPAL 325
+ ++ +++L+AL L PAL
Sbjct: 461 GSTGAIYRQFSITIVSAMALSVLVALILTPAL 492



Score = 38.7 bits (90), Expect = 1e-04
Identities = 40/275 (14%), Positives = 88/275 (32%), Gaps = 53/275 (19%)

Query: 428 PLLVVADVSGATNPAAAQMIASNISREDG----VVAAAPAGGDSRTAIIQVIPETGPNDT 483
P + R +G + A G S + ++
Sbjct: 799 PFSAFTTSHWVYGS-------PRLERYNGLPSMEIQGEAAPGTSSGDAMALM-------- 843

Query: 484 ATADLVQRIRADRDAIEADTGATFLVGGNTASNIDTSDKLATALPIFLVVVVGLAFILLT 543
+ A + G + G + + ++ + I VVV F+ L
Sbjct: 844 -----------ENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVV----FLCLA 888

Query: 544 IAFRAALVPITSIVGFLLSVFAALGAQVAVFQWGWGASLLGVTPGETISFLPIIALAIIF 603
+ + +P++ ++ L + L A + ++ L
Sbjct: 889 ALYESWSIPVSVMLVVPLGIVGVLLAATLF--------------NQKNDVYFMVGLLTTI 934

Query: 604 GLSSDYEVFVVSRIKEELATTG-DATAAVRNGVGLSAR--VVTAAALIMFGVFIAFLAG- 659
GLS+ + +V K+ + G A V + R ++T+ A I+ + +A G
Sbjct: 935 GLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGA 994

Query: 660 GDPIIKSVGLTLAFGVFLDAFVVRLTLIPAVMAML 694
G +VG+ + G+ A ++ + +P ++
Sbjct: 995 GSGAQNAVGIGVMGGMVS-ATLLAIFFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00833IGASERPTASE350.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.4 bits (81), Expect = 0.001
Identities = 33/194 (17%), Positives = 65/194 (33%), Gaps = 24/194 (12%)

Query: 40 SAGTSASSSAQGPA----SADRTKPVKVKTAKTETAKPDR-----AESDKSESETAE--S 88
+ T + A P+ + + + + A P AE+ K ES+T E
Sbjct: 996 NITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNE 1055

Query: 89 DNASKADAESADPAGKDRNRPADDTGGSDDASGADDDASDEGEAPAAEVTEHQAGEEEQP 148
+A++ A++ + A + ++ +T + S+ E E E E+E+
Sbjct: 1056 QDATETTAQNREVAKEAKSNVKANT----QTNEVAQSGSETKETQTTETKETATVEKEEK 1111

Query: 149 A-------APEQPVAEEQPPAAVEPSDAEVQAPVETTTTESTVTKALAATDGL--ATGEP 199
A V + P + + QA + K + T +P
Sbjct: 1112 AKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQP 1171

Query: 200 AADSGEPVVPALTS 213
A ++ V +T
Sbjct: 1172 AKETSSNVEQPVTE 1185


61NCTC10437_00860NCTC10437_00867N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00860012-0.650177protein kinase
NCTC10437_00861010-1.416411arabinose efflux permease family protein
NCTC10437_0086209-0.824716protein of uncharacterised function (DUF1206)
NCTC10437_00863-18-1.395515sugar transferase
NCTC10437_00864-19-0.948101putative arginine and alanine rich protein
NCTC10437_00865010-1.358310transporter
NCTC10437_00866210-0.081818transmembrane proteinm, MmpS5
NCTC10437_008671100.820919TetR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00860PERTACTIN320.008 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 31.6 bits (71), Expect = 0.008
Identities = 23/83 (27%), Positives = 34/83 (40%), Gaps = 1/83 (1%)

Query: 280 IPSEAPVAPVAPPQVPGPVAPPPPAPPSTRQFTSAPQVPMAPPSVRPPRRFGRGQLVLGA 339
+ ++AP AP PQ PGP P P P P P P + G+ + A
Sbjct: 563 VGAKAPPAPKPAPQ-PGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQPPAGRELSAA 621

Query: 340 VTAAMLAVAVILAMVLVFSGGDS 362
AA+ V LA L ++ ++
Sbjct: 622 ANAAVNTGGVGLASTLWYAESNA 644


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00861TCRTETA363e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.0 bits (83), Expect = 3e-04
Identities = 64/337 (18%), Positives = 114/337 (33%), Gaps = 45/337 (13%)

Query: 41 VYTVFATYFEGQFFAESEKNSTVYVYAIFAIT-FVMRPVGSWFFGRYADRRGRRAALTFS 99
+ V + + A++A+ F PV G +DR GRR L S
Sbjct: 24 IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVL----GALSDRFGRRPVLLVS 79

Query: 100 VSLMALCSAVIALVPSQATIGIAAPIILVLARLVQGFATGGEYGTSATYMSEAATRERR- 158
++ A+ A++A P +L + R+V G TG + Y+++ + R
Sbjct: 80 LAGAAVDYAIMATAPFLW--------VLYIGRIVAGI-TGATGAVAGAYIADITDGDERA 130

Query: 159 ---GFFSSFQYVTLVGGHVLAQFTLLILQSFLTDDQMRDFGWRIAF----AIGGVAAIVV 211
GF S+ +V G VL M F F A+ G+ +
Sbjct: 131 RHFGFMSACFGFGMVAGPVLGGL-------------MGGFSPHAPFFAAAALNGLNFLTG 177

Query: 212 FWLRRTMDESLSEEVIEAARSGEDKGAG-SMRELFTTYWRPLLLCFLITMGGTVAFYTYS 270
+L + ES E R + A T + + F++ + G V +
Sbjct: 178 CFL---LPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWV 234

Query: 271 VNAPTIVKSTYDDAMTATWINLIGLIFLMLLQPVGGMISDKVGRKPMLLFFGVGGLFYTY 330
+ + +D + G++ + + G ++ ++G + L G+ Y
Sbjct: 235 IF--GEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGER-RALMLGMIADGTGY 291

Query: 331 ILITFLPETQSPLASFALVAGGYII---LTGYTSINA 364
IL+ F L+A G I L S
Sbjct: 292 ILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQV 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00865ACRIFLAVINRP497e-08 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 49.5 bits (118), Expect = 7e-08
Identities = 43/247 (17%), Positives = 92/247 (37%), Gaps = 17/247 (6%)

Query: 137 AQSNDNKAAYVQVYIAGDQGETLANESVEAVRTLVADTPA--PDGVQAYVTGPAATTTDQ 194
A+ N AA + + +A A ++ +A++ +A+ P G++
Sbjct: 279 ARINGKPAAGLGIKLATGAN---ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQL 335

Query: 195 NAVGDASMRTIEMVTFAVIVVMLLFVYRSAVTTMVVTGMMFLGLMGARGIVSFLGYHGVF 254
+ ++T+ V +VM LF ++ T++ T + + L+G I++ G +
Sbjct: 336 SI--HEVVKTLFEAIMLVFLVMYLF-LQNMRATLIPTIAVPVVLLGTFAILAAFG----Y 388

Query: 255 GLTTFATNMVVTLAIAAATDYGIFLVGRYQEARRNGED--RESAYYTMFHGTAHVILASG 312
+ T T + LAI D I +V + + +E+ +M ++ +
Sbjct: 389 SINTL-TMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAM 447

Query: 313 LTIAGATACLHFTRL--PYFQTMGLPLAVGMTFVVAAALTLGPAVISVVTRFGKVLEPKR 370
+ A F ++ + + M V AL L PA+ + + + +
Sbjct: 448 VLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHEN 507

Query: 371 MARSRGW 377
GW
Sbjct: 508 KGGFFGW 514



Score = 36.4 bits (84), Expect = 6e-04
Identities = 30/158 (18%), Positives = 60/158 (37%), Gaps = 15/158 (9%)

Query: 781 ALILIFIIMVILTRAIVASAVIVGTVVLSLGTSFGLSVLLWQHIVGIPLHWMVLPMSVIV 840
A++L+F++M + + + A+ + V + L +F + I + + MVL + ++V
Sbjct: 347 AIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLV 406

Query: 841 LLAVGADYNLLLVSRMKEELGAGLHTGIIRSMAGTGSVVTAAGMVFAFTMISMTVSDMIV 900
A+ N V R+ E +SM+ + A + ++ +
Sbjct: 407 DDAIVVVEN---VERVMMEDKLPPKEATEKSMSQIQGALVGI----AMVLSAVFIPMAFF 459

Query: 901 IGQVGT-------TIGLGLLFDTFVVRSFMTPSIAALL 931
G G TI + V TP++ A L
Sbjct: 460 GGSTGAIYRQFSITIVSAMALSVLVALIL-TPALCATL 496


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00867HTHTETR577e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 56.6 bits (136), Expect = 7e-12
Identities = 24/106 (22%), Positives = 47/106 (44%), Gaps = 3/106 (2%)

Query: 24 ASSLRKDAERNRRRIIEAARELCATRGLEA-TLNEVAHHANLGVGTVYRRFPTKDLLFEA 82
A +++A+ R+ I++ A L + +G+ + +L E+A A + G +Y F K LF
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 83 IFQDGMDQLVEIADAALE--MEDSWEAFASLVWQLCELTATDRGLR 126
I++ + E+ D ++ + E T T+ R
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRR 107


62NCTC10437_00907NCTC10437_00923N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_009070121.119947iron ABC transporter ATP-binding protein
NCTC10437_00908-1130.664160Uncharacterised protein
NCTC10437_00909-1120.708425secreted protein
NCTC10437_00910090.014331TetR family transcriptional regulator
NCTC10437_00911-18-0.070760TetR family transcriptional regulator
NCTC10437_0091208-0.079164NmrA family protein
NCTC10437_0091319-0.621621Ketosteroid isomerase-related protein
NCTC10437_00914011-0.224166TetR family transcriptional regulator
NCTC10437_009151110.085260sulfate transporter
NCTC10437_009163120.054258putative transcriptional regulator
NCTC10437_009172110.575365L-2-haloacid dehalogenase
NCTC10437_009182100.737515TetR family transcriptional regulator
NCTC10437_009192101.226850Putative S-adenosyl-L-methionine-dependent
NCTC10437_00920190.517453arabinose efflux permease family protein
NCTC10437_00921180.332171lipoprotein
NCTC10437_00922180.571098integral membrane sensor signal transduction
NCTC10437_00923080.445371two component transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00907PF05272300.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.012
Identities = 11/23 (47%), Positives = 14/23 (60%)

Query: 30 VLGLLGPNGSGKSSLLRLLAGLD 52
+ L G G GKS+L+ L GLD
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00910HTHTETR552e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 54.6 bits (131), Expect = 2e-11
Identities = 31/142 (21%), Positives = 53/142 (37%), Gaps = 4/142 (2%)

Query: 8 PSLRERKKARTRLAIRQEAFRLFDQQGYANTTIDQIAHAADVSPRTLYRYFGVKEALLVS 67
+++ TR I A RLF QQG ++T++ +IA AA V+ +Y +F K L
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 68 --DDHTTPIVEAFAN--APRELSIVAAYRHALTEVFGALTPEERENSIAGQRMLYQVPEA 123
+ + I E A ++ R L V + EER +
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 124 RGLIYAEYIRLIDLLTAALARR 145
+ + R + L + +
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQ 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00911HTHTETR536e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 53.1 bits (127), Expect = 6e-11
Identities = 29/159 (18%), Positives = 60/159 (37%), Gaps = 7/159 (4%)

Query: 22 KQRLFKALGAVMGVKGYSSTTVDDLIKNAGVSRATFYQHFESKQDCFMAGYARMQGHIID 81
+Q + + +G SST++ ++ K AGV+R Y HF+ K D F + + +I +
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 82 AIGR--APTTGTPMQRFSTMLDQYLGFMSMDPPTARLYLV---EVYSAGPEAMTRRFE-- 134
A G P+ +L L + L + + G A+ ++ +
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRN 132

Query: 135 LQQEFVAGVAKLFNARSQADRFACKALVAAISSLVIGAL 173
L E + + +A + + ++ G +
Sbjct: 133 LCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00912NUCEPIMERASE377e-05 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 36.7 bits (85), Expect = 7e-05
Identities = 28/127 (22%), Positives = 35/127 (27%), Gaps = 29/127 (22%)

Query: 6 VLVTGATGKTGTHVVSGLRRLGHAV---------------RAATRRPAAGDGDAVPFDWA 50
LVTGA G G HV L GH V +A A D A
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 51 DHTTFPAALADV---RAIYLVAPAGV----ADPLPLVRP-------FLELAAAGGVRRVV 96
D A R V +P LE ++ ++
Sbjct: 63 DREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHLL 122

Query: 97 QLSSSAV 103
SSS+V
Sbjct: 123 YASSSSV 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00914HTHTETR582e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 57.7 bits (139), Expect = 2e-12
Identities = 24/177 (13%), Positives = 55/177 (31%), Gaps = 15/177 (8%)

Query: 1 MAEVTPSRSARAALRSDVTDGILGAVLDELAAVGYGKLSMDGVARRARAGKAALYRRWPS 60
MA T + IL L + G S+ +A+ A + A+Y +
Sbjct: 1 MARKTKQEAQETRQH------ILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKD 54

Query: 61 KQDMVLDAV-----TSISLPAGKSTSSGALPADI--ADLVHGVNDWLSDPLMSRILPDLL 113
K D+ + L P + L+H + +++ ++ +
Sbjct: 55 KSDLFSEIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIF 114

Query: 114 AEAKRNTDLA--AALTARIAVSRREYGHLVIGAAIDRGELSPDVDVEYALDLVAAPI 168
+ + ++A + + + + I+ L D+ A ++ I
Sbjct: 115 HKCEFVGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00918HTHTETR351e-04 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 34.6 bits (79), Expect = 1e-04
Identities = 25/164 (15%), Positives = 50/164 (30%), Gaps = 12/164 (7%)

Query: 9 DDDQHRVVAAVHDELARWGIDRFDIGAMAHRHGLDADAILRRWRDPELLILDALA----- 63
+ + ++ ++ G+ +G +A G+ AI ++D L +
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 64 --QRPGDTAPPDTGSLRSDL-FYLAVRMAAMVTSESGRKLHGGHLISDIRYGGVEIRQSA 120
+ + G S L L + + VT E R L G + + Q A
Sbjct: 70 IGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 121 WRARA----ATLAVVFDRARERGEMRDDVDYRTVLELLFAPINM 160
R + E + D+ R ++ I+
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISG 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00920TCRTETA433e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.5 bits (100), Expect = 3e-06
Identities = 42/260 (16%), Positives = 87/260 (33%), Gaps = 9/260 (3%)

Query: 65 MLLALPSGVLADLIDRRRLLIATQGAMAAGVGLLASLTGAGLTTPAV-LLMLLFVIGCGQ 123
A G L+D RR +L+ + A ++A T P + +L + ++
Sbjct: 57 FACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMA-------TAPFLWVLYIGRIVAGIT 109

Query: 124 ALTAPAWQAIQPDLVPAEQIPAAAALGSMSMNGARAIGPAIAGALVSLTGPTLVFALNAV 183
T A D+ ++ S GP + G + + FA A+
Sbjct: 110 GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAAL 169

Query: 184 SFVGIVCVLIWWRRPAVEDNYPPERALAALNAGGRFIRSSPIVRRILLRTALFIAPASAL 243
+ + + + P R A R+ R +V ++ +
Sbjct: 170 NGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVP 229

Query: 244 WGLLPVIAKDQLGLSSSGYGLLLGALGV-GAVCGAFLLSRLRSRFGQNTLLTVGAAGFAV 302
L + +D+ ++ G+ L A G+ ++ A + + +R G+ L +G
Sbjct: 230 AALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGT 289

Query: 303 ATAVLALVHSFGIVLVALVI 322
+LA + +V+
Sbjct: 290 GYILLAFATRGWMAFPIMVL 309



Score = 40.6 bits (95), Expect = 1e-05
Identities = 37/155 (23%), Positives = 62/155 (40%), Gaps = 4/155 (2%)

Query: 247 LPVIAKDQL--GLSSSGYGLLLGALGVGAVCGAFLLSRLRSRFGQNTLLTVGAAGFAVAT 304
LP + +D + ++ YG+LL + A +L L RFG+ +L V AG AV
Sbjct: 28 LPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDY 87

Query: 305 AVLALVHSFGIVLVALVIGGAAWLLTLSTLNAAMQLSLPAWVRARGLSVYQLVFMGGQAL 364
A++A ++ + ++ G T + A + RAR F G
Sbjct: 88 AIMATAPFLWVLYIGRIVAGITG-ATGAVAGAYIADITDGDERARHFGFMSACFGFGMVA 146

Query: 365 GSLLWGLVAGATTSVTSLLVSMALLLVCAVSSLFW 399
G +L GL G + + AL + ++ F
Sbjct: 147 GPVLGGL-MGGFSPHAPFFAAAALNGLNFLTGCFL 180



Score = 29.4 bits (66), Expect = 0.032
Identities = 26/108 (24%), Positives = 44/108 (40%), Gaps = 6/108 (5%)

Query: 68 ALPSGVLADLIDRRRLLIATQGAMAAGVGLLASLTGAGLTTPAVLLMLLFVIGCGQALTA 127
A+ +G +A + RR L+ A G LLA T + P ++L+ IG
Sbjct: 264 AMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIG------M 317

Query: 128 PAWQAIQPDLVPAEQIPAAAALGSMSMNGARAIGPAIAGALVSLTGPT 175
PA QA+ V E+ + + +GP + A+ + + T
Sbjct: 318 PALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITT 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00922PF06580310.006 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.006
Identities = 13/122 (10%), Positives = 40/122 (32%), Gaps = 37/122 (30%)

Query: 231 RLIRRCAAVAQERYDAKDVTLDVRVPEGL-----PPLWADEQRLSQVLGNLLDNALRH-- 283
++ +A +++ + + + ++ + PP+ ++ L++N ++H
Sbjct: 223 TVVDSYLQLASIQFEDR-LQFENQINPAIMDVQVPPM---------LVQTLVENGIKHGI 272

Query: 284 --TASGDSVRVECHRDGDHLAIVVADSGDGIAAEHLPRVFERFYRADAARDRDHGGAGIG 341
G + ++ +D + + V ++G G G
Sbjct: 273 AQLPQGGKILLKGTKDNGTVTLEVENTGSLAL------------------KNTKESTGTG 314

Query: 342 LA 343
L
Sbjct: 315 LQ 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00923HTHFIS846e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.5 bits (209), Expect = 6e-21
Identities = 32/134 (23%), Positives = 62/134 (46%), Gaps = 1/134 (0%)

Query: 19 RALVVDDEVPLAEVVASYLEREHFEAIVANNGVDAIAVARELDPDVVILDLGLPGIDGLE 78
LV DD+ + V+ L R ++ + +N D D+V+ D+ +P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 79 VCRQLRT-FSDAYVVMLTARDTEMDTVLGLTVGADDYVTKPFSPRELVARIRAMLRRPRL 137
+ +++ D V++++A++T M + GA DY+ KPF EL+ I L P+
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 138 APTPAASAAHETAP 151
P+ + + P
Sbjct: 125 RPSKLEDDSQDGMP 138


63NCTC10437_00965NCTC10437_00973N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00965-1100.924043transcriptional regulator, TetR family
NCTC10437_00966-380.769768short-chain dehydrogenase/reductase SDR
NCTC10437_009670112.204725acetamidase/formamidase
NCTC10437_009681102.508754deazaflavin-dependent nitroreductase family
NCTC10437_009691102.542035Uncharacterised protein
NCTC10437_009702112.442475esterase
NCTC10437_009710101.219797mitomycin antibiotics/polyketide fumonisin
NCTC10437_00972-191.222694Uncharacterised protein
NCTC10437_00973-1100.432851Bcr/CflA subfamily drug resistance transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00965TETREPRESSOR857e-23 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 85.4 bits (211), Expect = 7e-23
Identities = 54/198 (27%), Positives = 85/198 (42%), Gaps = 11/198 (5%)

Query: 9 LSVDLIAQAALDLVRSTG--GFTMPGVARKLRVQPSSLYNHVSGRDAIVELLRERAMSE- 65
L+ + + AAL+L+ TG G T +A+KL ++ +LY HV + A+++ L ++
Sbjct: 4 LNRESVIDAALELLNETGIDGLTTRKLAQKLGIEQPTLYWHVKNKRALLDALAVEILARH 63

Query: 66 VQLPADDPDLPWRDVVAGIARSYRSSFARYPQLIPLLTEYAVNSPQALTMYNALAVTLRR 125
W+ + A S+R + RY + + Q T+ L +
Sbjct: 64 HDYSLPAAGESWQSFLRNNAMSFRRALLRYRDGAKVHLGTRPDEKQYDTVETQLRF-MTE 122

Query: 126 AGFSPADTLRSITLVDSFVLGAAL-----DVAAPDEPWRTRDEVGPELAAALATGAPKPQ 180
GFS D L +I+ V F LGA L A D P + + P L AL
Sbjct: 123 NGFSLRDGLYAISAVSHFTLGAVLEQQEHTAALTDRPAAPDENLPPLLREALQ--IMDSD 180

Query: 181 RADDAFEYGLAVLLRGLE 198
+ AF +GL L+RG E
Sbjct: 181 DGEQAFLHGLESLIRGFE 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00966DHBDHDRGNASE1051e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 105 bits (264), Expect = 1e-29
Identities = 75/257 (29%), Positives = 113/257 (43%), Gaps = 9/257 (3%)

Query: 4 GLTGKRFAVTGGTRGIGRAVVEGLLAEGAAVAYCARTAEAVTQAQTELTAGGAAALGTAV 63
G+ GK +TG +GIG AV L ++GA +A E + + + L A A
Sbjct: 5 GIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPA 64

Query: 64 DVGDADAVTAWVASAAESLGGLDGVVANVSALAIPET----AENWRTTFEVDLLGTVSLV 119
DV D+ A+ A +G +D +V L E W TF V+ G +
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 120 DAALPHLLAAGSGSIVTISSVSGREIDFAAGPYGTMKAAITHYTQGVAYKYAAQGVRANT 179
+ +++ SGSIVT+ S + Y + KAA +T+ + + A +R N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 180 VSPGNTY--FPGGVWPS--IEENNPELFAQALALN-PTGRMATPQEVANAVVFLSSSAAS 234
VSPG+T +W E + + P ++A P ++A+AV+FL S A
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 235 FITGTNLLVDGALTRGV 251
IT NL VDG T GV
Sbjct: 245 HITMHNLCVDGGATLGV 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00970TONBPROTEIN300.013 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 30.0 bits (67), Expect = 0.013
Identities = 8/26 (30%), Positives = 8/26 (30%)

Query: 323 TPPLPPPPPPPPVPGALPAPAPAPAG 348
P P P P P P P P
Sbjct: 69 VEPEPEPEPIPEPPKEAPVVIEKPKP 94



Score = 28.4 bits (63), Expect = 0.036
Identities = 10/23 (43%), Positives = 11/23 (47%)

Query: 324 PPLPPPPPPPPVPGALPAPAPAP 346
P+P PP PV P P P P
Sbjct: 76 EPIPEPPKEAPVVIEKPKPKPKP 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00972cloacin402e-05 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 40.1 bits (93), Expect = 2e-05
Identities = 30/83 (36%), Positives = 37/83 (44%), Gaps = 2/83 (2%)

Query: 359 NGGDGGSIGGNGGVGRNGGNASGGIARGGNGGTGGDGGSAVTEGTGIGGNGGNGGNATPG 418
+GGDG G N G GN +GG G GG DG +E GG G+G + G
Sbjct: 2 SGGDGR--GHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGG 59

Query: 419 AGKAGTGGNGGAGGRGGAADGNQ 441
+G GGNG +GG G
Sbjct: 60 SGHGNGGGNGNSGGGSGTGGNLS 82



Score = 35.8 bits (82), Expect = 4e-04
Identities = 37/107 (34%), Positives = 46/107 (42%), Gaps = 6/107 (5%)

Query: 331 LTGLGGRGGVGGSATSTEGGNTTGGRGGNGGDGGSIGGNGGVGRN---GGNASGGIARGG 387
++G GRG G+ ++ GN GG G G GG+ G+G N GG + GI GG
Sbjct: 1 MSGGDGRGHNTGAHST--SGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGG 58

Query: 388 NGGTGGDGGSAVTEGTGIGGNGGNGGNATPGAGKAGTGGNGGAGGRG 434
G G GG+ G G G G A P A GAGG
Sbjct: 59 GSGHGNGGGNG-NSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 33.9 bits (77), Expect = 0.001
Identities = 33/104 (31%), Positives = 38/104 (36%), Gaps = 5/104 (4%)

Query: 289 GNGGNGGDRLASRAANGGGGGSATSEAGTATGGSAGSGGSAELTGLGGRGGVGGSATSTE 348
G G N G S NGG G G G S GSG S+E GG G G
Sbjct: 6 GRGHNTGAHSTSGNINGGPTG-----LGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGS 60

Query: 349 GGNTTGGRGGNGGDGGSIGGNGGVGRNGGNASGGIARGGNGGTG 392
G GG G +GG G+ G V ++ G GG
Sbjct: 61 GHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 33.1 bits (75), Expect = 0.003
Identities = 26/77 (33%), Positives = 34/77 (44%), Gaps = 2/77 (2%)

Query: 380 SGGIARGGNGGTGGDGGSAVTEGTGIGGNGG--NGGNATPGAGKAGTGGNGGAGGRGGAA 437
SGG RG N G G+ TG+G GG +G + G G G GG+
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 438 DGNQGNDGNDGGTASSS 454
GN G +GN GG + +
Sbjct: 62 HGNGGGNGNSGGGSGTG 78



Score = 30.1 bits (67), Expect = 0.019
Identities = 29/111 (26%), Positives = 43/111 (38%), Gaps = 13/111 (11%)

Query: 127 NGGNAGLFGGSGGAGGAGGNDYNTSAAGANANALGAAGGRGGNGAGLFPSLAGNGGAGGA 186
NGG GL G G + G+G + N G + + + GG G G GNG +GG
Sbjct: 21 NGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGG------GNGNSGGG 74

Query: 187 ASSLFGNATGGAGGTGGTANIGLPLPGDDSGVALGPGAGGAGGAATVGAAD 237
+ + G + A + P + A G + GA + AD
Sbjct: 75 SGT-------GGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIAD 118



Score = 28.9 bits (64), Expect = 0.048
Identities = 35/108 (32%), Positives = 43/108 (39%), Gaps = 19/108 (17%)

Query: 221 GPGAGGAGGAATVGAADGTGGTSSQTATGGSGGSGGYGLVSGGATPDLSGNGGKGGNATA 280
GP G GG GA+DG+G +S GG GSG + G GNGG GN
Sbjct: 23 GPTGLGVGG----GASDGSGWSSENNPWGGGSGSGIHWGGGSG-----HGNGGGNGN--- 70

Query: 281 YQEGKAVGGNGGNGGDRLASRAANGGGGGSATSEAGTATGGSAGSGGS 328
G G G L++ AA G A S G + S G+
Sbjct: 71 -------SGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00973TCRTETA698e-15 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 68.7 bits (168), Expect = 8e-15
Identities = 78/307 (25%), Positives = 119/307 (38%), Gaps = 25/307 (8%)

Query: 38 LYLPAFPRMVDDLHTSPTNVQLTLTAFLLGLAAGQLVFGP----LSDRFGRVRPLVIGAT 93
L +P P ++ DL S +V L A Q P LSDRFGR L++
Sbjct: 23 LIMPVLPGLLRDLVHS-NDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLA 81

Query: 94 VCVAASVVAVLAPSIEVLVGARLAQGLTGAAGMVIGRAIISDLATGRAAARAFSLMMIVG 153
+ AP + VL R+ G+TGA G V G A I+D+ G AR F M
Sbjct: 82 GAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG-AYIADITDGDERARHFGFMSACF 140

Query: 154 GVAPIAAPLAGGFLVGPLGWRGALAVILVLVVAMLVATLVVIRETHTEERR-ARLRAEKS 212
G +A P+ GG + G L + ++ E+H ERR R A
Sbjct: 141 GFGMVAGPVLGGLM-GGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNP 199

Query: 213 TTGSPLRDLLHRTYIGHLIAF-----GFAFAVMMAYISASPFIYQTMM-GLSSAQYGAMF 266
+ + F G A + F + G+S A +G +
Sbjct: 200 LASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILH 259

Query: 267 GVNASGLLLMSALSARLTSHREPRAVAGVGLAAILAAGVAVLALAVLGVPVGWLALPLFV 326
+ + ++ ++ARL E RA+ + ++A G + LA GW+A P+ V
Sbjct: 260 SL--AQAMITGPVAARL---GERRAL----MLGMIADGTGYILLAFAT--RGWMAFPIMV 308

Query: 327 AVAGMGL 333
+A G+
Sbjct: 309 LLASGGI 315


64NCTC10437_00981NCTC10437_00987N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_00981-190.593837MMPL domain-containing protein
NCTC10437_00982011-0.075338Uncharacterized protein conserved in bacteria
NCTC10437_00983080.084029Uncharacterised protein
NCTC10437_00984080.268950sugar ABC transporter substrate-binding protein
NCTC10437_00985090.649474binding-protein-dependent transport system inner
NCTC10437_009860100.761815binding-protein-dependent transport system inner
NCTC10437_009870100.805155ABC transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00981ACRIFLAVINRP595e-11 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 59.1 bits (143), Expect = 5e-11
Identities = 41/222 (18%), Positives = 80/222 (36%), Gaps = 19/222 (8%)

Query: 163 AIPLSFLVLVWVFGGLLAAALPVAIGGMAILGTLAVLRLVTFATDVSIFALNLATAMGLA 222
AI L FLV+ + A +P + +LGT A+L ++ +N T G+
Sbjct: 347 AIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYS-------INTLTMFGMV 399

Query: 223 LAI----DYTLLILSRYRDELAAGADRP-DAIRRTMTTAGRTVIFSATT---VALSMAAM 274
LAI D ++++ + P +A ++M+ ++ A V + MA
Sbjct: 400 LAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFF 459

Query: 275 VLFPMHFLKSFAYAGVATVAFAAAAAIVVTPAALVLLGDRIDSFDVRKLARRLLRRPAPQ 334
+ F+ V+ +A + A+++TPA L + +
Sbjct: 460 GGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENK-GGFFGWFNTT 518

Query: 335 PVPVEQRFWYRCAKAVMRRALPLGLAVVVLLLVLGAPFLGVK 376
+ K + L ++ L+V G L ++
Sbjct: 519 FDHSVNHYTNSVGKILGSTGRYL---LIYALIVAGMVVLFLR 557



Score = 30.6 bits (69), Expect = 0.031
Identities = 26/146 (17%), Positives = 54/146 (36%), Gaps = 8/146 (5%)

Query: 132 AGIAIATGGTAMVNVQITEQSQRDLLVMESLAIPLSFLVLVWVFGGLLAAALPVAIGGMA 191
AGI G + S + +++ + FL L ++ + + +
Sbjct: 852 AGIGYDWTGMS----YQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLG 907

Query: 192 ILGTLAVLRLVTFATDVSIFALNLATAMGLALAIDYTLLILSRYRDELAA-GADRPDAIR 250
I+G L L DV F + L T +G L+ +LI+ +D + G +A
Sbjct: 908 IVGVLLAATLFNQKNDV-YFMVGLLTTIG--LSAKNAILIVEFAKDLMEKEGKGVVEATL 964

Query: 251 RTMTTAGRTVIFSATTVALSMAAMVL 276
+ R ++ ++ L + + +
Sbjct: 965 MAVRMRLRPILMTSLAFILGVLPLAI 990


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00982PF05272320.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.002
Identities = 24/56 (42%), Positives = 29/56 (51%), Gaps = 6/56 (10%)

Query: 30 RFAGLTALAAVTMVLVQCAPASSPAPPPPGTPIPPAPTPDVAPETFSPITPSPPPA 85
R GL ++A + M APA +PAP PP P PP P P V E + I P P A
Sbjct: 95 REEGLESVAGIVM----GAPAGAPAPKPP-RPEPP-PRPVVEKECWETIQPVPEHA 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00985PF06580290.029 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.7 bits (64), Expect = 0.029
Identities = 19/97 (19%), Positives = 36/97 (37%), Gaps = 4/97 (4%)

Query: 25 TVYLVALALTKSSLAHPLRTFTGGANVATALASPAFVPSLWKSTLFAVVVAVATTTLGLG 84
++ +A++L L H R+F + L + +V VA T++
Sbjct: 42 MIFNIAISLMGLVLTHAYRSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFVANTSIWR- 100

Query: 85 LALLLRHRGNGFGVAGALFLLPLVTAPVLVGVAWKLL 121
L + + F + L ++ V+V W LL
Sbjct: 101 LLAFINTKPVAFTLP---LALSIIFNVVVVTFMWSLL 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_00987PF05272310.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.007
Identities = 15/45 (33%), Positives = 22/45 (48%), Gaps = 4/45 (8%)

Query: 17 TGKATVVDHVSLSLPKG----SMTVLVGPSGCGKSTTLRIVAGLE 57
GK ++ HV+ + G VL G G GKST + + GL+
Sbjct: 576 VGKYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVGLD 620


65NCTC10437_01060NCTC10437_01070N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_01060016-2.899005TetR family transcriptional regulator
NCTC10437_01061116-2.48653530S ribosomal protein S12
NCTC10437_01062014-2.81544830S ribosomal protein S7
NCTC10437_01063013-2.706864translation elongation factor 2 (EF-2/EF-G)
NCTC10437_01064011-1.491317elongation factor Tu
NCTC10437_01065011-0.796428Uncharacterised protein
NCTC10437_0106619-1.131493cutinase
NCTC10437_01067-18-1.283764Conserved membrane protein of uncharacterised
NCTC10437_01068-19-0.948062oxidoreductase, SDR family
NCTC10437_01069-19-0.373949ornithine aminotransferase
NCTC10437_01070-111-0.258971amidinotransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01060TETREPRESSOR512e-10 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 51.5 bits (123), Expect = 2e-10
Identities = 21/50 (42%), Positives = 33/50 (66%)

Query: 9 AKLSRDAIVNAALTFLDREGWDALTINALANQLGTKGPSLYNHVHSLDDL 58
A+L+R+++++AAL L+ G D LT LA +LG + P+LY HV + L
Sbjct: 2 ARLNRESVIDAALELLNETGIDGLTTRKLAQKLGIEQPTLYWHVKNKRAL 51


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01063TCRTETOQM5850.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 585 bits (1509), Expect = 0.0
Identities = 161/676 (23%), Positives = 306/676 (45%), Gaps = 71/676 (10%)

Query: 11 KVRNIGIMAHIDAGKTTTTERILFYTGISYKIGEVHDGAATMDWMEQEQERGITITSAAT 70
K+ NIG++AH+DAGKTT TE +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 71 TCFWDGNQINIIDTPGHVDFTVEVERSLRVLDGAVAVFDGKEGVEPQSEQVWRQADKYDV 130
+ W+ ++NIIDTPGH+DF EV RSL VLDGA+ + K+GV+ Q+ ++ K +
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 131 PRICFVNKMDKIGADFYFSIRTMEERLGANVIPIQIPIGSEGDFEGIVDLVEMKAKVWRG 190
P I F+NK+D+ G D + ++E+L A ++ Q
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ------------------------- 156

Query: 191 ETKLGETYETIDIPEDLQEKADEYRTKLLEAVAETDEALLEKYFGGEELTVEEIKGALRK 250
+ +L + E Q + V E ++ LLEKY G+ L E++
Sbjct: 157 KVELYPNMCVTNFTESEQ----------WDTVIEGNDDLLEKYMSGKSLEALELEQEESI 206

Query: 251 LTISSSAYLVLCGSAFKNKGVQPMLDAVIDYLPSPLDVESVTGHVPGKEDELVTRKPSTS 310
+ S + V GSA N G+ +++ + + S
Sbjct: 207 RFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH--------------------RGQ 246

Query: 311 EPFSALAFKVATHPFFGKLTYVRVYSGKVDSGSAVVNATKGKKERLGKLFQMHSNKENPV 370
FK+ +L Y+R+YSG + +V + K K ++ +++ + + +
Sbjct: 247 SELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMYTSINGELCKI 305

Query: 371 ETASAGHIYAVIG----LKDTTTGDTLSDPNDQVVLESMTFPDPVIEVAIEPKTKSDQEK 426
+ A +G I + L GDT P E + P P+++ +EP +E
Sbjct: 306 DKAYSGEIVILQNEFLKLNS-VLGDTKLLPQR----ERIENPLPLLQTTVEPSKPQQREM 360

Query: 427 LGTAIQKLAEEDPTFKVHLDQETGQTVIGGMGELHLDILVDRMRREFKVEANVGKPQVAY 486
L A+ ++++ DP + ++D T + ++ +G++ +++ ++ ++ VE + +P V Y
Sbjct: 361 LLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIY 420

Query: 487 KETIKRAVEKVEFTHKKQTGGSGQFAKVLVSIEPFTGEDGATYEFENKVTGGRIPREYIP 546
E + K E+T + + +A + +S+ P G+ ++E+ V+ G + + +
Sbjct: 421 MERPLK---KAEYTIHIEVPPNPFWASIGLSVSP--LPLGSGMQYESSVSLGYLNQSFQN 475

Query: 547 SVDAGAQDAMQYGVLAGYPLVNVKLTLLDGAYHEVDSSEMAFKVAGSQVMKKAAAQAQPV 606
+V G + + G L G+ + + K+ G Y+ S+ F++ V+++ +A
Sbjct: 476 AVMEGIRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTE 534

Query: 607 ILEPVMAVEVTTPEDYMGEVIGDLNSRRGQIQAMEERSGARVVKAHVPLSEMFGYVGDLR 666
+LEP ++ ++ P++Y+ D I + ++ ++ +P + Y DL
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLT 594

Query: 667 SKTQGRANYSMVFDSY 682
T GR+ Y
Sbjct: 595 FFTNGRSVCLTELKGY 610


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01064TCRTETOQM795e-18 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 79.1 bits (195), Expect = 5e-18
Identities = 52/155 (33%), Positives = 82/155 (52%), Gaps = 13/155 (8%)

Query: 13 VNIGTIGHVDHGKTTLTAAITKVLH-----DKYPELNESRAFDQIDNAPEERQRGITINI 67
+NIG + HVD GKTTLT ++ L+ + +++ + DN ERQRGITI
Sbjct: 4 INIGVLAHVDAGKTTLTESL---LYNSGAITELGSVDKGTT--RTDNTLLERQRGITIQT 58

Query: 68 SHVEYQTEKRHYAHVDAPGHADYIKNMITGAAQMDGAILVVAATDGPMPQTREHVLLARQ 127
+Q E +D PGH D++ + + +DGAIL+++A DG QTR R+
Sbjct: 59 GITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRK 118

Query: 128 VGVPYILVALNKADMVDDEELIELVEMEVRELLAA 162
+G+P I +NK D + + V +++E L+A
Sbjct: 119 MGIPTI-FFINKIDQNGID--LSTVYQDIKEKLSA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01065TONBPROTEIN310.002 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 31.5 bits (71), Expect = 0.002
Identities = 24/81 (29%), Positives = 29/81 (35%)

Query: 16 LDGPRRTIDPWKILACVSTAVSAATIATLVWTVNTVSDEETPAERMSAAAGVTVPTVVPS 75
LD PRR P + C+ AV A + T V V + P P
Sbjct: 3 LDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQAVQ 62

Query: 76 PPAPSVPVPEPESEPVAPAAP 96
PP V PEPE EP+
Sbjct: 63 PPPEPVVEPEPEPEPIPEPPK 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01068DHBDHDRGNASE1135e-32 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 113 bits (283), Expect = 5e-32
Identities = 83/280 (29%), Positives = 128/280 (45%), Gaps = 32/280 (11%)

Query: 4 GLEGRVALITGAARGQGRAHAVRLANEGADIVAIDVCKPVSASVTYPMPTPDDLAETVRL 63
G+EG++A ITGAA+G G A A LA++GA I A+D P+ L + V
Sbjct: 5 GIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDY-------------NPEKLEKVVSS 51

Query: 64 VEATGRKVLSREVDIRDLEAQQQLVADTIEQFGRLDIVVANAGVLSWGRIWEMSEEQWDT 123
++A R + D+RD A ++ A + G +DI+V AGVL G I +S+E+W+
Sbjct: 52 LKAEARHAEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEA 111

Query: 124 VIDVNLNGTWRTIRAAVPAMIEAGNGGSIIIVSSSAGTKATPGNAHYAASKHGLVALTNA 183
VN G + R+ M GSI+ V S+ A YA+SK V T
Sbjct: 112 TFSVNSTGVFNASRSVSKYM-MDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKC 170

Query: 184 LAIEAGEFGIRVNSIHPYSIDTPM-----VEADAMMKIFSKYPAFLHSFSPMPYKPVAHD 238
L +E E+ IR N + P S +T M + + ++ L +F
Sbjct: 171 LGLELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIK---GSLETFKT--------- 218

Query: 239 GKPGLQEFMTPEEVSDVVAWLAGDGSATISGSQIAVDRGT 278
G P L++ P +++D V +L + I+ + VD G
Sbjct: 219 GIP-LKKLAKPSDIADAVLFLVSGQAGHITMHNLCVDGGA 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01070ARGDEIMINASE310.004 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 31.3 bits (71), Expect = 0.004
Identities = 35/188 (18%), Positives = 63/188 (33%), Gaps = 44/188 (23%)

Query: 134 EGQGDFLAVGHTILA-GHGFRTDRRAHDELA-----GHVGMQTV-SLELADPR-FYHLDT 185
EG GD L + +L G RT+ ++ ++LA T+ + ++ R + HLDT
Sbjct: 217 EG-GDELVLNKGLLVIGISERTEAKSVEKLAISLFKNKTSFDTILAFQIPKNRSYMHLDT 275

Query: 186 ALAVLDDATIAYYPPA--------------------------FADHSRTTLQQMFPDAIE 219
+D + + D L + D I+
Sbjct: 276 VFTQIDYSVFTSFTSDDMYFSIYVLTYNPSSSKIHIKKEKARIKDVLSFYLGRK-IDIIK 334

Query: 220 VSTADAYVLGLNVVSDGLNVVL--PSAAAGFA------DQLRCAGFRPVGVDLSELLKGG 271
+ D +DG NV+ P ++ G + + SEL +G
Sbjct: 335 CAGGDLIHGAREQWNDGANVLAIAPGEIIAYSRNHVTNKLFEENGIKVHRIPSSELSRGR 394

Query: 272 GSVKCCTL 279
G +C ++
Sbjct: 395 GGPRCMSM 402


66NCTC10437_01253NCTC10437_01266N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_01253182.193167integral membrane sensor signal transduction
NCTC10437_012541100.513529roadblock/LC7 family protein
NCTC10437_01255-160.708899Protein of uncharacterised function (DUF742)
NCTC10437_01256-171.011482putative GTPase
NCTC10437_01257-170.805705pentapeptide repeat-containing protein
NCTC10437_01258-171.0150756,7-dihydropteridine reductase
NCTC10437_01259091.173516transcriptional regulator, TetR family
NCTC10437_01260-181.363262ABC-type multidrug transport system, ATPase
NCTC10437_01261-181.710257TetR family transcriptional regulator
NCTC10437_01262-180.909145NADH:flavin oxidoreductase
NCTC10437_01263-181.2738405,10-methylene-tetrahydrofolate
NCTC10437_01264-2100.940767Protein of uncharacterised function (DUF3017)
NCTC10437_01265-290.880737methylase involved in ubiquinone/menaquinone
NCTC10437_01266-180.529518homoserine O-acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01253PF03544300.039 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 29.9 bits (67), Expect = 0.039
Identities = 11/69 (15%), Positives = 20/69 (28%)

Query: 662 VPAVELLPQRRPGASGIADTPATPFEPIPVASAQPTPPAPRQDIAAFLSARSQPPAPAVD 721
AV+ P+ + P + PV +P P + + + V+
Sbjct: 63 PQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVE 122

Query: 722 PEPAPTANG 730
PA
Sbjct: 123 SRPASPFEN 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01258PERTACTIN300.028 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 29.7 bits (66), Expect = 0.028
Identities = 35/121 (28%), Positives = 46/121 (38%), Gaps = 13/121 (10%)

Query: 155 AAGVA---DGDVYRRARVVARVDTPSGTVVLTVRPQGGEF-GDFLPGQYVSVGVTLPDGA 210
AAGVA V+ + + R D P+G V GG G F P GV + D
Sbjct: 240 AAGVAAMDGAIVHLQRATIRRGDAPAGGAVPGGAVPGGAVPGGFGPLLDGWYGVDVSDST 299

Query: 211 RQLRQYSLVNAAGGGELTFAVRPVAAVAGRPAGEVSSWIAANVWVGDMIDVTVPFGDLPA 270
L Q S+V A G A AGR A S + + G++I+ P
Sbjct: 300 VDLAQ-SIVEAPQLG--------AAIRAGRGARVTVSGGSLSAPHGNVIETGGGARRFPP 350

Query: 271 P 271
P
Sbjct: 351 P 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01259HTHTETR446e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 44.2 bits (104), Expect = 6e-08
Identities = 28/167 (16%), Positives = 54/167 (32%), Gaps = 10/167 (5%)

Query: 2 GSEADESQRAAVLDAALAEVQQWGVDRFSIEGVAQRSGLTADYIRQTWTSKRQLIIDTLR 61
+ + R +LD AL Q GV S+ +A+ +G+T I + K L +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 62 TYSESMVPQSDT------GSLHGDLTVLALSIGNHLNNEIGRRIARMLVV----DSKSLV 111
++ G L + + + E RR+ ++ +
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA 124

Query: 112 VDSDTRVQFWLLRRAAIEEIFERAAARGELRSDVKPLVALQLLTSPI 158
V + L IE+ + L +D+ A ++ I
Sbjct: 125 VVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01260TONBPROTEIN330.003 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 33.4 bits (76), Expect = 0.003
Identities = 25/114 (21%), Positives = 36/114 (31%), Gaps = 4/114 (3%)

Query: 110 PPPTVAVPVTGRSGPTPAPGRPSATWPAPPPPPPSAPTGRPPVHRPVTGNQSQPYPAATS 169
P P + VT + P + P P P P P + +P P
Sbjct: 39 PAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKP 98

Query: 170 RPAPVPQPISAPAMESATVMGPAAAPRGGGDGNIATSMLRILRPGSAPKAPPGS 223
+P PV + P + V A+P N A + L +A P S
Sbjct: 99 KPKPVKKVQEQPKRDVKPVESRPASPF----ENTAPARLTSSTATAATSKPVTS 148



Score = 30.7 bits (69), Expect = 0.020
Identities = 29/125 (23%), Positives = 40/125 (32%), Gaps = 9/125 (7%)

Query: 110 PPPTVAVPVTGRSGPTPAPGRPSATWPAPPPPPPSAPTGRP-PVHRPVTGNQSQPYPAAT 168
PP V P P P P P P P P +P P +PV Q QP
Sbjct: 57 PPQAVQPPPEPVVEPEPEP-EPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKR--- 112

Query: 169 SRPAPVPQPISAPAMESATVMGPAAAPRGGGDGNIATSMLRILRPGSA--PKAPPGSIKI 226
PV ++P +A ++ TS+ R S P+ P + +
Sbjct: 113 -DVKPVESRPASPFENTAPA-RLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQAL 170

Query: 227 GRAGD 231
G
Sbjct: 171 RIEGQ 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01261HTHTETR602e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.0 bits (145), Expect = 2e-13
Identities = 20/100 (20%), Positives = 43/100 (43%), Gaps = 1/100 (1%)

Query: 1 MPRPARYTVDALLDAAAELLAADGPAAVTMSAVARSTGAPSGSVYHRFPTRAALCGELWV 60
+ A+ T +LD A L + G ++ ++ +A++ G G++Y F ++ L E+W
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 61 RTEERFQEQFVEAL-TSADDPQQRCVDAALRLVQWCRENP 99
+E E +E DP + + +++
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEE 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01266RTXTOXINA290.046 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.8 bits (64), Expect = 0.046
Identities = 20/60 (33%), Positives = 27/60 (45%), Gaps = 7/60 (11%)

Query: 268 LEYQGRKLLARF--DAGTYVALTDALSSHDVGRGRGGVAAA----LRGCPVPTVVGGITS 321
L Y G LLA F + G A +S+ G++AA L G PV +VG +T
Sbjct: 346 LGYDGDSLLAAFHKETGAIDASLTTISTVLASVS-SGISAAATTSLVGAPVSALVGAVTG 404


67NCTC10437_01540NCTC10437_01546N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_01540530-2.178780virulence factor Mce family protein
NCTC10437_01541426-2.143867virulence factor Mce family protein
NCTC10437_01542222-2.061495virulence factor Mce family protein
NCTC10437_01543322-1.952586ABC-type transport system involved in resistance
NCTC10437_01544224-2.914609ABC-type transport system involved in resistance
NCTC10437_01545225-3.427705organic solvent resistance ABC transporter
NCTC10437_01546227-3.823513transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01540IGASERPTASE290.042 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.3 bits (65), Expect = 0.042
Identities = 13/59 (22%), Positives = 24/59 (40%)

Query: 376 TARPNEVTYSEDWMRPDYIPPQPPAAAASPQTGDLTPGTLPAEAPLLATPQPQPTDPSA 434
P+ + +E+ R D P PPA A +T + E+ + + T+ +A
Sbjct: 1005 ADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTA 1063


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01541PERTACTIN359e-04 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 34.7 bits (79), Expect = 9e-04
Identities = 20/46 (43%), Positives = 22/46 (47%)

Query: 398 PYREPAPAPPPGGPPPGPPANPSPSPPPSPPIGPIDDPLTPPGAGQ 443
P P P P PG PP PP P P PP PP + P P AG+
Sbjct: 571 PKPAPQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQPPAGR 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01543PF05616354e-04 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 35.5 bits (81), Expect = 4e-04
Identities = 39/120 (32%), Positives = 48/120 (40%), Gaps = 11/120 (9%)

Query: 343 GPGGKPGCGSLPDVAKNWPVRQLIANTGF----GTGLDWR--PNPGIGFPGYANYLPVTR 396
PG K G + D N PV Q++A G T +D + P P + PG A P +
Sbjct: 271 APGTKVNMGPVTDRNGN-PV-QVVATFGRDSQGNTTVDVQVIPRPDLT-PGSAE-APNAQ 326

Query: 397 AVPEPPSIRYPGG-PAPGPIPYPGAPPYGAPQYGPDGTPLYPGVPPAPPQSPPAPPPPDG 455
+PE P PAP P P P PD P G P P SP P P+G
Sbjct: 327 PLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSPAVPDRPNG 386


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01546HTHTETR682e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 68.5 bits (167), Expect = 2e-16
Identities = 41/213 (19%), Positives = 72/213 (33%), Gaps = 26/213 (12%)

Query: 10 SRAAQAARTRQRLIDAAVQLFSANNYDDVAVADIAREAKVAHGLLFHYFGSKSQIYLAAI 69
+A TRQ ++D A++LFS ++ +IA+ A V G ++ +F KS ++
Sbjct: 4 KTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIW 63

Query: 70 QFAADEITGSFTVQEGLSPG---QQLREALVRHFRYLAS--HRGLALRLALAGPGAGQSA 124
+ + I + PG LRE L+ + R L + +
Sbjct: 64 ELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEM 123

Query: 125 WEIFESTRLHAVQWIALILDL-------------PPPSPAMQMMWRACIGAIDEAALFWL 171
+ ++ R ++ I A +M G I WL
Sbjct: 124 AVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIM----RGYISGLMENWL 179

Query: 172 RHDEPFAVEDIVDSVVDIMATAMRAAARLDPTL 204
+ F ++ V I+ L PTL
Sbjct: 180 FAPQSFDLKKEARDYVAILL----EMYLLCPTL 208


68NCTC10437_01592NCTC10437_01602N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_01592011-2.346572transcriptional regulator
NCTC10437_01593-112-3.531405DSBA oxidoreductase
NCTC10437_01594-111-2.869908Transport protein
NCTC10437_01595-113-1.956448mycobacterium membrane protein
NCTC10437_01596013-0.989809TetR family transcriptional regulator
NCTC10437_01597115-1.348568CsbD family protein
NCTC10437_01598013-0.137476ABC sugar transporter, periplasmic ligand
NCTC10437_015990130.785479Phytanoyl-CoA dioxygenase (PhyH)
NCTC10437_016000122.047079Uncharacterised protein
NCTC10437_016010102.577635Uncharacterised protein
NCTC10437_01602-1102.912519transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01592HTHTETR493e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 49.2 bits (117), Expect = 3e-09
Identities = 17/81 (20%), Positives = 23/81 (28%), Gaps = 4/81 (4%)

Query: 73 RKVTPRRPSGREAIVGSLVSAATDEFAAVGFRGASIRSIARRANVNHGMISRYFGSKGQL 132
RK R+ I+ A F+ G S+ IA+ A V G I +F K L
Sbjct: 3 RKTKQEAQETRQHIL----DVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDL 58

Query: 133 LHAALSALAEENATHLRANDD 153

Sbjct: 59 FSEIWELSESNIGELELEYQA 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01594ACRIFLAVINRP504e-08 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 50.2 bits (120), Expect = 4e-08
Identities = 46/240 (19%), Positives = 88/240 (36%), Gaps = 25/240 (10%)

Query: 136 AALVQVYLRGNQGEALSNESVDSIRDIVADTPA--PDGVKAYVTGAAPTITDNFEVGNEG 193
AA + + L A + ++ +I+ +A+ P G+K + T +
Sbjct: 286 AAGLGIKLATG---ANALDTAKAIKAKLAELQPFFPQGMK-VLYPYDTTPFVQLSIHEVV 341

Query: 194 THLVTAITFAVIALMLLIVYRSLVTMFIMLVVVAVELMAARGIVAVLAHTGVIGLSTYAT 253
L AI ++ L++ + +++ I + V V L+ A+LA G Y+
Sbjct: 342 KTLFEAI--MLVFLVMYLFLQNMRATLIPTIAVPVVLLGT---FAILAAFG------YSI 390

Query: 254 NLLTL----LAIAAGTDYAIFLVGR-YQEARNRGLDRDEAYYDMFRGTVHVIVGSGLTIV 308
N LT+ LAI D AI +V + L EA +VG + +
Sbjct: 391 NTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLS 450

Query: 309 GAVACLYF---TRLPYFQTLGVPAALGVLVTLVAALTLGPAVVVIASRFGLLEPKRRKRA 365
+ F + ++ + + ++++ AL L PA+ + E K
Sbjct: 451 AVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGG 510



Score = 43.3 bits (102), Expect = 4e-06
Identities = 34/205 (16%), Positives = 72/205 (35%), Gaps = 24/205 (11%)

Query: 724 INALRDSAFDAIKATPLSDAKIYVAGTASTYKDIQEGSTYDLMIAALAAIALILLIMIFI 783
+ L+ +K D +V + S ++++ AI L+ L+M
Sbjct: 310 LAELQPFFPQGMKVLYPYDTTPFV-----------QLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 784 TRSFVAAVVIVGTVVLSLGASLGLSVLVWQYIFGIELYWIVPALAIILLLAVGADYNLLL 843
++ A ++ V + L + + FG + + ++L + + D +++
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAI-----LAAFGYSIN-TLTMFGMVLAIGLLVDDAIVV 412

Query: 844 ---ISRFKEEIGAGLNTGIIRAMGGTGAVVTAAGLVFAAT---MASFVFSDLVVLGQIGT 897
+ R E ++M + +V +A MA F S + Q
Sbjct: 413 VENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSI 472

Query: 898 TIAMGLVFDTLIVRSFMTPSIAALL 922
TI + L+ TP++ A L
Sbjct: 473 TIVSAMALSVLVALIL-TPALCATL 496


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01596HTHTETR552e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 54.6 bits (131), Expect = 2e-11
Identities = 25/166 (15%), Positives = 52/166 (31%), Gaps = 11/166 (6%)

Query: 11 KPRWTEREAELLAVTLRHLQEHGYERLTVETVATAARCSKATIYRRWPSKERLVLAAFIE 70
K E +L V LR + G ++ +A AA ++ IY + K L +
Sbjct: 6 KQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWEL 65

Query: 71 GTN------CAEVPPRTGSLRGDLIHIGSVICEQSSEHASTMRAVLMELDRSPELAEAF- 123
+ G L I + E + + + + + E
Sbjct: 66 SESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAV 125

Query: 124 ----ETEFVLQRRAVYDEVLAAAVDRGEIDADAINSEIQDLLAGYL 165
+ L+ ++ L ++ + AD + ++ GY+
Sbjct: 126 VQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01597IGASERPTASE250.030 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 25.4 bits (55), Expect = 0.030
Identities = 12/48 (25%), Positives = 21/48 (43%)

Query: 32 NDRLQEEGKAQQDKAEAEKDVAKKEAEAETARAEAKTQEARQKAAQKS 79
N+ Q + ++ + K+ A E E + KTQE + +Q S
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVS 1130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01601BACYPHPHTASE290.020 Salmonella/Yersinia modular tyrosine phosphatase si...
		>BACYPHPHTASE#Salmonella/Yersinia modular tyrosine phosphatase

signature.
Length = 468

Score = 29.4 bits (65), Expect = 0.020
Identities = 28/94 (29%), Positives = 38/94 (40%), Gaps = 17/94 (18%)

Query: 29 AHTHSVYEA-ASDQRPKLQSWLATALPPMLPGRDPVRVVGVGVGDGSVDAPLAAALAADG 87
+H+HS A + R L+S L PP+ P P G G+ AP
Sbjct: 133 SHSHSALHAPGTPVREGLRSHLDPRTPPLPPRERPHTSGHHGAGEARATAP--------- 183

Query: 88 RRVHYTGVEPHRPSA-AAFTTRLAALDVATLTPA 120
+ V P+ P A A ++RL L TL PA
Sbjct: 184 -----STVSPYGPEARAELSSRLTTLR-NTLAPA 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01602HTHTETR638e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 63.5 bits (154), Expect = 8e-15
Identities = 33/168 (19%), Positives = 61/168 (36%), Gaps = 14/168 (8%)

Query: 11 RERKRTKTQRAIRRAALRLFLDQGYSNTTVEQIAERAEVSPRTFYRYFGVKEAVL--ICD 68
+++ +T++ I ALRLF QG S+T++ +IA+ A V+ Y +F K + I +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 69 QFTPIVTAFVN---APRELSPVAAYRFAVKEYFAGLTEEDRQDAIVS--QHLLYSAPE-- 121
+ A P++ R + E+R+ ++ H E
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA 124

Query: 122 ----ARGLLYSEYVTLIRLLTDALTQRLGEGVAFTDRQAIAGAIVGVL 165
A+ L E I + A + A + G +
Sbjct: 125 VVQQAQRNLCLESYDRIEQTLKHCIEA-KMLPADLMTRRAAIIMRGYI 171


69NCTC10437_01663NCTC10437_01670N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_01663-115-1.792267putative integral membrane alanine and leucine
NCTC10437_01664012-1.704684TetR family transcriptional regulator
NCTC10437_0166509-1.782096diguanylate cyclase
NCTC10437_01666-17-0.432186isochorismatase hydrolase
NCTC10437_01667-170.175637Uncharacterised protein
NCTC10437_01668070.520239short-chain dehydrogenase/reductase SDR
NCTC10437_01669-161.195592cyclohexanone monooxygenase
NCTC10437_01670-291.055230short-chain dehydrogenase/reductase SDR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01663RTXTOXINA300.013 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.5 bits (66), Expect = 0.013
Identities = 12/40 (30%), Positives = 20/40 (50%), Gaps = 1/40 (2%)

Query: 50 AVAGLNTAVALAVGSGAGIAALRLLRREPVQPAVSGLLGV 89
++ ++T +A +V SG AA L PV V + G+
Sbjct: 367 SLTTISTVLA-SVSSGISAAATTSLVGAPVSALVGAVTGI 405


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01664HTHTETR455e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 45.0 bits (106), Expect = 5e-08
Identities = 26/114 (22%), Positives = 45/114 (39%), Gaps = 1/114 (0%)

Query: 15 QKQAQARLEVSRHACALFWERGVAATSGDDIAAAAGLSTRTIWRYFRSKESCVEPVLALS 74
Q+ + R + A LF ++GV++TS +IA AAG++ I+ +F+ K + LS
Sbjct: 7 QEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELS 66

Query: 75 AQRFITLAHRWPAELSLADH-MAADIADHPLTEKELADEVGALRIATMSASEPA 127
L + A+ + +I H L + L E
Sbjct: 67 ESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01666ISCHRISMTASE544e-11 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 53.9 bits (129), Expect = 4e-11
Identities = 39/157 (24%), Positives = 64/157 (40%), Gaps = 21/157 (13%)

Query: 5 ALLVIDMFNTYTHP------DAEELAANAAEIVTPISDL------IAQAAARDDVELVYV 52
LL+ DM N + EL+AN ++ L AQ +++ + +
Sbjct: 32 VLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCVQLGIPVVYTAQPGSQNPDDRALL 91

Query: 53 NDNYGDFTATHTGIVQAALDGARPDLVAPLTPEPDSRFLTKVRHSAFYATPLDYLLSRLG 112
D +G G+ + ++ L PE D LTK R+SAF T L ++ + G
Sbjct: 92 TDFWG------PGLNSGPYEEK---IITELAPEDDDLVLTKWRYSAFKRTNLLEMMRKEG 142

Query: 113 VRRIILTGQVTEQCVLYTALDGYVRHYDVVVPPDAVA 149
++I+TG L TA + ++ DAVA
Sbjct: 143 RDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVA 179


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01668DHBDHDRGNASE1351e-40 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 135 bits (340), Expect = 1e-40
Identities = 78/263 (29%), Positives = 120/263 (45%), Gaps = 17/263 (6%)

Query: 2 SSSLNGKGVLVTGAASGIGEATARRLLDDGAHVVGFDLTPEAPQTLSDA-------ADYL 54
+ + GK +TGAA GIGEA AR L GAH+ D PE + + + A+
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF 62

Query: 55 SGDVSDTEAVERAVASVVEKAGRLDGVVHSAGVGGGGPIHLLPDAEWDRVVDINLKATFL 114
DV D+ A++ A + + G +D +V+ AGV G IH L D EW+ +N F
Sbjct: 63 PADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 115 VMRAALSQMVKQDRVDGERGAIVTLSSVEGLEGTAGGSAYNASKGGVVLLTKNAAIDYGP 174
R+ M+ + G+IVT+ S +AY +SK V+ TK ++
Sbjct: 123 ASRSVSKYMMDR-----RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAE 177

Query: 175 SGIRANVICPGFIQTPMFDSVMGIAGME-----GPREELRHEHKLRRFGRPDEVAAVAAF 229
IR N++ PG +T M S+ G E + L++ +P ++A F
Sbjct: 178 YNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLF 237

Query: 230 LLSAGASFVSGQAIAVDGGYTAG 252
L+S A ++ + VDGG T G
Sbjct: 238 LVSGQAGHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01670DHBDHDRGNASE1355e-41 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 135 bits (342), Expect = 5e-41
Identities = 78/256 (30%), Positives = 115/256 (44%), Gaps = 18/256 (7%)

Query: 5 LAGKVALISGAARGMGASHARMMVSHGAKVVCGDILDSDGELVAKELGDAARYVH---LD 61
+ GK+A I+GAA+G+G + AR + S GA + D E V L AR+ D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 62 VTRTEDWDAAVATAVAEFGGLDILVNNAGILNIGTIEDYELSEWHRILDINLNGVFLGIR 121
V + D A E G +DILVN AG+L G I EW +N GVF R
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 122 AAAPTMKTAGRGSIINISSIEGFAGTVACHGYTATKFAVRGLTKSTALELGPFGIRVNSV 181
+ + M GSI+ + S + Y ++K A TK LEL + IR N V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 182 HPGLVKTPMAD--WVPEDIFQSA-------------LGRIAQPQEVSNLVVYLASDESSY 226
PG +T M W E+ + L ++A+P ++++ V++L S ++ +
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 227 STGAEFVVDGGTIAGL 242
T VDGG G+
Sbjct: 246 ITMHNLCVDGGATLGV 261


70NCTC10437_01728NCTC10437_01736N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_01728513-0.146630short-chain dehydrogenase/reductase SDR
NCTC10437_01729417-0.828697Domain of uncharacterised function (DUF309)
NCTC10437_01730418-1.422498Phospholipase/Carboxylesterase
NCTC10437_01731415-1.219313transglycosylase associated protein
NCTC10437_01732412-0.566167CsbD-like protein
NCTC10437_01733012-0.206764Uncharacterised protein
NCTC10437_01734-1120.247987Uncharacterised protein
NCTC10437_01735-2110.916369Protein of uncharacterised function (DUF998)
NCTC10437_01736-1111.070640Cutinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01728DHBDHDRGNASE1205e-35 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 120 bits (301), Expect = 5e-35
Identities = 74/248 (29%), Positives = 119/248 (47%), Gaps = 5/248 (2%)

Query: 11 KSAVITGAAFGIGRATAVRFANEGARLIVTDVQGEPLRALADELRSAGAQVETVVGDVSV 70
K A ITGAA GIG A A A++GA + D E L + L++ E DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 71 ESDARQMIAAAIERYGRLDVLVANAGIIPLGDVLEMTVSDWDEVMAIDGRGMFLTCKFAI 130
+ ++ A G +D+LV AG++ G + ++ +W+ +++ G+F +
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 131 EAMLTTGGGAIVCLSSISGLAGQKRQAAYGPAKFVATGLTQHLAVEWAEQGIRVNAVAPG 190
+ M+ G+IV + S + AAY +K A T+ L +E AE IR N V+PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 191 TIRTERVLRLPEEPGGSE-----YLAEIERMHPMGRIGEPDEVASAIVFLASDDASFITG 245
+ T+ L + G+E L + P+ ++ +P ++A A++FL S A IT
Sbjct: 189 STETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITM 248

Query: 246 AVLPVDGG 253
L VDGG
Sbjct: 249 HNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01731ACRIFLAVINRP250.049 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 25.2 bits (55), Expect = 0.049
Identities = 18/74 (24%), Positives = 31/74 (41%), Gaps = 9/74 (12%)

Query: 5 ILGAVVVGLIVGALARLIMPGKQSIGVIMTVLLGAIGSFLGAWVSYKLGYSNANGGFEFI 64
L A+ ++ LA L + V++ V LG +G L A + N +
Sbjct: 874 ALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLF---NQKND------V 924

Query: 65 PFLVGIIFAVVLIA 78
F+VG++ + L A
Sbjct: 925 YFMVGLLTTIGLSA 938


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01732IGASERPTASE300.006 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.006
Identities = 15/91 (16%), Positives = 28/91 (30%), Gaps = 8/91 (8%)

Query: 98 QQDAQKRTAEAAEAQKAAAAKAQAEADAQREVEQARAQERVDVRAAADEVVEAAAEQQTA 157
+ + + +T + +A + E AR E A + + A
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPA--PATPSETTETVA 1041

Query: 158 E------RITENAHEQAEENRARAARLTNEA 182
E + E + A E A+ + EA
Sbjct: 1042 ENSKQESKTVEKNEQDATETTAQNREVAKEA 1072


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01733IGASERPTASE451e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 44.7 bits (105), Expect = 1e-07
Identities = 20/159 (12%), Positives = 53/159 (33%), Gaps = 7/159 (4%)

Query: 70 LAERSDALSRAAKLDAAATQKQQRADAQAKT-KVDGAINDQKQAKEAKERDVQDARTQAE 128
+ + A A KQ+ + ++ + + +V+ E
Sbjct: 1025 VPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNE 1084

Query: 129 -QRKKAAAEDAQKRADAAKRQVDEVAAQRKNSVEDAKRKEQARIRAEE--EKATSAAKSK 185
+ + ++ Q ++ V + K VE K +E ++ ++ ++ S
Sbjct: 1085 VAQSGSETKETQTT---ETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQP 1141

Query: 186 LDDAKAKSDAAESKRAQADRLEQLADAEKQKRRTARATE 224
+ ++D + + + AD E+ + T+ E
Sbjct: 1142 QAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVE 1180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01736PF06057300.008 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 29.8 bits (67), Expect = 0.008
Identities = 15/50 (30%), Positives = 20/50 (40%), Gaps = 14/50 (28%)

Query: 129 KLVLGGYSQGAAVMGFVTTELIPDGVTASEVPAPMPPDIANHVSAVTLLG 178
K++L GYS GA V+ FV E+ P +V LL
Sbjct: 118 KVILIGYSFGAEVIPFVLNEM--------------PARYRKNVLGAVLLS 153


71NCTC10437_01901NCTC10437_01914N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_01901-190.032587glyoxalase/bleomycin resistance
NCTC10437_01902-29-0.230797Uncharacterised protein
NCTC10437_01903-19-0.480390short-chain dehydrogenase/reductase SDR
NCTC10437_01904-110-0.895770extracellular solute-binding protein
NCTC10437_01905-111-1.660956TetR family transcriptional regulator
NCTC10437_01906-111-1.835959agmatinase
NCTC10437_01907010-0.803085geranylgeranyl reductase
NCTC10437_0190809-0.652856TetR family transcriptional regulator
NCTC10437_0190909-0.539387putative monooxygenase
NCTC10437_01910-170.105531ribonucleotide-diphosphate reductase subunit
NCTC10437_01911-161.096723Uncharacterised protein
NCTC10437_01912-171.374207multi-sensor signal transduction histidine
NCTC10437_01913-19-0.061343response regulator receiver protein
NCTC10437_01914-190.005081two component LuxR family transcriptional
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01901MPTASEINHBTR260.039 Metalloprotease inhibitor signature.
		>MPTASEINHBTR#Metalloprotease inhibitor signature.

Length = 122

Score = 26.1 bits (57), Expect = 0.039
Identities = 10/57 (17%), Positives = 17/57 (29%), Gaps = 2/57 (3%)

Query: 9 TLGVDDLSRARRFYEDGLGWTPKAAPEGVVFYQLPGIALALFGRADLAEDAHHPVDG 65
L D + + + W+P P+G+ G + R E G
Sbjct: 59 ALAGDVACAEQWLGDKPVSWSP--TPDGIWLMNAEGTGITHLNRQKEGEYTGRTPSG 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01903DHBDHDRGNASE946e-25 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 94.0 bits (233), Expect = 6e-25
Identities = 73/267 (27%), Positives = 124/267 (46%), Gaps = 24/267 (8%)

Query: 7 SAYDLTGQVALVTGSSSDLGIGFASARLLGQFNAAVM-VTGTTDRAAERAAELRAEGIVA 65
+A + G++A +TG++ GIG A AR L A + V ++ + + L+AE A
Sbjct: 2 NAKGIEGKIAFITGAAQ--GIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHA 59

Query: 66 HSFVADLMDPDAAGDLVAATIAEFGKLDILVNNAGLASVNIPERPNSLMVMTDEEWALAL 125
+F AD+ D A ++ A E G +DILVN AG+ RP + ++DEEW
Sbjct: 60 EAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVL------RPGLIHSLSDEEWEATF 113

Query: 126 RRNVDSAFFVTRAALPGMVERGYGRIINVASTAGILTAYTGDVGYHTAKAAMLGMTRSVA 185
N F +R+ M++R G I+ V S T Y ++KAA + T+ +
Sbjct: 114 SVNSTGVFNASRSVSKYMMDRRSGSIVTVGSNPA-GVPRTSMAAYASSKAAAVMFTKCLG 172

Query: 186 VDYAKNGVTANVVIPGWIATAAQL--------PSEVAAGNAT------PIGRSASAAEVA 231
++ A+ + N+V PG T Q +V G+ P+ + A +++A
Sbjct: 173 LELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIA 232

Query: 232 AGIAFFATPGASYVTGTTLAIDGGNSI 258
+ F + A ++T L +DGG ++
Sbjct: 233 DAVLFLVSGQAGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01905HTHTETR617e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 61.2 bits (148), Expect = 7e-14
Identities = 31/171 (18%), Positives = 64/171 (37%), Gaps = 7/171 (4%)

Query: 9 ARGVQRRTEIIDAAIEVMARVGLAGLSMRVVASQAGIPVGALSYYFDDKSDLVAQAFRQL 68
+ R I+D A+ + ++ G++ S+ +A AG+ GA+ ++F DKSDL ++ +
Sbjct: 7 QEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELS 66

Query: 69 SDREIERVVHTANQLQPSMSAEELADAVADMIIDGFTSP-----RGAIVTRYELVTEASR 123
E + + P L + + ++ T I + E V E +
Sbjct: 67 ESNIGELELEYQAK-FPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAV 125

Query: 124 DERLRPMFEAWYAAMVPALSRLFRELGSHQPELDSRTVMAVTAGLEIDNLY 174
++ + + + E +L +R + G I L
Sbjct: 126 VQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGY-ISGLM 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01908HTHTETR502e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 50.0 bits (119), Expect = 2e-09
Identities = 30/145 (20%), Positives = 50/145 (34%), Gaps = 2/145 (1%)

Query: 43 RWREHRKKVRSEIVDAAFRAIDQLGPE-VSLREIAEEAGTAKPKIYRHFTDKSDLFQAIG 101
+ ++ ++ R I+D A R Q G SL EIA+ AG + IY HF DKSDLF I
Sbjct: 4 KTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIW 63

Query: 102 QRMRDMLWAAIFPSIDISTDPARTVIFRAVEQYVRLVDEHPNVIRFL-MQGRFAEQSESA 160
+ + +V+ + + + + E
Sbjct: 64 ELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEM 123

Query: 161 MRALNEGRDITLAMAEMFNNELRDM 185
R++ L + L+
Sbjct: 124 AVVQQAQRNLCLESYDRIEQTLKHC 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01912PF06580340.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.7 bits (77), Expect = 0.002
Identities = 13/72 (18%), Positives = 31/72 (43%), Gaps = 9/72 (12%)

Query: 653 TVEVGLSETGTALNLSVTDDGVGGASTDG---GSGLVGLRDRVEALSGE---LTVTSRPG 706
+ + ++ + L V + G G+GL +R+R++ L G + ++ + G
Sbjct: 280 KILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQG 339

Query: 707 EGTRISATIPVP 718
+ A + +P
Sbjct: 340 KVN---AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01913HTHFIS541e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 53.7 bits (129), Expect = 1e-11
Identities = 30/107 (28%), Positives = 46/107 (42%), Gaps = 9/107 (8%)

Query: 1 MNGPRCLIVDDSAAFRDAARAMLERGGIEVVGTACDAAEAMRCAEELRPDVALVDVDLGS 60
M G L+ DD AA R L R G +V +AA R D+ + DV +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVR-ITSNAATLWRWIAAGDGDLVVTDVVMPD 59

Query: 61 DSGFDVAEQMT----GVPVILTSTHDEQDFADLIAA--SPALGFLPK 101
++ FD+ ++ +PV++ S + F I A A +LPK
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQN--TFMTAIKASEKGAYDYLPK 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01914HTHFIS728e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 71.8 bits (176), Expect = 8e-17
Identities = 43/165 (26%), Positives = 72/165 (43%), Gaps = 11/165 (6%)

Query: 1 MCALRVVVADDDVLLREGLASLLERSGFEVAGQAGDGDALLELVRAERPDLVLVDIRMPP 60
M ++VADDD +R L L R+G++V + L + A DLV+ D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPD 59

Query: 61 THTSEGLDAARVIREESPDIGIVVLSAHIDVDHAMELLAGGHAIGYLLKTRVTDVTDFVE 120
+ D I++ PD+ ++V+SA A++ G A YL K D+T+ +
Sbjct: 60 EN---AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKG-AYDYLPKPF--DLTELIG 113

Query: 121 TLQRIANGASVVDPALVAELVSARKRDDPLGALSTREHEVLTLMA 165
+ R A ++L + PL S E+ ++A
Sbjct: 114 IIGR----ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLA 154


72NCTC10437_01921NCTC10437_01928N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_01921010-0.232314TetR family transcriptional regulator
NCTC10437_01922-110-0.243149oligopeptide/dipeptide ABC transporter ATPase
NCTC10437_01923-28-0.213922binding-protein-dependent transport systems
NCTC10437_01924-290.082245binding-protein-dependent transport systems
NCTC10437_01925-190.828199nicotinamidase-like amidase
NCTC10437_019260101.134179extracellular solute-binding protein
NCTC10437_019271111.776397beta-lactamase domain-containing protein
NCTC10437_019281102.179391allantoinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01921HTHTETR671e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 66.6 bits (162), Expect = 1e-15
Identities = 32/183 (17%), Positives = 58/183 (31%), Gaps = 15/183 (8%)

Query: 6 AGRPRLHAQRRPGTTARDEILDAAGELFTTLGYAATSTRTIAESVGIRQASLYHYFSTKD 65
A + + AQ R ILD A LF+ G ++TS IA++ G+ + ++Y +F K
Sbjct: 2 ARKTKQEAQET-----RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKS 56

Query: 66 DILCALLSQTVAPTLAFIPRLQTAYPGMTEPQQLHAL-AAFDGAQLLGGRWNLGALYLLP 124
D+ + + + Q +PG L + R L +
Sbjct: 57 DLFSEIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHK 116

Query: 125 ELRDARLEPFFSDRERLRLHYLQLSLDIVG--------HTGVD-DAAADLPFRLVESLVN 175
+ + L L + + AA + + L+
Sbjct: 117 CEFVGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLME 176

Query: 176 MWS 178
W
Sbjct: 177 NWL 179


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01922PYOCINKILLER300.041 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 29.8 bits (66), Expect = 0.041
Identities = 19/65 (29%), Positives = 30/65 (46%), Gaps = 3/65 (4%)

Query: 566 SVARVIADRIAVMYLGRIVELGPAEQVIGAPAHPYTRALVGSIPDLGRESRILP-GEPAS 624
S+A+ I+D IAV LGR++ P+ +G + Y+ D +S G A+
Sbjct: 276 SLAQAISDAIAV--LGRVLASAPSVMAVGFASLTYSSRTAEQWQDQTPDSVRYALGMDAA 333

Query: 625 PLSPP 629
L P
Sbjct: 334 KLGLP 338


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01925ISCHRISMTASE492e-09 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 49.2 bits (117), Expect = 2e-09
Identities = 42/191 (21%), Positives = 66/191 (34%), Gaps = 20/191 (10%)

Query: 12 VLVVVDIQEGGGMSADDAGIPVMPGHVERVGRAQRLVAAAREARIPVVFFQEVHRPSGID 71
VL++ D+Q + PV E ++L + IPVV+ + + D
Sbjct: 32 VLLIHDMQNYFVDAFTAGASPV----TELSANIRKLKNQCVQLGIPVVYTAQPGSQNPDD 87

Query: 72 FGRELDGTEGVHCVDGGPGTDLEPTLRPNLD--GPNH-EFHIVKRRYSGFIGTDFEIVLS 128
D GPG + P + P + + K RYS F T+ ++
Sbjct: 88 RALLTDFW--------GPGLNSGPYEEKIITELAPEDDDLVLTKWRYSAFKRTNLLEMMR 139

Query: 129 GLKASTLILIGGLTDVCVHYTFADAHQRDFYVRVVSDCVGGSSQYRHDAALDAMEYL--Q 186
LI+ G + T +A D V D V S +H A+EY +
Sbjct: 140 KEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVADFSLEKH---QMALEYAAGR 196

Query: 187 TGALRTTDEIL 197
TD +L
Sbjct: 197 CAFTVMTDSLL 207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_01928UREASE320.004 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 32.0 bits (73), Expect = 0.004
Identities = 14/28 (50%), Positives = 19/28 (67%), Gaps = 1/28 (3%)

Query: 359 NPAAMAGLDDR-GSIAVGRRADLCVFDP 385
NPA GL GS+ VG+RADL +++P
Sbjct: 412 NPAIAHGLSHEIGSLEVGKRADLVLWNP 439



Score = 28.9 bits (65), Expect = 0.039
Identities = 18/99 (18%), Positives = 35/99 (35%), Gaps = 19/99 (19%)

Query: 1 MDLVLRGDRVLIDGEVRPGSVAVDGGMIVAVGSREADYGGVDVVDIPAPAA--------- 51
+D V+ +L + + + G I A+G + + V I
Sbjct: 68 VDTVITNALILDHWGIVKADIGLKDGRIAAIG-KAGNPDMQPGVTIIVGPGTEVIAGEGK 126

Query: 52 -LLPGFVDTHVHVNDPGTSWEGFATATAAAAAAGITTVV 89
+ G +D+H+H P E A +G+T ++
Sbjct: 127 IVTAGGMDSHIHFICPQQIEE--------ALMSGLTCML 157


73NCTC10437_02413NCTC10437_02420N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_02413-18-0.437021thioester reductase-like protein
NCTC10437_02414-210-0.155913cytochrome P450 133B1
NCTC10437_02415-2110.148890gamma-aminobutyraldehyde dehydrogenase
NCTC10437_02416-2120.4628194-aminobutyrate aminotransferase
NCTC10437_02417-2100.054624protein translocase subunit yajC
NCTC10437_02418-1100.298605protein-export membrane protein SecD
NCTC10437_02419-1100.240111preprotein translocase subunit SecF
NCTC10437_024200100.254879extracellular solute-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02413NUCEPIMERASE398e-05 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 39.0 bits (91), Expect = 8e-05
Identities = 48/243 (19%), Positives = 76/243 (31%), Gaps = 59/243 (24%)

Query: 766 TVLLTGATGFLGRYLALEWLERLSMVGGTLICL--------VRAKDDAAAR-ARLDSTFD 816
L+TGA GF+G +++ LE G ++ + V K A+ F
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEA----GHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFH 57

Query: 817 TGDPTLLEHYRRLAADHLEVIAGDKGEPD--LGLDPQTWQRLADSVDLIVDPAALVNHVL 874
D E L A + + R + + +P A +
Sbjct: 58 KIDLADREGMTDLFASG---------HFERVFISPHRLAVRYS-----LENPHAYAD--- 100

Query: 875 PYSQLFGPNALGTAELIRVALTTRLKPFVYVSTIGVGAGIAPGSFTEDADIRAISATRSV 934
N G ++ +++ +Y S+ V F+ D
Sbjct: 101 -------SNLTGFLNILEGCRHNKIQHLLYASSSSVYGLNRKMPFSTD----------DS 143

Query: 935 DDSYANGYGNSKWAGEVLLREAHELSGLPVAVFRCDMILADTTYSGQLNLPDM----FTR 990
D + Y +K A E++ L GLP R T Y G PDM FT+
Sbjct: 144 VDHPVSLYAATKKANELMAHTYSHLYGLPATGLR-----FFTVY-GPWGRPDMALFKFTK 197

Query: 991 LML 993
ML
Sbjct: 198 AML 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02418SECFTRNLCASE613e-12 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 61.4 bits (149), Expect = 3e-12
Identities = 42/222 (18%), Positives = 84/222 (37%), Gaps = 28/222 (12%)

Query: 366 IPGGRTQITGQFTQDSARELANVLKYGSLPLSFDSSEAETVSASLGLSSLRAGLIAGAVG 425
G + G Q+ ++ L L S E +V + + + +
Sbjct: 105 EDGQGAEGQGAQGQELVNKVETALTAVDPALKITSFE--SVGPKVSGELVWTAVWSLLAA 162

Query: 426 LAAVLLYSLLYYR---ALGVLIALSLTASGAMVFAILVLLGRYINYTLDLAGIAGLIIGI 482
++ Y + + ALG ++AL + + +L + L +A L+
Sbjct: 163 TVVIMFYIWVRFEWQFALGAVVALVHDV--LLTVGLFAVLQLKFD----LTTVAALLTIT 216

Query: 483 GMTADSFVVFFERIKDEIREGRSFR------SAVPRGWARARKTIMSGNAVTFLAAAVLY 536
G + + VV F+R+++ + + ++ +V +R T+M+G T LA +
Sbjct: 217 GYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNETLSR---TVMTG-MTTLLALVPML 272

Query: 537 FLAVGQVKGFAFTLGLTTILDVVIVFLVTWPLVYMASKSPLW 578
++GF F + V VF T+ VY+A L+
Sbjct: 273 IWGGDVIRGFVFAM-------VWGVFTGTYSSVYVAKNIVLF 307



Score = 33.3 bits (76), Expect = 0.003
Identities = 14/101 (13%), Positives = 34/101 (33%), Gaps = 5/101 (4%)

Query: 11 RYLALFLVLLVGVFALVFFTGDKKPDPKLGIDLQGGTRVTLTARTPDGSPPSRDALIQAQ 70
++ +++ + +++ GID +GGT + + T R AL +
Sbjct: 20 QWATFGAAIVMMIASVILPL---VIGLNFGIDFKGGTTIRTESTTAIDVGVYRAAL-EPL 75

Query: 71 QIISSRVDGLGVSGSEVIIDGQNLVITVPGDDTSEARSLGQ 111
++ + + S ++ +D A G
Sbjct: 76 ELGDVIISEVR-DPSFREDQHVAMIRIQMQEDGQGAEGQGA 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02419SECFTRNLCASE2431e-79 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 243 bits (621), Expect = 1e-79
Identities = 76/312 (24%), Positives = 145/312 (46%), Gaps = 14/312 (4%)

Query: 43 FEVIGKRKLWYTVSGLIVAVAIGAMLIRGFTFGIDFEGGTKMAMPRG---DSGITTSQVE 99
F+ + + + +++ ++ L+ G FGIDF+GGT + D G+ + +E
Sbjct: 14 FDFFRWQWATFGAAIVMMIASVILPLVIGLNFGIDFKGGTTIRTESTTAIDVGVYRAALE 73

Query: 100 TV-FNDTIGTPPESVVVVGSGDSATFQIRSETLTNEQTEQLRTA--LFDAFAPVGADGQP 156
+ D I + A +I+ + Q L + P
Sbjct: 74 PLELGDVIISEVRDPSFREDQHVAMIRIQMQEDGQGAEGQGAQGQELVNKVETALTAVDP 133

Query: 157 SKQSISDSAVSETWGGQITQKALIALIVFLVLAAIYITVRYERYMALAALATLVFDLVVT 216
+ + S +V G++ A+ +L+ V+ YI VR+E AL A+ LV D+++T
Sbjct: 134 ALKITSFESVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHDVLLT 193

Query: 217 AGVYALVGFEVTPATVIGLLTILGFSLYDTVIVFDKVEENTEGFEHATRRTFAEQANLAV 276
G++A++ + TV LLTI G+S+ DTV+VFD++ EN ++ + NL+V
Sbjct: 194 VGLFAVLQLKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYK---TMPLRDVMNLSV 250

Query: 277 NQTFMRSINTSLISVLPIIALMVVAVWLLGVGTLMDLALVQLVGVIVGTYSSIYFATPLL 336
N+T R++ T + ++L ++ +++ G + + GV GTYSS+Y A ++
Sbjct: 251 NETLSRTVMTGMTTLLALVPMLI-----WGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIV 305

Query: 337 VTLRERTDVVRK 348
+ + + +K
Sbjct: 306 LFIGLDRNKEKK 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02420INTIMIN300.032 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 30.0 bits (67), Expect = 0.032
Identities = 39/289 (13%), Positives = 82/289 (28%), Gaps = 21/289 (7%)

Query: 201 TAILNNDGPAVERIAQVWNTTWNLGADLDL--KKFPSSGPYKLDSVTDEGAVVLVTNDKW 258
TA + +G A + +N A L SG + +D+ V+V+
Sbjct: 581 TATVKKNGVAQANVPVSFNIVSG-TAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTA 639

Query: 259 WGTKPITGRITVWPRSPELQDRVNEGAYDVVDIAAGSSGTLNLPDDYVRTDSPSAGIEQL 318
T + ++ + + + E D A + ++ D P + E
Sbjct: 640 EMTSALNANAVIFV--DQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVT 697

Query: 319 IFAPQGPLSAVPARRALALCTPRDVIARNAEVPVANARL-NAATEDAYGAAEATPQVNDF 377
G LS + + + + +AR+ + A + P+V F
Sbjct: 698 FTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVK------APEVEFF 751

Query: 378 AVANPEAARAALNGQPLTVRIGYQTPNARLAATVGAIARTCAPAGITVE-DVAGENTGPL 436
+ + +G + +G + N
Sbjct: 752 TTLTIDDGNIEI--------VGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIA 803

Query: 437 ALRNNEIDVLIASTGGAAGSGSTGSSAMDAYTLHSANGNNLPRYSNERI 485
++ + V + G S + + YT+ + N +P S
Sbjct: 804 SVDASSGQVTLKEKGTTTISVISSDNQTATYTIATPNSLIVPNMSKRVT 852


74NCTC10437_02473NCTC10437_02478N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_024732142.431083dehydrogenase
NCTC10437_024743162.323418Uncharacterised protein
NCTC10437_024753151.951943short-chain dehydrogenase/reductase SDR
NCTC10437_024763171.991070Uncharacterised protein
NCTC10437_02477-1150.366867Protein of uncharacterised function (DUF3097)
NCTC10437_02478-290.010993recombination factor protein RarA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02473DHBDHDRGNASE871e-22 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 87.4 bits (216), Expect = 1e-22
Identities = 64/254 (25%), Positives = 107/254 (42%), Gaps = 15/254 (5%)

Query: 5 LDGRVALVTGAGAGIGEGIARRFADEGARVVVAEHDADSGSAVADRIGGYF-----VATD 59
++G++A +TGA GIGE +AR A +GA + +++ + V + D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 60 VSDRGQVDNAVAAAVSEFGAIDILVNNAWGGGGIGRVELKTDQQIADGVAVGYYGPFWAM 119
V D +D A E G IDILVN A G G + +D++ +V G F A
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVA-GVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 120 RAAYPHMKTAGWGRVINLCSLNGVNAHVGSLEYNAAKEALRALTRTAAREWAPTGVTVNA 179
R+ +M G ++ + S Y ++K A T+ E A + N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 180 LCPAAKSQAFFRAL--GDYPELEAMADAAN------PMGRMGDPYDDIAPVAVFLASDAS 231
+ P + +L + + + + P+ ++ P DIA +FL S +
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKP-SDIADAVLFLVSGQA 243

Query: 232 RYLTGNTLFVDGGA 245
++T + L VDGGA
Sbjct: 244 GHITMHNLCVDGGA 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02475DHBDHDRGNASE571e-11 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 57.0 bits (137), Expect = 1e-11
Identities = 45/190 (23%), Positives = 80/190 (42%), Gaps = 4/190 (2%)

Query: 8 GPWAVVAGGSEGVGAEFAALLAQAGVNVALVARKPEPLERTAARCRALGVQTRTLAVDLV 67
G A + G ++G+G A LA G ++A V PE LE+ + +A D+
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 68 DAP--DEVIAMTADLEVGLLIYNAGANTCSEHFLDA-DLADFARVIDLNITAMMRLTQHY 124
D+ DE+ A + I A + + ++ +N T + ++
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 125 ARPMRERRRGGILLVGSMAGYLGSARHSVYGGVKAFGRIFAESLWLELRQHDVHVLELVL 184
++ M +RR G I+ VGS + + Y KA +F + L LEL ++++ +
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 185 GVTRTPAMER 194
G T T M+
Sbjct: 188 GSTET-DMQW 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02476cloacin402e-05 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 40.5 bits (94), Expect = 2e-05
Identities = 31/98 (31%), Positives = 40/98 (40%)

Query: 431 AGGNGGTGGTGGTGGTGGAGDAGGNGGVGGAGGAGTGGGPTGVGGKGGLGGSGDSGGPGG 490
+GG+G TG +G GVGG G+G GG G GG G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 491 TGGTGGSGGSGGSGGSGAAGGSGGAGGGGGSAGIADPG 528
G GG+G SGG G+G + A G ++ PG
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPG 99



Score = 40.1 bits (93), Expect = 2e-05
Identities = 39/120 (32%), Positives = 49/120 (40%), Gaps = 6/120 (5%)

Query: 418 GVGGTGGVGGTGEAGGNGGTGGTGGTGGTGGAGDAGGNGGVGGAGGAGTGGGPTGVGGKG 477
G G G G GN GG G G GGA D G G G+G G G
Sbjct: 3 GGDGRGHNTGAHSTSGNI-NGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGI----HWG 57

Query: 478 GLGGSGDSGGPGGTGGTGGSGGSGGSGGSGAAGG-SGGAGGGGGSAGIADPGGTFSHAGA 536
G G G+ GG G +GG G+GG+ + + A G + G G ++ G S A A
Sbjct: 58 GGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIA 117



Score = 37.4 bits (86), Expect = 1e-04
Identities = 41/119 (34%), Positives = 52/119 (43%), Gaps = 8/119 (6%)

Query: 414 AGNGGVGGTGGVGGTGEAGGNGGTGGTGGTGGTGGAGDAGGNGGVGGAGGAGTGGGPTGV 473
+G G G G T G TG G G + G+G + N G GG+G+G G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWG--GGSGSGIHWGGG 59

Query: 474 GGKGGLGGSGDSGGPGGTGGTGGSGGSGGSGGSGAAGGSGGAG------GGGGSAGIAD 526
G G GG+G+SGG GTGG + + + G A G G G SA IAD
Sbjct: 60 SGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIAD 118



Score = 37.0 bits (85), Expect = 2e-04
Identities = 35/114 (30%), Positives = 42/114 (36%), Gaps = 2/114 (1%)

Query: 443 TGGTGGAGDAGGNGGVGGAGGAGTGGGPTGVGGKGGLGGSGDSGGPGGTGGTGGSGGSGG 502
+GG G + G + G G TG G G G S ++ GG+G GG G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 503 SGGSGAAGGSGG--AGGGGGSAGIADPGGTFSHAGAAGTSGANGASGASGASGA 554
G G G SGG GG SA A F G G + A S A
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAA 115



Score = 35.8 bits (82), Expect = 5e-04
Identities = 33/107 (30%), Positives = 40/107 (37%), Gaps = 1/107 (0%)

Query: 394 GQAGNGGNGGKGGFYGGGGNAGNGGVGGTGGVGGTGEAGGNGGTGGTGGTGGTGGAGDAG 453
G+ N G G GG G G + G G + E GG G+G G G G
Sbjct: 6 GRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNG 65

Query: 454 GNGGVGGAGGAGTGGGPTGVGGKGGLGGSGDSGGPGGTGGTGGSGGS 500
G G G GG+GTGG + V G S G S G+
Sbjct: 66 GGNGNSG-GGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111



Score = 34.3 bits (78), Expect = 0.001
Identities = 35/113 (30%), Positives = 43/113 (38%), Gaps = 3/113 (2%)

Query: 352 NGGNGGAGGIGAAGESGGVGAVGGTGGTGSNATGVGGTGTVGGQAGNGGNGGKGGFYGGG 411
+GG+G GA SG + GG G G G+G GG G G +GGG
Sbjct: 2 SGGDGRGHNTGAHSTSGNIN--GGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGG 59

Query: 412 GNAGNGGVGGTGGVGGTGEAGGNGGTGGTGGTGGTGGAGDAGGNGGVGGAGGA 464
GNGG G G GG+G G G + G V + GA
Sbjct: 60 SGHGNGGGNGNSG-GGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111



Score = 33.1 bits (75), Expect = 0.003
Identities = 29/101 (28%), Positives = 41/101 (40%)

Query: 471 TGVGGKGGLGGSGDSGGPGGTGGTGGSGGSGGSGGSGAAGGSGGAGGGGGSAGIADPGGT 530
+G G+G G+ + G G TG G G S GSG + + GGG GS G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 531 FSHAGAAGTSGANGASGASGASGANGTNGGTGKAGASGSSG 571
+ G G SG +G + ++ A G G+ G
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGG 102



Score = 33.1 bits (75), Expect = 0.003
Identities = 28/74 (37%), Positives = 35/74 (47%), Gaps = 5/74 (6%)

Query: 345 TGAASGGNGGNGGAGGIGAAGESGGVGAVGGTGGTGSNATGVGGTGTVGGQAGNGGNGGK 404
TGA S NGG G+G G GA G+G + N GG+G+ G G+G
Sbjct: 11 TGAHSTSGNINGGPTGLGVGG-----GASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNG 65

Query: 405 GGFYGGGGNAGNGG 418
GG GG +G GG
Sbjct: 66 GGNGNSGGGSGTGG 79



Score = 32.8 bits (74), Expect = 0.004
Identities = 29/82 (35%), Positives = 38/82 (46%), Gaps = 2/82 (2%)

Query: 301 AGGAATTTGTNAVAVGGNGADGGAGGTGVTTGGAGGAGGAATSTT-GAASGGNGGNGGAG 359
+GG T A + GN +GG G GV G + G+G ++ + G SG GG
Sbjct: 2 SGGDGRGHNTGAHSTSGN-INGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGS 60

Query: 360 GIGAAGESGGVGAVGGTGGTGS 381
G G G +G G GTGG S
Sbjct: 61 GHGNGGGNGNSGGGSGTGGNLS 82



Score = 32.8 bits (74), Expect = 0.004
Identities = 27/82 (32%), Positives = 35/82 (42%), Gaps = 1/82 (1%)

Query: 334 AGGAGGAATSTTGAASGG-NGGNGGAGGIGAAGESGGVGAVGGTGGTGSNATGVGGTGTV 392
+GG G + + SG NGG G G G A + G + G GS + G G+
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 393 GGQAGNGGNGGKGGFYGGGGNA 414
G G GN G G GG +A
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLSA 83



Score = 32.8 bits (74), Expect = 0.005
Identities = 36/123 (29%), Positives = 48/123 (39%), Gaps = 5/123 (4%)

Query: 134 GYGGIGGSGAAGVAGGN--GGKGGLFFGNGGAGGAGGTSANGGNGGDAGM---FALVGHG 188
G G G + A GN GG GL G G + G+G +S N GG +G +
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 189 GNGGVGGNGALGTIGTTGTGTATDGTAGEIGAIGATGANGVNPTASGAAGAAGAAGAATT 248
GNGG GN G+ A A+ GA G+ + S A +A A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIADIMAA 122

Query: 249 VTG 251
+ G
Sbjct: 123 LKG 125



Score = 31.6 bits (71), Expect = 0.009
Identities = 30/111 (27%), Positives = 37/111 (33%), Gaps = 2/111 (1%)

Query: 375 GTGGTGSNATGVGGTGTVGGQAGNGGNGGKGGFYGGGGNAGNGGVGGTGGVGGTGEAGGN 434
G G G N +G + G G G G GG G G + G G G GG+
Sbjct: 3 GGDGRGHNTGAHSTSGNING--GPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGS 60

Query: 435 GGTGGTGGTGGTGGAGDAGGNGGVGGAGGAGTGGGPTGVGGKGGLGGSGDS 485
G G G GG+G G V G T G + S +
Sbjct: 61 GHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111



Score = 31.6 bits (71), Expect = 0.010
Identities = 36/116 (31%), Positives = 41/116 (35%), Gaps = 7/116 (6%)

Query: 234 SGAAGAAGAAGAATTVTGGNGAPGSPGNPGGAGGAGGAVAGTAPLATGGNGAIG-GAGTP 292
SG G GA +T NG P G GGA G + P G I G G+
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 293 GANGGAGGAGGAATTTGTNAVAVGGNGADGGAGGTGVTTGGAGGAGGAATSTTGAA 348
NGG G G + TG G A G GAGG A S + A
Sbjct: 62 HGNGGGNGNSGGGSGTG------GNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111



Score = 30.1 bits (67), Expect = 0.027
Identities = 33/114 (28%), Positives = 41/114 (35%), Gaps = 2/114 (1%)

Query: 260 GNPGGAGGAGGAVAGTAPLATGGNGAIGGAGTPGANGGAGGAGGAATTTGTNAVAVGGNG 319
G+ GA G + G G GA G+G N GG G+ G + GNG
Sbjct: 8 GHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGG--SGHGNG 65

Query: 320 ADGGAGGTGVTTGGAGGAGGAATSTTGAASGGNGGNGGAGGIGAAGESGGVGAV 373
G G G TGG A A + A G G A I A S + +
Sbjct: 66 GGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIADI 119



Score = 30.1 bits (67), Expect = 0.032
Identities = 29/108 (26%), Positives = 36/108 (33%), Gaps = 4/108 (3%)

Query: 286 IGGAGTPGANGGAGGAGGAATTTGTNAVAVGGNGADGGAGGTGVTTGGAGGAGGAATSTT 345
+ G G N GA G T GG G GG G+G +
Sbjct: 1 MSGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGS 60

Query: 346 GAASGGNGGNGGAGGIGAAGESGGVGAVGGTGGTGSNATGVGGTGTVG 393
G +GG GN G G +G G + AV G A G G +
Sbjct: 61 GHGNGGGNGNSG----GGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02478HTHFIS320.006 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.7 bits (72), Expect = 0.006
Identities = 48/225 (21%), Positives = 81/225 (36%), Gaps = 41/225 (18%)

Query: 61 ASVILYGPPGTGKTTLASMISQATGRRFEAL-----SALSAGVKE------VRAVIDVAR 109
++++ G GTGK +A + RR +A+ + E + A+
Sbjct: 161 LTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQ 220

Query: 110 QASM----RGEQTVLFIDEVHRFSKTQQDALLAAVENRV-------------VLLVAATT 152
S + E LF+DE+ Q LL ++ V +VAAT
Sbjct: 221 TRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATN 280

Query: 153 ENPSFSVVAPLLSRSLI-------LQLQPLA--PEDIGTVIRRAIDDERGFGGKV-AVTE 202
++ S+ L L L+L PL EDI ++R + G V +
Sbjct: 281 KDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQ 340

Query: 203 DAIEQLVQLS-AGDARRALTALEVASETVTASGETVTVEVIEQSL 246
+A+E + G+ R + T + +T E+IE L
Sbjct: 341 EALELMKAHPWPGNVRELENLVRRL--TALYPQDVITREIIENEL 383


75NCTC10437_02599NCTC10437_02605N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_02599090.827147NLP/P60 protein
NCTC10437_02600190.254179NLP/P60 protein
NCTC10437_0260109-0.008107ATPase
NCTC10437_026021110.638024transcriptional regulator moxR1
NCTC10437_026030100.078435Mg-chelatase subunit ChlD
NCTC10437_02604-1100.2232443-oxoacyl-(acyl-carrier-protein) reductase
NCTC10437_02605-1100.237860enoyl-(acyl-carrier-protein) reductase (NADH)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02599PYOCINKILLER364e-04 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 35.5 bits (81), Expect = 4e-04
Identities = 31/113 (27%), Positives = 50/113 (44%), Gaps = 11/113 (9%)

Query: 143 EALSISSQKVMADLQRARTEQVNRESAARLAKQQADDAVKAAESSQADAVSALSSAQETF 202
EA + M L A ++ E+ + L Q + + AA++S A + + Q
Sbjct: 171 EAYMRFLDREMEGLTAAYNVKLFTEAISSL--QIRMNTLTAAKASIEAAAANKAREQ--- 225

Query: 203 KAQQVELDRLAAERAQAQARLKEAQTLAAPSSG---APAAA-PSAPAAEGTAA 251
E R A E+A+ QA ++ A T A P++G A AA A+G A+
Sbjct: 226 --AAAEAKRKAEEQARQQAAIRAANTYAMPANGSVVATAAGRGLIQVAQGAAS 276


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02601HTHFIS320.003 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.5 bits (74), Expect = 0.003
Identities = 32/176 (18%), Positives = 61/176 (34%), Gaps = 21/176 (11%)

Query: 28 HAAPSAAQSNGGLQSEVHTLERAIFEVKRIIVGQD----QLVERMLVGLLAKGHVLLEGV 83
++ + LE + ++ G+ ++ + + +++ G
Sbjct: 110 ELIGIIGRALAEPKRRPSKLEDDSQDGMPLV-GRSAAMQEIYRVLARLMQTDLTLMITGE 168

Query: 84 PGVAKTL---AVETFAKVVGGTFARIQ---FTPDLVPTDIVG------TRIYRAGKEEFD 131
G K L A+ + K G F I DL+ +++ G T F+
Sbjct: 169 SGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFE 228

Query: 132 IELGPVVVNFLLADEINRAPAKVQSALLEVMAERKISIGGKTFPLPSPFLVMATQN 187
G L DEI P Q+ LL V+ + + + G P+ S ++A N
Sbjct: 229 QAEGGT----LFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATN 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02604DHBDHDRGNASE1161e-33 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 116 bits (291), Expect = 1e-33
Identities = 71/252 (28%), Positives = 123/252 (48%), Gaps = 21/252 (8%)

Query: 23 RSVLVTGGNRGIGLAIAQRLAADGHKVAVTHRGSGAPDGLFGVE-----------CDVTD 71
+ +TG +GIG A+A+ LA+ G +A + + DV D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 72 SAAVDRAFTEVEEHQGPVEVLVSNAGISQDAFLMRMTEERFENVINANLTGAFRVAQRAS 131
SAA+D +E GP+++LV+ AG+ + + +++E +E + N TG F ++ S
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 132 RSMQRKRFGRIIFIGSVSGMWGIGNQANYAAAKAGLIGMARSISRELSKAGVTANVVAPG 191
+ M +R G I+ +GS + A YA++KA + + + EL++ + N+V+PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 192 YIDTEMTRSL-------DERIQAGALEF---IPAKRVGTADEVAGAVSFLASEDASYIAG 241
+T+M SL ++ I+ F IP K++ ++A AV FL S A +I
Sbjct: 189 STETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITM 248

Query: 242 AVIPVDGGMGMG 253
+ VDGG +G
Sbjct: 249 HNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02605DHBDHDRGNASE519e-10 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 51.2 bits (122), Expect = 9e-10
Identities = 54/272 (19%), Positives = 106/272 (38%), Gaps = 30/272 (11%)

Query: 5 LEGKRILVTGIITDSSIAFHIAKVAQEAGAELVLTGFD----RMKLIQRIIDRLPNPAPL 60
+EGK +TG I +A+ GA + D +++ + + A
Sbjct: 6 IEGKIAFITG--AAQGIGEAVARTLASQGAHI--AAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 61 LELDVQNSEHLDSLAARVTEVIGEGNKLDGVVHSIGFMPQTGMGVNPFFDAPYEDVAKGI 120
DV++S +D + AR+ +G +D +V+ G + E+
Sbjct: 62 FPADVRDSAAIDEITARIEREMG---PIDILVNVAGVLR-----PGLIHSLSDEEWEATF 113

Query: 121 HISAYSYASLAKAVLPIM--NPGGGIVGMDFD----PTRAMPAYNWMTVAKSALESVNRF 174
+++ + +++V M G IV + + P +M AY +K+A +
Sbjct: 114 SVNSTGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYA---SSKAAAVMFTKC 170

Query: 175 VAREAGQFGVRSNLVAAGPIRTLAMAAIVGGALGDDAGAQ-IQLLEEGWDQRAPLGWNMK 233
+ E ++ +R N+V+ G T ++ ++ Q I+ E + PL +
Sbjct: 171 LGLELAEYNIRCNIVSPGSTETDMQWSL---WADENGAEQVIKGSLETFKTGIPLK-KLA 226

Query: 234 DPTPVAKTVCALMSDWLPATTGTVIYADGGAS 265
P+ +A V L+S T + DGGA+
Sbjct: 227 KPSDIADAVLFLVSGQAGHITMHNLCVDGGAT 258


76NCTC10437_02673NCTC10437_02689N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_02673-180.144718Uncharacterised protein
NCTC10437_02674-19-0.268638ABC transporter, transmembrane region, type 1
NCTC10437_02675-110-1.668527ABC transporter transmembrane protein
NCTC10437_02676110-2.906703cytochrome d ubiquinol oxidase subunit II
NCTC10437_0267719-2.405661cytochrome bd ubiquinol oxidase subunit I
NCTC10437_0267829-1.535704integral membrane protein
NCTC10437_0267909-0.025968extracellular solute-binding protein
NCTC10437_02680-190.171288polar amino acid ABC transporter inner membrane
NCTC10437_02681-280.354453ABC transporter-like protein
NCTC10437_02682-190.902460putative signal transduction histidine kinase
NCTC10437_02683090.629712two component LuxR family transcriptional
NCTC10437_02684-270.180595putative membrane protein (DUF2339)
NCTC10437_02685-29-1.503746DNA-binding ferritin-like protein (oxidative
NCTC10437_02686-210-2.107397family 3 adenylate cyclase
NCTC10437_02687-112-2.367074putative amidohydrolase
NCTC10437_02689111-2.977994*response regulator receiver/ANTAR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02673PRTACTNFAMLY280.017 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 28.1 bits (62), Expect = 0.017
Identities = 18/55 (32%), Positives = 22/55 (40%), Gaps = 2/55 (3%)

Query: 2 SEPPPPPPPGAYPGPPGGYPPPLYGGYPA--IPPVGPKNGMGTAALVVAIVGLLS 54
++ PP P P PGP PP PA P + AA+ VGL S
Sbjct: 569 AKAPPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSAAANAAVNTGGVGLAS 623


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02674PF05272320.008 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.008
Identities = 18/80 (22%), Positives = 28/80 (35%), Gaps = 15/80 (18%)

Query: 306 ARLAARRLLTLTDTVDDDRRPAVTALDLRPGDRVAVV----GPSGCGKTTLLMQTPGVFF 361
+L + +L A + PG + G G GK+TL+ G+ F
Sbjct: 573 LQLVGKYILM-----------GHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVGLDF 621

Query: 362 AEDAHLFSTTVRDNLLVARG 381
D H T +D+ G
Sbjct: 622 FSDTHFDIGTGKDSYEQIAG 641


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02675PF05272300.040 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.040
Identities = 12/20 (60%), Positives = 15/20 (75%)

Query: 338 VLTGPNGIGKSTLLQAILGL 357
VL G GIGKSTL+ ++GL
Sbjct: 600 VLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02679MICOLLPTASE290.028 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 28.9 bits (64), Expect = 0.028
Identities = 14/45 (31%), Positives = 23/45 (51%), Gaps = 1/45 (2%)

Query: 39 DAIANTVPEA-IKSTGKLVIGVNIPYAPNEFKDPEGKIVGFDVDL 82
D N P+A IKS +++ I + E KD +G+I ++ D
Sbjct: 768 DVHVNKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDF 812


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02681PF05272310.005 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.8 bits (69), Expect = 0.005
Identities = 8/17 (47%), Positives = 10/17 (58%)

Query: 39 VLVLVGPSGSGKSTFLR 55
+VL G G GKST +
Sbjct: 598 SVVLEGTGGIGKSTLIN 614


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02683HTHFIS694e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 69.5 bits (170), Expect = 4e-16
Identities = 29/114 (25%), Positives = 52/114 (45%), Gaps = 1/114 (0%)

Query: 6 ITVMVVDDHPIWRDAVARDLADGGFEVVATADGVASAKRRAAVVLPDVVLMDMRLGDGDG 65
T++V DD R + + L+ G++V T++ A+ R A D+V+ D+ + D +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNA-ATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 66 AQATAEVLAVSPSTRILVLSASDERDDVLEAVKAGATGYLVKSASKQELEDAVR 119
+ P +LV+SA + ++A + GA YL K EL +
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02684TONBPROTEIN401e-05 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 40.0 bits (93), Expect = 1e-05
Identities = 16/58 (27%), Positives = 19/58 (32%)

Query: 39 EQPPQAVVPQPAAPPSGAPPQPIYPPQPAPRPAPSYRPQYPPPPPTPFQPWPNTVPPR 96
+PPQAV P P P P P P +P+ P P P R
Sbjct: 55 LEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKR 112



Score = 36.1 bits (83), Expect = 2e-04
Identities = 15/62 (24%), Positives = 19/62 (30%)

Query: 39 EQPPQAVVPQPAAPPSGAPPQPIYPPQPAPRPAPSYRPQYPPPPPTPFQPWPNTVPPRLK 98
QP + PA QP P P P P P+ P P + P+ K
Sbjct: 42 AQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK 101

Query: 99 KE 100

Sbjct: 102 PV 103



Score = 33.0 bits (75), Expect = 0.003
Identities = 17/60 (28%), Positives = 23/60 (38%), Gaps = 1/60 (1%)

Query: 39 EQPPQAVVPQPAAPPSGAP-PQPIYPPQPAPRPAPSYRPQYPPPPPTPFQPWPNTVPPRL 97
E P P+ A P P+P P+P + + P P P+ NT P RL
Sbjct: 74 EPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARL 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02685HELNAPAPROT1071e-32 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 107 bits (269), Expect = 1e-32
Identities = 37/142 (26%), Positives = 63/142 (44%), Gaps = 2/142 (1%)

Query: 22 ELSAALQRVLVDLIELHLQGKQAHWNVVGTNFRDLHLQLDEVVDFAREASDTVAERLRAL 81
+ +L L + L+ + + HW V G +F LH + +E+ D A E DT+AERL A+
Sbjct: 12 LVENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERLLAI 71

Query: 82 DAVPDGRSDTVAATTSLPEFPAYEHSTGEVVDLITARIYAVVDTLRTVHDGVDAE-DPST 140
P S+ + ++ E+V + + + V + D +T
Sbjct: 72 GGQPVATVKEYTEHASITDGGNETSAS-EMVQALVNDYKQISSESKFVIGLAEENQDNAT 130

Query: 141 ADILHQLIDGLEKLAWLLKSEN 162
AD+ LI+ +EK W+L S
Sbjct: 131 ADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02689HTHFIS784e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 78.3 bits (193), Expect = 4e-19
Identities = 35/118 (29%), Positives = 59/118 (50%), Gaps = 2/118 (1%)

Query: 17 RVLIAEDEALIRLDLAEMLREEGYEIVGEAGDGQEAVDLAESLNPDLVIMDVKMPRRDGI 76
+L+A+D+A IR L + L GY++ + + + DLV+ DV MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 77 DAAAEIASKRI-APIVILTAFSQRELVERARDAGAMAYLVKPFSITDLIPAIEVAVSR 133
D I R P+++++A + +A + GA YL KPF +T+LI I A++
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


77NCTC10437_02787NCTC10437_02794N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_02787-1100.345869short-chain dehydrogenase
NCTC10437_02788-211-0.140881threonine dehydratase
NCTC10437_02789113-1.018404permease for cytosine/purines uracil thiamine
NCTC10437_02790017-0.771465dehydrogenase of uncharacterised specificity,
NCTC10437_02791-117-0.476147Predicted membrane protein
NCTC10437_02792-111-0.388832XRE family transcriptional regulator
NCTC10437_02793-29-0.205539short-chain dehydrogenase/reductase SDR
NCTC10437_02794-39-0.026234Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02787DHBDHDRGNASE1059e-30 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 105 bits (264), Expect = 9e-30
Identities = 67/250 (26%), Positives = 109/250 (43%), Gaps = 9/250 (3%)

Query: 4 VLVAGGATGIGAAAVRAFRRRGDRVLLADRNEEAGKALVAEDLPGEASF---LRSDFTES 60
+ G A GIG A R +G + D N E V L EA +D +S
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEK-LEKVVSSLKAEARHAEAFPADVRDS 69

Query: 61 DAAQAAVEAAVSFNVGSLDAVFYNAAVLEARPLGEWTAQDWDRSAAVNLRAPFLMAQAAA 120
AA + A + +G +D + A VL + + ++W+ + +VN F +++ +
Sbjct: 70 -AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 121 PHLRKSDTGRVILTSSTGAFRGHAGMPAYHATKAGLLGMVRALADELGPDGVTVNAVCPG 180
++ +G ++ S A M AY ++KA + + L EL + N V PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 181 WVDTPFNSAFW----GHQQDPARALEELTEQIPLRRQADPDDMTGLLLFLASSASSYITG 236
+T + W G +Q +LE IPL++ A P D+ +LFL S + +IT
Sbjct: 189 STETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITM 248

Query: 237 QALVIDGGYT 246
L +DGG T
Sbjct: 249 HNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02790DHBDHDRGNASE1292e-38 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 129 bits (325), Expect = 2e-38
Identities = 79/258 (30%), Positives = 127/258 (49%), Gaps = 3/258 (1%)

Query: 13 LEGRTAVVTGAARGIGLSIATRLAAHGARVALLDLDDEQTQAAARQVQESTGSRTLGLAA 72
+EG+ A +TGAA+GIG ++A LA+ GA +A +D + E+ ++ A
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEK-LEKVVSSLKAEARHAEAFPA 64

Query: 73 DVTDAARLREAADAIETDLGPVNVVVPNAGILVLKTALDIEPQEFDAVLRVNLFGAFLTA 132
DV D+A + E IE ++GP++++V AG+L + +E++A VN G F +
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 133 VEFARRMPRSGADGRIIFTSSLFGLRGGVGNAAYAASKFGILGMAQSMAAELAPSGIRVN 192
++ M G I+ S AAYA+SK + + + ELA IR N
Sbjct: 125 RSVSKYM-MDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 193 SVCPGQIESAMIRQLFEDRAAANGTTPEGERSAFARHIPLGGLGDPDDVANTYVYLASPL 252
V PG E+ M L+ D A +G F IPL L P D+A+ ++L S
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVI-KGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 253 SSYVTGQHIVVDGGWSVG 270
+ ++T ++ VDGG ++G
Sbjct: 243 AGHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02793DHBDHDRGNASE1024e-28 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 102 bits (255), Expect = 4e-28
Identities = 72/257 (28%), Positives = 112/257 (43%), Gaps = 18/257 (7%)

Query: 15 LSGKRGLVTGGTRGIGMMIARGLLQAGARVLISSRNAEACARAQEQLSEFGDVW-AVPAD 73
+ GK +TG +GIG +AR L GA + N E + L A PAD
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 74 LSRHDECERLAHLATADSGGLDILVNNAGAMWDEPLATFPDQAWDTVIDLNLKSPFWLVQ 133
+ + + + G +DILVN AG + + + D+ W+ +N F +
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 134 ALLPALRDAGTADDPARIINIGSIAAIHIPNRPNYSYSSSKAALHQLTRVLAKELGPQHI 193
++ + D + I+ +GS A +P +Y+SSKAA T+ L EL +I
Sbjct: 126 SVSKYMMDRRSGS----IVTVGSNPA-GVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 194 AVNAVAPGPFPSTMMAATL--DEFGEA--IAASA-------PLRRIGRDDDMAGVAVFLA 242
N V+PG T M +L DE G I S PL+++ + D+A +FL
Sbjct: 181 RCNIVSPGS-TETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 243 SRAGAYLTGAIVPVDGG 259
S ++T + VDGG
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02794RTXTOXIND290.009 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.009
Identities = 18/114 (15%), Positives = 40/114 (35%), Gaps = 2/114 (1%)

Query: 21 LEVDHAALDGPRGEWQTAEETSARAQNQLSIVLERLETNAERRSADEAALQAAIEQQAEL 80
L++ + + Q++ + Q + I+ +E N + E+
Sbjct: 125 LKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEV 184

Query: 81 KRAIKTSAEQRETLRKARSKAQREVDAARKQAQAAEARYDEALLVEVLAAQKSK 134
R EQ T + K Q+E++ +K+A+ + +KS+
Sbjct: 185 LRLTSLIKEQFSTWQNQ--KYQKELNLDKKRAERLTVLARINRYENLSRVEKSR 236


78NCTC10437_02901NCTC10437_02915N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_02901-111-1.489754efpA_1
NCTC10437_02902-110-1.344427mycobacterium membrane protein
NCTC10437_02903-111-1.282671Transport protein
NCTC10437_02904-110-0.588760membrane protein mmpL4
NCTC10437_02905-1100.164044Phenolphthiocerol synthesis polyketide synthase
NCTC10437_02906-110-0.660468polyketide synthase, Pks8
NCTC10437_0290709-0.823940Polyketide biosynthesis protein BaeE
NCTC10437_0290818-0.807875Uncharacterised protein
NCTC10437_0290918-0.656094Uncharacterised protein
NCTC10437_0291028-0.243858glucose-1-phosphate thymidylyltransferase
NCTC10437_02911190.513343dTDP-glucose 4,6-dehydratase
NCTC10437_02912191.328469dTDP-4-dehydrorhamnose reductase
NCTC10437_029130111.578519ErfK/YbiS/YcfS/YnhG family protein
NCTC10437_02914-2100.580791Uncharacterised protein
NCTC10437_029150100.468205FHA domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02901TCRTETB1134e-29 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 113 bits (283), Expect = 4e-29
Identities = 77/341 (22%), Positives = 152/341 (44%), Gaps = 20/341 (5%)

Query: 34 VQLVAAMDGPVAVFALPRIQNELGLSDATRAWVITAYLLTFGGLILLGGRLGDAFGRKRV 93
+ + ++ V +LP I N+ A+ WV TA++LTF + G+L D G KR+
Sbjct: 22 LSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRL 81

Query: 94 FIAGVALFTFASVLCGIAWNGPS-LVVARLLHGVAAAIVTPTCTALLATTFPKGPARNAA 152
+ G+ + F SV+ + + S L++AR + G AA ++A PK R A
Sbjct: 82 LLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK-ENRGKA 140

Query: 153 IAVFGALASIGAVLGLVAGGLLTE-ISWRLAFLVNVP-IGILVICVARGMLRETQRERMR 210
+ G++ ++G +G GG++ I W ++L+ +P I I+ + +L++ R +
Sbjct: 141 FGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPFLMKLLKKEVRIKGH 198

Query: 211 LDVAGAVLATLTITATVFGLSMGPEKGWLSATTLGLGFAALAALLAFIFVE--RTAENPI 268
D+ G +L ++ I + L T+ + F ++ L IFV+ R +P
Sbjct: 199 FDIKGIILMSVGIVFFM-----------LFTTSYSISFLIVSVLSFLIFVKHIRKVTDPF 247

Query: 269 LPFSLFRDRDRIACFVAVFLCTGTSFTLTVLVAFYVQNIMGLSPAQAGVGFI-PIAVAMA 327
+ L ++ + + + GT +V + ++++ LS A+ G I P +++
Sbjct: 248 VDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVI 307

Query: 328 LGTAVSSRVVMSLAPRLVVIGGCTLVLGAMVFGGLTLHGDM 368
+ + +V P V+ G T + + + L
Sbjct: 308 IFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTS 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02903ACRIFLAVINRP421e-05 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 42.1 bits (99), Expect = 1e-05
Identities = 38/282 (13%), Positives = 92/282 (32%), Gaps = 30/282 (10%)

Query: 149 ASTNASVTAIRHIVENTQA--PEGVKAYVTGPSAIATDVGESGNRTVILVTAVSVAVICV 206
A+ + AI+ + Q P+G+K + + V+ ++ ++ +
Sbjct: 297 ANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFV---QLSIHEVVKTLFEAIMLVFL 353

Query: 207 MLLLVYRSVTTVLLLLVVVGIQLQAARGIVALLGHHGMLGLTTFGVNLL----VALCIAA 262
++ L +++ L+ + V + L I+A G + +N L + L I
Sbjct: 354 VMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFG---------YSINTLTMFGMVLAIGL 404

Query: 263 GTDYGIFFIGRYQEARQ-AGEDTVSAYFTTYHGVAKVVLASGLTIAGALYCLSF---TRL 318
D I + + A + + ++ + ++ ++F +
Sbjct: 405 LVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTG 464

Query: 319 PLFNALALPTAIGVIVAVAVALTLFPA----VLAAGSRFGLFDPKR--NLQVRGWRRLGT 372
++ ++ + ++V VAL L PA +L S + +
Sbjct: 465 AIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVN 524

Query: 373 AIVRWPAPILVAT--MAITLLGLLALPGFKPSYNDQKYLPHD 412
IL +T + ++A +LP +
Sbjct: 525 HYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEE 566



Score = 38.3 bits (89), Expect = 2e-04
Identities = 39/210 (18%), Positives = 81/210 (38%), Gaps = 30/210 (14%)

Query: 723 ARVAEIKSAAEEALKGTPLEGSTIYLTGTAAITKDMVVGSEYDLMIAVVAAICLIFIVML 782
A++AE++ + +K +T ++ + L A++ L+F+VM
Sbjct: 308 AKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVV-------KTLFEAIM----LVFLVMY 356

Query: 783 IMTRSLVAATVIVGTVLISLGAAFGVSVFVWQYVIGVQIHWAVLVMTVITL-LAVGS--D 839
+ +++ A + V + L F + G I+ +T+ + LA+G D
Sbjct: 357 LFLQNMRATLIPTIAVPVVLLGTFAILAAF-----GYSIN----TLTMFGMVLAIGLLVD 407

Query: 840 YNLLLV---ARIKDELGAGINTAIIRAMGGTGKVVTTAGLVFAATMGSMVV---SDLRSI 893
+++V R+ E A ++M + +V +A M S
Sbjct: 408 DAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIY 467

Query: 894 GQVGTTIALGLMFDTLIVRAFMTPSIAALL 923
Q TI + +++V +TP++ A L
Sbjct: 468 RQFSITIVSAMAL-SVLVALILTPALCATL 496



Score = 32.5 bits (74), Expect = 0.009
Identities = 34/223 (15%), Positives = 73/223 (32%), Gaps = 30/223 (13%)

Query: 717 ASPEGIARVAEIKSAAEEALKGTPLEGSTIYLTGTAAITKDMVVGSEYDLMIAVVAAICL 776
+ + E P G TG + + S V + +
Sbjct: 828 GEAAPGTSSGDAMALMENLASKLP-AGIGYDWTGMSYQERL----SGNQAPALVAISFVV 882

Query: 777 IFIVMLIMTRSLVAATVIVGTVLISLGAAFGVSVFVWQYVIGVQIHWAVLVMTVITLLAV 836
+F+ + + S ++ V + + V V + + + +V ++T + +
Sbjct: 883 VFLCLAALYESWSIPVSVMLVVPLGI-----VGVLLAATLFNQKNDVYFMV-GLLTTIGL 936

Query: 837 GSDYNLLLVARIKDEL---GAGINTAIIRA---------MGGTGKVVTTAGLVFAATMGS 884
+ +L+V KD + G G+ A + A M ++ L + GS
Sbjct: 937 SAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGS 996

Query: 885 MVVSDLRSIGQVGTTIALGLMFDTLIVRAFMTPSIAALLGRWF 927
+ + G + G++ TL+ F P ++ R F
Sbjct: 997 GAQNAV------GIGVMGGMVSATLLA-IFFVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02904ACRIFLAVINRP504e-08 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 50.2 bits (120), Expect = 4e-08
Identities = 47/268 (17%), Positives = 100/268 (37%), Gaps = 36/268 (13%)

Query: 152 VESVAAVRAIVDRLPP--PPGFSVYVTGAAPLLADMQHSGETSILKMTLIGALIIFVVLM 209
+++ A++A + L P P G V Q S ++K +++F+V+
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFV--QLSIHE-VVKTLFEAIMLVFLVMY 356

Query: 210 IVYRSVVTVVALLITVGIELFAARGIVAFLGHHEVFLLSTFAINL--LVALAMAAGT--D 265
+ +++ + I V + L I+A G ++IN + + +A G D
Sbjct: 357 LFLQNMRATLIPTIAVPVVLLGTFAILAAFG---------YSINTLTMFGMVLAIGLLVD 407

Query: 266 YGIFFFGR-YQEARNSGEDRETAYYTTFRGVAPVVLGSGLTIAGALMCLSF---TRMPIF 321
I + + A + + ++G + ++ + ++F + I+
Sbjct: 408 DAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIY 467

Query: 322 QTIGMPCAVGMLVAVAAALTLVPAVL-------------TIGGRVGLFDPTRAINVRRWR 368
+ + M ++V AL L PA+ GG G F+ T +V +
Sbjct: 468 RQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYT 527

Query: 369 RIGTAIVRWPGPILVATLAVTLIGLIAL 396
I+ G L+ A+ + G++ L
Sbjct: 528 NSVGKILGSTGRYLLI-YALIVAGMVVL 554



Score = 44.1 bits (104), Expect = 3e-06
Identities = 35/177 (19%), Positives = 72/177 (40%), Gaps = 23/177 (12%)

Query: 756 FEEGSRYDLLIAGVSALCLIFLIMLLITRGLVASVVIVGTVALSLGASFGLSVLIWQYIF 815
F + S ++++ A+ L+FL+M L + + A+++ V + L +F + F
Sbjct: 332 FVQLSIHEVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAI-----LAAF 386

Query: 816 GIPLHWMVLPMAVIVLLGVGSDYNLLLV---SRMKEEIGAGINTGIIRAMGGTGKVVTNA 872
G ++ + + +++ +G+ D +++V R+ E ++M +
Sbjct: 387 GYSINTLTM-FGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGI 445

Query: 873 GLVFAFTMAAMVFSDLMVIG--------QVGTTIGLGLLFDTLVVRAFLTPAIAALL 921
+V + VF + G Q TI + LV LTPA+ A L
Sbjct: 446 AMVL-----SAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVAL-ILTPALCATL 496



Score = 38.7 bits (90), Expect = 1e-04
Identities = 37/205 (18%), Positives = 77/205 (37%), Gaps = 12/205 (5%)

Query: 147 GSSAGVESVAAVRAIVDRLPPPPGFSVYVTGAAPLLADMQHSGETSILKMTLIGALIIFV 206
G+S+G +++A + + +LP G TG ++ + + I +++F+
Sbjct: 833 GTSSG-DAMALMENLASKLPA--GIGYDWTG----MSYQERLSGNQAPALVAISFVVVFL 885

Query: 207 VLMIVYRSVVTVVALLITVGIELFAARGIVAFLGHHEVFLLSTFAINLLVALAMAAGTDY 266
L +Y S V++++ V G++ F + LL + ++A
Sbjct: 886 CLAALYESWSIPVSVMLVVP---LGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAI 942

Query: 267 GIFFFGRYQEARNSGEDRETAYYTTFRGVAPVVLGSGLTIAGAL-MCLSF-TRMPIFQTI 324
I F + + E + P+++ S I G L + +S +
Sbjct: 943 LIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAV 1002

Query: 325 GMPCAVGMLVAVAAALTLVPAVLTI 349
G+ GM+ A A+ VP +
Sbjct: 1003 GIGVMGGMVSATLLAIFFVPVFFVV 1027


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02905PF03544413e-05 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 40.7 bits (95), Expect = 3e-05
Identities = 16/82 (19%), Positives = 22/82 (26%), Gaps = 2/82 (2%)

Query: 1309 RSPRSAVAPPVPMARVSA--PVAPPPVVPAPRPVTPPKSPPPFDPPVQAPPAAPAAPTAP 1366
+ P V P P P P V+ P+P PK P P
Sbjct: 67 QPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPA 126

Query: 1367 TSREPLWSRAQLETLAGGNISE 1388
+ E + A S+
Sbjct: 127 SPFENTAPARPTSSTATAATSK 148



Score = 34.6 bits (79), Expect = 0.004
Identities = 16/66 (24%), Positives = 21/66 (31%), Gaps = 3/66 (4%)

Query: 1311 PRSAVAPPVPMARVSAPVAPPPVVPAPRPVTPPKSPPPFDPPVQAPPAAPAAPTAPTSRE 1370
P++ PP P+ P P P PV K P P P +
Sbjct: 63 PQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPK---PKPKPKPVKKVEQPKRDVK 119

Query: 1371 PLWSRA 1376
P+ SR
Sbjct: 120 PVESRP 125



Score = 33.8 bits (77), Expect = 0.006
Identities = 13/62 (20%), Positives = 18/62 (29%), Gaps = 1/62 (1%)

Query: 1311 PRSAVAPPVPMARVS-APVAPPPVVPAPRPVTPPKSPPPFDPPVQAPPAAPAAPTAPTSR 1369
A P + PV P P P P P ++P + P P P
Sbjct: 55 VAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQP 114

Query: 1370 EP 1371
+
Sbjct: 115 KR 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02906IGASERPTASE422e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 42.4 bits (99), Expect = 2e-05
Identities = 42/213 (19%), Positives = 69/213 (32%), Gaps = 20/213 (9%)

Query: 936 PVDGTTSITPKRVRAARPATPVAAAEGSRMTHVVQRMESPAPAPVEVSAPAVHVREAEPQ 995
VD T TP ++A P+ P E +R V P PAP S V E Q
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSNNEEIAR----VDEAPVPPPAPATPSETTETVAENSKQ 1046

Query: 996 EAA--------PEVHVPAPAAMVSPDAWSVIDRIQQ-ETAEQHQRYLEVMTQSHQAFLDV 1046
E+ + +V Q E A+ E T + V
Sbjct: 1047 ESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATV 1106

Query: 1047 SVQMMAEIVGDR--DPAPQTPRLAIKEVVQAPPSVAPAVVSSPSPAASVTLVE-QVPAPA 1103
+ A++ ++ + T +++ K+ + +V P + +V + E Q
Sbjct: 1107 EKEEKAKVETEKTQEVPKVTSQVSPKQ--EQSETVQPQAEPARENDPTVNIKEPQSQTNT 1164

Query: 1104 PAAVMVPSP--SPVTEPPVSTSILSPSTPVPSV 1134
A P+ S E PV+ S +
Sbjct: 1165 TADTEQPAKETSSNVEQPVTESTTVNTGNSVVE 1197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02911NUCEPIMERASE1412e-41 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 141 bits (356), Expect = 2e-41
Identities = 78/335 (23%), Positives = 141/335 (42%), Gaps = 32/335 (9%)

Query: 3 RLLVTGGAGFIGSNFVHHVITHTDHHVTVLDKLT--YAGN--RASLTGLPDTRLTFLRGD 58
+ LVTG AGFIG + ++ H V +D L Y + +A L L F + D
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 59 VADAELVDQLV--GNADAVVHYAAESHNDNSLDHPDPFLHSNVIGTFTLLEAVRRHG-KR 115
+AD E + L G+ + V SL++P + SN+ G +LE R + +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 116 FHHVSTDEVYGDLALDDPGRFTESSPYN-PSSPYSSTKAGSDMLVRAWTRSFGVAATISN 174
+ S+ VYG L P F+ + P S Y++TK ++++ ++ +G+ AT
Sbjct: 121 LLYASSSSVYG-LNRKMP--FSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLR 177

Query: 175 CSNNYGPYQHVEKFIPRQITNILCGIRPRLYGEGRNVRDWIHADDHSSAVLLILERGRI- 233
YGP+ + + + +L G +Y G+ RD+ + DD + A++ + +
Sbjct: 178 FFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHA 237

Query: 234 -----------------GETYLIGANGEKDNRTVIELILTMMGQDADAYDRVPDRAGHDL 276
Y IG + + I+ + +G +A + +P + G L
Sbjct: 238 DTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKK-NMLPLQPGDVL 296

Query: 277 RYAIDPTKLRDELGWQPRYRDFAHGLAATIDWYRH 311
+ D L + +G+ P G+ ++WYR
Sbjct: 297 ETSADTKALYEVIGFTPE-TTVKDGVKNFVNWYRD 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02912NUCEPIMERASE371e-04 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 36.7 bits (85), Expect = 1e-04
Identities = 42/202 (20%), Positives = 68/202 (33%), Gaps = 48/202 (23%)

Query: 248 DYDTIINAAAYTAVDAAETAEGRVAAWAANVTGVAALARVASAHGIT-LVHVSSDYVFDG 306
++ + + AV + E A +N+TG + + I L++ SS V+
Sbjct: 75 HFERVFISPHRLAVRYS--LENPHAYADSNLTGFLNILEGCRHNKIQHLLYASSSSVYGL 132

Query: 307 TATRPYREDDPMA-PLGVYGQTKAAGDQIVST------VP----RHYIVRTSWVIGDGR- 354
P+ DD + P+ +Y TK A + + T +P R + V W GR
Sbjct: 133 NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFFTVYGPW----GRP 188

Query: 355 -----NFVQTMLSLADNGVDPSVVDDQFGRLTFT--------------------PELARA 389
F + ML G V + + FT +
Sbjct: 189 DMALFKFTKAML----EGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADTQWTVE 244

Query: 390 IRHLTESAAPYGTYNVTGSGPI 411
S APY YN+ S P+
Sbjct: 245 TGTPAASIAPYRVYNIGNSSPV 266


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02913RTXTOXINA280.045 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.0 bits (62), Expect = 0.045
Identities = 11/48 (22%), Positives = 16/48 (33%), Gaps = 4/48 (8%)

Query: 9 TRFTRGITRRTSSVIWCAALLAGLSLVGPVPSSLAGSTAATTVAGVSP 56
T+ + + S I GLS AG A+ +SP
Sbjct: 277 TKVLGNVGKGISQYIIAQRAAQGLSTSAAA----AGLIASAVTLAISP 320


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_02915PF03544362e-04 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 35.7 bits (82), Expect = 2e-04
Identities = 26/138 (18%), Positives = 34/138 (24%)

Query: 266 FVASVSVAAAMQLAPDPAAPRPTPAAETSAPFPAAAPRAVPAAPLPPAPPPPAEVVIPEV 325
+ VA + + P PA S A A P A PP P PE
Sbjct: 23 CIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEP 82

Query: 326 AVPVVEEAPVAEPPLAPEQAPMAETAPPDPVLPPAPVVAPPVPQERPGLLQRIRDRLSGG 385
+EAPV P+ P + P R S
Sbjct: 83 IPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTA 142

Query: 386 NDESPAPAPVVPPPVPPV 403
+ P V +
Sbjct: 143 TAATSKPVTSVASGPRAL 160


79NCTC10437_03198NCTC10437_03203N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_03198-290.319887TetR family transcriptional regulator
NCTC10437_03200-19-0.616068*major facilitator superfamily transporter
NCTC10437_03201110-1.091810excinuclease ABC subunit B uvrB
NCTC10437_03202-111-0.956275Protein of uncharacterised function (DUF402)
NCTC10437_03203-110-0.933682Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03198HTHTETR479e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 46.5 bits (110), Expect = 9e-09
Identities = 24/113 (21%), Positives = 46/113 (40%), Gaps = 4/113 (3%)

Query: 1 MPPTLPRGRPSSAAAILDTARALLDEG--RTVSLDSVAQASGLSKPGLMYHFPTKSALMD 58
M + + ILD A L + + SL +A+A+G+++ + +HF KS L
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 59 ALVD-HVVDSQERELSSFLAVPLD-QATARQRLGAYVRWALLSRHCRSDLVML 109
+ + + E EL P D + R+ L + + R + ++
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEII 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03200TCRTETB622e-12 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 61.8 bits (150), Expect = 2e-12
Identities = 66/378 (17%), Positives = 134/378 (35%), Gaps = 25/378 (6%)

Query: 63 WVTTVYLVGSVMAATTVHSVLMRLGPRWAYLLGLSVFGLGSLGCAVAPSM-EALLAGRTV 121
WV T +++ + + +LG + L G+ + GS+ V S L+ R +
Sbjct: 53 WVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFI 112

Query: 122 QGAAGGLLAGLGYAVINTALPSHLWTKASALVSAMWGVGTLVGPAAGGLFAQYSSWRWAF 181
QGA L V+ +P KA L+ ++ +G VGPA GG+ A Y W +
Sbjct: 113 QGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL 172

Query: 182 GVLVVMTTAMSLLVPLALPGREAATPNVPRERVRIPLAGLLLLGAAALLVSAAGIPHDVR 241
+ ++ + L+ L R + + G++L+ + +
Sbjct: 173 LIPMITIITVPFLMKLLKKEV--------RIKGHFDIKGIILMSVGIVFF----MLFTTS 220

Query: 242 ATAGLVAAGFALVGVFLVVDRKVTAAVLPPSAWGSGPLKWIYLTLGVLMAATMV--DMYV 299
+ + +F+ RKVT + P + P I + G ++ T+ V
Sbjct: 221 YSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPF-MIGVLCGGIIFGTVAGFVSMV 279

Query: 300 PLFGQRLAHLSPVAAG--FLAAGLAIGWTVGEISSASLSRQRLIVRIVAVAPVVMAVGLA 357
P + + LS G + G G I + R+ + ++ + ++V
Sbjct: 280 PYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGP-LYVLNIGVTFLSVSFL 338

Query: 358 IGALTQRDDASPAVVAAWAAGLVVSGAGIGIAWPHLSAWAMGKVDDPAEGPAAAAAINTV 417
+ + + V+ G + + + E A + +N
Sbjct: 339 TASFL---LETTSWFMTIIIVFVLGGLS---FTKTVISTIVSSSLKQQEAGAGMSLLNFT 392

Query: 418 QVISAAFGAALAGVVVNV 435
+S G A+ G ++++
Sbjct: 393 SFLSEGTGIAIVGGLLSI 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03201FLGMOTORFLIM300.034 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 29.9 bits (67), Expect = 0.034
Identities = 22/91 (24%), Positives = 34/91 (37%), Gaps = 7/91 (7%)

Query: 210 SFTRGSFRVRGDTVEIIPSYEELAVRIEYFGDEIEALYYMHPLTGDVVRKVDSLRVFPAT 269
+ R V +V+ + YEE I A+ M PL G+ V +VD F
Sbjct: 72 AQLRSMVHVHVASVDQLT-YEEFIRSIPTPS--TLAVITMDPLKGNAVLEVDPSITFSII 128

Query: 270 HYVAGPERMAAAI----SSIEKELEERLAEL 296
+ G AA + + IE + E +
Sbjct: 129 DRLFGGTGQAAKVQRDLTDIENSVMEGVIVR 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03203FLGPRINGFLGI280.020 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 28.0 bits (62), Expect = 0.020
Identities = 16/37 (43%), Positives = 22/37 (59%)

Query: 7 TGAVVAGAAVALSAMAVFPGSASAQPDPQPEVETPAP 43
TG +V GA V +S +AV G+ + Q P+V PAP
Sbjct: 269 TGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAP 305


80NCTC10437_03221NCTC10437_03232N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_03221-180.320733pyruvate, water dikinase
NCTC10437_03222-190.089276regulatory protein ArsR
NCTC10437_03223090.467918glyoxalase/bleomycin resistance
NCTC10437_03224-190.460385aminoglycoside phosphotransferase
NCTC10437_032250100.066168NAD-dependent epimerase/dehydratase
NCTC10437_03226-111-0.394720TetR family transcriptional regulator
NCTC10437_03227010-0.423998anaerobic dehydrogenase, typically
NCTC10437_03228013-0.909326peptidoglycan binding domain-containing protein
NCTC10437_03229-213-0.662475Polyketide cyclase / dehydrase and lipid
NCTC10437_03230-2130.155293NADH:flavin oxidoreductase/nadh oxidase
NCTC10437_03231-2120.837845histidinol-phosphatase
NCTC10437_03232-2141.196495protein of uncharacterised function DUF222
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03221PHPHTRNFRASE618e-12 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 61.3 bits (149), Expect = 8e-12
Identities = 21/52 (40%), Positives = 30/52 (57%)

Query: 735 IGALVTDIGSSVSHGAVVAREYGLPCVVNTLVATQVLKTGDHVRVDGDRGLV 786
+ TDIG SH A+++R +P VV T T+ ++ GD V VDG G+V
Sbjct: 177 VKGFATDIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03225NUCEPIMERASE551e-10 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 55.2 bits (133), Expect = 1e-10
Identities = 31/130 (23%), Positives = 45/130 (34%), Gaps = 17/130 (13%)

Query: 1 MKVLITGGTGFVGAWTAKAAQDAGHEVRFL---------VRSPERLTTSAERIGVDIADH 51
MK L+TG GF+G +K +AGH+V + RL E +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARL----ELLAQPGFQF 56

Query: 52 VVGDIADGEATAAALD--GCDAVIHCAAM--VSTDPSRADEMLHTNLEGARNVLGGAVRA 107
D+AD E + V V +NL G N+L G
Sbjct: 57 HKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN 116

Query: 108 GIDPIVHVSS 117
I +++ SS
Sbjct: 117 KIQHLLYASS 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03226HTHTETR573e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 56.6 bits (136), Expect = 3e-12
Identities = 27/157 (17%), Positives = 56/157 (35%), Gaps = 12/157 (7%)

Query: 20 DAVLDATRTLLVTRGYSATSIDLIAATAKVSRPAIYRRWRSKAHLIHEAA---FPDLGPA 76
+LD L +G S+TS+ IA A V+R AIY ++ K+ L E ++G
Sbjct: 14 QHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGEL 73

Query: 77 PCE---DDIAAEITRLCRGALLMYADPVVREAVPGLLHDLRLEPAM--RRLINDRLEAAA 131
E ++ L + + V E L+ + + + + +
Sbjct: 74 ELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNL 133

Query: 132 ----RRQLARQLADGVESGTVRPAVNADSVMDVVAGA 164
++ + L +E+ + + ++ G
Sbjct: 134 CLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGY 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03227RTXTOXINA310.029 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 30.7 bits (69), Expect = 0.029
Identities = 35/141 (24%), Positives = 45/141 (31%), Gaps = 33/141 (23%)

Query: 176 TDLLVVMGANPAASQGSLLAAP------DVMGLLGAIRRRGRVIVVDPVKTATADRADEW 229
T L V AA+ SL+ AP V G++ I + + + V + AD EW
Sbjct: 373 TVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEASKQAMFEHVASKMADVIAEW 432

Query: 230 LP------ITPGTDAALLLAVSHTLFDENLVNLGTVAPYLDGVDTLRDVVAEWPPERVSE 283
G DA H F E D L E+ ER
Sbjct: 433 EKKHGKNYFENGYDA------RHAAFLE------------DNFKILSQYNKEYSVERS-- 472

Query: 284 ATGIDAERIRALARELAGTPR 304
I + L ELAG R
Sbjct: 473 -VLITQQHWDTLIGELAGVTR 492


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03232PF05616300.024 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 30.1 bits (67), Expect = 0.024
Identities = 22/72 (30%), Positives = 30/72 (41%), Gaps = 15/72 (20%)

Query: 225 EAAPGHN-----IPDEEPNTDTDVPQAEADTPEAEPDV-PDANMPQAEAAPADNKDRCPD 278
E +P N P+E P T + PE +PD+ PDAN P + P D
Sbjct: 330 EVSPAENPANNPAPNENPGTRPN--------PEPDPDLNPDAN-PDTDGQPGTRPDSPAV 380

Query: 279 PRRTDTRTAAQR 290
P R + R +R
Sbjct: 381 PDRPNGRHRKER 392


81NCTC10437_03262NCTC10437_03271N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_032623130.534688putative permease, DMT superfamily
NCTC10437_03263117-0.616304Uncharacterised protein
NCTC10437_03264-115-2.286478TetR family transcriptional regulator
NCTC10437_03265-116-2.570447short chain dehydrogenase
NCTC10437_03266-117-2.533415short-chain dehydrogenase/reductase SDR
NCTC10437_03267016-2.3870644-carboxymuconolactone decarboxylase
NCTC10437_03268015-1.529942betaine aldehyde dehydrogenase
NCTC10437_03269-213-0.647433cytochrome P450
NCTC10437_03270-2120.673639transcriptional regulator
NCTC10437_03271-2110.779634oxidoreductase, SDR family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03262TCRTETA290.023 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 28.6 bits (64), Expect = 0.023
Identities = 11/38 (28%), Positives = 21/38 (55%)

Query: 185 ILLIGLGLAVLLPVVPFALELLALRRLTAASFGTMMAL 222
+ L +G+ +++PV+P L L A +G ++AL
Sbjct: 14 VALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLAL 51


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03263cloacin280.024 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 28.1 bits (62), Expect = 0.024
Identities = 32/103 (31%), Positives = 40/103 (38%), Gaps = 3/103 (2%)

Query: 71 APGAAGAIPGGPAGVAGPGGAAGVIPGGPAGTAGPGGAAGVIPGGPGGAAGPAGATGAIP 130
A +G I GGP G+ G GG A G + GG +G GG+ G
Sbjct: 13 AHSTSGNINGGPTGL-GVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNS 71

Query: 131 GGPAGTAGPGG--AAGVIPGGPGGAAGPAGATGAIPGGPAGTA 171
GG +GT G AA V G P + AG A +A
Sbjct: 72 GGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSA 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03264HTHTETR462e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 45.8 bits (108), Expect = 2e-08
Identities = 19/101 (18%), Positives = 36/101 (35%), Gaps = 4/101 (3%)

Query: 7 ILTAVVELLENDGYDAVQLREVARRSRTSLATIYKRYANRDELILAALEAWTDENRYAAV 66
IL + L G + L E+A+ + + IY + ++ +L E +
Sbjct: 16 ILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELEL 75

Query: 67 VGQRRAPG---ESLHEALMRLFRTLFEPWERHPAMLAAYFR 104
Q + PG L E L+ + + R ++ F
Sbjct: 76 EYQAKFPGDPLSVLREILIHVLESTVTEERRR-LLMEIIFH 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03265DHBDHDRGNASE857e-22 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 85.5 bits (211), Expect = 7e-22
Identities = 79/278 (28%), Positives = 121/278 (43%), Gaps = 37/278 (13%)

Query: 4 LDGKVAFITGVARGQGRSHAVRLASDGADIIGIDICADIDSNGYPMASPAELQETVTLVE 63
++GK+AFITG A+G G + A LAS GA I +D +P +L++ V+ ++
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVD------------YNPEKLEKVVSSLK 53

Query: 64 AHGGKMLASIADVRDFSAVKAAVDTGVAHFGRLDMVCANAGIAAMAFAELTDAQDLQMWT 123
A A ADVRD +A+ G +D++ AG+ L + + W
Sbjct: 54 AEARHAEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPG---LIHSLSDEEWE 110

Query: 124 DVLEVNLVGSFHTAKAAIPHLIAGERGGSIVFTSSTAGLRGFGGKGGGGLGYAASKHGIV 183
VN G F+ +++ +++ R GSIV S G YA+SK V
Sbjct: 111 ATFSVNSTGVFNASRSVSKYMMD-RRSGSIVTVGSNPA----GVPRTSMAAYASSKAAAV 165

Query: 184 GLMRALSNALAPHSIRVNTVHPTAVNTMM---------AVNPAMTAFLEHYPDGGPHLQN 234
+ L LA ++IR N V P + T M + LE + G
Sbjct: 166 MFTKCLGLELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTG------ 219

Query: 235 PMPVG-LLEPEDISAAVAYLVSDAAKYVTGVTFPVDAG 271
+P+ L +P DI+ AV +LVS A ++T VD G
Sbjct: 220 -IPLKKLAKPSDIADAVLFLVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03266DHBDHDRGNASE943e-25 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 94.3 bits (234), Expect = 3e-25
Identities = 75/261 (28%), Positives = 119/261 (45%), Gaps = 18/261 (6%)

Query: 2 LVTGAARGMGRSHALRLAEEGADVILIDICESLPDIEYPLASREDLAETARLVSEVGRRA 61
+TGAA+G+G + A LA +GA + +D + E L + + R A
Sbjct: 12 FITGAAQGIGEAVARTLASQGAHIAAVD------------YNPEKLEKVVSSLKAEARHA 59

Query: 62 ISHVVDVRDADALAAAVDDGVSKLGGLDASVANAGVLTAGTWETTTPAQWRTVVDVNLIG 121
+ DVRD+ A+ ++G +D V AGVL G + + +W VN G
Sbjct: 60 EAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTG 119

Query: 122 TWNTCAAALPHLVD-HGGSLVNI-SSAAGIKGTPLHTPYTASKHGVVGMSRALANELAAQ 179
+N + +++D GS+V + S+ AG+ T + Y +SK V ++ L ELA
Sbjct: 120 VFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSM-AAYASSKAAAVMFTKCLGLELAEY 178

Query: 180 SIRVNTVHPTGVATGMRPD---SLHGLIAHTRPDLGPLFLNAMPIMMAEAVDISNAVLFL 236
+IR N V P T M+ +G + L +A+ DI++AVLFL
Sbjct: 179 NIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFL 238

Query: 237 VSDESRYVTGLEFKVDAGVTL 257
VS ++ ++T VD G TL
Sbjct: 239 VSGQAGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03270HTHTETR521e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 52.3 bits (125), Expect = 1e-10
Identities = 21/127 (16%), Positives = 46/127 (36%), Gaps = 4/127 (3%)

Query: 4 PRKGKHSDISARLRLIEATAKIMRDEGYASATSRRVAAEAGVKQALVYYYFPTMDDLFVE 63
RK K R +++ ++ +G +S + +A AGV + +Y++F DLF E
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 64 VLRTGAEVSLDRMRDALTDDDPLRALWAINSDSRLTGLNTEFMALANHRKAIRVELKAYA 123
+ + + ++ + + L E R+ + +
Sbjct: 62 IWELSESNIGELELE--YQAKFPGDPLSVLREILIHVL--ESTVTEERRRLLMEIIFHKC 117

Query: 124 ERVRDIE 130
E V ++
Sbjct: 118 EFVGEMA 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03271DHBDHDRGNASE1083e-30 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 108 bits (270), Expect = 3e-30
Identities = 80/280 (28%), Positives = 131/280 (46%), Gaps = 32/280 (11%)

Query: 5 VEGKVAFITGAARGQGRSHAIRLAQEGADIIAVDVCGPISENSQIAPSTPDDLAETADLI 64
+EGK+AFITGAA+G G + A LA +GA I AVD P E + S+ A A+
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVD-YNP--EKLEKVVSSLKAEARHAEAF 62

Query: 65 KGLDRRVVTAEVDVRDYAALAAAVDGGVEQLGRLDIVVANAGIGNGGQTLDKTSEADWDD 124
DVRD AA+ ++G +DI+V AG+ G + S+ +W+
Sbjct: 63 P----------ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPG-LIHSLSDEEWEA 111

Query: 125 MIGVNLSGVWKSVKAAVPHLLSGGRGGSIILTSSVGGLKAYPHTGHYIAAKHGVIGLMRT 184
VN +GV+ + ++ +++ R GSI+ S Y ++K + +
Sbjct: 112 TFSVNSTGVFNASRSVSKYMMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKC 170

Query: 185 FAVELGQHSVRVNAVCPTNVNTPM-----FMNDGTMKLFRPDLAEPGPEDLKVAAQFMHV 239
+EL ++++R N V P + T M +G ++ + F
Sbjct: 171 LGLELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGS-----------LETFKTG 219

Query: 240 LPVGWV-EPADISNAVLFLASDEARYITGVPLPVDAGSML 278
+P+ + +P+DI++AVLFL S +A +IT L VD G+ L
Sbjct: 220 IPLKKLAKPSDIADAVLFLVSGQAGHITMHNLCVDGGATL 259


82NCTC10437_03394NCTC10437_03402N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_0339428-1.409324signal transduction histidine kinase regulating
NCTC10437_0339518-1.549615response regulator receiver/uncharacterised
NCTC10437_0339618-1.583960chitinase, cellulase
NCTC10437_033970100.057277beta-lactamase
NCTC10437_033982120.503090glutamine amidotransferase
NCTC10437_033992140.751203TIGR01777 family protein
NCTC10437_034003150.182687phosphoribosyltransferase
NCTC10437_034011140.556072transmembrane protein
NCTC10437_034020111.115632DivIVA domain protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03394PF06580402e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.8 bits (93), Expect = 2e-05
Identities = 38/223 (17%), Positives = 78/223 (34%), Gaps = 52/223 (23%)

Query: 338 ELAALQGQLSSHKSVTDTLRAQTHEFA-NQLHTISGLVQLGEYEAVRDLVGTLTR-RRAE 395
+L AL+ Q++ H F N L+ I L+ +A R+++ +L+ R
Sbjct: 162 QLMALKAQINPH-------------FMFNALNNIRALILEDPTKA-REMLTSLSELMRYS 207

Query: 396 ISDAVTQHIS-----DPAVAALLIAKTSLAAESGVALHLDPASHLAALEPALATDVITLL 450
+ + + +S + L +A ++PA + P + L+
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQV-PPM------LV 260

Query: 451 GNLIDNAVD--VSVGAPDACVTIGIDDRDG-LTISVLDTGPGVPEHLREAIFARGVTSKS 507
L++N + ++ + + +G +T+ V +TG ++ +E
Sbjct: 261 QTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE----------- 309

Query: 508 DVPGGRGIGLALV--RLVTA---QHGGTIEVSDGPGGGARFLV 545
G GL V RL + + G A L+
Sbjct: 310 ----STGTGLQNVRERLQMLYGTEAQIKLSEKQG-KVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03395HTHFIS793e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.1 bits (195), Expect = 3e-19
Identities = 35/140 (25%), Positives = 59/140 (42%), Gaps = 11/140 (7%)

Query: 4 VLVVDDDFMVAEIHRRFVDRVNGFRAVGVARTGAEALAAIRELRPQLILLDVYLPDMTGL 63
+LV DDD + + + + R G+ + A I L++ DV +PD
Sbjct: 6 ILVADDDAAIRTVLNQALSRA-GYDVRITS-NAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 64 DVLQRLRSEGDRVDVIMITAARELDTVRGALDGGAADYLIKPFEFPQL---------ETK 114
D+L R++ + V++++A T A + GA DYL KPF+ +L E K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 115 LQAYATRADALQSAGGVDQS 134
+ D+ V +S
Sbjct: 124 RRPSKLEDDSQDGMPLVGRS 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03396FLAGELLIN350.006 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 34.6 bits (79), Expect = 0.006
Identities = 32/269 (11%), Positives = 57/269 (21%), Gaps = 3/269 (1%)

Query: 1589 SFGFQATPGGGSATATNFSVNGVQNPTPVLPKVTVADATVAEGNSGTKNIVFTVTLDKAA 1648
+ G + + + L V A + D A
Sbjct: 140 DNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYA 199

Query: 1649 SAPVSVAYTTAGGTATAGSDFTAKSGVVTFAAGVLSQQISIAVAGDSVVEANETFTVTLS 1708
G + V A A +V T + +
Sbjct: 200 VGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGT 259

Query: 1709 NPTGVTIADGSAVGTITNDDVAPVLPKVTVADATVAEGNSGTKNIVFTVTLDKAATAPVS 1768
D V + G T VTL A +
Sbjct: 260 AEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGA 319

Query: 1769 VAYTTANGTAASGSDFTAKSGVVTFAAGVLS--QQISIAVAGDTAVETNETFTVTLSNPT 1826
A ++ + +G TF + ++S A + + TV + T
Sbjct: 320 ANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGES-KITVNGAEYT 378

Query: 1827 GVTIADGSAVGTITNDDVVTPTPGNSSAA 1855
D + T T + ++
Sbjct: 379 ANAAGDKVTLAGKTMFIDKTASGVSTLIN 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03399NUCEPIMERASE432e-06 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 42.8 bits (101), Expect = 2e-06
Identities = 32/147 (21%), Positives = 50/147 (34%), Gaps = 25/147 (17%)

Query: 154 IALTGASGLVGSALSAFLTTGGHRVI-------------KLVRNAAGAD--------DER 192
+TGA+G +G +S L GH+V+ K R A D
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 193 QWDPMAPAPDLLEGVDAVVHLAGASIAGRFTAEHRAAIRDSRIEPTRRLAAVAAATENGP 252
+ M + V A R++ E+ A DS + L + N
Sbjct: 63 DREGMTDLFAS-GHFERVFISPHRL-AVRYSLENPHAYADSNLTGF--LNILEGCRHNKI 118

Query: 253 RTFVSASAVGFYGFDRGDTQLTEDSTR 279
+ + AS+ YG +R T+DS
Sbjct: 119 QHLLYASSSSVYGLNRKMPFSTDDSVD 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03402RTXTOXIND344e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.4 bits (79), Expect = 4e-04
Identities = 15/93 (16%), Positives = 33/93 (35%), Gaps = 10/93 (10%)

Query: 136 ESEKMLADARAQADQLVTEARQTAETTVTEARQRADAMLADAQSRSETQLRQAQEKADAL 195
E E +A + ++ Q E+ + A++ + ++ +LRQ + L
Sbjct: 256 EQENKYVEAVNELRVYKSQLEQI-ESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLL 314

Query: 196 Q-----ADAERKHSEIM----GTINQQRTVLEG 219
+ ++ S I + Q + EG
Sbjct: 315 TLELAKNEERQQASVIRAPVSVKVQQLKVHTEG 347



Score = 31.0 bits (70), Expect = 0.006
Identities = 24/150 (16%), Positives = 57/150 (38%), Gaps = 14/150 (9%)

Query: 113 LRAAKILSLAQDTADRLTGSAKSESEKMLADARAQADQLVTEARQTAETTVTEARQRADA 172
R+ ++ L + E++L +Q T Q + + ++RA+
Sbjct: 157 SRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAER 216

Query: 173 MLADAQ-SRSETQLRQAQEKADALQA-------------DAERKHSEIMGTINQQRTVLE 218
+ A+ +R E R + + D + + E K+ E + + ++ LE
Sbjct: 217 LTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLE 276

Query: 219 GRLEQLRTFEREYRTRLKTYLESQLEELGQ 248
++ + + EY+ + + L++L Q
Sbjct: 277 QIESEILSAKEEYQLVTQLFKNEILDKLRQ 306


83NCTC10437_03837NCTC10437_03846N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_03837-110-0.018720carbamoyl-phosphate synthase ATP-binding subunit
NCTC10437_0383809-0.418452acetyl-CoA carboxylase, carboxyltransferase
NCTC10437_03839-1120.147830transcriptional regulator
NCTC10437_038401100.435923Uncharacterised protein
NCTC10437_03841-1100.697179Uncharacterised protein
NCTC10437_03842-190.628947peptidase S9 prolyl oligopeptidase
NCTC10437_03843-190.072437mycobacterium membrane protein
NCTC10437_0384409-0.370924major facilitator transporter
NCTC10437_0384509-0.946368short-chain dehydrogenase of uncharacterised
NCTC10437_03846011-1.353845ATPase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03837RTXTOXIND373e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 36.7 bits (85), Expect = 3e-04
Identities = 11/53 (20%), Positives = 26/53 (49%), Gaps = 3/53 (5%)

Query: 606 DGATVTAGTVVVTVEAMKMEHALSAPVDGVVELLVGAGEQVKVGQLLARITAT 658
+ G + + + +++ ++ V E++V GE V+ G +L ++TA
Sbjct: 81 EIVATANGKLTHSGRSKEIKPIENSIVK---EIIVKEGESVRKGDVLLKLTAL 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03839HTHTETR624e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.0 bits (150), Expect = 4e-14
Identities = 26/141 (18%), Positives = 52/141 (36%), Gaps = 2/141 (1%)

Query: 8 SRRSKAKSDRRSQLAAAAERLIAEHGYLAVRLEDIGAAAGVSGPAIYRHFPNKEALLVEL 67
+ + + R + A RL ++ G + L +I AAGV+ AIY HF +K L E+
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 68 LVGVSTRLLAGAEDVAGRAADADSA-LEGLIDFHLDFAFAEADLIRIQDRDLGNLPPAAK 126
+ + + + + L ++ L+ E + + +
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 127 RQ-VRRKQRQYVEIWVDVLRR 146
V++ QR D + +
Sbjct: 123 MAVVQQAQRNLCLESYDRIEQ 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03840PilS_PF08805240.027 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 24.1 bits (52), Expect = 0.027
Identities = 4/23 (17%), Positives = 13/23 (56%)

Query: 17 MSILTYIGVVLLVIGAVFWILGS 39
M +L +GV++++ + + +
Sbjct: 31 MEVLLVVGVIVVLAASAYKLYSM 53


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03843PF04335290.012 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 29.4 bits (66), Expect = 0.012
Identities = 9/27 (33%), Positives = 16/27 (59%)

Query: 98 KSPRWLWVVAGVSVVVIIGLVIALVIV 124
+S + WVVAGV+ + V+A+ +
Sbjct: 30 RSKKLAWVVAGVAGALATAGVVAVAAL 56


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03845DHBDHDRGNASE771e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 77.0 bits (189), Expect = 1e-18
Identities = 53/182 (29%), Positives = 83/182 (45%), Gaps = 3/182 (1%)

Query: 21 AVVTGASQNIGEALAIELARRGHNLIITARREEVLEALANRLREQYGVTVEVRAVDLADP 80
A +TGA+Q IGEA+A LA +G ++ E LE + + L+ + E D+ D
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE-ARHAEAFPADVRDS 69

Query: 81 AARATFCDELAGR--DISILCANAGTATFGPVAALDPAEERKQVQLNVLGVHDLVLAVLP 138
AA + I IL AG G + +L E +N GV + +V
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSK 129

Query: 139 GMVRRRAGGILISGSAAGNSPIPNNATYAATKAFANTFSESLRGEVKSAGVHVTVLAPGP 198
M+ RR+G I+ GS P + A YA++KA A F++ L E+ + +++PG
Sbjct: 130 YMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGS 189

Query: 199 VR 200

Sbjct: 190 TE 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03846TONBPROTEIN320.005 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 31.9 bits (72), Expect = 0.005
Identities = 13/47 (27%), Positives = 16/47 (34%)

Query: 444 AYERLAARMAPPPPVVPAPAPQTFPPFDIPPPFDVPPMPAPVEHVEP 490
+ + PP V P P P P PP APV +P
Sbjct: 46 SVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKP 92


84NCTC10437_03877NCTC10437_03884N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_03877-110-1.312082TetR family transcriptional regulator
NCTC10437_03878010-1.438061cytochrome P450
NCTC10437_03879012-1.072633putative secreted protein
NCTC10437_03880012-1.358291putative secreted protein
NCTC10437_03881112-1.971722putative secreted protein
NCTC10437_03882212-1.878827virulence factor Mce family protein
NCTC10437_03883111-2.254846virulence factor Mce family protein
NCTC10437_03884010-2.325638virulence factor Mce family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03877HTHTETR566e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 56.2 bits (135), Expect = 6e-12
Identities = 26/139 (18%), Positives = 52/139 (37%), Gaps = 8/139 (5%)

Query: 1 MASPRRIGAPDAKNRVVLLDAAEQLLIEEGYAAVTSRRVADRAGLKPQLVHYYFRTMEDL 60
MA + A + + +LD A +L ++G ++ + +A AG+ ++++F+ DL
Sbjct: 1 MARKTKQEAQETRQH--ILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDL 58

Query: 61 FLAVFHRRAEE-GLAVLATALQSPQPLWALWRFSTAPEATRLTMEFMGLANHRKALRAEI 119
F ++ G L + P ++ R E +E R+ L I
Sbjct: 59 FSEIWELSESNIGELELEYQAKFPGDPLSVLR-----EILIHVLESTVTEERRRLLMEII 113

Query: 120 VYYAERFRQEQNRAIADAL 138
+ E + A
Sbjct: 114 FHKCEFVGEMAVVQQAQRN 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_0387956KDTSANTIGN280.019 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 27.6 bits (61), Expect = 0.019
Identities = 24/88 (27%), Positives = 33/88 (37%), Gaps = 6/88 (6%)

Query: 29 AAAQPPLRHIQYTVGASQDIAN-AEIYWRQIDPPDWGAYSHNPY-----EFTPNVEANLG 82
AAQPPL + + N A I + DP + G NP + PN
Sbjct: 164 QAAQPPLNDQKRAAARIAWLKNCAGIDYMVKDPNNPGHMMVNPVLLNIPQGNPNPVGQPP 223

Query: 83 PNQAWVHETWLADPDQWAMVVVGLPAQS 110
+ + +QW +VVGL A S
Sbjct: 224 QRANQPANFAIHNHEQWRSLVVGLAALS 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03882PERTACTIN290.044 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 29.3 bits (65), Expect = 0.044
Identities = 23/78 (29%), Positives = 30/78 (38%)

Query: 454 PPFPVGAPAPMVPGAPAPAQPPILPPAPLPPGQGPLMPGQAVPVAPSAFDGNESGGPSVA 513
PP P AP P P P QPP P P PP P P P+ + + + +V
Sbjct: 568 PPAPKPAPQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQPPAGRELSAAANAAVN 627

Query: 514 TAQYDPQTGQYVAPDGAL 531
T + + A AL
Sbjct: 628 TGGVGLASTLWYAESNAL 645


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03883PF03544348e-04 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 34.2 bits (78), Expect = 8e-04
Identities = 20/76 (26%), Positives = 29/76 (38%), Gaps = 1/76 (1%)

Query: 368 PPPAEAIPAPPEGAPPL-PAAQPLLPVVPQPPAAPWLPQSQPITGDQIFAGPYGAPAPAP 426
P E IP PP+ AP + +P P+P P+ + A P+ APA
Sbjct: 77 EPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPAR 136

Query: 427 VPPAAPAAAPAVPTPG 442
+ AA + P
Sbjct: 137 PTSSTATAATSKPVTS 152



Score = 30.3 bits (68), Expect = 0.014
Identities = 15/81 (18%), Positives = 19/81 (23%), Gaps = 3/81 (3%)

Query: 365 GVAPPPAEAIPAPPEGAPPLPAAQPLLPVVPQPPAAPWLPQSQPITGDQIFAGPYGAPAP 424
PP P P P P + V+ +P P P
Sbjct: 66 VQPPPEPVVEPEPEPEPIPEPPKEA-PVVIEKPKPKP--KPKPKPVKKVEQPKRDVKPVE 122

Query: 425 APVPPAAPAAAPAVPTPGGGG 445
+ APA PT
Sbjct: 123 SRPASPFENTAPARPTSSTAT 143



Score = 28.8 bits (64), Expect = 0.037
Identities = 15/81 (18%), Positives = 20/81 (24%), Gaps = 15/81 (18%)

Query: 364 IGVAPPPAEAIPAPPEG-APPLPAAQPLLPVV----PQPPAAPWLPQSQPITGDQIFAGP 418
I V + P PP P +P P A + + +P
Sbjct: 50 ISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKP-------- 101

Query: 419 YGAPAPAPVPPAAPAAAPAVP 439
P P PV P
Sbjct: 102 --KPKPKPVKKVEQPKRDVKP 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_03884PF03544355e-04 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 35.0 bits (80), Expect = 5e-04
Identities = 22/101 (21%), Positives = 29/101 (28%), Gaps = 5/101 (4%)

Query: 399 PETPPTVSAYVGSGDVPPPPGWQAPP----GPPGIYQPDPDAPATPSPALFPGAPIPGPP 454
P P +V V D+ PP Q PP P +P P+ P + P P P
Sbjct: 46 PAQPISV-TMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPK 104

Query: 455 NILSNIPAQPQPQSVDGLLLPPQGPAGPPTGTPGGPPPSAP 495
QP+ P P +A
Sbjct: 105 PKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAA 145


85NCTC10437_04149NCTC10437_04156N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04149-170.186002sulfate adenylyltransferase, large subunit
NCTC10437_04150081.167127sulfate adenylyltransferase, small subunit
NCTC10437_04151081.483649carbonic anhydrase
NCTC10437_04152191.990689glycosyl transferase family protein
NCTC10437_041530101.401263glycosyl transferase family protein
NCTC10437_041541111.224011glycosyl transferase family protein
NCTC10437_041550111.097451signal transduction histidine kinase
NCTC10437_04156-190.101319two component transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04149TCRTETOQM617e-12 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 61.0 bits (148), Expect = 7e-12
Identities = 43/143 (30%), Positives = 67/143 (46%), Gaps = 14/143 (9%)

Query: 4 LLRIATAGSVDDGKSTLIGRLLFDSKAVMEDQLAAVERTSKERGHDYTDLALVTDGLRAE 63
++ I VD GK+TL LL++S A+ +L +V++ + TD E
Sbjct: 3 IINIGVLAHVDAGKTTLTESLLYNSGAI--TELGSVDKGTT-----------RTDNTLLE 49

Query: 64 REQGITIDVAYRYFATAKRKFIIADTPGHIQYTRNMVTGTSTAQLAIVLVDARHGLLEQS 123
R++GITI F K I DTPGH+ + + S AI+L+ A+ G+ Q+
Sbjct: 50 RQRGITIQTGITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQT 109

Query: 124 RRHAFLASLLGIQHIVLAVNKMD 146
R +GI I +NK+D
Sbjct: 110 RILFHALRKMGIPTIFF-INKID 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04152cloacin330.004 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 32.8 bits (74), Expect = 0.004
Identities = 32/100 (32%), Positives = 42/100 (42%), Gaps = 16/100 (16%)

Query: 249 WPADSRPYIGGSTDNSLLQLALGYNGLQRMTGGGGMPGGGFGGPGGPDGPGGPGGAAAPA 308
W +++ P+ GGS +G+ G G GGG G GG G GG A A
Sbjct: 39 WSSENNPWGGGSG-----------SGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAP 87

Query: 309 GAHLFFG-GDPGITRLFGASMGAEASWLLPAALIGLVAGL 347
A F PG L S+ A A L AA+ ++A L
Sbjct: 88 VAFGFPALSTPGAGGL-AVSISAGA---LSAAIADIMAAL 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04155PF06580357e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.8 bits (80), Expect = 7e-04
Identities = 18/84 (21%), Positives = 29/84 (34%), Gaps = 22/84 (26%)

Query: 428 AVGDTGDVVLTVLDDGPGIPSWLQPEVFERFARGDSSRARRGGSSTGLGLAIVAAVVRAH 487
D G V L V + G + STG GL V ++
Sbjct: 285 GTKDNGTVTLEVENTGSLA-------------------LKNTKESTGTGLQNVRERLQML 325

Query: 488 HG---TIEVHSRPGRTEFVVRLPG 508
+G I++ + G+ +V +PG
Sbjct: 326 YGTEAQIKLSEKQGKVNAMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04156HTHFIS1017e-27 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 101 bits (254), Expect = 7e-27
Identities = 32/119 (26%), Positives = 64/119 (53%)

Query: 37 IHVLVVDDEAVLAELVSMALRYEGWEISTAGDGATAIALAKQTPPDVVVLDVMLPDMSGL 96
+LV DD+A + +++ AL G+++ + AT D+VV DV++PD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 97 EVLRKLREQIPGLPLLLLTAKDSVEDRIAGLTAGGDDYVTKPFSIEEVVLRLRALLRRT 155
++L ++++ P LP+L+++A+++ I G DY+ KPF + E++ + L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122


86NCTC10437_04203NCTC10437_04208N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04203013-0.014814EmrB/QacA family drug resistance transporter
NCTC10437_04204013-0.008430Bcr/CflA subfamily drug resistance transporter
NCTC10437_04205113-0.420982membrane protein
NCTC10437_04206013-0.892063alpha-ketoglutarate decarboxylase
NCTC10437_04207011-0.183408Protein of uncharacterised function (DUF732)
NCTC10437_04208011-0.343559short-chain alcohol dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04203TCRTETB1575e-44 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 157 bits (398), Expect = 5e-44
Identities = 83/412 (20%), Positives = 179/412 (43%), Gaps = 20/412 (4%)

Query: 18 LWALLIGFFMILVDVTIVSVANPAIMASLGADYDAVIWVTSAYLLAYAVPLLIAGRLGDR 77
+W ++ FF +L + +++V+ P I + WV +A++L +++ + G+L D+
Sbjct: 17 IWLCILSFFSVL-NEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 78 YGPKNLYILGLAVFTAASLGCGLSDT-IGMLVAARVVQGVGAALLTPQTMTVITRTFPPQ 136
G K L + G+ + S+ + + +L+ AR +QG GAA M V+ R P +
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 137 RRGVAMSVWGATAGVATLVGPLAGGLLVDHLGWQWIFIVNVPIGVVGIAMAIWLVPSLPT 196
RG A + G+ + VGP GG++ ++ W ++ ++ + I ++ + + L+
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPM-ITIITVPFLMKLLKK-EV 193

Query: 197 QQNQRLDLPGVVLSGVAMFLIVFGLQEGQSHDWALWIWATIVVGAAVMGAFLYWQSVNTG 256
+ D+ G++L V + + + + ++V F+
Sbjct: 194 RIKGHFDIKGIILMSVGIVFFMLFTT--------SYSISFLIVSVLSFLIFVKHIR-KVT 244

Query: 257 QSLIPLIIFRDRNFSMSNL--GIATMGFVATSMILPLMFYAQSVCGLSPTQA-ALLTAPM 313
+ + ++ F + L GI ++P M + V LS + +++ P
Sbjct: 245 DPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMM--KDVHQLSTAEIGSVIIFPG 302

Query: 314 AVATGVLAPFVGRLVDRSHPSAVVGFGFSLMAIGLTWLSIEMTPATPIWRLVLPLTAMGV 373
++ + G LVDR P V+ G + +++ +L+ T W + + + +
Sbjct: 303 TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVS--FLTASFLLETTSWFMTIIIVFVLG 360

Query: 374 AMAFIWSPLAATATRNLPPQLAGAGSGVYNTTRQVGSVLGSAGMAALMTSQL 425
++F + ++ + +L Q AGAG + N T + G A + L++ L
Sbjct: 361 GLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPL 412


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04204TCRTETA672e-14 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 67.2 bits (164), Expect = 2e-14
Identities = 79/323 (24%), Positives = 127/323 (39%), Gaps = 19/323 (5%)

Query: 2 IFVLGLLVALGPLTIDMYLPALPKIAEELHVSSSLAQLTLTGTLAGLAIGQLVIGP---- 57
+ V+ VAL + I + +P LP + +L S+ + LA A+ Q P
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGI-LLALYALMQFACAPVLGA 65

Query: 58 LSDSLGRRKPLFAGIVLHMLASLLCLFAPNILVLGIARGLQGMGAAAGMVVAIAVVGDLY 117
LSD GRR L + + + AP + VL I R + G+ A G V A + D+
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG-AYIADIT 124

Query: 118 KDNAAATVMSRLMLVLGVAPVLAPSLGAAVLLHGSWHWVFAALVVLAGALLVMAVFLLPE 177
+ A + G V P LG ++ S H F A L G + FLLPE
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLG-GLMGGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 178 TLPVGHRRPLKVRGIAA-TYLELVRDVRFVILVLIAALGMSGLFAYIAGASFVLQG--RF 234
+ G RRPL+ + R + V ++ M L + A +V+ G RF
Sbjct: 184 SHK-GERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQ-LVGQVPAALWVIFGEDRF 241

Query: 235 GLDQTAFALVF-GAGAVALIGTTQLNVVLLKRFRPQQIMMWALVAASVFGAVFIAVAAAR 293
D T + G + + + + R ++ +M + A G + +A A
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGM-IADGTGYILLAFA--- 297

Query: 294 IGGLYGFVLPVWAILAAMGLVIP 316
P+ +LA+ G+ +P
Sbjct: 298 --TRGWMAFPIMVLLASGGIGMP 318


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04206PF03544350.001 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 35.3 bits (81), Expect = 0.001
Identities = 19/102 (18%), Positives = 30/102 (29%), Gaps = 8/102 (7%)

Query: 21 VDYSPEPTNDAPSGGANRAAASGNGKPAKTPTAPPEPAPAPKPANTNGGSAAPAKSDKPS 80
V PEP + P P P+P P PKP + K
Sbjct: 66 VQPPPEPVVE-PEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKP-------VKKVEQPKRD 117

Query: 81 GTSAKTPEKTPAKSPEKAPEKASEKAEAKSAPAPAKSKAATP 122
++ +P ++ A +S A S P + +
Sbjct: 118 VKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRA 159



Score = 35.0 bits (80), Expect = 0.001
Identities = 19/106 (17%), Positives = 27/106 (25%), Gaps = 8/106 (7%)

Query: 45 GKPAKTPTAPPEPAPAPKPANTNGGSAAPAKSDKPSGTSAKTPEKTPAKSPEKAPEKASE 104
+P P PEP P P P A K K K +K + +
Sbjct: 66 VQPPPEPVVEPEPEPEPIPEPP-----KEAPVVIE---KPKPKPKPKPKPVKKVEQPKRD 117

Query: 105 KAEAKSAPAPAKSKAATPAADGDESQILRGAAAAVVKNMSASLDVP 150
+S PA A + V + +L
Sbjct: 118 VKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRN 163



Score = 33.4 bits (76), Expect = 0.005
Identities = 24/123 (19%), Positives = 36/123 (29%), Gaps = 2/123 (1%)

Query: 55 PEPAPAPKPANTNGGSAAPAKSDKPSGTSAKTPEKTPAKSPEKAPEKASEKAEAKSAPAP 114
P PA P ++ +P PE P PE E + K P P
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKP 103

Query: 115 A-KSKAATPAADGDESQILRGAAAAVVKNMSASLDVPTATSVRAIPAKAM-IDNRIVINN 172
K D + A+ A TAT+ + P ++ R + N
Sbjct: 104 KPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRN 163

Query: 173 HLK 175
+
Sbjct: 164 QPQ 166



Score = 30.7 bits (69), Expect = 0.028
Identities = 20/109 (18%), Positives = 33/109 (30%), Gaps = 12/109 (11%)

Query: 31 APSGGANRAAASGNGKPAKTPTAPPEPAP-APKPANTNGGSAAPAKSDKPSGTSAKTPEK 89
AP+ A +P P PEP P PK A P K K
Sbjct: 56 APADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKP-----------KPKPK 104

Query: 90 TPAKSPEKAPEKASEKAEAKSAPAPAKSKAATPAADGDESQILRGAAAA 138
+ P++ + E++ A + A P + + + +
Sbjct: 105 PKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSV 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04208DHBDHDRGNASE793e-19 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 78.6 bits (193), Expect = 3e-19
Identities = 59/207 (28%), Positives = 95/207 (45%), Gaps = 4/207 (1%)

Query: 2 EGFAGKVAVVTGAGSGIGQALAIELGRSGAHVAISDVDTEGLAVTEERLKAIGAQVKADR 61
+G GK+A +TGA GIG+A+A L GAH+A D + E L LKA +A
Sbjct: 4 KGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 62 LNVTEREAFLLYADNVAEHFGKVNQIYNNAGIAFTGDVEITQFKDIERVMDVDFWGVVNG 121
+V + A + G ++ + N AG+ G + ++ E V+ GV N
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 122 TKAFLPHLIASGDGHVVNVSSVFGLFSVPGQAAYNSAKFAVRGFTEALRQEMGLAGHPVK 181
+++ +++ G +V V S AAY S+K A FT+ L E LA + ++
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLE--LAEYNIR 181

Query: 182 VSCVHPGGIKTAIARNAEAAEGIDAEE 208
+ V PG +T + + A E + E
Sbjct: 182 CNIVSPGSTETDMQWSLWADE--NGAE 206


87NCTC10437_04215NCTC10437_04228N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_042150102.448355Uncharacterised protein
NCTC10437_04216-192.296541citryl-CoA lyase
NCTC10437_04217081.901932MgtE integral membrane protein
NCTC10437_04218091.955051integral membrane protein
NCTC10437_042190101.790594membrane-bound lytic murein transglycosylase B
NCTC10437_042200121.002768ATPase involved in chromosome partitioning
NCTC10437_04221-1100.238832twin arginine-targeting protein translocase
NCTC10437_04222011-0.223079peptidase S1 and S6, chymotrypsin/Hap
NCTC10437_04223-3110.321457RNA polymerase sigma-70 factor
NCTC10437_04224-2100.075278RNA polymerase sigma factor SigE
NCTC10437_04225-2110.452448O-methyltransferase family protein
NCTC10437_04226-111-0.003658transcriptional regulator
NCTC10437_04227010-0.088956ABC transporter--like protein
NCTC10437_042280100.020269ABC transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04215TONBPROTEIN280.024 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 28.4 bits (63), Expect = 0.024
Identities = 21/108 (19%), Positives = 25/108 (23%), Gaps = 1/108 (0%)

Query: 33 IEQPQDHSADSTEVFSPEAPAQPQNYTPPPAYTPPPAYPPPTPGYQPSGYSAPGYPPSGD 92
IE P S + +P PQ PPP P P P P
Sbjct: 36 IELPAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPK 95

Query: 93 FATPPYPPPPAGYPPAYPDPGQQPPYPAP-GYGGPAYPPPGPYGTAPG 139
P P P + +P PA A
Sbjct: 96 PKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATS 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04219PF05616356e-04 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 34.7 bits (79), Expect = 6e-04
Identities = 27/84 (32%), Positives = 32/84 (38%), Gaps = 12/84 (14%)

Query: 339 PGFVPGQTLGPLPGPMPAAQPAPQPAAPPPPAWVPPWMQPPPQQQPRCAVFCIQDVAPAA 398
P PG P P+P PA PA P P P +P P+ P D+ P A
Sbjct: 313 PDLTPGSAEAPNAQPLPEVSPAENPANNPAPN-ENPGTRPNPEPDP--------DLNPDA 363

Query: 399 PAPVPAPAGPLGLVPPAPAAPPLP 422
P G G P +PA P P
Sbjct: 364 N---PDTDGQPGTRPDSPAVPDRP 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04221TATBPROTEIN829e-23 Bacterial sec-independent translocation TatB protein...
		>TATBPROTEIN#Bacterial sec-independent translocation TatB protein

signature.
Length = 171

Score = 82.0 bits (202), Expect = 9e-23
Identities = 30/79 (37%), Positives = 48/79 (60%), Gaps = 3/79 (3%)

Query: 1 MFANVGWGEMLVLVIAGLVILGPERLPGAIRWTAGAVRQARDYVTGATSQLREDLG-PEF 59
MF ++G+ E+L++ I GLV+LGP+RLP A++ AG +R R T ++L ++L EF
Sbjct: 1 MF-DIGFSELLLVFIIGLVVLGPQRLPVAVKTVAGWIRALRSLATTVQNELTQELKLQEF 59

Query: 60 DDLREPLSELQKLRGMTPR 78
D + + E L +TP
Sbjct: 60 QDSLKKV-EKASLTNLTPE 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04222V8PROTEASE741e-16 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 74.3 bits (182), Expect = 1e-16
Identities = 42/181 (23%), Positives = 67/181 (37%), Gaps = 26/181 (14%)

Query: 205 DSVVTIEATSETEGSQGSGVVVDGRGYIVTNNHVISEAATNPSEFKMSVVFNDGTEVP-- 262
V I+ + T SGVVV G+ ++TN HV+ +P K + P
Sbjct: 88 APVTYIQVEAPTGTFIASGVVV-GKDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNG 146

Query: 263 ----ANLVGRDPKTDLAVLKVDNV-------DNLSVARMGDSEKLRVGEEVIAAGAPLGL 311
+ + DLA++K + + A M ++ + +V + + G P
Sbjct: 147 GFTAEQITKYSGEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDK 206

Query: 312 RSTVTHGIISALHRPVPLSGDGSDTDTVIDGVQTDASINHGNSGGPLINMNSEVIGINTA 371
G T + +Q D S GNSG P+ N +EVIGI+
Sbjct: 207 PVATMW------------ESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWG 254

Query: 372 G 372
G
Sbjct: 255 G 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04226HTHTETR552e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 55.0 bits (132), Expect = 2e-11
Identities = 28/149 (18%), Positives = 47/149 (31%), Gaps = 17/149 (11%)

Query: 17 ARIRDAAIDQFGRAGF-AVGLRAIATAAGVSPGLVIHHFGSKDGLRAACDDYIADTVLAA 75
I D A+ F + G + L IA AAGV+ G + HF K L + + +
Sbjct: 14 QHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGEL 73

Query: 76 KTESIQTSDPATW----------FAQLAEIEEYAPMMGYLVRSLQSGGELAK--RLWRT- 122
+ E E +M + + GE+A + R
Sbjct: 74 ELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNL 133

Query: 123 ---MIDNTERYVEEGVRAGTIKPSADPRA 148
D E+ ++ + A + R
Sbjct: 134 CLESYDRIEQTLKHCIEAKMLPADLMTRR 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04228ACRIFLAVINRP290.041 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 29.4 bits (66), Expect = 0.041
Identities = 22/75 (29%), Positives = 33/75 (44%), Gaps = 2/75 (2%)

Query: 385 EQSGLG--EAVLGGAVSRWRWLLTAVAAAVLGGAVLMFFAGLGNGLGAGITVGDPPTVLR 442
E+ G G EA L R R +L A +LG L G G+G + +G ++
Sbjct: 953 EKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVS 1012

Query: 443 LTVAGLAFVPALAVL 457
T+ + FVP V+
Sbjct: 1013 ATLLAIFFVPVFFVV 1027


88NCTC10437_04303NCTC10437_04318N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04303-110-0.750002TetR family transcriptional regulator
NCTC10437_04304-110-0.555605MMPL domain-containing protein
NCTC10437_04305111-0.706587DoxX family protein
NCTC10437_04306-111-0.245390Uncharacterised protein
NCTC10437_04307-1100.0058402-nitropropane dioxygenase-like enzyme
NCTC10437_04308-290.217434dehydrogenase of uncharacterised specificity,
NCTC10437_04309-2100.328246L-carnitine dehydratase/bile acid-inducible
NCTC10437_04310-280.357656enoyl-CoA hydratase
NCTC10437_04311-190.256603H+ antiporter protein
NCTC10437_04312-290.420519abortive infection protein
NCTC10437_04313-29-0.004708nucleoside-diphosphate sugar epimerase
NCTC10437_04314-1120.266406carbonic anhydrase
NCTC10437_04315-1120.137887short-chain alcohol dehydrogenase
NCTC10437_04316-1100.349309TetR family transcriptional regulator
NCTC10437_043170110.513625beta-Ig-H3/fasciclin
NCTC10437_04318-111-0.644332acyl-CoA dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04303HTHTETR632e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 63.1 bits (153), Expect = 2e-14
Identities = 32/168 (19%), Positives = 57/168 (33%), Gaps = 14/168 (8%)

Query: 1 MPRRRRAPRGSGDQLRDDILDATTELLLETGDAAAVSIRSVAQRVGVTPPSIYLHFADKT 60
M R+ + + R ILD L + G ++ S+ +A+ GVT +IY HF DK+
Sbjct: 1 MARKTKQ---EAQETRQHILDVALRLFSQQG-VSSTSLGEIAKAAGVTRGAIYWHFKDKS 56

Query: 61 ALLDEVCCRYFEKLDDEMQAVAGAYTGA-IDVLRAQGMAYLRFALKTPVLYRIAMM---- 115
L E+ + + + G + VLR + L + + +
Sbjct: 57 DLFSEIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHK 116

Query: 116 ----GHGQPGSDVDVALNSSAFGHLRATVQALIAEGTYPPG-DPTTLA 158
G L ++ + T++ I P A
Sbjct: 117 CEFVGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAA 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04304ACRIFLAVINRP496e-08 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 49.1 bits (117), Expect = 6e-08
Identities = 36/226 (15%), Positives = 87/226 (38%), Gaps = 17/226 (7%)

Query: 184 IAIPLSFVVLVWVFGGLLAAALPLAVGAFAILGSMAVLR--GFTMVTDVSIFALNLSIAM 241
AI L F+V+ + A +P +LG+ A+L G+++ T +++F + L+I +
Sbjct: 346 EAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINT-LTMFGMVLAIGL 404

Query: 242 GLALAIDYTLLIISRYRDELAAGH-DRNTALVRTMATAGRTVLFSALTVALSMIAMVLFP 300
+D ++++ + A ++M+ ++ A+ ++ I M F
Sbjct: 405 L----VDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFG 460

Query: 301 MY---FLKSFAYAGIAVVGFAALAAIVVAPA--AIVLAGDRLDSLDVRKLVRRMLGRPEP 355
+ F+ ++ + + L A+++ PA A +L + S + + G
Sbjct: 461 GSTGAIYRQFSITIVSAMALSVLVALILTPALCATLL---KPVSAEHHENKGGFFGWFNT 517

Query: 356 VDKPVEQMFWYRSTKFVMRRSIPIATVVVTLLLLLGAPFLGANWGF 401
+ S ++ + + ++ + FL F
Sbjct: 518 TFDHSVN-HYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSF 562



Score = 40.2 bits (94), Expect = 4e-05
Identities = 43/227 (18%), Positives = 82/227 (36%), Gaps = 15/227 (6%)

Query: 112 PATPALVSKDGKTGL-IVAGITGGESGAQKHAKALTDELVHDR-DGVTVRAGGVAMTYVQ 169
+P L +G + I G S A AL + L G+ G M+Y +
Sbjct: 810 YGSPRLERYNGLPSMEIQGEAAPGTSSGD--AMALMENLASKLPAGIGYDWTG--MSYQE 865

Query: 170 INTQSEKDLLMMEAIAIPLSFVVLVWVFGGLLAAALPLAVGAFAILGSMAVLRGFTMVTD 229
S + AI+ + F+ L ++ + V I+G +L
Sbjct: 866 R--LSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVG--VLLAATLFNQK 921

Query: 230 VSIFALNLSIAMGLALAIDYTLLIISRYRDELAA-GHDRNTALVRTMATAGRTVLFSALT 288
++ + + + + L+ +LI+ +D + G A + + R +L ++L
Sbjct: 922 NDVYFM-VGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLA 980

Query: 289 VALSMIAMVLFPMYFLKSFAYAGIAVVG---FAALAAIVVAPAAIVL 332
L ++ + + + GI V+G A L AI P V+
Sbjct: 981 FILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVV 1027


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04307AUTOINDCRSYN310.005 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 31.0 bits (70), Expect = 0.005
Identities = 24/148 (16%), Positives = 46/148 (31%), Gaps = 30/148 (20%)

Query: 126 SVRHAVKAQSLGVDGISIDGFE-----------CAGHPGEDDVPGLVLIPAAADKIEIPM 174
++R L DG E G + L I P
Sbjct: 22 TLRKETFKDRLNWAVQCTDGMEFDQYDNNNTTYLFGIKDNTVICSLRFIETKY-----PN 76

Query: 175 IASGGFADSRGLVAALALGADGINMGSRFMCTVESCIHQNVKEAIVAGDERGTELIFRSL 234
+ +G F + + SRF + ++ + I+ + + ++F S+
Sbjct: 77 MITGTFFP---YFKEINIPEGNYLESSRF------FVDKSRAKDILGNEYPISSMLFLSM 127

Query: 235 HNTARVAS-----NVVSREVVDILKAGG 257
N ++ +VS ++ ILK G
Sbjct: 128 INYSKDKGYDGIYTIVSHPMLTILKRSG 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04308DHBDHDRGNASE723e-17 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 72.4 bits (177), Expect = 3e-17
Identities = 53/202 (26%), Positives = 85/202 (42%), Gaps = 16/202 (7%)

Query: 3 IKDAVAVVTGGASGLGLATTKRLLDRGASVVVID------LKGEDAVKELGARAKFVEAN 56
I+ +A +TG A G+G A + L +GA + +D K ++K A+ A+
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 57 VTDPEQVTAALDA-AEEMGPLRIDVNCAGIGNAIKTLGKDGPFPLDGFRKVVEVNLIGTF 115
V D + EMGP+ I VN AG+ G + + VN G F
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLR----PGLIHSLSDEEWEATFSVNSTGVF 121

Query: 116 NVIRLAAERIAKQEPINGERGVIINTASVAAFEGQIGQAAYSASKGGVVGMTLPIARDLS 175
N R ++ + + G I+ S A + AAY++SK V T + +L+
Sbjct: 122 NASRSVSKYMMDRR-----SGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 176 REFIRVCTIAPGLFKTPLLGSL 197
IR ++PG +T + SL
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSL 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04312PHPHTRNFRASE290.025 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 28.6 bits (64), Expect = 0.025
Identities = 19/44 (43%), Positives = 21/44 (47%), Gaps = 12/44 (27%)

Query: 190 NPMLGFAAI--------ILGTVC-ALLRRSTGG---VLAPMLTH 221
NP LGF AI I T ALLR ST G V+ PM+
Sbjct: 353 NPFLGFRAIRLCLEKQDIFRTQLRALLRASTYGNLKVMFPMIAT 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04313NUCEPIMERASE536e-10 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 53.2 bits (128), Expect = 6e-10
Identities = 29/127 (22%), Positives = 46/127 (36%), Gaps = 24/127 (18%)

Query: 5 RILVTGATGYIGSRLVTALLEDGHQVV-----AASRDV----ERLADFGWYDDVTAVTLD 55
+ LVTGA G+IG + LLE GHQVV DV RL +D
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLA-QPGFQFHKID 60

Query: 56 AHDEISAKQALTEAGPIDVVYYLVHGIGQPGFR----------EADNKAAANVASAAKAA 105
D L +G + V+ H + R +++ N+ +
Sbjct: 61 LADR-EGMTDLFASGHFERVFISPH---RLAVRYSLENPHAYADSNLTGFLNILEGCRHN 116

Query: 106 GVRRIVY 112
++ ++Y
Sbjct: 117 KIQHLLY 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04315DHBDHDRGNASE631e-13 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 63.1 bits (153), Expect = 1e-13
Identities = 55/222 (24%), Positives = 88/222 (39%), Gaps = 17/222 (7%)

Query: 17 LTGKTCVITGASAGLGRESARALAATGAHVILAARNADALAETETWVRAEVADARLSLVP 76
+ GK ITGA+ G+G AR LA+ GAH+ N + L + + ++AE A P
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHA--EAFP 63

Query: 77 LDLTSLASVAAAAEQISELTPVVHVLMNNAGV--MFTPFGRTAEGFETQFGTNHLGHFEF 134
D+ A++ +I + +L+N AGV + E +E F N G F
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 135 TRLLFPALVAADGARVVVLSSEGHRMSDVDFDDPNFEHRDYDKFAAYGASKTANVLHAVE 194
+R + ++ +V + S N AAY +SK A V+
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGS-------------NPAGVPRTSMAAYASSKAAAVMFTKC 170

Query: 195 LDRRLRDSGVRAFAVHPGIVATSLARHMTNDDFASLNRSAAS 236
L L + +R V PG T + + D+ + S
Sbjct: 171 LGLELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGS 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04316HTHTETR521e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 52.3 bits (125), Expect = 1e-10
Identities = 33/186 (17%), Positives = 63/186 (33%), Gaps = 19/186 (10%)

Query: 44 RAVLDATVDLLRDQSFDDLTTRQIADRAGVAHSDLCAYFRSKDAIVAEVY---------L 94
+ +LD + L Q + +IA AGV + +F+ K + +E++ L
Sbjct: 14 QHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGEL 73

Query: 95 GLLRDAPLAVDVEESARTRVVALFHQLVMLPADRPGLAAACSSA-MISQDKTVQPIRRRI 153
L A D R ++ + V R + + + VQ +R +
Sbjct: 74 ELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNL 133

Query: 154 QAELHRRV---------RTVLRSDAWPEVAQTLEFGLVGAMVQASSGAATFEGMADELAR 204
E + R+ +L +D A + G + +++ A + E
Sbjct: 134 CLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDLKKEARD 193

Query: 205 VVTTLL 210
V LL
Sbjct: 194 YVAILL 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04318SECFTRNLCASE290.024 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 29.4 bits (66), Expect = 0.024
Identities = 26/118 (22%), Positives = 44/118 (37%), Gaps = 4/118 (3%)

Query: 50 GLIGFNMPEEFGGGGTDDFRFNAIIDEEIARAGVPGPGLSLHNDVVAPYFKDLTTDEQKQ 109
+IG N +F GG T ID + RA + L DV+ +D + E +
Sbjct: 39 LVIGLNFGIDFKGGTTIRTESTTAIDVGVYRAALEPLEL---GDVIISEVRDPSFREDQH 95

Query: 110 RWLPGIASGETIIAVAMTEPGAGSDLAGIRTSAVRDGDDWILNGSKTFISSGINSDLV 167
+ I E A + G +L +A+ D + S + ++ +LV
Sbjct: 96 VAMIRIQMQEDGQG-AEGQGAQGQELVNKVETALTAVDPALKITSFESVGPKVSGELV 152


89NCTC10437_04334NCTC10437_04341N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04334010-1.308093virulence factor Mce family protein
NCTC10437_04335-111-1.267954virulence factor Mce family protein
NCTC10437_04336011-0.892687virulence factor Mce family protein
NCTC10437_04337-28-0.622136ABC-type transport system involved in resistance
NCTC10437_04338-19-0.058887organic solvent resistance ABC transporter
NCTC10437_04339-190.3047103-demethylubiquinone-9 3-methyltransferase
NCTC10437_04340-1100.506928TetR family transcriptional regulator
NCTC10437_043410110.049085alpha/beta hydrolase fold protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04334TONBPROTEIN310.009 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 30.7 bits (69), Expect = 0.009
Identities = 16/50 (32%), Positives = 17/50 (34%)

Query: 408 EPLPAPPPGGPPPGPPALAPPGIASVPDPDPGVVSVPAPGEAPPLVPVPG 457
EP A P P P P I P P V+ P P P PV
Sbjct: 56 EPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKK 105


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04336PF05616372e-04 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 36.6 bits (84), Expect = 2e-04
Identities = 37/118 (31%), Positives = 44/118 (37%), Gaps = 13/118 (11%)

Query: 379 IAFPTYANYFPVTRAVPEPPSIRNNFGGPAIGP-------IPYPGAPAYGAQLYAPDGTP 431
+A T N PVT P + FG + G IP P A+ AP+ P
Sbjct: 270 VAPGTKVNMGPVTDRNGNPVQVVATFGRDSQGNTTVDVQVIPRPDLTPGSAE--APNAQP 327

Query: 432 L---WPGLPPAPPPGAPRDPGPRPGSEPFIVPNPAQLQPTAVPYPFVPPPVPALPGPP 486
L P PA P +PG RP EP NP P P P PA+P P
Sbjct: 328 LPEVSPAENPANNPAPNENPGTRPNPEPDPDLNP-DANPDTDGQPGTRPDSPAVPDRP 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04337PERTACTIN280.047 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 28.1 bits (62), Expect = 0.047
Identities = 30/100 (30%), Positives = 40/100 (40%), Gaps = 22/100 (22%)

Query: 42 GTAVTHYRVELMRLIAQM-GLGSGALIVIGGTVAIVGFLTVTTGALVAVQGYTDFAEIGV 100
G V+ V+L + I + LG+ G V TV+ G+L A G V
Sbjct: 292 GVDVSDSTVDLAQSIVEAPQLGAAIRAGRGARV------TVSGGSLSAPHG-------NV 338

Query: 101 EALTGFASAFFNVRLIAPATTAIALSATIGAGATAQLGAM 140
G A R PA+ LS T+ AGA AQ A+
Sbjct: 339 IETGGGAR-----RFPPPASP---LSITLQAGARAQGRAL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04340HTHTETR733e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 73.1 bits (179), Expect = 3e-18
Identities = 33/171 (19%), Positives = 67/171 (39%), Gaps = 4/171 (2%)

Query: 5 ADRDDERRRIIEAAYRCLLAQRAGPVPMSAILSSAGLSSRAFYRHFVSKDDLFLALLHQA 64
+ + R+ I++ A R Q + I +AG++ A Y HF K DLF + +
Sbjct: 7 QEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELS 66

Query: 65 GAVLVAHVNGVAEAAVGTPVEQLADWIGEMFDSIVDPEQLRQLTVI---DCDQVRAAKGY 121
+ + G P+ L + + + +S V E+ R L I C+ V
Sbjct: 67 ESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVV 126

Query: 122 REMRERLHVERERSLTEILRRGHDDGSF-PLVDPEPDAVAINAVVSRVMAH 171
++ + L +E + + L+ + + A+ + +S +M +
Sbjct: 127 QQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMEN 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04341PF06057310.004 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 31.0 bits (70), Expect = 0.004
Identities = 27/93 (29%), Positives = 38/93 (40%), Gaps = 11/93 (11%)

Query: 25 EWNGGASDSSSPSVLMLHGGGQNRFSWKKTGQILAAQGLHVVALDSRGHGDSDRSPEATY 84
+ N +S + P V+ L G G K G IL QG VV S + + P+
Sbjct: 41 QVNAASSHTKPPLVIFLSGDGGWATLDKAVGGILQQQGWPVVGWSSLKYYWKQKDPKDV- 99

Query: 85 TVEALCADTLAVL----HQIGR-PTVLIGASMG 112
DTLA++ + G +LIG S G
Sbjct: 100 -----TQDTLAIIDKYQAEFGTQKVILIGYSFG 127


90NCTC10437_04459NCTC10437_04466N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04459-113-1.0230203-alpha-hydroxysteroid dehydrogenase
NCTC10437_04460013-1.645562esterase/lipase
NCTC10437_04461016-2.685269Uncharacterised protein
NCTC10437_04462116-3.094690universal stress protein UspA-like protein
NCTC10437_04463117-3.848021amino acid permease-associated protein
NCTC10437_04464223-4.030028protein of uncharacterised function DUF222
NCTC10437_04465618-3.359691Protein of uncharacterised function (DUF3303)
NCTC10437_04466113-3.274350deazaflavin-dependent nitroreductase family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04459DHBDHDRGNASE554e-11 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 55.4 bits (133), Expect = 4e-11
Identities = 60/276 (21%), Positives = 94/276 (34%), Gaps = 60/276 (21%)

Query: 6 VTGSASGMGRAVAEKLRRDGHTVVGVDVKDADIVADLS--------------DVQGRRAA 51
+TG+A G+G AVA L G + VD + +S DV+ A
Sbjct: 13 ITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSAAI 72

Query: 52 ADAVRR--ASKGSLDGAVLAAG-LGPGPGRDRPR----LIAQVNYFGVVDLLEAWQPLLA 104
+ R G +D V AG L PG VN GV + + +
Sbjct: 73 DEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKYMM 132

Query: 105 AADGAKVVVFSSNSTTTTPAVPARAVRAFLAHDAEKAVRAVRLFGGSASVMAYAASKIAV 164
+V SN S+ AYA+SK A
Sbjct: 133 DRRSGSIVTVGSNPAGVP----------------------------RTSMAAYASSKAAA 164

Query: 165 SRWVRREAVGSGWAGAGIRLNALAPGAIMTPLLQEQL--SDGRRAKLVRSFP------VP 216
+ + +G A IR N ++PG+ T +Q L + ++++ +P
Sbjct: 165 VMFTK--CLGLELAEYNIRCNIVSPGSTETD-MQWSLWADENGAEQVIKGSLETFKTGIP 221

Query: 217 TGAFGDADQLAEWVLFMLSDAADFLCGSVIFVDGGT 252
+A+ VLF++S A + + VDGG
Sbjct: 222 LKKLAKPSDIADAVLFLVSGQAGHITMHNLCVDGGA 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04460TONBPROTEIN372e-04 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 36.5 bits (84), Expect = 2e-04
Identities = 21/110 (19%), Positives = 39/110 (35%), Gaps = 7/110 (6%)

Query: 121 ADVDDPDSDTAPEPGADNSSPNSEGVPAVDPDVTEPQEQIEVTPVAPGSAEAEPAAGP-- 178
AD++ P + P P E +P + E+ + P + P
Sbjct: 53 ADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKR 112

Query: 179 -----GANTDSPIMSTGTATKTGAAVDTAAAQQVSTTVTAPATLAAAQPR 223
+ SP +T A T + A ++ V++ + P L+ QP+
Sbjct: 113 DVKPVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQ 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04465STREPKINASE250.042 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 25.4 bits (55), Expect = 0.042
Identities = 12/36 (33%), Positives = 20/36 (55%), Gaps = 1/36 (2%)

Query: 7 MDFRLNGSAEENEAAASRILDLYSKWSPP-ASATFH 41
MD+ L G E+N +RI+ +Y P +A++H
Sbjct: 372 MDYTLTGKVEDNHDDTNRIITVYMGKRPEGENASYH 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04466FbpA_PF05833270.031 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 26.8 bits (59), Expect = 0.031
Identities = 12/39 (30%), Positives = 19/39 (48%), Gaps = 1/39 (2%)

Query: 37 RKSGRRYRTPLLVFQTRDGYAILVGY-GLQTDWLKNALA 74
KS + + + F ++DG I VG +Q D+L A
Sbjct: 449 YKSKKSKTSKPMHFISKDGIDIYVGKNNIQNDYLTLKFA 487


91NCTC10437_04510NCTC10437_04516N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04510-1130.215234short-chain dehydrogenase/reductase SDR
NCTC10437_04511110-0.69389450S ribosomal protein L25/general stress protein
NCTC10437_0451208-0.708972peptidyl-tRNA hydrolase
NCTC10437_04513011-0.710709Uncharacterised protein
NCTC10437_04514110-0.111940Uncharacterised protein
NCTC10437_04515011-0.156808putative Fe-S protein
NCTC10437_0451619-0.925059ABC-2 type transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04510DHBDHDRGNASE541e-10 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 53.5 bits (128), Expect = 1e-10
Identities = 55/205 (26%), Positives = 87/205 (42%), Gaps = 32/205 (15%)

Query: 16 GRTVIVTGANSGLGLVTARELARVGATTILAVRNLDKGHAAAAGMSGDVEVRH-----LD 70
G+ +TGA G+G AR LA GA N +K + + E RH D
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLK--AEARHAEAFPAD 65

Query: 71 LQDLASIRV----FAEGVAHVDVLVNNAGI--MAVPYALTADGFESQIGTNHLGHFALTN 124
++D A+I + +D+LVN AG+ + ++L+ + +E+ N G F +
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 125 LLLPKITDR----VVTVSSMMHLLGYISLKDLNWKSRPYFAWPAYGQSKLANLLFTSELQ 180
+ + DR +VTV S N P + AY SK A ++FT L
Sbjct: 126 SVSKYMMDRRSGSIVTVGS-------------NPAGVPRTSMAAYASSKAAAVMFTKCLG 172

Query: 181 RKLAAAGSAVRAVAAHPGYSATNLQ 205
+L A +R PG + T++Q
Sbjct: 173 LEL--AEYNIRCNIVSPGSTETDMQ 195


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04512DPTHRIATOXIN290.016 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 28.6 bits (63), Expect = 0.016
Identities = 15/30 (50%), Positives = 18/30 (60%), Gaps = 8/30 (26%)

Query: 135 VGIGRPPGQKSGA--------SFVLENFSS 156
+GIG PP +GA SFV+ENFSS
Sbjct: 22 LGIGAPPSAHAGADDVVDSSKSFVMENFSS 51


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04513SURFACELAYER270.010 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 26.6 bits (58), Expect = 0.010
Identities = 17/72 (23%), Positives = 30/72 (41%), Gaps = 4/72 (5%)

Query: 6 RYLRILLVPSVAAAALMAAPVAGAQATCQEAGSLTRCETSGSVSVKAVPGTRA-PNVADT 64
+ LRI+ S AAAAL+A A A A + +++ + + A P+++
Sbjct: 3 KNLRIV---SAAAAALLAVAPIAATAMPVNAATTINADSAINANTNAKYDVDVTPSISAI 59

Query: 65 IPRGNNGRGPGI 76
+ P I
Sbjct: 60 AAVAKSDTMPAI 71


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04516ABC2TRNSPORT376e-05 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 36.8 bits (85), Expect = 6e-05
Identities = 39/216 (18%), Positives = 85/216 (39%), Gaps = 9/216 (4%)

Query: 17 VPARPTNLAQQSWIMV-KRN-MIHTKRMPEMLSDVTAQPIMFVLLFAFVFGASITNTGGA 74
V A P +WI V +RN + K L A+P++++ G + GG
Sbjct: 6 VTALPGGSL--NWIAVWRRNYIAWKKAALASLLGHLAEPLIYLFGLGAGLGVMVGRVGGV 63

Query: 75 SYREFLLPGIQAQTIVFSAF--VVAAGITADVEKGIIDRFRSLPISRSSVLIGRSIASVI 132
SY FL G+ A + + +A + A + + + +++G +
Sbjct: 64 SYTAFLAAGMVATSAMTAATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAAT 123

Query: 133 HSSLGVVVMALTGLAIGWRIRNGVGEAVLAFALLLVFGFGMIWFGILIGSLMRTVEAVNG 192
++L + + A+G+ + + A ++ + G G+++ +L + +
Sbjct: 124 KAALAGAGIGVVAAALGYTQWLSL---LYALPVIALTGLAFASLGMVVTALAPSYDYFIF 180

Query: 193 VMFTALFPVTFLANTFVPTEPMPHWLRVIAEWNPVS 228
+ P+ FL+ P + +P + A + P+S
Sbjct: 181 YQTLVITPILFLSGAVFPVDQLPIVFQTAARFLPLS 216


92NCTC10437_04555NCTC10437_04566N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04555113-1.054803large-conductance mechanosensitive channel
NCTC10437_04556013-0.764854MspA protein
NCTC10437_04557-113-0.357683MspA protein
NCTC10437_04558-190.150265molybdenum cofactor synthesis domain protein
NCTC10437_04559-380.150760trypsin-like serine protease with C-terminal PDZ
NCTC10437_04560-210-0.811565integral membrane sensor signal transduction
NCTC10437_04561-3110.245298two component response transcriptional
NCTC10437_04562-2120.18407950S ribosomal protein L32
NCTC10437_04563-2110.300461Protein of uncharacterised function (DUF1446)
NCTC10437_04564-1100.320388acyl-CoA dehydrogenase
NCTC10437_04565080.368583propionyl-CoA carboxylase
NCTC10437_04566071.131024pyruvate carboxylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04555MECHCHANNEL1103e-34 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 110 bits (277), Expect = 3e-34
Identities = 41/133 (30%), Positives = 71/133 (53%), Gaps = 6/133 (4%)

Query: 1 MLKGFKDFISRGNVIDLAVAVVVGAAFTGLVTAFTQNVVQPLVDRV--GAGPGREYGILR 58
++K F++F RGNV+DLAV V++GAAF +V++ +++ P + + G + LR
Sbjct: 3 IIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMPPLGLLIGGIDFKQFAVTLR 62

Query: 59 IPLGGDQFVDLN--AVLSAAINFVIVAAVIYFVIVVPFKKIKERDAKVA--STETELTLL 114
G V ++ + +F+IVA I+ I + K ++++ A + E LL
Sbjct: 63 DAQGDIPAVVMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEEPAAAPAPTKEEVLL 122

Query: 115 TEIRDMLRENTTD 127
TEIRD+L+E
Sbjct: 123 TEIRDLLKEQNNR 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04556OMADHESIN290.010 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 29.5 bits (65), Expect = 0.010
Identities = 23/86 (26%), Positives = 35/86 (40%), Gaps = 9/86 (10%)

Query: 126 VDPDLGFGNAIVTPPLFPGVSISADLGNGPGIQEVATFSVDVAGPGGSVVVSNAHGTVTG 185
DP LG + P G ++ GI +A + A G +V V G++
Sbjct: 43 ADPALGLEYPVRPPVPGAGGLNAS----AKGIHSIAIGATAEAAKGAAVAV--GAGSIAT 96

Query: 186 AAGGVLLRPFARLISSTGDSVSTYGA 211
V + P ++ + GDS TYGA
Sbjct: 97 GVNSVAIGPLSKAL---GDSAVTYGA 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04559V8PROTEASE628e-13 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 62.3 bits (151), Expect = 8e-13
Identities = 44/182 (24%), Positives = 73/182 (40%), Gaps = 26/182 (14%)

Query: 157 PSVVKLEVNQGRASEEGSGVILSTDGLILTNNHVVATAAGTPEGRAGAPETKV--TFANG 214
V ++V + SGV++ D +LTN HVV G P P + NG
Sbjct: 88 APVTYIQVEAPTGTFIASGVVVGKD-TLLTNKHVVDATHGDPHALKAFPSAINQDNYPNG 146

Query: 215 KSTTFTVVGTDPGSDIAVVRAENV-------SGLTPITVGSSADLRVGQDVVAIGSPLGL 267
T + D+A+V+ + P T+ ++A+ +V Q++ G P
Sbjct: 147 GFTAEQITKYSGEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDK 206

Query: 268 EGTVTTGIVSALNRPVAAGGDAQNQNTVLD--AIQTDAAINPGNSGGALVNMNGELVGIN 325
PVA +++ + T L A+Q D + GNSG + N E++GI+
Sbjct: 207 --------------PVATMWESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIH 252

Query: 326 SA 327

Sbjct: 253 WG 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04560FLAGELLIN290.037 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 29.2 bits (65), Expect = 0.037
Identities = 19/142 (13%), Positives = 43/142 (30%), Gaps = 7/142 (4%)

Query: 146 GGNTLLLSKSLAPTGKVLKRLGTVLLIVGGIGVAVAAIAGGMVASAGLRPVARLTRAAER 205
G K+ + K L + I ++T A +
Sbjct: 339 NGQFTFDDKTKNESAK------LSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKT 392

Query: 206 VARTDDLRPIPVFGNDELARLTEAFNMMLRALAESRERQARLVADAGHELRTPLTSMRTN 265
+ + N++ A ++ L ++ + + + + G ++ S TN
Sbjct: 393 MFIDKTASGVSTLINEDAAAAKKSTANPLASIDSALSKVDAVRSSLG-AIQNRFDSAITN 451

Query: 266 VELLMESMKPGARRIPDEDMAE 287
+ + ++ RI D D A
Sbjct: 452 LGNTVTNLNSARSRIEDADYAT 473


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04561HTHFIS1052e-28 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 105 bits (263), Expect = 2e-28
Identities = 42/135 (31%), Positives = 67/135 (49%)

Query: 2 RILVVDDDRAVRESLRRSLSFNGYSVELAQDGREALDLIATARPDALVLDVMMPRLDGLE 61
ILV DDD A+R L ++LS GY V + + IA D +V DV+MP + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VCRQLRSTGDDLPILVLTARDSVSERVAGLDAGADDYLPKPFALEELLARMRALLRRTNP 121
+ +++ DLP+LV++A+++ + + GA DYLPKPF L EL+ + L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 LEGDADSAAMTFSDL 136
+ + L
Sbjct: 125 RPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04566RTXTOXIND310.023 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.023
Identities = 11/38 (28%), Positives = 20/38 (52%), Gaps = 1/38 (2%)

Query: 623 TITAPADGILVELNVEPGQQVEVGAILARVDNP-ADAD 659
I + I+ E+ V+ G+ V G +L ++ A+AD
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEAD 135


93NCTC10437_04739NCTC10437_04745N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_04739-1121.582093response regulator with CheY-like receiver
NCTC10437_047400131.248008integral membrane sensor signal transduction
NCTC10437_047411170.366616FKBP-type peptidyl-prolyl cis-trans isomerase
NCTC10437_047420170.894386Protein of uncharacterised function (DUF2630)
NCTC10437_047430141.186638Uncharacterised protein
NCTC10437_04744-1141.247462Uncharacterised protein
NCTC10437_047450130.282071MspA protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04739HTHFIS1031e-27 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 103 bits (258), Expect = 1e-27
Identities = 36/119 (30%), Positives = 57/119 (47%)

Query: 8 PRVLVVDDDPDVLASLERGLRLSGFDVFTAVDGAEALRSATETRPDAIVLDINMPVLDGV 67
+LV DDD + L + L +G+DV + A R D +V D+ MP +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 68 SVVTALRAMDNDVPVCVLSARSSVDDRVAGLEAGADDYLVKPFVLAELVARVKALLRRR 126
++ ++ D+PV V+SA+++ + E GA DYL KPF L EL+ + L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04741INFPOTNTIATR493e-10 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 49.2 bits (117), Expect = 3e-10
Identities = 34/107 (31%), Positives = 51/107 (47%), Gaps = 2/107 (1%)

Query: 15 PTELVTEDIVVGDGPEAVPGANVEVHYVGVEYDTGEEFDSSWNRGESIEFPLRGLIQGWQ 74
P+ L + I G G + V V Y G D G FDS+ G+ F + +I GW
Sbjct: 125 PSGLQYKIIDAGTGAKPGKSDTVTVEYTGTLID-GTVFDSTEKAGKPATFQVSQVIPGWT 183

Query: 75 DGIPGMKVGGRRKLTIPPEQAYGPAG-GGHRLSGKTLIFVIDLLATR 120
+ + M G ++ +P + AYGP GG +TLIF I L++ +
Sbjct: 184 EALQLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVK 230


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04744PF05616310.003 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 31.3 bits (70), Expect = 0.003
Identities = 27/72 (37%), Positives = 32/72 (44%), Gaps = 4/72 (5%)

Query: 29 QPPLPPPPAPAP-PNPLPFESIPGLGPISSPGPVPPPGPGVAPPPATALFPLAQSDSPSQ 87
+P L P A AP PLP E P P ++P P PG P P L P A D+ Q
Sbjct: 312 RPDLTPGSAEAPNAQPLP-EVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQ 370

Query: 88 --IFGGIPAMPD 97
PA+PD
Sbjct: 371 PGTRPDSPAVPD 382


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04745PF05272290.017 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.017
Identities = 19/55 (34%), Positives = 23/55 (41%), Gaps = 3/55 (5%)

Query: 2 RSTCVRVATAMVAAALWTAPAPIASAQPAPAPPPTPGVQPVADVGAPPIPNGAVP 56
R + +V A APAP P P PPP P V+ P+P AVP
Sbjct: 95 REEGLESVAGIVMGAPAGAPAP---KPPRPEPPPRPVVEKECWETIQPVPEHAVP 146


94NCTC10437_04829NCTC10437_04845N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_048290111.241891*transcriptional regulator
NCTC10437_04830-1101.962452beta-lactamase domain-containing protein
NCTC10437_048310102.250620TetR family transcriptional regulator
NCTC10437_04832-1101.811595chorismate mutase
NCTC10437_048370101.284554****phosphate/phosphite/phosphonate ABC
NCTC10437_04838-1111.297444Conserved exported protein of uncharacterised
NCTC10437_04839-2100.641911protein kinase
NCTC10437_04840-210-0.061930putative secreted protein
NCTC10437_04841-210-0.644568Uncharacterized protein conserved in bacteria
NCTC10437_04842-19-1.224669transcriptional regulator
NCTC10437_04844-19-1.346780nifR3 family TIM-barrel protein
NCTC10437_04845010-1.641887cell envelope-related transcriptional
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04829HTHTETR572e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 56.9 bits (137), Expect = 2e-12
Identities = 33/185 (17%), Positives = 66/185 (35%), Gaps = 9/185 (4%)

Query: 1 MVRPAQTARSERTREALRRAAVVRFLAQGVDGTSAEQIAADAGVSLRTFYRHFASKHDLL 60
M R + ++ TR+ + A+ F QGV TS +IA AGV+ Y HF K DL
Sbjct: 1 MARKTK-QEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLF 59

Query: 61 FADYDAGLHWFRSA---LAAREPGESI--VESVQSAIMASPYDDWAVTKIAAMRSEELDA 115
++ A+ PG+ + + + ++ S + + + + +
Sbjct: 60 SEIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 116 DRIVRHIRQVEADFAEAIEEHMSAVDPAL---PGTDERMRITVTARCIAAAVFGAMEVWM 172
+ ++Q + + + + + A + + G ME W+
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 173 LGEQR 177
Q
Sbjct: 180 FAPQS 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04831HTHTETR475e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 47.3 bits (112), Expect = 5e-09
Identities = 21/119 (17%), Positives = 41/119 (34%), Gaps = 4/119 (3%)

Query: 5 RQRLLDALEASIAEDGYAKSTVAGIVRRARTSRRTFYEHFDSREACFVALLADANADQVR 64
RQ +LD ++ G + +++ I + A +R Y HF + F + + ++
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 65 QISE-AVDPSAPWQKQVRQAIEA---WISSGESRPALMLSWIRDVPSLGAVARELQRDA 119
E +R+ + + E R LM +G +A Q
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQR 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04838PF03544300.008 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 29.6 bits (66), Expect = 0.008
Identities = 18/54 (33%), Positives = 19/54 (35%)

Query: 27 PAAVFMAATADTSPVPVAVVEPEPVAAPAAPVVEQPEVPCCVEVVAAPSLTPPP 80
P +V M A AD P PEPV P PE P VV P
Sbjct: 49 PISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPK 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04839YERSSTKINASE330.001 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 33.2 bits (75), Expect = 0.001
Identities = 28/118 (23%), Positives = 53/118 (44%), Gaps = 16/118 (13%)

Query: 65 ALNHPHVVAVHDSGV----HEGRHYIVLERLPGQSLADVLA------RNGPLPAEH---- 110
A HP++ VH V + ++++ + G +D L + G + +E
Sbjct: 187 AGKHPNLANVHGMAVVPYGNRKEEALLMDEVDGWRCSDTLRTLADSWKQGKINSEAYWGT 246

Query: 111 VRAIMRDVLSALGAAHASGVLHRDIKPANILLTRHGGVKIA-DFGV-AKSPDTPQTLT 166
++ I +L +GV+H DIKP N++ R G + D G+ ++S + P+ T
Sbjct: 247 IKFIAHRLLDVTNHLAKAGVVHNDIKPGNVVFDRASGEPVVIDLGLHSRSGEQPKGFT 304


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04842HTHTETR395e-06 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 38.8 bits (90), Expect = 5e-06
Identities = 13/107 (12%), Positives = 38/107 (35%), Gaps = 9/107 (8%)

Query: 14 PLQDRQALRRGELITAGIGLLGSPGGPKLTVRAVCRAAGLTERYFYESFTDRDEYVAAVY 73
+ R ++ + L G ++ + +AAG+T Y F D+ + + ++
Sbjct: 4 KTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIW 63

Query: 74 DDVCAAAMSTLMES---------QSMRDAVERFVALMIDDPARGRVL 111
+ + +E +R+ + + + + R ++
Sbjct: 64 ELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLM 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04845PF05616330.003 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 33.2 bits (75), Expect = 0.003
Identities = 32/100 (32%), Positives = 38/100 (38%), Gaps = 22/100 (22%)

Query: 14 PDPDEAATQA-APPATPSPPSAPVTVADLIAKINGGTPPP-EPTRHRAEPEPEPEPYPDV 71
P PD A AP A P P +P A+ P P E R PEP+P+ PD
Sbjct: 311 PRPDLTPGSAEAPNAQPLPEVSP-------AENPANNPAPNENPGTRPNPEPDPDLNPDA 363

Query: 72 DELDTAILPVLDEHPSELPDLAAPRPPVHTPSRPKRPHRR 111
+ P D P PD A P RP HR+
Sbjct: 364 N-------PDTDGQPGTRPDSPA------VPDRPNGRHRK 390


95NCTC10437_04926NCTC10437_04931N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_049260120.000694short-chain dehydrogenase/reductase SDR
NCTC10437_04927112-0.418816AMP-dependent synthetase and ligase
NCTC10437_04928-1100.295309HIT family protein
NCTC10437_04929-110-0.421562integral membrane sensor signal transduction
NCTC10437_04930-113-0.828205response regulator with CheY-like receiver
NCTC10437_04931-112-0.872342Uncharacterised protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04926DHBDHDRGNASE771e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 76.6 bits (188), Expect = 1e-18
Identities = 59/196 (30%), Positives = 84/196 (42%), Gaps = 5/196 (2%)

Query: 24 PGSPADPTVADALIDRCAREFGSIDILVNCAGTAEPVGSSILNVTTEQFRDLMDAHLGTA 83
P D D + R RE G IDILVN AG P I +++ E++ +
Sbjct: 63 PADVRDSAAIDEITARIEREMGPIDILVNVAGVLRP--GLIHSLSDEEWEATFSVNSTGV 120

Query: 84 FQTCRAAAPRMADQGSGVIVNT-SSFAFLGDYGGTGYPAGKGAVTSLTLAIAAELKEYGV 142
F R+ + M D+ SG IV S+ A + Y + K A T + EL EY +
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 143 RANVVCPGA-KTRLSTGPEYEAHIAELNRRGLLDEISTQGALDA-APPEYVAPTYGYLVS 200
R N+V PG+ +T + + + AE +G L+ T L A P +A +LVS
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVS 240

Query: 201 DLAKDITGQIFIAAGG 216
A IT GG
Sbjct: 241 GQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04929PF06580371e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.2 bits (86), Expect = 1e-04
Identities = 20/114 (17%), Positives = 38/114 (33%), Gaps = 28/114 (24%)

Query: 377 PRLRQVLSNLVANALQH----TPDTAGVTVRVGTDDDNTIVEVCDEGPGMSPEDAQRVFE 432
P + ++ LV N ++H P + ++ D+ +EV + G +
Sbjct: 256 PPM--LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE---- 309

Query: 433 RFYRADSSRTRASGGTGLGLSIVD---SLVYAHGGRVSVSTAPGSGCRFRVSLP 483
TG GL V ++Y ++ +S G V +P
Sbjct: 310 --------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04930HTHFIS1073e-29 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 107 bits (270), Expect = 3e-29
Identities = 33/136 (24%), Positives = 59/136 (43%)

Query: 12 ARVLVVDDETNIVELLSVSLKFQGFEVHTATNGPAALDKAREVRPDAIILDVMMPGMDGF 71
A +LV DD+ I +L+ +L G++V +N D ++ DV+MP + F
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 72 GLLRRLRADGIEAPALFLTARDSLQDKIAGLTLGGDDYVTKPFSLEEVVARLRVVLRRSG 131
LL R++ + P L ++A+++ I G DY+ KPF L E++ + L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 132 GGVQEPRPSRLTFADI 147
+ +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04931RTXTOXINA290.022 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.8 bits (64), Expect = 0.022
Identities = 21/80 (26%), Positives = 36/80 (45%), Gaps = 15/80 (18%)

Query: 2 NLGKSLTDLAFAP-ARAGLAVADAGLGVASGALAVARRSLGESRDGTQTPTSFASMLGVQ 60
N+GK ++ A A GL+ + A G+ + A+ +A +P SF L +
Sbjct: 282 NVGKGISQYIIAQRAAQGLSTSAAAAGLIASAVTLA-----------ISPLSF---LSIA 327

Query: 61 DAVDRANRLAKLMDEDQPLG 80
D RAN++ + + LG
Sbjct: 328 DKFKRANKIEEYSQRFKKLG 347


96NCTC10437_04949NCTC10437_04962N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_049491101.183560Putative transcriptional regulator, TetR family
NCTC10437_04950-1102.069347major facilitator superfamily transporter
NCTC10437_049510111.702695short chain dehydrogenase
NCTC10437_049520121.010054enoyl-CoA hydratase/carnithine racemase
NCTC10437_049531131.166749Uncharacterised protein
NCTC10437_049540110.547225alpha,alpha-trehalose-phosphate synthase
NCTC10437_049550100.392761Uncharacterised protein
NCTC10437_04956-110-1.285400MCE-associated protein
NCTC10437_04957-111-1.350645Mce associated alanine and valine rich protein
NCTC10437_04958-110-1.230495virulence factor Mce family protein
NCTC10437_04959-111-2.038474virulence factor Mce family protein
NCTC10437_04960-211-1.923617virulence factor Mce family protein
NCTC10437_04961011-1.743121virulence factor Mce family protein
NCTC10437_04962010-1.819947virulence factor Mce family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04949HTHTETR572e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 57.3 bits (138), Expect = 2e-12
Identities = 35/202 (17%), Positives = 75/202 (37%), Gaps = 17/202 (8%)

Query: 15 ARRPGGRNARVRAQILAATAELVARDGIAGFRYEEVAEYAGVHKTSVYRNWPDREELVVD 74
AR+ R IL L ++ G++ E+A+ AGV + ++Y ++ D+ +L +
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 75 ALLRYADDLASIADT------GDIQRDLVDFLLALAGGLETPFGRTL-----EQAMQPAG 123
++ + GD L + L+ + T R L + G
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 124 QPSAVQALSRILDQRV-AAMQRRVHAAVDRGELPA-IDSAFLGEMISGPVHLIVNRGIRP 181
+ + VQ R L +++ + ++ LPA + + ++ G + ++ +
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFA 181

Query: 182 ---FTRADAER-IVGVVLAGIR 199
F R V ++L
Sbjct: 182 PQSFDLKKEARDYVAILLEMYL 203


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04950TCRTETB622e-12 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 61.8 bits (150), Expect = 2e-12
Identities = 62/341 (18%), Positives = 135/341 (39%), Gaps = 29/341 (8%)

Query: 39 RLVLWAAVLTLANVLADVVIGSPMMVLPQLLDHFDTDQ--VGWLNTGAMLAGAIWSPLLA 96
++++W +L+ +VL ++V+ + LP + + F+ W+NT ML +I + +
Sbjct: 14 QILIWLCILSFFSVLNEMVLN---VSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 97 KSSDVFGKRRILVVTLLVAGAGALVCLIAPN-VWIFLAGRFLQGAAFAAVFITVTLVLQI 155
K SD G +R+L+ +++ G+++ + + + + RF+QGA AA V +V+
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVAR 130

Query: 156 CGPR----VAMAMVGLVTSGSSIVGIIEPFLMKPVIDAFGYRGVFIASALLAAAAALCVR 211
P+ A ++G + + VG P + + + + + + ++
Sbjct: 131 YIPKENRGKAFGLIGSIVAMGEGVG---PAIGGMIAHYIHWSYLLLIPMITIITVPFLMK 187

Query: 212 AVIPESPIRSPGRIDLGGAVLLGGGLGAVLGYISLGRDLGWLSGGMTMLLAGGVAALTAW 271
+ E I+ D+ G +L+ G+ + + + L V + +
Sbjct: 188 LLKKEVRIKGH--FDIKGIILMSVGIVFFMLFTTSYSIS---------FLIVSVLSFLIF 236

Query: 272 AALALRVDEPIIDLRALSRPILLTLLTLVLAAGSFRSMLQLTSIVAQVPPDLGLGYGLGH 331
+V +P +D + + + V+ VP + + L
Sbjct: 237 VKHIRKVTDPFVDPGLGKNIPFMIGVLC-----GGIIFGTVAGFVSMVPYMMKDVHQLST 291

Query: 332 GEAIAVLLAAPNVGIAIGGICAGWLAGRIGPALPLLGGIGL 372
E +V++ + + I G G L R GP L G+
Sbjct: 292 AEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTF 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04951DHBDHDRGNASE1032e-28 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 103 bits (258), Expect = 2e-28
Identities = 66/255 (25%), Positives = 101/255 (39%), Gaps = 17/255 (6%)

Query: 11 LITGGGSGIGKGIAAAVVASGGNAVLAGRNADKLAAAAEEITAHGGDGAGTVIYEPADVT 70
ITG GIG+ +A + + G + N +KL + A A PADV
Sbjct: 12 FITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE----ARHAEAFPADVR 67

Query: 71 DEDQASRLVAAATAWHGRLDGVVHSAGGSETIGPLTQMDSAAWRNTVDLNLNGTMYVLKH 130
D + A G +D +V+ AG G + + W T +N G +
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLR-PGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 131 AARELVRGGGGSFVGISSIASSNTHRWFGAYGPTKAALDHLMMVAADELGPSSVRVNGIR 190
++ ++ GS V + S + AY +KAA EL ++R N +
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 191 PGLTRTDM----------VAPVIDTPALSEDYRLCTPLPRVGEVEDVANLAVFLLSDAAS 240
PG T TDM VI E ++ PL ++ + D+A+ +FL+S A
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSL--ETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 241 FITGQVINVDGGHGL 255
IT + VDGG L
Sbjct: 245 HITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04955cloacin426e-06 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 41.6 bits (97), Expect = 6e-06
Identities = 37/110 (33%), Positives = 46/110 (41%), Gaps = 1/110 (0%)

Query: 289 GAGGDGAPGGAGGNGGTVNNAFGNGANGVGGSGGAGGSSSSGAGGAGGKGGFVATDEGDA 348
G G G GA G +N G G S G+G SS + G GG G + G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWG-GGSGSGIHWGGGSG 61

Query: 349 TGGAGGAGGRANGSATGGDGGDGASASSAGDNRFGEDGNGGIAVGGAGGA 398
G GG G GS TGG+ A+ + G G GG+AV + GA
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111



Score = 35.5 bits (81), Expect = 5e-04
Identities = 25/76 (32%), Positives = 26/76 (34%)

Query: 383 GEDGNGGIAVGGAGGAGGSGGASGGDGGDGGTADTDGGPAASGGDGGGGGAGGFVGGDGG 442
G G GG G G GGAS G G GG + GGG G G G
Sbjct: 12 GAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNS 71

Query: 443 SGGDGGGALGGDTNTP 458
GG G G P
Sbjct: 72 GGGSGTGGNLSAVAAP 87



Score = 34.3 bits (78), Expect = 0.001
Identities = 35/118 (29%), Positives = 40/118 (33%), Gaps = 14/118 (11%)

Query: 246 GDADGGDGGRGGSASTGTGGTGGAGGSGSSDADSTHTVTGGNGGAGGDGAPGGAGGNGGT 305
G G G TG G G S S S + GG G+G G GNGG
Sbjct: 8 GHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGG- 66

Query: 306 VNNAFGNGANGVGGSGGAGGSSSSGAGGAGGKGGFVATDEGDATGGAGGAGGRANGSA 363
N GGSG G S+ A A G +T GAGG + A
Sbjct: 67 ------GNGNSGGGSGTGGNLSAVAAPVAFGFPAL-------STPGAGGLAVSISAGA 111



Score = 34.3 bits (78), Expect = 0.001
Identities = 25/75 (33%), Positives = 32/75 (42%), Gaps = 2/75 (2%)

Query: 350 GGAGGAGGRANGSATGGDGGDGASASSAGDNRFGEDGNGGIAVGGAGGAGGSGGASGGDG 409
GA G NG TG G GAS S + + GG + G GGSG +GG
Sbjct: 11 TGAHSTSGNINGGPTGLGVGGGASDGSGWSSE--NNPWGGGSGSGIHWGGGSGHGNGGGN 68

Query: 410 GDGGTADTDGGPAAS 424
G+ G GG ++
Sbjct: 69 GNSGGGSGTGGNLSA 83



Score = 33.5 bits (76), Expect = 0.002
Identities = 35/119 (29%), Positives = 46/119 (38%), Gaps = 12/119 (10%)

Query: 213 DAVGASGGAGGRGGYLGGNGGNGGIGGLADSVNGDADGGDGGRGGSASTGTGGTGGAGGS 272
D G + GA G + G G+GG A +G + + GGS S G G G+
Sbjct: 5 DGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGN 64

Query: 273 GSSDADSTHTVTGGNGGAGGDGAPGGAGGNGGTVNNAFGNGANGVGGSGGAGGSSSSGA 331
GG G G G+ G + AFG A G+GG S S+GA
Sbjct: 65 ------------GGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111



Score = 33.5 bits (76), Expect = 0.002
Identities = 27/80 (33%), Positives = 32/80 (40%), Gaps = 7/80 (8%)

Query: 393 GGAGGAGGSGGASGGDGGDGGTADTDGGPAASGGDGGGGGAGGFVGGDGGSGGDGGGALG 452
G +G G G G GG +D G + + GGG G+G GG G G GG
Sbjct: 12 GAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNS 71

Query: 453 GDTNTPGDGGSGGSGGLGGS 472
G GGSG G L
Sbjct: 72 G-------GGSGTGGNLSAV 84



Score = 33.5 bits (76), Expect = 0.002
Identities = 28/85 (32%), Positives = 32/85 (37%), Gaps = 5/85 (5%)

Query: 405 SGGDGGDGGTADTDGGPAASGGDGGGGGAGGFVGGDGGSG-----GDGGGALGGDTNTPG 459
SGGDG T +GG G G GG G G S G G G+ G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 460 DGGSGGSGGLGGSDGDSGSDGTSGA 484
G GG+G GG G G+ A
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLSAVAA 86



Score = 30.8 bits (69), Expect = 0.013
Identities = 23/86 (26%), Positives = 35/86 (40%)

Query: 192 GDGGNGGNDYNSSNSSTPGANDAVGASGGAGGRGGYLGGNGGNGGIGGLADSVNGDADGG 251
GDG +S++ + G +G GGA G+ N GG G G + G
Sbjct: 4 GDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHG 63

Query: 252 DGGRGGSASTGTGGTGGAGGSGSSDA 277
+GG G++ G+G G + A
Sbjct: 64 NGGGNGNSGGGSGTGGNLSAVAAPVA 89


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04958TONBPROTEIN401e-05 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 40.0 bits (93), Expect = 1e-05
Identities = 20/95 (21%), Positives = 30/95 (31%), Gaps = 4/95 (4%)

Query: 423 PMIPPQVDPDPGPPVVQLPPGVAPGPGPAPHAPFPLPVPPNEVGPPPPPWPYFAPPDQNV 482
P PP + P P P P P P+ + + P P P P +Q
Sbjct: 52 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPK 111

Query: 483 P--PYGRTPPGAP--APAPAPVPAAPPVAPAPGGP 513
+ P +P APA + ++ A
Sbjct: 112 RDVKPVESRPASPFENTAPARLTSSTATAATSKPV 146



Score = 33.0 bits (75), Expect = 0.002
Identities = 22/88 (25%), Positives = 25/88 (28%), Gaps = 3/88 (3%)

Query: 444 VAPGPGPAPHAPFPLPVPPNEVGPPPPPWPYFAPPDQNVPPYGRTPPGAPAPAPAPVPAA 503
V P P A PP V P P P + P P P P P PV
Sbjct: 50 VTPADLEPPQAV---QPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKV 106

Query: 504 PPVAPAPGGPLLPAEAIPQAAGGPAMAS 531
P+ A P PA +
Sbjct: 107 QEQPKRDVKPVESRPASPFENTAPARLT 134



Score = 29.2 bits (65), Expect = 0.035
Identities = 22/96 (22%), Positives = 30/96 (31%), Gaps = 8/96 (8%)

Query: 418 PPNKFPMIPPQVDPDPGPPVVQLPPGVAPGPGPAPHAPFPLPVPPNEVGPPPPPWPYFAP 477
P P P+ P+P + P P P P PV + P P +
Sbjct: 66 EPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK-----PVKKVQEQPKRDVKPVESR 120

Query: 478 PDQNVPPYGRTPPGAPAPAPAPVPAAPPVAPAPGGP 513
P P+ T P + A + PV GP
Sbjct: 121 PAS---PFENTAPARLTSSTATAATSKPVTSVASGP 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04959adhesinmafb290.041 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 28.9 bits (64), Expect = 0.041
Identities = 24/107 (22%), Positives = 44/107 (41%), Gaps = 10/107 (9%)

Query: 203 IAAAEGLNRFAGILANSKDSLGRTLDTLPAALDVLNRNRTTLVDAFAALRTFATVG--SR 260
A GL AG N+++++ R + P N V+A + A V ++
Sbjct: 276 FAVIGGLGSVAGFEKNTREAVDRWIQENP--------NAAETVEAVFNVAAAAKVAKLAK 327

Query: 261 VLQETKTDFAADFKDLYPIIKSLNDNADDFVKSLEFLPTFPFHYKYL 307
+ K + DF D Y +L+D+A ++ ++ HY+ L
Sbjct: 328 AAKPGKAAVSGDFADSYKKKLALSDSARQLYQNAKYREALDIHYEDL 374


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_04962RTXTOXINA290.036 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.8 bits (64), Expect = 0.036
Identities = 24/137 (17%), Positives = 45/137 (32%), Gaps = 26/137 (18%)

Query: 162 ELLQGQGGALSAMLSSTSAFTQNLAARDQLIGDVITNLNTVLGTVDEKGAEFDASVDRLQ 221
+LLQ A + + NL ++ L T L ++ +
Sbjct: 116 KLLQKYQKAGNILGGGAENIGDNLGKAGGILSTFQNFLGTALSSMK------------ID 163

Query: 222 KLLTGLAEGRDPIAGAIPPLAQASTDLTQMLQESRRPIQAVIENARPLAQRLDERKADIN 281
+L+ G + + LA+AS +L L ++ L ++ +N
Sbjct: 164 ELIKKQKSGGNVSSSE---LAKASIELINQL----------VDTVASLNNNVNSFSQQLN 210

Query: 282 KVIEPLAENYLRLNALG 298
+ L N LN +G
Sbjct: 211 TLGSVL-SNTKHLNGVG 226


97NCTC10437_05058NCTC10437_05065N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_050581110.210756dehydrogenase of uncharacterised specificity,
NCTC10437_05059180.178031general substrate transporter
NCTC10437_05060091.031206transcriptional regulator
NCTC10437_05061-111-0.119766lysophospholipase
NCTC10437_05062-110-0.877461TetR family transcriptional regulator
NCTC10437_05063-19-2.013239Protein of uncharacterised function (DUF2867)
NCTC10437_05064-18-2.017145phospholipid/glycerol acyltransferase
NCTC10437_0506518-1.865174TetR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05058DHBDHDRGNASE777e-19 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 76.6 bits (188), Expect = 7e-19
Identities = 59/257 (22%), Positives = 98/257 (38%), Gaps = 22/257 (8%)

Query: 4 LSGRTALVTGGAGGIGAACARALAERGATVTVADIDAAGATSLAEEIGGTAWTVDLF--D 61
+ G+ A +TG A GIG A AR LA +GA + D + + + A + F D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 62 VAALEDLRLET----------DILVNNAGIQSISEIAEFPPDKFRSMMALMVEAPFLLIR 111
V + T DILVN AG+ I +++ + ++ F R
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 112 AALPHMYRNGFGRVINISSIHGLRASPFKVAYVTAKHAMEGLSKVTALEGGPHGVTSNCI 171
+ +M G ++ + S AY ++K A +K LE + + N +
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 172 NPGYVRT----PLVTKQIADQARTHSISEDKVLSDVLLKESAVKRLVEPEEVGALATWLA 227
+PG T L + + E L K+L +P ++ +L
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPL------KKLAKPSDIADAVLFLV 239

Query: 228 SPVAGMVTGASYTMDGG 244
S AG +T + +DGG
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05059TCRTETA357e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.8 bits (80), Expect = 7e-04
Identities = 19/53 (35%), Positives = 28/53 (52%), Gaps = 2/53 (3%)

Query: 284 ADTNSILRYLLVAHAV-HFAIVPFVGYLADRFGRRPVYMIGAILGATWGFFAF 335
D + LL +A+ FA P +G L+DRFGRRPV ++ + GA +
Sbjct: 39 NDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVS-LAGAAVDYAIM 90



Score = 31.7 bits (72), Expect = 0.005
Identities = 21/91 (23%), Positives = 33/91 (36%), Gaps = 9/91 (9%)

Query: 83 VFGHYGDKYGRKKLLQFSLLLVGVATFLMGCLPTFGQIGYWAPGLLVTLRFIQGFAVGGE 142
V G D++GR+ +L SL V +M P W +L R + G G
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFL-----W---VLYIGRIVAGIT-GAT 112

Query: 143 WGGAVLLVAEHSPDRQRGFWASWPQAGVPVG 173
A +A+ + +R + A G
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFG 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05062HTHTETR533e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 53.5 bits (128), Expect = 3e-11
Identities = 30/147 (20%), Positives = 52/147 (35%), Gaps = 1/147 (0%)

Query: 6 RTPREVWIDTALEMLAAEGPDSVRVEVLAQRLGVTRGGFYRQFGGRDDLVDAMLHVWEQR 65
+ R+ +D AL + + +G S + +A+ GVTRG Y F + DL + + E
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 66 SIDDVRERVEAEGGDARRKVRKAGVLTF-SAQLLPLDLAVREWARRDPDVEARLRRVDNR 124
+ E GD +R+ + S + E + + V
Sbjct: 70 IGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 125 RMAYLRELISSICDGDAAEVEARALLA 151
+ E I +EA+ L A
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPA 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05065HTHTETR581e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 58.5 bits (141), Expect = 1e-12
Identities = 30/197 (15%), Positives = 64/197 (32%), Gaps = 9/197 (4%)

Query: 9 PEQNSATAARVLTAANDLLLARGAKGFTVADVAARAHVGKGTVYLYWPTKEDLLIGLVGR 68
++ T +L A L +G ++ ++A A V +G +Y ++ K DL +
Sbjct: 6 KQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWEL 65

Query: 69 GFLRILDELITRFGSDVDLVRPS------RFCPAMLQVATGQPLIAALQRHDDDLLGILV 122
I + + + + + L+ + H + +G +
Sbjct: 66 SESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEE-RRRLLMEIIFHKCEFVGEMA 124

Query: 123 DHPRSIALTEALGPDAVLNAAIPLWREHGMARTDWDIADQTFAVQALISGVLLSLLADPV 182
++ L + + E M D ++ ISG++ + L P
Sbjct: 125 VVQQAQRNLC-LESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAP- 182

Query: 183 STDDRLRVFGAAVTALL 199
+ D + V LL
Sbjct: 183 QSFDLKKEARDYVAILL 199


98NCTC10437_05287NCTC10437_05292N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_05287-280.274362D-amino acid aminohydrolase
NCTC10437_05288-2100.024734Rieske (2Fe-2S) domain-containing protein
NCTC10437_05289-1110.834162transcriptional regulator
NCTC10437_052900100.720997LysR family transcriptional regulator
NCTC10437_052910100.450009dehydrogenase of uncharacterised specificity,
NCTC10437_052921110.957321dehydrogenase of uncharacterised specificity,
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05287UREASE592e-11 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 59.4 bits (144), Expect = 2e-11
Identities = 39/130 (30%), Positives = 55/130 (42%), Gaps = 18/130 (13%)

Query: 3 DLKITGGTVVDGTGADRFTADVAVKDGKIVEIRRRGPSDPP-----LEGNATETIDATGK 57
D IT ++D G AD+ +KDG+I I + G D + G TE I GK
Sbjct: 69 DTVITNALILDHWGI--VKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGK 126

Query: 58 IVAPGFVDIHTHYDGQVSWDSVLEPSSNHGVTTVVAGNCG----VGFAPVRPGSEEWLIA 113
IV G +D H H+ + + E + G+T ++ G G PG W IA
Sbjct: 127 IVTAGGMDSHIHF---ICPQQIEEALMS-GLTCMLGGGTGPAHGTLATTCTPGP--WHIA 180

Query: 114 LM-EGVEDIP 122
M E + P
Sbjct: 181 RMIEAADAFP 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05289TETREPRESSOR742e-18 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 74.2 bits (182), Expect = 2e-18
Identities = 41/200 (20%), Positives = 79/200 (39%), Gaps = 7/200 (3%)

Query: 14 VTAELILQTAAQLIEIDGVKALTMRNLADRLGVAVTSIYWHIGGRDQLLDSLVERLLGE- 72
+ E ++ A +L+ G+ LT R LA +LG+ ++YWH+ + LLD+L +L
Sbjct: 4 LNRESVIDAALELLNETGIDGLTTRKLAQKLGIEQPTLYWHVKNKRALLDALAVEILARH 63

Query: 73 -LANLPAEGDDPVERIASLARSQRRVLIERQHLLGIAHERDRTPQLFLPIQQALAAQLAE 131
+LPA G+ + + A S RR L+ + + + + ++ L + E
Sbjct: 64 HDYSLPAAGESWQSFLRNNAMSFRRALLRYRDGAKVHLGTRPDEKQYDTVETQLRF-MTE 122

Query: 132 LDVTGTEAALILRSVQVHVISSAVMQFSAVRGAKHDEEDPSLWADASPDR-ALVEALQAP 190
+ + + +V H AV++ A D + P +
Sbjct: 123 NGFSLRDGLYAISAVS-HFTLGAVLEQQEHTAALTDRPAAP--DENLPPLLREALQIMDS 179

Query: 191 TDYDAVFEFVLDALLATLRT 210
D + F L++L+
Sbjct: 180 DDGEQAFLHGLESLIRGFEV 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05291DHBDHDRGNASE1064e-30 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 106 bits (266), Expect = 4e-30
Identities = 65/252 (25%), Positives = 108/252 (42%), Gaps = 12/252 (4%)

Query: 5 DGKNVVITGGSSGLGLAAARYLVDNGAHVL---ITGRHRDTLEAAGRRLGGNAITLASDA 61
+GK ITG + G+G A AR L GAH+ + + ++ + +A +D
Sbjct: 7 EGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADV 66

Query: 62 SSLADIDALAARAREEFGSLDALVVNAGIGSFDAFDDVTERTFDQVFAINTKGPFFTVQR 121
A ID + AR E G +D LV AG+ +++ ++ F++N+ G F +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 122 LAPLLAP--GSGVVLTTSIANQTGWAALSVYSASKAALRSMARTLSRELLPRGIRVNAIS 179
++ + +V S +++ Y++SKAA + L EL IR N +S
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 180 PGSIDTGKL------EKEVPERAAQLKAEFTDSSPMRRWGHPDEFAPAVAFLAFD-ATYV 232
PGS +T E + F P+++ P + A AV FL A ++
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 233 AGIELVVDGGES 244
L VDGG +
Sbjct: 247 TMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05292DHBDHDRGNASE1067e-30 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 106 bits (265), Expect = 7e-30
Identities = 77/253 (30%), Positives = 118/253 (46%), Gaps = 15/253 (5%)

Query: 5 EGKRAVITGGTSGIGLATAQLLVDGGARVLVTGRTPASLEKARAALGPRAI------VEA 58
EGK A ITG GIG A A+ L GA + P LEK ++L A +
Sbjct: 7 EGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADV 66

Query: 59 SDAVSGIDALVERAAAEFGAVDLLVLNAGATVTSTVDDTSEADYDALFELNTKAPFFTLQ 118
D+ + ID + R E G +D+LV AG + S+ +++A F +N+ F +
Sbjct: 67 RDS-AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 119 KFLPLLPDGSAVVLTTSVSNTKGIAATSV--YSATKAALRSLTRTFARELIDRGIRVNAV 176
+ D + + T SN G+ TS+ Y+++KAA T+ EL + IR N V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 177 SPGPVDTGILQRTMSPDAARE-----FLDQVKASNPMQRFGLPEEVAKAILFLGFD-ATY 230
SPG +T + + + E L+ K P+++ P ++A A+LFL A +
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 231 TTGTELLVDGGAS 243
T L VDGGA+
Sbjct: 246 ITMHNLCVDGGAT 258


99NCTC10437_05591NCTC10437_05600N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_05591-191.615483HNH nuclease
NCTC10437_05592-291.657809luciferase family protein
NCTC10437_05593-182.225999amidase
NCTC10437_055940101.193898short-chain dehydrogenase
NCTC10437_055950120.771349TetR family transcriptional regulator
NCTC10437_05596-1140.047629dehydrogenase
NCTC10437_05597-1120.881523inner membrane protein yidH
NCTC10437_055980110.863276Uncharacterised protein
NCTC10437_05599-1100.722520enoyl-CoA hydratase/isomerase
NCTC10437_05600-190.525203transcriptional regulator containing an amidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05591V8PROTEASE330.002 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 33.4 bits (76), Expect = 0.002
Identities = 16/53 (30%), Positives = 24/53 (45%), Gaps = 1/53 (1%)

Query: 224 VHVVVRQDTLDAGEPDGPSNPGGPSDPDGIPQNPSDPQHTGVPGGGPEGAEDS 276
+H + PD P+NP P++PD P NP +P + P G D+
Sbjct: 281 IHFANDDQPNNPDNPDNPNNPDNPNNPDE-PNNPDNPNNPDNPDNGDNNNSDN 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05594DHBDHDRGNASE762e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 76.2 bits (187), Expect = 2e-18
Identities = 50/183 (27%), Positives = 83/183 (45%), Gaps = 4/183 (2%)

Query: 6 FVTGSSRGFGRALVRAALAAGDRVAATARRPEQLADLVTEYGDRVL---PLALDVTDAGA 62
F+TG+++G G A+ R + G +AA PE+L +V+ DV D+ A
Sbjct: 12 FITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSAA 71

Query: 63 AVTSIATAREHFGRVDVVVNNAGYANVAPIETGDDTDFRAQFETNFWGVYHVSKAAIPVL 122
A G +D++VN AG I + D ++ A F N GV++ S++ +
Sbjct: 72 IDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKYM 131

Query: 123 REQGGGLIIQFSSMGGRVGGSAGIASYQAAKFAIDGFSRVLQTETAPFGITVLVVEPSGF 182
++ G I+ S V +A+Y ++K A F++ L E A + I +V P
Sbjct: 132 MDRRSGSIVTVGSNPAGV-PRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGST 190

Query: 183 ATD 185
TD
Sbjct: 191 ETD 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05595HTHTETR623e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 61.6 bits (149), Expect = 3e-14
Identities = 32/160 (20%), Positives = 60/160 (37%), Gaps = 18/160 (11%)

Query: 1 MRRDAAVNRRRLLTAAEAVFAERGP-AATLDDVAAAAEVGPATLYRHFANKEELVQEVLR 59
+++A R+ +L A +F+++G + +L ++A AA V +Y HF +K +L E+
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 60 SFFQRLIDVAGRAADAPAAD-------GLEAFLSTVGVELAEKSGL-----SAPVWGELA 107
+ ++ D L L + E + + GE+A
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA 124

Query: 108 PV-----SLVTELRDLSTELLTRAQRAGAVRQDVTPDDIA 142
V +L E D + L A + D+ A
Sbjct: 125 VVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAA 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05596DHBDHDRGNASE1291e-38 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 129 bits (324), Expect = 1e-38
Identities = 88/251 (35%), Positives = 122/251 (48%), Gaps = 18/251 (7%)

Query: 6 LTGRTALVTGGTAGIGLASARALAEAGASVIITGRDPEKGADAAAELGARVRF---VAAD 62
+ G+ A +TG GIG A AR LA GA + +PEK + L A R AD
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 63 MADRHSVDQLAHQ-----GVIDILVNNAASFPGAMTVDQDVESFERTFATNVRGVYFLVA 117
+ D ++D++ + G IDILVN A + E +E TF+ N GV+
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 118 ALAPGMIARRRGAIVNVTSMVAGKGVAGASTYSSSKAAVESLTRTWAAEFGPHGVRVNNV 177
+++ M+ RR G+IV V S AG + Y+SSKAA T+ E + +R N V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 178 APGPTATEGVAAEWGEVN----------EELGRALPLGRTADPREIADAVLFLASPRASF 227
+PG T T+ + W + N E +PL + A P +IADAVLFL S +A
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 228 ITGTTLHVDGG 238
IT L VDGG
Sbjct: 246 ITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05600HTHTETR280.029 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.4 bits (63), Expect = 0.029
Identities = 8/89 (8%), Positives = 25/89 (28%)

Query: 215 RDEHPALLLQAMDFIERNAAHDIGVGDVAAAVYLTPRTVQYMFRNHLGTTPTTYLREVRL 274
++ +L A+ + +G++A A +T + + F++ +
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 275 AHAREELLAGDRRMTTVAAVAARWKFAHT 303
E ++ +
Sbjct: 70 IGELELEYQAKFPGDPLSVLREILIHVLE 98


100NCTC10437_05631NCTC10437_05637N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_05631-2100.531375nucleoside-diphosphate-sugar epimerase
NCTC10437_05632-190.511795transcriptional regulator
NCTC10437_05633-191.216146histidine kinase with cyclic nucleotide-binding
NCTC10437_05634-1111.652378dTDP-4-dehydrorhamnose reductase
NCTC10437_05635-291.616402esterase/lipase
NCTC10437_05636-291.626541Uncharacterised protein
NCTC10437_05637-281.806317MMPL domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05631NUCEPIMERASE565e-11 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 55.9 bits (135), Expect = 5e-11
Identities = 42/186 (22%), Positives = 68/186 (36%), Gaps = 26/186 (13%)

Query: 1 MHVFVTGASGWIGTAVVEELLASGHDVTGL---------ARSDTSAQSLERKGIRVRRGD 51
M VTGA+G+IG V + LL +GH V G+ + + L + G + + D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 52 LDDLAGLRAGAEDA--DATIHLAN----KHDWANMAASNAAERAATQTIGDTLAGTGRPF 105
L D G+ + + ++ N A + I +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 106 LI---ASGVAGLAEGRAATEDDPSPFHGPESPRGGS----ENLALDFV-TRGVHSVSLRF 157
L+ +S V GL + DD P S + E +A + G+ + LRF
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVD--HPVSLYAATKKANELMAHTYSHLYGLPATGLRF 178

Query: 158 APTVHG 163
TV+G
Sbjct: 179 F-TVYG 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05632HTHTETR582e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 57.7 bits (139), Expect = 2e-12
Identities = 32/197 (16%), Positives = 50/197 (25%), Gaps = 20/197 (10%)

Query: 24 TPERLQEAALELFAAHGYEQTTATEIAAAVGLTERTFYRHFTDKREVL---FHGQQVLAD 80
T + + + AL LF+ G T+ EIA A G+T Y HF DK ++ + +
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 81 AFLAGVDGAPPTASPMEVGVQALRSASTFFPDDRRPHSRVRQSVIDKNPALQEREEHKLA 140
P P+ V + L R I + E +
Sbjct: 72 ELELEYQAKFP-GDPLSVLREILI---HVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQ 127

Query: 141 VLGATLADALRGTGAD-------------GHTAEVVARTIVMAFGITFGRWISAGQQASF 187
L A + W+ A Q
Sbjct: 128 QAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDL 187

Query: 188 DDIVTDVLRGVAEATRP 204
D + + E
Sbjct: 188 KKEARDYVAILLEMYLL 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05634NUCEPIMERASE352e-04 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 34.8 bits (80), Expect = 2e-04
Identities = 15/29 (51%), Positives = 15/29 (51%)

Query: 1 MNITVIGATGQIGSQVVTILRAAGHHVVG 29
M V GA G IG V L AGH VVG
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVG 29


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05637ACRIFLAVINRP481e-07 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 48.3 bits (115), Expect = 1e-07
Identities = 40/218 (18%), Positives = 77/218 (35%), Gaps = 22/218 (10%)

Query: 187 IAIPLSLLVLVWVFGGLAAAVLPIVVAAASIVASIAVLRAIALWTEVSIFALNLTTALGL 246
AI L LV+ + A ++P + ++ + A+L A +++N T G+
Sbjct: 346 EAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFG-------YSINTLTMFGM 398

Query: 247 ALAI-----DYTLLILSRFRDEVAGGAERTVALRTTMATAGRTMAFSGLVVT---LSMAT 298
LAI D +++ + R + A +M+ + +V++ + MA
Sbjct: 399 VLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAF 458

Query: 299 MLLFPVPFLRSFAYAGVATVVLCAAASLVLTPALIVLLGPRLDIRRGRHRRGGHQQPIQH 358
R F+ V+ + L +L+LTPAL L + ++ G
Sbjct: 459 FGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTT 518

Query: 359 NPW-----YRSTAFVIRHAVPLAVAGTALLILVGVPFL 391
S ++ LI+ G+ L
Sbjct: 519 FDHSVNHYTNSVGKILGS--TGRYLLIYALIVAGMVVL 554



Score = 31.7 bits (72), Expect = 0.011
Identities = 11/48 (22%), Positives = 24/48 (50%)

Query: 537 SRLPLVLAVIVAVIFVVLFALTRSVVLPLKALVLNMLSLTAAFGALVW 584
++ P ++A+ V+F+ L AL S +P+ +++ L + A
Sbjct: 870 NQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATL 917


101NCTC10437_05654NCTC10437_05659N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_05654011-1.941148signal transduction histidine kinase
NCTC10437_05655-19-0.787139response regulator receiver modulated
NCTC10437_05656061.046338polysaccharide deacetylase
NCTC10437_05657-171.026550putative agmatinase
NCTC10437_05658-160.964316short-chain dehydrogenase/reductase SDR
NCTC10437_05659-271.118278short-chain dehydrogenase/reductase SDR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05654HTHFIS757e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 74.9 bits (184), Expect = 7e-16
Identities = 36/127 (28%), Positives = 60/127 (47%), Gaps = 3/127 (2%)

Query: 1037 TVLLVDDDVRNVFSLGSALKRYGITVIPASNGQLGLDSLAEHPEIGLVLMDIMMPVMDGY 1096
T+L+ DDD L AL R G V SN +A LV+ D++MP + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD-GDLVVTDVVMPDENAF 63

Query: 1097 EAMRRIRADSRWRNLPIVALTAKAMKGDRQKCLDAGASDYVTKPVDMDQLTSVLRVWLAG 1156
+ + RI+ +LP++ ++A+ K + GA DY+ KP D+ +L ++ LA
Sbjct: 64 DLLPRIK--KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 1157 RPRAQPQ 1163
R +
Sbjct: 122 PKRRPSK 128



Score = 56.8 bits (137), Expect = 3e-10
Identities = 26/116 (22%), Positives = 47/116 (40%), Gaps = 7/116 (6%)

Query: 765 TKPVLLIVEDDATFARMLAELATEAGFESVVTPSGRMALALAKERQPAAITLDIGLPDMA 824
T +L+ +DDA +L + + AG++ +T + + D+ +PD
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 825 GWVVLDVLKHDLATR-HIPVNVVSGSNDDGR---GRRMGAIQTLNKPAEISDLRSM 876
D+L R +PV V+S N GA L KP ++++L +
Sbjct: 62 A---FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGI 114



Score = 55.2 bits (133), Expect = 1e-09
Identities = 22/82 (26%), Positives = 38/82 (46%), Gaps = 3/82 (3%)

Query: 890 HVLVAEDDANQRQVVKHMIESDDISITCVGTGERVLEELSGPTTYDCLVLDLGLPDIDGI 949
+LVA+DDA R V+ + + + ++ D +V D+ +PD +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA-GDGDLVVTDVVMPDENAF 63

Query: 950 DLIERIKDQLKEKSLPVIVHTG 971
DL+ RIK + LPV+V +
Sbjct: 64 DLLPRIKKARPD--LPVLVMSA 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05655HTHFIS852e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.3 bits (211), Expect = 2e-19
Identities = 29/149 (19%), Positives = 62/149 (41%), Gaps = 3/149 (2%)

Query: 35 VLIVDDDARLRELLCTVLTPLDCEIAQAGSGEEALTALLQRKVAVIVLDINMPGMDGFET 94
+L+ DDDA +R +L L+ ++ + + ++V D+ MP + F+
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 95 AQLIRDVEELASTPIVFLTGQADDSDLHRGYDLGAVDFLVKP-VPRQVFYAKVKALLELD 153
I+ + P++ ++ Q + + GA D+L KP ++ +AL E
Sbjct: 66 LPRIK--KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 154 RSFARLRSEAARLHEQQMQDARAAEVRQR 182
R ++L ++ + A E+ +
Sbjct: 124 RRPSKLEDDSQDGMPLVGRSAAMQEIYRV 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05658DHBDHDRGNASE1153e-33 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 115 bits (289), Expect = 3e-33
Identities = 73/253 (28%), Positives = 113/253 (44%), Gaps = 7/253 (2%)

Query: 5 LTGRVALVTGAAQGMGSAHARRLAAAGATVAVNDIRDSPALAALADQIAGVRVA----GD 60
+ G++A +TGAAQG+G A AR LA+ GA +A D ++ A R A D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 61 ICDPAECERVVADVVTATGRLDILVANHAYMTMAPLLEHDDADWWKVVDTNLGGTFFLVQ 120
+ D A + + A + G +DILV + + D +W N G F +
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 121 AVLPHMRAAGAGRVVVITSEWGVTGWPEATAYAAAKSGLISLVKTLGRELAPEHIIVNAV 180
+V +M +G +V + S AYA++K+ + K LG ELA +I N V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 181 APGVTDTP---QLQVDARAAGLDLGTVHELYAESIPLARIGVPDEIAVAVELLSDFELEA 237
+PG T+T L D A + E + IPL ++ P +IA AV L +
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 238 VVGQVISCNGGST 250
+ + +GG+T
Sbjct: 246 ITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05659DHBDHDRGNASE971e-26 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 97.4 bits (242), Expect = 1e-26
Identities = 64/232 (27%), Positives = 108/232 (46%), Gaps = 22/232 (9%)

Query: 6 VALVTGGASGIGAATVDMLLRRGFTVGCLDLK----ESVAGAEYTVA-------ADVSDA 54
+A +TG A GIG A L +G + +D E V + A ADV D+
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 55 DAVREAVSELRDQLGPVSAVVTSAGYYEMAPVADITVDAWRRMLRVHLGGLVNVARATLP 114
A+ E + + ++GP+ +V AG + ++ + W V+ G+ N +R+
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSK 129

Query: 115 DLVDAR-GALVAVASELAVGGGDGDAHYAAAKGAILGFVRSLAAEVAPQGVTVNCVAPGP 173
++D R G++V V S A A YA++K A + F + L E+A + N V+PG
Sbjct: 130 YMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGS 189

Query: 174 TDTPL---LAADSPWRAK-------EYLTTLPLRKLATPREVARCIEYLLCD 215
T+T + L AD + + T +PL+KLA P ++A + +L+
Sbjct: 190 TETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSG 241


102NCTC10437_05698NCTC10437_05707N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_05698017-2.515138major facilitator superfamily protein
NCTC10437_05699-120-3.088035flavin-dependent oxidoreductase, F420-dependent
NCTC10437_05700-217-3.114985Predicted flavin-nucleotide-binding protein
NCTC10437_05701-217-3.155978alpha/beta hydrolase fold protein
NCTC10437_05702-215-2.127616aldo/keto reductase
NCTC10437_05703016-2.210830short-chain dehydrogenase/reductase SDR
NCTC10437_05704-110-1.699778TetR family transcriptional regulator
NCTC10437_05705-19-1.666392MerR family transcriptional regulator
NCTC10437_05706010-1.552872Uncharacterised protein
NCTC10437_05707212-1.108661short-chain dehydrogenase/reductase SDR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05698TCRTETB432e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 42.6 bits (100), Expect = 2e-06
Identities = 27/173 (15%), Positives = 65/173 (37%), Gaps = 6/173 (3%)

Query: 27 VLALVGTLNYVDRFLPGVLAEPIKQELALSDTAIGVINGFGFLIVYAVLGIVIARIADRG 86
+L+ LN + + V I + + +N F++ +++ V +++D+
Sbjct: 21 ILSFFSVLNEM---VLNVSLPDIANDFNKPPASTNWVNT-AFMLTFSIGTAVYGKLSDQL 76

Query: 87 VFGAVIAGCLTLWGAMTMLGGAVQSGLQ-LALTRVGVAVGEAGSSPAAHAYVARNFVPEK 145
++ + + +++G S L + R G A VAR E
Sbjct: 77 GIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN 136

Query: 146 RSAPLAVITMSIPLASAASLLGGGLLAESLGWRAAFVVMGGISVLLAPLVLWV 198
R +I + + GG++A + W ++ I+++ P ++ +
Sbjct: 137 RGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIP-MITIITVPFLMKL 188



Score = 31.4 bits (71), Expect = 0.008
Identities = 27/136 (19%), Positives = 57/136 (41%), Gaps = 20/136 (14%)

Query: 278 LGLLIVGRIADRLATRDPRWLLWIVVILITGLLPASALAFVVESQTLCVWLLALAYVIGT 337
+G + G+++D+L + L + I+I S + FV + L+ ++ G
Sbjct: 64 IGTAVYGKLSDQLGIKR----LLLFGIIINCF--GSVIGFV--GHSFFSLLIMARFIQGA 115

Query: 338 ---AYLAPSIAAIQRLVLPEQRATASAIFLFFNATLGAVGPFLTGVISDALTAELGPQAL 394
A+ A + + R + E R A + A VGP + G+I+ +
Sbjct: 116 GAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIA---------HYI 166

Query: 395 GRALLILVPTMQLVAI 410
+ L+L+P + ++ +
Sbjct: 167 HWSYLLLIPMITIITV 182


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05702INTIMIN310.007 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 30.8 bits (69), Expect = 0.007
Identities = 11/36 (30%), Positives = 20/36 (55%), Gaps = 5/36 (13%)

Query: 175 SQITPEQLAEARQIADIVTVQSRYNVIDRSAQAVLE 210
QI P+ + E R ++ SRY+++ R+ +LE
Sbjct: 417 QQIEPQYVNELRTLS-----GSRYDLVQRNNNIILE 447


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05703DHBDHDRGNASE1161e-33 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 116 bits (291), Expect = 1e-33
Identities = 75/253 (29%), Positives = 118/253 (46%), Gaps = 15/253 (5%)

Query: 4 LSGRRALVTGGSRGIGAGIVRRLTADGAAVAFTFSASKAEADELLADVTAHGAKAVAIHA 63
+ G+ A +TG ++GIG + R L + GA +A + + +++++ + A A A A
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIA-AVDYNPEKLEKVVSSLKAEARHAEAFPA 64

Query: 64 DVADPTQAAAAVDTAAAELGGMDIVVNNAGIAVKAPIEEFTQEQYDRLVAINIGGPFWTT 123
DV D E+G +DI+VN AG+ I + E+++ ++N G F +
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 124 HSSLKHLGD--GGRIINIGSINADRVPVPELSVYAMTKGAVSSFTRGLARELGPRGITVN 181
S K++ D G I+ +GS N VP ++ YA +K A FT+ L EL I N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGS-NPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 182 NVQPGPINTDMN-----PDEGD------FAAALKNVTALGRYGQTSDVAAVVSFLAGPES 230
V PG TDM + G K L + + SD+A V FL ++
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 231 SYVTGANLNVDGG 243
++T NL VDGG
Sbjct: 244 GHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05704HTHTETR477e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 46.5 bits (110), Expect = 7e-09
Identities = 26/150 (17%), Positives = 46/150 (30%), Gaps = 16/150 (10%)

Query: 1 MKLFWQHGFDGVSISDVTAVTGVNRRSIYAEFGSKEKLFMRAAERYLAGPSGYVTDAL-- 58
++LF Q G S+ ++ GV R +IY F K LF E + +
Sbjct: 21 LRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELELEYQAK 80

Query: 59 ----TRPTAREVAEAMVHGAAN-----LVSGAIPGCLTTVGEAPGLAELR----EAIVRR 105
RE+ ++ L+ I VGE + + + R
Sbjct: 81 FPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNLCLESYDR 140

Query: 106 LAERFDAAVADGELF-GVDTVVLARWIVAV 134
+ + + L + T A +
Sbjct: 141 IEQTLKHCIEAKMLPADLMTRRAAIIMRGY 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05707DHBDHDRGNASE792e-19 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 78.9 bits (194), Expect = 2e-19
Identities = 53/183 (28%), Positives = 75/183 (40%), Gaps = 4/183 (2%)

Query: 6 FITGGTPGNFGMAFAETALEAGDRVALTSRRPQELDTWAQQYGDRVLV---VPLELTDAA 62
FITG G G A A T G +A P++L+ P ++ D+A
Sbjct: 12 FITGAAQG-IGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 63 QVQRAVRDAEEHFGGIDVLVNNAGRGWYGSIEGMDESSLRAMFELNFFAVLSVTRAVLPG 122
+ E G ID+LVN AG G I + + A F +N V + +R+V
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 123 MRARGNGWIVNVSSVAGLVSAPGFGFYSATKYAIEAITDALRDEVAAQGISVLTVEPGAF 182
M R +G IV V S V Y+++K A T L E+A I V PG+
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGST 190

Query: 183 RTN 185
T+
Sbjct: 191 ETD 193


103NCTC10437_05723NCTC10437_05728N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NCTC10437_0572328-0.923232dehydrogenase of uncharacterised specificity,
NCTC10437_0572427-0.787709TetR family transcriptional regulator
NCTC10437_0572527-0.794427acetyltransferase
NCTC10437_0572638-1.169979alpha/beta hydrolase fold protein
NCTC10437_0572717-1.306058WD40 domain-containing protein
NCTC10437_0572827-1.148102WD40 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05723DHBDHDRGNASE1191e-34 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 119 bits (298), Expect = 1e-34
Identities = 81/253 (32%), Positives = 125/253 (49%), Gaps = 11/253 (4%)

Query: 4 LTGKTALVTGATSGIGLAGARALAEEGAYVFLVGRRQDALEDAVAGIGAE--HASAIRAD 61
+ GK A +TGA GIG A AR LA +GA++ V + LE V+ + AE HA A AD
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 62 VTVQADLDRVAATIKATGRRLDVVFANAGINEFATLGNLTWEHHTTIFNTNVGGVIFAVQ 121
V A +D + A I+ +D++ AG+ + +L+ E F+ N GV A +
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 122 AALPLLND--GASIILCGSNGDVKAAPGASVYAASKAAIRSLARSWAAELVDRGIRVNVV 179
+ + D SI+ GSN + YA+SKAA + EL + IR N+V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 180 APGLTQTPGLADLFSEADDA-------LADLTSTVPMKRRARPEEIGSVVAFLASDGSSF 232
+PG T+T L+++ + A L + +P+K+ A+P +I V FL S +
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 233 MTGAEVYVDGGVS 245
+T + VDGG +
Sbjct: 246 ITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05724HTHTETR484e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 48.1 bits (114), Expect = 4e-09
Identities = 26/151 (17%), Positives = 48/151 (31%), Gaps = 8/151 (5%)

Query: 18 DQALDSAIDVFWRLGYEGASLAELTNAMGINRPSLYAVFGSKEELFIRALQRYGTTYHEH 77
LD A+ +F + G SL E+ A G+ R ++Y F K +LF + + E
Sbjct: 14 QHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGEL 73

Query: 78 LGQLLSRPGAY---QVLESYLRATANAVRAGSAPGCLSIQGGLSCGPNNARIPRLLA--- 131
+ ++ + E + + V + I + +
Sbjct: 74 ELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNL 133

Query: 132 --EYRHSIEAAVADALARTEDAAGIDTAALA 160
E IE + + A + T A
Sbjct: 134 CLESYDRIEQTLKHCIEAKMLPADLMTRRAA 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05726NEISSPPORIN280.042 Neisseria sp. porin signature.
		>NEISSPPORIN#Neisseria sp. porin signature.

Length = 348

Score = 28.0 bits (62), Expect = 0.042
Identities = 13/36 (36%), Positives = 20/36 (55%), Gaps = 6/36 (16%)

Query: 43 ALKFGDDAPRVVFLHG------GGQNAHTWDTVIVG 72
A +FG+ PRV + HG + +T+D V+VG
Sbjct: 273 AYRFGNVTPRVSYAHGFKGTVDSANHDNTYDQVVVG 308


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05727IGASERPTASE320.010 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.0 bits (72), Expect = 0.010
Identities = 33/154 (21%), Positives = 53/154 (34%), Gaps = 3/154 (1%)

Query: 73 GDAAPARSSGQSDAAGADEASEPEDSESEDADAADAADAADDADAADAADAADAAEEDAE 132
+ + AD S P S +E+ D A A A + AE +
Sbjct: 989 NQTVDTTNITTPNNIQADVPSVP--SNNEEIARVDEAPVPPPAPATPSETTETVAENSKQ 1046

Query: 133 EPGDADDAGEGATEIDAVDDAVVVEDLTGVEVELEVVTTAEREYTEPEPVTRETHSAAAV 192
E + + ATE A + V E + V+ + A+ E T ET A V
Sbjct: 1047 ESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATV 1106

Query: 193 PDSSE-RAAASATQPPPYVSEQSEYQKRVAAVLQ 225
+ + TQ P V+ Q ++ + +Q
Sbjct: 1107 EKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQ 1140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NCTC10437_05728TONBPROTEIN340.002 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 34.2 bits (78), Expect = 0.002
Identities = 10/36 (27%), Positives = 11/36 (30%)

Query: 604 MPESPPYAPADSEEEQSEEDEEAAAPPAPRPPAPAP 639
PP A E E + E P P AP
Sbjct: 53 ADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVV 88



Score = 32.7 bits (74), Expect = 0.005
Identities = 10/38 (26%), Positives = 12/38 (31%)

Query: 603 VMPESPPYAPADSEEEQSEEDEEAAAPPAPRPPAPAPA 640
V P A + + E P P PP AP
Sbjct: 50 VTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPV 87



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.