PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeCorineo.gbThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_002935 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1DIP_RS22975DIP_RS22900Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS229751383.358599thiol reductase thioredoxin
DIP_RS229701343.430621thioredoxin-disulfide reductase
DIP_RS229652363.599434RNA polymerase sigma factor
DIP_RS229601363.523047membrane protein
DIP_RS229551283.995642hypothetical protein
DIP_RS229500233.096981NUDIX hydrolase
DIP_RS229450262.875719CCA tRNA nucleotidyltransferase
DIP_RS229400323.779392hypothetical protein
DIP_RS229351414.280538branched-chain amino acid ABC transporter
DIP_RS229300454.305148branched-chain amino acid ABC transporter
DIP_RS22925-1374.331386hypothetical protein
DIP_RS22920-1374.350532membrane protein
DIP_RS22915-2324.159390iron-sulfur protein
DIP_RS22910-2243.615265tryptophan synthase subunit alpha
DIP_RS22905-1253.927276tryptophan synthase subunit beta
DIP_RS229000263.9413513-methyl-2-oxobutanoate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22960SECFTRNLCASE300.045 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 30.2 bits (68), Expect = 0.045
Identities = 19/83 (22%), Positives = 39/83 (46%), Gaps = 14/83 (16%)

Query: 549 SREVTKTSVWALGSSLVGIVVALGLSMGMDRVAGGFFEF---FGSVGMLIHLAIVGVVFL 605
S E+ T+VW+L ++ V I+ + + FE+ G+V L+H ++ V
Sbjct: 148 SGELVWTAVWSLLAATVVIMFYIWVR----------FEWQFALGAVVALVHDVLLTVGLF 197

Query: 606 VVTALVLSRSGLEEVVS-LGYAL 627
V L + + +++ GY++
Sbjct: 198 AVLQLKFDLTTVAALLTITGYSI 220


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22910UREASE344e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 34.3 bits (79), Expect = 4e-04
Identities = 30/116 (25%), Positives = 43/116 (37%), Gaps = 13/116 (11%)

Query: 122 GVDSILLPDVPVREGEPFIAAAKQAGIDPI--FIAPAQASEATLEGVAQHSSGYIYAISR 179
GV I+ P V GE I A G+D FI P Q EA + G+ G
Sbjct: 109 GVTIIVGPGTEVIAGEGKIVTA--GGMDSHIHFICPQQIEEALMSGLTCMLGGGTGP--- 163

Query: 180 DGVTGTERQSSTRGLDKVVANVKRFGGAPILLGF----GISTPEHVRDAIAAGASG 231
GT + T G + ++ P+ L F S P + + + GA+
Sbjct: 164 --AHGTLATTCTPGPWHIARMIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATS 217


2DIP_RS22855DIP_RS22790Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS228550223.583571sodium:proton antiporter
DIP_RS228452263.929429hypothetical protein
DIP_RS228404235.299152hypothetical protein
DIP_RS228354224.734807hypothetical protein
DIP_RS228304244.935576ABC transporter ATP-binding protein
DIP_RS228255211.953043cobalt ABC transporter permease
DIP_RS22820218-0.892675glycosyl transferase family 9
DIP_RS22815120-0.259859inosine-uridine preferring nucleoside hydrolase
DIP_RS22810118-0.136187hypothetical protein
DIP_RS22805128-1.056190hypothetical protein
DIP_RS228001170.469891hypothetical protein
DIP_RS227950201.972232phage resistance protein
DIP_RS22790-1224.888314PTS fructose transporter subunit IIA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22810PF06776330.001 Invasion associated locus B
		>PF06776#Invasion associated locus B

Length = 214

Score = 33.4 bits (76), Expect = 0.001
Identities = 13/58 (22%), Positives = 20/58 (34%), Gaps = 8/58 (13%)

Query: 4 PSISSSVRRACATATAVIALAGVAAPS-ANAQSSNPIEAGSSALNYGANSLLAEWSKN 60
P ++S R A A + LAG A + + S G+ +G W
Sbjct: 36 PMLASCRRLARRN-GARLMLAGAMAIALSFGWSDRADAQGAVRSVHGD------WQIR 86


3DIP_RS22670DIP_RS22180Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS22670-1163.148800hypothetical protein
DIP_RS22665-1183.881771DNA-binding protein
DIP_RS226600224.659964hypothetical protein
DIP_RS226550245.071102membrane protein
DIP_RS22650-1264.628017hydrolase
DIP_RS22645-2172.010394carbohydrate kinase
DIP_RS22640-2140.228122DNA glycosylase
DIP_RS22635-3170.301636DNA starvation/stationary phase protection
DIP_RS22630-2170.824727membrane protein
DIP_RS22625-3150.322957hypothetical protein
DIP_RS22620-1170.388379DNA-binding protein
DIP_RS226153182.294340membrane protein
DIP_RS226104202.025066short chain dehydrogenase
DIP_RS226054201.466091universal stress protein
DIP_RS226004201.606478MarR family transcriptional regulator
DIP_RS225954202.006021penicillin-binding protein
DIP_RS225901332.252066membrane protein
DIP_RS225850274.444253hypothetical protein
DIP_RS22580-3234.34513530S ribosomal protein S6
DIP_RS22575-3224.319101single-stranded DNA-binding protein
DIP_RS22570-2223.82355250S ribosomal protein L9
DIP_RS225650223.610158replicative DNA helicase
DIP_RS22560-2255.059020membrane protein
DIP_RS22555-1193.833885carbonate dehydratase
DIP_RS225500304.414438transporter
DIP_RS22545-1293.313164thiol reductase thioredoxin
DIP_RS22540-2313.861678hypothetical protein
DIP_RS22535-1324.367476NYN domain-containing protein
DIP_RS22530-1313.073856GntR family transcriptional regulator
DIP_RS22525-2292.440555ABC transporter ATP-binding protein
DIP_RS22520-2291.671175membrane protein
DIP_RS22500-2262.48534723S rRNA pseudouridylate synthase
DIP_RS22495-2322.210174universal stress protein
DIP_RS22490-1312.393339membrane protein
DIP_RS22485-2293.182794membrane protein insertase YidC
DIP_RS22475-3263.828273class E sortase
DIP_RS22465-1263.232048antiporter
DIP_RS224600234.310296hypothetical protein
DIP_RS224552233.977466two-component sensor histidine kinase
DIP_RS224502233.573502DNA-binding response regulator
DIP_RS224453232.904703histone
DIP_RS224402243.620470hypothetical protein
DIP_RS224351212.781598recombinase RecB
DIP_RS22430115-0.184234alpha-1,2-mannosyltransferase
DIP_RS22425017-2.405305oxidoreductase
DIP_RS22420124-3.648113superoxide dismutase [Mn]
DIP_RS22415023-2.785990peptide-methionine (S)-S-oxide reductase
DIP_RS22410024-2.837950nicotinate-nucleotide diphosphorylase
DIP_RS22405019-0.679407aspartate oxidase
DIP_RS22400-3181.554521quinolinate synthetase
DIP_RS22395-3293.080783hypothetical protein
DIP_RS22390-3314.652924L-lactate dehydrogenase
DIP_RS22385-2152.108097glycerophosphoryl diester phosphodiesterase
DIP_RS22380-2152.066323preprotein translocase subunit SecA
DIP_RS223750132.218183SGNH hydrolase
DIP_RS223701152.649608sodium:alanine symporter
DIP_RS223651142.840769LytR family transcriptional regulator
DIP_RS223600130.968039nitric-oxide reductase large subunit
DIP_RS223551223.770201membrane protein
DIP_RS22350-1194.151327amidase
DIP_RS22345-1323.875051prephenate dehydratase
DIP_RS22340-2232.924766phosphoglycerate mutase
DIP_RS22335-3172.418862ATPase AAA
DIP_RS22330-1244.426424hypothetical protein
DIP_RS22325-1265.343609hypothetical protein
DIP_RS22320-1275.415542GntR family transcriptional regulator
DIP_RS223150265.066646serine--tRNA ligase
DIP_RS223101234.588174haloacid dehalogenase
DIP_RS22305-1244.228259acyltransferase
DIP_RS223000373.952747glycerol-3-phosphate dehydrogenase
DIP_RS222951281.836027glycerol transporter
DIP_RS222901230.530371glycerol kinase
DIP_RS22285029-3.389034hypothetical protein
DIP_RS222803170.123521transposase
DIP_RS222755231.231114transposase
DIP_RS222654180.636074transposase
DIP_RS222604160.692301hypothetical protein
DIP_RS222554160.512715collagen-binding protein
DIP_RS22250417-0.099731surface-anchored fimbrial subunit
DIP_RS22245119-2.321583class C sortase
DIP_RS22240024-2.954226class C sortase
DIP_RS22235-125-2.585077surface-anchored protein
DIP_RS22230032-4.985868hypothetical protein
DIP_RS22225-230-5.834444hypothetical protein
DIP_RS22220032-7.092415hypothetical protein
DIP_RS22215034-7.492393single-stranded DNA-binding protein
DIP_RS22210235-7.281603type I-E CRISPR-associated endoribonuclease
DIP_RS22200238-7.954292subtype I-E CRISPR-associated endonuclease Cas1
DIP_RS22190339-7.729872CRISPR-associated helicase/endonuclease Cas3
DIP_RS22185221-3.402310type I-E CRISPR-associated protein
DIP_RS22180221-2.566525CRISPR-associated protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22665CHANLCOLICIN300.009 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.4 bits (68), Expect = 0.009
Identities = 20/64 (31%), Positives = 31/64 (48%), Gaps = 3/64 (4%)

Query: 234 AHSENQAPRVEKEETTLAKITKNAPTIAHALEL---KPEQSFEELLRTTGLSQGQLRFAL 290
AH+ N A + E E LAK + A A A E + EQ +E+ R ++ QL+ A
Sbjct: 116 AHANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLKLAE 175

Query: 291 DKLQ 294
+ +
Sbjct: 176 AEEK 179


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22635HELNAPAPROT1166e-36 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 116 bits (291), Expect = 6e-36
Identities = 33/145 (22%), Positives = 69/145 (47%), Gaps = 2/145 (1%)

Query: 13 DAKQLIDHLQERLTDYNDLHLILKHAHWNVHGRNFIAVHEMIDPQVDLVRGYADEVAERI 72
+ + + L +L+++ L+ L HW V G +F +HE + D D +AER+
Sbjct: 9 NQTLVENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERL 68

Query: 73 ATLGGTPIGTPVGHVENRTPLEYNVNSGSTEDHLKELNKVYTKVLEGVREAMANAGD-LD 131
+GG P+ T + E+ + + + S + ++ L Y ++ + + A + D
Sbjct: 69 LAIGGQPVATVKEYTEHASITDGGNET-SASEMVQALVNDYKQISSESKFVIGLAEENQD 127

Query: 132 SVTEDIYIGQAAELEKFQWFIREHI 156
+ T D+++G E+EK W + ++
Sbjct: 128 NATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22610DHBDHDRGNASE693e-16 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 68.9 bits (168), Expect = 3e-16
Identities = 47/190 (24%), Positives = 73/190 (38%), Gaps = 13/190 (6%)

Query: 3 RVAVVTGGSRGVGLSVVRTLLEAGWRVHAHYRTQPADITHDRLTWWQADFTQGLPTAQAS 62
++A +TG ++G+G +V RTL G + A +A + P
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 63 G----------FPQLPRIDALIHCAGVATLGSCASVERDEWERHMSVNLHSPVQLTRQLL 112
++ ID L++ AGV G S+ +EWE SVN +R +
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 113 PQLRANHA-HVVYINSGAGKRANPQWGAYAASKFAAR--AWCDALRQEEPEITVTSIFPG 169
+ + +V + S AYA+SK AA C L E I + PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 170 RIATDMQKAI 179
TDMQ ++
Sbjct: 189 STETDMQWSL 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22575cloacin343e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 33.9 bits (77), Expect = 3e-04
Identities = 26/61 (42%), Positives = 30/61 (49%), Gaps = 4/61 (6%)

Query: 127 NGGGNFGGSQGGFGGNTGGNSGGGFGAPQGGFGGGPQNSQSAAPA--NDPWSSAPQAGGF 184
GGG+ GS +GG +G +GGG G GG G G S AAP P S P AGG
Sbjct: 46 WGGGS--GSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGL 103

Query: 185 G 185

Sbjct: 104 A 104



Score = 30.1 bits (67), Expect = 0.006
Identities = 11/35 (31%), Positives = 16/35 (45%)

Query: 127 NGGGNFGGSQGGFGGNTGGNSGGGFGAPQGGFGGG 161
+ GN G G G G + G G+ + +GGG
Sbjct: 15 STSGNINGGPTGLGVGGGASDGSGWSSENNPWGGG 49



Score = 27.4 bits (60), Expect = 0.038
Identities = 19/62 (30%), Positives = 23/62 (37%), Gaps = 3/62 (4%)

Query: 125 GGNGGGNFGGSQGGFGGNTGGNSGGGFGAPQGGFGGGPQNSQSAAPANDPWSSAPQAGGF 184
GG+G G+ G+ G GG +G G GG G S P S GG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLG---VGGGASDGSGWSSENNPWGGGSGSGIHWGGG 59

Query: 185 GG 186
G
Sbjct: 60 SG 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22560TCRTETA553e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 54.8 bits (132), Expect = 3e-10
Identities = 80/380 (21%), Positives = 135/380 (35%), Gaps = 41/380 (10%)

Query: 37 ALDAMDVGLISFVMAALIKHWGLTHGQTS---VLASAGFVGMAIGATFGGLLADKWGRRN 93
ALDA+ +GLI V+ L++ ++ T+ +L + + A G L+D++GRR
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 94 VFALTLLVYGLATGASALAGGLAVLIVLRFIVGLGLGAELPVASTLVSEFAPLRHRGRLV 153
V ++L + A A L VL + R + G+ GA VA +++ R R
Sbjct: 75 VLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARHF 133

Query: 154 VILEAFWAVGWILAAIIGAFVVSAS-DSGWRWALVLGCVPALYSAYVRSSLPESVRFLEA 212
+ A + G + ++G + S + + A L + L ++ LPES +
Sbjct: 134 GFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFL---LPESHK---- 186

Query: 213 RGRHDEAEAAVQQFEKASATIPDTPVIATEDPADQADSIFAPNMRRRTFGLWTVWFCINL 272
R A+ T V A L V+F + L
Sbjct: 187 GERRPLRREALNPLASFRWARGMTVVAA----------------------LMAVFFIMQL 224

Query: 273 -SYYGAFIWIPSLLVADGF---SLVKSFQFTLIITLAQFPGYALAAWLIEIWGRRTTLAV 328
A +W + D F + L + + G R L +
Sbjct: 225 VGQVPAALW--VIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALML 282

Query: 329 FLLGSAGSAALYGFADTTALIIAAGCCLSFFNLGAWGALYAISPELYPSQIRGTGTGSAA 388
++ L FA + L+ +G AL A+ + +G GS A
Sbjct: 283 GMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM-PALQAMLSRQVDEERQGQLQGSLA 341

Query: 389 GFGRIASIIAPLIVPPIVAA 408
+ SI+ PL+ I AA
Sbjct: 342 ALTSLTSIVGPLLFTAIYAA 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22535IGASERPTASE472e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 46.6 bits (110), Expect = 2e-07
Identities = 47/268 (17%), Positives = 70/268 (26%), Gaps = 44/268 (16%)

Query: 192 PDEEGEEGEFPAAAAEATAAAAEAAADTTVTTDATDNEATPPVPTPEQVAEKSQREAPAA 251
DE PA +E T AE + + T + + +AT +VA++++ A
Sbjct: 1020 VDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKAN 1079

Query: 252 P---------------------IPCPQRPGEPALSEEDCMAEVPAKMPK--PVAPKPGAH 288
E A E + EVP + P +
Sbjct: 1080 TQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETV 1139

Query: 289 LPAETPAPAPTATPAPTPSSRTVRVTSTVEESLEVTVSNSAEPNVEAEEADTPATPRPAP 348
P PA T T+ E+ + T SN +P E
Sbjct: 1140 QPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQP--VTESTTVNTGNSVVE 1197

Query: 349 NP-----SMMAPRRKLRSRYVP-----------LPNEVWASAGYQTPFDVGQQYALWWFD 392
NP + P S P N A+ V
Sbjct: 1198 NPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNT 1257

Query: 393 NAATPEQRDSAH---LLSGGGLPPEIDR 417
NA + R A L G + I +
Sbjct: 1258 NAVLSDARAKAQFVALNVGKAVSQHISQ 1285


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22525PF05272280.041 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.041
Identities = 9/20 (45%), Positives = 12/20 (60%)

Query: 37 LIGANGVGKTTLLRILAGQS 56
L G G+GK+TL+ L G
Sbjct: 601 LEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS2248060KDINNERMP782e-17 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 77.7 bits (191), Expect = 2e-17
Identities = 54/265 (20%), Positives = 97/265 (36%), Gaps = 71/265 (26%)

Query: 5 FIYPVSGVMRLWHYIFADLFGCSQSQAWVASLFALVVTVRSIIAPFSWMQFKSGRFAIMM 64
+++ +S + G W S+ + VR I+ P + Q+ S M+
Sbjct: 332 WLWFISQPLFKLLKWIHSFVG-----NWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRML 386

Query: 65 RPKIKRLKEEYAERTDKESILEQERRQKEIQEEY---GYSMAAGCVPALIQVPVFLGLYQ 121
+PKI+ ++E + +++R +E+ Y + GC P LIQ+P+FL LY
Sbjct: 387 QPKIQAMRERLGD--------DKQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYY 438

Query: 122 VLLRMARPKEGLDATVHEPIGMLTSDNVRSFLDTRFFGVPLPAYNSMSPEQLAHLGTDQP 181
+L+ +V P + + L P
Sbjct: 439 MLMG----------SVE------------------LRQAPFALW-------IHDLSAQDP 463

Query: 182 TIHAFVLPLIIAACVFTTINMIVSTLRTRYSIDHDSEFAVGMYRFLIVMVLVAPIGLLQT 241
++LP+++ +F M T T D M F+ V+ V
Sbjct: 464 Y---YILPILMGVTMFFIQKM-SPTTVT------DPMQQKIMT-FMPVIFTV-------- 504

Query: 242 GIFGPIPAAICLYWVANNLWTLIQN 266
F P+ + LY++ +NL T+IQ
Sbjct: 505 -FFLWFPSGLVLYYIVSNLVTIIQQ 528


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22465TCRTETB411e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.6 bits (95), Expect = 1e-05
Identities = 38/141 (26%), Positives = 59/141 (41%), Gaps = 23/141 (16%)

Query: 49 FKAAQPMLKEDFGLTTLQLGYIGLAFSITYGIGKTLVGYFVDGHNSKRIISTLLICASTM 108
+ P + DF ++ AF +T+ IG + G D KR+
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRL----------- 81

Query: 109 VLLMGLLLSYFGSVIGI----FIVLWGLNGLFQSAGGPASYSTI----SRWAPRTKRGKY 160
LL G++++ FGSVIG F L + Q AG A + + +R+ P+ RGK
Sbjct: 82 -LLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKA 140

Query: 161 LGLWNASHNVG---GALAGGL 178
GL + +G G GG+
Sbjct: 141 FGLIGSIVAMGEGVGPAIGGM 161


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22455TYPE3IMSPROT320.004 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 32.0 bits (73), Expect = 0.004
Identities = 22/111 (19%), Positives = 48/111 (43%), Gaps = 13/111 (11%)

Query: 145 SIYMVFPLFF-LYLRVLPDIRGIILVVGATAIAITSQLAQLTIGAVMGPVVSALVVIAIH 203
SI V L +++ + ++ ++ + IT L Q+ ++ V +V+
Sbjct: 143 SILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQILRQLMVICTVGFVVISIAD 202

Query: 204 FAFEAIWKGARE----REELIDELLATRNQLAETERAAGIAAERQRIAHEI 250
+AFE ++ +E ++E+ E E E + I ++R++ EI
Sbjct: 203 YAFE-YYQYIKELKMSKDEIKRE-------YKEMEGSPEIKSKRRQFHQEI 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22450HTHFIS576e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 57.1 bits (138), Expect = 6e-12
Identities = 30/126 (23%), Positives = 51/126 (40%), Gaps = 12/126 (9%)

Query: 2 IRVLLADDHEIVRLGLRAVLESA-EDIEVIGEVATAEAAIAAAQAGGIDVILMDLRFGPG 60
+L+ADD +R L L A D+ + + A AG D+++ D+
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRI---TSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 VQGTKLTSGADATAAIRRRMDNPPEVLVVTNYDTDADILGAIEAGALGYMLKDAPPEELL 120
+ D I++ + P VLV++ +T + A E GA Y+ K EL+
Sbjct: 61 -------NAFDLLPRIKKARPDLP-VLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELI 112

Query: 121 AAVRSA 126
+ A
Sbjct: 113 GIIGRA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22325TRNSINTIMINR300.010 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 30.5 bits (68), Expect = 0.010
Identities = 16/45 (35%), Positives = 23/45 (51%)

Query: 3 EMSSKKVAGFAGVAALVVASGVGAYTYTASQHDNAPRPQTATSSA 47
E+ G+ +AL+VA G+GA TA N P QT T++
Sbjct: 357 ELQLSSGIGYGLSSALIVAGGIGAGVTTALHRRNQPAEQTTTTTT 401


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22235PYOCINKILLER280.037 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 28.2 bits (62), Expect = 0.037
Identities = 29/115 (25%), Positives = 44/115 (38%), Gaps = 8/115 (6%)

Query: 111 RLSDEVWKAVSGRDGRAVVTGLPMGVYLVSETPPAKRPAEYRRTLDFLITVPAGMRTADG 170
RL++E R ++ + V + P + A T + +TVP+ A
Sbjct: 358 RLTNEA------RGNTTTLSVVSTDGVSVPKAVPVRMAAYNATTGLYEVTVPSTTAEAPP 411

Query: 171 NVASWSCDVQVFTKDTDDLPPTVPVFPPVESSVTLTPSSPVPGTPKTPGKPDLPE 225
+ +W+ ++ P VP PV TLTP P T PG LPE
Sbjct: 412 LILTWTPASPPGNQNPSSTTPVVPKPVPVYEGATLTPVKATPET--YPGVITLPE 464


4DIP_RS21525DIP_RS21185Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS21525023-3.634401transposase
DIP_RS21520013-1.987826hypothetical protein
DIP_RS21510-1140.504835hypothetical protein
DIP_RS21505-113-2.783905hypothetical protein
DIP_RS21495-113-0.795691membrane protein
DIP_RS21490011-2.287262hypothetical protein
DIP_RS21485012-3.038991membrane protein
DIP_RS21480014-3.265837hypothetical protein
DIP_RS21475013-3.803182surface-anchored fimbrial associated protein
DIP_RS21470018-0.328934adenylosuccinate synthetase
DIP_RS21465214-0.028170surface anchored protein
DIP_RS21460-1222.059485hypothetical protein
DIP_RS21455-1202.586306hypothetical protein
DIP_RS21450-1253.831407phosphoribosyltransferase
DIP_RS21445-2233.819320phosphoribosylglycinamide formyltransferase
DIP_RS21440-2304.295339pyridine nucleotide-disulfide oxidoreductase
DIP_RS21435-2254.113958phosphate acetyltransferase
DIP_RS21430-1324.596480acetate kinase
DIP_RS21425-1375.009188serine/threonine protein kinase
DIP_RS214200434.195973ABC transporter
DIP_RS214150474.439439hypothetical protein
DIP_RS214100474.439748ABC transporter
DIP_RS214050504.503669multidrug ABC transporter permease
DIP_RS214001484.797437hypothetical protein
DIP_RS21395-1455.063073cardiolipin synthase
DIP_RS21390-2335.225010exonuclease
DIP_RS21385-2305.366104acetyltransferase
DIP_RS21380-2295.157974peptide deformylase
DIP_RS21375-2375.373306hypothetical protein
DIP_RS21370-3355.020800hypothetical protein
DIP_RS21365-2374.665217carboxylate--amine ligase
DIP_RS21360-1363.854580hypothetical protein
DIP_RS213550230.414057hypothetical protein
DIP_RS21350017-0.807780dipeptidase
DIP_RS21345220-2.522537hypothetical protein
DIP_RS21340121-3.064111hypothetical protein
DIP_RS21335120-3.119690glutamine amidotransferase
DIP_RS21330013-0.401611excinuclease ABC subunit A
DIP_RS213250201.771731alkene reductase
DIP_RS213202334.646015transposase
DIP_RS213151273.177092hypothetical protein
DIP_RS230503312.935752hypothetical protein
DIP_RS213052302.343458RNA-splicing ligase RtcB
DIP_RS21300123-2.399206hypothetical protein
DIP_RS21295120-3.110401hypothetical protein
DIP_RS21290121-3.824845molecular chaperone GroEL
DIP_RS21285221-4.151944transposase
DIP_RS21280119-4.758394hypothetical protein
DIP_RS21275216-3.174927polyphosphate kinase 2
DIP_RS21270120-1.284384lipase
DIP_RS21265114-4.251014membrane protein
DIP_RS21260314-1.052412surface-anchored fimbrial subunit
DIP_RS21255316-0.522411class C sortase
DIP_RS21250318-0.476606surface anchored protein
DIP_RS21240217-0.466444surface-anchored membrane protein
DIP_RS212353236.273715amino acid adenylation protein
DIP_RS212300344.131647MarR family transcriptional regulator
DIP_RS212251494.694780sulfurtransferase
DIP_RS212201544.786010inorganic pyrophosphatase
DIP_RS212151575.447070D-alanyl-D-alanine carboxypeptidase
DIP_RS212101555.806956tRNA(Ile)-lysidine synthetase
DIP_RS212051524.804419hypoxanthine phosphoribosyltransferase
DIP_RS21200-1505.708465cell division protein FtsH
DIP_RS211950426.050081GTP cyclohydrolase I FolE
DIP_RS21190-2304.941268hypothetical protein
DIP_RS21185-1304.333592dihydropteroate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS21510NUCEPIMERASE340.001 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 33.6 bits (77), Expect = 0.001
Identities = 13/63 (20%), Positives = 19/63 (30%), Gaps = 17/63 (26%)

Query: 119 RRRRSRQTDVQFHRADLADSDWVVHPASGLPITTVPRTILDLAQSGHEPDHLLHLVADAG 178
R Q QFH+ DLAD + + DL S + +
Sbjct: 45 RLELLAQPGFQFHKIDLADREGMT----------------DLFAS-GHFERVFISPHRLA 87

Query: 179 RKY 181
+Y
Sbjct: 88 VRY 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS21490PF02370280.033 M protein repeat
		>PF02370#M protein repeat

Length = 168

Score = 28.1 bits (62), Expect = 0.033
Identities = 16/74 (21%), Positives = 34/74 (45%), Gaps = 5/74 (6%)

Query: 268 DREERETMQRYWSGEELTPKREQAREDRANLDAREAEIEGQVSR-EADPQ----KKQELE 322
+R+E++ E + + +E + + ++E + + + Q +Q L
Sbjct: 77 ERKEKQERPERREKFERQHQDKHYQEQQKKHQQEQQQLEAEKQKLAKEKQISDASRQGLN 136

Query: 323 RDLENTRATKEVVE 336
RDLE +RA K+ +E
Sbjct: 137 RDLEASRAAKKELE 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS21430ACETATEKNASE5100.0 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 510 bits (1316), Expect = 0.0
Identities = 183/399 (45%), Positives = 251/399 (62%), Gaps = 8/399 (2%)

Query: 4 VLVLNSGSSSIKFQLVDPTAHATDDPFASGLVEQIGEPQGRVTLKHAGEKFVVEAPIPDH 63
+LV+N GSSS+K+QL++ + A GL E+IG +T GEK ++ + DH
Sbjct: 3 ILVINCGSSSLKYQLIESK---DGNVLAKGLAERIGINDSLLTHNANGEKIKIKKDMKDH 59

Query: 64 SAGLALAFDLMGEHKCG--PTDVEIIAVGHRVVHGGILFSQPQVITDEIMDMVRDLIPLA 121
+ L D + G EI AVGHRVVHGG F+ +ITD+++ + D I LA
Sbjct: 60 KDAIKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCIELA 119

Query: 122 PLHNPANIDGIEVARKILPDVAHVAVFDTGFFHDMPPAAAIYAIDAKTAADHGVRRYGFH 181
PLHNPANI+GI+ +I+PDV VAVFDT F MP A +Y I + + +R+YGFH
Sbjct: 120 PLHNPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKYGFH 179

Query: 182 GTSHEYVSHKVAELLDLPEGAINQITLHLGNGASCAAIRGGKAIDTSMGMTPLSGLVMGT 241
GTSH+YVS + AE+L+ P ++ IT HLGNG+S AA++ GK+IDTSMG TPL GL MGT
Sbjct: 180 GTSHKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLAMGT 239

Query: 242 RSGDIDPGIVFHLHRQAGMSIDEIDELLNKKSGVKGISGV-NDFRELR-TMIDAGDQDAW 299
RSG IDP I+ +L + +S +E+ +LNKKSGV GISG+ +DFR+L GD+ A
Sbjct: 240 RSGSIDPSIISYLMEKENISAEEVVNILNKKSGVYGISGISSDFRDLEDAAFKNGDKRAQ 299

Query: 300 LAYNIYIHQLRRYIGSYMIALGRVNAITFTAGVGENDVAVRADALSHLEGFGIKIDPERN 359
LA N++ +++++ IGSY A+G V+ I FTAG+GEN +R L LE G K+D E+N
Sbjct: 300 LALNVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREFILDGLEFLGFKLDKEKN 359

Query: 360 ALPNTGPREISTDDSAIKVFVVPTNEELAIARYAKALSG 398
+ IST DS + V VVPTNEE IA+ + +
Sbjct: 360 KVRGE-EAIISTADSKVNVMVVPTNEEYMIAKDTEKIVE 397


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS21425YERSSTKINASE373e-04 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 37.0 bits (85), Expect = 3e-04
Identities = 28/82 (34%), Positives = 47/82 (57%), Gaps = 9/82 (10%)

Query: 199 VLPALDYLHSRGVVYNDLKPDNIII--SEDQVKLIDLG--AVSGIGAFGYIYGTKGYQAP 254
+L ++L GVV+ND+KP N++ + + +IDLG + SG G+ T+ ++AP
Sbjct: 254 LLDVTNHLAKAGVVHNDIKPGNVVFDRASGEPVVIDLGLHSRSGEQPKGF---TESFKAP 310

Query: 255 E--VASDGPSIASDIYTIGRTL 274
E V + G S SD++ + TL
Sbjct: 311 ELGVGNLGASEKSDVFLVVSTL 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS21385SACTRNSFRASE459e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 44.6 bits (105), Expect = 9e-08
Identities = 16/60 (26%), Positives = 27/60 (45%)

Query: 224 DGYLGYSAVEVAPEYRRQGLATELGAAMLAWGKAHGAHTAYLQVIESNAAGIGLYHKLGF 283
+GY + VA +YR++G+ T L + W K + L+ + N + Y K F
Sbjct: 87 NGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS21370IGASERPTASE280.019 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.5 bits (63), Expect = 0.019
Identities = 26/150 (17%), Positives = 52/150 (34%), Gaps = 15/150 (10%)

Query: 31 VTSDNSSSAPKTSTAETSEQTPAEKSAPQA-PAASDTP--------APSDAPGESAAPAE 81
V ++ + PK ++ + +Q +E PQA PA + P + ++ ++ PA+
Sbjct: 1114 VETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAK 1173

Query: 82 KAPAD--RPADKPAAERPVPVRVLNNSTVQGLAAQVADTLRSHGI----DVVEVGNLPDA 135
+ ++ +P + V N Q S V ++P
Sbjct: 1174 ETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHN 1233

Query: 136 VVPETTVFYPAGHQAQAQKIADQLHAAIAQ 165
V P TT A + +A ++
Sbjct: 1234 VEPATTSSNDRSTVALCDLTSTNTNAVLSD 1263


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS21345V8PROTEASE426e-07 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 42.3 bits (99), Expect = 6e-07
Identities = 21/42 (50%), Positives = 25/42 (59%)

Query: 131 NSDQPDDSKKPEGQGNPGGSNNPKLPNNPKLPNNPKLPNNPD 172
N DQP++ P+ NP NNP PNNP PNNP P+N D
Sbjct: 285 NDDQPNNPDNPDNPNNPDNPNNPDEPNNPDNPNNPDNPDNGD 326



Score = 36.1 bits (83), Expect = 7e-05
Identities = 20/38 (52%), Positives = 21/38 (55%), Gaps = 1/38 (2%)

Query: 147 PGGSNNPKLPNNPKLPNNPKLPNNPDVTPAPPSNDGAG 184
P +NP PNNP PNNP PNNPD P P N G
Sbjct: 289 PNNPDNPDNPNNPDNPNNPDEPNNPD-NPNNPDNPDNG 325



Score = 32.7 bits (74), Expect = 0.001
Identities = 24/67 (35%), Positives = 32/67 (47%), Gaps = 8/67 (11%)

Query: 115 ENFRDALAACIAGKYYNSDQPDDSKKPEGQGNPGGSNNPKLPNNPKLPNNPKLPNNPDVT 174
EN R+ L I ++ +D +P NP NNP PNNP PNN P+NP+
Sbjct: 268 ENVRNFLKQNIEDIHFANDD-----QPNNPDNPDNPNNPDNPNNPDEPNN---PDNPNNP 319

Query: 175 PAPPSND 181
P + D
Sbjct: 320 DNPDNGD 326


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS21230PF05272290.008 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.008
Identities = 18/95 (18%), Positives = 29/95 (30%), Gaps = 10/95 (10%)

Query: 20 PTFQAERIRKCLRDGSEEVLSRYSVRMREYWIL-----GFVAGVEPPTQVAIAAALGVDP 74
F+ E+ + + G + L R G+ T + ALG DP
Sbjct: 752 IYFRPEQELRLVETGVQGRLWALLTREGAPAAEGAAQKGYSVNTTFVTIADLVQALGADP 811

Query: 75 S-----DMVRLIDSVESRGWVRRELDPRDRRRHLV 104
++ D + GW RRR +
Sbjct: 812 GKSSPMLEGQVRDWLNENGWEYLRETSGQRRRGYM 846


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS21200HTHFIS320.010 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.1 bits (73), Expect = 0.010
Identities = 22/88 (25%), Positives = 33/88 (37%), Gaps = 18/88 (20%)

Query: 200 AKIPRGVLLYGPPGTGKTLLARAV---AGEAGVPFYS-----ISGSDFVEMFVGV----- 246
+ +++ G GTGK L+ARA+ PF + I G
Sbjct: 157 MQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAF 216

Query: 247 -GASRVRD-LFKQARENSPCIIFIDEID 272
GA F+QA + +F+DEI
Sbjct: 217 TGAQTRSTGRFEQAEGGT---LFLDEIG 241


5DIP_RS21125DIP_RS20940Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS211250323.514328transposase
DIP_RS21120-1323.578454lysine--tRNA ligase
DIP_RS21115-2253.373599hypothetical protein
DIP_RS21110-1293.570917hypothetical protein
DIP_RS21105-1263.343261hypothetical protein
DIP_RS21100-2233.467217NDP-hexose 4-ketoreductase
DIP_RS21095-2142.649871membrane protein
DIP_RS21090-1142.388628adenine glycosylase
DIP_RS21085-1151.913824beta-hydroxylase
DIP_RS21080-1142.139456hypothetical protein
DIP_RS21075-1172.673783DNA repair protein RadA
DIP_RS21070-1183.138359hypothetical protein
DIP_RS210650304.613728hypothetical protein
DIP_RS210600284.887693CarD family transcriptional regulator
DIP_RS21055-1235.1375462-C-methyl-D-erythritol 4-phosphate
DIP_RS21050-1245.0658642-C-methyl-D-erythritol 2,4-cyclodiphosphate
DIP_RS21045-2193.837932cysteine--tRNA ligase
DIP_RS21040-1233.30177023S rRNA
DIP_RS210351421.985595LacI family transcriptional regulator
DIP_RS21030247-0.569998trehalose-phosphatase
DIP_RS21025248-1.705175hypothetical protein
DIP_RS21020244-2.767589trehalose-6-phosphate synthase
DIP_RS21015236-4.079122hypothetical protein
DIP_RS21010239-4.032792amino acid export carrier protein
DIP_RS21000127-5.664707*integrase
DIP_RS20995332-5.556311hypothetical protein
DIP_RS20990334-4.079536hypothetical protein
DIP_RS23045227-2.692081hypothetical protein
DIP_RS209800241.036691hypothetical protein
DIP_RS23040-1231.331254hypothetical protein
DIP_RS20970-1252.461271hypothetical protein
DIP_RS20965-1273.422063integrase
DIP_RS209601304.175138amino acid transporter
DIP_RS209551325.365518pyruvate dehydrogenase
DIP_RS209501355.587296transposase
DIP_RS209452294.850193acetyltransferase
DIP_RS209401223.530043sulfonamide-resistant dihydropteroate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS21100HTHFIS320.012 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.7 bits (72), Expect = 0.012
Identities = 33/173 (19%), Positives = 66/173 (38%), Gaps = 31/173 (17%)

Query: 531 IIGQDDAVKSVSRAIRRTRAGLKDPRRPSGSFIFAGPSGVGKTELSKSLANFLFGDDDAL 590
++G+ A++ + R + R + + G SG GK ++++L ++ +
Sbjct: 139 LVGRSAAMQEIYRVLARLMQT-------DLTLMITGESGTGKELVARALHDYGKRRNGPF 191

Query: 591 IQIDMGEFHDRFTASRLFGAPPGYVGYDEGGQLTEKVRRKP--FS-----VVLFDEIEKA 643
+ I+M S LFG E G T R F + DEI
Sbjct: 192 VAINMAAIPRDLIESELFGH--------EKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDM 243

Query: 644 HKEIYNTLLQVLEEG---RLTDGQGRVVDFKNTVLIFTSNLGTQDISKAVGMG 693
+ LL+VL++G + D + ++ +N +D+ +++ G
Sbjct: 244 PMDAQTRLLRVLQQGEYTTVGGRTPIRSDVR---IVAATN---KDLKQSINQG 290


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS20940SACTRNSFRASE383e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.4 bits (89), Expect = 3e-06
Identities = 17/70 (24%), Positives = 27/70 (38%), Gaps = 15/70 (21%)

Query: 90 AYLHKLAVRRTHAGRGVSSALIEACRHAARTQGCAKLRLD--------CHPNLRGLYERL 141
A + +AV + + +GV +AL+ A+ L L+ CH Y +
Sbjct: 90 ALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACH-----FYAKH 144

Query: 142 GFT--HVDTF 149
F VDT
Sbjct: 145 HFIIGAVDTM 154


6DIP_RS20830DIP_RS20805Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS208300313.273938PTS beta-glucoside transporter subunit IIABC
DIP_RS20825-1313.565190ribonucleotide-diphosphate reductase subunit
DIP_RS20820-1283.917583ribonucleotide reductase assembly protein NrdI
DIP_RS20815-1203.917583membrane protein
DIP_RS208100163.250524hypothetical protein
DIP_RS20805-1203.112371amidophosphoribosyltransferase
7DIP_RS20455DIP_RS20180Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS204550233.276594ATP-dependent Clp protease adaptor ClpS
DIP_RS20450-1173.730318hypothetical protein
DIP_RS204450174.265970peptidase
DIP_RS204400183.127004rhomboid family intramembrane serine protease
DIP_RS204350182.743708glutamate racemase
DIP_RS204301192.487862hypothetical protein
DIP_RS204251344.725859ribonuclease PH
DIP_RS204201354.845226non-canonical purine NTP pyrophosphatase
DIP_RS204151354.886972membrane protein
DIP_RS204101344.976216hypothetical protein
DIP_RS230351345.108885*hypothetical protein
DIP_RS203952355.2325933-oxoacyl-ACP synthase
DIP_RS203900323.886881holo-ACP synthase
DIP_RS20385-1322.440972TetR family transcriptional regulator
DIP_RS203801232.053791peroxiredoxin
DIP_RS203752272.341739hypothetical protein
DIP_RS20370429-0.279885monooxygenase
DIP_RS20365928-0.423193hypothetical protein
DIP_RS20360928-0.978218hypothetical protein
DIP_RS203551129-0.492632integrase
DIP_RS230301229-1.820808triple helix repeat-containing collagen
DIP_RS203451231-2.768887hypothetical protein
DIP_RS203401231-1.103049hypothetical protein
DIP_RS203357300.003653hypothetical protein
DIP_RS203307311.032996hypothetical protein
DIP_RS203256311.121240hypothetical protein
DIP_RS203206313.150623hypothetical protein
DIP_RS203156314.466439hypothetical protein
DIP_RS203107304.401304hypothetical protein
DIP_RS203057314.829063hypothetical protein
DIP_RS203007293.814447hypothetical protein
DIP_RS202956283.248939hypothetical protein
DIP_RS202904251.255858terminase
DIP_RS20285326-3.823507hypothetical protein
DIP_RS20275023-0.387417hypothetical protein
DIP_RS20265-119-0.717351hypothetical protein
DIP_RS20260-123-0.078296hypothetical protein
DIP_RS202550330.678309hypothetical protein
DIP_RS202501381.671175hypothetical protein
DIP_RS202351312.645577*hypothetical protein
DIP_RS202303371.328366dihydrodipicolinate synthase family protein
DIP_RS202201331.528390*oligoribonuclease
DIP_RS202151271.823410short-chain dehydrogenase/reductase
DIP_RS202101243.282048NADPH:quinone dehydrogenase
DIP_RS202001273.129003*MFS transporter
DIP_RS201951262.817403transposase
DIP_RS20190-1303.258838DNA-binding protein
DIP_RS201850323.418356hypothetical protein
DIP_RS201800313.622615carbon starvation protein CstA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS20390ENTSNTHTASED270.030 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 26.5 bits (58), Expect = 0.030
Identities = 13/31 (41%), Positives = 16/31 (51%), Gaps = 6/31 (19%)

Query: 43 RRRAEHLAGRWAAKEAFVKAWSQSLYGKPPV 73
+R+AEHLAGR AA A + G V
Sbjct: 45 KRKAEHLAGRIAAVHALRE------VGVRTV 69


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS20385HTHTETR623e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.3 bits (151), Expect = 3e-14
Identities = 35/207 (16%), Positives = 60/207 (28%), Gaps = 21/207 (10%)

Query: 16 RPAQQRSREKFDRILAAARAVLVDVGFESFTFDEVAKRAEVPIGTIYQYFANKYVMICEL 75
R +Q ++E IL A + G S + E+AK A V G IY +F +K + E+
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 76 DRHDTAASIAEIQRFSQQVPALQWPDFLNEFIDHLSVM--WRADPSRRSVWH----AIQS 129
+ + + P I L + +
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 130 TPATRATAAATEQPMLDIIGDVMRP------LAPRTTPEQRLEIASLLVHTVSSLLNYAV 183
+ D I ++ L A ++ +S L+ +
Sbjct: 123 MAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTR---RAAIIMRGYISGLMENWL 179

Query: 184 HDPTSSDEVFESRVAEIKRMIVAYLFA 210
P S D + + R VA L
Sbjct: 180 FAPQSFD------LKKEARDYVAILLE 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS20365ISCHRISMTASE280.009 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 28.4 bits (63), Expect = 0.009
Identities = 11/43 (25%), Positives = 22/43 (51%), Gaps = 5/43 (11%)

Query: 70 SAFERTDATGTKLGDWLRANNIERLDVVGLGTELGIRSSVLDA 112
SAF+RT+ L + +R ++L + G+ +G + +A
Sbjct: 127 SAFKRTN-----LLEMMRKEGRDQLIITGIYAHIGCLVTACEA 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS20215DHBDHDRGNASE944e-25 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 94.3 bits (234), Expect = 4e-25
Identities = 58/188 (30%), Positives = 96/188 (51%), Gaps = 9/188 (4%)

Query: 12 AVVTGASQGIGRAMARDLARMGHNVLLVARREDVLRELADQLMTDHSVVAEVYPCDLADA 71
A +TGA+QGIG A+AR LA G ++ V + L E + + AE +P D+ D+
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKL-EKVVSSLKAEARHAEAFPADVRDS 69

Query: 72 DDLRGLVAEL--QGREVNIIVNSAGIASFGP---FMDQDWQYESKQFDLNARAVFELTHA 126
+ + A + + ++I+VN AG+ G D++W+ F +N+ VF + +
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWE---ATFSVNSTGVFNASRS 126

Query: 127 VLPGMVARKSGAICNVGSAAGNVPIPNNATYVLTKAGVNAFTEALHYELRKSGVACTLLA 186
V M+ R+SG+I VGS VP + A Y +KA FT+ L EL + + C +++
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 187 PGPVREAV 194
PG +
Sbjct: 187 PGSTETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS20210NUCEPIMERASE290.029 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.0 bits (65), Expect = 0.029
Identities = 10/27 (37%), Positives = 14/27 (51%)

Query: 156 ILVTGATGGVGSIAVHLLAQRGYSIVA 182
LVTGA G +G L + G+ +V
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVG 29


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS20200TCRTETA310.013 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.6 bits (69), Expect = 0.013
Identities = 31/140 (22%), Positives = 54/140 (38%), Gaps = 7/140 (5%)

Query: 255 LGVFMLASGLSALVGGRISGVWSDYSSRAVMSYGALASSIVVLVIVACSWWAPSALNVWL 314
G+ + L + G SD R + +LA + V I+A +A +W+
Sbjct: 45 YGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMA------TAPFLWV 98

Query: 315 FPLSFFVVNVVHTGIRVARKTYIVDMAEGDQRTRYVGAANTLMGVILLIVGVISGTIAHW 374
+ V + VA YI D+ +GD+R R+ G + G ++ V+ G + +
Sbjct: 99 LYIGRIVAGITGATGAVA-GAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGF 157

Query: 375 GPQPALLFLAAIGLAGAATS 394
P AA+ T
Sbjct: 158 SPHAPFFAAAALNGLNFLTG 177


8DIP_RS19435DIP_RS19385Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS19435-117-3.660494GntR family transcriptional regulator
DIP_RS19430417-4.674799sodium:proton antiporter
DIP_RS19425514-4.136336hypothetical protein
DIP_RS19420417-4.116934hypothetical protein
DIP_RS19415325-3.767305hypothetical protein
DIP_RS19410326-4.079863hypothetical protein
DIP_RS19405325-3.880279hypothetical protein
DIP_RS19400127-3.080349hypothetical protein
DIP_RS19395328-2.974545hypothetical protein
DIP_RS19390426-3.036803hypothetical protein
DIP_RS19385118-3.577241transposase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS19435FLGMOTORFLIG290.021 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 28.6 bits (64), Expect = 0.021
Identities = 13/93 (13%), Positives = 31/93 (33%), Gaps = 1/93 (1%)

Query: 100 YLIEAHSAQKVAQLPSAQRTAIVAALEATIEKQEEALSRERLDLYTDFDAEFHQHIMKAA 159
YL ++ ++ LP+ +T + + E + L + + A
Sbjct: 146 YLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKKLASLSSEDYTSAG 205

Query: 160 GNEILAHLYNSLRDKQ-ARITSRIVTRNADNAQ 191
G + + + N K I + + + A+
Sbjct: 206 GVDNVVEIINMADRKTEKFIIESLEEEDPELAE 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS19390RTXTOXINA383e-04 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 38.0 bits (88), Expect = 3e-04
Identities = 48/270 (17%), Positives = 97/270 (35%), Gaps = 33/270 (12%)

Query: 303 NAAAAAGRLASQSANMADQAVAASHEARAALRETAAALARAAAASARAQSAAAHAASAAN 362
+A +A+ L++ A+ +A A L + + + AQ AA +++A
Sbjct: 250 SAISASFILSNADADTRTKAAAGVELTTKVL----GNVGKGISQYIIAQRAAQGLSTSAA 305

Query: 363 AAAYDANAAHGARIAAEQARNAAQAAEQSQQAFRYAEEAAGFAQSAGQAAGSAARNADAA 422
AA A+A A I+ + A +++ + Y++ + + A
Sbjct: 306 AAGLIASAVTLA-ISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKETGAI 364

Query: 423 AAAATEAANAAGESQGAADEARAGAARARAAAGRARAAASEVDRLVANIRSLVEQTRQAA 482
A+ T A + +AA + V LV + ++ +A+
Sbjct: 365 DASLT-----------TISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEAS 413

Query: 483 KEAA-EH-ANNSAQSADDAAREAGNAHYAAQVAGRHAQFSSQSAERARNNIQLAKDIHEQ 540
K+A EH A+ A + ++ G ++ RHA F +N ++ +++
Sbjct: 414 KQAMFEHVASKMADVIAEWEKKHGKNYFENGYDARHAAF-------LEDNFKILSQYNKE 466

Query: 541 ATNSAQERLQAERAFLKEQARLAREIQDVA 570
ER+ L Q I ++A
Sbjct: 467 --------YSVERSVLITQQHWDTLIGELA 488


9DIP_RS19210DIP_RS19135Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS19210022-3.412098transcriptional regulator
DIP_RS19205024-3.338000alpha 1,6 mannopyranosyltransferase
DIP_RS19200125-3.020926geranylgeranyl pyrophosphate synthase
DIP_RS19195226-3.483066methylenetetrahydrofolate reductase [NAD(P)H]
DIP_RS19190229-3.421829acetyltransferase
DIP_RS19185128-4.207075hypothetical protein
DIP_RS19180026-3.486995membrane protein
DIP_RS19175-127-3.181461division/cell wall cluster transcriptional
DIP_RS19170-225-3.214827ribosomal RNA small subunit methyltransferase H
DIP_RS19165-324-2.728315hypothetical protein
DIP_RS19160-224-3.129932cell division protein FtsI
DIP_RS19155-123-2.570548UDP-N-acetylmuramoyl-L-alanyl-D-glutamate--2,
DIP_RS19150-123-2.698952UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D-
DIP_RS19145-125-3.201506phospho-N-acetylmuramoyl-pentapeptide-
DIP_RS19140-124-2.830732UDP-N-acetylmuramoylalanine--D-glutamate ligase
DIP_RS19135-124-3.087315cell division protein FtsW
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS19190SACTRNSFRASE394e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.8 bits (90), Expect = 4e-06
Identities = 14/58 (24%), Positives = 24/58 (41%)

Query: 109 YFELAELHVSPRAQGQGIGRSLLHALIKHCGHELILLSTPEVPDEKNAAFHLYRSCGF 166
Y + ++ V+ + +G+G +LLH I+ E D +A H Y F
Sbjct: 89 YALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


10DIP_RS19000DIP_RS18970Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS19000-120-3.644110TetR family transcriptional regulator
DIP_RS18995-121-3.630830hypothetical protein
DIP_RS18990-223-4.419756hypothetical protein
DIP_RS18985-124-4.048819histidinol dehydrogenase
DIP_RS18980-125-3.347137histidinol-phosphate aminotransferase
DIP_RS18975-228-3.537575imidazoleglycerol-phosphate dehydratase
DIP_RS18970029-3.143693hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS19000TETREPRESSOR865e-23 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 86.1 bits (213), Expect = 5e-23
Identities = 57/200 (28%), Positives = 91/200 (45%), Gaps = 25/200 (12%)

Query: 38 QLSKESIISASLDILDSYGLADMTMRRLATHLGVAPGALYWHFKNKQALIDAIARHIMAP 97
+L++ES+I A+L++L+ G+ +T R+LA LG+ LYWH KNK+AL+DA+A I+A
Sbjct: 3 RLNRESVIDAALELLNETGIDGLTTRKLAQKLGIEQPTLYWHVKNKRALLDALAVEILAR 62

Query: 98 LIDASPSAYHHD----IPSLAFDVRSLMLRHRDGAELLSAALTDAS----LRNDIISVVS 149
D S A + + A R +LR+RDGA++ D + + +
Sbjct: 63 HHDYSLPAAGESWQSFLRNNAMSFRRALLRYRDGAKVHLGTRPDEKQYDTVETQLRFMTE 122

Query: 150 AILGGCDKAYVGASTLLNFILGSCLIEQSEVQRLEIESGTTTAVR--------------- 194
D Y S + +F LG+ L EQ E +
Sbjct: 123 NGFSLRDGLYAI-SAVSHFTLGAVL-EQQEHTAALTDRPAAPDENLPPLLREALQIMDSD 180

Query: 195 NHDQLFMSSLEIIIAGLKSQ 214
+ +Q F+ LE +I G + Q
Sbjct: 181 DGEQAFLHGLESLIRGFEVQ 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS18995PF05616300.007 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 30.1 bits (67), Expect = 0.007
Identities = 22/76 (28%), Positives = 32/76 (42%), Gaps = 6/76 (7%)

Query: 48 TSKSASAKPSEP-SDAPASESAAPAPAPENGQPAPGHPAPGPIDNPEISFIPQAPVADGA 106
T SA A ++P + +E+ A PAP +P P P NP+ + P DG
Sbjct: 316 TPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDAN-----PDTDGQ 370

Query: 107 PASAEDQQAIEGLVRG 122
P + D A+ G
Sbjct: 371 PGTRPDSPAVPDRPNG 386


11DIP_RS16505DIP_RS16445Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS16505223-1.846059TIGR02611 family protein
DIP_RS16500123-1.655398ABC transporter substrate-binding protein
DIP_RS16495223-1.264904iron-siderophore uptake system protein
DIP_RS16490022-2.222595membrane protein
DIP_RS16485-121-3.208997ABC transporter ATP-binding protein
DIP_RS16480230-0.326697hypothetical protein
DIP_RS164753310.102205methylmalonyl-CoA epimerase
DIP_RS164703350.976630endonuclease NucS
DIP_RS164655371.961707hypothetical protein
DIP_RS164606412.572262ATP synthase epsilon chain
DIP_RS164556432.551302ATP synthase subunit beta
DIP_RS164505371.414446ATP synthase subunit gamma
DIP_RS164453331.313154ATP synthase subunit alpha
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS16500FERRIBNDNGPP884e-22 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 88.5 bits (219), Expect = 4e-22
Identities = 58/263 (22%), Positives = 104/263 (39%), Gaps = 27/263 (10%)

Query: 90 PERVVVLDTGELDSVLSLGITPVGMTTTKGAN------PVPSYLADKVKDVERVGTINEL 143
P R+V L+ ++ +L+LGI P G+ T P+P V VG E
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLP-------DSVIDVGLRTEP 87

Query: 144 NIEAIAALKPDLIIGSQLRADKLYPQLSDIAPT-VFSIRPGSP----WKENFLLVGEALG 198
N+E + +KP ++ S L+ IAP F+ G +++ + + L
Sbjct: 88 NLELLTEMKPSFMVWSA-GYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLN 146

Query: 199 MEKEAEEKLNEYADRVAELKNNVPQNTE--VSLVRFM-PNKLRLYGNKSLIGVVLADAGL 255
++ AE L +Y D + +K + + L + P + ++G SL +L + G+
Sbjct: 147 LQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGI 206

Query: 256 ARPEKQNVDDLAVE-ISPENIDQAAGSVIFYTSYGKPDATGETAVVEGPAWKNLEAVAAG 314
+ + +S + + + + A++ P W+ + V AG
Sbjct: 207 PNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDM--DALMATPLWQAMPFVRAG 264

Query: 315 KAHRVNDDVWFLGLGPTGAMEIV 337
+ RV VWF G AM V
Sbjct: 265 RFQRV-PAVWFYG-ATLSAMHFV 285


12DIP_RS16345DIP_RS16270Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS16345120-3.504756hypothetical protein
DIP_RS16335221-3.926582*hypothetical protein
DIP_RS16330119-3.387248MarR family transcriptional regulator
DIP_RS16325-116-3.358711polyisoprenoid-binding protein
DIP_RS16320-118-3.230026ATP-binding protein
DIP_RS16315-214-2.298777DNA repair exonuclease
DIP_RS16310-214-2.121443hypothetical protein
DIP_RS16305-19-0.880614ATP-dependent helicase
DIP_RS16300-217-0.501139sodium:proline symporter
DIP_RS16295015-3.859739hypothetical protein
DIP_RS16290016-2.935465membrane protein
DIP_RS16285017-2.692919hypothetical protein
DIP_RS16280-117-2.920374DEAD/DEAH box helicase
DIP_RS16275-118-4.228999DNA mismatch repair protein MutT
DIP_RS16270-117-3.742090helicase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS16320RTXTOXIND381e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 38.3 bits (89), Expect = 1e-04
Identities = 26/215 (12%), Positives = 66/215 (30%), Gaps = 20/215 (9%)

Query: 190 KKGGELGIAETAVAEAEQAVLAATQQKETLDRRVESVAGFQRDRDQAQQ-DLPNARDTLT 248
+KG L AEA+ ++ + L++ + + ++ + LP+
Sbjct: 119 RKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQN 178

Query: 249 TREQELKTA-ELAAQQLENASAQLKIAQDQERILQSDIDAREELRKEQSRLSEALQIRI- 306
E+E+ L +Q Q + +++ LS + R+
Sbjct: 179 VSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLD 238

Query: 307 -------------QEELDLKEKTDSLSEAREKLRSQHEKARADAEKMRQRLEKLKESLRL 353
L+ + K +SQ E+ ++ ++ + + + +
Sbjct: 239 DFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKN 298

Query: 354 DTLRTSYQELLTVSKQLEELRTVPEVTGAQLSAAE 388
+ L +L + + L + A+
Sbjct: 299 EIL----DKLRQTTDNIGLLTLELAKNEERQQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS16280SECETRNLCASE381e-05 Bacterial translocase SecE signature.
		>SECETRNLCASE#Bacterial translocase SecE signature.

Length = 127

Score = 38.3 bits (89), Expect = 1e-05
Identities = 27/100 (27%), Positives = 44/100 (44%), Gaps = 14/100 (14%)

Query: 56 SAQQEADSAEKDGFEALG-LPDAILKAVAKVG----FETPSPIQAQTIPVLMQGHDVMGL 110
SA EA + + G EA+ + L VA VG + P++A + +L+ G+
Sbjct: 2 SANTEAQGSGR-GLEAMKWVVVVALLLVAIVGNYLYRDIMLPLRALAVVILIAA--AGGV 58

Query: 111 AQTGTGKTAAFALPILSRIDVKKRHPQALILAPTRELALQ 150
A T A A +R +V+K ++ PTR+ L
Sbjct: 59 ALLTTKGKATVAFAREARTEVRK------VIWPTRQETLH 92


13DIP_RS16200DIP_RS16010Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS16200-1363.393037alpha-ketoglutarate decarboxylase
DIP_RS161950302.707134hypothetical protein
DIP_RS161901363.337767membrane protein
DIP_RS161853413.183483magnesium transporter
DIP_RS161801312.075336membrane protein
DIP_RS161750191.241723sodium:proton antiporter
DIP_RS161700170.749380Sec-independent protein translocase TatB
DIP_RS161650221.365545anti-sigma factor
DIP_RS161603322.077535RNA polymerase sigma factor SigE
DIP_RS161552271.962301methyltransferase
DIP_RS161502261.781584glucose-1-phosphate adenylyltransferase
DIP_RS161453313.093510glycosyl transferase family 1
DIP_RS161403233.219016decarboxylase
DIP_RS161352182.899728beta-fructosidase
DIP_RS161300172.262351SAM-dependent methyltransferase
DIP_RS16125-1161.883271hypothetical protein
DIP_RS161200172.405326glucosyl-3-phosphoglycerate synthase
DIP_RS161150252.871760dihydropteroate synthase
DIP_RS161102393.282293Rossman fold protein, TIGR00730 family
DIP_RS161053423.976248succinyl-diaminopimelate desuccinylase
DIP_RS161002423.595230succinyltransferase
DIP_RS160953444.171749amino acid transporter
DIP_RS160901344.4336612,3,4,5-tetrahydropyridine-2,6-dicarboxylate
DIP_RS160851263.537997amino acid transporter
DIP_RS160801212.361300hypothetical protein
DIP_RS160751202.411691membrane protein
DIP_RS160700192.709077RNA pseudouridine synthase
DIP_RS16065-2202.777800succinyldiaminopimelate transaminase
DIP_RS160600212.651211ferredoxin
DIP_RS160550242.691964membrane protein
DIP_RS160500262.9542421D-myo-inositol
DIP_RS160450263.273192hypothetical protein
DIP_RS16040-1261.989662GTP-binding protein
DIP_RS16035130-1.249102hypothetical protein
DIP_RS16030127-3.717717hypothetical protein
DIP_RS16025128-4.888687arsenate reductase (glutaredoxin)
DIP_RS16020-120-3.615660peptidase
DIP_RS16015-218-4.538657hypothetical protein
DIP_RS16010-315-3.208428hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS16200GPOSANCHOR340.005 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 33.9 bits (77), Expect = 0.005
Identities = 15/61 (24%), Positives = 20/61 (32%), Gaps = 1/61 (1%)

Query: 48 EAKNTSPQPAAPAKKQPAPAKKPAPTTASAPASTEAKAAPKPENKKPAKKQQPSPLERTG 107
+A+ + A A P KP +A KP K K+ L TG
Sbjct: 451 QAEELAKLRAGKASDSQTPDAKPGNKAVP-GKGQAPQAGTKPNQNKAPMKETKRQLPSTG 509

Query: 108 E 108
E
Sbjct: 510 E 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS16185FLGMOTORFLIG320.003 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 32.5 bits (74), Expect = 0.003
Identities = 31/146 (21%), Positives = 57/146 (39%), Gaps = 19/146 (13%)

Query: 189 IATVLYELPDAQRANVAKELDDDRLADVLQEMS-------EDRQAELIETLDIERAADVL 241
A +L + + V K L + + + E++ E + L+E ++ A + +
Sbjct: 21 AAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKELMMAQEFI 80

Query: 242 EEMDPDDAADLLGE-LDDDKAD-VLLELMDPEESAPVRRLMSFSPDTVGALMTPEPLILS 299
++ D A +LL + L KA ++ L +S P + P + + E
Sbjct: 81 QKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANILNFIQQE----H 136

Query: 300 PQTTVAEALALARNPDLPTSLASIVF 325
PQT AL L+ L AS +
Sbjct: 137 PQTI---ALILSY---LDPQKASFIL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS16170TATBPROTEIN553e-12 Bacterial sec-independent translocation TatB protein...
		>TATBPROTEIN#Bacterial sec-independent translocation TatB protein

signature.
Length = 171

Score = 55.0 bits (132), Expect = 3e-12
Identities = 26/80 (32%), Positives = 44/80 (55%), Gaps = 5/80 (6%)

Query: 1 MFSSIGWPEIFTVLILGLIIIGPERLPKVIEDVRAAIYAAKKAINNAKEELNGNLGAEFD 60
MF IG+ E+ V I+GL+++GP+RLP ++ V I A + + EL L +
Sbjct: 1 MFD-IGFSELLLVFIIGLVVLGPQRLPVAVKTVAGWIRALRSLATTVQNELTQELKLQ-- 57

Query: 61 EFREPINK--IASIQRMGPK 78
EF++ + K AS+ + P+
Sbjct: 58 EFQDSLKKVEKASLTNLTPE 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS16040TCRTETOQM1791e-50 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 179 bits (455), Expect = 1e-50
Identities = 98/443 (22%), Positives = 181/443 (40%), Gaps = 60/443 (13%)

Query: 7 RNVAIVAHVDHGKTTLVDAMLRQSGAFDEHAELVDR---VMDSGDLEKEKGITILAKNTA 63
N+ ++AHVD GKTTL +++L SGA E VD+ D+ LE+++GITI T+
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGS-VDKGTTRTDNTLLERQRGITIQTGITS 62

Query: 64 VRRRGAGKDGNDVVINVIDTPGHADFGGEVERALSMVDGVVLLVDASEGPLPQTRFVLGK 123
+ + +N+IDTPGH DF EV R+LS++DG +LL+ A +G QTR +
Sbjct: 63 FQW-------ENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHA 115

Query: 124 ALAAKMPVVILVNKTDRPDARIDEVVEE-----SQDLLLELASAL------DDPEAAEAA 172
+P + +NK D+ + V ++ S +++++ L + +E
Sbjct: 116 LRKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQW 175

Query: 173 EQLL----DLPVLYASGREGKAS-------------TVNPG-NGNVPDAEDLQALFDVIY 214
+ ++ DL Y SG+ +A ++ P +G+ + + L +VI
Sbjct: 176 DTVIEGNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVIT 235

Query: 215 EVMPEPTVNVDGPLQAHVTNLDSSSFLGRIGLIRVHSGSLKKGQQVAWIHYDEEGNQHTK 274
T L V ++ S R+ IR++SG L V +
Sbjct: 236 NKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVR--------ISEKE 287

Query: 275 TAKIAELLRTVGVTRVPADEVVAGDIAAISGIDSIMIGDTLADLEHPVALPRITVDEPAI 334
KI E+ ++ D+ +G+I + + + + L D + RI P +
Sbjct: 288 KIKITEMYTSINGELCKIDKAYSGEIVILQN-EFLKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 335 SMTIGVNTSPLAGRGGGDKLTARVVKARLEQELIGNVSLRVLPTERPDAWEVQGRGEMAL 394
T+ + + L + + LR + G++ +
Sbjct: 347 QTTVEPSKPQQREM----------LLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQM 396

Query: 395 SVLVETMRRE-GFELTVGKPQVV 416
V ++ + E+ + +P V+
Sbjct: 397 EVTCALLQEKYHVEIEIKEPTVI 419



Score = 37.5 bits (87), Expect = 1e-04
Identities = 16/81 (19%), Positives = 29/81 (35%), Gaps = 11/81 (13%)

Query: 424 LFEPYEHMVIDIPSEYQGNVTQLMAARKGQMLSMDNISDEWVRMEYKVPAR--------- 474
L EPY I P EY ++ + + V + ++PAR
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDL 593

Query: 475 -GLISFRTMFMTETRGTGIAN 494
+ R++ +TE +G +
Sbjct: 594 TFFTNGRSVCLTELKGYHVTT 614


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS16020V8PROTEASE381e-05 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 38.4 bits (89), Expect = 1e-05
Identities = 33/163 (20%), Positives = 60/163 (36%), Gaps = 42/163 (25%)

Query: 48 CTGTLVSPTTVLTARHCLNGGLG---------HVRLGADH----FTAVRAVAHP-QADLA 93
+G +V T+LT +H ++ G ++ FTA + + + DLA
Sbjct: 104 ASGVVVGKDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLA 163

Query: 94 VLHLD------------RPAPIAPSAISGRHTQPGNRFGVAGYGSTFPGIPMAAAATMQR 141
++ +PA ++ +A + Q V GY P ATM
Sbjct: 164 IVKFSPNEQNKHIGEVVKPATMSNNAET----QVNQNITVTGYPGDKP------VATMWE 213

Query: 142 RVTDVPSPDRQAVMIENHISQGVLRPGDSGGPLL-EGNHVVGV 183
+ +A+ + + G+SG P+ E N V+G+
Sbjct: 214 SKGKITYLKGEAMQYDLSTT-----GGNSGSPVFNEKNEVIGI 251


14DIP_RS15740DIP_RS15710Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS157401143.136604ABC transporter
DIP_RS157352283.701618reductase
DIP_RS157303343.969351C4-dicarboxylate ABC transporter
DIP_RS157254423.513249LysR family transcriptional regulator
DIP_RS157203342.515359bifunctional N-acetylglucosamine-1-phosphate
DIP_RS157153272.188842ribose-phosphate pyrophosphokinase
DIP_RS157104201.388701type I pullulanase
15DIP_RS15355DIP_RS15260Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS153552244.214937thymidylate synthase
DIP_RS153503295.719772diacylglycerol kinase
DIP_RS153453356.584387NrdH-redoxin
DIP_RS153403377.475508precorrin-6A synthase (deacetylating)
DIP_RS153355428.728935hypothetical protein
DIP_RS153256459.434169*hypothetical protein
DIP_RS153207469.888352hypothetical protein
DIP_RS1531584910.889097integrase
DIP_RS15310114511.102797hypothetical protein
DIP_RS153058417.683931hypothetical protein
DIP_RS153002330.700302hypothetical protein
DIP_RS152951261.818829hypothetical protein
DIP_RS152902353.918765hypothetical protein
DIP_RS152851353.965227hypothetical protein
DIP_RS152801323.539985hypothetical protein
DIP_RS152752344.244468hypothetical protein
DIP_RS152703386.052217DNA methylase
DIP_RS152653386.221837DNA restriction-modification system, restriction
DIP_RS152603272.837950hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS15340BCTERIALGSPD300.013 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 29.5 bits (66), Expect = 0.013
Identities = 18/103 (17%), Positives = 36/103 (34%), Gaps = 5/103 (4%)

Query: 98 TPDDGAVAFLVWGDPSLYDSTLRIIEHMRNLEDLHADVKVIP-----GITAVQVLTAEHG 152
D+ A LV G+P+ + +I+ + + + KVI V+VLT
Sbjct: 232 VADERTNAVLVSGEPNSRQRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGISS 291

Query: 153 ILINRIGEAIHITTGRNLPETSAKDRRNCVVMLDGKTAWQDVA 195
+ + A + A + N +++ D+
Sbjct: 292 TMQSEKQAAKPVAALDKNIIIKAHGQTNALIVTAAPDVMNDLE 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS15265TONBPROTEIN330.003 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 33.0 bits (75), Expect = 0.003
Identities = 12/88 (13%), Positives = 19/88 (21%), Gaps = 11/88 (12%)

Query: 435 PAPPQPVHPMDVTPPTPVPTPDLDPVPGGNTTA---------PTPVLPPADPQPVAP--D 483
P+P PP P P P P + P + +P +P +
Sbjct: 68 VVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFEN 127

Query: 484 LDKGTDTRPPLGLKELGGDGASPEDLYP 511
T +
Sbjct: 128 TAPARLTSSTATAATSKPVTSVASGPRA 155



Score = 30.3 bits (68), Expect = 0.023
Identities = 20/70 (28%), Positives = 27/70 (38%), Gaps = 2/70 (2%)

Query: 437 PPQPVHPMDVTPPTPVPTPDLDPVPGGNTTAPTPVLPPADPQPVAPDLDKGTDTRPPLGL 496
PPQ V P P P P+ +P+P AP + P P K +P +
Sbjct: 57 PPQAVQP--PPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDV 114

Query: 497 KELGGDGASP 506
K + ASP
Sbjct: 115 KPVESRPASP 124


16DIP_RS15145DIP_RS15040Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS15145-1263.269163cold-shock protein
DIP_RS15135-2312.972697hypothetical protein
DIP_RS15130-1282.221160hypothetical protein
DIP_RS15125-1253.070520DNA-binding protein
DIP_RS151201262.405945helicase
DIP_RS151150210.799231membrane protein
DIP_RS151100182.059451aminotransferase
DIP_RS150901192.391039transposase
DIP_RS150852233.695239transposase
DIP_RS15080017-0.864745methyltransferase
DIP_RS15075121-2.371028cupin
DIP_RS15070125-4.352638integrase
DIP_RS15065227-6.439755transposase
DIP_RS15060125-6.443564transposase
DIP_RS15055025-6.968045lantibiotic ABC transporter permease
DIP_RS15050022-6.171002lantibiotic ABC transporter ATP-binding protein
DIP_RS15045019-5.718725lantibiotic ABC transporter permease
DIP_RS15040-214-4.488882lantibiotic-modifying protein
17DIP_RS14450DIP_RS14405Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS14450020-3.319813O-acetylhomoserine
DIP_RS14445-227-4.268669hypothetical protein
DIP_RS14440022-2.874531hemin ABC transporter ATP-binding protein
DIP_RS14435121-3.292660iron ABC transporter permease
DIP_RS14430220-3.110022ABC transporter substrate-binding protein
DIP_RS14425122-3.142429hemin receptor
DIP_RS14420022-3.124765hypothetical protein
DIP_RS14415021-3.498625homoserine O-acetyltransferase
DIP_RS14410024-4.277419hypothetical protein
DIP_RS14405-128-3.211605membrane protein
18DIP_RS14280DIP_RS14240Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS14280-1373.671593hypothetical protein
DIP_RS14275-2343.506905GMP synthetase
DIP_RS14270-1192.708842hypothetical protein
DIP_RS142650253.923775hypothetical protein
DIP_RS142601223.778653DMT transporter permease
DIP_RS142551214.043981membrane protein
DIP_RS142502203.902047MFS transporter
DIP_RS142450173.449091siderophore biosynthesis protein
DIP_RS142400173.158955iron-enterobactin transporter ATP-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS14240PF041831835e-51 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 183 bits (466), Expect = 5e-51
Identities = 104/488 (21%), Positives = 175/488 (35%), Gaps = 35/488 (7%)

Query: 18 GRLLAALIDEHLITPETH---HKLTNTTTPPADLLRQLAQDNQLTITPTNLERACAEIAD 74
++L+ L E + E+ N + + L I L A +
Sbjct: 15 AKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWIDAQTLRCADEPVLA 74

Query: 75 SVVGLHRARTAITQRWEQALNAQTATNTSQTPYSDLIQALRQRCRLEDRGSSAMLARCEQ 134
+ + + A A+ + T DL Q L+ R L + A Q
Sbjct: 75 QTLLMQLKQV---LSMSDATVAEHMQDLYATLLGDL-QLLKARRGLSASDLINLNADRLQ 130

Query: 135 LVCDGHPAHPAAKTSLGIG-DSFLHVLPEQTETIQLRFVAVDTDHAVV--VGGHPVETI- 190
+ GHP K G G ++ PE T +L ++AV +H + + +
Sbjct: 131 CLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQLL 190

Query: 191 -SEAMPLLGARLNAELERCELHHHSV-IPVHPFQWDNVISSEFAEEIASGTIVLL-ETTA 247
+ P AR + + L H+ + +PVHP+QW I+++F + A G +V L E
Sbjct: 191 TAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEGRMVSLGEFGD 250

Query: 248 TAEPLMSVRTLRVSDATGSMHIKVALEIQLTGAVRGVS-AGAVAAPAIASIIDDACTLDA 306
S+RTL + G + IK+ L I T RG+ A P + + DA
Sbjct: 251 QWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQVFATDA 310

Query: 307 GFIPRTDT--DQPAFSVAYDRSAIRWNADSGIRAHCFGAVLRDDP-TGNADDEIAMPVAT 363
+ +PA G + R++P DE + +AT
Sbjct: 311 TLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKPDESPVLMAT 370

Query: 364 LLARNPLTGATIAADLIDELSHRHNRHRDEIATDWFTALGKFLFVPAVALIARWGIALEP 423
L+ + +A ID A W T L + + VP L+ R+G+AL
Sbjct: 371 LMECDE-NNQPLAGAYIDRS--------GLDAETWLTQLFRVVVVPLYHLLCRYGVALIA 421

Query: 424 HPQNTVIILRDGMPHRIVVRDLGGCRLWANGPLAAH--------PIVDKLRATALIENDL 475
H QN + +++G+P R++++D G + +L A LI +
Sbjct: 422 HGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDLQ 481

Query: 476 IRLIDKVF 483
V
Sbjct: 482 TGHFVTVL 489



Score = 177 bits (450), Expect = 7e-49
Identities = 99/519 (19%), Positives = 183/519 (35%), Gaps = 46/519 (8%)

Query: 569 ISPLENLELLTPES-------LRAACAPYSEWAHETLATRLHDAAVRERIDDS------- 614
+S LE ++ ES + A + A + L A R D
Sbjct: 18 LSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWIDAQTLRCADEPVLAQTL 77

Query: 615 FPTLRDDIANAEENLALVRAQVTSRVNTPESYWDLLKGLP---PHAAMIAADSYAISGHN 671
L+ ++ ++ +A + + + +GL +SGH
Sbjct: 78 LMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINLNADRLQCLLSGHP 137

Query: 672 VHPLAKLRRGFSIEESAAYGPEAGMSTDLRLVGVDKRMIDTSTTADC---VRLIAHHFPQ 728
K RRG+ E Y PE + L + V + + + L A PQ
Sbjct: 138 KFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQLLTAAMDPQ 197

Query: 729 HIAYARTHLHEHGLDADSYAIIPVHPWQLEHVIREAFAEDIADHTMVPIPNIAIAAHPTI 788
A E+GLD ++ +PVHPWQ + I F D A+ MV +
Sbjct: 198 EFARFSQVWQENGLD-HNWLPLPVHPWQWQQKIATDFIADFAEGRMVSLGEFGDQWLAQQ 256

Query: 789 SLRTLVPHAPTPSGTRPFIKCAVDVTLTSTRRSISQDSALGTPRVAGLVATALEQLRRET 848
SLRTL +A IK + + TS R I P + + T
Sbjct: 257 SLRTLT-NASRRG--GLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQVFATD--AT 311

Query: 849 NVQPRAVVVPELSGLALSRDERSEGIDDSFRKTRQRGLSVLLRDDATAYLAPGEIAMSAC 908
VQ AV++ E + +S + + +R Q L V+ R++ +L P E +
Sbjct: 312 LVQSGAVILGEPAAGYVSHEGYAALARAPYR--YQEMLGVIWRENPCRWLKPDESPVLMA 369

Query: 909 AL----RGHEGVVPSPLRDIN---EEFFDDYVYDLMSTVLGLMMVKGIALEQHLQNTLVR 961
L ++ + + + E + ++ + L+ G+AL H QN +
Sbjct: 370 TLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITL- 428

Query: 962 IDLSGKTPVYRGIMLRDFSG-LRAWAPRLQQWASDQVFEPGAIT-LTDDH-EEFVNKGFY 1018
+ P ++L+DF G +R + S + L+ D+ + G +
Sbjct: 429 -AMKEGVPQ--RVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDLQTGHF 485

Query: 1019 ASVFGNLDGIVDEYSQARGVDAQSLWERVHVQINRFVQE 1057
+V + ++ GV + ++ + ++ ++++
Sbjct: 486 VTVLRFISPLMVRL----GVPERRFYQLLAAVLSDYMKK 520


19DIP_RS14185DIP_RS14125Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS141851293.529893tRNA
DIP_RS141801263.423240ribosomal-protein-alanine N-acetyltransferase
DIP_RS14175-1142.682018tRNA threonylcarbamoyladenosine biosynthesis
DIP_RS14170-3172.508239hypothetical protein
DIP_RS14165-3182.453719transporter
DIP_RS14160-3152.510795tRNA threonylcarbamoyladenosine biosynthesis
DIP_RS14155-1161.492401alanine racemase
DIP_RS141500151.550056alpha/beta hydrolase
DIP_RS141403212.336483hypothetical protein
DIP_RS141353212.203423hypothetical protein
DIP_RS141303252.063523hypothetical protein
DIP_RS141252200.901267phosphoglucosamine mutase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS14175SACTRNSFRASE341e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 34.1 bits (78), Expect = 1e-04
Identities = 12/65 (18%), Positives = 23/65 (35%), Gaps = 1/65 (1%)

Query: 74 VHTVGVDPRWQRRGFGRLVMDNFVHVADTAG-GPIFLEVRTTNAPAIALYESLGFEHQGV 132
+ + V ++++G G ++ + A + LE + N A Y F V
Sbjct: 92 IEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAV 151

Query: 133 RKNYY 137
Y
Sbjct: 152 DTMLY 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS14155PF05272280.022 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 27.7 bits (61), Expect = 0.022
Identities = 7/20 (35%), Positives = 11/20 (55%)

Query: 35 VIILDGPLGAGKTTFTQGLA 54
++L+G G GK+T L
Sbjct: 598 SVVLEGTGGIGKSTLINTLV 617


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS14150ALARACEMASE2847e-96 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 284 bits (728), Expect = 7e-96
Identities = 97/369 (26%), Positives = 154/369 (41%), Gaps = 24/369 (6%)

Query: 5 ETRISCDAIAANTRRLKDMVAPAQLMCVVKADGYNHGAPEVATVMARNGADQFGVATLAE 64
+ + A+ N ++ A++ VVKA+ Y HG + + A D F + L E
Sbjct: 6 QASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWS--AIGATDGFALLNLEE 63

Query: 65 AHQLRDAGITLPIL-CWIWSPEQDFSAAIDRDIDLAAVSMDHVRALIAEAMRRPAGTRVR 123
A LR+ G PIL + QD + S ++AL ++ P +
Sbjct: 64 AITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAP----LD 119

Query: 124 VTVKIDTELHRSGIDEANWTEAFELLHACPQVNVTGVFSHLACADDLESDYTDHQAEVFR 183
+ +K+++ ++R G ++ L A V + SH A A+ D
Sbjct: 120 IYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAE--HPDGISGAMARIE 177

Query: 184 RAISAGRNVGLDLPVNHLAASPATLTRPDLHFDMVRPGLALYGHEPIAG----LDHGLRE 239
+A GL+ L+ S ATL P+ HFD VRPG+ LYG P + GLR
Sbjct: 178 QAAE-----GLECRR-SLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTGLRP 231

Query: 240 AMTWIGSVTVVKPIAAGQGTSYNMTWHAPADGYLCVVPVGYADGLPRNVQGHLEVTIAGT 299
MT + V+ + AG+ Y + A + + +V GYADG PR+ V + G
Sbjct: 232 VMTLSSEIIGVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVLVDGV 291

Query: 300 RYPQVGRVCMDQIVVFLGDNSRGVAPGDEAIIFGPRDTQAMTVTELACATGTINYEILCR 359
R VG V MD + V L + G ++G + + ++A A GT+ YE++C
Sbjct: 292 RTMTVGTVSMDMLAVDLTPCPQ-AGIGTPVELWGKE----IKIDDVAAAAGTVGYELMCA 346

Query: 360 PTGRSHRTY 368
R
Sbjct: 347 LALRVPVVT 355


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS14135PF03544414e-06 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 41.1 bits (96), Expect = 4e-06
Identities = 17/56 (30%), Positives = 23/56 (41%), Gaps = 3/56 (5%)

Query: 370 VPPPPTPDVPPPPAPAPPPVIDNVPEPPPPPKQIPGMVPAAAPAPPPAPEPPAPAP 425
+ PP PP P P P + +PEPP P ++ P P P P+P
Sbjct: 60 LEPPQAVQPPPEPVVEPEPEPEPIPEPPKEA---PVVIEKPKPKPKPKPKPVKKVE 112



Score = 35.7 bits (82), Expect = 2e-04
Identities = 15/59 (25%), Positives = 19/59 (32%)

Query: 370 VPPPPTPDVPPPPAPAPPPVIDNVPEPPPPPKQIPGMVPAAAPAPPPAPEPPAPAPLSS 428
V P P P P PV+ P+P P PK P E +P +
Sbjct: 73 VVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFEN 131



Score = 33.4 bits (76), Expect = 0.001
Identities = 10/55 (18%), Positives = 13/55 (23%)

Query: 371 PPPPTPDVPPPPAPAPPPVIDNVPEPPPPPKQIPGMVPAAAPAPPPAPEPPAPAP 425
P + P P P P A+P AP P +
Sbjct: 87 PKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSST 141



Score = 33.4 bits (76), Expect = 0.001
Identities = 17/55 (30%), Positives = 19/55 (34%), Gaps = 4/55 (7%)

Query: 371 PPPPTPDVPPPPAPAPPPVIDNVPEPPPPPKQIPGMVPAAAPAPPPAPEPPAPAP 425
PP P V P P P P P+P +Q V P E APA
Sbjct: 86 PPKEAPVVIEKPKPKPKP----KPKPVKKVEQPKRDVKPVESRPASPFENTAPAR 136



Score = 32.6 bits (74), Expect = 0.002
Identities = 11/55 (20%), Positives = 12/55 (21%)

Query: 371 PPPPTPDVPPPPAPAPPPVIDNVPEPPPPPKQIPGMVPAAAPAPPPAPEPPAPAP 425
P P P P P D P P PA + P
Sbjct: 97 PKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVT 151



Score = 30.3 bits (68), Expect = 0.011
Identities = 21/53 (39%), Positives = 23/53 (43%), Gaps = 4/53 (7%)

Query: 377 DVPPPPAPAPPPVIDNVP----EPPPPPKQIPGMVPAAAPAPPPAPEPPAPAP 425
V PAPA P + V EPP + P V P P P PEPP AP
Sbjct: 39 QVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAP 91


20DIP_RS13945DIP_RS13810Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS13945020-3.04569750S ribosomal protein L18
DIP_RS13940022-2.04765350S ribosomal protein L6
DIP_RS13935-125-1.34967430S ribosomal protein S8
DIP_RS13930-230-0.376335hypothetical protein
DIP_RS13925-1350.813754hypothetical protein
DIP_RS139201433.216410glucosamine-6-phosphate deaminase
DIP_RS139151423.585319N-acetylglucosamine-6-phosphate deacetylase
DIP_RS139100383.101431N-acetylmannosamine-6-phosphate 2-epimerase
DIP_RS139051362.790626N-acetylglucosamine kinase
DIP_RS139002382.530146GntR family transcriptional regulator
DIP_RS138951321.838973ABC transporter substrate-binding protein
DIP_RS138901211.487752ABC transporter permease
DIP_RS138851191.165938ABC transporter
DIP_RS138802170.369303ABC transporter
DIP_RS13875520-0.604121dihydrodipicolinate synthase family protein
DIP_RS13870618-1.503503cyclic pyranopterin phosphate synthase MoaA
DIP_RS13865516-0.874589molybdopterin molybdenumtransferase MoeA
DIP_RS13860316-0.931783cyclic pyranopterin monophosphate synthase
DIP_RS13855216-0.902520molybdenum cofactor guanylyltransferase
DIP_RS13850217-0.962935molybdenum cofactor biosynthesis protein
DIP_RS13845218-0.999522nitrate/nitrite transporter integral membrane
DIP_RS13840222-1.365808MFS transporter
DIP_RS13835226-1.319074nitrate reductase subunit alpha
DIP_RS13830234-1.864487nitrate reductase subunit beta
DIP_RS13825337-1.460836nitrate reductase molybdenum cofactor assembly
DIP_RS13820430-1.262168nitrate reductase subunit gamma
DIP_RS138153190.407531molybdenum ABC transporter substrate-binding
DIP_RS138102121.125090molybdenum ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13915UREASE290.027 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 29.3 bits (66), Expect = 0.027
Identities = 13/21 (61%), Positives = 17/21 (80%)

Query: 346 HDRGAIEVGKRADLVLCSGDF 366
H+ G++EVGKRADLVL + F
Sbjct: 421 HEIGSLEVGKRADLVLWNPAF 441


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13845TCRTETB471e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 46.8 bits (111), Expect = 1e-07
Identities = 62/326 (19%), Positives = 118/326 (36%), Gaps = 46/326 (14%)

Query: 51 IQKEFGLSDTQLSWILAVAILNGSMWRLPAGILADKIGGRKV-MLGITLFSAVASLGVAF 109
I +F +W+ +L S+ G L+D++G +++ + GI + S+
Sbjct: 40 IANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN-CFGSVIGFV 98

Query: 110 SQNYTMTLVLA-FLVGFCGNSFTAGAAWASAWF-PKHKQGFALGVFGA-GNVGASVTKFI 166
++ L++A F+ G +F A A + PK +G A G+ G+ +G V I
Sbjct: 99 GHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAI 158

Query: 167 GPVIIASTAGATYFGFIP-----AGWRLIPI------------IYSVTLVIVAILMMIYT 209
G +IA +Y IP L+ + I + L+ V I+ +
Sbjct: 159 GG-MIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLF 217

Query: 210 PKVDRFGSSGRSLAEMLLPLKQVR----------------VWRFSLYYVVVFGAYVALSA 253
S+ L+ +K +R L ++FG +
Sbjct: 218 TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVS 277

Query: 254 TLPKYYISNYDVSLAAAGLLTAIFIFPASL----LRPVGGWFSDRYGARRAMYGTFFVMG 309
+P + +S A G ++ IFP ++ +GG DR G + +
Sbjct: 278 MVPYMMKDVHQLSTAEIG---SVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLS 334

Query: 310 TAAGLLCLPSNIINIFVFTLLIFSLG 335
+ + F+ +++F LG
Sbjct: 335 VSFLTASFLLETTSWFMTIIIVFVLG 360


21DIP_RS13345DIP_RS13210Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS133452421.924112metal-transporting ATPase
DIP_RS133402432.504515hypothetical protein
DIP_RS133351313.146153delta-aminolevulinic acid dehydratase
DIP_RS133300303.418635bifunctional uroporphyrinogen-III
DIP_RS133250262.928418hydroxymethylbilane synthase
DIP_RS13320-1242.439934glutamyl-tRNA reductase
DIP_RS13315-1211.905079glutaredoxin
DIP_RS13310-2202.642584phosphoserine phosphatase
DIP_RS13305-3222.171915hypothetical protein
DIP_RS13300-2221.908886excisionase
DIP_RS13295-3212.260116pyrroline-5-carboxylate reductase
DIP_RS13290-1233.164151hypothetical protein
DIP_RS132851223.680635hypothetical protein
DIP_RS132800253.960571DNA-binding response regulator
DIP_RS132750243.657336histidine kinase
DIP_RS132700274.5915252,3-bisphosphoglycerate-dependent
DIP_RS132651284.495523D-inositol-3-phosphate glycosyltransferase
DIP_RS132600284.268509long-chain-fatty-acid--CoA ligase
DIP_RS13255-1344.083579long-chain-fatty-acid--CoA ligase
DIP_RS132500303.312643hypothetical protein
DIP_RS132450323.636175UDP-N-acetylenolpyruvoylglucosamine reductase
DIP_RS132401342.992926hypothetical protein
DIP_RS132350313.611583hypothetical protein
DIP_RS132301373.269487formate acetyltransferase
DIP_RS13225-2222.289110formate acetyltransferase
DIP_RS13220-2212.469712pyruvate formate-lyase 1-activating enzyme
DIP_RS13215-2222.814992hypothetical protein
DIP_RS132100293.284413hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13310RTXTOXINA290.027 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.027
Identities = 17/75 (22%), Positives = 31/75 (41%), Gaps = 1/75 (1%)

Query: 64 KLDEYTSGLDSVSGSFEAAGAQHVTTANPEIPQDIGAAAFFDIDNTLIQGSSLVVFAMGL 123
LD +GLD+VSG A A + +N + AAA ++ ++ + +
Sbjct: 234 NLDNIGAGLDTVSGILSAISASFIL-SNADADTRTKAAAGVELTTKVLGNVGKGISQYII 292

Query: 124 AKKKYFKLNEILPVA 138
A++ L+ A
Sbjct: 293 AQRAAQGLSTSAAAA 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13305PRPHPHLPASEC280.032 Prokaryotic zinc-dependent phospholipase C signature.
		>PRPHPHLPASEC#Prokaryotic zinc-dependent phospholipase C signature.

Length = 398

Score = 28.4 bits (63), Expect = 0.032
Identities = 17/97 (17%), Positives = 34/97 (35%), Gaps = 11/97 (11%)

Query: 120 EFSDFECPFCARWSNQTEPTLMEEYVSKGLVRIEWNDLPVNGEHALAAAKAGRAAAAQGK 179
+F+ + + ++ + Y S + W+D + LA ++ G A G
Sbjct: 212 DFNAWSKEYARGFAKTGK----SIYYSHASMSHSWDDWDYAAKVTLANSQKGTA----GY 263

Query: 180 FDEFRKALFEASRNVSGHPNNTLKDFERFARNAGVKD 216
F L + S +K+ + +G KD
Sbjct: 264 IYRF---LHDVSEGNDPSVGKNVKELVAYISTSGEKD 297


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13280HTHFIS875e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 87.2 bits (216), Expect = 5e-22
Identities = 30/118 (25%), Positives = 61/118 (51%), Gaps = 1/118 (0%)

Query: 2 TSILLVEDEESLADPLAFLLRKEGFDVVIAGDGPSALVEFDRNAIDIVLLDLMLPGMSGT 61
+IL+ +D+ ++ L L + G+DV I + + D+V+ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 DVCKRLRAV-SSVPVIMVTARDSEIDKVVGLELGADDYVTKPYSSRELIARIRAVLRR 118
D+ R++ +PV++++A+++ + + E GA DY+ KP+ ELI I L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13275PF06580330.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.9 bits (75), Expect = 0.002
Identities = 25/110 (22%), Positives = 43/110 (39%), Gaps = 23/110 (20%)

Query: 264 LLVTAVSNLISNAINYSPQGMPISVATKISRDGRVLIRVIDNGIGISVENQKRVFERFFR 323
L+ T V N I + I PQG I + +G V + V + G ++++N K
Sbjct: 259 LVQTLVENGIKHGIAQLPQGGKILLKGT-KDNGTVTLEVENTG-SLALKNTKE------- 309

Query: 324 VDKARSRSTGGTGLGLAIVKH-VAANHGGD--VTLWSRPGTGSTFTIELP 370
TG GL V+ + +G + + L + G + +P
Sbjct: 310 ----------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKV-NAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13240THERMOLYSIN300.006 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 29.6 bits (66), Expect = 0.006
Identities = 14/83 (16%), Positives = 24/83 (28%), Gaps = 5/83 (6%)

Query: 35 LSPEPGEVHEFTEANGGAVATLFEVLPQELLPEAIRAMISQALKVKRVVTVGGLNGTSAP 94
L+P PG +A G V + + EA V G+ G
Sbjct: 190 LTPVPGNWIYMIDAADGKV-----LNKWNQMDEAKPGGAQPVAGTSTVGVGRGVLGDQKY 244

Query: 95 LSYTADVKGTPVDFKGEISMNGL 117
++ T + +G+
Sbjct: 245 INTTYSSYYGYYYLQDNTRGSGI 267


22DIP_RS13110DIP_RS12145Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS131102240.141380protease
DIP_RS13105223-0.673015alkylated DNA repair protein
DIP_RS13100223-3.078307hypothetical protein
DIP_RS13095024-5.328964hypothetical protein
DIP_RS13090031-5.785548hypothetical protein
DIP_RS13085030-4.477225DNA-binding protein
DIP_RS13080237-3.515556DNA-binding protein
DIP_RS130753552.676493hypothetical protein
DIP_RS130703293.454105integrase
DIP_RS130652334.407319integrase
DIP_RS130602375.628230hypothetical protein
DIP_RS130552405.573330*DNA polymerase III subunit delta'
DIP_RS130452345.933317adenylate cyclase
DIP_RS13040-1335.472156hypothetical protein
DIP_RS13035-3324.432939hypothetical protein
DIP_RS13030-2294.585042oligopeptide transporter, OPT family
DIP_RS13025-2284.398856hypothetical protein
DIP_RS23025-1254.252226DNA topoisomerase I
DIP_RS13015-1274.383184NAD-dependent dehydratase
DIP_RS13010-1243.968962membrane protein
DIP_RS13005-1275.812437ABC transporter substrate-binding protein
DIP_RS13000-1255.553417ABC transporter
DIP_RS129952205.307021ABC transporter ATP-binding protein
DIP_RS129903205.786858ABC transporter ATP-binding protein
DIP_RS129853205.898715cold-shock protein
DIP_RS129756215.262514DEAD/DEAH box helicase
DIP_RS129706274.162281hypothetical protein
DIP_RS129655244.613146hypothetical protein
DIP_RS129603223.999645hypothetical protein
DIP_RS129552204.362728hypothetical protein
DIP_RS129500173.864401hypothetical protein
DIP_RS12945-1184.467897general secretion pathway protein GspE
DIP_RS12940-2224.623353morphological differentiation
DIP_RS12935-2234.083513phosphoserine phosphatase
DIP_RS12930-1224.210233hypothetical protein
DIP_RS129250233.581173membrane protein
DIP_RS129200253.959919hydrolase
DIP_RS12915-1243.269903serine protease (mycosin)
DIP_RS12910-1213.073436coenzyme A pyrophosphatase
DIP_RS12905-1212.795267membrane protein
DIP_RS12900-1202.678635endonuclease III
DIP_RS12895-1294.191199Crp/Fnr family transcriptional regulator
DIP_RS12890-2264.098416MBL fold metallo-hydrolase
DIP_RS128850314.284591LysR family transcriptional regulator
DIP_RS128800273.074641hypothetical protein
DIP_RS128750354.009148WhiB family transcriptional regulator
DIP_RS128702233.232941penicillin-binding protein
DIP_RS128652212.463939hypothetical protein
DIP_RS128603222.724230metallophosphoesterase
DIP_RS128503272.392739*hypothetical protein
DIP_RS128454312.791272membrane protein
DIP_RS128405481.741083cation:proton antiporter
DIP_RS128356442.635728cation:proton antiporter
DIP_RS128306512.293478cation:proton antiporter
DIP_RS12825458-0.265202cation:proton antiporter
DIP_RS128204660.321402cation:proton antiporter
DIP_RS128157681.378645Na+/H+ antiporter subunit G
DIP_RS128106671.541638two-component sensor histidine kinase
DIP_RS128055432.298328DNA-binding response regulator
DIP_RS128004353.081893MFS transporter
DIP_RS127953304.621110ABC transporter
DIP_RS127901161.239169transporter
DIP_RS127850171.255037ABC transporter ATP-binding protein
DIP_RS127801201.595813catalase
DIP_RS127750161.445913RNA polymerase sigma factor SigC
DIP_RS127700212.167322aspartate-semialdehyde dehydrogenase
DIP_RS12765-1222.075389membrane protein
DIP_RS12760-2375.508457aspartate kinase
DIP_RS12755-1405.193433membrane protein
DIP_RS127500414.373681hypothetical protein
DIP_RS127450323.361402phosphomannomutase
DIP_RS127400251.5149042-deoxyribose-5-phosphate aldolase
DIP_RS127350221.270984antibiotic MFS transporter
DIP_RS127300191.821708purine-nucleoside phosphorylase
DIP_RS127250223.462763hypothetical protein
DIP_RS12720-2234.296064DNA-directed RNA polymerase sigma-70 factor
DIP_RS12715-2264.8730742-isopropylmalate synthase
DIP_RS12710-1275.939867NADPH-dependent oxidoreductase
DIP_RS12705-2287.060943DNA polymerase III subunit epsilon
DIP_RS12700-3307.380196UDP-N-acetylmuramyl peptide synthase
DIP_RS12695-2245.242283glutamine amidotransferase
DIP_RS12690-2244.702831recombination protein RecR
DIP_RS12685-3244.458361nucleoid-associated protein
DIP_RS12680-3255.676824DNA polymerase III subunit gamma/tau
DIP_RS12675-2265.058624hypothetical protein
DIP_RS12670-3275.304120aminotransferase
DIP_RS12660-1285.427914*hypothetical protein
DIP_RS12655-2254.771669gluconate kinase
DIP_RS12650-2285.418391gluconate permease
DIP_RS12645-1274.483052tRNA glutamyl-Q synthetase
DIP_RS12640-1243.850463tRNA-guanine(34) transglycosylase
DIP_RS126350252.906291multidrug RND transporter
DIP_RS12630-1212.017262excisionase
DIP_RS126200221.908249*tRNA-specific adenosine deaminase
DIP_RS126150261.617498hypothetical protein
DIP_RS12610-1321.541173prephenate dehydrogenase
DIP_RS126051220.258331hypothetical protein
DIP_RS126001252.263776hypothetical protein
DIP_RS125952273.197860transposase
DIP_RS125903313.565239hypothetical protein
DIP_RS125852292.703617hypothetical protein
DIP_RS125802323.610930surface-anchored fimbrial subunit
DIP_RS125752385.255665surface anchored protein
DIP_RS125702343.568698class C sortase
DIP_RS125650303.562456fimbrial protein
DIP_RS12560-1254.122314hypothetical protein
DIP_RS125552305.820089class C sortase
DIP_RS125503224.298715pyridoxal 5'-phosphate synthase subunit PdxT
DIP_RS125452264.693403pyridoxal biosynthesis lyase PdxS
DIP_RS125401220.593322GntR family transcriptional regulator
DIP_RS12535124-0.134635polysaccharide deacetylase
DIP_RS12530227-0.473353hypothetical protein
DIP_RS12525144-3.078583toxin
DIP_RS12520231-4.978478hypothetical protein
DIP_RS12515332-5.215524hypothetical protein
DIP_RS12510231-0.283920hypothetical protein
DIP_RS12505129-0.496269peptidoglycan-binding protein
DIP_RS12500127-0.964825hypothetical protein
DIP_RS12495127-1.356187phage tail fiber protein
DIP_RS12485325-1.056230hypothetical protein
DIP_RS23020424-0.929774hypothetical protein
DIP_RS12475525-1.261200hypothetical protein
DIP_RS12470523-1.297919hypothetical protein
DIP_RS12465522-1.261333hypothetical protein
DIP_RS12460621-0.918770hypothetical protein
DIP_RS12455226-1.664178phage tail protein
DIP_RS12450430-1.172638hypothetical protein
DIP_RS12445430-1.260376hypothetical protein
DIP_RS12440532-0.754360hypothetical protein
DIP_RS12435427-1.455123hypothetical protein
DIP_RS12430527-1.564816hypothetical protein
DIP_RS12425628-1.443535phage capsid protein
DIP_RS12420526-1.554745primosomal replication protein N
DIP_RS12415624-1.694605phage portal protein
DIP_RS12410524-0.969073hypothetical protein
DIP_RS12400627-1.036151hypothetical protein
DIP_RS12385730-1.393930hypothetical protein
DIP_RS12380433-1.850623hypothetical protein
DIP_RS12375432-4.276314hypothetical protein
DIP_RS12370332-3.224092hypothetical protein
DIP_RS12365434-2.339077hypothetical protein
DIP_RS12360533-1.047117antirepressor
DIP_RS12355328-0.506979hypothetical protein
DIP_RS123503291.241181hypothetical protein
DIP_RS123454341.169281transcriptional regulator
DIP_RS123402390.878493hypothetical protein
DIP_RS12335232-1.219929hypothetical protein
DIP_RS12330-119-1.465172XRE family transcriptional regulator
DIP_RS12325019-1.588318integrase
DIP_RS12320118-0.587593*carboxylate transporter
DIP_RS12315-1190.550460hypothetical protein
DIP_RS12310-1281.827502**aminotransferase
DIP_RS123000294.398630*NADPH:quinone oxidoreductase
DIP_RS122950336.238770cysteine desulfurase
DIP_RS12280-1315.769923sugar ABC transporter permease
DIP_RS12270-1295.290111ABC transporter ATP-binding protein
DIP_RS122651294.487780iron ABC transporter substrate-binding protein
DIP_RS122601293.373288hypothetical protein
DIP_RS12255-1351.979155hypothetical protein
DIP_RS122500342.785385manganese ABC transporter ATP-binding protein
DIP_RS122452422.703219hypothetical protein
DIP_RS122401352.605644glycosyl transferase
DIP_RS12235-1263.316297hypothetical protein
DIP_RS12230-1254.084556hypothetical protein
DIP_RS122250285.254131hypothetical protein
DIP_RS122200264.257689hypothetical protein
DIP_RS122150354.785887hypothetical protein
DIP_RS122101274.090185decaprenylphosphoryl-beta-D-ribose oxidase
DIP_RS122001354.546683short-chain dehydrogenase
DIP_RS121951354.534725membrane protein
DIP_RS121900334.252741arabinosyltransferase
DIP_RS121850324.015910hypothetical protein
DIP_RS121800334.346204glycine/betaine ABC transporter permease
DIP_RS12175-1253.476119hypothetical protein
DIP_RS12170-1263.874073hypothetical protein
DIP_RS12165-1254.273692peptidase M13
DIP_RS12155-2234.614496hypothetical protein
DIP_RS12150-1233.561208alpha/beta hydrolase
DIP_RS12145-1253.026920methylated-DNA--protein-cysteine
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13105V8PROTEASE422e-06 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 41.5 bits (97), Expect = 2e-06
Identities = 33/241 (13%), Positives = 68/241 (28%), Gaps = 42/241 (17%)

Query: 31 TNGTPVSPADDTAAEGVVQVAS--------CTGTVVASQWVLTAQHCVE----VPNLQRP 78
+ ++ + V + +G VV +LT +H V+ P+ +
Sbjct: 74 NDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHVVDATHGDPHALKA 133

Query: 79 VYVGTTREQQRREENTFTSDYAVWAPHGDVALVHVTDALPQRLVRKV-------RRAPVS 131
++ ++ GD+A+V + + + +V A
Sbjct: 134 FPSAINQD-NYPNGGFTAEQITKYSGEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQ 192

Query: 132 FGEQGRVYGWGAGTGETLQYARAAVGKTSSGVRPQGNQHGAFIVQYLDEAKAGRGDSGGP 191
+ V G+ + G Q+ G+SG P
Sbjct: 193 VNQNITVTGYPGDKPVATMWESKGKITYLKGE---AMQYDLSTTG---------GNSGSP 240

Query: 192 LF-VDGEVAGVTSFKAPQGGGRFSLFASLHGLGDWIAQ-------TTAAPSPSPRNPNSK 243
+F EV G+ P + +++ Q +P NP++
Sbjct: 241 VFNEKNEVIGIHWGGVPNEFNGAVFINE--NVRNFLKQNIEDIHFANDDQPNNPDNPDNP 298

Query: 244 N 244
N
Sbjct: 299 N 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13100BORPETOXINA280.034 Bordetella pertussis toxin A subunit signature.
		>BORPETOXINA#Bordetella pertussis toxin A subunit signature.

Length = 269

Score = 27.8 bits (61), Expect = 0.034
Identities = 21/56 (37%), Positives = 26/56 (46%), Gaps = 5/56 (8%)

Query: 74 YPSYRYIDKIAGTTVPPVPDSLAALAPEALRAAAEVAEELAPWVETFVPEMVLVNY 129
Y S R + I GT V P A +A +A E +E +A W E MVLV Y
Sbjct: 212 YTSRRSVASIVGTLVRMAPVIGACMARQA-----ESSEAMAAWSERAGEAMVLVYY 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13010NUCEPIMERASE402e-05 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 39.8 bits (93), Expect = 2e-05
Identities = 26/130 (20%), Positives = 50/130 (38%), Gaps = 28/130 (21%)

Query: 19 RVVVTGASGYVGARLVVELLSTGFNVRATSRSLSSLKRFPWYDQ--------------VE 64
+ +VTGA+G++G + LL G V + +L +YD +
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVG----IDNLN--DYYDVSLKQARLELLAQPGFQ 55

Query: 65 AVEADLSISEDVNRLFED--VDTVFYLVHSMSGTQNFEELESKIASKVA------QAAHS 116
+ DL+ E + LF + VF H ++ + E + S + +
Sbjct: 56 FHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRH 115

Query: 117 NGVRQIIYLS 126
N ++ ++Y S
Sbjct: 116 NKIQHLLYAS 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12995TCRTETA567e-11 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 56.4 bits (136), Expect = 7e-11
Identities = 83/384 (21%), Positives = 148/384 (38%), Gaps = 38/384 (9%)

Query: 10 ITFLLVTQMLFSIGFYLVVPFLAVQLSENLGASGTIV---GLVLGIRTFSQQGLFFFGGG 66
+ +L T L ++G L++P L L + L S + G++L + Q G
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRD-LVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 67 LADKFGAGPILLIGVAIRVIGFLTVGMAETVTLMTLGVVLIGFAAALFSPAVESIFAAEG 126
L+D+FG P+LL+ +A + + + A + ++ +G ++ G A + A
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAYIA--- 121

Query: 127 HRLEKLGVITRARLFALDAAYSRIGTLTGPVIGAVLIPLGFATVSIAGAAIFGSIFVI-H 185
+ RAR F +A G + GPV+G ++ A AA+ G F+
Sbjct: 122 ---DITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGC 178

Query: 186 LALRRTWTASAQPQRTHAPSMMAVWGEVIRNKKFVVFALLYSTYLLA--YNQQYLSLPVE 243
L + +P R A + +A + V A L + + + Q +L V
Sbjct: 179 FLLPESHKGERRPLRREALNPLAS---FRWARGMTVVAALMAVFFIMQLVGQVPAALWVI 235

Query: 244 L--RRATGSDEALGWFFAVSAVFVIALQSRVT-----RWAERQAAAAALGGGFGLMALSF 296
R +G A + Q+ +T R ER+A G +
Sbjct: 236 FGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALML----GMIADGTGY 291

Query: 297 AVVAVASISLELEGWIAYVPAAAMMILLHTGIMIAVPIARDLVGDLAGNNNLGSYYGFLN 356
++A A+ GW+A+ +M+LL +G I +P + ++ G G L
Sbjct: 292 ILLAFAT-----RGWMAFP----IMVLLASG-GIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 357 SFGGLAVLLGSLTVGATLDHAETT 380
+ L ++G L A + TT
Sbjct: 342 ALTSLTSIVGPLLFTAIYAASITT 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12945PF05272320.005 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.005
Identities = 11/45 (24%), Positives = 23/45 (51%)

Query: 168 LIAHNTFTAEVAQILENLVRERRPLLIVGGTGSGKTTLLSALLAT 212
L+ VA+++E + +++ G G GK+TL++ L+
Sbjct: 575 LVGKYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12915V8PROTEASE506e-09 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 50.0 bits (119), Expect = 6e-09
Identities = 56/298 (18%), Positives = 105/298 (35%), Gaps = 49/298 (16%)

Query: 98 AIGSVFQTLATLLVAWLVSIPLATGLSGGISQGIKNSEILGFVDHGAPSQLSALPSKISA 157
+ S+F +ATL A LVS P A LS S+ + N Q + + +
Sbjct: 7 KVSSLF--VATLTTATLVSSPAANALS---SKAMDN-----------HPQQTQSSKQQTP 50

Query: 158 MLNESGLPPLVSPFMEKQHSSQVEAP--AIKVADTALVERLRPSVIHVLGESEECSRRLM 215
+ + G + P +++H++ + ++ DT V ++ + E + +
Sbjct: 51 KIQKGGN---LKPLEQREHANVILPNNDRHQITDTTNGHY--APVTYI--QVEAPTGTFI 103

Query: 216 GSGFVVDATHVMTNAHVVAGTQRVS--LDTVVGMVDATVVYYNPQLDIAVLEAEGLNLPA 273
SG VV ++TN HVV T L ++ + + G A
Sbjct: 104 ASGVVVGKDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLA 163

Query: 274 LQWAPEPAETGADAIVMGFPES-GPFE---AAPARISDRIIIAG----PDIYANGRVERE 325
+ P E E P A +++ I + G + + +
Sbjct: 164 I-VKFSPNEQNKH-----IGEVVKPATMSNNAETQVNQNITVTGYPGDKPVATMWESKGK 217

Query: 326 SYTARGSIRQ-------GNSGGPMVDAEGNVIGVVFGASVDATDIGYALTAKEVLNMV 376
+G Q GNSG P+ + + VIG+ +G + + + + V N +
Sbjct: 218 ITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWGGVPNEFNGAVFIN-ENVRNFL 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12805HTHFIS1003e-26 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 99.5 bits (248), Expect = 3e-26
Identities = 41/158 (25%), Positives = 75/158 (47%), Gaps = 4/158 (2%)

Query: 1 MHRLRVLVVDDEPQMVQIISYALELEGWEVLSASSAQRGWQLLNEYRCDLVILDVMLPDA 60
M +LV DD+ + +++ AL G++V S+A W+ + DLV+ DV++PD
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 SGYALCERIRAAGDVAGTPVIMLTALGDTDNRVEGLEAGADDYVAKPFSPKELV-LRAQA 119
+ + L RI+ A PV++++A ++ E GA DY+ KPF EL+ + +A
Sbjct: 61 NAFDLLPRIKKAR--PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 120 VVRRSGGVVQPEL-REIIVGEVSVNPATHAVVIGGRRV 156
+ + E + + V + A + R+
Sbjct: 119 LAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12765TONBPROTEIN382e-04 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 37.7 bits (87), Expect = 2e-04
Identities = 13/39 (33%), Positives = 19/39 (48%)

Query: 1011 PHPTPAPQPAPQPTPQKPSEQQPSPKAPEKKPQKEKKVL 1049
P P P+P P+P + P + P+ KP+ KKV
Sbjct: 69 VEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQ 107



Score = 34.6 bits (79), Expect = 0.002
Identities = 12/47 (25%), Positives = 21/47 (44%)

Query: 1008 PLFPHPTPAPQPAPQPTPQKPSEQQPSPKAPEKKPQKEKKVLARTGA 1054
P+ P AP +P P+ + +P K E+ + K V +R +
Sbjct: 77 PIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPAS 123



Score = 31.9 bits (72), Expect = 0.010
Identities = 15/51 (29%), Positives = 24/51 (47%)

Query: 997 PLIPLVIIPFLPLFPHPTPAPQPAPQPTPQKPSEQQPSPKAPEKKPQKEKK 1047
P P+ + P P A QP P+P + E +P P+ P++ P +K
Sbjct: 41 PAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEK 91



Score = 31.5 bits (71), Expect = 0.013
Identities = 14/39 (35%), Positives = 21/39 (53%), Gaps = 2/39 (5%)

Query: 1011 PHPTPAPQPAPQ-PTPQKPSEQQPSPKA-PEKKPQKEKK 1047
P P P P+P + P + + +P PK P KK Q++ K
Sbjct: 73 PEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPK 111



Score = 31.1 bits (70), Expect = 0.019
Identities = 10/37 (27%), Positives = 15/37 (40%)

Query: 1011 PHPTPAPQPAPQPTPQKPSEQQPSPKAPEKKPQKEKK 1047
P P P P+P P+ E +KP+ + K
Sbjct: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPK 97


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12760CARBMTKINASE320.005 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 31.7 bits (72), Expect = 0.005
Identities = 22/100 (22%), Positives = 41/100 (41%), Gaps = 7/100 (7%)

Query: 112 RIVDVTPGRVREALDEGKICLVAGFQG--VNRESKDVTTL-GRGGSDTTAVALAAALNAD 168
V+ +++ ++ G I + +G G V E ++ + D LA +NAD
Sbjct: 172 GHVEAET--IKKLVERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNAD 229

Query: 169 VCEIYSDVDGVYTADPRIVPNAQKLEKLCFEEMLELAASG 208
+ I +DV+G Q L ++ EE+ + G
Sbjct: 230 IFMILTDVNGAALYYGT--EKEQWLREVKVEELRKYYEEG 267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12735TCRTETB1492e-42 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 149 bits (378), Expect = 2e-42
Identities = 96/376 (25%), Positives = 183/376 (48%), Gaps = 7/376 (1%)

Query: 41 MAEGLGIDPNTASLQASLAGVIIGIGAVVYAALADAISIRKLMLIGIGLVVVGSVIGFVF 100
+A P + + + + IG VY L+D + I++L+L GI + GSVIGFV
Sbjct: 40 IANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVG 99

Query: 101 SGSWPLVLAGRLIQTGGLAAAETLYVIYVTKHLAAEDQKTYLGFSTAAFQSGLLVGALTS 160
+ L++ R IQ G AA L ++ V +++ E++ G + G VG
Sbjct: 100 HSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIG 159

Query: 161 GAISTYIGWRVMFLVPLILIVAVPFILKTVPEEEASSSHLDVVGLFLIAIFATSVIQYMQ 220
G I+ YI W + L+P+I I+ VPF++K + +E H D+ G+ L+++ + +
Sbjct: 160 GMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTT 219

Query: 221 AFKLFWLAFMLVSIVIFVWYVRNAKNPVVNPEFFKNGRYVWAILLVLIVYSTQLGYIVLL 280
++ + +L ++S +IFV ++R +P V+P KN ++ +L I++ T G++ ++
Sbjct: 220 SYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMV 279

Query: 281 PFAAKEFHGLDQAQASYLMIPGYICAVLIGIFSGKIGKLMTSRRTIFTALGMIIVALVVG 340
P+ K+ H L A+ ++I + I G IG ++ RR L + + L V
Sbjct: 280 PYMMKDVHQLSTAEIGSVII---FPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVS 336

Query: 341 ALAIQVHV---AVAIASIILFASGFALLYAPLVNTALANIL-PEKSGVAIGFYNLTINIG 396
L + + + II+F G +++T +++ L +++G + N T +
Sbjct: 337 FLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLS 396

Query: 397 VPLGIAYTFKLMNLSI 412
GIA L+++ +
Sbjct: 397 EGTGIAIVGGLLSIPL 412


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12680GPOSANCHOR412e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 40.8 bits (95), Expect = 2e-05
Identities = 19/90 (21%), Positives = 22/90 (24%), Gaps = 16/90 (17%)

Query: 520 EPENKKVWNPEPKPEP--------KPEPKPEPKPEPEPAPASDNVWGAPAPLGGGQPATP 571
E E K + K E K + K E A A P
Sbjct: 418 ELEESKKLTEKEKAELQAKLEAEAKALKEKLAKQAEELAKL-----RAGKASDSQTPDAK 472

Query: 572 PPPPVERFSPAKATAPQPPAAPQPVQEAPK 601
P P K APQ P + K
Sbjct: 473 PGNKA---VPGKGQAPQAGTKPNQNKAPMK 499


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12670BCTERIALGSPD320.006 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 31.8 bits (72), Expect = 0.006
Identities = 26/132 (19%), Positives = 46/132 (34%), Gaps = 15/132 (11%)

Query: 267 ASSKENIEWYASH------------ANVRGIGPNKLNQLAHAQFFGDVAGLKAHMLKHAA 314
+E I + + + + LN+ + QFF V L +
Sbjct: 37 TDIQEFINTVSKNLNKTVIIDPSVRGTITVRSYDMLNEEQYYQFFLSV--LDVYGFAVIN 94

Query: 315 SLAPKFERVLEILDSRLSEYGVAKWTSPTGGYFISVDVVPGTASRVVELAKEAGIALTGA 374
+ +V+ D++ + VA +P G + VVP T +LA A
Sbjct: 95 -MNNGVLKVVRSKDAKTAAVPVASDAAPGIGDEVVTRVVPLTNVAARDLAPLLRQLNDNA 153

Query: 375 GSSFPLHNDPNN 386
G +H +P+N
Sbjct: 154 GVGSVVHYEPSN 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12635ACRIFLAVINRP536e-09 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 52.5 bits (126), Expect = 6e-09
Identities = 34/205 (16%), Positives = 79/205 (38%), Gaps = 9/205 (4%)

Query: 216 IIGVAIAAIILIFTFGSLVAAGLPLLIAVIGVGIGSLSITLATAWVSLNNVTPVLAVMLG 275
+ + +++ ++ A +P IAV V +G+ +I A S+N +T + ++L
Sbjct: 345 FEAIMLVFLVMYLFLQNMRATLIPT-IAVPVVLLGTFAILAAFG-YSINTLT-MFGMVLA 401

Query: 276 LAVGIDYSLFIMFRYRRELL--HLDKEQAAGMAVGTAGSAVVFAGLTVIIALVALAV--- 330
+ + +D ++ ++ R ++ L ++A ++ A+V + + + +A
Sbjct: 402 IGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGG 461

Query: 331 ANIPFLTYMGLAAAFTVFIAVLIALTMVPALLGALGDKAFAVRLRRKRRT-SPARTLGRK 389
+ + + ++VL+AL + PAL L A K T
Sbjct: 462 STGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDH 521

Query: 390 WGELVHRAPGVVIAVSVVTLGALTL 414
+ G ++ + L L
Sbjct: 522 SVNHYTNSVGKILGSTGRYLLIYAL 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12580PRTACTNFAMLY350.002 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 34.6 bits (79), Expect = 0.002
Identities = 16/59 (27%), Positives = 22/59 (37%), Gaps = 7/59 (11%)

Query: 150 PNTVISAKFDVVGGTTQPEPEPDPAPEPQP-------APEPEPGTSVNPAAPTAVDPAT 201
N S +P P+P P P P AP+P G ++ AA AV+
Sbjct: 560 GNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSAAANAAVNTGG 618


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12535PHPHTRNFRASE320.003 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 32.1 bits (73), Expect = 0.003
Identities = 46/218 (21%), Positives = 89/218 (40%), Gaps = 30/218 (13%)

Query: 36 QARIAEDAGASAVMALERVPADIRAQGGVARMSDPDLIEGIVNAVSIPVMAKARIGHFVE 95
+ +I + +A AL+ V + M + + E A I ++K +GH +
Sbjct: 91 KGKIENE-QMNAEYALKEVSDMFVSM--FESMDNEYMKE---RAADIRDVSKRVLGHLIG 144

Query: 96 AQI--LESLGVDFIDESEVLSPADYVNHIDKWNFDVPFVCG-ATNLGEALRRITEGAAMI 152
+ L ++ + + +E L+P+D + FV G AT++G R + A M
Sbjct: 145 VETGSLATIAEETVIIAEDLTPSDTAQ------LNKQFVKGFATDIGG---RTSHSAIMS 195

Query: 153 RSKGE---AGTGDVSEAVKHLRTI-----RGEINRLRSMDEDQLYVAAKEI----QAPYD 200
RS GT +V+E ++H + G + + +E + Y + + +
Sbjct: 196 RSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEEKRAAFEKQKQEWA 255

Query: 201 LVREVAATGKLPVTLFVAGGVATPADAALVMQMGAEGV 238
+ +T K + +A + TP D V+ G EG+
Sbjct: 256 KLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGI 293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12515DPTHRIATOXIN11140.0 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 1114 bits (2881), Expect = 0.0
Identities = 559/560 (99%), Positives = 560/560 (100%)

Query: 1 MSRKLFASILIGALLGIGAPPSAHAGADDVVDSSKSFVMENFSSYHGTKPGYVDSIQKGI 60
+SRKLFASILIGALLGIGAPPSAHAGADDVVDSSKSFVMENFSSYHGTKPGYVDSIQKGI
Sbjct: 8 VSRKLFASILIGALLGIGAPPSAHAGADDVVDSSKSFVMENFSSYHGTKPGYVDSIQKGI 67

Query: 61 QKPKSGTQGNYDDDWKGFYSTDNKYDAAGYSVDNENPLSGKAGGVVKVTYPGLTKVLALK 120
QKPKSGTQGNYDDDWKGFYSTDNKYDAAGYSVDNENPLSGKAGGVVKVTYPGLTKVLALK
Sbjct: 68 QKPKSGTQGNYDDDWKGFYSTDNKYDAAGYSVDNENPLSGKAGGVVKVTYPGLTKVLALK 127

Query: 121 VDNAETIKKELGLSLTEPLMEQVGTEEFIKRFGDGASRVVLSLPFAEGSSSVEYINNWEQ 180
VDNAETIKKELGLSLTEPLMEQVGTEEFIKRFGDGASRVVLSLPFAEGSSSVEYINNWEQ
Sbjct: 128 VDNAETIKKELGLSLTEPLMEQVGTEEFIKRFGDGASRVVLSLPFAEGSSSVEYINNWEQ 187

Query: 181 AKALSVELEINFETRGKRGQDAMYEYMAQACAGNRVRRSVGSSLSCINLDWDVIRDKTKT 240
AKALSVELEINFETRGKRGQDAMYEYMAQACAGNRVRRSVGSSLSCINLDWDVIRDKTKT
Sbjct: 188 AKALSVELEINFETRGKRGQDAMYEYMAQACAGNRVRRSVGSSLSCINLDWDVIRDKTKT 247

Query: 241 KIESLKEHGPIKNKMSESPNKTVSEEKAKQYLEEFHQTALEHPELSELKTVTGTNPVFAG 300
KIESLKEHGPIKNKMSESPNKTVSEEKAKQYLEEFHQTALEHPELSELKTVTGTNPVFAG
Sbjct: 248 KIESLKEHGPIKNKMSESPNKTVSEEKAKQYLEEFHQTALEHPELSELKTVTGTNPVFAG 307

Query: 301 ANYAAWAVNVAQVIDSETADNLEKTTAALSILPGIGSVMGIADGAVHHNTEEIVAQSIAL 360
ANYAAWAVNVAQVIDSETADNLEKTTAALSILPGIGSVMGIADGAVHHNTEEIVAQSIAL
Sbjct: 308 ANYAAWAVNVAQVIDSETADNLEKTTAALSILPGIGSVMGIADGAVHHNTEEIVAQSIAL 367

Query: 361 SSLMVAQAIPLVGELVDIGFAAYNFVESIINLFQVVHNSYNRPAYSPGHKTQPFLHDGYA 420
SSLMVAQAIPLVGELVDIGFAAYNFVESIINLFQVVHNSYNRPAYSPGHKTQPFLHDGYA
Sbjct: 368 SSLMVAQAIPLVGELVDIGFAAYNFVESIINLFQVVHNSYNRPAYSPGHKTQPFLHDGYA 427

Query: 421 VSWNTVEDSIIRTGFQGESGHDIKITAENTPLPIAGVLLPTIPGKLDVNKSKTHISVNGR 480
VSWNTVEDSIIRTGFQGESGHDIKITAENTPLPIAGVLLPTIPGKLDVNKSKTHISVNGR
Sbjct: 428 VSWNTVEDSIIRTGFQGESGHDIKITAENTPLPIAGVLLPTIPGKLDVNKSKTHISVNGR 487

Query: 481 KIRMRCRAIDGDVTFCRPKSPVYVGNGVHANLHVAFHRSSSEKIHSNEISSDSIGVLGYQ 540
KIRMRCRAIDGDVTFCRPKSPVYVGNGVHANLHVAFHRSSSEKIHSNEISSDSIGVLGYQ
Sbjct: 488 KIRMRCRAIDGDVTFCRPKSPVYVGNGVHANLHVAFHRSSSEKIHSNEISSDSIGVLGYQ 547

Query: 541 KTVDHTKVNSKLSLFFEIKS 560
KTVDHTKVNSKLSLFFEIKS
Sbjct: 548 KTVDHTKVNSKLSLFFEIKS 567


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12460RTXTOXINA320.025 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 32.2 bits (73), Expect = 0.025
Identities = 17/61 (27%), Positives = 26/61 (42%), Gaps = 8/61 (13%)

Query: 1600 ILKGLGGAAGGAAAGFALGGPVGALLGGIAGLLFGGAGQVMNGIAEIKQGRMQESTYKKY 1659
+L + AA +G PV AL+G + G++ +GI E + M E K
Sbjct: 374 VLASVSSGISAAATTSLVGAPVSALVGAVTGII--------SGILEASKQAMFEHVASKM 425

Query: 1660 A 1660
A
Sbjct: 426 A 426


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12385ARGREPRESSOR270.030 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 27.1 bits (60), Expect = 0.030
Identities = 11/47 (23%), Positives = 20/47 (42%)

Query: 135 GTCREIASACRRLGYEISKTTVHRWAREETISSKTMEDGRVIVSLQE 181
T E+ ++ GY +++ TV R +E + +G SL
Sbjct: 20 ETQDELVDILKKDGYNVTQATVSRDIKELHLVKVPTNNGSYKYSLPA 66


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12250ADHESNFAMILY1996e-64 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 199 bits (508), Expect = 6e-64
Identities = 78/333 (23%), Positives = 148/333 (44%), Gaps = 34/333 (10%)

Query: 12 AVAATGLVAAAGCSTTDSGTSASGTSSAAKSDTLKIFATTSYIGDAVKNIAPD-ADLTVM 70
V + C++ T++ LK+ AT S I D KNIA D DL +
Sbjct: 8 LVLFLSAIILVACASGKKDTTSG--------QKLKVVATNSIIADITKNIAGDKIDLHSI 59

Query: 71 VGPGGDPHTYQPSTADLEAMQNADAVIWSGLGMEANMIDQLRGLGDKQIAVAEQLPESQL 130
V G DPH Y+P D++ AD + ++G+ +E L + A++
Sbjct: 60 VPIGQDPHEYEPLPEDVKKTSEADLIFYNGINLETGGNAWFTKLVEN----AKKTENKDY 115

Query: 131 LPWVEEDEHDHDHGDAHEHGHEGEDAHGHHHESQWDPHVWNSTDNWKLVVDQIVKKLSAA 190
A G + G + + + DPH W + +N + I K+LSA
Sbjct: 116 F--------------AVSDGVDVIYLEGQNEKGKEDPHAWLNLENGIIFAKNIAKQLSAK 161

Query: 191 DSANADTYKANGEKYNKQIDEAKAYVQAKIDTIPQDQRTLVSGHDAFRYFGKQFGLEVKA 250
D N + Y+ N ++Y ++D+ + K + IP +++ +V+ AF+YF K +G+
Sbjct: 162 DPNNKEFYEKNLKEYTDKLDKLDKESKDKFNKIPAEKKLIVTSEGAFKYFSKAYGVPSAY 221

Query: 251 TDFVTSDAERSANELEDLATFIVEHHVPVIFQDASANPQAVKSLEENVAKKGGKVKVVDK 310
+ ++ E + +++ L + + VP +F ++S + + +K++ ++
Sbjct: 222 IWEINTEEEGTPEQIKTLVEKLRQTKVPSLFVESSVDDRPMKTVSQDTNIPIY------A 275

Query: 311 ELYSDSLGADA-PADTYIGALKYNADTIAEAFS 342
++++DS+ D+Y +KYN D IAE +
Sbjct: 276 QIFTDSIAEQGKEGDSYYSMMKYNLDKIAEGLA 308


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12230adhesinb2152e-70 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 215 bits (549), Expect = 2e-70
Identities = 88/318 (27%), Positives = 158/318 (49%), Gaps = 25/318 (7%)

Query: 14 VAALALTGSLVACSTDSASTSATTSAANKALTVFATTGYIGDAVKNIAPD-ADVTIMVGP 72
V L L ACS+ S+T + ++K L V AT I D KNIA D ++ +V
Sbjct: 8 VLLLLAFVGLAACSS---QKSSTETGSSK-LNVVATNSIIADITKNIAGDKINLHSIVPV 63

Query: 73 GGDPHTYQPTTQDISKIESSDVVLWSGLHMEAKMLDQL-------KAQGDRQ-AAVAEAI 124
G DPH Y+P +D+ K +D++ ++G+++E K + ++ AV+E +
Sbjct: 64 GQDPHEYEPLPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVSEGV 123

Query: 125 PEDKRLDWPEPGDNGEKLYDPHVWNSTENWKYVVDAIAKKLSEVDKDNAATYKDNAEKYK 184
G + + DPH W + EN IAK+LSE D N TY+ N + Y
Sbjct: 124 D-----VIYLEGQSEKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANKETYEKNLKAYV 178

Query: 185 KEIDETAAYVKEQIDQIPEQKRILITGHDAFSYFGKQFGVEIHATDFVTSESEMSPAELA 244
+++ KE+ + IP +K++++T F YF K + V + +E E +P ++
Sbjct: 179 EKLSALDKEAKEKFNNIPGEKKMIVTSEGCFKYFSKAYNVPSAYIWEINTEEEGTPDQIK 238

Query: 245 ELGKFIAEKKIPTIFQDNLANPQAINSLKETVKAKGWNVEISDKELYADSLGESA-PTDT 303
L + + + K+P++F ++ + + + ++ +K N+ I +++ DS+ E D+
Sbjct: 239 TLVEKLRKTKVPSLFVESSVDDRPMKTV-----SKDTNIPIYA-KIFTDSVAEKGEEGDS 292

Query: 304 YLGVLKYNADAIREALAK 321
Y ++KYN + I E L+K
Sbjct: 293 YYSMMKYNLEKIAEGLSK 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12200TACYTOLYSIN260.014 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 26.1 bits (57), Expect = 0.014
Identities = 12/37 (32%), Positives = 20/37 (54%)

Query: 9 QPVPQPQEATSDTIATQHTSAPLSNDMMLLMAEEIPM 45
QP P+ E T++ + SNDM+ L +E+P+
Sbjct: 52 QPKPESSELTTEKAGQKMDDMLNSNDMIKLAPKEMPL 88


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12190DHBDHDRGNASE502e-09 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 49.7 bits (118), Expect = 2e-09
Identities = 49/199 (24%), Positives = 72/199 (36%), Gaps = 9/199 (4%)

Query: 2 LNAVG-QAQNILLLGGTSEIGLSIVAEFLAKGPAHVTLAARQDSPRIDAAVAQMKAAGAS 60
+NA G + + + G IG + VA LA AH+ A +P V A A
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEA-VARTLASQGAHIA--AVDYNPEKLEKVVSSLKAEAR 57

Query: 61 EVDVIDFDATDFDKHAEVID-LAWAQGDVDLAIVAFGTL--GDQEQLWQDQKAAVTSAQT 117
+ D D E+ + G +D+ + G L G L ++ A S +
Sbjct: 58 HAEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNS 117

Query: 118 NYTAPVSVGVLLGEKFKEQGHGTIVALSSVAGQRVRRSNFVYGASKAGMDGFYVNLGEAL 177
S V + ++ G+IV + S R S Y +SKA F LG L
Sbjct: 118 TGVFNASRSVS--KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLEL 175

Query: 178 RPSGANVVVVRPGQVRTKM 196
+V PG T M
Sbjct: 176 AEYNIRCNIVSPGSTETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS12150TCRTETB280.025 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 28.3 bits (63), Expect = 0.025
Identities = 15/44 (34%), Positives = 25/44 (56%), Gaps = 7/44 (15%)

Query: 154 IDGTLRGFVLRVPHLLVPENSPRLGHNISPMRLNGSVVMSPKTV 197
I GT+ GFV VP+++ + H +S + GSV++ P T+
Sbjct: 268 IFGTVAGFVSMVPYMM------KDVHQLSTAEI-GSVIIFPGTM 304


23DIP_RS22485DIP_RS22450N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS22485-2293.182794membrane protein insertase YidC
DIP_RS22475-3263.828273class E sortase
DIP_RS22465-1263.232048antiporter
DIP_RS224600234.310296hypothetical protein
DIP_RS224552233.977466two-component sensor histidine kinase
DIP_RS224502233.573502DNA-binding response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS2248060KDINNERMP782e-17 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 77.7 bits (191), Expect = 2e-17
Identities = 54/265 (20%), Positives = 97/265 (36%), Gaps = 71/265 (26%)

Query: 5 FIYPVSGVMRLWHYIFADLFGCSQSQAWVASLFALVVTVRSIIAPFSWMQFKSGRFAIMM 64
+++ +S + G W S+ + VR I+ P + Q+ S M+
Sbjct: 332 WLWFISQPLFKLLKWIHSFVG-----NWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRML 386

Query: 65 RPKIKRLKEEYAERTDKESILEQERRQKEIQEEY---GYSMAAGCVPALIQVPVFLGLYQ 121
+PKI+ ++E + +++R +E+ Y + GC P LIQ+P+FL LY
Sbjct: 387 QPKIQAMRERLGD--------DKQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYY 438

Query: 122 VLLRMARPKEGLDATVHEPIGMLTSDNVRSFLDTRFFGVPLPAYNSMSPEQLAHLGTDQP 181
+L+ +V P + + L P
Sbjct: 439 MLMG----------SVE------------------LRQAPFALW-------IHDLSAQDP 463

Query: 182 TIHAFVLPLIIAACVFTTINMIVSTLRTRYSIDHDSEFAVGMYRFLIVMVLVAPIGLLQT 241
++LP+++ +F M T T D M F+ V+ V
Sbjct: 464 Y---YILPILMGVTMFFIQKM-SPTTVT------DPMQQKIMT-FMPVIFTV-------- 504

Query: 242 GIFGPIPAAICLYWVANNLWTLIQN 266
F P+ + LY++ +NL T+IQ
Sbjct: 505 -FFLWFPSGLVLYYIVSNLVTIIQQ 528


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22465TCRTETB411e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.6 bits (95), Expect = 1e-05
Identities = 38/141 (26%), Positives = 59/141 (41%), Gaps = 23/141 (16%)

Query: 49 FKAAQPMLKEDFGLTTLQLGYIGLAFSITYGIGKTLVGYFVDGHNSKRIISTLLICASTM 108
+ P + DF ++ AF +T+ IG + G D KR+
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRL----------- 81

Query: 109 VLLMGLLLSYFGSVIGI----FIVLWGLNGLFQSAGGPASYSTI----SRWAPRTKRGKY 160
LL G++++ FGSVIG F L + Q AG A + + +R+ P+ RGK
Sbjct: 82 -LLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKA 140

Query: 161 LGLWNASHNVG---GALAGGL 178
GL + +G G GG+
Sbjct: 141 FGLIGSIVAMGEGVGPAIGGM 161


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22455TYPE3IMSPROT320.004 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 32.0 bits (73), Expect = 0.004
Identities = 22/111 (19%), Positives = 48/111 (43%), Gaps = 13/111 (11%)

Query: 145 SIYMVFPLFF-LYLRVLPDIRGIILVVGATAIAITSQLAQLTIGAVMGPVVSALVVIAIH 203
SI V L +++ + ++ ++ + IT L Q+ ++ V +V+
Sbjct: 143 SILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQILRQLMVICTVGFVVISIAD 202

Query: 204 FAFEAIWKGARE----REELIDELLATRNQLAETERAAGIAAERQRIAHEI 250
+AFE ++ +E ++E+ E E E + I ++R++ EI
Sbjct: 203 YAFE-YYQYIKELKMSKDEIKRE-------YKEMEGSPEIKSKRRQFHQEI 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS22450HTHFIS576e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 57.1 bits (138), Expect = 6e-12
Identities = 30/126 (23%), Positives = 51/126 (40%), Gaps = 12/126 (9%)

Query: 2 IRVLLADDHEIVRLGLRAVLESA-EDIEVIGEVATAEAAIAAAQAGGIDVILMDLRFGPG 60
+L+ADD +R L L A D+ + + A AG D+++ D+
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRI---TSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 VQGTKLTSGADATAAIRRRMDNPPEVLVVTNYDTDADILGAIEAGALGYMLKDAPPEELL 120
+ D I++ + P VLV++ +T + A E GA Y+ K EL+
Sbjct: 61 -------NAFDLLPRIKKARPDLP-VLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELI 112

Query: 121 AAVRSA 126
+ A
Sbjct: 113 GIIGRA 118


24DIP_RS20945DIP_RS20900N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS209452294.850193acetyltransferase
DIP_RS209401223.530043sulfonamide-resistant dihydropteroate synthase
DIP_RS209350161.464965transposase
DIP_RS20930-2160.233664uracil-xanthine permease
DIP_RS20925-116-0.470000TetR family transcriptional regulator
DIP_RS20920-215-2.345143membrane protein
DIP_RS20910-118-2.346561hypothetical protein
DIP_RS20905-119-2.034164DNA-binding response regulator
DIP_RS20900-217-0.416064two-component sensor histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS20940SACTRNSFRASE383e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.4 bits (89), Expect = 3e-06
Identities = 17/70 (24%), Positives = 27/70 (38%), Gaps = 15/70 (21%)

Query: 90 AYLHKLAVRRTHAGRGVSSALIEACRHAARTQGCAKLRLD--------CHPNLRGLYERL 141
A + +AV + + +GV +AL+ A+ L L+ CH Y +
Sbjct: 90 ALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACH-----FYAKH 144

Query: 142 GFT--HVDTF 149
F VDT
Sbjct: 145 HFIIGAVDTM 154


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS20915HTHTETR462e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 46.2 bits (109), Expect = 2e-08
Identities = 15/82 (18%), Positives = 32/82 (39%)

Query: 2 RTNKKQQLLATAMDIVESDGLQGLTYDSLSTATGTSKSGLIYHFPSRRALEVELNKYSAS 61
+Q +L A+ + G+ + ++ A G ++ + +HF + L E+ + S S
Sbjct: 9 AQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSES 68

Query: 62 LWTQALEDIAGAPPQQLDLPLR 83
+ + P LR
Sbjct: 69 NIGELELEYQAKFPGDPLSVLR 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS20910TCRTETB1362e-37 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 136 bits (344), Expect = 2e-37
Identities = 78/414 (18%), Positives = 159/414 (38%), Gaps = 18/414 (4%)

Query: 18 HDQRWIFLGIISLGLFVVGADNSVLYTALPALHVHLHTSELEGLWIINAYSLVLAGLLLG 77
H+Q I+L I+S F + VL +LP + + W+ A+ L +
Sbjct: 12 HNQILIWLCILS---FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAV 68

Query: 78 TGTLGDKIGHRRMFEIGLSVFAGASILAAVSPNAAT-LITARALLGIGASTMMPASLALL 136
G L D++G +R+ G+ + S++ V + + LI AR + G GA+ PA + ++
Sbjct: 69 YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAA-AFPALVMVV 127

Query: 137 RITFTDIRERNTAIGIWGAVATLGAALGPVLGGFLLEHYYWGSIFLINVPVVIIALVGTY 196
+ R A G+ G++ +G +GP +GG + + +W +L+ +P++ I V
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITV--- 182

Query: 197 FMAPKNRPNPQRSWDFLSSMFAMFFMVGLVLLIKECAHTPISFAMLALGFAFMAVGAALF 256
K R + VG+V + ISF ++++ +F
Sbjct: 183 PFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFL------IF 236

Query: 257 WWRHSHLDEPLLRFGVFRNSLFSAGVLAATCAMLILSGTELLTTQRFQLAEGFTPLQAGL 316
+ +P + G+ +N F GVL ++G + + + + G
Sbjct: 237 VKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGS 296

Query: 317 LSAATA-LAAFPSSIGGGAIVHKVGFRPLISGGFLVMCTGGALTAFSVPHNIFPLLITGL 375
+ ++ GG +V + G +++ G + +F + + + I +
Sbjct: 297 VIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIV 356

Query: 376 LLCGFGAGMVMSVSSTAVIGSAPSRDSGMASAMEAVSYEFGSLISVALFGSLFS 429
+ G + +S+ S + S+ +A+ G L S
Sbjct: 357 FVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSF-LSEGTGIAIVGGLLS 409



Score = 33.7 bits (77), Expect = 0.002
Identities = 21/100 (21%), Positives = 40/100 (40%), Gaps = 2/100 (2%)

Query: 333 GAIVHKVGFRPLISGGFLVMCTGGALTAFSVPHNIFPLLITGLLLCGFGAGMVMSVSSTA 392
G + ++G + L+ G ++ C G + H+ F LLI + G GA ++
Sbjct: 70 GKLSDQLGIKRLLLFGIIINCFGSVIGFVG--HSFFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 393 VIGSAPSRDSGMASAMEAVSYEFGSLISVALFGSLFSFFY 432
V P + G A + G + A+ G + + +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIH 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS20895HTHFIS1021e-27 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 102 bits (257), Expect = 1e-27
Identities = 33/119 (27%), Positives = 57/119 (47%)

Query: 8 ATRVLVVDDEPNIVELLKVSLKFQGFEVETAQSGIEALEKARSFQPDAFILDVMMPGMDG 67
+LV DD+ I +L +L G++V + + D + DV+MP +
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 68 YELLPKLRADGFEGPVLYLTAKDAVEHRIHGLTIGADDYVTKPFSLEEVITRLRVILRR 126
++LLP+++ + PVL ++A++ I GA DY+ KPF L E+I + L
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS20890PF06580386e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.3 bits (89), Expect = 6e-05
Identities = 25/139 (17%), Positives = 46/139 (33%), Gaps = 36/139 (25%)

Query: 381 YPDR---DIDVRSECTDVPVVEGDAARLHQVLTNLVANALKHG----GDSARVTIKLADA 433
+ DR + + DV V ++ LV N +KHG ++ +K
Sbjct: 236 FEDRLQFENQINPAIMDVQV-------PPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD 288

Query: 434 GSNFAVKVIDDGIGLSEEDASHIFERFYRADSSRARSTGGSGLGLAIVKS-LVESHGGE- 491
++V + G + +G GL V+ L +G E
Sbjct: 289 NGTVTLEVENTGSLALKNTKE------------------STGTGLQNVRERLQMLYGTEA 330

Query: 492 -VSVESEQGHGTTFIVELP 509
+ + +QG +V +P
Sbjct: 331 QIKLSEKQG-KVNAMVLIP 348


25DIP_RS14040DIP_RS14015N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS14040-217-2.425983peptidase
DIP_RS14035-314-1.253939sialidase
DIP_RS14030-214-1.144631type I methionyl aminopeptidase
DIP_RS14025-3180.935060adenylate kinase
DIP_RS14020-2190.026999preprotein translocase subunit SecY
DIP_RS14015-217-1.125095sugar ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS14035RTXTOXINA320.002 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 32.2 bits (73), Expect = 0.002
Identities = 25/92 (27%), Positives = 36/92 (39%), Gaps = 14/92 (15%)

Query: 24 TATSAALLTAQPATASAAPLSSGSSATSSINDQLQTIGQQTRDGAWNL------------ 71
T + A + + +A + + SA S D L+ +QTR+ L
Sbjct: 2 TTITTAQIKSTLQSAKQSAANKLHSAGQSTKDALKKAAEQTRNAGNRLILLIPKDYKGQG 61

Query: 72 --RNALISQADALPIEVAAPIKNGIDATVNLF 101
N L+ AD L IEV KNG T +F
Sbjct: 62 SSLNDLVRTADELGIEVQYDEKNGTAITKQVF 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS14030GPOSANCHOR456e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 45.4 bits (107), Expect = 6e-07
Identities = 36/189 (19%), Positives = 69/189 (36%), Gaps = 28/189 (14%)

Query: 513 IDKEIVAAQEKAAAATKEAQEAAEKVEKLTEELAAARKENDELKTQVKESEEAVGDLADK 572
+ ++ A + + K++ L E AA EL+ ++ + + K
Sbjct: 223 LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAK 282

Query: 573 VFKL--ENAVTEAKKKATVAEKAVSDA-LAKLQEAESIAEEQKAKAESAAAEAQ------ 623
+ L E A EA+K + V +A L+ + E K + E+ + +
Sbjct: 283 IKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKIS 342

Query: 624 ------------ALREKLEKLEGSILTAKENSEAEE------SADLSSTARDAADTARRA 665
A RE ++LE +E ++ E DL + +R+A +A
Sbjct: 343 EASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDA-SREAKKQVEKA 401

Query: 666 ATDANGALS 674
+AN L+
Sbjct: 402 LEEANSKLA 410



Score = 45.4 bits (107), Expect = 6e-07
Identities = 27/138 (19%), Positives = 50/138 (36%), Gaps = 3/138 (2%)

Query: 512 DIDKEIVAAQEKAAAATKEAQEAAEKVEKLTEELAAARKENDELKTQVKESEEAVGDLAD 571
+++K + A + A + + + + L A K + + L
Sbjct: 194 ELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEA 253

Query: 572 KVFKLENAVTEAKKKATVAEKAVSDALAKLQEAES---IAEEQKAKAESAAAEAQALREK 628
+ LE E +K A + AK++ E+ E +KA E + A R+
Sbjct: 254 EKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQS 313

Query: 629 LEKLEGSILTAKENSEAE 646
L + + AK+ EAE
Sbjct: 314 LRRDLDASREAKKQLEAE 331



Score = 45.4 bits (107), Expect = 7e-07
Identities = 40/180 (22%), Positives = 71/180 (39%), Gaps = 34/180 (18%)

Query: 511 GDIDKEIVAAQEKAAAATKEAQEAAEKVEKLTEELAAARKEN-------DELKTQVKESE 563
+++K + A + A + + + + L E A ++ L+ + S
Sbjct: 263 AELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASR 322

Query: 564 EAVGDLADKVFKLENA--VTEAKKK--------ATVAEKAVSDALAKLQEAESIAE---- 609
EA L + KLE ++EA ++ + A+K + KL+E I+E
Sbjct: 323 EAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQ 382

Query: 610 ----------EQKAKAESAAAEAQALREKLEKLEGSILTAKENSE---AEESADLSSTAR 656
E K + E A EA + LEKL + +K+ +E AE A L + A+
Sbjct: 383 SLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKLEAEAK 442



Score = 41.2 bits (96), Expect = 1e-05
Identities = 48/206 (23%), Positives = 78/206 (37%), Gaps = 37/206 (17%)

Query: 514 DKEIVAAQEKAAAATKEAQEAAEKVEKLTEELAAARKENDELKTQVKESEEAVGDLADKV 573
+ + + A+ + ++ + +KL E+ + L+ + S EA L +
Sbjct: 308 NANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEH 367

Query: 574 FKLEN-----------------AVTEAKKK---ATVAEKAVSDALAKL--QEAESIAEEQ 611
KLE A EAKK+ A + AL KL + ES +
Sbjct: 368 QKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTE 427

Query: 612 KAKAESAA---AEAQALREKL-------EKLEGSILTAKENSEAEESADLSSTARDAADT 661
K KAE A AEA+AL+EKL KL + + +A+ A
Sbjct: 428 KEKAELQAKLEAEAKALKEKLAKQAEELAKLRAGKASDSQTPDAKPGNK----AVPGKGQ 483

Query: 662 ARRAATDANGALSGQKQNEEK-PAMG 686
A +A T N + K+ + + P+ G
Sbjct: 484 APQAGTKPNQNKAPMKETKRQLPSTG 509


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS14015SECYTRNLCASE495e-177 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 495 bits (1277), Expect = e-177
Identities = 249/446 (55%), Positives = 327/446 (73%), Gaps = 16/446 (3%)

Query: 1 MSAIFQAFRDADLRKKILITIALIILYRIGAQIPSPGVDYASISGRLRELTSDSSSVYSL 60
++A +AFR DLRKK+L T+A+I++YR+G IP PGVDY ++ +RE + + ++ L
Sbjct: 2 LTAFARAFRTPDLRKKLLFTLAIIVVYRVGTHIPIPGVDYKNVQQCVREASG-NQGLFGL 60

Query: 61 INLFSGGALLQLSIFAIGVMPYITASIIVQLLTVVIPKFEELKKEGQSGQAKMDQYTRYL 120
+N+FSGGALLQ++IFA+G+MPYITASII+QLLTVVIP+ E LKKEGQ+G AK+ QYTRYL
Sbjct: 61 VNMFSGGALLQITIFALGIMPYITASIILQLLTVVIPRLEALKKEGQAGTAKITQYTRYL 120

Query: 121 TLGLALLQSSGIVALADREQLLGSGVS--VLKEDRNLWTLVMLVLVMSAGAILVMWLGEI 178
T+ LA+LQ +G+VA A L G + D++++T + +V+ M+AG +VMWLGE+
Sbjct: 121 TVALAILQGTGLVATARSAPLFGRCSVGGQIVPDQSIFTTITMVICMTAGTCVVMWLGEL 180

Query: 179 ITERGIGNGMSLLIFAGIATRIPSDGANILS----SSGGVVFAVVLVAVIVLVVGVVFVE 234
IT+RGIGNGMS+L+F IA PS I + G + F V+ +++V VVFVE
Sbjct: 181 ITDRGIGNGMSILMFISIAATFPSALWAIKKQGTLAGGWIEFGTVIAVGLIMVALVVFVE 240

Query: 235 QGQRRIPVQYAKRMVGRRQYGGSSTYLPLKVNQAGVIPVIFASSLIYMPVLITQIINSGS 294
Q QRRIPVQYAKRM+GRR YGG+STY+PLKVNQAGVIPVIFASSL+Y+P L+ Q S
Sbjct: 241 QAQRRIPVQYAKRMIGRRSYGGTSTYIPLKVNQAGVIPVIFASSLLYIPALVAQFAGGNS 300

Query: 295 HEVSDNWWQRNVIQYLQTPSSWQYIVLYFALIIFFSYFYVSVQYDPNEQADNMKKYGGFI 354
W+ V Q L YIV YF LI+FF++FYV++ ++P E ADNMKKYGGFI
Sbjct: 301 ------GWKSWVEQNLTKGDHPIYIVTYFLLIVFFAFFYVAISFNPEEVADNMKKYGGFI 354

Query: 355 PGIRPGRPTAEYLGYVMNRLLFVGALYLGIIAVLPNIALDLGVGASSAGSTPFGGTAILI 414
PGIR GRPTAEYL YV+NR+ + G+LYLG+IA++P +AL VG ++ + PFGGT+ILI
Sbjct: 355 PGIRAGRPTAEYLSYVLNRITWPGSLYLGLIALVPTMAL---VGFGASQNFPFGGTSILI 411

Query: 415 MVSVALTTVKQIESQLLQSNYEGLLK 440
+V V L TVKQIESQL Q NYEG L+
Sbjct: 412 IVGVGLETVKQIESQLQQRNYEGFLR 437


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS14010PF05272362e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 36.2 bits (83), Expect = 2e-04
Identities = 14/33 (42%), Positives = 18/33 (54%)

Query: 34 LVLVGPSGCGKSTTLRMLAGLEEVNDGRILIGE 66
+VL G G GKST + L GL+ +D IG
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGT 631


26DIP_RS13310DIP_RS13275N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS13310-2202.642584phosphoserine phosphatase
DIP_RS13305-3222.171915hypothetical protein
DIP_RS13300-2221.908886excisionase
DIP_RS13295-3212.260116pyrroline-5-carboxylate reductase
DIP_RS13290-1233.164151hypothetical protein
DIP_RS132851223.680635hypothetical protein
DIP_RS132800253.960571DNA-binding response regulator
DIP_RS132750243.657336histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13310RTXTOXINA290.027 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.027
Identities = 17/75 (22%), Positives = 31/75 (41%), Gaps = 1/75 (1%)

Query: 64 KLDEYTSGLDSVSGSFEAAGAQHVTTANPEIPQDIGAAAFFDIDNTLIQGSSLVVFAMGL 123
LD +GLD+VSG A A + +N + AAA ++ ++ + +
Sbjct: 234 NLDNIGAGLDTVSGILSAISASFIL-SNADADTRTKAAAGVELTTKVLGNVGKGISQYII 292

Query: 124 AKKKYFKLNEILPVA 138
A++ L+ A
Sbjct: 293 AQRAAQGLSTSAAAA 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13305PRPHPHLPASEC280.032 Prokaryotic zinc-dependent phospholipase C signature.
		>PRPHPHLPASEC#Prokaryotic zinc-dependent phospholipase C signature.

Length = 398

Score = 28.4 bits (63), Expect = 0.032
Identities = 17/97 (17%), Positives = 34/97 (35%), Gaps = 11/97 (11%)

Query: 120 EFSDFECPFCARWSNQTEPTLMEEYVSKGLVRIEWNDLPVNGEHALAAAKAGRAAAAQGK 179
+F+ + + ++ + Y S + W+D + LA ++ G A G
Sbjct: 212 DFNAWSKEYARGFAKTGK----SIYYSHASMSHSWDDWDYAAKVTLANSQKGTA----GY 263

Query: 180 FDEFRKALFEASRNVSGHPNNTLKDFERFARNAGVKD 216
F L + S +K+ + +G KD
Sbjct: 264 IYRF---LHDVSEGNDPSVGKNVKELVAYISTSGEKD 297


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13280HTHFIS875e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 87.2 bits (216), Expect = 5e-22
Identities = 30/118 (25%), Positives = 61/118 (51%), Gaps = 1/118 (0%)

Query: 2 TSILLVEDEESLADPLAFLLRKEGFDVVIAGDGPSALVEFDRNAIDIVLLDLMLPGMSGT 61
+IL+ +D+ ++ L L + G+DV I + + D+V+ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 DVCKRLRAV-SSVPVIMVTARDSEIDKVVGLELGADDYVTKPYSSRELIARIRAVLRR 118
D+ R++ +PV++++A+++ + + E GA DY+ KP+ ELI I L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13275PF06580330.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.9 bits (75), Expect = 0.002
Identities = 25/110 (22%), Positives = 43/110 (39%), Gaps = 23/110 (20%)

Query: 264 LLVTAVSNLISNAINYSPQGMPISVATKISRDGRVLIRVIDNGIGISVENQKRVFERFFR 323
L+ T V N I + I PQG I + +G V + V + G ++++N K
Sbjct: 259 LVQTLVENGIKHGIAQLPQGGKILLKGT-KDNGTVTLEVENTG-SLALKNTKE------- 309

Query: 324 VDKARSRSTGGTGLGLAIVKH-VAANHGGD--VTLWSRPGTGSTFTIELP 370
TG GL V+ + +G + + L + G + +P
Sbjct: 310 ----------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKV-NAMVLIP 348


27DIP_RS13150DIP_RS13105N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
DIP_RS13150-2130.259527dTDP-glucose 4,6-dehydratase
DIP_RS13145-2120.173143dTDP-4-dehydrorhamnose reductase
DIP_RS13140-113-0.395669glucose-1-phosphate thymidylyltransferase
DIP_RS13135-112-0.494402hydrolase
DIP_RS13130-112-0.298910acyl-CoA synthetase
DIP_RS13125-110-1.066370alpha-amylase
DIP_RS13120-1200.854062hypothetical protein
DIP_RS131151210.836496plasmid replication protein
DIP_RS131102240.141380protease
DIP_RS13105223-0.673015alkylated DNA repair protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13150NUCEPIMERASE1491e-44 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 149 bits (377), Expect = 1e-44
Identities = 76/338 (22%), Positives = 138/338 (40%), Gaps = 43/338 (12%)

Query: 2 LVTGGAGFIGSNFVRRVLATRPEYRVTVLDKLT--Y-----AGNAANLDGCDATLVVGDI 54
LVTG AGFIG + +R L ++V +D L Y L D+
Sbjct: 4 LVTGAAGFIGFHVSKR-LLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 55 CDAQLVDRLVADS--DVVVHFAAESHNDNSLVDPSPFVQTNVVGTFTLLEAARRHDVRFH 112
D + + L A + V SL +P + +N+ G +LE R + ++ H
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ-H 120

Query: 113 HV--STDEVFGDLELDDPNRFTEHTPYN-PSSPYSATKAGSDHLVRAWVRSFGLRATISH 169
+ S+ V+G P F+ + P S Y+ATK ++ + + +GL AT
Sbjct: 121 LLYASSSSVYGLNR-KMP--FSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLR 177

Query: 170 CSNNYGPYQHIEKFIPRQITNILSGLTPKLYGTGEQVRDWIHVDDHNDAVIRILE----- 224
YGP+ + + + +L G + +Y G+ RD+ ++DD +A+IR+ +
Sbjct: 178 FFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHA 237

Query: 225 -------------SGRIGQTYIIGADNDHVNNKTVITLICELMGADGFEHVADRPGHDMR 271
S + Y IG ++ V I + + +G + +++ D+
Sbjct: 238 DTQWTVETGTPAASIAPYRVYNIG-NSSPVELMDYIQALEDALGIEAKKNMLPLQPGDVL 296

Query: 272 YAM-DSSTLRAELGWQPRFTDTDTGMREGLLQTIEWYR 308
D+ L +G+ P T +++G+ + WYR
Sbjct: 297 ETSADTKALYEVIGFTPE-----TTVKDGVKNFVNWYR 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13145NUCEPIMERASE424e-06 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 41.7 bits (98), Expect = 4e-06
Identities = 45/269 (16%), Positives = 86/269 (31%), Gaps = 71/269 (26%)

Query: 176 RILITGAHGQLGRALAELLPD-----------------------AELCSHVDFDVVHPPQ 212
+ L+TGA G +G +++ L + EL + F
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 213 RSW---------RQYEAIINCAAYNNVDAAEDDRARAWEVNAVAPARLAQIATENNLT-L 262
+E + V + ++ + N + + N + L
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 263 VHVSTDFIFDGTTSV--HEETEAPSPLSVYGASKAAGDIAAAVTPKHYVVRTS-----WV 315
++ S+ ++ + + P+S+Y A+K A ++ A Y + + V
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFFTV 181

Query: 316 FGQGG-------NFVETMRSLAQRGIRPNVISDQRGRP--THAADLAAGIVHLL------ 360
+G G F + M G +V + + + T+ D+A I+ L
Sbjct: 182 YGPWGRPDMALFKFTKAMLE----GKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHA 237

Query: 361 ------------RSSADYGVYNLSNSGDV 377
S A Y VYN+ NS V
Sbjct: 238 DTQWTVETGTPAASIAPYRVYNIGNSSPV 266


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13125INTIMIN350.006 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 34.7 bits (79), Expect = 0.006
Identities = 46/222 (20%), Positives = 71/222 (31%), Gaps = 15/222 (6%)

Query: 2185 IKVKVTYSDKSTDTAVAHVTTVAQTEQSVEVTPGTQTLVDVPGNGDVTVTVPQNSGLKAE 2244
+V VT +A A T +V+ Q V V N V +
Sbjct: 556 DQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTN 615

Query: 2245 VAGNL-VTLTSDGSVEGHVTVTYQVKDADGKTTTGKINVQVAKLAEASWPKSASVAAGDS 2303
+G VTL SD V A T +N + + + A D
Sbjct: 616 GSGKATVTLKSD-------KPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKA-DK 667

Query: 2304 TVITPEGVDTLPYGTTLSVGSLP---EWATFDQTTGQLTLTADEDATDESTTATITVAYP 2360
T G D + Y + G P + TF T G+L+ + ++ T+ T+T P
Sbjct: 668 TTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTP 727

Query: 2361 -DGLT--KQYSVALSVGAATETTADQHDPQWESVVVNGDSVT 2399
L + VA+ V A ++ + G V
Sbjct: 728 GKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVK 769


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13110HTHTETR260.032 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 26.1 bits (57), Expect = 0.032
Identities = 12/55 (21%), Positives = 24/55 (43%), Gaps = 1/55 (1%)

Query: 47 KRGARQGTGTRGRVLAVYSQTLVDTG-EVPTARQIAGEIGITKRMVNIHLKELRD 100
++ ++ TR +L V + G + +IA G+T+ + H K+ D
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSD 57


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13105V8PROTEASE422e-06 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 41.5 bits (97), Expect = 2e-06
Identities = 33/241 (13%), Positives = 68/241 (28%), Gaps = 42/241 (17%)

Query: 31 TNGTPVSPADDTAAEGVVQVAS--------CTGTVVASQWVLTAQHCVE----VPNLQRP 78
+ ++ + V + +G VV +LT +H V+ P+ +
Sbjct: 74 NDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHVVDATHGDPHALKA 133

Query: 79 VYVGTTREQQRREENTFTSDYAVWAPHGDVALVHVTDALPQRLVRKV-------RRAPVS 131
++ ++ GD+A+V + + + +V A
Sbjct: 134 FPSAINQD-NYPNGGFTAEQITKYSGEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQ 192

Query: 132 FGEQGRVYGWGAGTGETLQYARAAVGKTSSGVRPQGNQHGAFIVQYLDEAKAGRGDSGGP 191
+ V G+ + G Q+ G+SG P
Sbjct: 193 VNQNITVTGYPGDKPVATMWESKGKITYLKGE---AMQYDLSTTG---------GNSGSP 240

Query: 192 LF-VDGEVAGVTSFKAPQGGGRFSLFASLHGLGDWIAQ-------TTAAPSPSPRNPNSK 243
+F EV G+ P + +++ Q +P NP++
Sbjct: 241 VFNEKNEVIGIHWGGVPNEFNGAVFINE--NVRNFLKQNIEDIHFANDDQPNNPDNPDNP 298

Query: 244 N 244
N
Sbjct: 299 N 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
DIP_RS13100BORPETOXINA280.034 Bordetella pertussis toxin A subunit signature.
		>BORPETOXINA#Bordetella pertussis toxin A subunit signature.

Length = 269

Score = 27.8 bits (61), Expect = 0.034
Identities = 21/56 (37%), Positives = 26/56 (46%), Gaps = 5/56 (8%)

Query: 74 YPSYRYIDKIAGTTVPPVPDSLAALAPEALRAAAEVAEELAPWVETFVPEMVLVNY 129
Y S R + I GT V P A +A +A E +E +A W E MVLV Y
Sbjct: 212 YTSRRSVASIVGTLVRMAPVIGACMARQA-----ESSEAMAAWSERAGEAMVLVYY 262



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.