PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeNC_009074.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_009074 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1BURPS668_0012BURPS668_0022Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_00120133.235126general secretion pathway protein G
BURPS668_0013-1133.996175general secretion pathway protein H
BURPS668_00140133.704503general secretion pathway protein I
BURPS668_0015-1104.151808general secretion pathway protein J
BURPS668_0016-294.319671general secretion pathway protein K
BURPS668_0017-1104.282237general secretion pathway protein L
BURPS668_0018-1112.284823general secretory pathway protein M
BURPS668_00190102.293919general secretory pathway protein N
BURPS668_00200112.179641NodT family efflux transporter outer membrane
BURPS668_00211121.432729hypothetical protein
BURPS668_00222131.001406MarR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0012BCTERIALGSPG1886e-65 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 188 bits (480), Expect = 6e-65
Identities = 67/140 (47%), Positives = 94/140 (67%), Gaps = 3/140 (2%)

Query: 10 QAARRQRGFTLIEIMVVVAILGILAALIVPKIMSRPDEARRIAAKQDIGTIMQALKLYRL 69
+A +QRGFTL+EIMVV+ I+G+LA+L+VP +M ++A + A DI + AL +Y+L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL 61

Query: 70 DNGRYPTQDQGLNALIQKPTTDPIPNNWKDGGYLERLPNDPWGNSYKYLNPGVHGEIDVF 129
DN YPT +QGL +L++ PT P+ N+ GY++RLP DPWGN Y +NPG HG D+
Sbjct: 62 DNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLL 121

Query: 130 SYGADGKEGGESNDSDIGSW 149
S G DG+ G E DI +W
Sbjct: 122 SAGPDGEMGTE---DDITNW 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0013BCTERIALGSPH511e-10 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 51.5 bits (123), Expect = 1e-10
Identities = 20/101 (19%), Positives = 33/101 (32%), Gaps = 15/101 (14%)

Query: 51 RARGFTLLEMLVVLVIAGILVSVASLTLRRNPRTDLREEAQRIALLFETAGDEAQVRARP 110
R RGFTLLEM+++L++ G+ + L + + R +
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQF 61

Query: 111 IAWRATEHGFRF---------------DIRTGDGWRPLRDD 136
++F D +G W PLR
Sbjct: 62 FGVSVHPDRWQFLVLEARDGADPAPADDGWSGYRWLPLRAG 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0014BCTERIALGSPG300.001 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 30.2 bits (68), Expect = 0.001
Identities = 10/26 (38%), Positives = 18/26 (69%)

Query: 10 RSPARSRGFTMIEVLVALAIIAVALA 35
R+ + RGFT++E++V + II V +
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLAS 27


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0015BCTERIALGSPG333e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 33.3 bits (76), Expect = 3e-04
Identities = 17/72 (23%), Positives = 34/72 (47%), Gaps = 3/72 (4%)

Query: 33 RGFTLIEMMIAITILAVIA-ILSWRGLDQIIRGREKVAAAMEDERVFAQMFDQMRIDARR 91
RGFTL+E+M+ I I+ V+A ++ + + ++ A+ D D ++D
Sbjct: 8 RGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQK--AVSDIVALENALDMYKLDNHH 65

Query: 92 AATDDEAGQPAV 103
T ++ + V
Sbjct: 66 YPTTNQGLESLV 77


2BURPS668_0069BURPS668_0111Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0069-293.113951hypothetical protein
BURPS668_0070-292.801941AraC family transcriptional regulator
BURPS668_0071-1102.848709acyl-CoA dehydrogenase
BURPS668_0072-193.325711hypothetical protein
BURPS668_0073-1102.797397hypothetical protein
BURPS668_0074-292.207536acetyl-CoA acetyltransferase
BURPS668_0075-1111.7434273-hydroxyacyl-CoA dehydrogenase
BURPS668_00761131.167005alpha-methylacyl-CoA racemase
BURPS668_00774170.417657lipoprotein
BURPS668_00784182.290950hypothetical protein
BURPS668_00790152.211532hypothetical protein
BURPS668_0080-2152.563838hypothetical protein
BURPS668_00811212.148748lipoprotein
BURPS668_0082-214-0.426251hypothetical protein
BURPS668_0083-211-1.389495transmembrane regulator PrtR
BURPS668_0084-210-2.557848sigma-70 family RNA polymerase sigma factor
BURPS668_0085-210-2.794563chaperone protein
BURPS668_0086-111-3.524343[Ni] hydrogenase, b-type cytochrome subunit
BURPS668_0087-113-3.520826DNA gyrase subunit B
BURPS668_0088013-3.514555DNA polymerase III subunit beta
BURPS668_0089013-3.594957chromosomal replication initiation protein
BURPS668_0090217-4.70128350S ribosomal protein L34
BURPS668_0091-116-2.950098ribonuclease P protein component
BURPS668_0092-217-4.424939hypothetical protein
BURPS668_0093015-3.083748inner membrane protein translocase component
BURPS668_0094019-2.833547hypothetical protein
BURPS668_0095018-2.688292hypothetical protein
BURPS668_0096-117-2.194866tRNA modification GTPase TrmE
BURPS668_0097118-2.829095phage integrase site specific recombinase
BURPS668_0098117-1.317661hypothetical protein
BURPS668_0099329-4.604425hypothetical protein
BURPS668_0100236-6.051476hypothetical protein
BURPS668_0101651-12.436119hypothetical protein
BURPS668_0102652-12.742463hypothetical protein
BURPS668_0103444-11.687203hypothetical protein
BURPS668_0104444-11.773714hypothetical protein
BURPS668_0105443-11.577412resolvase, N-terminal:resolvase helix-turn-helix
BURPS668_0106339-11.775182hypothetical protein
BURPS668_0107026-7.155015hypothetical protein
BURPS668_0108018-6.254840lipoprotein
BURPS668_0109025-6.912075hypothetical protein
BURPS668_0110-124-5.783273hypothetical protein
BURPS668_0111019-3.082383diamine N-acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0076SHAPEPROTEIN320.004 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 32.0 bits (73), Expect = 0.004
Identities = 19/67 (28%), Positives = 30/67 (44%), Gaps = 3/67 (4%)

Query: 144 AGQPGDAPFAPPTLVGDLGGGALYLAMGVLAGIVDAR-LRGKGQIVDAAIVDGSANLMNL 202
AG P ++V D+GGG +A+ L G+V + +R G D AI++
Sbjct: 151 AGLPVSEATG--SMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRNYGS 208

Query: 203 LLSIHAA 209
L+ A
Sbjct: 209 LIGEATA 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0080UREASE300.003 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 30.1 bits (68), Expect = 0.003
Identities = 16/44 (36%), Positives = 21/44 (47%), Gaps = 5/44 (11%)

Query: 106 GGILVYDQFVTP----PTPQPVQQRRLRWGAHGRSNNGDNFYVV 145
GG + P PTPQPV R + +GA+GRS + V
Sbjct: 452 GGTIAAAPMGDPNASIPTPQPVHYRPM-FGAYGRSRTNSSVTFV 494


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0089PERTACTIN330.003 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 33.2 bits (75), Expect = 0.003
Identities = 24/93 (25%), Positives = 31/93 (33%)

Query: 81 PKAGQRSPAGATPLAPRAPLPSANPAPVAPGPASAPAVDAHAPAPAGMNAATAAAVAAAQ 140
P A + +P P+ P P P P P +A AP P +AAA AA
Sbjct: 568 PPAPKPAPQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQPPAGRELSAAANAAVN 627

Query: 141 AAQAAQANAAALNADEAADLDLPSLTAHEAAAG 173
A+ A L L + A G
Sbjct: 628 TGGVGLASTLWYAESNALSKRLGELRLNPDAGG 660


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_009360KDINNERMP490e-171 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 490 bits (1263), Expect = e-171
Identities = 204/576 (35%), Positives = 320/576 (55%), Gaps = 46/576 (7%)

Query: 1 MDIKRTVLWVIFFMSAVMLFDNWQRSHGRPSMFFPNVTQTNTASNATNGNGASGANAAAA 60
MD +R +L + + M++ W++ Q T + T
Sbjct: 1 MDSQRNLLVIALLFVSFMIWQAWEQDKNPQ-----PQAQQTTQTTTT------------- 42

Query: 61 ANALPAAATGAAPATTAPAAQAQLVRFSTDVYNGEIDTRGGTLAKLTLTK---AGDGKQP 117
AA AA + Q +L+ TDV + I+TRGG + + L + QP
Sbjct: 43 ------AAGSAADQGVPASGQGKLISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQP 96

Query: 118 DLSVTLFDHTANHTYLARTGLLGGDFPN-----HNDVYAQVAGPTSLAADQNTLKLSFES 172
L + + Y A++GL G D P+ +Y LA QN L++
Sbjct: 97 ---FQLLETSPQFIYQAQSGLTGRDGPDNPANGPRPLYNVEKDAYVLAEGQNELQVPMTY 153

Query: 173 PVKGGVKVVKTYTFTRGSYVIGVDTKIENVGAAPVTPSVYMELVRD-----NSSVETPMF 227
G KT+ RG Y + V+ ++N G P+ S + +L + + + F
Sbjct: 154 TDAAGNTFTKTFVLKRGDYAVNVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNF 213

Query: 228 S-HTFLGPAVYTDQKHFQKITFGDIDKNKADYVTSADNGWIAMVQHYFASAWIPQSGAKR 286
+ HTF G A T + ++K F I N+ ++S GW+AM+Q YFA+AWIP +
Sbjct: 214 ALHTFRGAAYSTPDEKYEKYKFDTIADNENLNISS-KGGWVAMLQQYFATAWIPHNDGTN 272

Query: 287 DIYVEKIDPTLYRVGVKQPVAAIAPGQSADVSARLFAGPEEERMLEGIAPGLELVKDYGW 346
+ Y + + +G K + PGQ+ +++ L+ GPE + + +AP L+L DYGW
Sbjct: 273 NFYTANLGNGIAAIGYKSQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGW 332

Query: 347 VTIIAKPLFWLLEKIHGFVGNWGWAIVLLTLLIKAVFFPLSAASYKSMARMKEITPRMQA 406
+ I++PLF LL+ IH FVGNWG++I+++T +++ + +PL+ A Y SMA+M+ + P++QA
Sbjct: 333 LWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQA 392

Query: 407 LRERFKSDPQKMNAALMELYKTEKVNPFGGCLPVVIQIPVFISLYWVLLASVEMRGAPWV 466
+RER D Q+++ +M LYK EKVNP GGC P++IQ+P+F++LY++L+ SVE+R AP+
Sbjct: 393 MRERLGDDKQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFA 452

Query: 467 LWIHDLSQRDPYFILPVLMAVSMFVQTKLNPTP-PDPVQAKMMMFMPIAFSVMFFFFPAG 525
LWIHDLS +DPY+ILP+LM V+MF K++PT DP+Q K+M FMP+ F+V F +FP+G
Sbjct: 453 LWIHDLSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSG 512

Query: 526 LVLYYVVNNVLSIAQQYYITRTL---GGAAAKKKAS 558
LVLYY+V+N+++I QQ I R L G + +KK S
Sbjct: 513 LVLYYIVSNLVTIIQQQLIYRGLEKRGLHSREKKKS 548


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0096PF05272372e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 36.6 bits (84), Expect = 2e-04
Identities = 25/123 (20%), Positives = 40/123 (32%), Gaps = 9/123 (7%)

Query: 191 IDFLEAADARGKLAHIR--ERLAHVLGDARQGALLREGLSV----VLAGQPNVGKSSLLN 244
+ L K +R + + + ++ G VL G +GKS+L+N
Sbjct: 555 VHVLGKTPDDYKPRRLRYLQLVGKYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLIN 614

Query: 245 ALAGAELAIVTPI-AGTTRDKVAQTIQIEGIPLHIIDTAGLRETEDEVEKIGIARTWGEI 303
L G + T GT +D Q I L + R + E K +
Sbjct: 615 TLVGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSEMT--AFRRADAEAVKAFFSSRKDRY 672

Query: 304 ERA 306
A
Sbjct: 673 RGA 675


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0111SACTRNSFRASE418e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 40.7 bits (95), Expect = 8e-07
Identities = 23/113 (20%), Positives = 45/113 (39%), Gaps = 11/113 (9%)

Query: 37 FEEPYETFTELSQLYDQHVHDQRERRFVAFDSDGELVGLVELI----ELDYIHRRGEFQI 92
F +PY E + +V ++ + F+ + + +G +++ I I
Sbjct: 42 FSKPYFKQYEDDDMDVSYVEEEGKAAFLYY-LENNCIGRIKIRSNWNGYALIE-----DI 95

Query: 93 IIAPNRQGRGFATRATRLAVEYAFKVLNLRKLYLIVDKSNVAAIRVYEKCGFK 145
+A + + +G T A+E+A K + L L N++A Y K F
Sbjct: 96 AVAKDYRKKGVGTALLHKAIEWA-KENHFCGLMLETQDINISACHFYAKHHFI 147


3BURPS668_0128BURPS668_0139Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0128-111-3.677179hypothetical protein
BURPS668_0129-214-3.473775spermidine/putrescine-binding periplasmic
BURPS668_0130-213-2.937293ABC-type spermidine/putrescine transport system,
BURPS668_0131-315-3.711287ABC-type spermidine/putrescine transport system,
BURPS668_0132-117-4.254481AraC family transcriptional regulator
BURPS668_0133423-6.031927acetyltransferase
BURPS668_0134748-12.964343hypothetical protein
BURPS668_0135647-13.547096PBSX family phage portal protein
BURPS668_0136649-12.953157hypothetical protein
BURPS668_0137750-12.571957hypothetical protein
BURPS668_0138639-10.632779hypothetical protein
BURPS668_0139330-7.957355cytidine/deoxycytidylate deaminase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0129MALTOSEBP320.002 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 32.4 bits (73), Expect = 0.002
Identities = 26/94 (27%), Positives = 44/94 (46%), Gaps = 8/94 (8%)

Query: 5 LTLLLTMTFGFVQAAELHLANWPDWMP--------PELLKKFEKETGIKTTLDIYDSDAT 56
L+ L TM F A++ W+ E+ KKFEK+TGIK T++ D
Sbjct: 12 LSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIKVTVEHPDKLEE 71

Query: 57 LLSKLQAGGGGYDVVVAGDYYVQVFAKNGLVRKL 90
++ A G G D++ +A++GL+ ++
Sbjct: 72 KFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEI 105


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0133SACTRNSFRASE392e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.8 bits (90), Expect = 2e-06
Identities = 18/67 (26%), Positives = 29/67 (43%), Gaps = 1/67 (1%)

Query: 67 RHAFINALFVRPAWHGAGVGREMLERALAWLREQGSESVSL-VTDPGSRADGFYQHLGWI 125
+A I + V + GVG +L +A+ W +E + L D A FY +I
Sbjct: 88 GYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFI 147

Query: 126 RGELDSY 132
G +D+
Sbjct: 148 IGAVDTM 154


4BURPS668_0215BURPS668_0232Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0215-1103.495511acyl-CoA dehydrogenase
BURPS668_02161102.949171GMC family oxidoreductase
BURPS668_02170123.210358flagellar hook-length control protein FliK
BURPS668_02182141.653303flagellar export protein FliJ
BURPS668_02192121.285758flagellar protein export ATPase FliI
BURPS668_02201120.297878flagellar assembly protein H
BURPS668_02212102.359422flagellar motor switch protein G
BURPS668_0222093.767198flagellar MS-ring protein
BURPS668_02232114.525182flagellar hook-basal body complex protein FliE
BURPS668_0224194.841104flagellar protein FliS
BURPS668_0225-183.950569hypothetical protein
BURPS668_0226-193.282096hypothetical protein
BURPS668_02270122.151817flagellar biosynthesis protein
BURPS668_02281130.985353hypothetical protein
BURPS668_02291121.143199xanthine dehydrogenase accessory factor
BURPS668_02300141.142958amino acid permease
BURPS668_0231-2142.067586LuxR family transcriptional regulator
BURPS668_0232-2143.074611sensor histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0217FLGHOOKFLIK711e-15 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 71.4 bits (174), Expect = 1e-15
Identities = 65/198 (32%), Positives = 90/198 (45%), Gaps = 6/198 (3%)

Query: 259 ALAALRDAADSARATLAASSAPAALQQAA-PAALAANAGAAAASAAPSLAPPVGTPDWTD 317
L A++ S P+ + AA P AAP L+ P+G+ +W
Sbjct: 183 PAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEWQQ 242

Query: 318 ALSQKVVFLSNAHQQSAELTLNPPDLGPLQVVLRVADNHAHALFVSQHAQVRDAVEAALP 377
+LSQ + + QQSAEL L+P DLG +Q+ L+V DN A VS H VR A+EAALP
Sbjct: 243 SLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAALP 302

Query: 378 KLREAMEAGGLGLGSASVSDGGFASAQQQQTPQRQSSDGSATRRAFGASTADAALDELAA 437
LR + G+ LG +++S F+ QQ + Q+QS +A D L
Sbjct: 303 VLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQR-TANHEPLAGEDDDT----LPV 357

Query: 438 ASSSGATRRTVGMVDTFA 455
S VD FA
Sbjct: 358 PVSLQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0218FLGFLIJ602e-14 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 59.8 bits (144), Expect = 2e-14
Identities = 43/140 (30%), Positives = 74/140 (52%)

Query: 1 MAQSFPLQLLLERAQDDLDTAAKQLGRAQRERTDAQAQLDALMRYRDEYRVRFAESAQSG 60
MA+ L L + A+ +++ AA+ LG +R A+ QL L+ Y++EYR +G
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MPAGNWRNFQAFLDTLDAAIEQQRRVLAAAQTRIDAARPEWQAKKRTLGSYEILQARGAR 120
+ + W N+Q F+ TL+ AI Q R+ L ++D A W+ KK+ L +++ LQ R +
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 QDAQRAAKREQRDADEHAAK 140
+ +Q+ DE A +
Sbjct: 121 AALLAENRLDQKKMDEFAQR 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0220FLGFLIH1091e-31 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 109 bits (273), Expect = 1e-31
Identities = 64/184 (34%), Positives = 106/184 (57%), Gaps = 4/184 (2%)

Query: 37 AAAALAAELQRVRDAAHAEGLAAGHVEGQALGYQAGYEQGRAKGFDEGQAEAHTHAAQLA 96
A +L +L +++ AH +G AG EG+ G++ GY++G A+G ++G AEA + A +
Sbjct: 36 AEPSLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIH 95

Query: 97 A----LAASFRDALAGVERDLADDIATLALEIAQQVVRQHVQHDPAALIAAAREVLAAEP 152
A L + F+ L ++ +A + +ALE A+QV+ Q D +ALI +++L EP
Sbjct: 96 ARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEP 155

Query: 153 ALAGAPHLIVNPADLPVVEAYLKDELDTLGWSVRTDTSIERGGCRAHASTGEIDATLTTR 212
+G P L V+P DL V+ L L GW +R D ++ GGC+ A G++DA++ TR
Sbjct: 156 LFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATR 215

Query: 213 WERV 216
W+ +
Sbjct: 216 WQEL 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0221FLGMOTORFLIG298e-102 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 298 bits (765), Expect = e-102
Identities = 114/324 (35%), Positives = 191/324 (58%)

Query: 5 GLNKSALLLMSIGEEEAAQVFKFLAPREVQKIGAAMAALKNVTREQVEDVLNDFVQEAEK 64
G K+A+LL+SIG E +++VFK+L+ E++ + +A L+ +T E ++VL +F +
Sbjct: 17 GKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKELMMA 76

Query: 65 HTALSLDSSEYIRTVLTKALGEDKAGVLIDRILQGSDTSGIEGLKWMDSAAVAELIKNEH 124
+ +Y R +L K+LG KA +I+ + + E ++ D A + I+ EH
Sbjct: 77 QEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANILNFIQQEH 136

Query: 125 PQIIATILVHLDRDQASEIASCFTERLRNDVLLRIATLDGIQPTALRELDDVLTGLLSGS 184
PQ IA IL +LD +AS I S ++ +V RIA +D P +RE++ VL L+
Sbjct: 137 PQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKKLASL 196

Query: 185 DNLKRAPMGGIRTAAEILNFMTSVHEEAVIENVKQYDPDLAQKIIDQMFVFENLLDLEDR 244
+ GG+ EI+N E+ +IE++++ DP+LA++I +MFVFE+++ L+DR
Sbjct: 197 SSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIVLLDDR 256

Query: 245 AIQLLLKEVESEALIIALKGAPPALRQKFLSNMSQRAAELLAEDLDARGPVRVSEVETQQ 304
+IQ +L+E++ + L ALK +++K NMS+RAA +L ED++ GP R +VE Q
Sbjct: 257 SIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKDVEESQ 316

Query: 305 RKILQVVRNLAESGQIVIGGKAED 328
+KI+ ++R L E G+IVI E+
Sbjct: 317 QKIVSLIRKLEEQGEIVISRGGEE 340


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0222FLGMRINGFLIF468e-162 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 468 bits (1205), Expect = e-162
Identities = 254/562 (45%), Positives = 360/562 (64%), Gaps = 37/562 (6%)

Query: 53 LSRMKTNPRLPFLIGAALAIAAIVALVLWSRAPDYRVLYSNLSDRDGGAIIAALQQANVP 112
L+R++ NPR+P ++ + A+A +VA+VLW++ PDYR L+SNLSD+DGGAI+A L Q N+P
Sbjct: 16 LNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIP 75

Query: 113 YKFADAGGAILVPANQVHETRLKLAAMGLPKGGSVGFELMDNQKFGISQFAEQVNYQRAL 172
Y+FA+ GAI VPA++VHE RL+LA GLPKGG+VGFEL+D +KFGISQF+EQVNYQRAL
Sbjct: 76 YRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRAL 135

Query: 173 EGELQRTVESINAVRAARVHLAIPKPSVFVRDREAPSASVLVDLYPGRVLDEGQVLAVTR 232
EGEL RT+E++ V++ARVHLA+PKPS+FVR++++PSASV V L PGR LDEGQ+ AV
Sbjct: 136 EGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALDEGQISAVVH 195

Query: 233 MVSSSVPDMPAKNVTIVDQDGNLLTQT-ASATGLDASQLKYVQQIERNTQKRIDAILAPI 291
+VSS+V +P NVT+VDQ G+LLTQ+ S L+ +QLK+ +E Q+RI+AIL+PI
Sbjct: 196 LVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRIQRRIEAILSPI 255

Query: 292 FGAGNARSQVSADVDFSKIEQTSESYGPNGTPQQSAIRSQQTSSSTELAQSGASGVPGAL 351
G GN +QV+A +DF+ EQT E Y PNG ++ +RS+Q + S ++ GVPGAL
Sbjct: 256 VGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGVPGAL 315

Query: 352 SNTPPQPASAPIVA-------------SNGQPAGPAATPVSDRKDSTTNYELDKTVRHVE 398
SN P P API ++ +A P S +++ T+NYE+D+T+RH +
Sbjct: 316 SNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDRTIRHTK 375

Query: 399 QSMGTIKRLSVAVVVNYQPSTDAKGRVTMQPLAADKLAQVQQLVKDAMGYDEKRGDSVNV 458
++G I+RLSVAVVVNY+ D K PL AD++ Q++ L ++AMG+ +KRGD++NV
Sbjct: 376 MNVGDIERLSVAVVVNYKTLADGKP----LPLTADQMKQIEDLTREAMGFSDKRGDTLNV 431

Query: 459 VNSAFSAAADPFANLPWWRQPDMIELGKDIAKWLGVAAAAAALYFMFVRPALRR---AFP 515
VNS FSA + LP+W+Q I+ +WL V A L+ VRP L R
Sbjct: 432 VNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRRVEEAK 491

Query: 516 PPAEPAAAAVPALDGPDDVLALDGLPSPDKKQLAEEDEEHPALLAFENERNRYERNLDYA 575
E A + + L+ D + N+R E
Sbjct: 492 AAQEQAQVRQETEEAVEVRLSKDEQLQQRR----------------ANQRLGAEVMSQRI 535

Query: 576 RTIARQDPKIVATVVKNWVSDE 597
R ++ DP++VA V++ W+S++
Sbjct: 536 REMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0223FLGHOOKFLIE619e-16 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 61.2 bits (148), Expect = 9e-16
Identities = 47/111 (42%), Positives = 62/111 (55%), Gaps = 8/111 (7%)

Query: 3 APVNGIASALQQMQAMAAQAAGGASPATSLAGSGAASAGSFASAMKASLDKISGDQQKAL 62
+ + GI + Q+QA A A S SFA + A+LD+IS Q A
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQES--------LPQPTISFAGQLHAALDRISDTQTAAR 52

Query: 63 GEAHAFEIGAQNVSLNDVMVDMQKANIGFQFGLQVRNKLVSAYNEIMQMSV 113
+A F +G V+LNDVM DMQKA++ Q G+QVRNKLV+AY E+M M V
Sbjct: 53 TQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0227TYPE3IMSPROT634e-15 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 62.9 bits (153), Expect = 4e-15
Identities = 17/81 (20%), Positives = 32/81 (39%), Gaps = 1/81 (1%)

Query: 10 AVLAYDAKGGDTAPRVVAKGYGLVAERIIERARDAGLYVHTAPEMV-SLLMQVDLDARIP 68
A+ +G P V K + + + A + G+ + + +L +D IP
Sbjct: 268 AIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEEEGVPILQRIPLARALYWDALVDHYIP 327

Query: 69 PQLYQAVAELLAWLYALERDA 89
+ +A AE+L WL +
Sbjct: 328 AEQIEATAEVLRWLERQNIEK 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0231HTHFIS853e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.9 bits (210), Expect = 3e-21
Identities = 31/114 (27%), Positives = 55/114 (48%), Gaps = 2/114 (1%)

Query: 5 ILLVDDHAIVRQGIRQLLIDRGIAREVKEAECGGDALVIAEKSEFDVILLDISLPDMNGI 64
IL+ DD A +R + Q L G +V+ + D+++ D+ +PD N
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGY--DVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 65 EVLKRLKRRLPSTPVLMFSMYREDQFAVRALKAGAAGYLSKTVNAAQMVSAISQ 118
++L R+K+ P PVL+ S A++A + GA YL K + +++ I +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


5BURPS668_0260BURPS668_0285Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_02601124.305669multifunctional tRNA nucleotidyl
BURPS668_02622153.805025hypothetical protein
BURPS668_02610133.232239RebB protein
BURPS668_02630122.737667FlgN family flagellar protein
BURPS668_02642111.406984flagellar biosynthesis regulator protein FlgM
BURPS668_02652111.145929flagellar basal body P-ring biosynthesis protein
BURPS668_0266517-1.887634flagellar basal-body rod protein FlgB
BURPS668_0267419-1.772540flagellar basal body rod protein FlgC
BURPS668_0268420-0.715152flagellar basal body rod modification protein
BURPS668_0269020-0.766190flagellar hook protein FlgE
BURPS668_0270-217-0.063251flagellar basal body rod protein FlgF
BURPS668_0271015-0.066216flagellar basal body rod protein FlgG
BURPS668_02721150.132211flagellar basal body L-ring protein
BURPS668_02732130.167387flagellar basal body P-ring biosynthesis protein
BURPS668_0274010-0.183504flagellar rod assembly protein/muramidase FlgJ
BURPS668_0275190.229814hypothetical protein
BURPS668_02760110.663429flagellar hook-associated protein FlgK
BURPS668_02770131.441236flagellar hook-associated protein FlgL
BURPS668_02781132.090124hypothetical protein
BURPS668_02792142.355880uracil-xanthine permease
BURPS668_02801142.392493hypothetical protein
BURPS668_0281-2111.098060DNA-binding transcriptional activator GcvA
BURPS668_02820120.229275chromate transporter
BURPS668_0283013-0.808707chromate transporter
BURPS668_0284013-2.343242hypothetical protein
BURPS668_0285012-3.106068hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0267FLGHOOKAP1270.029 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 26.8 bits (59), Expect = 0.029
Identities = 10/38 (26%), Positives = 17/38 (44%)

Query: 102 NVDPVQEMVNMISASRSYQANVETLNTAKQLMLKTLTI 139
V+ +E N+ + Y AN + L TA + + I
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0269FLGHOOKAP1340.001 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 34.2 bits (78), Expect = 0.001
Identities = 17/58 (29%), Positives = 24/58 (41%)

Query: 356 ISAPGSTNHGTLQGSALENSNVDLTSQLVKLITAQRNYQANAQTIKTQQTVDQTLINL 413
SA L S V+L + L Q+ Y ANAQ ++T + LIN+
Sbjct: 488 SSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 29.9 bits (67), Expect = 0.018
Identities = 11/31 (35%), Positives = 17/31 (54%)

Query: 6 GLSGLAGASSDLDVIGNNIANANTVGFKGST 36
+SGL A + L+ NNI++ N G+ T
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQT 37


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0270FLGHOOKAP1290.019 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.2 bits (65), Expect = 0.019
Identities = 9/34 (26%), Positives = 18/34 (52%)

Query: 4 LIYTAMTGATQSLEQQSVVANNLANASTTGFRAQ 37
LI AM+G + + +NN+++ + G+ Q
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQ 36


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0271FLGHOOKAP1421e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 42.3 bits (99), Expect = 1e-06
Identities = 10/48 (20%), Positives = 23/48 (47%)

Query: 213 TLKQGYVESSNVNVVQELVNMIQTQRAYEINSKAVTTSDQMLQTVTQM 260
L S VN+ +E N+ + Q+ Y N++ + T++ + + +
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 40.3 bits (94), Expect = 5e-06
Identities = 19/80 (23%), Positives = 34/80 (42%), Gaps = 14/80 (17%)

Query: 4 SLYIAATGMNAQQAQMDVISNNLANVSTNGFKGSRAVFEDLLYQTVRQPGANSTQQTELP 63
+ A +G+NA QA ++ SNN+++ + G+ RQ + + L
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYT--------------RQTTIMAQANSTLG 48

Query: 64 SGLQLGTGVQQVATERLYTQ 83
+G +G GV +R Y
Sbjct: 49 AGGWVGNGVYVSGVQREYDA 68


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0272FLGLRINGFLGH2051e-68 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 205 bits (522), Expect = 1e-68
Identities = 128/222 (57%), Positives = 156/222 (70%), Gaps = 7/222 (3%)

Query: 25 AALAAAALALAGCAQIPREPITQQPMSAMPPMPPAMQAPGSIY---NPGYAG-RPLFEDQ 80
A + L+L GCA IP P+ Q SA P P A GSI+ P G +PLFED+
Sbjct: 10 AISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINYGYQPLFEDR 69

Query: 81 RPRNVGDILTIVIAENINATKSSGANTNRQGNTSFDVPTAG-FLGGLF--NKANLSAQGA 137
RPRN+GD LTIV+ EN++A+KSS AN +R G T+F T +L GLF +A++ A G
Sbjct: 70 RPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNARADVEASGG 129

Query: 138 NKFAATGGASAANTFNGTITVTVTNVLPNGNLVVSGEKQMLINQGNEFVRFSGIVNPNTI 197
N F GGA+A+NTF+GT+TVTV VL NGNL V GEKQ+ INQG EF+RFSG+VNP TI
Sbjct: 130 NTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTI 189

Query: 198 SGQNSVYSTQVADARIEYSAKGYINEAETMGWLQRFFLNIAP 239
SG N+V STQVADARIEY GYINEA+ MGWLQRFFLN++P
Sbjct: 190 SGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSP 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0273FLGPRINGFLGI371e-129 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 371 bits (954), Expect = e-129
Identities = 164/392 (41%), Positives = 225/392 (57%), Gaps = 27/392 (6%)

Query: 10 RVVRPLVAARRRAAACCALAACMLALAFAPAAARAERLKDLAQIQGVRDNPLIGYGLVVG 69
RV+R + AA +A L+ PA A R+KD+A +Q RDN LIGYGLVVG
Sbjct: 2 RVLRIIAAALVFSALPF--------LSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVG 53

Query: 70 LDGTGDQTMQTPFTTQTLANMLANLGISINNGSANGGGSSAMTNMQLKNVAAVMVTATLP 129
L GTGD +PFT Q++ ML NLGI+ G +N KN+AAVMVTA LP
Sbjct: 54 LQGTGDSLRSSPFTEQSMRAMLQNLGITTQGGQSN-----------AKNIAAVMVTANLP 102

Query: 130 PFARPGEAIDVTVSSLGNAKSLRGGTLLLTPLKGADGQVYALAQGNMAVGGAGASANGSR 189
PFA PG +DVTVSSLG+A SLRGG L++T L GADGQ+YA+AQG + V G A + +
Sbjct: 103 PFASPGSRVDVTVSSLGDATSLRGGNLIMTSLSGADGQIYAVAQGALIVNGFSAQGDAAT 162

Query: 190 VQVNQLAAGRIAGGAIVERSVPNAVAQMNGVLQLQLNDMDYGTAQRIVSAVNS----SFG 245
+ + R+ GAI+ER +P+ L LQL + D+ TA R+ VN+ +G
Sbjct: 163 LTQGVTTSARVPNGAIIERELPSKFKDSV-NLVLQLRNPDFSTAVRVADVVNAFARARYG 221

Query: 246 AGTATALDGRTIQLTAPADSAQQVAFMARLQNLEVSPERAAAKVILNARTGSIVMNQMVT 305
A D + I + P + MA ++NL V + AKV++N RTG+IV+ V
Sbjct: 222 DPIAEPRDSQEIAVQKPRVA-DLTRLMAEIENLTVETD-TPAKVVINERTGTIVIGADVR 279

Query: 306 LQNCAVAHGNLSVVVNTQPVVSQPGPFSNGQTVVAQQSQIQLKQDNGSLRMVTAGANLAE 365
+ AV++G L+V V P V QP PFS GQT V Q+ I Q+ + + G +L
Sbjct: 280 ISRVAVSYGTLTVQVTESPQVIQPAPFSRGQTAVQPQTDIMAMQEGSKV-AIVEGPDLRT 338

Query: 366 VVKALNSLGATPADLMSILQAMKAAGALRADL 397
+V LNS+G +++ILQ +K+AGAL+A+L
Sbjct: 339 LVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0274FLGFLGJ2265e-75 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 226 bits (578), Expect = 5e-75
Identities = 124/297 (41%), Positives = 173/297 (58%), Gaps = 15/297 (5%)

Query: 15 ALDVQGFDALRSKATAAAPREGVKMVAGQFDAMFTQMMLKSMRDATPSDGLLDSSSSKMY 74
A D Q + L++KA P ++ VA Q + MF QMMLKSMRDA P DGL S +++Y
Sbjct: 12 AWDAQSLNELKAKA-GEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDGLFSSEHTRLY 70

Query: 75 TSMLDQQLAQQMSS-KGIGVADALTKQLLRNANVAPDAQGEGGLAAMNALAKAYANSNGA 133
TSM DQQ+AQQM++ KG+G+A+ + KQ+ + ++ + Y N +
Sbjct: 71 TSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLETVVRYQNQALS 130

Query: 134 PGNGALAGTRGYSAASALTPPLKGNGNSAQADAFVEKMALAAQAASAATGIPARFIVGQA 193
P + + AF+ +++L AQ AS +G+P I+ QA
Sbjct: 131 ------------QLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQA 178

Query: 194 ALESGWGKREIRGANGESSYNVFGIKATKGWTGRTVSAVTTEYVNGKPHRVVAQFRAYDS 253
ALESGWG+R+IR NGE SYN+FG+KA+ W G TTEY NG+ +V A+FR Y S
Sbjct: 179 ALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSS 238

Query: 254 YEHAMTDYANLLKNNPRYASVLNAGHNAEGFAHGMQKAGYATDPHYAKKLISIMQQI 310
Y A++DY LL NPRYA+V A +AE A +Q AGYATDPHYA+KL +++QQ+
Sbjct: 239 YLEALSDYVGLLTRNPRYAAVTTAA-SAEQGAQALQDAGYATDPHYARKLTNMIQQM 294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0276FLGHOOKAP12362e-71 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 236 bits (603), Expect = 2e-71
Identities = 162/444 (36%), Positives = 253/444 (56%), Gaps = 12/444 (2%)

Query: 50 NTLMNLGVSGLNAALWGLTTTGQNISNAATPGYSVERPVYAEASGQYTSSGYLPQGVSTV 109
++L+N +SGLNAA L T NIS+ GY+ + + A+A+ + G++ GV
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 110 TVERQYNQYLSNQLNAAQTQGSSLSTYYTLVAQLNNYVGSPTAGIATAITNYFTGLQTVA 169
V+R+Y+ +++NQL AAQTQ S L+ Y +++++N + + T+ +AT + ++FT LQT+
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 170 NNAADPSARQTAMSNAQTLASQLVAAGQQYSQLRQSVNSQLTDTVTQINSYTSQIAQLNE 229
+NA DP+ARQ + ++ L +Q Q + VN + +V QIN+Y QIA LN+
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 230 QIA--SASSQGQPPNQLLDQRDLAVSKLSQLAGVQV-VQSNGNYSVFLSGGQPLVVGNAS 286
QI+ + G PN LLDQRD VS+L+Q+ GV+V VQ G Y++ ++ G LV G+ +
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 287 YQLATVASPSDPSELTI-VSKGVAGSAQPGPTQYLPDVSLTGGALGGLLAFRSQTLDPAQ 345
QLA V S +DPS T+ G AG+ + +P+ L G+LGG+L FRSQ LD +
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIE------IPEKLLNTGSLGGILTFRSQDLDQTR 294

Query: 346 AQLGALAVSFASQVNAQNALGVDMSGNPGGSLFAVGAPAVYANQNNTGSATLSVSFVDGT 405
LG LA++FA N Q+ G D +G+ G FA+G PAV N N G + + D +
Sbjct: 295 NTLGQLALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDAS 354

Query: 406 QPTTSDYALSYDGAKYTLTDRATGSVVGTATPSSTPPTMTIGGLKLSLSSTPNAGDSFTV 465
+DY +S+D ++ +T R + T TP + + GL+L+ + TP DSFT+
Sbjct: 355 AVLATDYKISFDNNQWQVT-RLASNTTFTVTPDAN-GKVAFDGLELTFTGTPAVNDSFTL 412

Query: 466 LPTRGALDGFSLATANGSAIAAAS 489
P A+ + + + IA AS
Sbjct: 413 KPVSDAIVNMDVLITDEAKIAMAS 436



Score = 84.6 bits (209), Expect = 4e-19
Identities = 46/105 (43%), Positives = 66/105 (62%)

Query: 608 GTNDGRNALALSQLVNSKTMNNGTTTLTGAYAGYVNAIGNAASQLKASSAAQTALVGQIT 667
G +D RN AL L ++ G + AYA V+ IGN + LK SSA Q +V Q++
Sbjct: 441 GDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQLS 500

Query: 668 QAQQSVSGVNQNEEAANLMQYQQLYQANAKVIQTANSVFQTVLGL 712
QQS+SGVN +EE NL ++QQ Y ANA+V+QTAN++F ++ +
Sbjct: 501 NQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0277FLAGELLIN416e-06 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 41.2 bits (96), Expect = 6e-06
Identities = 55/369 (14%), Positives = 113/369 (30%), Gaps = 10/369 (2%)

Query: 16 MNDQQAQIAQLYQQVSSGISLTTPADNPLAAAQAVQLSATSATLAQYTQNQTIVQTALQT 75
+N Q+ ++ +++SSG+ + + D+ A A + ++ L Q ++N + QT
Sbjct: 17 LNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGISIAQT 76

Query: 76 EDTTLTSVNDVLNAAYQALMHAGDGGLSDSDRAALAAQIQGSRDHLLTLANTADGAGNYL 135
+ L +N+ L + + A +G SDSD ++ +IQ + + ++N G +
Sbjct: 77 TEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQFNGVKV 136

Query: 136 FAGFQPTTQPFSNKPGGGVTY------AGDYGARAVQIADTRTVSQGDNGANVFMSVPFL 189
+ G +T G + + + GD ++ +
Sbjct: 137 LSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYD 196

Query: 190 GSLPVPAAGASNTGTGTIGAVSITNPSDPTNTHQFTITFGGTAAAPTYTVTDNSVTPPTT 249
+ +G + + T A T D T +T
Sbjct: 197 TYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKST 256

Query: 250 TAAQAYSSGQGINLGGQTVAVSGKPAVGDTFTVTPAPQAGTDVFATLD----TVIAALKS 305
+ G GG+ V T V T++ T+ A +
Sbjct: 257 AGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADIT 316

Query: 306 PVGNSQTASTALTNTMATASTKLMNTMTNVLTVQASVGGRLQEVKAMQAVTTTNTLQTTN 365
+ A+T ++ S + T S E + T+
Sbjct: 317 AGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAE 376

Query: 366 SLSNLTDTN 374
+N
Sbjct: 377 YTANAAGDK 385


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0283ACRIFLAVINRP280.021 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.3 bits (63), Expect = 0.021
Identities = 17/63 (26%), Positives = 30/63 (47%), Gaps = 2/63 (3%)

Query: 110 YVQQGMMPVTAGLVVASAVLISEASNRSALQWGITAAVAAL-AYRTRVHPLWLLAGGALA 168
Y G++ T GL +A+LI E + + G A L A R R+ P+ + + +
Sbjct: 925 YFMVGLL-TTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFIL 983

Query: 169 GLV 171
G++
Sbjct: 984 GVL 986


6BURPS668_0325BURPS668_0338Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0325-3133.020221hypothetical protein
BURPS668_0324-2123.455663UDP-N-acetylglucosamine pyrophosphorylase
BURPS668_0326-2113.626961hypothetical protein
BURPS668_0327-2103.060271C32 tRNA thiolase
BURPS668_0328-1144.285503dihydroneopterin aldolase
BURPS668_03290135.155880hypothetical protein
BURPS668_03300142.564378hypothetical protein
BURPS668_0331-1142.670024hypothetical protein
BURPS668_0332-1153.596091hypothetical protein
BURPS668_0333-1153.200255hypothetical protein
BURPS668_0334-1143.299522fructokinase
BURPS668_0335-2143.528082N-acylglucosamine 2-epimerase
BURPS668_0336-1113.759332LacI family transcriptional regulator
BURPS668_03372133.720527methyl-accepting chemotaxis protein
BURPS668_03382130.730708hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0324SSBTLNINHBTR290.021 Streptomyces subtilisin inhibitor signature.
		>SSBTLNINHBTR#Streptomyces subtilisin inhibitor signature.

Length = 144

Score = 29.0 bits (64), Expect = 0.021
Identities = 21/44 (47%), Positives = 23/44 (52%), Gaps = 3/44 (6%)

Query: 21 VLHPLAGRPLLSHVIDTARALAPSRLVVVIGHGAEQVRAAVAAP 64
V PLAG L S A APS LV+ +GHG AA AAP
Sbjct: 18 VCGPLAGASLASPATAPASLYAPSALVLTVGHGES---AATAAP 58


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0335cloacin290.042 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 29.3 bits (65), Expect = 0.042
Identities = 23/86 (26%), Positives = 27/86 (31%), Gaps = 23/86 (26%)

Query: 424 GRGANAGGGAGDDDRAPRAAHDSG---RGGDVKGGGKGGSKGGVK--------GGGTDDH 472
GRG N G AH + GG G GG+ G GGG+
Sbjct: 6 GRGHNTG------------AHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSG 53

Query: 473 GHWVDKGYKGGEGNAADGAGRRRNGG 498
HW G G + G GG
Sbjct: 54 IHWGGGSGHGNGGGNGNSGGGSGTGG 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0336HTHTETR280.041 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.1 bits (62), Expect = 0.041
Identities = 17/136 (12%), Positives = 34/136 (25%), Gaps = 8/136 (5%)

Query: 2 GTTIRDVAQAANVSIGTVSRALKNQPGLSEATRARIVE-----IAHRMNYDPTQLRPRIK 56
T++ ++A+AA V+ G + K++ L P ++
Sbjct: 31 STSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELELEYQAKFPGDPLSVLR 90

Query: 57 -RLTFLLHRQHNNFATTPFFSHVLHGVEDACRERGIVPSLLTTGPTDDVIRQMRPHAPDA 115
L +L + H E E +V + ++
Sbjct: 91 EILIHVLESTVTEERRRLLMEIIFHKCEFV-GEMAVVQQAQRNLCL-ESYDRIEQTLKHC 148

Query: 116 IAVAGFMEPETLEALA 131
I A
Sbjct: 149 IEAKMLPADLMTRRAA 164


7BURPS668_0360BURPS668_0367Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_03602141.907030lipoprotein
BURPS668_03612170.988958leucine-responsive regulatory protein
BURPS668_03622161.629316AzlC family protein
BURPS668_03635191.305522AzlD family protein
BURPS668_03647212.146768alpha/beta hydrolase
BURPS668_03652190.345843hypothetical protein
BURPS668_03661190.031880hypothetical protein
BURPS668_0367419-1.668046dihydrodipicolinate synthase
8BURPS668_0416BURPS668_0426Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_04160113.282644hypothetical protein
BURPS668_04170103.093858IclR family transcriptional regulator
BURPS668_04181103.171811fumarylacetoacetate hydrolase
BURPS668_04191102.098156hypothetical protein
BURPS668_04201101.580638enoyl-CoA hydratase
BURPS668_04211122.416674hypothetical protein
BURPS668_04221142.089252patatin family phospholipase
BURPS668_0423-2153.426448hypothetical protein
BURPS668_0424-1134.179508hypothetical protein
BURPS668_0425-3133.639467aut protein
BURPS668_0426-2123.576484hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0416PF03544353e-04 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 35.3 bits (81), Expect = 3e-04
Identities = 18/99 (18%), Positives = 34/99 (34%), Gaps = 3/99 (3%)

Query: 55 PVQVELLKPQPIERAPAPEKPAADRPRAAPKRAARASAPPAHAPRASAPVSSAAESSTES 114
P Q P+P+ +P + P+ AP + P P+ V +
Sbjct: 62 PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKR---DV 118

Query: 115 SAESPAAASGAEPASAADGQAAGATSGAAAGASGASAPP 153
AS E + A ++ AT+ + + ++ P
Sbjct: 119 KPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGP 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0426GPOSANCHOR300.003 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 30.0 bits (67), Expect = 0.003
Identities = 18/64 (28%), Positives = 24/64 (37%), Gaps = 4/64 (6%)

Query: 68 PALETAPLNAPGAAPAAASDSAPGSPAASAPASAVAPASMPASVAAPAAPA----PSSPP 123
A E A L A A+ + D+ PG+ A A + P AP PS+
Sbjct: 451 QAEELAKLRAGKASDSQTPDAKPGNKAVPGKGQAPQAGTKPNQNKAPMKETKRQLPSTGE 510

Query: 124 AAQP 127
A P
Sbjct: 511 TANP 514


9BURPS668_0482BURPS668_0491Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_04821113.568610NAD(P)H-dependent glycerol-3-phosphate
BURPS668_04842111.984128hypothetical protein
BURPS668_0483-280.512639hypothetical protein
BURPS668_0485-210-0.657435RNA methyltransferase
BURPS668_0486-210-0.945764competence protein
BURPS668_0487012-1.984267biotin biosynthesis protein BioC
BURPS668_0488214-3.076802hypothetical protein
BURPS668_0489215-3.309688cytochrome c oxidase subunit II
BURPS668_0490318-4.204578cytochrome c oxidase subunit I
BURPS668_0491014-3.314589hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0489OMPADOMAIN682e-14 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 67.6 bits (165), Expect = 2e-14
Identities = 27/103 (26%), Positives = 46/103 (44%), Gaps = 2/103 (1%)

Query: 399 QADGGAAANAASGAAAQTQAQAPALPAAIYFETGKSELPADAKDAIAAAAEYVRAH--PD 456
Q + A A + Q + L + + F K+ L + + A+ + D
Sbjct: 193 QGEAAPVVAPAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKD 252

Query: 457 AKLALSGFTDKTGSADANAELAKRRAQVVRDALKTAGVAEDRI 499
+ + G+TD+ GS N L++RRAQ V D L + G+ D+I
Sbjct: 253 GSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPADKI 295


10BURPS668_0513BURPS668_0552Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0513-1123.173575fatty acid desaturase
BURPS668_0514-2133.9946252,4-diaminobutyrate 4-transaminase
BURPS668_0515-2154.559455hypothetical protein
BURPS668_0516-2153.466854ABC-2 type transporter permease
BURPS668_0517-2164.148116ABC transporter ATP-binding protein
BURPS668_0518-1144.358012syringomycin biosynthesis enzyme
BURPS668_0519-2144.361032UbiE/COQ5 family methlytransferase
BURPS668_0520-1144.091851citrate synthase-like protein
BURPS668_05210133.017414acyl-CoA dehydrogenase
BURPS668_05220143.633196hypothetical protein
BURPS668_05230143.473798AMP-binding protein
BURPS668_0524-1152.365533pyridoxal-dependent decarboxylase
BURPS668_05252162.793075hypothetical protein
BURPS668_05261132.064866hypothetical protein
BURPS668_05271122.458769hypothetical protein
BURPS668_05280102.143656hypothetical protein
BURPS668_0529091.944046hypothetical protein
BURPS668_0530192.198150hypothetical protein
BURPS668_0531192.385314AMP-binding protein
BURPS668_05324124.047986LysR family transcriptional regulator
BURPS668_05333133.738458hypothetical protein
BURPS668_05342133.624471GntR family transcriptional regulator
BURPS668_05351133.377473N-acetylglucosamine-6-phosphate deacetylase
BURPS668_05360123.734419SIS domain-containing protein
BURPS668_0537-293.290954phosphoryl transfer system,
BURPS668_0538-1102.027192pts system, N-acetylglucosamine-specific IIBC
BURPS668_0539-1101.336709hypothetical protein
BURPS668_0540-2100.024919glycosyl hydrolase
BURPS668_0541-211-2.262823hypothetical protein
BURPS668_0542-19-2.900983hypothetical protein
BURPS668_0543011-3.770359cyd operon protein YbgT
BURPS668_054409-3.923201cytochrome d ubiquinol oxidase, subunit II
BURPS668_054508-4.100202cytochrome d ubiquinol oxidase, subunit I
BURPS668_0546-28-2.250177hypothetical protein
BURPS668_0547-28-2.442640RNA polymerase factor sigma-32
BURPS668_0548-110-1.375415hypothetical protein
BURPS668_0549-19-0.228046hypothetical protein
BURPS668_0550191.509970hypothetical protein
BURPS668_0551-183.0058342-isopropylmalate synthase
BURPS668_0552-293.814410hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0516ABC2TRNSPORT320.003 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 31.8 bits (72), Expect = 0.003
Identities = 33/155 (21%), Positives = 60/155 (38%), Gaps = 7/155 (4%)

Query: 163 YGEFFATGILIMAFMSIGVVSTA-TTIATLRERNTFKMYVCFPVSRF-VFLASLIVSRVI 220
Y F A G++ + M+ T + + T++ + + + L + +
Sbjct: 65 YTAFLAAGMVATSAMTAATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATK 124

Query: 221 LMLAASVTLMLAARYLFQVPLPLWSLRALRAIPVVLLGAAMLLSLGTLLASRARSLAAAE 280
LA + ++AA + SL L A+PV+ L SLG ++ + A S
Sbjct: 125 AALAGAGIGVVAAALGY---TQWLSL--LYALPVIALTGLAFASLGMVVTALAPSYDYFI 179

Query: 281 AWCNLIYFPLLFFSDLTIPLRAAPHWLRVVLLVLP 315
+ L+ P+LF S P+ P + LP
Sbjct: 180 FYQTLVITPILFLSGAVFPVDQLPIVFQTAARFLP 214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0537PHPHTRNFRASE513e-175 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 513 bits (1323), Expect = e-175
Identities = 194/567 (34%), Positives = 312/567 (55%), Gaps = 7/567 (1%)

Query: 303 PNTLAGVCAAPGIAVGTLVRWDDAQIVPPELASGTPAAESRLLDRALAEVDAQLETTVRE 362
+ + G+ A+ G+A+ + + + + + E L AL + +L +
Sbjct: 2 HHKITGIAASSGVAIAKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQ 61

Query: 363 ASRRGAIGEAGIFAVHRVLLEDPALVDAARDLI-SLGKSAGYAWRETIRAQTAVLADVDD 421
+A IFA H ++L+DP LVD + I + +A YA +E ++ +D+
Sbjct: 62 TEASMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDN 121

Query: 422 TLLAERAADLRDIDKRVLRAL-GYASASARELPAEAVLAAEEFTPSDLASLDRERVAALV 480
+ ERAAD+RD+ KRVL L G + S + E V+ AE+ TPSD A L+++ V
Sbjct: 122 EYMKERAADIRDVSKRVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFA 181

Query: 481 MARGGATSHAAIIARQLGIPALVAVGDALYAIAQRTQVVVDASAGRLEYAPSALDVERAR 540
GG TSH+AI++R L IPA+V + I V+VD G + P+ +V+
Sbjct: 182 TDIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYE 241

Query: 541 HERQRLAGVREANRRMSGEAALTRDGHRIEVAANIATLDDARVALDNGADAVGLLRTELM 600
+R ++ ++ GE + T+DG +E+AANI T D L NG + +GL RTE +
Sbjct: 242 EKRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFL 301

Query: 601 FIHRQAAPTASEHQQSYQSIVDALQGRAAIIRTLDVGADKEVDYLTLPPEPNPALGLRGI 660
++ R PT E ++Y+ +V + G+ +IRTLD+G DKE+ YL LP E NP LG R I
Sbjct: 302 YMDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFRAI 361

Query: 661 RLAQVRPDLLDDQLRGLLAVKPYGSVRILLPMVTDVGELVRIRKRIDD-----FARAMGR 715
RL + D+ QLR LL YG+++++ PM+ + EL + + + + + +
Sbjct: 362 RLCLEKQDIFRTQLRALLRASTYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEGVDV 421

Query: 716 AQAVEVGVMIEVPSAALLADQLAQHADFLSIGTNDLTQYTLAMDRCQADLAAQADGLHPA 775
+ ++EVG+M+E+PS A+ A+ A+ DF SIGTNDL QYT+A DR ++ HPA
Sbjct: 422 SDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPA 481

Query: 776 VLRLVDATVRGAEKHGKWVGVCGALGGDPVAVPVLVGLGVTELSVDPVSVPGIKAQVRRL 835
+LRLVD ++ A GKWVG+CG + GD VA+P+L+GLG+ E S+ S+ ++Q+ +L
Sbjct: 482 ILRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLLKL 541

Query: 836 DYQLCRQRAQDLLALESAQAVRAASRE 862
+ + AQ L L++A+ V ++
Sbjct: 542 SKEELKPFAQKALMLDTAEEVEQLVKK 568


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0540cloacin310.023 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 30.8 bits (69), Expect = 0.023
Identities = 31/120 (25%), Positives = 48/120 (40%), Gaps = 8/120 (6%)

Query: 176 VVVDGAAPAVLRYDDTDDELRYVETLPADAQNNSPGNAPP--AAAQPVANRALPSVKRQR 233
V + G P+ + DD + + V +LPAD SP ++ P A V R + VK +R
Sbjct: 134 VALYGVLPSQIAKDDPNMMSKIVTSLPADDITESPVSSLPLDKATVNVNVRVVDDVKDER 193

Query: 234 ALPGALDLRGVELTLPELPSAQVAALRERAGTLGLDGARVPVWGVVAPRRLPADIAVPGG 293
+ GV +++P + A ER G PV + PA + G
Sbjct: 194 QNISVVS--GVPMSVPVVD----AKPTERPGVFTASIPGAPVLNISVNNSTPAVQTLSPG 247


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0550TYPE3IMSPROT349e-05 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 34.0 bits (78), Expect = 9e-05
Identities = 14/42 (33%), Positives = 21/42 (50%), Gaps = 1/42 (2%)

Query: 44 KRETKQQFIDAITAGRRRYRQIEIQSQDVL-PVGDATYVVAG 84
KRE K+ +RR EIQS+++ V ++ VVA
Sbjct: 222 KREYKEMEGSPEIKSKRRQFHQEIQSRNMRENVKRSSVVVAN 263


11BURPS668_0576BURPS668_0619Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0576211-2.954087ATP-dependent protease La
BURPS668_0577211-2.916788hypothetical protein
BURPS668_0578013-2.723567HPr kinase/phosphorylase
BURPS668_0579-114-2.808817PTS transporter subunit IIA-like
BURPS668_0580-214-1.735206ribosomal subunit interface protein
BURPS668_0581-113-0.489496RNA polymerase factor sigma-54
BURPS668_05820110.835530ABC transporter ATP-binding protein
BURPS668_05830100.929831hypothetical protein
BURPS668_05841111.028168hypothetical protein
BURPS668_05853101.4779423-deoxy-D-manno-octulosonate 8-phosphate
BURPS668_05863100.815287KpsF/GutQ family sugar isomerase
BURPS668_05872100.788296monovalent cation:proton antiporter-2 (CPA2)
BURPS668_0588213-0.081856adenine phosphoribosyltransferase
BURPS668_0589014-0.807038LysE family translocator protein
BURPS668_0590-211-1.341666nudix hydrolase
BURPS668_0591-111-1.653185formyltetrahydrofolate deformylase
BURPS668_0593-110-1.424848hypothetical protein
BURPS668_0592-110-1.652784hypothetical protein
BURPS668_0594-27-1.128810excinuclease ABC subunit A
BURPS668_0595416-0.945862major facilitator family transporter
BURPS668_0596520-0.443199single-stranded DNA-binding protein
BURPS668_0597524-1.356228dienelactone hydrolase
BURPS668_0598747-8.135362hypothetical protein
BURPS668_0599746-8.027842Zn-dependent hydrolases, including glyoxylases
BURPS668_0600862-12.024962transcriptional regulator
BURPS668_0601216-1.044577hypothetical protein
BURPS668_0602010-0.336698hypothetical protein
BURPS668_0603011-0.200279hypothetical protein
BURPS668_06040102.4414094-carboxymuconolactone decarboxylase
BURPS668_06051102.573292carboxymuconolactone decarboxylase
BURPS668_06061102.860558Rhs family protein
BURPS668_0607-1100.323052hypothetical protein
BURPS668_0608529-3.126380hypothetical protein
BURPS668_0609625-3.325153hypothetical protein
BURPS668_0610623-2.753615hypothetical protein
BURPS668_0611825-3.388094hypothetical protein
BURPS668_0612825-4.698349hypothetical protein
BURPS668_0613218-1.387598hypothetical protein
BURPS668_06142113.708298hypothetical protein
BURPS668_06151123.627600hypothetical protein
BURPS668_06161113.484178hypothetical protein
BURPS668_0617294.129568hypothetical protein
BURPS668_06182104.078692FHA domain-containing protein
BURPS668_06191103.672722protein kinase domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0595TCRTETA853e-20 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 84.9 bits (210), Expect = 3e-20
Identities = 77/368 (20%), Positives = 143/368 (38%), Gaps = 31/368 (8%)

Query: 7 RATTSLAAIFALRMLGLFMIMPVFSVYAKTIPGGENVVL-VGIALGAYGVTQSLLYIFYG 65
R + + AL +G+ +IMPV + + +V GI L Y + Q G
Sbjct: 5 RPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLG 64

Query: 66 WASDKFGRKPVIAAGLLIFALGSFVAAFAHDITWIIVGRVIQGM-GAVSSAVLAFIADLT 124
SD+FGR+PV+ L A+ + A A + + +GR++ G+ GA + A+IAD+T
Sbjct: 65 ALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADIT 124

Query: 125 SEHNRTKAMAMVGGSIGMSFAVAIVGAPI--VFHWVGMSGLFAIVGALSVAAIGVVLWVV 182
R + + G + G + + F AL+ +++
Sbjct: 125 DGDERARHFGFMSACFGFGM---VAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLL 181

Query: 183 PDAPRPVHVPAPFAEVLHNVELLRLNFGVLVLHATQTALFLVVPRLLVDGGLPVA----- 237
P++ + P E L+ + R G+ V+ A F+ + + G +P A
Sbjct: 182 PESHKGERRPLR-REALNPLASFRWARGMTVVAALMAVFFI----MQLVGQVPAALWVIF 236

Query: 238 ----SHWQ-----VYLPVMGL--AFVMMVPAIIVAEKQGRMKPVLLGGIAAILIGQLLLG 286
HW + L G+ + + VA + G + ++L G+ A G +LL
Sbjct: 237 GEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALML-GMIADGTGYILLA 295

Query: 287 VATHTILIVAAILFVYFLGFNILEASQPSLVSKLAPGSRKGAATGVYNTTQSIGLALGGV 346
AT + + V I + +++S+ R+G G S+ +G +
Sbjct: 296 FATRGWMAF--PIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPL 353

Query: 347 VGGVLLKH 354
+ +
Sbjct: 354 LFTAIYAA 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0596cloacin433e-07 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 42.8 bits (100), Expect = 3e-07
Identities = 30/71 (42%), Positives = 35/71 (49%), Gaps = 9/71 (12%)

Query: 109 GGRGGSGGGGGGGDDGGY-------GGGGGRDMERGGGGGRASGGGG--AGARSGGGGGA 159
GG G G GGG D G+ GGG G + GGG G +GGG +G SG GG
Sbjct: 22 GGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNL 81

Query: 160 SRPSAPAGGGF 170
S +AP GF
Sbjct: 82 SAVAAPVAFGF 92



Score = 33.1 bits (75), Expect = 5e-04
Identities = 23/61 (37%), Positives = 26/61 (42%), Gaps = 5/61 (8%)

Query: 109 GGRGGSGGGGGGGDDGGYGGGGGRDMERGGGGGRASGGGGAGARSGGGGGASRPSAPAGG 168
GG GSG GGG G GGG G GGG +GG + + G S P G
Sbjct: 47 GGGSGSGIHWGGGSGHGNGGGNG-----NSGGGSGTGGNLSAVAAPVAFGFPALSTPGAG 101

Query: 169 G 169
G
Sbjct: 102 G 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0603BACINVASINC290.049 Salmonella/Shigella invasin protein C signature.
		>BACINVASINC#Salmonella/Shigella invasin protein C signature.

Length = 409

Score = 29.5 bits (65), Expect = 0.049
Identities = 15/54 (27%), Positives = 26/54 (48%)

Query: 79 QLRVLPRGITGLLNISKLTPDALAIIRATALALTHLEQAEKKVDDESPLPILDA 132
QLR + +IS ++ A+A++ A + L QA+ K+ + L DA
Sbjct: 105 QLREQQAEVGKFFDISGMSSSAVALLAAANTLMLTLNQADSKLSGKLSLVSFDA 158


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0606SALSPVBPROT607e-11 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 60.1 bits (145), Expect = 7e-11
Identities = 55/208 (26%), Positives = 81/208 (38%), Gaps = 41/208 (19%)

Query: 12 LNLPSGGGSVSGDGGDFSVDLNTGTATLKFDLTVPAGPNGITPPHTLQYSAGAGDGAFGI 71
LP GG ++S G D G A++ L + A G P L YS+G G+G FG+
Sbjct: 18 PFLPKGGKALSQSGPD-------GLASITLPLPISAE-RGFAPALALHYSSGGGNGPFGV 69

Query: 72 GWSLGLMTIRRR-----------------------ITPATGAAEPAPPGACSLVGVGELV 108
GWS M+I R T +TG A P P + V
Sbjct: 70 GWSCATMSIARSTSHGVPQYNDSDEFLGPDGEVLVQTLSTGDA-PNPVTCFAYGDVSFPQ 128

Query: 109 DMGARRFRPIVDATGLLIEFTGAS------WTATDKTDTQYTLGTSANARIG---GGALP 159
R++P +++ +E+ + W D + LG +A AR+ +
Sbjct: 129 SYTVTRYQPRTESSFYRLEYWVGNSNGDDFWLLHDSNGILHLLGKTAAARLSDPQAASHT 188

Query: 160 AAWLVDRCADSAGNAIAYTWLDVGGARV 187
A WLV+ AG I Y++L G V
Sbjct: 189 AQWLVEESVTPAGEHIYYSYLAENGDNV 216


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0607PF05616350.002 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 34.7 bits (79), Expect = 0.002
Identities = 28/80 (35%), Positives = 32/80 (40%), Gaps = 9/80 (11%)

Query: 164 SQGNSAASSVAVTRAKAVDAAVVGAFSPPQMPNPSALPAANP--NAAPSTTPGFHPAPGV 221
SQGN+ + R D A +P P P PA NP N AP+ PG P P
Sbjct: 299 SQGNTTVDVQVIPRP---DLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEP 355

Query: 222 MPPRGIDLAPAALTPLKIQP 241
P DL P A QP
Sbjct: 356 DP----DLNPDANPDTDGQP 371


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0619YERSSTKINASE340.004 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 33.9 bits (77), Expect = 0.004
Identities = 17/42 (40%), Positives = 24/42 (57%), Gaps = 2/42 (4%)

Query: 149 QVLDGLAHAHANGVVHRDLKPQNVMVTTRDGEPCAKILDFGI 190
++LD H GVVH D+KP NV+ GEP ++D G+
Sbjct: 253 RLLDVTNHLAKAGVVHNDIKPGNVVFDRASGEPV--VIDLGL 292


12BURPS668_0750BURPS668_0785Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_07502112.311710hypothetical protein
BURPS668_07512112.657653FAD binding domain-containing protein
BURPS668_07530113.487922LysR family transcriptional regulator
BURPS668_07541102.995274tRNA 2-selenouridine synthase
BURPS668_07551113.343519hypothetical protein
BURPS668_07561133.121903hypothetical protein
BURPS668_07570123.291217hypothetical protein
BURPS668_07580102.954496ABC transporter ATP-binding protein
BURPS668_07590133.002283ABC transporter permease
BURPS668_07600153.203402hypothetical protein
BURPS668_0761092.541051protein-L-isoaspartate O-methyltransferase
BURPS668_0762-192.768067hypothetical protein
BURPS668_0763-1103.755068nicotinate phosphoribosyltransferase
BURPS668_0764-1104.246182hypothetical protein
BURPS668_0765-293.281055phosphoribosyl transferase domain-containing
BURPS668_0766-2103.532533transglycosylase
BURPS668_0767-193.774131hypothetical protein
BURPS668_0768-1103.865420cytochrome c family protein
BURPS668_0769-1103.508589hypothetical protein
BURPS668_0770-1103.378661cytochrome c oxidase subunit I/subunit III
BURPS668_07710103.977647cytochrome c oxidase subunit II
BURPS668_0772-1113.578531thiamine pyrophosphate protein
BURPS668_0773094.355870mandelate racemase/muconate lactonizing enzyme
BURPS668_0774184.375910hypothetical protein
BURPS668_0775084.244412hypothetical protein
BURPS668_0776093.338291GMC oxidoreductase
BURPS668_07770102.215449hypothetical protein
BURPS668_07780131.175485hypothetical protein
BURPS668_0779-1120.300670LuxR family transcriptional regulator
BURPS668_0780-211-0.385465peptidase
BURPS668_0781-214-3.124166transcriptional regulator
BURPS668_0782-115-3.733846Signal transduction histidine kinase
BURPS668_0783021-5.145914oxidoreductase molybdopterin binding subunit
BURPS668_0784227-2.912491response regulator receiver domain-containing
BURPS668_0785228-2.581288hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0769V8PROTEASE300.004 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 30.4 bits (68), Expect = 0.004
Identities = 10/30 (33%), Positives = 13/30 (43%)

Query: 144 NREPNAPGEPDELGELGELGEPDLPDEPDD 173
P+ P PD E PD P+ PD+
Sbjct: 292 PDNPDNPNNPDNPNNPDEPNNPDNPNNPDN 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0782HTHFIS589e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 57.5 bits (139), Expect = 9e-11
Identities = 34/122 (27%), Positives = 52/122 (42%), Gaps = 15/122 (12%)

Query: 493 RALVVDDNENARETLGAMLATLGIRVDLRGTGKEGLRCFGECQHDIVVLDLELPDISGFE 552
LV DD+ R L L+ G V + R D+VV D+ +PD + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 553 VAEQIRWATSSDAARKTTILGVSAYES------ALLKGDHAIFDAFIPKPIHLDTLGGIV 606
+ +I+ A +L +SA + A KG +D ++PKP L L GI+
Sbjct: 65 LLPRIK-----KARPDLPVLVMSAQNTFMTAIKASEKG---AYD-YLPKPFDLTELIGII 115

Query: 607 SR 608
R
Sbjct: 116 GR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0784HTHFIS385e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.5 bits (87), Expect = 5e-05
Identities = 22/125 (17%), Positives = 47/125 (37%), Gaps = 12/125 (9%)

Query: 188 ARIAVVDDSPDVAETICEYFAEKGVAAIAYYDSVSFRKALEVEDFDGYILDWLLGEETAA 247
A I V DD + + + + G ++ + + + D D + D ++ +E A
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 248 PLVRGIRASENADAPIFLLTGKISTGEASEDEIADIVSSFNARCEE---KPVRLPILFAE 304
L+ I+ D P+ +++ ++ + + + KP L L
Sbjct: 64 DLLPRIK-KARPDLPVLVMSA--------QNTFMTAIKASEKGAYDYLPKPFDLTELIGI 114

Query: 305 VARAL 309
+ RAL
Sbjct: 115 IGRAL 119


13BURPS668_0795BURPS668_0823Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0795217-2.855815LysR family transcriptional regulator
BURPS668_0796015-3.723150hypothetical protein
BURPS668_0797015-3.064443hypothetical protein
BURPS668_0798014-4.281585multidrug ABC transporter permease
BURPS668_0799016-5.079505mandelate racemase
BURPS668_0800123-5.567287LuxR family transcriptional regulator
BURPS668_0801016-4.8140605-enolpyruvylshikimate-3-phosphate synthase
BURPS668_0802-114-2.419926hypothetical protein
BURPS668_0803-113-2.108222hypothetical protein
BURPS668_0804012-0.934826hypothetical protein
BURPS668_0806011-0.323699*hypothetical protein
BURPS668_0807-110-0.493701major facilitator family transporter
BURPS668_0808110-0.562355sensor histidine kinase
BURPS668_0809114-2.012999DNA-binding response regulator
BURPS668_0810215-2.124417recombinase A
BURPS668_0811215-0.998132recombination regulator RecX
BURPS668_0812-213-1.214294hypothetical protein
BURPS668_0813-111-1.685891succinyl-CoA synthetase subunit beta
BURPS668_0814-110-0.731374succinyl-CoA synthetase subunit alpha
BURPS668_0815-1100.264667TerC family integral membrane protein
BURPS668_08160120.465703type IV pilin protein PilA
BURPS668_0817-111-0.044585O-antigen polymerase
BURPS668_0818423-0.091783hypothetical protein
BURPS668_08191103.051718hypothetical protein
BURPS668_08200113.981671hypothetical protein
BURPS668_0821-1103.761507TonB domain-containing protein
BURPS668_0822-1123.415175molybdenum cofactor biosynthesis protein MoaC
BURPS668_0823-1103.019454hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0807TCRTETA358e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.8 bits (80), Expect = 8e-04
Identities = 47/264 (17%), Positives = 93/264 (35%), Gaps = 28/264 (10%)

Query: 77 AIVFGRLGDLVGRKHTFLITIVIMGISTFVVGFLPGYASIGIAAPVIFIAMRLLQGLALG 136
A V G L D GR+ L+++ + ++ P V++I R++ G+ G
Sbjct: 60 APVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAP-------FLWVLYIG-RIVAGIT-G 110

Query: 137 GEYGGAATYVAEHAPSHRRGFYTSWIQTTATLGLFLSLLVILGVRTAIGEEAFGSWGWRV 196
A Y+A+ R + ++ G+ + G +G G +
Sbjct: 111 ATGAVAGAYIADITDGDERARHFGFMSACFGFGM------VAG--PVLGGLM-GGFSPHA 161

Query: 197 PFVASILLLAVSVWIRLQLNESPVFLRIKAEGKTSKAPLTEAFGQWKNLKIVILALIGLT 256
PF A+ L ++ L + + + PL + + L
Sbjct: 162 PFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASF-----RWARGMTVVAALM 216

Query: 257 AGQAVVWYTGQFYA---LFFLTQTLKVDGASANILIALALLIGTPF-FVFFGSLSDRIGR 312
A ++ GQ A + F D + I +A ++ + + G ++ R+G
Sbjct: 217 AVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGE 276

Query: 313 KPIILAGCLIAALTYFPLFKALTH 336
+ ++ G +IA T + L T
Sbjct: 277 RRALMLG-MIADGTGYILLAFATR 299



Score = 34.8 bits (80), Expect = 8e-04
Identities = 17/42 (40%), Positives = 24/42 (57%)

Query: 287 ILIALALLIGTPFFVFFGSLSDRIGRKPIILAGCLIAALTYF 328
IL+AL L+ G+LSDR GR+P++L AA+ Y
Sbjct: 47 ILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYA 88


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0808PF06580487e-08 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 47.6 bits (113), Expect = 7e-08
Identities = 50/230 (21%), Positives = 85/230 (36%), Gaps = 55/230 (23%)

Query: 305 LAGLRTQAEF-ALRHEVNADVAH----SLEQIA----TSSEQAARLVTQLLALARAENRA 355
+A + +A+ AL+ ++N H +L I +A ++T L L R
Sbjct: 154 MASMAQEAQLMALKAQINP---HFMFNALNNIRALILEDPTKAREMLTSLSELMRY---- 206

Query: 356 TGLTFEPVEIASLARQ--AVRDWV---QAALAKQMDLGYEGPDTDAPLRVDGQPVMLREM 410
L + SLA + V ++ ++ + ++V P ML
Sbjct: 207 -SLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQV---PPML--- 259

Query: 411 LGNLIDNAIRY----TPAGGRITVRVRAERAAGAVHLEVEDTGPGIPPNERERVVERFYR 466
+ L++N I++ P GG+I ++ + G V LEVE+TG N +E
Sbjct: 260 VQTLVENGIKHGIAQLPQGGKILLKGTKDN--GTVTLEVENTGSLALKNTKE-------- 309

Query: 467 ILGREGDGSGLGLAIVRE-IVAQHGGTLTIDDNVYQTSPRLAGTLVRVSI 515
+G GL VRE + +G I S + V I
Sbjct: 310 -------STGTGLQNVRERLQMLYGTEAQIK-----LSEKQGKVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0809HTHFIS1003e-26 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 99.5 bits (248), Expect = 3e-26
Identities = 35/118 (29%), Positives = 64/118 (54%), Gaps = 1/118 (0%)

Query: 2 RILIAEDDSILADGLTRSLRQSGYAVDHVRNGVEADTALSMQTFDLLILDLGLPKMSGLE 61
IL+A+DD+ + L ++L ++GY V N ++ DL++ D+ +P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLRRLRARNSNLPVLILTAADSVDERVKGLDLGADDYMAKPFALNE-LEARVRALTRR 118
+L R++ +LPVL+++A ++ +K + GA DY+ KPF L E + RAL
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0816BCTERIALGSPH414e-07 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 41.1 bits (96), Expect = 4e-07
Identities = 29/124 (23%), Positives = 44/124 (35%), Gaps = 23/124 (18%)

Query: 1 MRARGFTLIELMIVLAIVGVVAAYAIPAYQDYLARSRVGEGLALAAS--ARLAVAENAAS 58
MR RGFTL+E+M++L ++GV A + A+ SR A A+L +
Sbjct: 1 MRQRGFTLLEMMLILLLMGVSAGMVLLAFPA----SRDDSAAQTLARFEAQLRFVQQRGL 56

Query: 59 GNGFSGGYVSPPATRNVDSIRVDDDSGQIVV-----AFTTRVAAAGANTLVLVPSAPDQA 113
G G + V D Q +V A G + +P +
Sbjct: 57 QTGQFFG------------VSVHPDRWQFLVLEARDGADPAPADDGWSGYRWLPLRAGRV 104

Query: 114 DTPT 117
T
Sbjct: 105 ATSG 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0817PF06580290.048 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.1 bits (65), Expect = 0.048
Identities = 17/107 (15%), Positives = 41/107 (38%), Gaps = 14/107 (13%)

Query: 205 AALSALLSVGLALTVSRGPWLQVG-----------VMVVAGFWMAFA-QARRDPA--ASR 250
+ + +L+ + R WL++ +V+ W R A ++
Sbjct: 49 SLMGLVLTHAYRSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTK 108

Query: 251 ARAWAIPVVLGVLFVAVNVAVRWANVHYHLGLAESAADRMRDAGQIA 297
A+ +P+ L ++F V V W+ +++ ++ D ++A
Sbjct: 109 PVAFTLPLALSIIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMA 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0821PF03544391e-06 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 39.2 bits (91), Expect = 1e-06
Identities = 18/97 (18%), Positives = 28/97 (28%), Gaps = 2/97 (2%)

Query: 18 AGCAAFAPRDAAKLECTMPVAAYPENAKPLERRATVLVRAMITASGNAENVTVTTSSRNA 77
+ L P YP A+ L V V+ +T G +NV + ++
Sbjct: 147 SKPVTSVASGPRALSRNQPQ--YPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPAN 204

Query: 78 AADRAAVDAMSRIACSQTPARGGEPYPFTLTRPFVFE 114
+R +AM R G E
Sbjct: 205 MFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTE 241


14BURPS668_0876BURPS668_0890Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_08760124.268027hypothetical protein
BURPS668_08780113.924452hypothetical protein
BURPS668_0879-1105.461813hypothetical protein
BURPS668_08800115.081780LysR family transcriptional regulator
BURPS668_08810115.055787esterase
BURPS668_0882-1134.869920major facilitator family transporter
BURPS668_0883-1124.411854transcriptional regulator
BURPS668_08841124.666908xylulokinase
BURPS668_08851124.234422mannitol dehydrogenase
BURPS668_08861124.804300LysR family transcriptional regulator
BURPS668_08871124.369176benzoylformate decarboxylase
BURPS668_08882113.764646vanillin dehydrogenase
BURPS668_08892123.3346732-dehydropantoate 2-reductase
BURPS668_08902112.340642hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0878PF06776300.020 Invasion associated locus B
		>PF06776#Invasion associated locus B

Length = 214

Score = 29.5 bits (66), Expect = 0.020
Identities = 11/49 (22%), Positives = 15/49 (30%), Gaps = 2/49 (4%)

Query: 1 MKTGRRHFVRSVASASAALAAAAWSPARAAIDAPASPATALSLTPGRWS 49
+ + RR R+ A A A A A A+ G W
Sbjct: 38 LASCRRLARRNGARLMLAGAMAI--ALSFGWSDRADAQGAVRSVHGDWQ 84



Score = 28.7 bits (64), Expect = 0.039
Identities = 7/37 (18%), Positives = 13/37 (35%)

Query: 10 RSVASASAALAAAAWSPARAAIDAPASPATALSLTPG 46
+++ A L+ S R A A A ++
Sbjct: 25 KAIQMGPAELSPMLASCRRLARRNGARLMLAGAMAIA 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0881BLACTAMASEA300.018 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 29.8 bits (67), Expect = 0.018
Identities = 11/35 (31%), Positives = 15/35 (42%)

Query: 57 REDALFRFASVSKPIVSAAAMRAVAAGKLDLDASI 91
R D F S K ++ A + V AG L+ I
Sbjct: 57 RADERFPMMSTFKVVLCGAVLARVDAGDEQLERKI 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0882TCRTETB354e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 35.2 bits (81), Expect = 4e-04
Identities = 31/155 (20%), Positives = 59/155 (38%), Gaps = 5/155 (3%)

Query: 26 LLALATAGFITIVTEALPAGLLPLMGRDLRVSDALVGQLVTVYAAGSIVAAIPLVAATRG 85
L+ L F +++ E + LP + D A + T + + +
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 86 MRRRPLLLAALAGFVVANTATAASPYYAPVLV-ARCVAGVSAGLLWALLAGYASRMVDAR 144
+ + LLL + + + +L+ AR + G A AL+ +R +
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 145 QRGRAIAIAMLGAPVAMSVGI-PL-GTALGAALGW 177
RG+A ++G+ VAM G+ P G + + W
Sbjct: 136 NRGKAFG--LIGSIVAMGEGVGPAIGGMIAHYIHW 168


15BURPS668_0932BURPS668_0956Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0932-1114.263060hypothetical protein
BURPS668_0933-2114.178253chromate transporter
BURPS668_0934-1102.815113cyclic nucleotide-binding protein
BURPS668_0935-183.217947hypothetical protein
BURPS668_09363101.9873062-dehydropantoate 2-reductase
BURPS668_09372111.273883hypothetical protein
BURPS668_09381113.002123SAM-dependent methyltransferases
BURPS668_09392132.871596diguanylate phosphodiesterase
BURPS668_09402133.658559hypothetical protein
BURPS668_09411133.501707DHA2 family drug:H+ antiporter-1
BURPS668_09433164.090952hypothetical protein
BURPS668_09423143.224427citrate synthase
BURPS668_09443231.302251hypothetical protein
BURPS668_09456362.243220hypothetical protein
BURPS668_09477362.235858hypothetical protein
BURPS668_09463240.584392GntR family transcriptional regulator
BURPS668_09484240.787945aldo/keto reductase
BURPS668_09495251.171350hypothetical protein
BURPS668_0950221-0.281988hypothetical protein
BURPS668_0951011-2.066226hypothetical protein
BURPS668_0952-19-1.301746elongation factor G
BURPS668_095309-1.322981hypothetical protein
BURPS668_095408-2.077488RNA pseudouridine synthase
BURPS668_0956-18-3.234270isocitrate dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0941TCRTETB1383e-38 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 138 bits (349), Expect = 3e-38
Identities = 92/408 (22%), Positives = 171/408 (41%), Gaps = 15/408 (3%)

Query: 17 VMLWLVATGFFMQTLDATIVNTALPSMAASLGESPLRMQSVVIAYSLTMAVMIPVSGWLA 76
+++WL FF L+ ++N +LP +A + P V A+ LT ++ V G L+
Sbjct: 15 ILIWLCILSFF-SVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLS 73

Query: 77 DTLGTRRVFFSAILIFTLGSLLCANAHT-LPLLVAFRVIQGVGGAMLLPVGRLAVLRTFP 135
D LG +R+ I+I GS++ H+ LL+ R IQG G A + + V R P
Sbjct: 74 DQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIP 133

Query: 136 AERYLPALSFVAIPGLIGPLIGPTLGGWLVKIASWHWIFLINVPVGIAGCIATFYSMPDS 195
E A + +G +GP +GG + HW +L+ +P+ + +
Sbjct: 134 KENRGKAFGLIGSIVAMGEGVGPAIGGMIAH--YIHWSYLLLIPMITIITVPFLMKLLKK 191

Query: 196 RNPAAGRFDLKGYLLLTIGMIAISLSLDGLADLGMQHAMVLVLLILSLACFVAYGLYAVR 255
G FD+KG +L+++G++ L L++ +LS FV + +
Sbjct: 192 EVRIKGHFDIKGIILMSVGIVFFMLFTT------SYSISFLIVSVLSFLIFVKHIR---K 242

Query: 256 APQPIFSLELFGIHTFSVGLLGNLFARIGSGAMPYLIPLLLQVSLGYGAFEAG-LMMLPV 314
P L F +G+L ++P +++ E G +++ P
Sbjct: 243 VTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPG 302

Query: 315 AAAGMFSKRIITVLITRHGYRKVLLANTIMVGLMMASFALVSDAMPTWLKIAQLALFGGF 374
+ + I +L+ R G VL + + + + + + ++ I + + GG
Sbjct: 303 TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL 362

Query: 375 NSMQFTAMNTLTLKDLGTGGASSGNSLFSLVQMLSMSLGVTVAGALLA 422
+ + T ++T+ L A +G SL + LS G+ + G LL+
Sbjct: 363 SFTK-TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0942PF03544290.029 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 29.2 bits (65), Expect = 0.029
Identities = 16/84 (19%), Positives = 27/84 (32%), Gaps = 5/84 (5%)

Query: 154 PAAAPDVSDVSDLPDAPPDAGRAAPPADAAPARHRAPPDPPDPPAPAAELDARRRDAQLD 213
P A + P PP +A + P P P P +++ +RD +
Sbjct: 62 PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPV 121

Query: 214 EAAMLLRVAAAALAARAPSAEPVH 237
E+ A+ AP+
Sbjct: 122 ESR-----PASPFENTAPARPTSS 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0949RTXTOXINA280.026 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.4 bits (63), Expect = 0.026
Identities = 12/52 (23%), Positives = 26/52 (50%)

Query: 11 AAAIAVVTVTAMAAAPVAAAAAATVTAAAMVAATATAVAVATAVVAATAPAM 62
A+ + TV A ++ ++AAA ++ A + A + + ++ A+ AM
Sbjct: 366 ASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEASKQAM 417


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0952TCRTETOQM6290.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 629 bits (1623), Expect = 0.0
Identities = 172/683 (25%), Positives = 295/683 (43%), Gaps = 75/683 (10%)

Query: 107 RYRNIGISAHIDAGKTTTTERILFYTGVSHKIGEVHDGAATMDWMEQEQERGITITSAAT 166
+ NIG+ AH+DAGKTT TE +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 167 TAFWKGMAGNYPEHRINIIDTPGHVDFTIEVERSMRVLDGACMVYDSVGGVQPQSETVWR 226
+ W+ ++NIIDTPGH+DF EV RS+ VLDGA ++ + GVQ Q+ ++
Sbjct: 62 SFQWEN-------TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFH 114

Query: 227 QANKYKVPRIAFVNKMDRVGADFFRVQRQIGERLKGVAVPIQIPVGAEEHFQGVVDLVKM 286
K +P I F+NK+D+ G D V + I E+L V Q V M
Sbjct: 115 ALRKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ----------KVELYPNM 164

Query: 287 KAIVWDDESQGVKFTYEDIPANLVELAHEWREKMVEAAAEASEELLEKYLTDHNSLTEDE 346
+ + Q + E +++LLEKY+ SL E
Sbjct: 165 CVTNFTESEQ------------------------WDTVIEGNDDLLEKYM-SGKSLEALE 199

Query: 347 IKAALRKRTIANEIVPMLCGSAFKNKGVQAMLDAVIDYLPSPADVPAILGHDLDDKEAER 406
++ R + P+ GSA N G+ +++ + + S
Sbjct: 200 LEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH---------------- 243

Query: 407 HPSDDEPFSALAFKIMTDPFVGQLIFFRVYSGVVESGDTLLNATKDKKERLGRILQMHAN 466
FKI +L + R+YSGV+ D++ + K+K ++ +
Sbjct: 244 --RGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMYTSING 300

Query: 467 ERKEIKEVRAGDIAAAVG--LK-EATTGDTLCDPGKPIILEKMEFPEPVISQAVEPKTKA 523
E +I + +G+I LK + GDT P + E++E P P++ VEP
Sbjct: 301 ELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQR----ERIENPLPLLQTTVEPSKPQ 356

Query: 524 DQEKMGLALNRLAQEDPSFRVQTDEESGQTIISGMGELHLEIIVDRMKREFGVEATVGKP 583
+E + AL ++ DP R D + + I+S +G++ +E+ ++ ++ VE + +P
Sbjct: 357 QREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEP 416

Query: 584 QVAYRETVRTVAEDVEGKFVKQSGGRGQYGHAVIKLEPNP-GKGYEFLDEIKGGVIPREF 642
V Y E + E + + + + P P G G ++ + G + + F
Sbjct: 417 TVIYME---RPLKKAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSF 473

Query: 643 IPAVNKGIEETLKSGVLAGYPVVDVKVHLTFGSYHDVDSNENAFRMAGSMAFKEAMRRAK 702
AV +GI + G L G+ V D K+ +G Y+ S FRM + ++ +++A
Sbjct: 474 QNAVMEGIRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAG 532

Query: 703 PVLLEPMMAVEVETPEDFMGNVMGDLSSRRGIVQGMEDIAGGGGKLVRAEVPLAEMFGYS 762
LLEP ++ ++ P++++ D + + ++ E+P + Y
Sbjct: 533 TELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ--LKNNEVILSGEIPARCIQEYR 590

Query: 763 TSLRSATQGRATYTMEFKHYAET 785
+ L T GR+ E K Y T
Sbjct: 591 SDLTFFTNGRSVCLTELKGYHVT 613


16BURPS668_1027BURPS668_1037Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_10270113.480469hypothetical protein
BURPS668_10282124.275843TonB-dependent vitamin B12 receptor BtuB
BURPS668_10293154.326451cobalamin ABC transporter permease
BURPS668_10302143.591966cobalamin ABC transporter ATP-binding protein
BURPS668_10312143.808008nicotinate-nucleotide--dimethylbenzimidazole
BURPS668_10321134.534411cobalamin synthase
BURPS668_10330144.498120alpha-ribazole-5'-phosphate phosphatase
BURPS668_10340124.067077hypothetical protein
BURPS668_1035-1125.422636cobalamin ABC transporter periplasmic
BURPS668_1036-1114.836479threonine-phosphate decarboxylase
BURPS668_1037-1123.204744cobalamin biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1027BACINVASINB270.015 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 27.4 bits (60), Expect = 0.015
Identities = 22/89 (24%), Positives = 33/89 (37%), Gaps = 4/89 (4%)

Query: 23 QTERLALEEQVAQLRNEAQTLHAELEQLRDERNALAAERDTLSAKIDDAQVKLNAILEKL 82
Q + +E Q+ E QT E ++ D A + DT + D A KL KL
Sbjct: 112 QAMIESQKEMGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAATKKLTQAQNKL 171

Query: 83 ----PRTKNVPDAENQLDLLAPQANDEGE 107
P AE ++ +A + E
Sbjct: 172 QSLDPADPGYAQAEAAVEQAGKEATEAKE 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1028SSBTLNINHBTR300.015 Streptomyces subtilisin inhibitor signature.
		>SSBTLNINHBTR#Streptomyces subtilisin inhibitor signature.

Length = 144

Score = 29.8 bits (66), Expect = 0.015
Identities = 36/109 (33%), Positives = 49/109 (44%), Gaps = 8/109 (7%)

Query: 8 AALAALSGLPCIALAQGDASASSASFASSVS--YAPAAA--SPADADSALSTAPAAAAAS 63
A AA GL A+ A AS AS A++ + YAP+A + +SA + AP A
Sbjct: 5 ARWAATLGLTATAVCGPLAGASLASPATAPASLYAPSALVLTVGHGESAATAAPLRAVTL 64

Query: 64 PASGAARGAEAVSADAASAV--ASGASSASPARAASAAQL--APVVVTA 108
+ A G +A A + + A G SA A + APVVVT
Sbjct: 65 TCAPTASGTHPAAAAACAELRAAHGDPSALAAEDSVMCTREYAPVVVTV 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1035FERRIBNDNGPP407e-06 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 39.9 bits (93), Expect = 7e-06
Identities = 37/179 (20%), Positives = 66/179 (36%), Gaps = 9/179 (5%)

Query: 49 ARRVVSLAPHVTELIYAAG----GGAKLVGAVSYSDYPPAAKAIARVGSNKALDLERIAA 104
R+V+L EL+ A G G A + + PP ++ VG +LE +
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTE 94

Query: 105 LKPDLIVVWRHGNAEHETERLRALGIPLYFSEPRH-LDDVAASLDKLGLLLGTHEIASAA 163
+KP +V E A G FS+ + L SL ++ LL A
Sbjct: 95 MKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNLQSAAETH 154

Query: 164 ADAYRRRIAQLRARYADK--PPVTVFFQAWDKPLITLNGDH-IVSDVIALCGGRNVFAR 219
Y I ++ R+ + P+ + D + + G + + +++ G N +
Sbjct: 155 LAQYEDFIRSMKPRFVKRGARPL-LLTTLIDPRHMLVFGPNSLFQEILDEYGIPNAWQG 212


17BURPS668_1108BURPS668_1135Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_11085240.299753hypothetical protein
BURPS668_110911320.931342hypothetical protein
BURPS668_111015351.772321hypothetical protein
BURPS668_11111237-0.133519hypothetical protein
BURPS668_1112831-0.590733hypothetical protein
BURPS668_11146200.192795hypothetical protein
BURPS668_1113520-0.145798hypothetical protein
BURPS668_1115519-0.749773lipoprotein
BURPS668_1116323-2.330737hypothetical protein
BURPS668_1117927-5.702792ecotin
BURPS668_1118928-6.319403D-alanyl-D-alanine carboxypeptidase
BURPS668_11191030-7.048219hypothetical protein
BURPS668_11201031-7.070046hypothetical protein
BURPS668_11211031-7.255144hypothetical protein
BURPS668_11221030-7.045181cell surface protein
BURPS668_1123332-6.891683hypothetical protein
BURPS668_1125331-5.583904hemolysin activation/secretion protein
BURPS668_1127131-3.881415*hypothetical protein
BURPS668_1128226-2.993483hypothetical protein
BURPS668_1129227-3.408690hypothetical protein
BURPS668_1130227-3.151970hypothetical protein
BURPS668_1131015-1.891885translation initiation factor IF-1
BURPS668_1132013-1.512613alpha/beta hydrolase
BURPS668_1133210-0.747040hypothetical protein
BURPS668_1134312-0.417691rubredoxin
BURPS668_1135212-0.091283hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1115cloacin300.003 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 29.7 bits (66), Expect = 0.003
Identities = 19/59 (32%), Positives = 23/59 (38%), Gaps = 1/59 (1%)

Query: 49 GTVNVWGGDGWRDRDHWRGGDDRWHGGWRGGGNWRGGNDWHGGRGNGWQGGRGPAGGRN 107
G + G G D W ++ W GG G +W GG HG G G G G N
Sbjct: 23 GPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHW-GGGSGHGNGGGNGNSGGGSGTGGN 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1122PF05860684e-15 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 67.5 bits (165), Expect = 4e-15
Identities = 23/138 (16%), Positives = 51/138 (36%), Gaps = 23/138 (16%)

Query: 82 AQVVG-AGANAPSVIQTQNGLQQVNITKPSGAGVSLNTYSQFDVPKQGVIVNNSPTLTNT 140
AQ+ S I T+ + + +G+ + + + +F VP G N+PT
Sbjct: 1 AQITPDTTLPINSNITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFNNPT---- 55

Query: 141 QQAGYINGNPNLGPNGSAKIIINQVNSNNPSQLKGFVEIAGQRAEMIISNPSGLVVDGGG 200
+ + II++V + S + G + A + + NP+G++
Sbjct: 56 ----------------NIQNIISRVTGGSVSNIDGLIRANAT-ANLFLINPNGIIFGQNA 98

Query: 201 FINTSRAILTTGTPSLNA 218
++ + + + L
Sbjct: 99 RLDIGGSFVGSTANRLKF 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1125IGASERPTASE310.017 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 31.2 bits (70), Expect = 0.017
Identities = 13/76 (17%), Positives = 28/76 (36%), Gaps = 2/76 (2%)

Query: 26 PSPADQAAAARANAEQDRQAQQQRDAQQRDAAVRAPSVRSEVPKVEAYPALPAEAPCFRI 85
P+PA + AE +Q + + ++DA + ++ EA + A +
Sbjct: 1028 PAPATPSETTETVAENSKQESKTVEKNEQDAT--ETTAQNREVAKEAKSNVKANTQTNEV 1085

Query: 86 DRFTLDVPNSLPDTTK 101
+ + + TK
Sbjct: 1086 AQSGSETKETQTTETK 1101


18BURPS668_1180BURPS668_1190Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_11800133.942843hypothetical protein
BURPS668_11810134.075803glycosyl hydrolase
BURPS668_11822164.566336hypothetical protein
BURPS668_11832134.613919hypothetical protein
BURPS668_11843145.198040hypothetical protein
BURPS668_11853135.172066DNA translocase FtsK
BURPS668_11863153.817186hypothetical protein
BURPS668_11873174.263833hypothetical protein
BURPS668_11881112.826944phosphoribosylglycinamide formyltransferase 2
BURPS668_11892152.210602lipoprotein
BURPS668_11901113.162437hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1185IGASERPTASE446e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 44.3 bits (104), Expect = 6e-06
Identities = 51/355 (14%), Positives = 97/355 (27%), Gaps = 34/355 (9%)

Query: 379 RAQARPAAPDPRFAPRRPATQAAVSAARNRPMTFTPSRQTTGSTPPQPAPRAQTAAPTAE 438
+ R D QA V + + PP PA ++T AE
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETTETVAE 1042

Query: 439 TARKRAPANPARAPLYAWHEKPAEHIAPAASVHETLRSIEASAAQWTALAGATSTAATPV 498
+++ + + E A V + +S + Q +A + S
Sbjct: 1043 NSKQESKTVEKN------EQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQ 1096

Query: 499 TARESMAAPAAPSGGAAASAARDGRAPTSAETAAPDGHAPTS----AKTVAPDGHVPTSA 554
T A A + P +P + A+ +
Sbjct: 1097 TTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIK 1156

Query: 555 ETAAPDGHVPTS---AETVAPDGHAPTSAETAAPDGHAP------TSAETAAPNDHASTS 605
E + + A+ + + P + T G++ T+ T P ++ +S
Sbjct: 1157 EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESS 1216

Query: 606 -------AETAAPDGHVPTSAETAAPDGHVSA---TVETSAVAAPAGITQAAPPIAADTC 655
+ H A T++ D A T+ A + A +A +
Sbjct: 1217 NKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVG 1276

Query: 656 PAGEHVIAAVEPAGTSDSAA----IGAGAIAHAEAGAAASTAETASPIGVDTHIA 706
A I+ +E + S+ T + +G D I+
Sbjct: 1277 KAVSQHISQLEMNNEGQYNVWVSNTSMNKNYSSSQYRRFSSKSTQTQLGWDQTIS 1331



Score = 40.8 bits (95), Expect = 7e-05
Identities = 52/315 (16%), Positives = 93/315 (29%), Gaps = 51/315 (16%)

Query: 690 ASTAETASPIGVDTHIAPSREADRTA-QTAPTAPSPAEATPHVDAPHALDVAARALVGNT 748
+ T + I D PS + AP P PA ATP +
Sbjct: 994 TTNITTPNNIQADVPSVPSNNEEIARVDEAPVPP-PAPATP----------SETTET-VA 1041

Query: 749 AATAHGAAAVDGSAQRADTASPAASTSGPPAPVAASAASSDRAAPQPVATAAPASIATSG 808
+ + V+ + Q A + VA A S+ +A Q +
Sbjct: 1042 ENSKQESKTVEKNEQDATETTAQNRE------VAKEAKSNVKANTQ-------TNEVAQS 1088

Query: 809 ALGTMKASGTAGPQPSTIAAQRASAIDDTGQPPSTGHSTHAAVSNELGRRPHAAPDAVTP 868
T + T + +T+ + + ++ ++ + E + V P
Sbjct: 1089 GSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQE-------QSETVQP 1141

Query: 869 VLPPAAAAVPTNASAVQRQALASESAEAAQGVARAAAAGDSRETTQVSPAGARPDGAAPS 928
PA PT + + Q+ + +A+ Q ++ + T S
Sbjct: 1142 QAEPARENDPT-VNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES------TTVNTGNS 1194

Query: 929 AAVGNPIAPLPDASAITAHEDAPT--------SAAPDAATPVIAAMDSAMPNAVAPASAI 980
NP P + T + ++ S A S + VA
Sbjct: 1195 VVE-NPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLT 1253

Query: 981 A--SNAGMSPASASA 993
+ +NA +S A A A
Sbjct: 1254 STNTNAVLSDARAKA 1268



Score = 31.6 bits (71), Expect = 0.039
Identities = 40/283 (14%), Positives = 70/283 (24%), Gaps = 41/283 (14%)

Query: 300 PPPASAMPAPTIAAAKPAAATMPPSGLSKAERLAAPTGGAAAPLAAPAAAVTSPAAFAPA 359
PPPA A P+ T + + + T A A A
Sbjct: 1026 PPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNR--EVAKEAKSNVKAN--TQ 1081

Query: 360 ATGIAKPIGSTAAVAALGKR-----AQARPAAPDPRFAPRRPATQAAVSAARNRPMTFTP 414
+A+ T + + A + P + VS + + T P
Sbjct: 1082 TNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQP 1141

Query: 415 SRQTTGSTPPQPAPRAQTAAPTAETARKRAPANPARAPLYAWHEKPAEHIAPAASVHETL 474
+ + P P ++T PA+ ET
Sbjct: 1142 QAEPA----RENDPTVNIKEPQSQTNTTADTEQPAK---------------------ETS 1176

Query: 475 RSIEASAAQWTALAGATSTAATPVTARESMAAPAAPSGGAAASAARDGRA-------PTS 527
++E + T + S P + P S + R R+
Sbjct: 1177 SNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEP 1236

Query: 528 AETAAPDGHAPTSAKTVAPDGHVPTSAETAAPDGHVPTSAETV 570
A T++ D + + + S A + V
Sbjct: 1237 ATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAV 1279


19BURPS668_1246BURPS668_1259Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_1246-1113.715760KDP operon transcriptional regulatory protein
BURPS668_12470133.716017hypothetical protein
BURPS668_1248-1123.420149hypothetical protein
BURPS668_1249-1123.480050SMR family multidrug efflux pump
BURPS668_1250-1113.476648hypothetical protein
BURPS668_12510123.876846ABC transporter permease
BURPS668_12520122.848271hypothetical protein
BURPS668_1253-1103.282656hypothetical protein
BURPS668_1254-1113.775952hypothetical protein
BURPS668_12550114.042445threonyl/alanyl tRNA synthetase
BURPS668_1256-293.952939hypothetical protein
BURPS668_1257-1114.064814amidase
BURPS668_1258-2104.052674major facilitator family transporter
BURPS668_1259-3143.570049hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1246HTHFIS891e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.1 bits (221), Expect = 1e-22
Identities = 39/157 (24%), Positives = 71/157 (45%), Gaps = 3/157 (1%)

Query: 7 TVVLIEDEKQIRRFVRSALEEEGIAVFDAETGRQGLIEAATRKPDLAIVDLGLPDGDGLD 66
T+++ +D+ IR + AL G V A DL + D+ +PD + D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 67 VIRELR-GWSEMPVIVLSARTHEEEKVAALDAGADDYLTKPFGVSELLARIRAHL--RRR 123
++ ++ ++PV+V+SA+ + A + GA DYL KPF ++EL+ I L +R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 124 NQAGAAESPVVRFGDVSVDLALRRVWRGGEVVHLTPL 160
+ + V A++ ++R + T L
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDL 161


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1258TCRTETA355e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 35.2 bits (81), Expect = 5e-04
Identities = 30/101 (29%), Positives = 47/101 (46%), Gaps = 9/101 (8%)

Query: 130 ALVIGAYADRAGRKPAMTLTLAMMAVGTGAIAVLPGYETIGVAAPILLVVTRLIQGLAWG 189
A V+GA +DR GR+P + ++LA AV +A P +L + R++ G+ G
Sbjct: 60 APVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLW--------VLYIGRIVAGIT-G 110

Query: 190 GEAGPATTYILEAAPPERRAAYACWQVATQGFAAVAAGLAG 230
A YI + + RA + + A GF VA + G
Sbjct: 111 ATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG 151


20BURPS668_1287BURPS668_1302Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_1287012-3.499832triosephosphate isomerase
BURPS668_1288-113-5.515280preprotein translocase subunit SecG
BURPS668_1290-112-4.983578*NADH dehydrogenase subunit A
BURPS668_1291-113-2.704396NADH dehydrogenase subunit B
BURPS668_1292-115-3.029748NADH dehydrogenase subunit C
BURPS668_1293-115-3.109800NADH dehydrogenase subunit D
BURPS668_1294015-2.551466NADH dehydrogenase subunit E
BURPS668_1295015-2.609758NADH-quinone oxidoreductase subunit F
BURPS668_1296116-3.308581NADH dehydrogenase subunit G
BURPS668_1297119-5.206406NADH dehydrogenase subunit H
BURPS668_1298118-4.657513NADH dehydrogenase subunit I
BURPS668_1299118-4.585311NADH dehydrogenase subunit J
BURPS668_1300117-4.777439NADH dehydrogenase subunit K
BURPS668_1301017-4.166465NADH dehydrogenase subunit L
BURPS668_1302-214-3.456041NADH dehydrogenase subunit M
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1288SECGEXPORT838e-24 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 82.7 bits (204), Expect = 8e-24
Identities = 46/102 (45%), Positives = 68/102 (66%), Gaps = 1/102 (0%)

Query: 8 IIVVQLLSALGVIGLVLLQHGKGADMGAAFGSGASGSLFGATGSANFLSRTTAVLATIFF 67
++VV L+ A+G++GL++LQ GKGADMGA+FG+GAS +LFG++GS NF++R TA+LAT+FF
Sbjct: 5 LLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLATLFF 64

Query: 68 VATLALTYLGSYKSAPSVGVLGAAPAPAASAPAASQTPAASA 109
+ +L L + S K+ APA + PA
Sbjct: 65 IISLVLGNINSNKTNKGSEWEN-LSAPAKTEQTQPAAPAKPT 105


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1297OUTRMMBRANEA300.013 Outer membrane protein A signature.
		>OUTRMMBRANEA#Outer membrane protein A signature.

Length = 346

Score = 29.9 bits (67), Expect = 0.013
Identities = 16/96 (16%), Positives = 28/96 (29%), Gaps = 10/96 (10%)

Query: 138 YAVILAGWASNSKYAFLGAMR-------AAAQMVSYEISMGFALVLVLMTAGSLNLSEIV 190
Y GW+ F+ A Y+++ + G +
Sbjct: 29 YTGAKLGWSQYHDTGFINNNGPTHENQLGAGAFGGYQVNPYVGFEMGYDWLGRMPY---K 85

Query: 191 GSQQHGFFAGHGVNFLSWNWLPLLPVFVIYFISGIA 226
GS ++G + GV + P+ IY G
Sbjct: 86 GSVENGAYKAQGVQLTAKLGYPITDDLDIYTRLGGM 121


21BURPS668_1325BURPS668_1339Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_1325-116-3.459367DNA-binding response regulator
BURPS668_1327-215-3.601451hypothetical protein
BURPS668_1326-113-3.173038hypothetical protein
BURPS668_1333115-3.366560**hypothetical protein
BURPS668_1334215-3.606858Xaa-Pro aminopeptidase
BURPS668_1335215-3.024898fatty acid desaturase
BURPS668_1336218-2.414315AraC family transcriptional regulator
BURPS668_1338421-1.502894D-3-phosphoglycerate dehydrogenase
BURPS668_1337629-1.061250hypothetical protein
BURPS668_1339631-0.082423hypothetical protein
22BURPS668_1358BURPS668_1387Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_1358214-0.051361antioxidant, AhpC/TSA family protein
BURPS668_13592140.425841hypothetical protein
BURPS668_13602130.511532hypothetical protein
BURPS668_13612120.477876hypothetical protein
BURPS668_1362213-0.146748multidrug resistance protein MdtC
BURPS668_1363010-0.809301multidrug resistance protein MdtB
BURPS668_1364011-0.395279membrane fusion protein MdtA
BURPS668_1365414-1.640106hypothetical protein
BURPS668_1366214-0.602004IclR family transcriptional regulator
BURPS668_1367313-0.547357hypothetical protein
BURPS668_1368213-0.225151hypothetical protein
BURPS668_13690131.303110hypothetical protein
BURPS668_13701171.215350hypothetical protein
BURPS668_1371-2142.414728lipoprotein
BURPS668_13720113.277122transcriptional regulator
BURPS668_1373-1112.846418hypothetical protein
BURPS668_1374-1113.307078cysteine dioxygenase, type I
BURPS668_1375-1132.541754AsnC family transcriptional regulator
BURPS668_1376-1142.693145iron ABC transporter ATP-binding protein
BURPS668_1377-1142.158153iron ABC transporter permease
BURPS668_1378-113-0.723923iron ABC transporter substrate-binding protein
BURPS668_1379-315-0.541774hypothetical protein
BURPS668_1380-1122.297779hypothetical protein
BURPS668_13810113.142500amino acid transporter
BURPS668_1382-194.307239hypothetical protein
BURPS668_1383-194.293951hypothetical protein
BURPS668_1384094.960075hypothetical protein
BURPS668_1385084.881419exodeoxyribonuclease V subunit gamma
BURPS668_1386094.244665exodeoxyribonuclease V subunit beta
BURPS668_1387193.716823exodeoxyribonuclease V subunit alpha
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1359PF01540290.015 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 28.9 bits (64), Expect = 0.015
Identities = 26/84 (30%), Positives = 38/84 (45%), Gaps = 3/84 (3%)

Query: 13 RTGRALADLLLKQQDFEVTALVRRPDFA--LPGAKVVVADLTGDFSSAFN-GITHAIYAA 69
+ G+ AD LKQ + L + PD++ L +A+ T F A + G AI +
Sbjct: 35 KNGKEKADAALKQANALAEELKKNPDYSKILETLNKEIAEATKSFKEAGSYGDYPAIISK 94

Query: 70 GSAESEGATEEEQIDRDAVARAAD 93
SA E A E+Q A + AD
Sbjct: 95 LSAAVENAKSEQQKVDQANKKIAD 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1362ACRIFLAVINRP7450.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 745 bits (1925), Expect = 0.0
Identities = 279/1104 (25%), Positives = 502/1104 (45%), Gaps = 100/1104 (9%)

Query: 3 LARPFITRPVATTLLALGIALAGLFAFVKLPVSPLPQVDFPTILVQASLPGASPETVATS 62
+A FI RP+ +LA+ + +AG A ++LPV+ P + P + V A+ PGA +TV +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 VTSPLERHLGSIADVAEMTSMS-SVGNARIVLQFNLNRDIDGAARDVQAAINAARADLPA 121
VT +E+++ I ++ M+S S S G+ I L F D D A VQ + A LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 SLKSNPTYRKVNPADSPIMVVSLTS--KTASPAKLYDAASTVLQQSLSQIDGIGQVSLSG 179
++ + S +MV S + + D ++ ++ +LS+++G+G V L G
Sbjct: 121 EVQ-QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 180 SANPAVRVELEPQALFHYGIGLEDVRAALASANANSPKGAIEAGP------HRYQLYTND 233
+ A+R+ L+ L Y + DV L N G + P +
Sbjct: 180 AQY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQT 238

Query: 234 QATKAAQYKDLVI-AYRNHAAVSLSDVSSVVDSVEDLRNLGLMNGERAVLVILYRSPGAN 292
+ ++ + + + + V L DV+ V E+ + +NG+ A + + + GAN
Sbjct: 239 RFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGAN 298

Query: 293 IIDTIERVKAALPQLTAALPADIQVTPVLDRSRTIRASLADTEHTLIIAVSLVVMVVFLF 352
+DT + +KA L +L P ++V D + ++ S+ + TL A+ LV +V++LF
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 353 LRNWRATLIPSVAVPISIVGTFGAMYLLGFSLNNLSLMALIVATGFVVDDAIVVLENIAR 412
L+N RATLIP++AVP+ ++GTF + G+S+N L++ +++A G +VDDAIVV+EN+ R
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 413 HI-ENGTPRLQAAFDGAREVGFTVLSISLSLVAVFLPILLMGGIVGRLFREFALTLSLAI 471
+ E+ P +A ++ ++ I++ L AVF+P+ GG G ++R+F++T+ A+
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 472 AVSLVVSLTLTPMMCARLLPEAHAPRDE--GRVARWLERGFEWMQRGYERTLSWALRHPF 529
A+S++V+L LTP +CA LL A E G W F+ Y ++ L
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 530 TILMTLVATIALNIALYIVVPKGFFPQQDTGLMIGGIQADQTTSFQAMKLRFTEMMRIIR 589
L+ +A + L++ +P F P++D G+ + IQ + + + ++
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 590 ANP-----NVANVAGFT-GGAQTNSGFMFVALKDKPQR---KLSADQVIQQLRPQLAEVA 640
N +V V GF+ G N+G FV+LK +R + SA+ VI + + +L ++
Sbjct: 599 KNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 641 GARTFLQAAQDIRAGGRQSNAQYQFT-LLGDSTAELYKWGP-ILTEALQKRPELADVNSD 698
I G + ++ G L + +L A Q L V +
Sbjct: 659 DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 699 QQQGGLEAMVTIDRATAARLGIKPAQIDNTLYDAFGQRQVSTIYNPLNQYHVVMEVAPQY 758
+ + + +D+ A LG+ + I+ T+ A G V+ + + ++ ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 759 WQSPEMLKQIYISTSGGSASGVQTTNAAAGTYVATTARASTAGAAAQSAAAIAADSARNQ 818
PE + ++Y+ ++ G V + +V + R R
Sbjct: 779 RMLPEDVDKLYVRSANGEM--VPFSAFTTSHWVYGSPRLE-----------------RYN 819

Query: 819 ALNSIASSG--KSSASSGAAVSTSKSTMVPLSAIASFGPSTTPLAVNHQGLFVATTISFN 876
L S+ G SSG A++ ++
Sbjct: 820 GLPSMEIQGEAAPGTSSGDAMALMENLAS------------------------------K 849

Query: 877 LPPGVSLSKATQVIYQTMAEVGVPPTIQGSFQGTAQAFQESLKDQPILILAALAAVYIVL 936
LP G+ G + P L+ + V++ L
Sbjct: 850 LPAGIGY------------------DWTGMSYQERLSG----NQAPALVAISFVVVFLCL 887

Query: 937 GILYESYIHPVTILSTLPSAGVGALLGLLLFKTEFSIIALIGVILLIGIVKKNAIMMVDF 996
LYES+ PV+++ +P VG LL LF + + ++G++ IG+ KNAI++V+F
Sbjct: 888 AALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEF 947

Query: 997 AIDA-SRQGKSSFDAIHEACLLRFRPIMMTTMAALLGALPLAFGRGDGAEMRAPLGIAIA 1055
A D ++GK +A A +R RPI+MT++A +LG LPLA G G+ + +GI +
Sbjct: 948 AKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVM 1007

Query: 1056 GGLIVSQMLTLYTTPVVYLYMDRL 1079
GG++ + +L ++ PV ++ + R
Sbjct: 1008 GGMVSATLLAIFFVPVFFVVIRRC 1031



Score = 96.1 bits (239), Expect = 4e-22
Identities = 83/503 (16%), Positives = 167/503 (33%), Gaps = 25/503 (4%)

Query: 2 NLARPFITRPVATTLLALGIALAGLFAFVKLPVSPLPQVDFPTILVQASLP-GASPETVA 60
N + L+ I + F++LP S LP+ D L LP GA+ E
Sbjct: 528 NSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQ 587

Query: 61 TSVTSPLERHLGSIAD----VAEMTSMSSVGNAR----IVLQFNLNRDIDGAARDVQAAI 112
+ + +L + V + S G A+ + + +G +A I
Sbjct: 588 KVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVI 647

Query: 113 NAARADLPASLKSNPTYRKVNPADSPIMVVSLTSKT-----ASPAKLYDAASTVLQQSLS 167
+ A+ +L + + L A + +L +
Sbjct: 648 HRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQ 707

Query: 168 QIDGIGQVSLSGSAN-PAVRVELEPQALFHYGIGLEDVRAALASANANSPKGAIEAGPHR 226
+ V +G + ++E++ + G+ L D+ +++A +
Sbjct: 708 HPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRV 767

Query: 227 YQLYT---NDQATKAAQYKDLVIAYRNHAAVSLSDVSSVVDSVEDLRNLGLMNGERAVLV 283
+LY L + N V S ++ V L NG ++ +
Sbjct: 768 KKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHW-VYGSPRLERYNGLPSMEI 826

Query: 284 ILYRSPGANIIDTIERVKAALPQLTAALPADIQVTPVLDRSRTIRASLADTEHTLIIAVS 343
+PG + D A + L + LPA I S R S + I+
Sbjct: 827 QGEAAPGTSSGD----AMALMENLASKLPAGIGYD-WTGMSYQERLSGNQAPALVAISFV 881

Query: 344 LVVMVVFLFLRNWRATLIPSVAVPISIVGTFGAMYLLGFSLNNLSLMALIVATGFVVDDA 403
+V + + +W + + VP+ IVG A L + ++ L+ G +A
Sbjct: 882 VVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNA 941

Query: 404 IVVLENI-ARHIENGTPRLQAAFDGAREVGFTVLSISLSLVAVFLPILLMGGIVGRLFRE 462
I+++E + G ++A R +L SL+ + LP+ + G
Sbjct: 942 ILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNA 1001

Query: 463 FALTLSLAIAVSLVVSLTLTPMM 485
+ + + + ++++ P+
Sbjct: 1002 VGIGVMGGMVSATLLAIFFVPVF 1024



Score = 59.9 bits (145), Expect = 5e-11
Identities = 37/225 (16%), Positives = 84/225 (37%), Gaps = 4/225 (1%)

Query: 870 ATTISFNLPPGVSLSKATQVIYQTMAEV--GVPPTIQGS-FQGTAQAFQESLKDQPILIL 926
A + L G + + I +AE+ P ++ T Q S+ + +
Sbjct: 286 AAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLF 345

Query: 927 AALAAVYIVLGILYESYIHPVTILSTLPSAGVGALLGLLLFKTEFSIIALIGVILLIGIV 986
A+ V++V+ + ++ + +P +G L F + + + G++L IG++
Sbjct: 346 EAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLL 405

Query: 987 KKNAIMMVDFAIDASRQGKSSF-DAIHEACLLRFRPIMMTTMAALLGALPLAFGRGDGAE 1045
+AI++V+ + K +A ++ ++ M +P+AF G
Sbjct: 406 VDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGA 465

Query: 1046 MRAPLGIAIAGGLIVSQMLTLYTTPVVYLYMDRLRVWAEKRRDRR 1090
+ I I + +S ++ L TP + + +
Sbjct: 466 IYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGG 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1363ACRIFLAVINRP8000.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 800 bits (2067), Expect = 0.0
Identities = 283/1035 (27%), Positives = 498/1035 (48%), Gaps = 31/1035 (2%)

Query: 4 SRVFILRPVGTALLMAAIMLAGLVALRFLPLAALPEVDYPTIQVQTFYPGASPEVMTSSV 63
+ FI RP+ +L +M+AG +A+ LP+A P + P + V YPGA + + +V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 64 TAPLERQFGQMPSLNQMSSQS-SAGASVITLQFSLDLPLDIAEQEVQAAINAAGNLLPSD 122
T +E+ + +L MSS S SAG+ ITL F DIA+ +VQ + A LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 123 LPAPPIYAKVNPADAPVITLAVTSKTLPLTQ--VQDLADTRLAMKISQVSGVGLVSLSGG 180
+ I + + ++ S TQ + D + + +S+++GVG V L G
Sbjct: 122 VQQQGIS-VEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 NRPAVRIQANPLALASYGLNLDDLRTTISNLNVNTPKGNFDGP------TRAYTINANDQ 234
A+RI + L Y L D+ + N G G +I A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 235 LTSADQYNDAVV-AYKNGRPVMLTDVAKIVAGSENTKLGAWVDAEPAIILNVQRQPGANV 293
+ +++ + +G V L DVA++ G EN + A ++ +PA L ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 294 IQTVDNVKAILPKLQESLPAALDVQIVTDRTTMIRAAVRDVQFELGLAVALVVLVMYLFL 353
+ T +KA L +LQ P + V D T ++ ++ +V L A+ LV LVMYLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 354 ANVYATIIPSLSVPLSLIGTLAVMYLSGFSLNNLSLMALTIATGFVVDDAIVMIENIARY 413
N+ AT+IP+++VP+ L+GT A++ G+S+N L++ + +A G +VDDAIV++EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 414 -VEEGDSALEAALKGSKQIGFTIISLTVSLIAVLIPLLFMGDVVGRLFHEFAITLAVTIV 472
+E+ EA K QI ++ + + L AV IP+ F G G ++ +F+IT+ +
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 473 ISAVVSLTLVPMMCAKLLRHTPPPESHRFEAKVHGLIERV----IERYGVALQWVLDRQR 528
+S +V+L L P +CA LL+ E H + G + Y ++ +L
Sbjct: 480 LSVLVALILTPALCATLLK-PVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 529 ATLVVAVLTLALTALLYVVIPKGFFPTQDTGVIQAITQAPQSVSYGAMAERQQALAAEIL 588
L++ L +A +L++ +P F P +D GV + Q P + + + L
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 589 KH--PDVVSLTSFIGVDGANITLNSGRMLINLKPRDERS---ESASDVIRSLQRQVANVT 643
K+ +V S+ + G + N+G ++LKP +ER+ SA VI + ++ +
Sbjct: 599 KNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 644 GISLYMQPVQDLTIDSTVSPTQYQFMLTS---PNPDEFATWVPKLVDRLKKEPS-LADVA 699
+ P I + T + F L D +L+ + P+ L V
Sbjct: 659 DGFVI--PFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVR 716

Query: 700 TDLQNSGKSVYIEIDRTSAARFGITPATVDNALYDAYGQRIVSTIFTQSNQYRVILESEP 759
+ +E+D+ A G++ + ++ + A G V+ + ++ ++++
Sbjct: 717 PNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADA 776

Query: 760 QMQHYTDSLNGIYLPSAGGGQVPLSAIATFRERPAPLLVSHLSQFPATTISFNLAAGASL 819
+ + + ++ +Y+ SA G VP SA T + + P+ I A G S
Sbjct: 777 KFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSS 836

Query: 820 GEAVKAIDAAERELGLPASFQTRFQGAALAFQASLSNQLFLILAAIVTMYIVLGVLYESY 879
G+A+ ++ +L PA + G + + S + L+ + V +++ L LYES+
Sbjct: 837 GDAMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESW 894

Query: 880 IHPITILSTLPSAGVGALLALMITGHDLDIIGIIGIVLLIGIVKKNAIMMIDFALEAERV 939
P++++ +P VG LLA + D+ ++G++ IG+ KNAI++++FA +
Sbjct: 895 SIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEK 954

Query: 940 EGKPPREAIYQACLLRFRPILMTTLAALLGAVPLIVGSGAGSELRQPLGIAIAGGLIVSQ 999
EGK EA A +R RPILMT+LA +LG +PL + +GAGS + +GI + GG++ +
Sbjct: 955 EGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSAT 1014

Query: 1000 VLTLFTTPVIYLGFD 1014
+L +F PV ++
Sbjct: 1015 LLAIFFVPVFFVVIR 1029


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1364RTXTOXIND487e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.5 bits (113), Expect = 7e-08
Identities = 28/149 (18%), Positives = 58/149 (38%), Gaps = 16/149 (10%)

Query: 84 AVRGEMPVVLNALGTVTPLANV-TVRTQLSGYLQAVSFQEGQIVKKGDVLAQIDPRP--- 139
+V G++ +V A G +T ++ + ++ + +EG+ V+KGDVL ++
Sbjct: 75 SVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA 134

Query: 140 ----YQISLANAQGALARDEALLATARLDLKRYQTLVAQ---DSIAKQTADTQASLVKQY 192
Q SL A+ R + L + L+ L + +++++ SL+K+
Sbjct: 135 DTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKE- 193

Query: 193 EGTVQIDRAAIDSAKLNLAYARITAPVSG 221
Q + L + A
Sbjct: 194 ----QFSTWQNQKYQKELNLDKKRAERLT 218



Score = 38.7 bits (90), Expect = 4e-05
Identities = 33/182 (18%), Positives = 61/182 (33%), Gaps = 26/182 (14%)

Query: 141 QISLANAQGALARDEALLAT--ARLDLKRYQTLVAQDSIAKQTADTQASLVKQY-EGTVQ 197
+ ++ + L ++L+ + L A++ T + ++ + + T
Sbjct: 251 KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDN 310

Query: 198 ID--RAAIDSAKLNLAYARITAPVSGRV-GLRQVDPGNYVTPSDT--------NGIVVIT 246
I + + + I APVS +V L+ G VT ++T + + V
Sbjct: 311 IGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTA 370

Query: 247 QLQPMSVIFTTSEDNLPAILKQVGAGGKLSVTAYNRNNTTPLETGV-LDTLDNQIDTATG 305
+Q + F AI+K V A+ L V LD D G
Sbjct: 371 LVQNKDIGFINVG--QNAIIK---------VEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419

Query: 306 TV 307
V
Sbjct: 420 LV 421


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1373NUCEPIMERASE310.003 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 31.3 bits (71), Expect = 0.003
Identities = 23/123 (18%), Positives = 41/123 (33%), Gaps = 24/123 (19%)

Query: 33 LKIALFGATGMIGSRIAAEAARRGHQVTAL-------------SRNPAASGANVQAKAAD 79
+K + GA G IG ++ GHQV + +R + Q D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 80 LFD---PASIAAALEGQDVVASA------YGPKQEEASKVVAVAKAL--VDGARKAGVKR 128
L D + A+ + V S Y + A + L ++G R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 129 VVV 131
++
Sbjct: 121 LLY 123


23BURPS668_1396BURPS668_1404Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_13963131.524361hypothetical protein
BURPS668_13970101.755115pentapeptide MXKDX repeat-containing protein
BURPS668_13982112.470233hypothetical protein
BURPS668_1399-1102.184318hypothetical protein
BURPS668_14001112.294311molybdopterin-binding oxidoreductase
BURPS668_14013123.712071hypothetical protein
BURPS668_14022113.563435hypothetical protein
BURPS668_14031113.571642RNA-binding protein
BURPS668_14043123.978346major facilitator family transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1399GPOSANCHOR300.014 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 30.4 bits (68), Expect = 0.014
Identities = 23/71 (32%), Positives = 27/71 (38%), Gaps = 2/71 (2%)

Query: 62 AAANATANATANATTDPTIDATADATTDAMANVTTDAATRVTANPPIHETANVPASTAAA 121
A+ + T +A P A T N T+ P ETAN P TAAA
Sbjct: 463 ASDSQTPDAKPGNKAVPGKGQAPQAGTKPNQNKAPMKETKRQL-PSTGETAN-PFFTAAA 520

Query: 122 ATTEAVAGARA 132
T A AG A
Sbjct: 521 LTVMATAGVAA 531


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1404TCRTETB453e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 45.3 bits (107), Expect = 3e-07
Identities = 56/337 (16%), Positives = 114/337 (33%), Gaps = 52/337 (15%)

Query: 49 PGSASWIGVVPTATQLGYATGMLVLAPLGDRFDRRTLILLQIAGLSAALVVAAAAPT--- 105
P S +W V TA L ++ G V L D+ + L+L I V+ +
Sbjct: 48 PASTNW---VNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFS 104

Query: 106 LGVLAAASLAIGILATIAQQAVPFAAEIAPPAARGQAVGTVMSGLLLGILLARTAAGFVA 165
L ++A G A A V A I P RG+A G + S + +G + G +A
Sbjct: 105 LLIMARFIQGAGAAAFPALVMVVVARYI-PKENRGKAFGLIGSIVAMGEGVGPAIGGMIA 163

Query: 166 EYFGWRAVFAVSVAALAALAAVIVARLPRSSPTSTLPYGK-------------------- 205
Y W + + + + + ++
Sbjct: 164 HYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSI 223

Query: 206 --LLASMWQLVRELRGLREAS------------------MTGGAIFAAFSAFWPVLTLLL 245
L+ S+ + ++ +R+ + + GG IF + F ++ ++
Sbjct: 224 SFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMM 283

Query: 246 AGAPFHLGPQAAG--LFGIVGAAGALAAPYAGRFADKRGPRAIISLAIALIAASFAIFA- 302
L G + + + G D+RGP ++++ + ++ SF +
Sbjct: 284 -KDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASF 342

Query: 303 LSGASLIGLVIGVIVLDVGVQAAQIS-NQSRIYALKP 338
L + + I ++ + G+ + + +LK
Sbjct: 343 LLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQ 379


24BURPS668_1413BURPS668_1440Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_14132140.554176LacI family transcriptional regulator
BURPS668_1414017-0.390091hypothetical protein
BURPS668_1415-117-0.429401hypothetical protein
BURPS668_1416-313-0.795030hypothetical protein
BURPS668_1417-2110.239311hypothetical protein
BURPS668_1418-2100.260966diguanylate cyclase
BURPS668_1419-1103.420065hypothetical protein
BURPS668_1420-183.106914hypothetical protein
BURPS668_1421092.074892hypothetical protein
BURPS668_1422192.377261hypothetical protein
BURPS668_1423392.597046FMN-binding domain-containing protein
BURPS668_1424473.284056GntR family transcriptional regulator
BURPS668_1425392.192748AsnC family transcriptional regulator
BURPS668_14263102.041408hypothetical protein
BURPS668_1427183.148992glucosamine--fructose-6-phosphate
BURPS668_14280124.166196carotenoid 9,10-9',10' cleavage dioxygenase
BURPS668_14291134.011290hypothetical protein
BURPS668_1430-1113.039465LysR family transcriptional regulator
BURPS668_14310133.029013short chain dehydrogenase
BURPS668_1432-3133.091386lipoprotein
BURPS668_1433-1122.319817hypothetical protein
BURPS668_14341122.193472hypothetical protein
BURPS668_14351111.798620sulfite reductase subunit beta
BURPS668_1436-1100.802309hypothetical protein
BURPS668_14371101.297451GntR family transcriptional regulator
BURPS668_14382220.852962hypothetical protein
BURPS668_14395231.445726hypothetical protein
BURPS668_14405260.565631HSP20 family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1431DHBDHDRGNASE673e-15 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 67.0 bits (163), Expect = 3e-15
Identities = 74/266 (27%), Positives = 119/266 (44%), Gaps = 19/266 (7%)

Query: 1 MADHSIKGKTVIIAGGAKNLGGLIARDLAAQGAQAVAIHYNSAASKGAAAETVAAIEAAG 60
M I+GK I G A+ +G +AR LA+QGA A+ YN + + V++++A
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLE----KVVSSLKAEA 56

Query: 61 ARAVALQADLTAAGAVEKLFVDTVAAIGRPDIAINTVGKVLKKPFVEITEAEYDEMAAVN 120
A A AD+ + A++++ +G DI +N G + +++ E++ +VN
Sbjct: 57 RHAEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVN 116

Query: 121 SKTAFFFLKEAGRHVND--NGKIVTLVTSLLGAFTPFYAAYAGMKAPVEHFTRAAAKEFG 178
S F + +++ D +G IVT+ ++ G AAYA KA FT+ E
Sbjct: 117 STGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 179 ARGISVTAVGPGPMDTPFFYPAEGADAVAYHKTAAALSPFSKTGL--------TDIGDVV 230
I V PG +T + + A +L F KTG+ +DI D V
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETF-KTGIPLKKLAKPSDIADAV 235

Query: 231 PFIRHLVSD-GWWITGQTILINGGYT 255
F LVS IT + ++GG T
Sbjct: 236 LF---LVSGQAGHITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1432IGASERPTASE494e-08 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 48.5 bits (115), Expect = 4e-08
Identities = 34/243 (13%), Positives = 67/243 (27%), Gaps = 1/243 (0%)

Query: 200 EPAETAEGAPMKLKTPAAPTPPAAPVPASSAAPGASASSAVAAPAAAGSGPAASAPAAPV 259
P+ + + A PPA P+ + A S + A A
Sbjct: 1007 VPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNR 1066

Query: 260 RHAAPAPASATAAASAPTAASAPAPTPASAPAPASTPAPASAPTPASAPTPTPASAPTPA 319
A A ++ A A + + T + A A T P
Sbjct: 1067 EVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVT 1126

Query: 320 SIPAPAPASAPASTPAPASAPAPAPTTSPASSIAPTAAPFASAIPPARAEKFAPAVTATT 379
S +P + P A PT + + T + P +
Sbjct: 1127 SQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES 1186

Query: 380 AGSTSTPASAAAPSSPWLPPLLPPLLSPDAPSPPADTARTAPLAPAASPATAAAAATNAT 439
+ + P + P P ++ ++ + P + R + + + A ++ + +
Sbjct: 1187 TTVNTGNSVVENPENT-TPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRS 1245

Query: 440 ATA 442
A
Sbjct: 1246 TVA 1248



Score = 39.7 bits (92), Expect = 3e-05
Identities = 32/208 (15%), Positives = 57/208 (27%), Gaps = 10/208 (4%)

Query: 178 DPTRRDKAAVKAAEKERVAPLPEPAETAEGAPMKLKTPAAPTPPAAPVPASSAAPGASAS 237
+ T +++ K A K V + E A+ +T T A V A +
Sbjct: 1060 ETTAQNREVAKEA-KSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEK 1118

Query: 238 SAVAAPAAAGSGPAASAPAAPVRHAAPAPASATAAASAPTAASAPAPTPASAPAPASTPA 297
+ + P A PA + PT + + A PA
Sbjct: 1119 TQEVPKVTSQVSPKQEQSETVQPQAEPAR------ENDPTVNIKEPQSQTNTTADTEQPA 1172

Query: 298 PASAPTPASAPTPTPASAPTPASIPAPAPASAPASTPAPASAPAPAPTTSPASSIAPTAA 357
++ T + + + P + + P S + P S+
Sbjct: 1173 KETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVP- 1231

Query: 358 PFASAIPPARAEKFAPAVTATTAGSTST 385
+ P + V ST+T
Sbjct: 1232 --HNVEPATTSSNDRSTVALCDLTSTNT 1257


25BURPS668_1449BURPS668_1458Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_14491123.385939malonate transporter MadL subunit
BURPS668_14502133.883950malonate transporter MadM subunit
BURPS668_14511134.217951hypothetical protein
BURPS668_14521124.694823malonate decarboxylase subunit alpha
BURPS668_14533136.152540malonate decarboxylase subunit delta
BURPS668_14543136.008721malonate decarboxylase subunit beta
BURPS668_14552135.415055malonate decarboxylase subunit gamma
BURPS668_14561144.654119phosphoribosyl-dephospho-CoA transferase
BURPS668_1457-173.910325triphosphoribosyl-dephospho-CoA synthase MdcB
BURPS668_1458-373.106259malonyl CoA-acyl carrier protein transacylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1452ADHESNFAMILY300.019 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 30.2 bits (68), Expect = 0.019
Identities = 28/128 (21%), Positives = 42/128 (32%), Gaps = 19/128 (14%)

Query: 390 LKAGEEADARTPAA---LRRGRKLVVQIGE----------TFGEKNAPMFVEQLDALRLA 436
L+ E P A L G I + F EKN + ++LD L
Sbjct: 127 LEGQNEKGKEDPHAWLNLENGIIFAKNIAKQLSAKDPNNKEFYEKNLKEYTDKLDKLDKE 186

Query: 437 DKLALDLAPVMVYGDDVTHVVTEEGIANLLMCRDADEREHAIRGVAGYTEIGRGRDRRLV 496
K + P + +VT EG + I + E + + LV
Sbjct: 187 SKDKFNKIP-----AEKKLIVTSEGAFKYF-SKAYGVPSAYIWEINTEEEGTPEQIKTLV 240

Query: 497 ERLRERGV 504
E+LR+ V
Sbjct: 241 EKLRQTKV 248


26BURPS668_1511BURPS668_1549Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_1511325-5.921423peptidase
BURPS668_1512529-7.401233aldose 1-epimerase
BURPS668_1513736-9.241859undecaprenyl pyrophosphate phosphatase
BURPS668_1515743-10.508517*hypothetical protein
BURPS668_1516843-10.758955hypothetical protein
BURPS668_1517844-10.982797phage HK97 tail length tape measure-related
BURPS668_1518648-11.079811prohead protease
BURPS668_1519736-8.459825hypothetical protein
BURPS668_1520734-7.947438hypothetical protein
BURPS668_1521734-7.799596hypothetical protein
BURPS668_1523633-8.156228hypothetical protein
BURPS668_1522633-7.935652phage integrase site specific recombinase
BURPS668_1524629-6.541498transposase
BURPS668_1525032-6.231756phage integrase site specific recombinase
BURPS668_1526031-5.138762hypothetical protein
BURPS668_1527025-2.565310hypothetical protein
BURPS668_1528123-0.949020hypothetical protein
BURPS668_1529021-0.796834hypothetical protein
BURPS668_1530121-0.375286hypothetical protein
BURPS668_1531221-0.699174PAAR motif-containing protein
BURPS668_1532325-0.921619MerR family transcriptional regulator
BURPS668_15333161.596468GntR family transcriptional regulator
BURPS668_15353115.169148hypothetical protein
BURPS668_15362116.067973hypothetical protein
BURPS668_15381105.362068tryptophan repressor binding protein
BURPS668_1537095.512643hypothetical protein
BURPS668_1539-1104.844344alpha-amylase
BURPS668_1540-1114.8697074-alpha-glucanotransferase
BURPS668_1541-2113.991470malto-oligosyltrehalose trehalohydrolase
BURPS668_1542-1103.751230glycogen debranching protein GlgX
BURPS668_1543-193.227960glycogen branching protein
BURPS668_1544-192.789639alpha-glucosidase
BURPS668_15452112.664268alpha amylase
BURPS668_1546827-1.370759hypothetical protein
BURPS668_1548628-1.668046deoxyribodipyrimidine photolyase
BURPS668_1547529-2.263876poly(3-hydroxybutyrate) depolymerase
BURPS668_1549330-0.638830hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1511RTXTOXIND320.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.1 bits (73), Expect = 0.002
Identities = 11/64 (17%), Positives = 29/64 (45%), Gaps = 12/64 (18%)

Query: 149 VIAAAAGTVVYAGNGLRGYGNLLIVKHNADFLTTYAHNRALLVKEGQTVAQGQKIAEMGD 208
++A A G + ++G +K + + + ++VKEG++V +G + ++
Sbjct: 82 IVATANGKLTHSGR-------SKEIKPIENSIV-----KEIIVKEGESVRKGDVLLKLTA 129

Query: 209 TDND 212
+
Sbjct: 130 LGAE 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1517PYOCINKILLER340.003 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 34.0 bits (77), Expect = 0.003
Identities = 53/237 (22%), Positives = 87/237 (36%), Gaps = 23/237 (9%)

Query: 587 GLDSSLQDTLAGLSQRRDI--DVRRYVMGLDQMNTQQEDAYQYQDTTRNMTARAKAEFD- 643
GLDSS ++ D+ D RRYV L ++E Q++D + + A +AE D
Sbjct: 41 GLDSSTENGWQEFESYADVGVDPRRYV-PLQVKEKRREIELQFRDAEKKLEASVQAELDK 99

Query: 644 ARYALQQ-------QYQQRVRQLVEQYALDPTSDMKQYAEKLRAEQAYLA-ERSAGMERF 695
A AL R +V + +K+ + A R+A
Sbjct: 100 ADAALGPAKNLAPLDVINRSLTIVGNALQQKNQKLLLNQKKITSLGAKNFLTRTAEEIGE 159

Query: 696 FAREEARRNSFEAQMKDGLSSLGG-DAMTNAELAKTAFVT--AWQDSQSALEQFITSGEG 752
A E N EA M+ + G A N +L A + ++ +A + I +
Sbjct: 160 QAVREGNINGPEAYMRFLDREMEGLTAAYNVKLFTEAISSLQIRMNTLTAAKASIEAAAA 219

Query: 753 NFKKFTASI-------LADLAKIALRQAEVFAIQSIGSSFGFFSEGGPVGHFASGGA 802
N + A+ + A+R A +A+ + G S + G + A G A
Sbjct: 220 NKAREQAAAEAKRKAEEQARQQAAIRAANTYAMPANG-SVVATAAGRGLIQVAQGAA 275


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1543PRTACTNFAMLY300.031 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 30.4 bits (68), Expect = 0.031
Identities = 19/63 (30%), Positives = 26/63 (41%), Gaps = 2/63 (3%)

Query: 213 RTEAPPRTASIVADLDALERFGWHDDAWLRARASLDLAHAPVSIYEVHPESWLRVAAEGN 272
+ RT + A L+A RF D +L +A L + A Y + LRV EG
Sbjct: 754 AVKGKYRTHGVGASLEAGRRFTHADGWFLEPQAELAVFRAGGGAYRA--ANGLRVRDEGG 811

Query: 273 RSA 275
S
Sbjct: 812 SSV 814


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1547PF07675310.011 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 31.2 bits (70), Expect = 0.011
Identities = 18/46 (39%), Positives = 25/46 (54%), Gaps = 3/46 (6%)

Query: 376 SYNVYRNGNKVGSS-TSTAYTDAGLIAGTAYSYTVTEIDPSLGESA 420
+Y +YRN ++ S T T Y D L G Y+Y V ++ GESA
Sbjct: 1260 TYTIYRNNTQIASGVTETTYRDPDLATGF-YTYGV-KVVYPNGESA 1303


27BURPS668_1576BURPS668_1611Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_1576-193.308681hypothetical protein
BURPS668_15770131.280672PAAR motif-containing protein
BURPS668_15780160.342883hypothetical protein
BURPS668_1579522-3.062498hypothetical protein
BURPS668_1580218-2.340737Rhs element Vgr protein
BURPS668_1581326-4.914091hypothetical protein
BURPS668_1582428-5.796903hypothetical protein
BURPS668_1583427-5.434792hypothetical protein
BURPS668_1584326-4.750185hypothetical protein
BURPS668_1585-2150.155315Rhs element Vgr protein
BURPS668_1586-120-0.527695hypothetical protein
BURPS668_1587537-6.194624hypothetical protein
BURPS668_1588639-7.333592hypothetical protein
BURPS668_1589639-7.576007hypothetical protein
BURPS668_1590639-7.758237Rhs element Vgr protein
BURPS668_1591954-13.084255hypothetical protein
BURPS668_15921158-13.946766hypothetical protein
BURPS668_1593949-12.908356hypothetical protein
BURPS668_1594746-10.803446transposase
BURPS668_1595645-9.672259integrase core subunit
BURPS668_1596444-9.829755transposase
BURPS668_1597441-7.428415hypothetical protein
BURPS668_1598224-2.810903hypothetical protein
BURPS668_1599-113-1.186699hypothetical protein
BURPS668_1600015-1.515839hypothetical protein
BURPS668_1601114-1.357728DNA-binding protein
BURPS668_1602115-1.237939hypothetical protein
BURPS668_1603219-2.370074hydrolase
BURPS668_1604224-3.205501major facilitator family transporter
BURPS668_1605432-3.698951hypothetical protein
BURPS668_1606123-2.301193transcriptional regulator
BURPS668_1607722-1.408979hypothetical protein
BURPS668_16084160.053238hypothetical protein
BURPS668_16094102.473966hypothetical protein
BURPS668_16105112.229390hypothetical protein
BURPS668_16112101.248196spore coat protein U domain-contain protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1604TCRTETB411e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.6 bits (95), Expect = 1e-05
Identities = 59/353 (16%), Positives = 120/353 (33%), Gaps = 55/353 (15%)

Query: 44 VDTQMFSLVIPALLTAWGIGKGQAGLIGGATLAAGAIGGLLAGMIADRFGRVRALQITVC 103
++ + ++ +P + + + A + +IG + G ++D+ G R L +
Sbjct: 28 LNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGII 87

Query: 104 WFSLFTFLSAFAQNFEQLLVL-KTLQGLGFGGEWTAGAVLLSETIRARHRGKAMGIVQSA 162
+ + +F LL++ + +QG G V+++ I +RGKA G++ S
Sbjct: 88 INCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSI 147

Query: 163 WGFGWGGAVLLYTLVFSWLPPEWAWRVLFAIGVLPALLVLYIRRAIPEPPRDDAR----- 217
G G + ++ ++ W L I ++ + V ++ + + + R
Sbjct: 148 VAMGEGVGPAIGGMIAHYI----HWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKG 203

Query: 218 ----------------------VAVSTSAAAAQTAPARASAKSIFDPSV------LRMTI 249
+ VS + R DP + + +
Sbjct: 204 IILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVL 263

Query: 250 VGGLIGVGAHGGYHAITTWLPTYLKTERHLSVLGTG------AYLAVIIVAFIIGCMTSA 303
GG+I G + +P +K LS G ++VII +I G
Sbjct: 264 CGGIIFGTVAG----FVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGG----- 314

Query: 304 YLQDRIGRRRNLMLFSACCVVTVNLYVMLPLDNVAMLLLGFPLGFFAAGIPAT 356
L DR G +L ++V+ L + + F G+ T
Sbjct: 315 ILVDRRGPL--YVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFT 365


28BURPS668_1634BURPS668_1651Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_1634-1133.069249hypothetical protein
BURPS668_1635-2113.185268CAIB/BAIF family protein
BURPS668_1636-193.401715hypothetical protein
BURPS668_1637-282.448145alanyl-tRNA synthetase
BURPS668_1638-1103.310540transcriptional regulator
BURPS668_16390122.931058major facilitator superfamily permease
BURPS668_16410131.912740peptidase s1, chymotrypsin:pdz/dhr/glgf
BURPS668_16402131.536447hypothetical protein
BURPS668_16421151.376524hypothetical protein
BURPS668_16431132.760622NUDIX family hydrolase
BURPS668_16441113.508495hypothetical protein
BURPS668_16450134.713720hypothetical protein
BURPS668_16460134.861517NAD-dependent 4-hydroxybutyrate dehydrogenase
BURPS668_16470134.707181hypothetical protein
BURPS668_16481144.340184thioesterase
BURPS668_16492124.662908ABC transporter permease
BURPS668_16501134.437851ABC transporter permease/ATP-binding protein
BURPS668_16511143.169539ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1639FLAGELLIN340.001 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 34.2 bits (78), Expect = 0.001
Identities = 45/282 (15%), Positives = 71/282 (25%), Gaps = 9/282 (3%)

Query: 66 WLASANYAGYFVGAMTCARIAVDPARMVRAGLAATVLLTFAMGLASPFWVWALVRFVGGA 125
L+ N VGA I +D ++ L A+ + + + V G
Sbjct: 136 VLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGY 195

Query: 126 VSAWTFVFASQWGLRRVVEHGAPAWGGVIYTGPGIGIVATGLIGFALAGRHAALGWIGFA 185
+ R V GA T P V F
Sbjct: 196 DTYAVGANK----YRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFK 251

Query: 186 VASAVLTAFVWRAFGAAGAGTDGGQSRGGAKRAGDGVGVADAAGRAGERLAAGAAEGAGA 245
+ +A A G G + G + G G + G
Sbjct: 252 TTKSTAGTAEAKAIAGAIKGGKEGDT---FDYKGVTFTIDTKTGNDGNGKVSTTINGEKV 308

Query: 246 VADAVSAANVADTADTADTADTADTANTANTANTANTANTANTA--NSANSANSANSANS 303
A D A + + + T N + S AN+A S
Sbjct: 309 TLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGES 368

Query: 304 ANSANSATATAERNSPVAGVAGQHGAMAAPAARAPHVESAAA 345
+ N A TA +AG+ + A+ + + A
Sbjct: 369 KITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDA 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1641V8PROTEASE485e-08 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 47.7 bits (113), Expect = 5e-08
Identities = 32/154 (20%), Positives = 54/154 (35%), Gaps = 26/154 (16%)

Query: 119 GSGFIVGADGIILTTAYVVGQASEATVRLIDRR-----------EFKA-RVLAVDDSSDV 166
SG +VG +LT +VV L F A ++ D+
Sbjct: 104 ASGVVVG-KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDL 162

Query: 167 AVLQIDATK--------LPTVRLGDSSRVRTGEPVLTIGTPDGSANTVTTGIVSATARML 218
A+++ + + + +++ + + + G P G T + ++
Sbjct: 163 AIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYP-GDKPVATMW--ESKGKIT 219

Query: 219 PDGGRFPFFQTDVTGNLDNSGGPVFNRAGEVIGI 252
G + TG NSG PVFN EVIGI
Sbjct: 220 YLKGEAMQYDLSTTGG--NSGSPVFNEKNEVIGI 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1642FLGHOOKAP1280.040 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 27.6 bits (61), Expect = 0.040
Identities = 11/34 (32%), Positives = 16/34 (47%), Gaps = 2/34 (5%)

Query: 190 VDIREEALHELIDRLDDLASEFHSAF--LHEAGK 221
+ R + L + + L LA F AF H+AG
Sbjct: 283 LTFRSQDLDQTRNTLGQLALAFAEAFNTQHKAGF 316


29BURPS668_1690BURPS668_1729Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_1690514-2.568323hypothetical protein
BURPS668_1691513-2.889048electron transfer flavoprotein-ubiquinone
BURPS668_1692516-2.614344short chain dehydrogenase
BURPS668_1693516-2.419361thioesterase
BURPS668_1694414-2.616363hypothetical protein
BURPS668_1695112-2.554122hypothetical protein
BURPS668_1696-18-1.394636Beta-xylosidase
BURPS668_1697-280.833849hypothetical protein
BURPS668_1698-1101.723767hypothetical protein
BURPS668_1699-192.3233243-oxoacid CoA-transferase subunit A
BURPS668_17002123.773197succinyl-CoA:3-ketoacid-coenzyme A transferase
BURPS668_17010133.816414short chain dehydrogenase
BURPS668_17021122.923977polysaccharide deacetylase
BURPS668_17032122.442994hypothetical protein
BURPS668_1704-191.394932LysR family transcriptional regulator
BURPS668_1705-370.649244hypothetical protein
BURPS668_1706-48-1.554956pyridine nucleotide-disulfide family
BURPS668_1707-310-2.596595alpha/beta hydrolase
BURPS668_1708-111-3.843604endoribonuclease L-PSP
BURPS668_1709-112-4.237963GTP diphosphokinase
BURPS668_1711117-6.783073*hypothetical protein
BURPS668_1712012-4.713969threonyl-tRNA synthetase
BURPS668_1713113-4.117526translation initiation factor IF-3
BURPS668_1714012-3.26126050S ribosomal protein L35
BURPS668_1715-310-2.78750450S ribosomal protein L20
BURPS668_1716-212-3.927043phenylalanyl-tRNA synthetase subunit alpha
BURPS668_1717-312-3.475044phenylalanyl-tRNA synthetase subunit beta
BURPS668_1718-116-3.715129integration host factor subunit alpha
BURPS668_1719017-3.401737MerR family transcriptional regulator
BURPS668_1720-115-3.631581lipoprotein
BURPS668_1721320-4.418264acyltransferase
BURPS668_1723216-2.678147*lipoprotein
BURPS668_1724116-3.365577hypothetical protein
BURPS668_1725320-4.525189hypothetical protein
BURPS668_1726217-4.578844lipoprotein
BURPS668_1727324-5.189383transposase
BURPS668_1728232-4.818023antibiotic biosynthesis monooxygenase
BURPS668_1729229-3.860440hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1692DHBDHDRGNASE1205e-35 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 120 bits (301), Expect = 5e-35
Identities = 77/261 (29%), Positives = 125/261 (47%), Gaps = 16/261 (6%)

Query: 7 LEGKVALITGASSGLGQRFAQVLSQAGAKVVLASRRVERLKELRAEIEAAGGAAHVVSLD 66
+EGK+A ITGA+ G+G+ A+ L+ GA + E+L+++ + ++A A D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 67 VTDYQSIRAAVAHAETEAGTIDILVNNSGVSTMQKLVDVSPADFEYVFDTNTRGAFFVAQ 126
V D +I A E E G IDILVN +GV + +S ++E F N+ G F ++
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 127 EVAKRMMMRAGSGNAKPACRIINIASVAGLRPFSQIGLYAMSKAAVVHMTRAMALEWGRH 186
V+K MM R I+ + S P + + YA SKAA V T+ + LE +
Sbjct: 126 SVSKYMMDRRSGS-------IVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEY 178

Query: 187 GINVNAICPGYIDTEINHYLWETEQGQ---------KLQSMLPRRRVGKPQDLDGLLLLL 237
I N + PG +T++ LW E G ++ +P +++ KP D+ +L L
Sbjct: 179 NIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFL 238

Query: 238 AADESQFINGSIVSADDGLGL 258
+ ++ I + D G L
Sbjct: 239 VSGQAGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1701DHBDHDRGNASE748e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 73.5 bits (180), Expect = 8e-18
Identities = 60/249 (24%), Positives = 98/249 (39%), Gaps = 19/249 (7%)

Query: 10 VLVIGGSSGIGAAAARAFAVLDADVTIASRDANKLAAAARAIDG-PRPVRQAVLDTTDAP 68
+ G + GIG A AR A A + + KL ++ R D D+
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 69 AVDA----FFAEAGPFDHVVMSAAHTPGGPVRKLPLADAQAAMDSKFWGAY----RVARA 120
A+D E GP D +V A G + L + +A G + V++
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 121 ARIAPGGSLTFVSGFLSVRPSASAVLQGAINAALEALARGLALELAP--VRVNTVSPGLV 178
GS+ V + P S + AA + L LELA +R N VSPG
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGST 190

Query: 179 ATPLWSKL--DDAAREAMYASAAAR----LPARRVGQPEDVANAIVYLAATR--YATGST 230
T + L D+ E + + +P +++ +P D+A+A+++L + + + T
Sbjct: 191 ETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITMHN 250

Query: 231 ALVDGGGAI 239
VDGG +
Sbjct: 251 LCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1703PYOCINKILLER280.026 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 27.8 bits (61), Expect = 0.026
Identities = 19/68 (27%), Positives = 27/68 (39%), Gaps = 2/68 (2%)

Query: 22 AAALAPAHADTTGLIEPAHLSVDGSLPAAQRDAQILAARRYDTFWHNGDPALARAALADD 81
AA+LA A +D ++ S + A A + + R W + P R AL D
Sbjct: 274 AASLAQAISDAIAVLGRVLASAPSVM--AVGFASLTYSSRTAEQWQDQTPDSVRYALGMD 331

Query: 82 FADRTPPP 89
A PP
Sbjct: 332 AAKLGLPP 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1718DNABINDINGHU1191e-38 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 119 bits (301), Expect = 1e-38
Identities = 35/89 (39%), Positives = 53/89 (59%)

Query: 37 TKAELAELLFDSVGLNKREAKDMVEAFFEVIRDALENGESVKLSGFGNFQLRDKPQRPGR 96
K +L + ++ L K+++ V+A F + L GE V+L GFGNF++R++ R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 97 NPKTGEAIPIAARRVVTFHASQKLKALVE 125
NP+TGE I I A +V F A + LK V+
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1720PF00577310.025 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 30.6 bits (69), Expect = 0.025
Identities = 32/179 (17%), Positives = 58/179 (32%), Gaps = 36/179 (20%)

Query: 491 APWDAMSDLFNRHLLDYSPRSLNDLKLSADGGALRVRGGIKLWNQVPPGVWLPADMKGSL 550
AP + FN L P+++ DL +G ++PPG + D+ +
Sbjct: 40 APLSSAELYFNPRFLADDPQAVADLSRFENG------------QELPPGTY-RVDIYLNN 86

Query: 551 TLLDERHLAFTPTQVSVLGIP--QAKLLRALGIELSSLAPLKRRGAELRGDSLVLDQYTV 608
+ R + F +P L ++G+ +S++ + + L
Sbjct: 87 GYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLA---DDACVPLTSM-- 141

Query: 609 FPPPVLIGHMSQATVEPDG----LRLTFRPAPNAPVLRPPANLPGSYLWLEGGDTKMFN 663
+ AT + D L LT P A + LW G + + N
Sbjct: 142 ---------IHDATAQLDVGQQRLNLTI---PQAFMSNRARGYIPPELWDPGINAGLLN 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1723cloacin320.004 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 32.0 bits (72), Expect = 0.004
Identities = 18/54 (33%), Positives = 23/54 (42%)

Query: 20 GGGGGGGGGGGGGGSTTSNPNGTGVTPSSAANVQPVTVSAGITGAANMLMTSVT 73
GGG G G GGG + +GTG S+ A A T A L S++
Sbjct: 55 HWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSIS 108



Score = 30.8 bits (69), Expect = 0.011
Identities = 12/24 (50%), Positives = 14/24 (58%)

Query: 20 GGGGGGGGGGGGGGSTTSNPNGTG 43
G G GGG G G S+ +NP G G
Sbjct: 26 GLGVGGGASDGSGWSSENNPWGGG 49


30BURPS668_1746BURPS668_1771Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_1746115-3.015222MarR family transcriptional regulator
BURPS668_1747116-3.031078hypothetical protein
BURPS668_1748116-2.930066GTP-binding protein TypA
BURPS668_1749-115-3.470816transposase
BURPS668_1750-117-3.0095932-oxoglutarate dehydrogenase E1
BURPS668_1751-216-2.300268dihydrolipoamide succinyltransferase
BURPS668_1752-414-3.329439dihydrolipoamide dehydrogenase
BURPS668_1753-112-3.260898AFG1 family ATPase
BURPS668_1754112-2.626819hypothetical protein
BURPS668_17552120.558965hypothetical protein
BURPS668_17564152.242568hypothetical protein
BURPS668_17574151.747031lipoprotein
BURPS668_17583161.665287hypothetical protein
BURPS668_17594151.325151hypothetical protein
BURPS668_17604232.216665lipoprotein
BURPS668_17612210.148996hypothetical protein
BURPS668_1762424-1.467001hypothetical protein
BURPS668_1763-2130.201113hypothetical protein
BURPS668_1764-2101.997102pilin family protein
BURPS668_1765-1101.719134peptidase A24A, prepilin type IV
BURPS668_17660112.514585TadE family protein
BURPS668_17671123.171373pilus assembly protein CpaB
BURPS668_17680123.131370type II/III secretion system protein
BURPS668_17691113.757522hypothetical protein
BURPS668_17701123.371251type II/IV secretion system protein
BURPS668_17710123.196570type II secretion system protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1748TCRTETOQM1715e-48 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 171 bits (435), Expect = 5e-48
Identities = 102/435 (23%), Positives = 172/435 (39%), Gaps = 62/435 (14%)

Query: 5 LRNIAIIAHVDHGKTTLVDQLLRQSGTFRENQQVAE--RVMDSNDIEKERGITILAKNCA 62
+ NI ++AHVD GKTTL + LL SG E V + D+ +E++RGITI +
Sbjct: 3 IINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITS 62

Query: 63 VEYEGTHINIVDTPGHADFGGEVERVLSMVDSVLLLVDAVEGPMPQTRFVTKKALALGLK 122
++E T +NI+DTPGH DF EV R LS++D +LL+ A +G QTR + +G+
Sbjct: 63 FQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIP 122

Query: 123 PIVVINKIDRPGARIDWV-------------INQTFDLFDKLGATE----EQLDFPIV-- 163
I INKID+ G + V I Q +L+ + T EQ D I
Sbjct: 123 TIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGN 182

Query: 164 -----------------YASGLNGY---ASLDP-----AARDGDMRPLFEAILQHVPVRP 198
+ SL P A + + L E I
Sbjct: 183 DDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSST 242

Query: 199 ADPDAPLQLQITSLDYSTYVGRIGVGRITRGRIKPGQPVVMRFGPEGDVLNRKINQVLSF 258
+ L ++ ++YS R+ R+ G + V R + + KI ++ +
Sbjct: 243 HRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSV--RISEKEKI---KITEMYTS 297

Query: 259 QGLERVQVDSAEAGDIVLINGIEDVGIGATICAVEAPEALPMITVDEPTLTMNFLVNSSP 318
E ++D A +G+IV++ E + + + + + I P L +
Sbjct: 298 INGELCKIDKAYSGEIVILQN-EFLKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKPQ 356

Query: 319 LAGREGKFVTSRQIRDRLMKELNHNVALRVKDTGDETVFEVSGRGELHLTILVENMRRE- 377
+ D L++ V E + +S G++ + + ++ +
Sbjct: 357 QREMLLDALLEISDSDPLLR-------YYVDSATHEII--LSFLGKVQMEVTCALLQEKY 407

Query: 378 GYELAVSRPRVVMQE 392
E+ + P V+ E
Sbjct: 408 HVEIEIKEPTVIYME 422



Score = 34.1 bits (78), Expect = 0.002
Identities = 17/100 (17%), Positives = 32/100 (32%), Gaps = 1/100 (1%)

Query: 387 RVVMQEVDGVKHEPYELLTVDLEDEHQGGVMEELGRRKGEMLDMVSDGRGRTRLEYRIPA 446
V+++ EPY + E+ + + ++D L IPA
Sbjct: 525 EQVLKKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQL-KNNEVILSGEIPA 583

Query: 447 RGLIGFQSEFLTLTRGTGLMSHIFDSYAPVKEGSVGERRN 486
R + ++S+ T G + Y V + R
Sbjct: 584 RCIQEYRSDLTFFTNGRSVCLTELKGYHVTTGEPVCQPRR 623


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1751RTXTOXIND290.028 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.028
Identities = 8/83 (9%), Positives = 26/83 (31%), Gaps = 3/83 (3%)

Query: 48 EVPAPAAGVLAQVLQNDGDTVVADQVIATID---TEAKAGAAAAAAGAADVQPAAAPVAA 104
E+ ++ +++ +G++V V+ + EA ++ A ++ + +
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 105 PAPAAQPAAAAASSTAAASPAAS 127
+ S
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVS 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1761cloacin457e-07 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 44.7 bits (105), Expect = 7e-07
Identities = 33/117 (28%), Positives = 51/117 (43%), Gaps = 1/117 (0%)

Query: 30 GGSGTISKGLDGSGSGSGGGNAISTTGGSGSGGTSGAGGSGSGGSGSSGSTGGLSGGGGS 89
G+ + S ++G +G G G S G S GGSGSG G +G +GGG
Sbjct: 11 TGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSG-IHWGGGSGHGNGGGNG 69

Query: 90 TSGGGSTSGGGSTSGGTSTSSSINALGTIAGNTGGIISGAGSTVSGLGTVVGSQTLP 146
SGGGS +GG ++ + AL T + AG+ + + ++ + P
Sbjct: 70 NSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIADIMAALKGP 126



Score = 40.5 bits (94), Expect = 2e-05
Identities = 33/123 (26%), Positives = 46/123 (37%), Gaps = 2/123 (1%)

Query: 38 GLDGSGSGSGGGNAI-STTGGSGSGGTSGAGGSGSGGSGSSGSTGGLSGGG-GSTSGGGS 95
G DG G +G + + GG G G GSG S + GG SG G G G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 96 TSGGGSTSGGTSTSSSINALGTIAGNTGGIISGAGSTVSGLGTVVGSQTLPGVNPQTTQA 155
+GGG+ + G + + N A G + + GL + + L A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIADIMAA 122

Query: 156 LGG 158
L G
Sbjct: 123 LKG 125



Score = 34.7 bits (79), Expect = 0.001
Identities = 34/125 (27%), Positives = 52/125 (41%), Gaps = 4/125 (3%)

Query: 55 TGGSGSGGTSGAGGSGSGGSGSSGSTGGLSGGGGSTSGGGSTSGGGSTSGGTSTSSSINA 114
+GG G G +GA S +G GL GGG++ G G +S GG+ +
Sbjct: 2 SGGDGRGHNTGA---HSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGG 58

Query: 115 LGTIAGNTGGIISGAGSTVSGLGTVVGSQTLPGVNPQTTQALGGVVQSL-GGAVSALGSG 173
G SG GS G + V + G +T GG+ S+ GA+SA +
Sbjct: 59 GSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIAD 118

Query: 174 VTSGI 178
+ + +
Sbjct: 119 IMAAL 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1765PREPILNPTASE534e-11 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 53.3 bits (128), Expect = 4e-11
Identities = 30/124 (24%), Positives = 52/124 (41%), Gaps = 10/124 (8%)

Query: 4 LFSIGFFFAWAAAVAIADCRDRRIPNELVLVGLAAVIIFTVCRQNPFGTTLSGALVGGAV 63
+ A+ D +P++L L L ++F + F +L A++G
Sbjct: 134 TLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNL--LGGF-VSLGDAVIGAMA 190

Query: 64 GLVSLFPFFAL-------RVMGAADVKVFAVLGAWCGLPALPRLWVVASIAAGVHALALM 116
G + L+ + MG D K+ A LGAW G ALP + +++S+ + L+
Sbjct: 191 GYLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLI 250

Query: 117 LLTR 120
LL
Sbjct: 251 LLRN 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1768BCTERIALGSPD1382e-37 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 138 bits (349), Expect = 2e-37
Identities = 58/249 (23%), Positives = 111/249 (44%), Gaps = 11/249 (4%)

Query: 160 VQVDVRVVEFSRSVLKQAGLNFFKQNNGFTFGSFAPAGLASVTGGG----TSSMSVSANI 215
V V+ + E + G+ + +N G T + + +++ G S+
Sbjct: 347 VLVEAIIAEVQDADGLNLGIQWANKNAGMTQFTNSGLPISTAIAGANQYNKDGTVSSSLA 406

Query: 216 PIASAFN-LVVGSATRGLFADLSILEANNLARVLAQPTLVALSGQSASFLAGGEIPVPVP 274
S+FN + G L+ L ++ +LA P++V L A+F G E+PV
Sbjct: 407 SALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTG 466

Query: 275 QSLGT-----ISIDWKPYGVGLTLTPTVLSPRRIALKVAPESSQLDFVHSITINGVTVPA 329
+ +++ K G+ L + P + + L++ E S + S + +
Sbjct: 467 SQTTSGDNIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVADAAS-STSSDLGAT 525

Query: 330 LTTRRADTTVELGDGESFAIGGLIDRETTSNVDKVPFLGDLPIIGTFFKHLSYQQNDKEL 389
TR + V +G GE+ +GGL+D+ + DKVP LGD+P+IG F+ S + + + L
Sbjct: 526 FNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNL 585

Query: 390 VIIVTPHLV 398
++ + P ++
Sbjct: 586 MLFIRPTVI 594


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1769HTHFIS384e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 38.3 bits (89), Expect = 4e-05
Identities = 11/63 (17%), Positives = 26/63 (41%)

Query: 72 AALRVSHPGLPIVALGSLGEPESALAALRAGVRDFIDFSAPAEDALRITRGLLDHVGDQP 131
++ + P LP++ + + +A+ A G D++ + + I L +P
Sbjct: 67 PRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRP 126

Query: 132 SRH 134
S+
Sbjct: 127 SKL 129


31BURPS668_1906BURPS668_1955Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_19063131.629591AraC family transcriptional regulator
BURPS668_19073152.527758carbohydrate ABC transporter periplasmic
BURPS668_19084153.224479carbohydrate ABC transporter ATP-binding
BURPS668_19094144.027624carbohydrate ABC transporter permease
BURPS668_19105134.491106hypothetical protein
BURPS668_19113143.918808zinc-binding dehydrogenase oxidoreductase
BURPS668_19122164.823453carbohydrate kinase
BURPS668_19133185.261880short chain dehydrogenase
BURPS668_19141144.395746hypothetical protein
BURPS668_1915-1144.284335hypothetical protein
BURPS668_19161136.110963hypothetical protein
BURPS668_19172136.167968extracytoplasmic-function sigma-70 factor
BURPS668_19183146.601430hypothetical protein
BURPS668_19194126.199166syringomycin biosynthesis enzyme
BURPS668_19205136.998967ABC-type cobalamin/Fe3+-siderophores transport
BURPS668_19211115.945780iron-hydroxamate transporter permease subunit
BURPS668_19221136.412127ferric iron reductase FhuF
BURPS668_19232136.806530Fe3+-hydroxamate ABC transporter periplasmic
BURPS668_19242146.475534hypothetical protein
BURPS668_19251145.859304hypothetical protein
BURPS668_19262155.587104cyclic peptide ABC transporter ATP-binding
BURPS668_19272135.911566siderophore non-ribosomal peptide synthetase
BURPS668_19281124.286127siderophore non-ribosomal peptide synthetase
BURPS668_1929-1112.228058L-ornithine 5-monooxygenase MbaA
BURPS668_1930-1112.269359ferric malleobactin transporter
BURPS668_19311112.527610hypothetical protein
BURPS668_19322112.832657cobyrinic acid a,c-diamide synthase
BURPS668_19331122.878038cob(I)yrinic acid a,c-diamide
BURPS668_19341103.532119cobalamin biosynthesis protein CbiG
BURPS668_1935093.780672hypothetical protein
BURPS668_1936094.135159high-affinity nickel transport protein
BURPS668_1937-194.695145cobalamin biosynthesis protein CobW
BURPS668_1938-1115.118056cobaltochelatase subunit CobN
BURPS668_19393125.533436hypothetical protein
BURPS668_19401145.466353magnesium chelatase subunit ChII
BURPS668_19410133.164130protoporphyrin IX magnesium chelatase
BURPS668_19421124.375088phospholipase/carboxylesterase
BURPS668_19431114.419462hypothetical protein
BURPS668_19440144.464029hypothetical protein
BURPS668_19450145.702218hypothetical protein
BURPS668_19460145.593801glycosyl hydrolase
BURPS668_19471137.277059cobalamin biosynthesis protein CbiG/precorrin-3B
BURPS668_19480146.457426precorrin-2 C(20)-methyltransferase
BURPS668_19490167.610109precorrin-8X methylmutase
BURPS668_19502157.217133precorrin-3B synthase
BURPS668_19512135.249165hypothetical protein
BURPS668_19521135.739361precorrin-6Y C5,15-methyltransferase
BURPS668_19531134.898655cobalt-precorrin-6A synthase
BURPS668_19540103.985962cobalt-precorrin-6x reductase
BURPS668_19551113.218516precorrin-4 C(11)-methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1913DHBDHDRGNASE1223e-36 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 122 bits (308), Expect = 3e-36
Identities = 81/252 (32%), Positives = 118/252 (46%), Gaps = 15/252 (5%)

Query: 9 GRSFLVTGASSGIGRAAAVALRGCGARVVAAARNARELERLAHETGC-----EPLELDVG 63
G+ +TGA+ GIG A A L GA + A N +LE++ E DV
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 64 CDASVRAALSG-ERMRDAFDGLINCAGVTSLAAAIDTTADEFDRVMAVNARGAMLVARHV 122
A++ + ER D L+N AGV + +E++ +VN+ G +R V
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 123 ARAMIRAGRGGSIVNVSSQAALVALPSHLAYCASKAALDAMTRVLCVELGPHGIRVNSVN 182
++ M R GSIV V S A V S AY +SKAA T+ L +EL + IR N V+
Sbjct: 128 SKYM-MDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PTVTLTPMAERAWSDPHASGPMLA--------AIPLGRFARVADVVAPILFLSSDAAAMV 234
P T T M W+D + + ++ IPL + A+ +D+ +LFL S A +
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 235 SGVALPVDGGYT 246
+ L VDGG T
Sbjct: 247 TMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1920PF05272280.041 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.041
Identities = 12/23 (52%), Positives = 13/23 (56%)

Query: 36 VTALCGPNGCGKSTLLRTLAGLQ 58
L G G GKSTL+ TL GL
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_19222FE2SRDCTASE576e-12 Ferric iron reductase signature.
		>2FE2SRDCTASE#Ferric iron reductase signature.

Length = 262

Score = 57.4 bits (138), Expect = 6e-12
Identities = 51/186 (27%), Positives = 73/186 (39%), Gaps = 24/186 (12%)

Query: 74 RALVSQWSKYYFNLAASAGFAAALLLGRPLDMAPQRMRVALRGGMPVALLFEADALRPAQ 133
+ L+S W+++Y L A L + LD++P+ VA F D
Sbjct: 89 KPLISLWAQWYIGLMVPPLMLALLTQEKALDVSPEHFHAEFHETGRVA-CFWVDVCEDKN 147

Query: 134 AEPAS---RYAALVDH-LRATIDTLAALAKLSPRVLWANAGNLLD-YLFEQCAHAPRAGA 188
A P S R L+ L + L A +++ +++W+N G L++ YL E G
Sbjct: 148 ATPHSPQHRMETLISQALVPVVQALEATGEINGKLIWSNTGYLINWYLTEM---KQLLGE 204

Query: 189 DA------AWLFGPVDSRGEANPLRLPVRRVKPCSARLPDPFRARRVCCLRNEIPGEDQL 242
A F + GE NPL V L D RR CC R +P Q
Sbjct: 205 ATVESLRHALFFEKTLTNGEDNPLWRTV--------VLRDGLLVRRTCCQRYRLPDVQQ- 255

Query: 243 CGSCPL 248
CG C L
Sbjct: 256 CGDCTL 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1923FERRIBNDNGPP1143e-31 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 114 bits (287), Expect = 3e-31
Identities = 78/264 (29%), Positives = 113/264 (42%), Gaps = 15/264 (5%)

Query: 115 PARIVVLEFMFAEDLAALDITPVGMADPAYYPIWIGYDDARFARVSDVGTRQEPSLEAIA 174
P RIV LE++ E L AL I P G+AD Y +W+ + V DVG R EP+LE +
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVADTINYRLWVS-EPPLPDSVIDVGLRTEPNLELLT 93

Query: 175 AAKPDLILGVGLRHAPIFDALSRIAPTVLFKYSPNYIEDGRQVTQYDWARAILRTIGCLT 234
KP ++ + P + L+RIAP F +S DG+Q AR L + L
Sbjct: 94 EMKPSFMVW-SAGYGPSPEMLARIAPGRGFNFS-----DGKQ--PLAMARKSLTEMADLL 145

Query: 235 GRARDARAVQARVDAGLARDARRIAAAGRAGERVAWLQELGLPDRYWAFTGNSASAGIAR 294
A A+ + + R + G R L L P F NS I
Sbjct: 146 NLQSAAETHLAQYEDFIRSMKPRFV---KRGARPLLLTTLIDPRHMLVFGPNSLFQEILD 202

Query: 295 ALGLE-PWPGEPTREGTAYVTSEDLLKQPDLAVLFVSATEPGVPLDAKLDSSIWRFVPAR 353
G+ W GE G+ V+ + L D+ VL +DA + + +W+ +P
Sbjct: 203 EYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKD-MDALMATPLWQAMPFV 261

Query: 354 RAGRVALVERNIWGFGGPMSALRL 377
RAGR V +W +G +SA+
Sbjct: 262 RAGRFQRVP-AVWFYGATLSAMHF 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1940HTHFIS446e-07 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 44.4 bits (105), Expect = 6e-07
Identities = 39/176 (22%), Positives = 64/176 (36%), Gaps = 14/176 (7%)

Query: 17 DRALPAAYPFSALIGQ-AALQQALLLVA-VDPGLGGVLVSGPRGTAKSTAARALAELLP- 73
+ + L+G+ AA+Q+ ++A + ++++G GT K ARAL +
Sbjct: 127 SKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKR 186

Query: 74 -EGRFVTLPLSASDEQVTGSLDLASALADNT--VRFSPGLVARAHLGVLYVDEINLLPDA 130
G FV + ++A + S T S G +A G L++DEI +P
Sbjct: 187 RNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMD 246

Query: 131 LVDALLDAAASGVNTVERDGVSHSHAARFALVGTMNP------EEGELRPQLLDRF 180
LL G G + +V N +G R L R
Sbjct: 247 AQTRLLRVLQQG--EYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRL 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1952OMADHESIN290.027 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 29.5 bits (65), Expect = 0.027
Identities = 25/63 (39%), Positives = 28/63 (44%)

Query: 147 ADGATPAAIAGALVARGFGPSAMSVFEHLGGPLERRLDARADAWRDARAAALNVVAIECR 206
A GAT A GA VA G G A V GPL + L A + A A + VAI R
Sbjct: 74 AIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAASTAQKDGVAIGAR 133

Query: 207 ACA 209
A
Sbjct: 134 AST 136


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1955LCRVANTIGEN300.008 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 30.0 bits (67), Expect = 0.008
Identities = 16/58 (27%), Positives = 24/58 (41%), Gaps = 5/58 (8%)

Query: 46 RAELVVNTAELDLDEIVALLARAHGKGQDVARVHSG-----DPSLYGAIGEQIRRLAA 98
R EL TAEL + ++ H +H D +LYG E+I + +A
Sbjct: 154 REELAELTAELKIYSVIQAEINKHLSSSGTINIHDKSINLMDKNLYGYTDEEIFKASA 211


32BURPS668_1970BURPS668_2062Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_19704163.771919hypothetical protein
BURPS668_19713123.505381hypothetical protein
BURPS668_19721134.327158hypothetical protein
BURPS668_19730133.904838amine ABC transporter permease
BURPS668_19740124.357167amine ABC transporter ATP-binding protein
BURPS668_1975-1124.255461amine ABC transporter permease
BURPS668_1976-2133.109731amine ABC transporter periplasmic amine-binding
BURPS668_1977-1133.704831hypothetical protein
BURPS668_1978-1113.207671methyltransferase UbiE
BURPS668_1979-1113.034513major facilitator family transporter
BURPS668_1980-282.743983AMP-binding protein
BURPS668_1981-2102.170199hypothetical protein
BURPS668_1982-292.310006hypothetical protein
BURPS668_1983-391.846567methyl-accepting chemotaxis protein
BURPS668_19841142.578867chemotaxis protein CheW
BURPS668_19850162.778848hypothetical protein
BURPS668_19861201.838447AraC family transcriptional regulator
BURPS668_1987320-0.435053hypothetical protein
BURPS668_1988219-1.053107outer membrane porin
BURPS668_1989019-2.670274hypothetical protein
BURPS668_1990-216-3.850177syringopeptin synthetase C
BURPS668_1991-110-5.644655hypothetical protein
BURPS668_1992-18-4.427635JmjC domain-containing protein
BURPS668_1993-18-3.476590histidinol-phosphate/aromatic aminotransferase
BURPS668_1994011-3.491434hypothetical protein
BURPS668_1995012-2.071598formyltransferase
BURPS668_1996013-0.760912argininosuccinate synthase
BURPS668_1997-1131.272394argininosuccinate lyase
BURPS668_1998-2121.803632GHMP kinase ATP-binding subunit
BURPS668_1999-3111.921697major facilitator family transporter
BURPS668_2000-1114.057557hypothetical protein
BURPS668_2001-1114.028492cysteine synthase
BURPS668_2002-1113.642147argininosuccinate lyase
BURPS668_2003-2113.688736L-allo-threonine aldolase
BURPS668_2004-2113.651192acetyltransferase
BURPS668_2005-1113.730532syringomycin synthetase
BURPS668_2006-3110.369887hypothetical protein
BURPS668_2007-4110.263749carbamoyltransferase
BURPS668_20081191.385607penicillin amidase
BURPS668_2009752-12.249336transposase, mutator type
BURPS668_2010654-11.924457ISPsy18, transposase
BURPS668_2011752-10.480731bacteriophage gp31 protein
BURPS668_2012752-10.268869gp30
BURPS668_2013847-7.864388hypothetical protein
BURPS668_2014541-6.668046lipase
BURPS668_2015535-1.306165hypothetical protein
BURPS668_2016-119-1.060605hypothetical protein
BURPS668_2018-218-1.650077hypothetical protein
BURPS668_2017-119-2.277183hypothetical protein
BURPS668_2019-119-2.884334hypothetical protein
BURPS668_2020-118-2.897389hypothetical protein
BURPS668_2021-118-2.908543Hpt sensor hybrid histidine kinase
BURPS668_2022523-2.357999DNA-binding response regulator
BURPS668_2023525-2.245247hypothetical protein
BURPS668_2024524-1.251054hypothetical protein
BURPS668_2025522-1.580327hypothetical protein
BURPS668_2026617-1.261010hypothetical protein
BURPS668_2027617-1.561037outer membrane protein
BURPS668_2028521-2.312829hypothetical protein
BURPS668_2029522-2.880517hypothetical protein
BURPS668_2030623-2.745177type-1 fimbrial protein, A subunit
BURPS668_2031623-2.441125outer membrane usher protein YcbS
BURPS668_2032528-4.284080fimbrial assembly chaperone protein
BURPS668_2033531-4.348113fimbrial protein
BURPS668_2034434-2.945186hypothetical protein
BURPS668_2035329-0.952163hypothetical protein
BURPS668_2036334-3.993628hypothetical protein
BURPS668_2037334-3.736045hypothetical protein
BURPS668_2038427-3.074167hypothetical protein
BURPS668_2039427-2.788495PHB depolymerase esterase
BURPS668_2040525-2.879500hypothetical protein
BURPS668_2041830-4.509349hypothetical protein
BURPS668_2043825-4.679028hypothetical protein
BURPS668_2042117-3.898438EutG protein
BURPS668_2044-116-2.100012hypothetical protein
BURPS668_2045-114-0.793469hypothetical protein
BURPS668_2046-111-0.793561hypothetical protein
BURPS668_2047090.696992hypothetical protein
BURPS668_2048183.573595tryptophan halogenase
BURPS668_2049294.433820monovalent cation:proton antiporter-2 (CPA2)
BURPS668_2050194.311967Rieske family iron-sulfur cluster-binding
BURPS668_2051093.898721hypothetical protein
BURPS668_2052084.931110hypothetical protein
BURPS668_2053-174.564290dihydroxyacetone kinase
BURPS668_2054-193.763915hypothetical protein
BURPS668_2055-481.851354methyl-accepting chemotaxis protein
BURPS668_2056-3101.243692ABC-2 type transporter permease
BURPS668_2057-4111.368958ABC transporter ATP-binding protein
BURPS668_2058-3120.772457ApbE family protein
BURPS668_2059-2130.160715hypothetical protein
BURPS668_2060-1150.009635nitrous-oxide reductase
BURPS668_2061-1172.119832copper ABC transporter periplasmic
BURPS668_2062-1183.205079copper ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1979TCRTETB931e-22 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 93.4 bits (232), Expect = 1e-22
Identities = 74/403 (18%), Positives = 153/403 (37%), Gaps = 16/403 (3%)

Query: 18 FMQNLDSTVVATALPSMARELGVNVVFLSSAITSYLVALTVFIPVSGWIAERFGAKRVFI 77
F L+ V+ +LP +A + + T++++ ++ V G ++++ G KR+ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 78 AAIAIFTAASVMCAAANGLAT-LVAARILQGAGGALMVPVGRLILYRGVSRHEMLAATTW 136
I I SV+ + + L+ AR +QGAG A + +++ R + + A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 137 LTMPALVGPLLGPPLGGFLTDALSWRAVFWINVPVGVAGAALAARLVPASAGERRAPADA 196
+ +G +GP +GG + + W + + +P+ + + D
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 197 RGMLLVGAALAALMLGVETAGRGVLPAGAPALCLGAGVALGGLAIRHCRRVAHPAVDLSL 256
+G++L+ + ML + L + + ++H R+V P VD L
Sbjct: 202 KGIILMSVGIVFFMLFTTSYSISFLIVSVLSF---------LIFVKHIRKVTDPFVDPGL 252

Query: 257 L-GIPTFHAATIAGSLFRAGAGALPFLVPLTLQVGFGASASRSGAITLASA-LGSLVMRP 314
IP G +F AG + +VP ++ S + G++ + + ++
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFV-SMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGY 311

Query: 315 MTHAALHRAPMRTVLIAGSVSFAAVLAACATLSPAWPDAAVFALLLVGGLSRSLSFASLG 374
+ + R VL G + + L ++ V G S + +
Sbjct: 312 IGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLG-GLSFTKTVIS 370

Query: 375 ALVFSDVPSERLSAATSFQGTAQQLMRAVGVAVAAGALHLAML 417
+V S + + A S L G+A+ G L + +L
Sbjct: 371 TIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLL 413


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1983IGASERPTASE300.029 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.029
Identities = 22/139 (15%), Positives = 41/139 (29%), Gaps = 5/139 (3%)

Query: 436 SALAGEAGKTMTEVTQAVARVTDIMGEIAAASGEQSRGIEQVNQAIAQMDEVTQQNAALV 495
S + + ++ V + E A + E ++ + +A Q +EV Q +
Sbjct: 1034 SETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETK 1093

Query: 496 EEAAAASKSLEEQGRHLTQAVSFFRASAASAAPQARHAAPAKPKAKRGVAAPAPAPRAAH 555
E +K + + + + PK ++ A A
Sbjct: 1094 ETQTTETKETATVEKEEKA-----KVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARE 1148

Query: 556 AAPTFNKPAPALAAAATAS 574
PT N P TA
Sbjct: 1149 NDPTVNIKEPQSQTNTTAD 1167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1988ECOLNEIPORIN929e-23 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 91.8 bits (228), Expect = 9e-23
Identities = 91/380 (23%), Positives = 145/380 (38%), Gaps = 64/380 (16%)

Query: 32 ASTAHAQSSVVLYGLIDTSITYANNQRTHGAGSPGGPGWAVTSGALNASRWGLRGREDLG 91
A A + V LYG I + + + +GA + T S+ G +G+EDLG
Sbjct: 12 ALPVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVE--TGTGIVDLGSKIGFKGQEDLG 69

Query: 92 NGVSAIFALENGFSGASGALSQKGVDMFGRQAWIGLKSKEGGALTLGRQYDLILDF--VT 149
NG+ AI+ +E AS A + G RQ++IGLK G L +GR ++ D +
Sbjct: 70 NGLKAIWQVE---QKASIAGTDSG--WGNRQSFIGLK-GGFGKLRVGRLNSVLKDTGDIN 123

Query: 150 PLGASGPGWGGNLAVHPYDNDDSNRNIRINNAVKYTSPTYRGWTLGAMYGFSNTAGQFGN 209
P + G N P R I +V+Y SP + G + Y ++ AG N
Sbjct: 124 PWDSKSDYLGVNKIAEP-----EARLI----SVRYDSPEFAGLSGSVQYALNDNAG-RHN 173

Query: 210 NAAWSAGLSYANGPLKLGAGYLRINRNPNAANANGALSTTDGSATITGGSQQIWAVAGRY 269
+ ++ AG +Y NG + G + QI + Y
Sbjct: 174 SESYHAGFNYKNGGFFVQYGGAYKRH-------------HQVQENVNIEKYQIHRLVSGY 220

Query: 270 -AFGPHSIGAAWSHSATDRVSGVLQGGSIAKLDGNSLVFDNFTLDGRY-VVTPRLSLAAA 327
++ A A L + + + TL R+ VTPR+S A
Sbjct: 221 DNDALYASVAVQQQDAK------LVEENYSHNSQTEVA---ATLAYRFGNVTPRVSYAHG 271

Query: 328 YTYTMGRFDARSGETRPKWNHMVAQADYAFSIRTDAYLEAVYQRVSGGNGIPAFNATIWT 387
+ + + + ++ +V A+Y FS RT A + A + + G G F +T
Sbjct: 272 FKGSFDATNYNN-----DYDQVVVGAEYDFSKRTSALVSAGWLQ--EGKGESKFVSTA-- 322

Query: 388 LTPSANGNQVVVALGLRHRF 407
+GLRH+F
Sbjct: 323 -----------GGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2001ARGDEIMINASE290.040 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 28.6 bits (64), Expect = 0.040
Identities = 11/48 (22%), Positives = 18/48 (37%), Gaps = 2/48 (4%)

Query: 267 PTSGAAFMVAEWLRAQRDDGRTIVFIAPDEGHRYADTVYDDAWLRGQG 314
+G + R Q +DG ++ IAP E Y+ + G
Sbjct: 334 KCAGGDLIHGA--REQWNDGANVLAIAPGEIIAYSRNHVTNKLFEENG 379


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2013BACYPHPHTASE260.007 Salmonella/Yersinia modular tyrosine phosphatase si...
		>BACYPHPHTASE#Salmonella/Yersinia modular tyrosine phosphatase

signature.
Length = 468

Score = 25.9 bits (56), Expect = 0.007
Identities = 8/11 (72%), Positives = 9/11 (81%)

Query: 40 PKTPPFPPRRR 50
P+TPP PPR R
Sbjct: 156 PRTPPLPPRER 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2021HTHFIS817e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.0 bits (200), Expect = 7e-18
Identities = 36/146 (24%), Positives = 60/146 (41%), Gaps = 1/146 (0%)

Query: 854 TVLIAEDNLLNRSLLLDQLTTLGVRVIEAKNGEEALALLLKEPVDVVMTDIDMPMMDGFQ 913
T+L+A+D+ R++L L+ G V N + D+V+TD+ MP + F
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 914 LLAEMRRLGMTMPVYAVSASARPEDVAEGRARGFTDYLAKPVSLERLETVVRACCSAP-A 972
LL +++ +PV +SA + +G DYL KP L L ++ + P
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 973 GARADEDAQDELPGLPDVPPAYASAF 998
ED + L A +
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQEIY 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2022HTHFIS442e-07 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 44.4 bits (105), Expect = 2e-07
Identities = 19/84 (22%), Positives = 38/84 (45%), Gaps = 6/84 (7%)

Query: 10 KVVVADDHPIVLRAVTDYVNSLPGFHVVASVSSGDALLSAMREQEVNLVVTDFTMHQAND 69
++VADD + + ++ G+ V S+ L + + +LVVTD M
Sbjct: 5 TILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVM----P 58

Query: 70 DKDGLRLISHLMRAYERTPIIVFT 93
D++ L+ + +A P++V +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMS 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2027OMADHESIN542e-09 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 54.1 bits (129), Expect = 2e-09
Identities = 76/232 (32%), Positives = 106/232 (45%), Gaps = 14/232 (6%)

Query: 656 SGTNASATGENSTATGTASTASGSNSTANGANSTASGAGATATGENAAATGAGATATGNN 715
S A A + TA S + A G A G NA+A G + A G
Sbjct: 19 SSPYAFADDYDGIPNLTAVQISPNADPALGLEYPVRPPVPGAGGLNASAKGIHSIAIGAT 78

Query: 716 ASASGTSSTAGGANAIASGENSTANGANSTASGNGSSAFGESAAAAGDGSTALGANAVAS 775
A A+ ++ A GA +IA+G NS A G S A G+ + +G ++ A DG A+GA A S
Sbjct: 79 AEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAASTAQKDG-VAIGARASTS 137

Query: 776 GVGSVATGAGSVASGANSSAYGTGSNAAGAGSVAIGQGATASGSNSVALGTGSVASEDNT 835
G VA G S A NS A G S+ A A+ S+A+G S +N+
Sbjct: 138 DTG-VAVGFNSKADAKNSVAIGHSSHVA------------ANHGYSIAIGDRSKTDRENS 184

Query: 836 VSVGSAGSERRITNVAAGVNATDAVNVGQLNSAVSGIQHQMDGMQGQIDTLA 887
VS+G R++T++AAG TDAVNV QL + Q + ++ A
Sbjct: 185 VSIGHESLNRQLTHLAAGTKDTDAVNVAQLKKEIEKTQENTNKRSAELLANA 236



Score = 41.4 bits (96), Expect = 1e-05
Identities = 75/328 (22%), Positives = 118/328 (35%), Gaps = 5/328 (1%)

Query: 480 ASGTNSTANGTNSTASGDNSTASGTNASATGENSTATGTDSTASGSNSTASGTNASATGE 539
A + N T S + A G A G +++A G +S A G A A
Sbjct: 25 ADDYDGIPNLTAVQISPNADPALGLEYPVRPPVPGAGGLNASAKGIHSIAIGATAEAAKG 84

Query: 540 NSTATGTDSTASGTNSTANGTNSTASGDNSTASGTNASATGENSTATGTASTASGSNSTA 599
+ A G S A+G NS A G S A GD++ G ++A + AST+ +
Sbjct: 85 AAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAASTAQKDGVAIGARASTSDTGVAVG 144

Query: 600 NGTNSTASGDNSTASGTNASATGENSTATGTASTASGSNSTANGTNSTASGDNSTASGTN 659
+ + A + ++ +A S A G S NS + G S A+GT
Sbjct: 145 FNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTDRENSVSIGHESLNRQLTHLAAGTK 204

Query: 660 AS-----ATGENSTATGTASTASGSNSTANGANSTASGAGATATGENAAATGAGATATGN 714
+ A + +T S AN+ A ++ G T + + T
Sbjct: 205 DTDAVNVAQLKKEIEKTQENTNKRSAELLANANAYADNKSSSVLGIANNYTDSKSAETLE 264

Query: 715 NASASGTSSTAGGANAIASGENSTANGANSTASGNGSSAFGESAAAAGDGSTALGANAVA 774
NA + + N + NS A TA + +S + A + + A A+A
Sbjct: 265 NARKEAFAQSKDVLNMAKAHSNSVARTTLETAEEHANSVARTTLETAEEHANKKSAEALA 324

Query: 775 SGVGSVATGAGSVASGANSSAYGTGSNA 802
S + + ANS T SN+
Sbjct: 325 SANVYADSKSSHTLKTANSYTDVTVSNS 352



Score = 40.7 bits (94), Expect = 3e-05
Identities = 68/247 (27%), Positives = 105/247 (42%), Gaps = 6/247 (2%)

Query: 417 ASGTNASASGENSTATGTDSTASGSNSTANGTNSTASGDNSTASGTNASATGENSTATGT 476
A G NASA G +S A G + A+ + A G S A+G NS A G + A G+++ G
Sbjct: 60 AGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGA 119

Query: 477 DSTASGTNSTANGTNSTASGDNSTASGTNASATGENSTATGTDSTASGSN--STASGTNA 534
STA ST+ D A G N+ A +NS A G S + ++ S A G +
Sbjct: 120 ASTAQKDGVAIGARASTS--DTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRS 177

Query: 535 SATGENSTATGTDSTASGTNSTANGTNSTASGDNSTASGTNASATGENSTATGTASTASG 594
ENS + G +S A GT T + + A + +T +A +
Sbjct: 178 KTDRENSVSIGHESLNRQLTHLAAGTKDTDAVN--VAQLKKEIEKTQENTNKRSAELLAN 235

Query: 595 SNSTANGTNSTASGDNSTASGTNASATGENSTATGTASTASGSNSTANGTNSTASGDNST 654
+N+ A+ +S+ G + + + ++ T EN+ A + N +NS A T
Sbjct: 236 ANAYADNKSSSVLGIANNYTDSKSAETLENARKEAFAQSKDVLNMAKAHSNSVARTTLET 295

Query: 655 ASGTNAS 661
A S
Sbjct: 296 AEEHANS 302



Score = 40.3 bits (93), Expect = 3e-05
Identities = 83/344 (24%), Positives = 131/344 (38%), Gaps = 12/344 (3%)

Query: 392 LSTSVSGLQGSVSANTGTASGDNSTASGTNASASGENSTATGTDSTASGSNSTANGTNST 451
+S S + + S+ A + + T S A G + A G N++
Sbjct: 7 ISVSAALISALFSSPYAFADDYDGIPNLTAVQISPNADPALGLEYPVRPPVPGAGGLNAS 66

Query: 452 ASGDNSTASGTNASATGENSTATGTDSTASGTNSTANGTNSTASGDNSTASGTNASATGE 511
A G +S A G A A + A G S A+G NS A G S A GD++ G ++A +
Sbjct: 67 AKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAASTAQ-K 125

Query: 512 NSTATGTDSTASGSNSTASGTNASATGENSTATGTDSTASGTNSTANGTNSTASGDNSTA 571
+ A G ++ S + A G N+ A +NS A G S + AN S A GD S
Sbjct: 126 DGVAIGARASTSDT-GVAVGFNSKADAKNSVAIGHSSHVA-----ANHGYSIAIGDRSKT 179

Query: 572 SGTNASATGENSTATGTASTASGSNST-----ANGTNSTASGDNSTASGTNASATGENST 626
N+ + G S A+G+ T A +T + N+
Sbjct: 180 DRENSVSIGHESLNRQLTHLAAGTKDTDAVNVAQLKKEIEKTQENTNKRSAELLANANAY 239

Query: 627 ATGTASTASGSNSTANGTNSTASGDNSTASGTNASATGENSTATGTASTASGSNSTANGA 686
A +S+ G + + S + +N+ S N + S A + TA
Sbjct: 240 ADNKSSSVLGIANNYTDSKSAETLENARKEAFAQSKDVLNMAKAHSNSVARTTLETAEEH 299

Query: 687 NSTASGAGATATGENAAATGAGATATGNNASASGTSSTAGGANA 730
++ + E+A A A A+ N + S +S T AN+
Sbjct: 300 ANSVARTTLETAEEHANKKSAEALASANVYADSKSSHTLKTANS 343



Score = 39.9 bits (92), Expect = 5e-05
Identities = 106/459 (23%), Positives = 170/459 (37%), Gaps = 39/459 (8%)

Query: 502 SGTNASATGENSTATGTDSTASGSNSTASGTNASATGENSTATGTDSTASGTNSTANGTN 561
S A A + T S + A G A G +++A G +S A G
Sbjct: 19 SSPYAFADDYDGIPNLTAVQISPNADPALGLEYPVRPPVPGAGGLNASAKGIHSIAIGAT 78

Query: 562 STASGDNSTASGTNASATGENSTATGTASTASGSNSTANGTNSTASGDN-----STASGT 616
+ A+ + A G + ATG NS A G S A G ++ G STA D ++
Sbjct: 79 AEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAASTAQKDGVAIGARASTSD 138

Query: 617 NASATGENSTATGTASTASGSNS--TANGTNSTASGDNSTASGTNASATGENSTATGTAS 674
A G NS A S A G +S AN S A GD S N+ + G S
Sbjct: 139 TGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTDRENSVSIGHESLNRQLTH 198

Query: 675 TASGSNSTANGANSTASGAGATATGENAAATGAGATATGNNASASGTSSTAGGANAIASG 734
A+G+ T + N T EN A A N + + +SS G AN
Sbjct: 199 LAAGTKDT-DAVNVAQLKKEIEKTQENTNKRSAELLANANAYADNKSSSVLGIAN----- 252

Query: 735 ENSTANGANSTASGNGSSAFGESAAAAGDGSTALGANAVASGVGSVATGAGSVASGANSS 794
N +S ++ +A E+ A + D A++ + ++ T S A ++
Sbjct: 253 -----NYTDSKSAETLENARKEAFAQSKDVLNMAKAHSNSVARTTLETAEEHANSVARTT 307

Query: 795 AYGTGSNAAGAGSVAIGQGATASGSNSVALGTGSVASEDNTVSVGSAGSERRITNVAAGV 854
+A + A+ + S S + + D TVS + + R
Sbjct: 308 LETAEEHANKKSAEALASANVYADSKSSHTLKTANSYTDVTVSNSTKKAIRESNQYT--- 364

Query: 855 NATDAVNVGQLNSAVSGIQHQMDGMQGQIDTLARDAYSGIAAATALTMIPDVDPGKTLAV 914
H+ + ++D L G+A++ AL + +
Sbjct: 365 ------------------DHKFRQLDNRLDKLDTRVDKGLASSAALNSLFQPYGVGKVNF 406

Query: 915 GIGTANFKGYQASALGATARITQNLKVKTGVSYSGSNYV 953
G ++ QA A+G+ R+ +N+ +K GV+Y+GS+ V
Sbjct: 407 TAGVGGYRSSQALAIGSGYRVNENVALKAGVAYAGSSDV 445


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2030FIMBRIALPAPE351e-04 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 35.4 bits (81), Expect = 1e-04
Identities = 32/135 (23%), Positives = 54/135 (40%), Gaps = 18/135 (13%)

Query: 198 VPLGDVRVDRFSGIGSTFADRNFSIGMTCTQPAGTYDIALTFSATADSSGAPGVLAITQG 257
V GD+ + G ++F++ M C GT + +T S+G G +
Sbjct: 46 VNWGDIEIQNLVQSGGN--QKDFTVDMNCPYSLGTMKVTIT------SNGQTGNSILVPN 97

Query: 258 ASSASGVGIQLLMN-------GSPVTFGTVLDAGSATA-GATLTIPMTAR--YYQTGRVV 307
S+ASG G+ + + G+ VT G+ + G T I + A+ Y + +
Sbjct: 98 TSTASGDGLLIYLYNSNNSGIGNAVTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQSL 157

Query: 308 TPGAANGIATFAVSY 322
G + AT SY
Sbjct: 158 QAGTFSATATLVASY 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2031PF005777790.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 779 bits (2014), Expect = 0.0
Identities = 292/861 (33%), Positives = 449/861 (52%), Gaps = 43/861 (4%)

Query: 2 LAAALAALSATARGQQALEFDPAFLELGGGQGGADLSVYATSNRVLPGVYPVSVFVNGEA 61
L A A + L F+P FL Q ADLS + + PG Y V +++N
Sbjct: 30 LFVACAFAAQAPLSSAELYFNPRFLA-DDPQAVADLSRFENGQELPPGTYRVDIYLNNGY 88

Query: 62 IERRDITFVSESARDGREDAIPCLSARMFDEWGVDIAAFAKLAQAGEDACVDIADSVPHA 121
+ RD+TF D + +PCL+ G++ A+ + + +DACV + + A
Sbjct: 89 MATRDVTFN---TGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDA 145

Query: 122 RTEFDSHQLRLNVTVPQAALKRRARGAVDPARWDQGIDAALLDYQLSAAQYAGGNFASAR 181
+ D Q RLN+T+PQA + RARG + P WD GI+A LL+Y S
Sbjct: 146 TAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNR---IGG 202

Query: 182 SRTTLYAGLRGAVNLGAWRLSHTSSFLRGL-----DGRNRFQIVNTFVQRDIAGWNSRLT 236
+ Y L+ +N+GAWRL +++ +N++Q +NT+++RDI SRLT
Sbjct: 203 NSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLT 262

Query: 237 AGEGTTPANIFDGFQFLGVQLNTDETMLPDSLQGYAPTVHGVAQTNAQVTIRQNGFVIYS 296
G+G T +IFDG F G QL +D+ MLPDS +G+AP +HG+A+ AQVTI+QNG+ IY+
Sbjct: 263 LGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYN 322

Query: 297 TYVPPGPFTIDDLYPTSSSGNLEVTITEADGHVTTFTQPYSAVPMLLRDGSWRYNVTAGQ 356
+ VPPGPFTI+D+Y +SG+L+VTI EADG FT PYS+VP+L R+G RY++TAG+
Sbjct: 323 STVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGE 382

Query: 357 YR-DGISGSHPSFAMATLARGLAGEFSLYGGFIGAGMYQSVLVGIGKNLGSIGAVSLDVS 415
YR P F +TL GL +++YGG A Y++ GIGKN+G++GA+S+D++
Sbjct: 383 YRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMT 442

Query: 416 HARSAVDLADSRTVSGHAFRVLYAKAVGSWGTDFRLLAYRYSTAGYRSFADAVQLRDGSE 475
A S L D G + R LY K++ GT+ +L+ YRYST+GY +FAD R
Sbjct: 443 QANS--TLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGY 500

Query: 476 PAAL------------------GAKRQRLEGTVNQRLGRLGSMYATVAVQTYWGSAARST 517
KR +L+ TV Q+LGR ++Y + + QTYWG++
Sbjct: 501 NIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDE 560

Query: 518 VYQLGHSGNWGRASYGLYAAYSKGSGVPSSWN-VSLSLSMPLEVFFGGARVRAPAGGSAN 576
+Q G + + ++ L + +K + ++L++++P + A+
Sbjct: 561 QFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQW--RHAS 618

Query: 577 VSYFVSRNNENHVNQQMTASGSSSEQ-RLNYSVGVAHS----SESDVSGSVSTSYLAPFG 631
SY +S + + G+ E L+YSV ++ S +G + +Y +G
Sbjct: 619 ASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYG 678

Query: 632 RYDASIDSGRGYTQAAFTAAGGMLWHGTGVLFTQPLGETVAVVDVPNVRGVRFEMHPGVS 691
+ Q + +GG+L H GV QPL +TV +V P + + E GV
Sbjct: 679 NANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVR 738

Query: 692 TDRAGEAVIPRLNPYRVNRIAVDQRRMPQDVEIRNPVSEVVPTRAAVVQTHFDSVVGLRA 751
TD G AV+P YR NR+A+D + +V++ N V+ VVPTR A+V+ F + VG++
Sbjct: 739 TDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKL 798

Query: 752 LFTLTRADGSFPPQGATAENDEGQVLGVVGMDGETFVAGLPAAEGHFVVRWGAARQNRCR 811
L TLT + P GA ++ Q G+V +G+ +++G+P A G V+WG C
Sbjct: 799 LMTLTH-NNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA-GKVQVKWGEEENAHCV 856

Query: 812 VNYALPGKAANGAYTAVEATC 832
NY LP ++ T + A C
Sbjct: 857 ANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2050PF07675310.008 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 30.8 bits (69), Expect = 0.008
Identities = 21/59 (35%), Positives = 28/59 (47%), Gaps = 3/59 (5%)

Query: 181 DYVIADPEPRGGRLAME-RGVTWAARRHDHRF--GAHYPWTLRLTPPQDGAPASVEIDT 236
DY I +PEP G++ + G AR D F G Y +T+R DG VE D+
Sbjct: 472 DYCITNPEPASGKMWIAGDGGNQPARYDDFAFEAGKKYTFTMRRAGMGDGTDMEVEDDS 530


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2052INFPOTNTIATR300.001 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 30.0 bits (67), Expect = 0.001
Identities = 12/27 (44%), Positives = 16/27 (59%)

Query: 48 AAAAVAMAMAMAMAANDVARRRTDADR 74
AA + +AM+ AMAA D TD D+
Sbjct: 7 TAAIMGLAMSTAMAATDATSLTTDKDK 33


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2056ABC2TRNSPORT562e-11 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 56.1 bits (135), Expect = 2e-11
Identities = 62/201 (30%), Positives = 100/201 (49%), Gaps = 10/201 (4%)

Query: 17 ASPLRILFGLTQPLLYLFVLGAALRSGTYAEIGG--YQAYIFPGVVGLSLM----FTAIS 70
A+ +L L +PL+YLF LGA L +GG Y A++ G+V S M F I
Sbjct: 30 AALASLLGHLAEPLIYLFGLGAGL-GVMVGRVGGVSYTAFLAAGMVATSAMTAATFETIY 88

Query: 71 AAVGIVHDRQTGLLNALLVSPVRRVDIALGKIGAGALLAWLQALLLLPFSPAIGIGLTAP 130
AA G + ++T A+L + +R DI LG++ A A L + + A+G
Sbjct: 89 AAFGRMEGQRT--WEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY-TQWL 145

Query: 131 RLALLVAAMAFAALAFSALGLALALPFRSVIVFPVVSNTLLLPMFFLSGGLYPLDLAPDW 190
L + +A LAF++LG+ + S F ++ P+ FLSG ++P+D P
Sbjct: 146 SLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIV 205

Query: 191 IRAAAAFDPAAYGVDLMRGVL 211
+ AA F P ++ +DL+R ++
Sbjct: 206 FQTAARFLPLSHSIDLIRPIM 226


33BURPS668_2085BURPS668_2097Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_20852122.477121AsnC family transcriptional regulator
BURPS668_20861112.341178linear gramicidin synthetase subunit D
BURPS668_2087091.931602hypothetical protein
BURPS668_2088-281.547084hypothetical protein
BURPS668_2089192.146958polyhydroxybutyrate depolymerase
BURPS668_20901111.547973hypothetical protein
BURPS668_20911111.723639hypothetical protein
BURPS668_20921122.780961hypothetical protein
BURPS668_20931123.021158LacI family transcriptional regulator
BURPS668_20941133.741392gluconate 2-dehydrogenase
BURPS668_20950143.612482major facilitator transporter
BURPS668_20960143.9438542-keto-gluconokinase
BURPS668_2097-2143.469866hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2086NUCEPIMERASE532e-09 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 53.2 bits (128), Expect = 2e-09
Identities = 45/203 (22%), Positives = 78/203 (38%), Gaps = 37/203 (18%)

Query: 628 VLLTGATGFVGVHLLAQLLATTEAVIHCVVRARDAHDAERRVADKLRTYRLGVSERDRAR 687
L+TGA GF+G H+ +LL V+ + D +D L+ RL + + +
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVG-IDNLNDYYDV------SLKQARLELLAQPGFQ 55

Query: 688 IRCHAGDIAHDRLGMASADYDALSRC-----VDVVHHSA--SAVNF-IKPYAAMKRDNVD 739
H D+ AD + ++ + V S AV + ++ A N+
Sbjct: 56 F--HKIDL---------ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLT 104

Query: 740 GLVNVIRFAAAARVKALSLLSTISVYSWGHRITGKTVMREDDDLDQNLDAVCADIGYVKS 799
G +N++ +++ L S+ SVY + K DD +D + Y +
Sbjct: 105 GFLNILEGCRHNKIQHLLYASSSSVYG----LNRKMPFSTDDSVDHPVSL------YAAT 154

Query: 800 KWVMEKLADA-ARARGLPLITFR 821
K E +A + GLP R
Sbjct: 155 KKANELMAHTYSHLYGLPATGLR 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2093HTHTETR353e-04 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 34.6 bits (79), Expect = 3e-04
Identities = 15/107 (14%), Positives = 40/107 (37%), Gaps = 1/107 (0%)

Query: 12 ATISDVAREAGTGKTSVSRYLNGETNVLSADLRQRIETAIERLNYRPNQMARGL-KRGRN 70
++ ++A+ AG + ++ + ++++ S E + R
Sbjct: 32 TSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELELEYQAKFPGDPLSVLRE 91

Query: 71 RLLGLLAADLTNPYTVEVLRGVEAACHALGYMPLICHAANELEMERR 117
L+ +L + +T ++ + C +G M ++ A L +E
Sbjct: 92 ILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNLCLESY 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2095TCRTETB320.005 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.8 bits (72), Expect = 0.005
Identities = 32/139 (23%), Positives = 53/139 (38%), Gaps = 2/139 (1%)

Query: 244 IGVYGFVLWLPSIVKNGSALGMVATGWLSALP-YLAATIAMLAASWASDRFGSRKGFVWP 302
V GFV +P ++K+ L G + P ++ I DR G
Sbjct: 270 GTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIG 329

Query: 303 FLLIGAAAFAASYTLGSTHFWLSYALLVVAGAAMYAPYGPFFAIVPELLPKNVAGGAMAL 362
+ + AS+ L +T ++++ ++ V G + IV L + AG M+L
Sbjct: 330 VTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKT-VISTIVSSSLKQQEAGAGMSL 388

Query: 363 INSMGALGSFVGSYAVGYL 381
+N L G VG L
Sbjct: 389 LNFTSFLSEGTGIAIVGGL 407



Score = 30.6 bits (69), Expect = 0.014
Identities = 24/140 (17%), Positives = 55/140 (39%), Gaps = 5/140 (3%)

Query: 35 AAAGINQDLGISKGLSSLIGALFFLGYFFFQIPGAIYAERRSVKTLVFWSLVLWGACASL 94
+ I D ++ + F L + +++ +K L+ + +++ S+
Sbjct: 36 SLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCF-GSV 94

Query: 95 TGVV--SNIPSLMAIRFLLGVVEAAVMPAML-IFISNWFTKRERSRANTFLILGNPVTVL 151
G V S L+ RF+ G AA PA++ + ++ + K R +A + +
Sbjct: 95 IGFVGHSFFSLLIMARFIQGA-GAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEG 153

Query: 152 WMSVVSGYLVHEFGWRHMFV 171
+ G + H W ++ +
Sbjct: 154 VGPAIGGMIAHYIHWSYLLL 173



Score = 29.8 bits (67), Expect = 0.021
Identities = 26/109 (23%), Positives = 45/109 (41%), Gaps = 6/109 (5%)

Query: 268 TGWLSALPYLAATIAMLAASWASDRFGSRKGFVWPFLLIGAAAFAASYTLGSTHF-WLSY 326
T W++ L +I SD+ G ++ ++ ++ + +G + F L
Sbjct: 51 TNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGF--VGHSFFSLLIM 108

Query: 327 ALLVVA-GAAMYAPYGPFFAIVPELLPKNVAGGAMALINSMGALGSFVG 374
A + GAA + +V +PK G A LI S+ A+G VG
Sbjct: 109 ARFIQGAGAAAFPAL--VMVVVARYIPKENRGKAFGLIGSIVAMGEGVG 155


34BURPS668_2113BURPS668_2145Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_2113-2113.071749hypothetical protein
BURPS668_2115-1112.029870sensory box sigma-54 dependent transcriptional
BURPS668_21160120.507512metallo-beta-lactamase
BURPS668_21170110.033927pyridine nucleotide-disulfide family
BURPS668_2118314-2.112760hypothetical protein
BURPS668_2119212-1.466137hypothetical protein
BURPS668_2120212-1.413871putrescine ABC transporter permease PotI
BURPS668_2121212-1.197987putrescine ABC transporter permease PotH
BURPS668_212209-0.973602putrescine ABC transporter ATP-binding protein
BURPS668_2123-29-0.460629putrescine ABC transporter periplasmic
BURPS668_2124-180.962476manganese/iron transporter
BURPS668_2125-112-1.598408hypothetical protein
BURPS668_2126-212-2.234218hypothetical protein
BURPS668_2127-113-2.438759OmpW family outer membrane protein
BURPS668_2128-114-2.6816382-nitropropane dioxygenase
BURPS668_2129219-4.479997aldehyde dehydrogenase
BURPS668_2130835-8.614370hypothetical protein
BURPS668_2131940-9.705582ABC transporter ATP-binding protein
BURPS668_2132944-10.254519hypothetical protein
BURPS668_21331043-10.198230transposase
BURPS668_2134947-11.234121transposase subfamily protein
BURPS668_2135947-10.898816DNA-binding protein
BURPS668_2136949-10.833254hypothetical protein
BURPS668_2137844-10.067767hypothetical protein
BURPS668_2138843-9.959643XRE family transcriptional regulator
BURPS668_2139843-9.815258hypothetical protein
BURPS668_2140743-9.640053hypothetical protein
BURPS668_2141642-9.620945hypothetical protein
BURPS668_2142745-11.007542Rhs element Vgr protein
BURPS668_2143426-9.630062hypothetical protein
BURPS668_2144216-5.093237Type II secretory pathway, pseudopilin PulG
BURPS668_2145114-3.503883hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2115HTHFIS334e-112 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 334 bits (857), Expect = e-112
Identities = 127/357 (35%), Positives = 181/357 (50%), Gaps = 42/357 (11%)

Query: 145 ERLTTVRSASAKPSGEGLVGGSDAFNAALSALQRVAPSMLPVLLLGESGTGKELFARALH 204
+ + G LVG S A L R+ + L +++ GESGTGKEL ARALH
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALH 181

Query: 205 EASARAMGPFVVVDCSGIAETLFESELFGYEKGAFTGASARKPGLVETAQGGTLFLDEIG 264
+ R GPFV ++ + I L ESELFG+EKGAFTGA R G E A+GGTLFLDEIG
Sbjct: 182 DYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIG 241

Query: 265 DVPLSMQVKLLRLIESGTFRRVGGVEALCADFRLVAATHKPLKAMIGDGRFRPDLYYRIS 324
D+P+ Q +LLR+++ G + VGG + +D R+VAAT+K LK I G FR DLYYR++
Sbjct: 242 DMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLN 301

Query: 325 AYPISLPAVRERPGDMPLLVDSILRRIAALGPVAGQRFVVAPDALARLEAYAWPGNIREL 384
P+ LP +R+R D+P LV +++ G +AL ++A+ WPGN+REL
Sbjct: 302 VVPLRLPPLRDRAEDIPDLVRHFVQQAEKEG---LDVKRFDQEALELMKAHPWPGNVREL 358

Query: 385 RNVLDRACLLTDDGVIRVEHLPDEVAGGARIEPGAPA----------------------- 421
N++ R L VI E + +E+ P A
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 422 -------------KLSDDELARIARA---FGGTRRALAERVGMSERTLYRRLRALGI 462
L++ E I A G + A+ +G++ TL +++R LG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2144BCTERIALGSPG1146e-35 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 114 bits (288), Expect = 6e-35
Identities = 41/98 (41%), Positives = 59/98 (60%)

Query: 89 RRIAAAQDVAVITVALKLYRLDNGSYPTQAQGLQALIEKPTIDPIPNNWKETGYLTRLPN 148
+ A D+ + AL +Y+LDN YPT QGL++L+E PT+ P+ N+ + GY+ RLP
Sbjct: 41 DKQKAVSDIVALENALDMYKLDNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPA 100

Query: 149 DPWGHAYQFLNPGRHGEIDVYSVGPATDFKYSLSIGSW 186
DPWG+ Y +NPG HG D+ S GP + I +W
Sbjct: 101 DPWGNDYVLVNPGEHGAYDLLSAGPDGEMGTEDDITNW 138


35BURPS668_2218BURPS668_2223Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_221828-0.776456resuscitation promoting factor
BURPS668_221928-1.1177824-carboxymuconolactone decarboxylase
BURPS668_222039-1.108345MerR family transcriptional regulator
BURPS668_222148-1.011232comE protein
BURPS668_222238-1.496557hypothetical protein
BURPS668_222338-1.201380ATPase AAA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2221PF06776300.005 Invasion associated locus B
		>PF06776#Invasion associated locus B

Length = 214

Score = 30.3 bits (68), Expect = 0.005
Identities = 14/65 (21%), Positives = 22/65 (33%), Gaps = 1/65 (1%)

Query: 148 PTTSSATPAPSAATPAPATATPSTAASAPAA-KKSRSSKKQDKAAAASAAAQASAPAAAS 206
P T+ A PA A PA +P A+ A + A A + + A
Sbjct: 15 PVTNHAVPALKAIQMGPAELSPMLASCRRLARRNGARLMLAGAMAIALSFGWSDRADAQG 74

Query: 207 TTKAK 211
++
Sbjct: 75 AVRSV 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2223HTHFIS350.002 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 34.8 bits (80), Expect = 0.002
Identities = 34/160 (21%), Positives = 58/160 (36%), Gaps = 28/160 (17%)

Query: 576 VVGQNEAIDAVADAIRRSRAGLADPNRPYGSFLFLGPTGVGKTELCKALAGFLFDSEEHL 635
+VG++ A+ + + R L + + G +G GK + +AL +
Sbjct: 139 LVGRSAAMQEIYRVLAR----LMQTDLT---LMITGESGTGKELVARALHDYGKRRNGPF 191

Query: 636 IRIDMSEFMEKHSVARLIGAPPGYVGYEEGGYLTEAVRRKPYSV-------ILLDEIEKA 688
+ I+M+ + L G+E+G + T A R + LDEI
Sbjct: 192 VAINMAAIPRDLIESEL-------FGHEKGAF-TGAQTRSTGRFEQAEGGTLFLDEIGDM 243

Query: 689 HPDVFNVLLQVLDDG---RMTDGQGRTVDFKNTVIVMTSN 725
D LL+VL G + D + IV +N
Sbjct: 244 PMDAQTRLLRVLQQGEYTTVGGRTPIRSDVR---IVAATN 280



Score = 32.9 bits (75), Expect = 0.005
Identities = 34/166 (20%), Positives = 63/166 (37%), Gaps = 27/166 (16%)

Query: 136 LESAIAAVRGGSQ-------VHSQDAESQREALKKYTVDLTERARAG-KLDPVIGRDDEI 187
+AI A G+ ++ AL + ++ P++GR +
Sbjct: 87 FMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAM 146

Query: 188 RRSIQILQRRTKNN-PVLI-GEPGVGKTAIVEGLAQRIVNGEVPETLKNKRVLSLDMAAL 245
+ ++L R + + ++I GE G GK + L +N ++++MAA+
Sbjct: 147 QEIYRVLARLMQTDLTLMITGESGTGKELVARALHDY-------GKRRNGPFVAINMAAI 199

Query: 246 --------LAGAKYRGEFEERLKAVLNDIAKDEGRTIVFIDEIHTM 283
L G + +G F + EG T+ F+DEI M
Sbjct: 200 PRDLIESELFGHE-KGAFTGAQTRSTGRFEQAEGGTL-FLDEIGDM 243


36BURPS668_2245BURPS668_2253Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_2245-112-4.220251phosphate transporter family protein
BURPS668_2246114-4.698349hypothetical protein
BURPS668_2247217-4.651585replicative DNA helicase
BURPS668_2248416-1.221119hypothetical protein
BURPS668_2249213-0.01363550S ribosomal protein L9
BURPS668_22502150.64487930S ribosomal protein S18
BURPS668_22511131.790810primosomal replication protein N
BURPS668_22521112.56261930S ribosomal protein S6
BURPS668_22531113.087133hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2249UREASE290.009 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 28.9 bits (65), Expect = 0.009
Identities = 12/33 (36%), Positives = 17/33 (51%), Gaps = 5/33 (15%)

Query: 121 LKMIGEHGVQVALHTDVL-----VDVTVNVIGD 148
L + E+ VQV +HTD L V+ T+ I
Sbjct: 235 LSVADEYDVQVMIHTDTLNESGFVEDTIAAIKG 267


37BURPS668_2301BURPS668_2341Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_2301-1123.451001peptidyl-prolyl cis-trans isomerase
BURPS668_2302-1124.196151acetyltransferase
BURPS668_2303-1133.159929phosphoribosylformylglycinamidine synthase
BURPS668_2304-1123.793757hypothetical protein
BURPS668_2305-2123.744918D-amino-acid dehydrogenase
BURPS668_2307-2122.926875nitric-oxide reductase subunit C
BURPS668_2306-210-0.234355hypothetical protein
BURPS668_2308-213-0.086651glucose-6-phosphate isomerase
BURPS668_2309-119-0.209213ABC transporter ATP-binding protein
BURPS668_2311-121-0.622459acyl-CoA thioesterase
BURPS668_2310-124-1.449896hypothetical protein
BURPS668_2312-124-1.362436peptidyl-prolyl cis-trans isomerase D
BURPS668_2314243-1.202597*hypothetical protein
BURPS668_2315214-3.767381hypothetical protein
BURPS668_2316311-4.567466hypothetical protein
BURPS668_2317311-4.846341ribosomal subunit interface protein
BURPS668_231949-4.687791*hypothetical protein
BURPS668_2320410-4.388794hypothetical protein
BURPS668_2321410-4.123663ATP-dependent protease La
BURPS668_232218-1.082150ATP-dependent protease ATP-binding subunit ClpX
BURPS668_2323190.707491ATP-dependent Clp protease proteolytic subunit
BURPS668_23243122.129768trigger factor
BURPS668_23253125.762760hypothetical protein
BURPS668_23261104.396768hypothetical protein
BURPS668_2327-283.749821glycerate kinase
BURPS668_2328-193.074734MarR family transcriptional regulator
BURPS668_2329-272.865830hypothetical protein
BURPS668_2330-272.1818562-dehydropantoate 2-reductase
BURPS668_2331-281.280310LuxR family transcriptional regulator
BURPS668_2332-112-0.116657outer membrane porin
BURPS668_2333-216-1.090163major facilitator superfamily permease
BURPS668_2334025-4.337451histone deacetylase
BURPS668_2335333-5.197984endonuclease Nuc
BURPS668_2336131-4.496901hypothetical protein
BURPS668_2337132-3.906852exported avidin family protein
BURPS668_2338228-3.468813hypothetical protein
BURPS668_2339232-4.730587hypothetical protein
BURPS668_2340130-4.752623PAAR motif-containing protein
BURPS668_2341027-4.028186hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2307PYOCINKILLER300.023 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.1 bits (67), Expect = 0.023
Identities = 53/222 (23%), Positives = 73/222 (32%), Gaps = 21/222 (9%)

Query: 84 DALVAAAELRRLGFAADAWMPIEVKPDDARWALERARAANVPIDEAAPESFDGYGWLVDG 143
+ L AA AA A E +A+ E I A + G +V
Sbjct: 205 NTLTAAKASIE---AAAANKAREQAAAEAKRKAEEQARQQAAIRAANTYAMPANGSVVAT 261

Query: 144 LFGIGLARPLDGAFAAIAQRIAARARHTGRVLALDVPSGLDSDTGARVGGGTAVTATCTL 203
G GL + GA A++AQ I+ GRVLA S G ++T +
Sbjct: 262 AAGRGLIQVAQGA-ASLAQAISDAIAVLGRVLA--------SAPSVMAVGFASLTYSSRT 312

Query: 204 SFIAAKPGLYTGDGRDLAGEIHVAPLDLGEPPAPAIRLNAPELFEAR--LPERAFASHKG 261
+ T D A + A L P++ LNA LP R +G
Sbjct: 313 AEQWQD---QTPDSVRYALGMDAAKLG----LPPSVNLNAVAKASGTVDLPMRLTNEARG 365

Query: 262 TYGSLGIVGGDTGMCGAPILAARAALFAGAGKVHVGFVGTGA 303
+L +V D + AA A G V T A
Sbjct: 366 NTTTLSVVSTDGVSVPKAVPVRMAAYNATTGLYEVTVPSTTA 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2321GPOSANCHOR403e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 40.4 bits (94), Expect = 3e-05
Identities = 35/192 (18%), Positives = 73/192 (38%), Gaps = 15/192 (7%)

Query: 92 KVLVEGLQRAQALSIEEQETQFSCEVMPLEPDHADSAETEALRRAIVSQFDQYVKLNKKI 151
L + L+ A S + + E A A+ E ++ K +
Sbjct: 193 AELEKALEGAMNFSTADSAKIKTLEAE-KAALAARKADLEKALEGAMNFSTADSAKIKTL 251

Query: 152 PPEILTSLSGIDEAGRLADTIAAHLPLKLDQKQHILEMFPVIERLEHLLAQLEAEIDILQ 211
E + E + + + + + LE A LE + +L
Sbjct: 252 EAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAE---KAALEAEKADLEHQSQVLN 308

Query: 212 VEKRIRGRVKRQMEKSQREYYLNEQVKAIQKELGEGEEGAD--LEELEKRINAARMPKEA 269
R ++R ++ S+ +Q++A ++L E + ++ + L + ++A+R EA
Sbjct: 309 AN---RQSLRRDLDASREAK---KQLEAEHQKLEEQNKISEASRQSLRRDLDASR---EA 359

Query: 270 KKKADAELKKLK 281
KK+ +AE +KL+
Sbjct: 360 KKQLEAEHQKLE 371


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2322HTHFIS310.009 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.009
Identities = 19/112 (16%), Positives = 40/112 (35%), Gaps = 12/112 (10%)

Query: 51 EAAAAGVEASLSKSDLPSPQEIRDILDQYVIGQERAKKILAVAVYNHYKRL-------KH 103
+A+ G L K E+ I+ + + +R L + + +
Sbjct: 92 KASEKGAYDYLPKP--FDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEI 149

Query: 104 LDKKDDVELSKSNILLIGPTGSGKTLLAQTLARL---LNVPFVIADATTLTE 152
+ + +++ G +G+GK L+A+ L N PFV + +
Sbjct: 150 YRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPR 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2332ECOLNEIPORIN693e-15 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 69.1 bits (169), Expect = 3e-15
Identities = 72/323 (22%), Positives = 118/323 (36%), Gaps = 37/323 (11%)

Query: 20 AATLAALSGPAHAQSTLTLYGVADAGVQYLSRADGQHAAWRLQN-----YGILPSQLGIK 74
A TLAAL P A + +TLYG AGV SR+ + A L S++G K
Sbjct: 7 ALTLAAL--PVAAMADVTLYGTIKAGV-ETSRSVAHNGAQAASVETGTGIVDLGSKIGFK 63

Query: 75 GEEDLGGGWRARFQLEQGINLNDGTATVPGYAFFRGAYVGMGGPAGTVTLGRQFSTLFDK 134
G+EDLG G +A +Q+EQ ++ + R +++G+ G G + +GR S L D
Sbjct: 64 GQEDLGNGLKAIWQVEQKASIAGTDSGW----GNRQSFIGLKGGFGKLRVGRLNSVLKDT 119

Query: 135 TLFYDPLWYASYSGQGVLVPLSANFVDHSIKFQSATFAGFDVEALAAMAGIAGNTRAGRV 194
+P S + + S+++ S FAG ++ A N AGR
Sbjct: 120 GDI-NPWDSKSDYLGVNKIAEPEARLI-SVRYDSPEFAGL-SGSVQ----YALNDNAGRH 172

Query: 195 ------LELGGQFTSRGLSASAVLHRSH-GTAQGGADRSAQRRDIGTFAARYAFASLPLT 247
+ + R H ++ R + + +AS+ +
Sbjct: 173 NSESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSGYDNDALYASVAVQ 232

Query: 248 VHAGVQRLTGELDPARTIV-------WGGARYQASGRLGFAGGIYHTDSPTPQVGHPTLF 300
++T V +G + S GF G T+
Sbjct: 233 QQDAKLVEENYSHNSQTEVAATLAYRFGNVTPRVSYAHGFKGSFDATNYNN----DYDQV 288

Query: 301 IASTTCSLSKRTIAYLNLGYAKN 323
+ SKRT A ++ G+ +
Sbjct: 289 VVGAEYDFSKRTSALVSAGWLQE 311


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2333TCRTETA310.009 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.9 bits (70), Expect = 0.009
Identities = 38/135 (28%), Positives = 57/135 (42%), Gaps = 4/135 (2%)

Query: 254 AQTSGNVLAIASLMGIAGAALASYLGGRAARRAMLLAGYGILAASLVALAAAPNANGYTL 313
G +LA+ +LM A A + L R RR +LL A +A AP +
Sbjct: 42 TAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYI 101

Query: 314 A--IFGFKFAWTFVLPFMLASVAAVDATGRLIATLNLVIGSGLAAGPLAAGLMLDGGGTL 371
+ G A V +A + D R ++ G G+ AGP+ GLM GG +
Sbjct: 102 GRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLM--GGFSP 159

Query: 372 RALFSIAAAVSLVSL 386
A F AAA++ ++
Sbjct: 160 HAPFFAAAALNGLNF 174


38BURPS668_2519BURPS668_2536Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_25190103.166452hypothetical protein
BURPS668_2520093.580935dioxygenase, TauD/TfdA
BURPS668_2521183.808460hypothetical protein
BURPS668_2522083.797193non-ribosomal peptide synthase
BURPS668_25231103.695029hypothetical protein
BURPS668_2524-1113.755157major facilitator family transporter
BURPS668_25250123.528032amino acid adenylation protein
BURPS668_25261131.153073adenylylsulfate kinase
BURPS668_2527-1120.579329hypothetical protein
BURPS668_2528-1120.943321hypothetical protein
BURPS668_2529-2111.058185hypothetical protein
BURPS668_2530-2110.479952RND family efflux transporter MFP subunit
BURPS668_2531-311-0.093546AcrB/AcrD/AcrF family protein
BURPS668_25322160.197625hypothetical protein
BURPS668_2533422-0.710051GDSL-like lipase/acylhydrolase domain/outer
BURPS668_2534420-2.986476hypothetical protein
BURPS668_2535529-5.168977hypothetical protein
BURPS668_2536318-1.241380hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2524TCRTETA392e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.4 bits (92), Expect = 2e-05
Identities = 57/271 (21%), Positives = 97/271 (35%), Gaps = 13/271 (4%)

Query: 74 AFTLPIALFALLSGVAADAWDRRTVMLLSQALMFSVALCLVALAAAGAMTPARLLVCMFV 133
+ L A + G +D + RR V+L+S +V ++A A + L + V
Sbjct: 51 LYALMQFACAPVLGALSDRFGRRPVLLVS-LAGAAVDYAIMATAPFLWV----LYIGRIV 105

Query: 134 GGCAGAMFQPAWQSAVTEQVPARELSAAIALDSFSMNFARTAGPALGGFIVASVSPNAAF 193
G GA + + + E + S F AGP LGG + SP+A F
Sbjct: 106 AGITGATG-AVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLM-GGFSPHAPF 163

Query: 194 V---LSGLSYAGLIYVLSRSIRGAAARPPVRERLATMLVQGVRYCGRARGIRGTLIRSSL 250
L RP R A + R+ + + +
Sbjct: 164 FAAAALNGLNFLTGCFLLPESHKGERRP--LRREALNPLASFRWARGMTVVAALMAVFFI 221

Query: 251 FGFLGSPVWALLPLFAKTQFGGEARTYGVLLASFGA-GAASGALGGAAGRARLGREALVR 309
+G AL +F + +F +A T G+ LA+FG + + A+ ARLG +
Sbjct: 222 MQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALM 281

Query: 310 LCTLTFAAGMLATAWSPCQAVAMLGLAVAGG 340
L + G + A++ +A + +
Sbjct: 282 LGMIADGTGYILLAFATRGWMAFPIMVLLAS 312



Score = 35.2 bits (81), Expect = 5e-04
Identities = 31/167 (18%), Positives = 58/167 (34%), Gaps = 8/167 (4%)

Query: 21 LAALRGPFAYRTFAAIWVAS-LVGNIGGSIQTVAASWLMTSMAPSPTMVSLVQTAFTLPI 79
LA+ R AA+ ++ +G + + T + + AF +
Sbjct: 200 LASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILH 259

Query: 80 ALF-ALLSGVAADAWDRRTVMLLSQALMFSVALCLVALAAAGAMTPARLLVCMFVGGCAG 138
+L A+++G A R ++L M + + LA A A ++ + G
Sbjct: 260 SLAQAMITGPVAARLGERRALMLG---MIADGTGYILLAFATRGWMAFPIMVLLASG--- 313

Query: 139 AMFQPAWQSAVTEQVPARELSAAIALDSFSMNFARTAGPALGGFIVA 185
+ PA Q+ ++ QV + + GP L I A
Sbjct: 314 GIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2530RTXTOXIND432e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.5 bits (100), Expect = 2e-06
Identities = 17/126 (13%), Positives = 42/126 (33%), Gaps = 21/126 (16%)

Query: 74 TVRSQVDGQITHVRFREGQQVRAGDVLVEIDRRALQAAADQATAKLEQDKATLANARLEL 133
++ + + + +EG+ VR GDVL+++ +A + + L Q + ++
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 134 ----------------ARHQRLAEMNAAPVQML-----DTWKARVNELHAQIRGDQAAVQ 172
Q ++E + L TW+ + + + +A
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERL 217

Query: 173 NARVAV 178
+
Sbjct: 218 TVLARI 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2531ACRIFLAVINRP7570.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 757 bits (1956), Expect = 0.0
Identities = 273/1033 (26%), Positives = 495/1033 (47%), Gaps = 26/1033 (2%)

Query: 9 FIRYPVATCLMTAGILFAGVAAYFHLPVAPLPQVEFPTIQVSAVLPGADPVSVASTLAQP 68
FIR P+ ++ ++ AG A LPVA P + P + VSA PGAD +V T+ Q
Sbjct: 5 FIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQV 64

Query: 69 LETQFSKIPYVTQMTSQSTLS-STSIVLQFSLERSIDAAANDVQSAIDAAAAQLPADLPS 127
+E + I + M+S S + S +I L F D A VQ+ + A LP ++
Sbjct: 65 IEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQQ 124

Query: 128 PPTFQKVNPADSPIMLLSAISSTLPLTTID--DYVETRLTKSLSQIDGVGSVSIGGQQKP 185
+ S +M+ +S T D DYV + + +LS+++GVG V + G Q
Sbjct: 125 QGIS-VEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY- 182

Query: 186 SIRIQLDPVKLASRGLSSEDVRRALSGLSGVNPKGVFNGT------TRSYTIYTNGQLTE 239
++RI LD L L+ DV L + G GT + +I +
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKN 242

Query: 240 PAQW-NGAIVAYRDGTPVRIRDIGQAVLGPEDNTLAAWIDGRRAISVGIYKKPGANTVST 298
P ++ + DG+ VR++D+ + LG E+ + A I+G+ A +GI GAN + T
Sbjct: 243 PEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDT 302

Query: 299 VDKIRARLPELEASLPPSLKIAVLADRTQTIRASLLDIELTLLLNVVLVVVVIYAFLGSV 358
I+A+L EL+ P +K+ D T ++ S+ ++ TL ++LV +V+Y FL ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 359 RTTIIPAVTVPVSLFGACALMWVCGYSLDNISLMAMTIAVGFVVDDAIVMVENIARH-VE 417
R T+IP + VPV L G A++ GYS++ +++ M +A+G +VDDAIV+VEN+ R +E
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 418 AGERPLQAALKGLSETSFTIASISLSLVAVLLPLLLMSGIIGRMFREFAVTLSMTIIVSA 477
P +A K +S+ + I++ L AV +P+ G G ++R+F++T+ + +S
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 478 FVSLTLTPMMASYLLRAHRHDAGRPPRP--GLFERAFARTAAAYERALDVALRHRFVTLC 535
V+L LTP + + LL+ + G F F + Y ++ L L
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLL 542

Query: 536 AFFASVAASVFLYVGIPKGFFPQQDTGVITGISEAAQTISVEDMARHSMALAAIIRADPA 595
+ VA V L++ +P F P++D GV + + + E + + +
Sbjct: 543 IYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNEK 602

Query: 596 --VEHCQMAVGGSAYAGTTVNNGRWYITLKPRDQRDA---TADEVIRRLRPQFAKVPGVR 650
VE V G +++G N G +++LKP ++R+ +A+ VI R + + K+
Sbjct: 603 ANVES-VFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 651 MYLQAAQDVIIGARLARTQYQLTLQSA-DVGALTTWAPRLLARLSGLP-QLRDVASDQQV 708
+ ++ ++L Q+ ALT +LL + P L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLE 721

Query: 709 NGSALSVAIDRDQAARYGLTPEAIDGTLYDAFGSRQVAQYFTQLSTYKVIMETLPSLQRD 768
+ + + +D+++A G++ I+ T+ A G V + + K+ ++ +
Sbjct: 722 DTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRML 781

Query: 769 PGTLDRIYMKAPSGALVPLSSVARWTTDTVQPLSVNHQSHFPSVTISFNLAPGVSLGEAT 828
P +D++Y+++ +G +VP S+ + + + PS+ I APG S G+A
Sbjct: 782 PEDVDKLYVRSANGEMVPFSAFTT-SHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 829 AAIEAAQASLRMPPAVVGSFQGTAQAFQSTLATMPMLILSALIVAYLVLGALYGSFIHPW 888
A +E + ++P + + G + + + P L+ + +V +L L ALY S+ P
Sbjct: 841 ALMENLAS--KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPV 898

Query: 889 TILSTLPSAGVGAIATLWLFKYDFNLIALIGVILLIGIVKKNGIMMVDFAIAATRERNMT 948
+++ +P VG + LF ++ ++G++ IG+ KN I++V+FA +
Sbjct: 899 SVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKG 958

Query: 949 SLDAIRSACLLRLRPIMMTTMTALFGALPLMFTPGMGSELRQPLGYAMVGGLLVSQVLTL 1008
++A A +RLRPI+MT++ + G LPL + G GS + +G ++GG++ + +L +
Sbjct: 959 VVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAI 1018

Query: 1009 FTTPVIYLYLDTL 1021
F PV ++ +
Sbjct: 1019 FFVPVFFVVIRRC 1031



Score = 89.9 bits (223), Expect = 3e-20
Identities = 78/509 (15%), Positives = 163/509 (32%), Gaps = 37/509 (7%)

Query: 4 NLFAVFIRYPVATCLMTAGILFAGVAAYFHLPVAPLPQVEFPTIQVSAVLP-GADPVSVA 62
N + L+ A I+ V + LP + LP+ + LP GA
Sbjct: 528 NSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQ 587

Query: 63 STLAQPLETQF---SKIPYVTQMTSQSTLSSTS-------IVLQFSLERSIDAAANDVQS 112
L Q + + + S + + L+ ER + N ++
Sbjct: 588 KVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEER--NGDENSAEA 645

Query: 113 AIDAAAAQLPADLPSPPTFQKVNPADSPIMLLSAIS---------STLPLTTIDDYVETR 163
I A + L + I+ L + + L +
Sbjct: 646 VIHRAKME----LGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQL 701

Query: 164 LTKSLSQIDGVGSVSIGGQQ-KPSIRIQLDPVKLASRGLSSEDVRRALSGLSGVNPKGVF 222
L + + SV G + ++++D K + G+S D+ + +S G F
Sbjct: 702 LGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDF 761

Query: 223 NGTTRSYTIYTNGQ---LTEPAQWNGAIVAYRDGTPVRIRDIGQAVLGPEDNTLAAWIDG 279
R +Y P + V +G V + L +G
Sbjct: 762 IDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLER-YNG 820

Query: 280 RRAISVGIYKKPGANTVSTVDKIRARLPELEASLPPSLKIAVLADRTQTIRASLLDIELT 339
++ + PG ++ A + L + LP + + R S
Sbjct: 821 LPSMEIQGEAAPG----TSSGDAMALMENLASKLPAGIGYDW-TGMSYQERLSGNQAPAL 875

Query: 340 LLLNVVLVVVVIYAFLGSVRTTIIPAVTVPVSLFGACALMWVCGYSLDNISLMAMTIAVG 399
+ ++ V+V + + A S + + VP+ + G + D ++ + +G
Sbjct: 876 VAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIG 935

Query: 400 FVVDDAIVMVENI-ARHVEAGERPLQAALKGLSETSFTIASISLSLVAVLLPLLLMSGII 458
+AI++VE + G+ ++A L + I SL+ + +LPL + +G
Sbjct: 936 LSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAG 995

Query: 459 GRMFREFAVTLSMTIIVSAFVSLTLTPMM 487
+ + ++ + +++ P+
Sbjct: 996 SGAQNAVGIGVMGGMVSATLLAIFFVPVF 1024


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2533IGASERPTASE310.019 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.8 bits (69), Expect = 0.019
Identities = 40/211 (18%), Positives = 59/211 (27%), Gaps = 23/211 (10%)

Query: 335 SDRLSLFADVGYTRNFHG--AAGGMNAFDSDVEMFSIGADYKLSEASRAGALLSSGNANG 392
S+ + L Y RN + A N + + Y G L G
Sbjct: 1331 SNNVQLGGVFTYVRNSNNFDKATSKNTL----AQVNFYSKYYADNHWYLGIDLGYGKFQS 1386

Query: 393 SLAGGQGR-IGLHAYRLGVY--HAFERAGLFVRAYAGAGWSR-----YRL--DRAAVLPG 442
L H + G+ AF + G +S + L R V P
Sbjct: 1387 KLQTNHNAKFARHTAQFGLTAGKAFNLGNFGITPIVGVRYSYLSNADFALDQARIKVNPI 1446

Query: 443 AVRASTSGFDFGALVKAGYLFALGGVRLGPVADVGYTQLVARGYTEDGDPILAQNVGVQR 502
+V+ + + D Y + LG + P+ Y G A NV Q+
Sbjct: 1447 SVKTAFAQVDLS------YTYHLGEFSVTPILSARYDANQGSGKINVNGYDFAYNVENQQ 1500

Query: 503 LKGVSAGAGVRFAAPLAAIGRRAGELSAEAQ 533
L+ IG AE Q
Sbjct: 1501 QYNAGLKLKYH-NVKLSLIGGLTKAKQAEKQ 1530


39BURPS668_2657BURPS668_2672Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_26572181.780230hypothetical protein
BURPS668_26563132.232663hypothetical protein
BURPS668_26582122.680912hypothetical protein
BURPS668_26592113.119915hypothetical protein
BURPS668_26600102.426999N-formylglutamate amidohydrolase
BURPS668_26610102.262609hypothetical protein
BURPS668_26620122.150237N-formimino-L-glutamate deiminase
BURPS668_2663-1102.305885imidazolonepropionase
BURPS668_26640102.735435hypothetical protein
BURPS668_2665-182.609034urocanate hydratase
BURPS668_2666-183.738433histidine utilization repressor
BURPS668_2667-293.785421histidine ammonia-lyase
BURPS668_26682153.361861hypothetical protein
BURPS668_26692171.876183hypothetical protein
BURPS668_26703200.520137hypothetical protein
BURPS668_2671423-1.193888hypothetical protein
BURPS668_26722141.541157LuxR family transcriptional regulator
40BURPS668_2704BURPS668_2712Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_27042102.234705FeS assembly ATPase SufC
BURPS668_27052131.871777sufB/SufD domain-containing protein
BURPS668_2706111-0.005536cysteine desulfurase SufS
BURPS668_2707113-1.826966NifU family SUF system FeS assembly protein
BURPS668_2708-114-2.551356phenylacetic acid degradation protein paaD
BURPS668_2709-113-3.954448hypothetical protein
BURPS668_271008-4.082092transcriptional regulator
BURPS668_271107-3.801759cation-binding hemerythrin HHE family protein
BURPS668_2712010-3.588785hypothetical protein
41BURPS668_2810BURPS668_2818Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_2810014-3.313932membrane protein
BURPS668_2811216-3.633550hypothetical protein
BURPS668_2812317-4.423602nicotinate phosphoribosyltransferase
BURPS668_2813519-3.404901hypothetical protein
BURPS668_2814415-2.028163ferredoxin
BURPS668_2817516-2.065150**transposase
BURPS668_2818317-0.140432CreA protein
42BURPS668_2827BURPS668_2837Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_28273113.304329thioesterase
BURPS668_28282112.532411hypothetical protein
BURPS668_2829-172.015469hypothetical protein
BURPS668_2830-182.046713ArsR family transcriptional regulator
BURPS668_2831081.481702hypothetical protein
BURPS668_2832191.746335translation initiation factor 2
BURPS668_2833-116-1.550468thymidylate synthase
BURPS668_28343210.197759sigma-54 dependent trancsriptional regulator
BURPS668_283510342.422222hypothetical protein
BURPS668_28365192.900634hypothetical protein
BURPS668_28374163.006411hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2830adhesinmafb320.002 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 32.0 bits (72), Expect = 0.002
Identities = 17/67 (25%), Positives = 28/67 (41%), Gaps = 3/67 (4%)

Query: 36 SARPAGELTMIAGLSPSAASAHLARLTDGGLLAL---DVRGRHRYYRIATPDIAAAIEAL 92
R A + + ++P A A + G +A + R + P+ A +EA+
Sbjct: 254 GTRYAIDKAAMRNIAPLPAEGKFAVIGGLGSVAGFEKNTREAVDRWIQENPNAAETVEAV 313

Query: 93 ANVAQAA 99
NVA AA
Sbjct: 314 FNVAAAA 320


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2832IGASERPTASE436e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 42.7 bits (100), Expect = 6e-06
Identities = 42/263 (15%), Positives = 63/263 (23%), Gaps = 12/263 (4%)

Query: 583 NPAARAGERPQPNMPQPNAAQPNAAQPNIARPGQPQPGVAQPTAPHAPGTPPNAMRPDAA 642
NP + + N PN Q P P AP PP P
Sbjct: 982 NPEVEKRNQT---VDTTNITTPNNIQ--ADVPSVPSNNEEIARVDEAPVPPPAPATPSET 1036

Query: 643 RPNEARPAPAPSARNGVPRPPAAVENPGMRDEARAPGEAPRPQPSWTQPHPPIQQQRANE 702
A + S A R+ A+ EA + TQ + Q +
Sbjct: 1037 TETVAENSKQESKTVEKNEQDATETTAQNREVAK---EAKSNVKANTQTNEVAQSGSETK 1093

Query: 703 GGPRASGEPNAPLNYRSPTQNALPPIRSTPTPTHSAPPAPPPAERAQPQPQPGPTPRNAM 762
+ A + + + P T P +E QPQ +P +
Sbjct: 1094 ETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 763 RAPEAPRQEVAPPAPRNEYRAPAPAPRPQIE--APRMEAPRMP-APRAEAP-RMEPRPAP 818
E Q + + + + P P +P
Sbjct: 1154 NIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNS 1213

Query: 819 PPPAVPHNPPPAPRQEPPHQARP 841
P N + PH P
Sbjct: 1214 ESSNKPKNRHRRSVRSVPHNVEP 1236


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2834HTHFIS376e-129 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 376 bits (966), Expect = e-129
Identities = 133/388 (34%), Positives = 202/388 (52%), Gaps = 40/388 (10%)

Query: 101 FDYVTVPYECDRIVESVGHAYGMVTLSEGLAPAAATVRNEGEMVGTCEAMLALFKMIRKV 160
+DY+ P++ ++ +G A + + ++ +VG AM +++++ ++
Sbjct: 99 YDYLPKPFDLTELIGIIGRA--LAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARL 156

Query: 161 ASTDAPVFISGESGTGKELTAVAIHERSSRAGAPFVAINCGAIPPTLLQAELFGYERGAF 220
TD + I+GESGTGKEL A A+H+ R PFVAIN AIP L+++ELFG+E+GAF
Sbjct: 157 MQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAF 216

Query: 221 TGANQRKIGRIEAANGGTLFLDEIGDLPFESQASLLRFLQEHKVERVGGHQSIPVDVRII 280
TGA R GR E A GGTLFLDEIGD+P ++Q LLR LQ+ + VGG I DVRI+
Sbjct: 217 TGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIV 276

Query: 281 SATHVDMQIALRNGRFREDLYHRLCVLKLEEPPLRERGKDIEILARHMLERFKGDAHRRL 340
+AT+ D++ ++ G FREDLY+RL V+ L PPLR+R +DI L RH +++ + + +
Sbjct: 277 AATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEG-LDV 335

Query: 341 RGFTPDAIAALHNYAWPGNVRELINRVRRAIVMSEGRMISAADLELSGYAEVA------- 393
+ F +A+ + + WPGNVREL N VRR + +I+ +E +E+
Sbjct: 336 KRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKA 395

Query: 394 ------------------------------PMSLEEARESAERHAIEVALLRHRGRLADA 423
+ E I AL RG A
Sbjct: 396 AARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKA 455

Query: 424 ARELGVSRVTLYRLLCAYGMRDDGGARA 451
A LG++R TL + + G+ +R+
Sbjct: 456 ADLLGLNRNTLRKKIRELGVSVYRSSRS 483


43BURPS668_2877BURPS668_2896Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_2877111-3.185423rfaE protein
BURPS668_2878110-3.696583UDP-glucose 6-dehydrogenase
BURPS668_2879110-1.862715tetratricopeptide repeat protein
BURPS668_2880110-1.485980hypothetical protein
BURPS668_288109-1.682028integration host factor subunit beta
BURPS668_2882011-1.77579030S ribosomal protein S1
BURPS668_2883-19-1.271078cytidylate kinase
BURPS668_2884-29-1.487518bifunctional prephenate
BURPS668_2885-210-3.547479chorismate mutase/prephenate dehydratase
BURPS668_2886-112-4.439453phosphoserine aminotransferase
BURPS668_2887-113-4.140967hypothetical protein
BURPS668_2888-212-2.703490DNA gyrase subunit A
BURPS668_2890522-2.541243hypothetical protein
BURPS668_2889217-1.160110hypothetical protein
BURPS668_2891115-0.535509ompA family protein
BURPS668_2892-1101.7905593-demethylubiquinone-9 3-methyltransferase
BURPS668_28931141.508116phosphoglycolate phosphatase
BURPS668_28950171.044302hypothetical protein
BURPS668_28963122.279005phospholipid N-methyltransferase PmtA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2881DNABINDINGHU1098e-35 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 109 bits (273), Expect = 8e-35
Identities = 35/89 (39%), Positives = 58/89 (65%), Gaps = 1/89 (1%)

Query: 2 TKSELVAQLASRFPQLVLKDADFAVKTMLDAMSDALSKGHRIEIRGFGSFGLNRRPARVG 61
K +L+A++A +L KD+ AV + A+S L+KG ++++ GFG+F + R AR G
Sbjct: 3 NKQDLIAKVAEA-TELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKG 61

Query: 62 RNPKSGEKVQVPEKHVPHFKPGKELRERV 90
RNP++GE++++ VP FK GK L++ V
Sbjct: 62 RNPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2891OMPADOMAIN1684e-53 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 168 bits (426), Expect = 4e-53
Identities = 75/146 (51%), Positives = 99/146 (67%), Gaps = 3/146 (2%)

Query: 74 AQAPAPAPVAPVAPAITSQKITYQADTLFDFDKAVLKPAGKQKLDELAAKIQGMNVE--V 131
AP AP AP + ++ T ++D LF+F+KA LKP G+ LD+L +++ ++ +
Sbjct: 195 EAAPVVAPAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGS 254

Query: 132 VVATGYTDRIGSDKYNDRLSLRRAQAVKSYLVSKGVPANKVYTEGKGKRNPVTGNTC-KQ 190
VV GYTDRIGSD YN LS RRAQ+V YL+SKG+PA+K+ G G+ NPVTGNTC
Sbjct: 255 VVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNV 314

Query: 191 KNRKQLIACLAPDRRVEVEVVGTQEV 216
K R LI CLAPDRRVE+EV G ++V
Sbjct: 315 KQRAALIDCLAPDRRVEIEVKGIKDV 340


44BURPS668_2926BURPS668_2943Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_2926535-1.730352hypothetical protein
BURPS668_2928328-0.155187*hypothetical protein
BURPS668_29292180.868574hypothetical protein
BURPS668_2930090.709818hypothetical protein
BURPS668_2931090.557056hypothetical protein
BURPS668_2932-191.268902hypothetical protein
BURPS668_2933091.396605hypothetical protein
BURPS668_2934191.537082hypothetical protein
BURPS668_29351111.312296TonB-dependent receptor
BURPS668_29362190.602931hypothetical protein
BURPS668_29371180.316518hypothetical protein
BURPS668_2938515-1.725667hypothetical protein
BURPS668_2939413-1.950968chorismate mutase
BURPS668_2940-18-3.303067exonuclease
BURPS668_2941010-4.660040cold-shock domain-contain protein
BURPS668_2942-29-3.849447hypothetical protein
BURPS668_2943-49-3.112151outer membrane porin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2935SURFACELAYER300.029 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 30.4 bits (68), Expect = 0.029
Identities = 18/49 (36%), Positives = 25/49 (51%), Gaps = 1/49 (2%)

Query: 13 RLAAACAAALAWPAAHAASTAAAVPADSTPAAAAEMTASGKTLDTVKVT 61
R+ +A AAAL A A+TA V A +T A + + A+ V VT
Sbjct: 6 RIVSAAAAALL-AVAPIAATAMPVNAATTINADSAINANTNAKYDVDVT 53


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2943ECOLNEIPORIN1275e-36 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 127 bits (321), Expect = 5e-36
Identities = 89/386 (23%), Positives = 143/386 (37%), Gaps = 62/386 (16%)

Query: 1 MKKTLIVAALSGVFATAAHAQSSVTLYGLIDAGITYTNNQGGHSAWS-----QSTGSVNG 55
MKK+LI L+ + A + VTLYG I AG+ + + + A + + G
Sbjct: 1 MKKSLIALTLAALPVAAM---ADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLG 57

Query: 56 SRWGLRGAEDLGGGLKAIFVLENGFGINNGTLKQNGREFGRQAFVGLSHEQYGALTLGRQ 115
S+ G +G EDLG GLKAI+ +E I + RQ+F+GL +G L +GR
Sbjct: 58 SKIGFKGQEDLGNGLKAIWQVEQKASIAGT----DSGWGNRQSFIGLK-GGFGKLRVGRL 112

Query: 116 YDSVVDYLG--PLSLTGTQFGGTQFAHPFDNDNLNNSFRVNNAVKYTSVNWAGLKFGALY 173
+ D P G + A P + S V+Y S +AGL Y
Sbjct: 113 NSVLKDTGDINPWDSKSDYLGVNKIAEP---EARLIS------VRYDSPEFAGLSGSVQY 163

Query: 174 GFSNNNQFANNRAYSAGVSYSYAGFNIGAGYLQLNNNFGPTVSNASGAVALDNTFVGKRQ 233
++N N+ +Y AG +Y GF + G ++ ++
Sbjct: 164 ALNDNAGRHNSESYHAGFNYKNGGFFVQYGGAYKRHHQV------------QENVNIEKY 211

Query: 234 RVFGGGLNYTFGPATAGFVFTQSRVNRATAIGAGASGVSSGIALDGTFMRFNNYEVNARY 293
++ Y A + + A + S S RF N
Sbjct: 212 QIHRLVSGYD---NDALYASVAVQQQDAKLVEENYSHNSQTEVAATLAYRFG----NVTP 264

Query: 294 AITPAWTVAGSYTYTAGFIENHHPGWNQFNLQTAYALSKRTDVYLQGVYQKVNNDGTGLG 353
++ A GS+ T N++ ++Q + Y SKRT + + + +G G
Sbjct: 265 RVSYAHGFKGSFDAT-----NYNNDYDQVVVGAEYDFSKRTSALVSAGWLQ---EGKGES 316

Query: 354 AYINGIGGMSSTEKQIAVTAGLRHRF 379
++ A GLRH+F
Sbjct: 317 KFV-----------STAGGVGLRHKF 331


45BURPS668_2965BURPS668_2993Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_2965-121-3.709433hypothetical protein
BURPS668_2966-122-4.886234cell division topological specificity factor
BURPS668_2967-222-4.411040septum site-determining protein MinD
BURPS668_2968130-6.293115septum formation inhibitor
BURPS668_2969335-8.468173acetyltransferase
BURPS668_2970440-10.247512hypothetical protein
BURPS668_2971735-9.968091hypothetical protein
BURPS668_2972634-9.751607hypothetical protein
BURPS668_2973648-10.739608phage replication protein
BURPS668_2974755-10.997294hypothetical protein
BURPS668_2975761-11.506009hypothetical protein
BURPS668_2976761-11.654551hypothetical protein
BURPS668_2977661-11.826776hypothetical protein
BURPS668_2978659-11.409982hypothetical protein
BURPS668_2979555-11.544590phage-related membrane protein
BURPS668_2980336-8.259006phage-related membrane protein
BURPS668_2981127-6.896805Type II secretory pathway, component PulD
BURPS668_2982-113-3.081302hypothetical protein
BURPS668_2983011-2.544115hypothetical protein
BURPS668_2984-19-1.697768site-specific recombinases, DNA invertase Pin
BURPS668_2986013-0.848837*seryl-tRNA synthetase
BURPS668_2987011-0.087322hypothetical protein
BURPS668_29880120.163455recombination factor protein RarA
BURPS668_2989112-0.296118outer-membrane lipoprotein carrier protein
BURPS668_2990112-0.796297DNA translocase FtsK
BURPS668_2991214-1.305312thioredoxin-disulfide reductase
BURPS668_2992112-1.080874Smr domain-containing protein
BURPS668_2993214-2.397086hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2969SACTRNSFRASE354e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 35.3 bits (81), Expect = 4e-05
Identities = 15/59 (25%), Positives = 21/59 (35%)

Query: 81 RRAFVSELWVAHAVRHLRGGVLLVNTARAWLATRGAGEIYAWIADQNRSAIRFYEHVGF 139
A + ++ VA R G L++ A W + D N SA FY F
Sbjct: 88 GYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2978PF05616661e-13 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 66.3 bits (161), Expect = 1e-13
Identities = 35/104 (33%), Positives = 46/104 (44%), Gaps = 21/104 (20%)

Query: 396 PGTDVVIAPDVRPNPNPA---------------------PYPLPAPNPSPVPNPEPVPVP 434
PGT V + P N NP P P P + PN +P+P
Sbjct: 272 PGTKVNMGPVTDRNGNPVQVVATFGRDSQGNTTVDVQVIPRPDLTPGSAEAPNAQPLPEV 331

Query: 435 NPEPNPSPNPEPNPNPNPNPNPNPNPNPNPNPNPNPNPNPNPDP 478
+P NP+ NP PN NP PNP P+P+ NP+ NP+ + P P
Sbjct: 332 SPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPGTRP 375



Score = 66.3 bits (161), Expect = 1e-13
Identities = 35/100 (35%), Positives = 46/100 (46%), Gaps = 1/100 (1%)

Query: 377 ADPFAVIPTLADLLSRAATPGTDVVIAPDVRPNPNPAPYPLPAPNPSPVPNPEPVPVPNP 436
+P V+ T T V+ PD+ P AP P P SP NP P PN
Sbjct: 286 GNPVQVVATFGRDSQGNTTVDVQVIPRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNE 345

Query: 437 EPNPSPNPEPNPNPNPNPNPNPNPNPNPNPNPNPNPN-PN 475
P PNPEP+P+ NP+ NP+ + P P+ P+ PN
Sbjct: 346 NPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSPAVPDRPN 385



Score = 60.5 bits (146), Expect = 9e-12
Identities = 27/73 (36%), Positives = 36/73 (49%)

Query: 406 VRPNPNPAPYPLPAPNPSPVPNPEPVPVPNPEPNPSPNPEPNPNPNPNPNPNPNPNPNPN 465
V P P+ P APN P+P P P P P+ NP PNP P+P+ NP+ NP+ +
Sbjct: 309 VIPRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTD 368

Query: 466 PNPNPNPNPNPDP 478
P P+ P
Sbjct: 369 GQPGTRPDSPAVP 381



Score = 36.6 bits (84), Expect = 3e-04
Identities = 21/74 (28%), Positives = 29/74 (39%)

Query: 444 PEPNPNPNPNPNPNPNPNPNPNPNPNPNPNPNPDPSIAQNVNVVNVPSVNVINRVAVDVG 503
P P+ P PN P P +P NP NP P+ + N P +N D
Sbjct: 311 PRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQ 370

Query: 504 PDPEIKTPSLEDTP 517
P +P++ D P
Sbjct: 371 PGTRPDSPAVPDRP 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2981PHAGEIV1106e-29 Gene IV protein signature.
		>PHAGEIV#Gene IV protein signature.

Length = 426

Score = 110 bits (276), Expect = 6e-29
Identities = 82/381 (21%), Positives = 153/381 (40%), Gaps = 33/381 (8%)

Query: 10 VDLRFVPVAQVVDLIYADMLKVPYVIAPEVLADQRLISFRFDARDGDVRNFMRGFLASLG 69
+++ + V Y+ +++P+V + + D + ++R+F L +
Sbjct: 24 IEMNNSSLRDFVT-WYSKQTGESVIVSPDVKG--TVTVYSSDVKPENLRDFFISVLRANN 80

Query: 70 FTVDTRDGMDYVSKRDAISSEPQDSYVYHPRYRDAEYLSRLVQPLFEGRFTTNRDIATSA 129
F + + K + QD P + EY G F D T
Sbjct: 81 FDM-VGSIPSIIQK---YNPNNQDYIDELPSSDNQEYDD--NSAPSGGFFVPQNDNVTQT 134

Query: 130 ----NARMRAASP----------PTSAAAILDQATDALVFIGSAGEISALKKLLPQVDTP 175
N R + S+ + ++ LV + L + L VD P
Sbjct: 135 FKINNVRAKDLIRVVELFVKSNTSKSSNVLSVDGSNLLVVSAPKDILDNLPQFLSTVDLP 194

Query: 176 VGQVAVRAWVYEVSKQEDNNSAFQLALRLLGGRVGASIGSGTIGDDSNAIRLRAGGFEAA 235
Q+ + ++EV Q+ + F A G V + + + ++ G F
Sbjct: 195 TDQILIEGLIFEV--QQGDALDFSFAAGSQRGTVAGGVNTDRLTSVLSSAGGSFGIFNGD 252

Query: 236 I-----AALNSDSRFRVVTSPNLRVRSGQLARLNVGQSVPVV-GSVSYPSA-SGAPVQSV 288
+ AL ++S ++++ P + SGQ ++VGQ+VP + G V+ SA P Q+V
Sbjct: 253 VLGLSVRALKTNSHSKILSVPRILTLSGQKGSISVGQNVPFITGRVTGESANVNNPFQTV 312

Query: 289 QYQDAGVIFQVQPTVKASAIDLNVIEEISDFVRTTTGVNNSPTKNTRKLESSFSVEDGDA 348
+ Q+ G+ V P A + I +D + ++T ++ T N R + ++ ++ DG
Sbjct: 313 ERQNVGISMSVFPVAMAGGNIVLDITSKADSLSSSTQASDVIT-NQRSIATTVNLRDGQT 371

Query: 349 VLIGGLTQDKESHLDSGLSFL 369
+L+GGLT K + DSG+ FL
Sbjct: 372 LLLGGLTDYKNTSQDSGVPFL 392


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2984FbpA_PF05833290.011 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 29.1 bits (65), Expect = 0.011
Identities = 12/86 (13%), Positives = 32/86 (37%), Gaps = 8/86 (9%)

Query: 72 KLDRIARSLKHLLSIIDRLQERGAGFESLTEHIDTNSPAGRLMLQMLGAFAEFEREMIRE 131
K +++ +S + + + +E S+ +I+ + E ++E+I
Sbjct: 389 KYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEI--------EEIKKELIET 440

Query: 132 RTKSGMQAAKRRGIRLGRPRSLESED 157
+ K + + +P S+D
Sbjct: 441 GYIKFKKIYKSKKSKTSKPMHFISKD 466


46BURPS668_3032BURPS668_3054Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_3032334-6.637895lipoprotein
BURPS668_3033543-8.875254hypothetical protein
BURPS668_3034761-12.223602hypothetical protein
BURPS668_3035761-12.584814NAD-dependent deacetylase
BURPS668_30361172-16.422532Sir2 family transcriptional regulator
BURPS668_30371171-16.838687hypothetical protein
BURPS668_30381059-13.614096hypothetical protein
BURPS668_30391062-14.676176hypothetical protein
BURPS668_3040641-12.561719bacteriophage/transposase fusion protein
BURPS668_3041538-12.482112hypothetical protein
BURPS668_3042436-11.959813hypothetical protein
BURPS668_3043437-11.869032hypothetical protein
BURPS668_3044439-12.246776hypothetical protein
BURPS668_3045438-11.531616Signal recognition particle GTPase
BURPS668_3046437-11.738968hypothetical protein
BURPS668_3047339-11.422568phage integrase
BURPS668_3048443-11.067404beta-lactamase superfamily hydrolase
BURPS668_3049968-15.522357hypothetical protein
BURPS668_3050857-12.660285hypothetical protein
BURPS668_3051649-11.232766hypothetical protein
BURPS668_3052545-10.444788hypothetical protein
BURPS668_3053544-9.604554hypothetical protein
BURPS668_3054644-9.525593hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3032SUBTILISIN543e-10 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 54.1 bits (130), Expect = 3e-10
Identities = 26/108 (24%), Positives = 47/108 (43%), Gaps = 10/108 (9%)

Query: 149 VHAIAPKAKI----VLVEAASNSFNDLMTAVDVAVGAGASVVSMSFGGSEFSSETSFDSH 204
V +AP+A + VL + S ++ ++ + A+ ++SMS GG E E
Sbjct: 103 VVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQKVDIISMSLGGPEDVPELHEAVK 162

Query: 205 FGAPSNVTFVASSGDSGNGTE------YPAASPYVVAVGGTTLSADAS 246
S + + ++G+ G+G + YP V++VG AS
Sbjct: 163 KAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAINFDRHAS 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3041ACRIFLAVINRP270.014 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 26.7 bits (59), Expect = 0.014
Identities = 9/61 (14%), Positives = 19/61 (31%), Gaps = 4/61 (6%)

Query: 1 MLEINANRRREKVQRATATAAELKAFNAMCTELAKNLGFKSLKELHEKTGDAEVSMKILM 60
+ + + V+ + A + F SLK E+ GD + ++
Sbjct: 593 VTDYYLKNEKANVESVFTVNGFSFSGQA----QNAGMAFVSLKPWEERNGDENSAEAVIH 648

Query: 61 A 61

Sbjct: 649 R 649


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3049CHANLCOLICIN270.018 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 27.3 bits (60), Expect = 0.018
Identities = 11/75 (14%), Positives = 35/75 (46%), Gaps = 3/75 (4%)

Query: 21 ARWHRTAEDRARQRRFMEGIQTMIDQQLEIMQMEKDEPITLTPEDQSWDLVQVLRKSLPD 80
+ ++ ++R+ +E + ++QL++ + E+ L+ E ++ V++ +K L
Sbjct: 144 EAAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKA---VEIAQKKLSA 200

Query: 81 LRNELAHGSSTLSNQ 95
++E+ +
Sbjct: 201 AQSEVVKMDGEIKTL 215


47BURPS668_3088BURPS668_3102Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_3088223-4.550567glycoside hydrolase family protein
BURPS668_3089234-5.666853hypothetical protein
BURPS668_3090135-6.541976capsular polysaccharide biosynthesis protein
BURPS668_3091242-7.742426glycoside hydrolase family protein
BURPS668_3092139-7.678975NAD-dependent epimerase/dehydratase
BURPS668_3093241-9.069600glycosyl transferase family protein
BURPS668_3094341-8.898438glycosyl transferase family protein
BURPS668_3095340-9.693773glycosyl transferase family protein
BURPS668_3096440-9.282580glycosyl transferase family protein
BURPS668_3097542-9.605348NAD-dependent epimerase/dehydratase
BURPS668_3098643-10.273330O-antigen acetylase WbiA
BURPS668_3099436-8.871877lipopolysaccharide ABC transporter ATP-binding
BURPS668_3100128-7.510894lipopolysaccharide ABC transporter permease
BURPS668_3101020-4.724717dTDP-4-dehydrorhamnose reductase
BURPS668_3102-313-3.106572dTDP-4-dehydrorhamnose 3,5-epimerase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3090NUCEPIMERASE728e-16 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 72.1 bits (177), Expect = 8e-16
Identities = 53/301 (17%), Positives = 108/301 (35%), Gaps = 50/301 (16%)

Query: 288 VMVTGAGGSIGSELCRQILKFQPAQLIAFD-LSEYAMYRLTEELRERFPDLPVVPIIGDA 346
+VTGA G IG + +++L+ Q++ D L++Y L + E D
Sbjct: 3 YLVTGAAGFIGFHVSKRLLE-AGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 347 KDSLLLDQVMSRYAPHIVFHAAAYKHVPLMEELNAWQALRNNVLGTYRVARAAIRHDVRH 406
D + + + VF + V E N +N+ G + + ++H
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLE-NPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 407 FVLIST---------------DKAVNPTNVMGASKRLAE-MACQALQQTSARTQFETV-- 448
+ S+ D +P ++ A+K+ E MA S
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTY----SHLYGLPATGL 176

Query: 449 RFGNVLGSAGS---VIPKFQQQIAKGGPVTV-THPEITRFFMTIPEASQLVLQA------ 498
RF V G G + KF + + +G + V + ++ R F I + ++ +++
Sbjct: 177 RFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPH 236

Query: 499 ------------SSMGQGGEIFILDMGEPVKIVDLARDLIRLYGFTEEQIRIEFSGLRPG 546
++ ++ + PV+++D + L G + + L+PG
Sbjct: 237 ADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGI---EAKKNMLPLQPG 293

Query: 547 E 547
+
Sbjct: 294 D 294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3092NUCEPIMERASE1076e-29 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 107 bits (268), Expect = 6e-29
Identities = 68/344 (19%), Positives = 130/344 (37%), Gaps = 42/344 (12%)

Query: 3 RVIVTGANGFVGRALCRALLAAGHEVTGL-------------VRRRGVCTEGVSEWVHEA 49
+ +VTGA GF+G + + LL AGH+V G+ R + G H+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQ--FHKI 59

Query: 50 D--DFDGVADRWPAGLQVDAVVHLAARVHMMRDRSPDPDAAFRASNVAATMRVARAARQQ 107
D D +G+ D + +G + V R+ + S + A+ SN+ + + R
Sbjct: 60 DLADREGMTDLFASG-HFERVFISPHRLAV--RYSLENPHAYADSNLTGFLNILEGCRHN 116

Query: 108 GARRFVFLS--SVKAIAESDGGTPLCE-NSTPAPQDAYGRSKLEAERALEQLRDELSFDT 164
+ ++ S SV + P +S P Y +K E
Sbjct: 117 KIQHLLYASSSSVYGLNRKM---PFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPA 173

Query: 165 VIVRPPLVYGPGVRAN--FLSLMRAVSRGVPLPL-GAVRARRSMVYVDNLADAVMRCVTE 221
+R VYGP R + +A+ G + + + +R Y+D++A+A++R
Sbjct: 174 TGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDV 233

Query: 222 PAATNGCFHVADSDMPPTIAEL-LDDIGHHLGRPARLLPVPERLLRVAGALTGRAAQ--- 277
+ + V +IA + +IG+ P L+ ++ G A+
Sbjct: 234 IPHADTQWTVETGTPAASIAPYRVYNIGN--SSPVELM----DYIQALEDALGIEAKKNM 287

Query: 278 IDRLTSDLR---LDTTHIRTVLDWRPPRSSEEGLAETACWFKSL 318
+ D+ DT + V+ + P + ++G+ W++
Sbjct: 288 LPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3097NUCEPIMERASE1682e-51 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 168 bits (426), Expect = 2e-51
Identities = 83/363 (22%), Positives = 136/363 (37%), Gaps = 58/363 (15%)

Query: 6 KILVTGGAGFIGCAISERLAARASRYVVMDNLHPQIHASAVRPGALHEKAE----LVVAD 61
K LVTG AGFIG +S+RL + V +DNL+ + +++ L A+ D
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDY-YDVSLKQARLELLAQPGFQFHKID 60

Query: 62 VTDAGAWDALLSDFQPEIIIHLAAETGTGQSLTEASRHALVNVVGTTRLTDALVKHGIVV 121
+ D L + E + SL +A N+ G + + + +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNK--I 118

Query: 122 EHILLTSSRAVYGEGAWQKDDGTIVYPGQRGRAQLEAAQWDFPGMTMLPSRADRTEPRPT 181
+H+L SS +VYG +P D + P
Sbjct: 119 QHLLYASSSSVYGLN------------------------------RKMPFSTDDSVDHPV 148

Query: 182 SVYGATKLAQEHVLRAWSLATKTPLSILRLQNVYGPGQSLTNSYTGIVALFSRLAREKKV 241
S+Y ATK A E + +S P + LR VYGP + F++ E K
Sbjct: 149 SLYAATKKANELMAHTYSHLYGLPATGLRFFTVYGPWGRPDMALF----KFTKAMLEGKS 204

Query: 242 IPLYEDGNVTRDFVSIDDVADAIVATLVRTPEA-----------------LSLFDIGSGQ 284
I +Y G + RDF IDD+A+AI+ P A +++IG+
Sbjct: 205 IDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSS 264

Query: 285 ATSILDMARIIAAHYGAPEPQINGAFRDGDVRHAACDLSESLANLGWKPQWSLKRGIGEL 344
++D + + G + + GDV + D +G+ P+ ++K G+
Sbjct: 265 PVELMDYIQALEDALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNF 324

Query: 345 QTW 347
W
Sbjct: 325 VNW 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3100ABC2TRNSPORT300.007 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 30.3 bits (68), Expect = 0.007
Identities = 16/59 (27%), Positives = 24/59 (40%)

Query: 195 LFTMVLMFLSPVFYPASALPEKYRFWLELNPLTLFIEQSRGILLEGRVPDFHPLGLAFL 253
L ++FLS +P LP ++ PL+ I+ R I+L V D A
Sbjct: 184 LVITPILFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALC 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3101NUCEPIMERASE588e-12 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 58.3 bits (141), Expect = 8e-12
Identities = 32/160 (20%), Positives = 56/160 (35%), Gaps = 27/160 (16%)

Query: 1 MKILVTGANGQVGWELARSLAVLGQVV-----------PLTRE--------------QAD 35
MK LVTGA G +G+ +++ L G V ++ + D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 36 LGRPETLARIVEDAKPDVVVNAAAYTAVDAAETDGAAANVINGEA-VGVLAAATKRVGGL 94
L E + + + V + AV + + A N + +L
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 95 FVHYSTDYVFDGTKPSPYIETDPT-CPVNAYGASKLLGEL 133
++ S+ V+ + P+ D PV+ Y A+K EL
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANEL 160


48BURPS668_3114BURPS668_3131Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_3114222-1.344197hypothetical protein
BURPS668_3115127-4.113954rubredoxin
BURPS668_3116324-1.934004phosphomethylpyrimidine kinase
BURPS668_3117729-2.729123molecular chaperone GroEL
BURPS668_31181034-3.011148co-chaperonin GroES
BURPS668_31201237-3.515125hypothetical protein
BURPS668_31191137-3.711469hypothetical protein
BURPS668_31211136-3.655298hypothetical protein
BURPS668_3122434-5.055750hypothetical protein
BURPS668_3124429-2.850742hypothetical protein
BURPS668_3123327-1.002537hypothetical protein
BURPS668_3125123-0.617350hypothetical protein
BURPS668_3126019-0.699310transcriptional regulator
BURPS668_3127014-0.828651zinc-binding dehydrogenase oxidoreductase
BURPS668_3128112-1.319736hypothetical protein
BURPS668_3130313-2.513756hypothetical protein
BURPS668_3129312-3.183198hypothetical protein
BURPS668_3131111-3.335619OmpW family outer membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3121PYOCINKILLER310.017 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.5 bits (68), Expect = 0.017
Identities = 50/197 (25%), Positives = 75/197 (38%), Gaps = 9/197 (4%)

Query: 306 ANALSVANPAALTAAANTVAGTLARAANGTPVAGAIGGLVAALPVANPAGALTSAANNAA 365
A A A A AA A T A ANG+ VA A G + VA A +L A ++A
Sbjct: 228 AEAKRKAEEQARQQAAIRAANTYAMPANGSVVATAAGR--GLIQVAQGAASLAQAISDAI 285

Query: 366 STIATVAGTNPAAAIGGVAGALTGAAGTGVATASQLGSVGSALMGSGAASAGKVLTSGSA 425
+ + V + P+ G A +LT ++ T Q +G AA G +
Sbjct: 286 AVLGRVLASAPSVMAVGFA-SLTYSSRTAEQWQDQTPDSVRYALGMDAAKLGLPPSVNLN 344

Query: 426 AFGSAAASAGL-----LLTTGAAAASSVVNSLGSSV-GAVVASLPNLSVSSSKSTAAASN 479
A A+ + L G SVV++ G SV AV + + ++ +
Sbjct: 345 AVAKASGTVDLPMRLTNEARGNTTTLSVVSTDGVSVPKAVPVRMAAYNATTGLYEVTVPS 404

Query: 480 PLAPVSSMVATLVGALP 496
A ++ T A P
Sbjct: 405 TTAEAPPLILTWTPASP 421


49BURPS668_3147BURPS668_3153Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_3147-283.047105lipoprotein
BURPS668_3148-173.316108phospholipid-binding protein
BURPS668_3149183.567304hypothetical protein
BURPS668_3150384.529449LysR family transcriptional regulator
BURPS668_3151284.454049iron ABC transporter permease
BURPS668_3152083.890577iron chelate uptake ABC transporter periplasmic
BURPS668_3153-193.771919iron ABC transporter ATP-binding protein
50BURPS668_3236BURPS668_3264Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_3236540-7.029922sulfatase
BURPS668_3237438-6.932536short chain dehydrogenase/reductase
BURPS668_3238539-9.162693capsule polysaccharide biosynthesis protein
BURPS668_3239535-9.116975hypothetical protein
BURPS668_3240437-9.537245D,D-heptose 1,7-bisphosphate phosphatase
BURPS668_3241439-10.565212D-glycero-D-manno-heptose 1-phosphate
BURPS668_3242439-11.033045phosphoheptose isomerase
BURPS668_3243340-11.313779D-glycero-D-manno-heptose 7-phosphate kinase
BURPS668_3244343-11.709887GDP-6-deoxy-D-lyxo-4-hexulose reductase
BURPS668_3245448-12.467989NAD-dependent epimerase/dehydratase
BURPS668_3246451-13.148052capsular polysaccharide biosynthesis protein
BURPS668_3247551-13.122302glycoside hydrolase family protein
BURPS668_3248553-13.265519capsular polysaccharide biosynthesis protein
BURPS668_3249652-12.645058capsular polysaccharide biosynthesis protein
BURPS668_3250750-11.822572glycoside hydrolase family protein
BURPS668_3251748-11.263444capsular polysaccharide export ABC transporter
BURPS668_3252643-8.393464capsular polysaccharide export inner-membrane
BURPS668_3253438-7.080671capsule polysaccharide exporter
BURPS668_3254335-5.547281capsular polysaccharide biosynthesis/export
BURPS668_3255131-4.573074glycoside hydrolase family protein
BURPS668_3256026-3.676316hypothetical protein
BURPS668_3257025-3.420360capsular polysaccharide biosynthesis protein
BURPS668_3258-118-3.927933mannose-1-phosphate guanylyltransferase
BURPS668_3259-280.080727hypothetical protein
BURPS668_3260-170.971144glutamine amidotransferase
BURPS668_3261-171.628010small conductance mechanosensitive ion channel
BURPS668_3262-171.881249hypothetical protein
BURPS668_3263-272.218683DedA family membrane protein
BURPS668_3264-1103.202324DNA mismatch repair protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3237DHBDHDRGNASE704e-16 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 69.7 bits (170), Expect = 4e-16
Identities = 65/249 (26%), Positives = 100/249 (40%), Gaps = 26/249 (10%)

Query: 12 ITGASAGLGRALARAYARPGVVLSLGGRDAVRLEESAADCRACGATVFVASIDVRDADAM 71
ITGA+ G+G A+AR A G ++ + +LE+ + +A DVRD+ A+
Sbjct: 13 ITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSAAI 72

Query: 72 R----RWLEQFDDAHPIHLLIANAGVASTLAHGGDWEARERTAAIVDTNFYGAMNAVLPV 127
R + PI +L+ AGV + E A N G NA V
Sbjct: 73 DEITARIEREMG---PIDILVNVAGVLRPGLI--HSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 128 IDRMRARGSGQVALISSLAALRGMAISPAYCASKAALKAWGDSVRPVLKRDGIRLSVVLP 187
M R SG + + S A AY +SKAA + + L IR ++V P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 188 GFVKTAMSDVFPADKPLLWSPDKAAQYIQRGIAARRAEIAFPALLALGMRLLPLL-PAVM 246
G +T M + LW+ + A+ + +G G+ L L P+ +
Sbjct: 188 GSTETDM-------QWSLWADENGAEQVIKGSLET---------FKTGIPLKKLAKPSDI 231

Query: 247 ADTILGRLS 255
AD +L +S
Sbjct: 232 ADAVLFLVS 240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3244NUCEPIMERASE1295e-37 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 129 bits (326), Expect = 5e-37
Identities = 80/352 (22%), Positives = 137/352 (38%), Gaps = 52/352 (14%)

Query: 4 RVLITGITGMVGSHLADFLLENTDWEIYGLCRWRSPLDNV-SHLLPRINEKNRIRL---- 58
+ L+TG G +G H++ LLE ++ G+ DN+ + + + L
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGH-QVVGI-------DNLNDYYDVSLKQARLELLAQPG 53

Query: 59 ---VYGDLRDYLSIHEAVKQSTPDFVFHLAAQSYPKTSFDSPLDTLETNVQGTANVLEAL 115
DL D + + + VF + + S ++P ++N+ G N+LE
Sbjct: 54 FQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGC 113

Query: 116 RKNNIDAVTHVCASSEVFGRVPREKLPIDEE-CTFHPASPYAISKVGTDLIGRYYAEAYN 174
R N I + +SS V+G K+P + HP S YA +K +L+ Y+ Y
Sbjct: 114 RHNKIQHLL-YASSSSVYGL--NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYG 170

Query: 175 MTVMTTRMFTHTGPR-RGDVFAESTFAKQIAMIERELIPPVVKTGNLDSLRTFADVRDAV 233
+ R FT GP R D+ A F K AM+E + + R F + D
Sbjct: 171 LPATGLRFFTVYGPWGRPDM-ALFKFTK--AMLEGK---SIDVYNYGKMKRDFTYIDDIA 224

Query: 234 RAYYMLVTINPI-----------------PGAYYNIGGTYSCTVGQMLDTLISMSTSKDV 276
A L + P P YNIG + + + L +D
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQAL------EDA 278

Query: 277 IRVETDPE--RLRPIDADLQVPNTRKFEAVTGWKPEISFEKTMEDLLNYWRA 326
+ +E L+P D +T+ V G+ PE + + +++ +N++R
Sbjct: 279 LGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRD 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3245NUCEPIMERASE451e-07 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 45.2 bits (107), Expect = 1e-07
Identities = 59/332 (17%), Positives = 106/332 (31%), Gaps = 82/332 (24%)

Query: 1 MKVFLVGSTGYIGKTLFDACSQR-WRTLGT-STRDGADIVFSLARAEAFPYEQVSA--GD 56
MK + G+ G+IG + + + +G + D D+ AR E D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 57 ------------------VVAVAA------AISSPDACAKDYETAFQVNVTGTLTLIRGV 92
V ++ +P A Y N+TG L ++ G
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHA----Y---ADSNLTGFLNILEG- 112

Query: 93 VARGA---RVIFFSSDTVYGASEQLLSEEAELT--PAGAYGAMKRRVEA---ELGENAAV 144
R +++ SS +VYG + ++ + P Y A K+ E +
Sbjct: 113 -CRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171

Query: 145 KVIRLSY--VFSLRDR-------FTQYLLGCAKEGKRADIFK--PFSRCVVYLSDVVEGV 193
L + V+ R FT+ +L EGK D++ R Y+ D+ E +
Sbjct: 172 PATGLRFFTVYGPWGRPDMALFKFTKAML----EGKSIDVYNYGKMKRDFTYIDDIAEAI 227

Query: 194 VSLIE-------RWD---------AIDERVINFVGPELVAREDFVEKIRNLAAPELDYGF 237
+ L + +W RV N V D+++ + + E
Sbjct: 228 IRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNM 287

Query: 238 SEP-EGDFFVNRPRIINVSSARFEKLLGRRPR 268
GD + + +++G P
Sbjct: 288 LPLQPGDVLETSA---DTKALY--EVIGFTPE 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3246PF05043300.007 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 29.9 bits (67), Expect = 0.007
Identities = 19/76 (25%), Positives = 33/76 (43%), Gaps = 7/76 (9%)

Query: 48 RHLEEIGASLRIDIDE---IESWCVDELKSREVGENDGGKQIDISVTDFILANCRQKRLF 104
H + + +L +E W EL+ + D DI +++FI+ KRL
Sbjct: 414 YHAKFVAETLSYYCSNNFELEVW--TELELSKESLED--SPYDIIISNFIIPPIENKRLI 469

Query: 105 YTMNHPTAALMREIAA 120
Y+ N T +L+ + A
Sbjct: 470 YSNNINTVSLIYLLNA 485


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3252ABC2TRNSPORT382e-05 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 38.0 bits (88), Expect = 2e-05
Identities = 32/139 (23%), Positives = 58/139 (41%), Gaps = 7/139 (5%)

Query: 88 MAVTPNLALMYHRNVKVIDIFIARILLEVVGNTASFFVLMITFHALGLVDYPEDILEVMF 147
M M + +++ DI + + + + + ALG + +++
Sbjct: 94 MEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGYTQWLS----LLY 149

Query: 148 AWVMIIWFG---ASLGFIIGALSEKTELVEKLWHPVTYLMFPLSGAIFMVDWLSPAFQKI 204
A +I G ASLG ++ AL+ + V + LSGA+F VD L FQ
Sbjct: 150 ALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQTA 209

Query: 205 VLWLPMVHGVEMLREGYFG 223
+LP+ H ++++R G
Sbjct: 210 ARFLPLSHSIDLIRPIMLG 228


51BURPS668_3276BURPS668_3286Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_3276413-1.554436molecular chaperone DnaJ
BURPS668_3277612-1.314593hypothetical protein
BURPS668_3278412-0.845001molecular chaperone DnaK
BURPS668_32790100.203394thioredoxin family protein
BURPS668_3280-111-0.489595heat shock protein GrpE
BURPS668_3282-1100.579573hypothetical protein
BURPS668_32810131.980568heat shock protein 15
BURPS668_32830142.846200ferrochelatase
BURPS668_32841143.008466heat-inducible transcription repressor
BURPS668_32851153.201152NAD(+)/NADH kinase
BURPS668_32860143.220484DNA repair protein RecN
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3278SHAPEPROTEIN1353e-37 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 135 bits (342), Expect = 3e-37
Identities = 81/382 (21%), Positives = 138/382 (36%), Gaps = 71/382 (18%)

Query: 5 IGIDLGTTNSCVAIMEGNQVKVIENSEGARTTPSIIAYMDDNEVL-VGAPAKRQSVTNPK 63
+ IDLGT N+ + + V + R V VG AK+ P
Sbjct: 13 LSIDLGTANTLIYVKGQGIVLNEPSVVAIRQD----RAGSPKSVAAVGHDAKQMLGRTPG 68

Query: 64 NTLFAVKRLIGRRFEEKEVQKDIGLMPYAIIKADNGDAWVEAHGEKLAPPQVSAEVLRK- 122
N + A++ + + V D V+ ++L+
Sbjct: 69 N-IAAIRPM------KDGVIADF---------------------------FVTEKMLQHF 94

Query: 123 MKKTAEDYLGEPVTEAVITVPAYFNDSQRQATKDAGRIAGLEVKRIINEPTAAALAFGLD 182
+K+ + P ++ VP +R+A +++ + AG +I EP AAA+ GL
Sbjct: 95 IKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLP 154

Query: 183 KAEKGDRKIAVYDLGGGTFDVSIIEIADVDGEMQFEVLSTNGDTFLGGEDFDQRIIDYII 242
+E V D+GGGT +V++I + V + +GG+ FD+ II+Y+
Sbjct: 155 VSE--ATGSMVVDIGGGTTEVAVISLNGV---------VYSSSVRIGGDRFDEAIINYVR 203

Query: 243 GEFKKEQGVDLSKDVLALQRLKEAAEKAKIELSSS----QQTEINLPYITADASGPKHLN 298
+ G + AE+ K E+ S+ + EI + P+
Sbjct: 204 RNYGSLIG-------------EATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFT 250

Query: 299 LKVTRAKLEALVEDLVERTIEPCRTAIKDAGVKVSDIDD--VILVGGQTRMPKVQEKVKE 356
L + LEAL E L + SDI + ++L GG + + + E
Sbjct: 251 LN-SNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLME 309

Query: 357 FFGKEPRRDVNPDEAVAVGAAI 378
G +P VA G
Sbjct: 310 ETGIPVVVAEDPLTCVARGGGK 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3280IGASERPTASE310.002 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 31.2 bits (70), Expect = 0.002
Identities = 16/77 (20%), Positives = 24/77 (31%), Gaps = 8/77 (10%)

Query: 2 ENTQENPTDQTTEETGREAQAAEPAAQAAENAAPAAEAA--------LAEAQAKIAELQE 53
T E TE T + + A+ A + E A + K E
Sbjct: 1048 SKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVE 1107

Query: 54 SFLRAKAETENVRRRAQ 70
+AK ETE + +
Sbjct: 1108 KEEKAKVETEKTQEVPK 1124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3285TCRTETB290.035 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 28.7 bits (64), Expect = 0.035
Identities = 19/89 (21%), Positives = 35/89 (39%), Gaps = 10/89 (11%)

Query: 25 LASLAACIAKRGFEVVFEADTAQAIGSAGYPALTP---AEIGARADVAVVLGGDGTMLGM 81
S+ + F ++ A Q G+A +PAL A + + G G+++ M
Sbjct: 91 FGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAM 150

Query: 82 GRQLAPYKTPLIG---INHGRLGFITDIP 107
G + P IG ++ ++ IP
Sbjct: 151 GEGVG----PAIGGMIAHYIHWSYLLLIP 175


52BURPS668_3301BURPS668_3318Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_33012112.145429hypothetical protein
BURPS668_33022141.733314phosphonate ABC transporter permease
BURPS668_33032141.511661phosphonate ABC transporter periplasmic
BURPS668_33042142.518000phosphonate ABC transporter ATP-binding protein
BURPS668_33052172.723171phosphonate metabolism protein PhnM
BURPS668_33061152.785803phosphonate C-P lyase system protein PhnL
BURPS668_33071151.830904phosphonate C-P lyase system protein PhnK
BURPS668_3308-1122.035657phosphonate metabolism protein PhnJ
BURPS668_33091123.895643phosphonate metabolism protein PhnI
BURPS668_33103144.071084carbon-phosphorus lyase complex subunit
BURPS668_33112122.213604phosphonate metabolism protein PhnG
BURPS668_33121132.119832hypothetical protein
BURPS668_33131133.178962phosphonates metabolism transcriptional
BURPS668_33140122.531660hypothetical protein
BURPS668_33150120.250159phosphonate metabolism
BURPS668_33160120.4757124-hydroxybenzoate octaprenyltransferase
BURPS668_33171111.438383transcriptional regulatory protein
BURPS668_33182120.587816transcriptional regulatory protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3306PF05272300.008 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.008
Identities = 21/68 (30%), Positives = 28/68 (41%), Gaps = 1/68 (1%)

Query: 61 CVALTGPSGAGKSTLLRCLYGNYLANRGTIAVRAGARAAEHVV-LTASEPHEVIALRRDV 119
V L G G GKSTL+ L G + + G + E + + A E E+ A RR
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSEMTAFRRAD 657

Query: 120 IGYVSQFL 127
V F
Sbjct: 658 AEAVKAFF 665


53BURPS668_3385BURPS668_3395Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_3385232-0.461906hypothetical protein
BURPS668_3386231-1.564847hypothetical protein
BURPS668_3387129-2.079569hypothetical protein
BURPS668_3388321-4.762986HSP20 family protein
BURPS668_3389116-5.443056HSP20 family protein
BURPS668_3390312-5.499854hypothetical protein
BURPS668_3391212-6.580327chaperonin, 10 kDa
BURPS668_3392112-5.546989hypothetical protein
BURPS668_3393013-5.301226glutamate/aspartate ABC transporter ATP-binding
BURPS668_3394-112-3.545393glutamate/aspartate ABC transporter permease
BURPS668_3395-213-3.083475glutamate/aspartate ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3392ACRIFLAVINRP250.039 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 25.2 bits (55), Expect = 0.039
Identities = 9/38 (23%), Positives = 21/38 (55%), Gaps = 1/38 (2%)

Query: 14 IEIDDVIVGLLAI-RLNLPENADPRDAISRHLSEAGGP 50
+ +DD IV + + R+ + + P++A + +S+ G
Sbjct: 404 LLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGA 441


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3393PF05272320.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.4 bits (73), Expect = 0.002
Identities = 17/53 (32%), Positives = 24/53 (45%), Gaps = 5/53 (9%)

Query: 29 VVVVCGPSGSGKSTLIKTVNGLEPFQQGEILVNGQSVGDKKTNLSKLRSKVGM 81
VV+ G G GKSTLI T+ GL+ F +G K + ++ V
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHF-----DIGTGKDSYEQIAGIVAY 645


54BURPS668_3440BURPS668_3453Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_34402132.659076thiamine monophosphate kinase
BURPS668_34414131.125250phosphatidylglycerophosphatase A
BURPS668_34423131.230504cinA family protein
BURPS668_34433120.855258orotidine 5'-phosphate decarboxylase
BURPS668_34442131.279799aldose 1-epimerase
BURPS668_34452131.078068NAD dependent epimerase/dehydratase
BURPS668_34462131.850380L-arabinose transporter permease
BURPS668_34471142.488146L-arabinose transporter ATP-binding protein
BURPS668_3448-1122.799732carbohydrate ABC transporter periplasmic
BURPS668_3449-1121.334137short chain dehydrogenase
BURPS668_3450-2120.1482592-dehydro-3-deoxy-6-phosphogalactonate aldolase
BURPS668_3451-114-1.3181512-dehydro-3-deoxygalactonokinase
BURPS668_3452014-3.077542IclR family transcriptional regulator
BURPS668_3453213-2.706547hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3441PF05616290.012 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 29.3 bits (65), Expect = 0.012
Identities = 14/38 (36%), Positives = 15/38 (39%)

Query: 4 DPTPRPADSADSASQPGATPAPASSPAPRRDSPQDPQR 41
+P RP D P A P P R DSP P R
Sbjct: 346 NPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSPAVPDR 383



Score = 27.4 bits (60), Expect = 0.040
Identities = 12/24 (50%), Positives = 13/24 (54%)

Query: 7 PRPADSADSASQPGATPAPASSPA 30
PRP + SA P A P P SPA
Sbjct: 311 PRPDLTPGSAEAPNAQPLPEVSPA 334



Score = 27.4 bits (60), Expect = 0.049
Identities = 11/35 (31%), Positives = 14/35 (40%)

Query: 5 PTPRPADSADSASQPGATPAPASSPAPRRDSPQDP 39
P +P A P PAP +P R + DP
Sbjct: 323 PNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDP 357


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3445DHBDHDRGNASE1233e-36 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 123 bits (310), Expect = 3e-36
Identities = 76/249 (30%), Positives = 113/249 (45%), Gaps = 8/249 (3%)

Query: 26 GRAVLITGGATGIGASFVEHFARQGARVAFVDLDEKAGRALVARLADAAHEPVFVVCDLT 85
G+ ITG A GIG + A QGA +A VD + + +V+ L A D+
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 86 DIGALRGAIDAIRVRIGPIAVLVNNAANDVRHAVADVTPESFDASIAVNLRHQFFAAQAV 145
D A+ I +GPI +LVN A + ++ E ++A+ +VN F A+++V
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 146 IDDMKRLGGGAIVNLGSIGWMLKNAGYPVYATAKAAVQGLTRALARELGPFGIRVNTLVP 205
M G+IV +GS + YA++KAA T+ L EL + IR N + P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 206 GWVMTDKQRRLWLDDAGRAAIKAGQCIDAEL--------LPGDLARMALFLAADDSRLIT 257
G TD Q LW D+ G + G + P D+A LFL + + IT
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHIT 247

Query: 258 AQDVVVDGG 266
++ VDGG
Sbjct: 248 MHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3447PF05272290.040 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.040
Identities = 14/38 (36%), Positives = 17/38 (44%), Gaps = 5/38 (13%)

Query: 18 RALD-GISFDVHAGQVHGLMGENGAGKSTLLKILGGEY 54
R ++ G FD L G G GKSTL+ L G
Sbjct: 587 RVMEPGCKFDY----SVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3449DHBDHDRGNASE1359e-41 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 135 bits (340), Expect = 9e-41
Identities = 84/259 (32%), Positives = 129/259 (49%), Gaps = 12/259 (4%)

Query: 4 LAGKVAIVTGAGRGIGAAIARAFVREGAAVAIAELDAA---LAEESADAIARDTAGARVL 60
+ GK+A +TGA +GIG A+AR +GA +A + + S A AR
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE----- 60

Query: 61 AVPTDVARAESVAAALARTERAFGPLDVLVNNAGVNVFGDPLALTDEDWRRCFAIDLDGV 120
A P DV + ++ AR ER GP+D+LVN AGV G +L+DE+W F+++ GV
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 121 WNGCRAALPGMVERGRGSIVNIASTHAFKIIPGCFPYPVAKHGVLGLTRALGIEYAPRNV 180
+N R+ M++R GSIV + S A Y +K + T+ LG+E A N+
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 181 RVNAIAPGYIETQLTHDWW---NAQPDPQAARRETLALQ-PMKRIGRPDEVAMTAVFLAS 236
R N ++PG ET + W N ET P+K++ +P ++A +FL S
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVS 240

Query: 237 DEAPFINASCITIDGGRSV 255
+A I + +DGG ++
Sbjct: 241 GQAGHITMHNLCVDGGATL 259


55BURPS668_3480BURPS668_3503Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_348008-3.312707hypoxanthine-guanine phosphoribosyltransferase
BURPS668_3481-19-2.384672MarC membrane protein
BURPS668_3482-110-2.316194prolyl-tRNA synthetase
BURPS668_3483-112-2.642243dinucleoside polyphosphate hydrolase
BURPS668_3485-113-2.283702lipoprotein
BURPS668_3484-113-2.552722hypothetical protein
BURPS668_3486-313-3.139381gamma-glutamyl kinase
BURPS668_3487117-3.364455GTPase ObgE
BURPS668_3488526-3.838464hypothetical protein
BURPS668_3489525-4.28007650S ribosomal protein L27
BURPS668_3490627-4.79770450S ribosomal protein L21
BURPS668_3491626-4.565502octaprenyl-diphosphate synthase
BURPS668_3493931-5.179899*integrase
BURPS668_34941244-9.037838hypothetical protein
BURPS668_34951245-9.364568hypothetical protein
BURPS668_34961250-10.633902hypothetical protein
BURPS668_34971149-10.352503hypothetical protein
BURPS668_34981253-11.289183hypothetical protein
BURPS668_34991354-11.392684XRE family transcriptional regulator
BURPS668_3500944-10.830998hypothetical protein
BURPS668_3501638-8.401771hypothetical protein
BURPS668_3502432-5.678886hypothetical protein
BURPS668_3503329-4.601480ATPase AAA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3486CARBMTKINASE361e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 36.3 bits (84), Expect = 1e-04
Identities = 34/129 (26%), Positives = 48/129 (37%), Gaps = 10/129 (7%)

Query: 132 GVVPIINENDTVVTDEIKFGDNDTLGALVANLIEGDTLVILTDQPGLFTADPRKDPGATL 191
G VP+I E+ + E D D G +A + D +ILTD G +
Sbjct: 195 GGVPVILEDGEIKGVEAVI-DKDLAGEKLAEEVNADIFMILTDVNGAALYYGTEKEQWLR 253

Query: 192 VAEASAGAPELEAMAGGAGSSIGRGGMLTKILAAKRAAHSGANTVIASGRERDVLVRLAA 251
+ E AGS M K+LAA R G I + E+ V
Sbjct: 254 EVKVEELRKYYEEGHFKAGS------MGPKVLAAIRFIEWGGERAIIAHLEK--AVEALE 305

Query: 252 GEAIGTQLI 260
G+ GTQ++
Sbjct: 306 GKT-GTQVL 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3487TCRTETOQM290.035 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 29.1 bits (65), Expect = 0.035
Identities = 18/100 (18%), Positives = 34/100 (34%), Gaps = 22/100 (22%)

Query: 249 PFDERVDPVAEARAIVGELRKYDESLYEKPRWLVLNKLDMVPED---ERRARVADFIERF 305
E+ D V E ++ L EK ++ + + E R +
Sbjct: 170 TESEQWDTVIEG----------NDDLLEK----YMSGKSLEALELEQEESIRFHN----- 210

Query: 306 GWTGPVFEISALTGQGCESLVYAIHDYLVEHSDAHRAELA 345
PV+ SA G ++L+ I + + ++EL
Sbjct: 211 CSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTHRGQSELC 250


56BURPS668_3531BURPS668_3537Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_3531-111-3.074121penicillin-binding protein
BURPS668_3532014-3.372482cell division protein FtsL
BURPS668_3533017-4.027288S-adenosyl-methyltransferase MraW
BURPS668_3534017-4.562307cell division protein MraZ
BURPS668_3535015-4.607850ubiquinone biosynthesis protein
BURPS668_3536014-4.238295outer membrane porin
BURPS668_3537-212-3.016183long-chain-fatty-acid--CoA ligase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_353256KDTSANTIGN270.017 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 26.8 bits (59), Expect = 0.017
Identities = 18/58 (31%), Positives = 28/58 (48%), Gaps = 7/58 (12%)

Query: 24 NQQRQIFIQLQRAQSQEHQLQQDYAQLQYQQSA-------LSKTSRIEQLATSSLKMQ 74
NQ F+ +AQ Q+ Q QQ AQ Q++ L+ + +I QL +K+Q
Sbjct: 328 NQIHLNFVMPPQAQQQQGQGQQQQAQATAQEAVAAAAVRLLNGSDQIAQLYKDLVKLQ 385


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3536ECOLNEIPORIN881e-21 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 88.3 bits (219), Expect = 1e-21
Identities = 90/394 (22%), Positives = 139/394 (35%), Gaps = 71/394 (18%)

Query: 1 MKKSLLALVALSAFAGAAHAQSSVTLYGIIDEGFNINTNAGGKHL-----YNLSSGVMQG 55
MKKSL+AL L+A AA A VTLYG I G + + + V G
Sbjct: 1 MKKSLIALT-LAALPVAAMAD--VTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLG 57

Query: 56 SRWGLRGTEDLGGGLKALFVLENGFDVNSGKLNQGGLEFGRQAYVGLSSGFGTVTLGRQY 115
S+ G +G EDLG GLKA++ +E + G RQ+++GL GFG + +GR
Sbjct: 58 SKIGFKGQEDLGNGLKAIWQVEQKASIAGTDSGWGN----RQSFIGLKGGFGKLRVGRLN 113

Query: 116 DSVVDF--VGPLEA-GDQWGGYIAAHPGDLDNFNNAYRVNNAVKFTSANYGGFTFGGLYS 172
+ D + P ++ D G A P + + V++ S + G + Y+
Sbjct: 114 SVLKDTGDINPWDSKSDYLGVNKIAEP---EARLIS------VRYDSPEFAGLSGSVQYA 164

Query: 173 FGGVAGDFSRNQTWSLGAGYTNGPLVLGVGYLNARTPSTAGGLFGNNTTSSTPAAVTTPV 232
AG ++++ G Y NG + G R + +
Sbjct: 165 LNDNAG-RHNSESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHR-------L 216

Query: 233 YAGYASAHTYQVIGAGGAYSFGAATVGITYSNIKFMNFASTVFPNQTATFNNAEINFKYQ 292
+GY + Y A A A + + + T + N +
Sbjct: 217 VSGYDNDALY----ASVAVQQQDAKL-VEENYSHNSQTEVAA----TLAYRFG--NVTPR 265

Query: 293 LTPTLLAGAAYDYTQGSKIAGSSAAKYHQGSVGVDYFLSKRTDVYAIGVYQHASGNVIEA 352
++ ++D T + Y Q VG +Y SKRT + E
Sbjct: 266 VSYAHGFKGSFDATNYNND-------YDQVVVGAEYDFSKRTSALVSAGWLQ------EG 312

Query: 353 DGNTVGPATAAINGLTPSSNRNQFAARVGIRHKF 386
G S A VG+RHKF
Sbjct: 313 KG---------------ESKFVSTAGGVGLRHKF 331


57BURPS668_3606BURPS668_3619Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_36062150.180740methyltransferase
BURPS668_3607114-2.699639LigA protein
BURPS668_3608113-2.847397acyltransferase
BURPS668_3610118-2.789841hypothetical protein
BURPS668_3609119-3.788487lipoprotein
BURPS668_3611121-4.578586hypothetical protein
BURPS668_3613120-3.435939M1 family peptidase
BURPS668_3612-120-1.706361hypothetical protein
BURPS668_3614213-1.837359hypothetical protein
BURPS668_361529-1.099571hypothetical protein
BURPS668_361629-1.194860hypothetical protein
BURPS668_361709-0.174115hypothetical protein
BURPS668_3618111-1.383754secretion protein
BURPS668_3619214-1.531992toxin secretion ABC transporter ATP-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3618RTXTOXIND1322e-36 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 132 bits (334), Expect = 2e-36
Identities = 81/429 (18%), Positives = 159/429 (37%), Gaps = 53/429 (12%)

Query: 28 RPVSFAVLASAAASMALGVI--LLFTFGTYTRRTTVDGVLTPDTGLVKVYAQQTGVVLKK 85
PVS A M VI +L G T +G LT ++ + +V +
Sbjct: 51 TPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEI 110

Query: 86 NVVEGQHVTRGQVLYTVSTDLQSAAAGQTQAAL----IEQAQQRKTSLQQELDKTRRLQ- 140
V EG+ V +G VL ++ A +TQ++L +EQ + + S EL+K L+
Sbjct: 111 IVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKL 170

Query: 141 ----------------------------QDERDTLQSKIASLRTELAGIDDQIAAQRTRA 172
Q+++ + + R E + +I +
Sbjct: 171 PDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLS 230

Query: 173 SIAADAASRYAGLLAQDYISKDQAQQRQADLLDQRSKLNSLMRDRASTAQSLKEALNDLS 232
+ ++ LL + I+K +++ ++ ++L + A +
Sbjct: 231 RVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQ 290

Query: 233 GLSLKQQNQLSQIDRSVIDVDRTLIESEAKREF-----VVTAPETGT-ATAVIAEPGQTA 286
++ +N++ R D L AK E V+ AP + + G
Sbjct: 291 LVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVV 350

Query: 287 DTSHPLASIVPTGAHWQAYLFVPSAAVGFVHVGDRVLVRYQAYPYQKFGQYEASVVSIAR 346
T+ L IVP + V + +GF++VG +++ +A+PY ++G V +I
Sbjct: 351 TTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINL 410

Query: 347 TALSAAELATSGGPAAQTASGTYYRITVALNSQNVMAYGRAQPLQAGMALQADVLQERRR 406
A+ G + + +++ + + PL +GMA+ A++ R
Sbjct: 411 DAIE------------DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRS 458

Query: 407 LYEWVLEPL 415
+ ++L PL
Sbjct: 459 VISYLLSPL 467


58BURPS668_3630BURPS668_3649Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_3630211-2.931311hypothetical protein
BURPS668_3631211-3.038660hypothetical protein
BURPS668_3632313-4.293345hypothetical protein
BURPS668_3633315-4.153314hypothetical protein
BURPS668_3634212-3.582320hypothetical protein
BURPS668_3635211-3.009183hypothetical protein
BURPS668_3636214-3.529816hypothetical protein
BURPS668_3637217-5.500112hypothetical protein
BURPS668_3638317-5.366515lipoprotein
BURPS668_3639630-8.970245hypothetical protein
BURPS668_3640837-10.010732hypothetical protein
BURPS668_3641837-10.794110Phage integrase
BURPS668_3642735-10.359033hypothetical protein
BURPS668_3643627-9.356915TnpB
BURPS668_3644525-9.101103type I restriction enzyme R protein N terminus
BURPS668_3646213-5.888546*ClpXP protease specificity-enhancing factor
BURPS668_364709-5.157073stringent starvation protein A
BURPS668_364818-4.666956ubiquinol-cytochrome c reductase, cytochrome c1
BURPS668_364908-3.337559ubiquinol-cytochrome c reductase, cytochrome b
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3636IGASERPTASE290.012 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.3 bits (65), Expect = 0.012
Identities = 27/188 (14%), Positives = 58/188 (30%), Gaps = 10/188 (5%)

Query: 11 ASQPTPPTTEAFNKSLADADAVAKTGDQERAIGLYQQLAKSDPTREEPWSRIAQIQFQQG 70
Q P+ + N+ +A D A ++ +++E + Q
Sbjct: 1002 NIQADVPSVPSNNEEIARVDEAPVP-PPAPATPSETTETVAENSKQESKTVEKNEQDATE 1060

Query: 71 HYGQAIVAAQEALQRDKTDRQAKSVLAVAGLRIATESLGELRQDSSLAGDAKSDAQALAK 130
Q A+EA K + Q VA T+ + + + A+ +
Sbjct: 1061 TTAQNREVAKEAKSNVKANTQT---NEVAQSGSETKETQTTETKETATVEKEEKAKVETE 1117

Query: 131 QLRDTLGEAALFPPEQQATKPVVKKRRIVRRAKPVHEAPRAAESETAAAPATPPAAPAQP 190
+ ++ + P+Q+ + + +A+P E + + A QP
Sbjct: 1118 KTQEVPKVTSQVSPKQE------QSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQP 1171

Query: 191 AATPAPAP 198
A +
Sbjct: 1172 AKETSSNV 1179



Score = 28.1 bits (62), Expect = 0.032
Identities = 32/191 (16%), Positives = 60/191 (31%), Gaps = 32/191 (16%)

Query: 11 ASQPTPPTTEAFNKSLADADAVAKTGDQERAIGLYQQLAKSDPTREEPWSRIAQIQFQQG 70
+ T P N AD +V ++ I + P P +
Sbjct: 994 TTNITTP-----NNIQADVPSVPSNNEE---IARVDEAPVPPPAPATPSETTETV----- 1040

Query: 71 HYGQAIVAAQEALQRDKTDRQAKSVLAVAGLRIATESLGELRQDSSLAGDAKSDAQALAK 130
A + QE+ +K ++ A A +A E+ ++ ++ A+S ++
Sbjct: 1041 ----AENSKQESKTVEKNEQDATETTAQNR-EVAKEAKSNVKANTQTNEVAQSGSETKET 1095

Query: 131 QLRDTLGEAALFPPEQQATKPVVKKRRIVRRAKPVHEAPRAAESETAAAPATPPAAPAQP 190
Q E + T V K+ + + E P+ + +P + QP
Sbjct: 1096 Q-----------TTETKETATVEKEEKAKVETEKTQEVPKVT---SQVSPKQEQSETVQP 1141

Query: 191 AATPAPAPAKA 201
A PA
Sbjct: 1142 QAEPARENDPT 1152



Score = 27.7 bits (61), Expect = 0.039
Identities = 22/130 (16%), Positives = 41/130 (31%), Gaps = 22/130 (16%)

Query: 75 AIVAAQEALQRDKTDRQAKSVLAVAGLRIATESLGELRQDSSLAGDAKSDAQALAKQLRD 134
I A ++ + + V AT S ++A ++K +++ + K
Sbjct: 1002 NIQADVPSVPSNNEEIARVDEAPVPPPAPATPS----ETTETVAENSKQESKTVEKN--- 1054

Query: 135 TLGEAALFPPEQQATKPVVKKRRIVRRAKP-VHEAPRAAESETAAAPATPPAAPAQPAAT 193
EQ AT+ + R + + AK V + E A + Q T
Sbjct: 1055 ----------EQDATETTAQNREVAKEAKSNVKANTQTNE----VAQSGSETKETQTTET 1100

Query: 194 PAPAPAKAAG 203
A +
Sbjct: 1101 KETATVEKEE 1110


59BURPS668_3791BURPS668_3812Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_3791211-0.969376hypothetical protein
BURPS668_3792211-1.060917tryptophan repressor binding protein
BURPS668_3793110-0.406390N-acetyl-gamma-glutamyl-phosphate reductase
BURPS668_3794091.267072hypothetical protein
BURPS668_3795081.740688hypothetical protein
BURPS668_3796-392.016703lipoprotein
BURPS668_3797-392.524240ompW family protein
BURPS668_3798-193.051903LysR family transcriptional regulator
BURPS668_3799-1111.703005major facilitator transporter
BURPS668_3800117-1.057899LysE family translocator protein
BURPS668_3801321-1.862978dehydrogenase
BURPS668_3802734-4.299625hypothetical protein
BURPS668_3803741-5.140269hypothetical protein
BURPS668_3804743-5.976436hypothetical protein
BURPS668_3805741-4.469979hypothetical protein
BURPS668_3806838-3.788487hypothetical protein
BURPS668_3807638-3.799334hypothetical protein
BURPS668_3808532-2.307535lipoprotein
BURPS668_3809422-4.161485hypothetical protein
BURPS668_3810419-2.968283hypothetical protein
BURPS668_3811117-1.885911hypothetical protein
BURPS668_3812215-1.322623hypothetical protein
60BURPS668_3853BURPS668_3867Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_38532130.881968methyl-accepting chemotaxis protein II
BURPS668_3854212-0.638582chemotaxis protein CheW
BURPS668_3855212-0.433478chemotaxis protein CheA
BURPS668_3856113-2.619747chemotaxis protein CheY
BURPS668_3857014-2.736267flagellar motor protein MotB
BURPS668_3858016-4.317461flagellar motor protein MotA
BURPS668_3859118-3.545069transcriptional activator FlhC
BURPS668_3860018-3.423999transcriptional activator FlhD
BURPS668_3861115-1.215216glycoside hydrolase family protein
BURPS668_3862114-0.986034H-NS histone family protein
BURPS668_3863013-0.771185hypothetical protein
BURPS668_3864010-0.367092aquaporin Z
BURPS668_3865-110-1.259987HAD family hydrolase
BURPS668_3866210-0.758955BadF/BadG/BcrA/BcrD ATPase
BURPS668_3867316-2.493253DNA-3-methyladenine glycosidase I
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3855PF06580463e-07 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 46.0 bits (109), Expect = 3e-07
Identities = 21/151 (13%), Positives = 50/151 (33%), Gaps = 52/151 (34%)

Query: 464 ELDKSLIERIIDPLT--HLVRNSLDHGIETVEARRAAGKDAVGQLVLSAAHHGGNIVIEV 521
+++ ++++ + P+ LV N + HGI G+++L G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 522 SDDGAGLNRERILAKAAKQGMQISENISDDEVWNLIFAPGFSTAEVVTDVSGRGVGMDVV 581
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKNTK--------------------------------------ESTGTGLQNV 318

Query: 582 KRNIQSMGG---HVEISSQAGRGTTTRIVLP 609
+ +Q + G +++S + G +++P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3856HTHFIS718e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 71.4 bits (175), Expect = 8e-18
Identities = 38/114 (33%), Positives = 58/114 (50%), Gaps = 2/114 (1%)

Query: 4 TILAIDDSATMRTLLSATLGEAGYDVTVASDGEVGLDVALATRFDLVLTDHHMPRKNGLE 63
TIL DD A +RT+L+ L AGYDV + S+ A DLV+TD MP +N +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 LIVALRRQLGYEATPILVLTTENGDAFKDAARAAGATGWIEKPIDPDALIELVA 117
L+ +++ P+LV++ +N A GA ++ KP D LI ++
Sbjct: 65 LLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3857OMPADOMAIN401e-05 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 39.5 bits (92), Expect = 1e-05
Identities = 25/117 (21%), Positives = 51/117 (43%), Gaps = 9/117 (7%)

Query: 182 FAMSSDAVEPYMRDILREIGKTLNDV---PNRIIVQGHTDAVPYAGGEKGYSNWELSADR 238
F + ++P + L ++ L+++ ++V G+TD + G Y N LS R
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI----GSDAY-NQGLSERR 277

Query: 239 ANASRRELIAGGMDEAKVLRV-LGLASTQNLNKADPLDPENRRISIIVLNRKSELAL 294
A + LI+ G+ K+ +G ++ N D + I + +R+ E+ +
Sbjct: 278 AQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


61BURPS668_3878BURPS668_3920Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_3878117-3.041963short chain dehydrogenase/reductase
BURPS668_3879013-3.016999hexapeptide repeat-containing transferase
BURPS668_3880014-2.995225Rieske family iron-sulfur cluster-binding
BURPS668_3881217-2.968859hypothetical protein
BURPS668_3882014-0.748669hypothetical protein
BURPS668_3883-190.410291LysE family translocator protein
BURPS668_3884-19-0.302141acid phosphatase AcpA
BURPS668_3885214-0.607038hypothetical protein
BURPS668_3886213-0.452113hypothetical protein
BURPS668_3887014-0.795228lipoprotein
BURPS668_3888-115-1.447296lipoprotein
BURPS668_3889314-0.521984csgG family protein
BURPS668_3890414-0.459502hypothetical protein
BURPS668_3891314-0.515714lipoprotein
BURPS668_3892312-1.627222hypothetical protein
BURPS668_3893212-2.0008253-oxoadipate enol-lactone hydrolase
BURPS668_3894012-2.051433methyl-accepting chemotaxis protein
BURPS668_3895224-4.716673hypothetical protein
BURPS668_3896224-4.679942hypothetical protein
BURPS668_3897333-8.313814FAD/FMN-containing dehydrogenases
BURPS668_3898547-11.104213chitin-binding domain-containing protein
BURPS668_3899549-11.376784hypothetical protein
BURPS668_3900653-12.536171gp30
BURPS668_3901862-15.176307hypothetical protein
BURPS668_3902956-14.138070hypothetical protein
BURPS668_3903540-9.308442phage integrase site specific recombinase
BURPS668_3904638-8.661346phage integrase site specific recombinase
BURPS668_3905637-8.812427hypothetical protein
BURPS668_3906635-6.555904hypothetical protein
BURPS668_3907729-4.457670bacteriophage-like protein
BURPS668_3908628-3.606721Type II secretory pathway, component PulD
BURPS668_3909731-3.664054toxin
BURPS668_3910631-3.214907ABC-type Co2+ transport system, permease
BURPS668_3911728-2.178902hypothetical protein
BURPS668_3912829-2.057341hypothetical protein
BURPS668_3913115-2.659954hypothetical protein
BURPS668_3914216-1.056633hypothetical protein
BURPS668_39151170.764442hypothetical protein
BURPS668_39181171.428493*cytochrome c5
BURPS668_39170101.629951hypothetical protein
BURPS668_3919-1101.723372ATP-dependent DNA helicase Rep
BURPS668_39200203.896470hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3878DHBDHDRGNASE998e-27 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 98.6 bits (245), Expect = 8e-27
Identities = 66/251 (26%), Positives = 111/251 (44%), Gaps = 9/251 (3%)

Query: 7 LAGGTYLVTGASSGIGRATAIAIAQLGGRLVLGGRDPARLADTLAALPGDGHASHAAALD 66
+ G +TGA+ GIG A A +A G + +P +L +++L + + A D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 67 DADAAA--DWVGALAETHGPLAGVFHAAGVELIRPARMTAQAQLEQVFGASLYAAFGIAR 124
D+AA + + GP+ + + AGV + + E F + F +R
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 125 AAAKKNVIADGGSVVYMSSVAGSTGQVGMTAYSAAKAGIEGLVRSLACELAPRRIRANAI 184
+ +K + GS+V + S + M AY+++KA + L ELA IR N +
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 185 AAGAVKTEMHARL--TRGTPEDALAAYEASHLLG-----FGEPGDVAAAAIFLLSGASRW 237
+ G+ +T+M L E + + G +P D+A A +FL+SG +
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 238 ITGTSLVVDGG 248
IT +L VDGG
Sbjct: 246 ITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3888IGASERPTASE300.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.003
Identities = 18/93 (19%), Positives = 28/93 (30%)

Query: 48 QKSPEDQIDALEKALQQIRAKGNRPPPGFEAHLGMLYASVGKEQQAEQSFQAEKASFPES 107
+ + I A ++ + R S E AE S Q K
Sbjct: 996 NITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNE 1055

Query: 108 SPFMDFLLKKKAAAPQAKPSAPAQPQTQTQAQQ 140
+ + + A +AK + A QT AQ
Sbjct: 1056 QDATETTAQNREVAKEAKSNVKANTQTNEVAQS 1088


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3908BCTERIALGSPD1013e-25 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 101 bits (254), Expect = 3e-25
Identities = 63/289 (21%), Positives = 118/289 (40%), Gaps = 36/289 (12%)

Query: 190 AGDVPAMPVETSVQARGDD--LVIVGSHDEVALLRKVVPELDTVPSEVVVRGWVYEVANT 247
A V A+ ++A G L++ + D + L +V+ +LD +V+V + EV +
Sbjct: 300 AKPVAALDKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDA 359

Query: 248 DS-----------------TNTAWSIAVRMLSGQLRVSSGDTSSDAS---------AVRF 281
D TN+ I+ + G SS + A F
Sbjct: 360 DGLNLGIQWANKNAGMTQFTNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAGF 419

Query: 282 TGPGVDAAISALNADSRFKVVSAPHVRIVSGERVRLNVGQQVPTQSSVSYQGSTGTPVQS 341
++AL++ ++ +++ P + + NVGQ+VP + S S +
Sbjct: 420 YQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTG-SQTTSGDNIFNT 478

Query: 342 ITYQDAGLIFDVEPTVMR-EVIELKVREEISDF--VATKTGVDTSPTKNTRQLQTVTRLK 398
+ + G+ V+P + + + L++ +E+S A+ T D T NTR + +
Sbjct: 479 VERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVG 538

Query: 399 DGEVVVLGGLIQDRNATARSGYAWLPSF-LDG---RSSSKQRTEVLLVL 443
GE VV+GGL+ + L + G RS+SK+ ++ L+L
Sbjct: 539 SGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLML 587


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3911PF05616594e-11 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 58.6 bits (141), Expect = 4e-11
Identities = 49/183 (26%), Positives = 66/183 (36%), Gaps = 13/183 (7%)

Query: 480 QPVPQPVPVPLPQPVPHPAPEPAPSPVPQPVPVPVPEPVPGPVPVPVPSPVPEPIPQPIP 539
Q +P+P P P+ P P SP P P P PG P P P P P P
Sbjct: 308 QVIPRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDT 367

Query: 540 QPLPQPVPIPTPAPGTNPGTGRPSRDD------ICALHPEASACAPL---GSASDVAVNS 590
P P P G R R + +C P+ AC L A D+ + S
Sbjct: 368 DGQPGTRPDSPAVPDRPNGRHRKERKEGEDGGLLCKFFPDILACDRLPEPNPAEDLNLPS 427

Query: 591 DAKRISLSPLSIGLTKGVCPEPKRVVVF----GGELSFSYEPLCEFALKLRPLVLLLGAL 646
+ + I CP P V + +FS+E C A +LR ++L L
Sbjct: 428 ETVNVEFQKSGIFQDSAQCPAPVTFTVTVLDSSRQFAFSFENACTIAERLRYMLLALAWA 487

Query: 647 LAG 649
+A
Sbjct: 488 VAA 490



Score = 45.1 bits (106), Expect = 6e-07
Identities = 30/86 (34%), Positives = 34/86 (39%), Gaps = 2/86 (2%)

Query: 408 GKDVLITPFVPPQPIPVP--VPKPAPTPAPRPASEAEPEPRPAPAPVPGAPPQPRPVPEP 465
G+D V Q IP P P A P +P E P PA P P P RP PEP
Sbjct: 296 GRDSQGNTTVDVQVIPRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEP 355

Query: 466 QPQPQPMPVPRPVPQPVPQPVPVPLP 491
P P P QP +P +P
Sbjct: 356 DPDLNPDANPDTDGQPGTRPDSPAVP 381



Score = 41.7 bits (97), Expect = 9e-06
Identities = 25/72 (34%), Positives = 30/72 (41%), Gaps = 2/72 (2%)

Query: 426 VPKPAPTPAPRPASEAEPEPRPAPAPVPGAPPQPRPVP--EPQPQPQPMPVPRPVPQPVP 483
+P+P TP A A+P P +PA P P P P P P+P P P P
Sbjct: 310 IPRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDG 369

Query: 484 QPVPVPLPQPVP 495
QP P VP
Sbjct: 370 QPGTRPDSPAVP 381



Score = 30.9 bits (69), Expect = 0.018
Identities = 22/89 (24%), Positives = 29/89 (32%), Gaps = 14/89 (15%)

Query: 376 QPISEPDIAEWAKENPGQVPTVGDLFTSPARKGKDVLITPFVPPQPIPVPVPKPAPTPAP 435
Q I PD+ + E P P ++P P P P P P P
Sbjct: 308 QVIPRPDLTPGSAEAPNAQPLPE--------------VSPAENPANNPAPNENPGTRPNP 353

Query: 436 RPASEAEPEPRPAPAPVPGAPPQPRPVPE 464
P + P+ P PG P VP+
Sbjct: 354 EPDPDLNPDANPDTDGQPGTRPDSPAVPD 382


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3914PF09025250.045 YopR Core
		>PF09025#YopR Core

Length = 143

Score = 25.0 bits (54), Expect = 0.045
Identities = 12/41 (29%), Positives = 22/41 (53%)

Query: 33 VLEQESSEGKQILVGTINLPAALKDSPTGDYLAEFALQQSM 73
+L E G+Q + L A++ +P G+YLA+ A ++
Sbjct: 80 MLRAELPLGRQQQTFLLQLLGAVEHAPGGEYLAQLARRELQ 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3920OMADHESIN280.050 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 27.9 bits (61), Expect = 0.050
Identities = 32/107 (29%), Positives = 45/107 (42%), Gaps = 2/107 (1%)

Query: 60 LSLSAIAASEAFSFAYAWTCRRHRWPLALAAGLAAWAAAASALARLPATPPAATAVAFAA 119
+S+SA S FS YA+ P A ++ A A L P PP A A
Sbjct: 7 ISVSAALISALFSSPYAFADDYDGIPNLTAVQISPNADPALGLE-YPVRPPVPGAGGLNA 65

Query: 120 TCFGQSCLPRGATLAPRAPLSHADLAGRLAAGAALALAVTSLAGALG 166
+ G + GAT + A AG +A G ++A+ L+ ALG
Sbjct: 66 SAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVN-SVAIGPLSKALG 111


62BURPS668_3964BURPS668_3989Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_3964-114-3.665887cyclohexadienyl dehydratase
BURPS668_3965116-3.838515CoA-binding protein
BURPS668_3966117-3.787862AMP-binding protein
BURPS668_3967223-4.658523ATP synthase F0F1 subunit epsilon
BURPS668_3968225-4.907151ATP synthase F0F1 subunit beta
BURPS668_3969120-5.016700ATP synthase F0F1 subunit gamma
BURPS668_3970221-4.705083ATP synthase F0F1 subunit alpha
BURPS668_3971524-4.779157ATP synthase F0F1 subunit delta
BURPS668_3972425-6.080453ATP synthase F0F1 subunit B
BURPS668_3973119-4.479699ATP synthase F0F1 subunit C
BURPS668_3974016-3.656842ATP synthase F0F1 subunit A
BURPS668_3975019-3.511895ATP synthase F0F1 subunit I
BURPS668_3976-121-3.877844lipoprotein
BURPS668_3978024-5.098888hypothetical protein
BURPS668_3977025-5.090593transporter
BURPS668_3979226-5.394572ParB family protein
BURPS668_3980126-5.276294CobQ/CobB/MinD/ParA nucleotide binding
BURPS668_3981124-4.44473316S rRNA methyltransferase GidB
BURPS668_3982224-3.449592tRNA uridine 5-carboxymethylaminomethyl
BURPS668_3983019-2.214265hydrophobic amino acid ABC transporter
BURPS668_3984-116-1.970701hydrophobic amino acid ABC transporter
BURPS668_3985-215-1.275680amino acid uptake ABC transporter permease
BURPS668_3986-313-1.807225amino acid uptake ABC transporter periplasmic
BURPS668_3987-211-1.628692hypothetical protein
BURPS668_3988012-2.919107amino acid uptake ABC transporter permease
BURPS668_3989-111-3.447858amino acid uptake ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3971FLGMOTORFLIN270.034 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 26.8 bits (59), Expect = 0.034
Identities = 24/87 (27%), Positives = 45/87 (51%), Gaps = 9/87 (10%)

Query: 5 ATIARPYAEALFRVAEGGDISAWSTLVQELAQVAQLPEVLSVASSPKVSRTQ--VAELLL 62
AT + A+A+F+ GGD+S +Q++ + +P L+V ++ RT+ + ELL
Sbjct: 28 ATTTKSAADAVFQQLGGGDVSG---AMQDIDLIMDIPVKLTV----ELGRTRMTIKELLR 80

Query: 63 AALKSPLASGAQAKNFVQMLVDNHRIA 89
S +A A + +L++ + IA
Sbjct: 81 LTQGSVVALDGLAGEPLDILINGYLIA 107


63BURPS668_0007BURPS668_0015N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0007-111-0.598170general secretion pathway protein D
BURPS668_00080120.560366general secretory pathway protein E
BURPS668_0009-1140.626527general secretion pathway protein F
BURPS668_00100131.365579general secretion pathway protein C
BURPS668_00110121.883408hypothetical protein
BURPS668_00120133.235126general secretion pathway protein G
BURPS668_0013-1133.996175general secretion pathway protein H
BURPS668_00140133.704503general secretion pathway protein I
BURPS668_0015-1104.151808general secretion pathway protein J
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0007BCTERIALGSPD404e-133 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 404 bits (1040), Expect = e-133
Identities = 216/691 (31%), Positives = 325/691 (47%), Gaps = 88/691 (12%)

Query: 13 TALVVAGIVAAQAAHAQVTLNFVNADIDQVAKAIGAATGKTIIVDPRVKGQLNLVAERPV 72
T L+ A ++ AA + + +F DI + + KT+I+DP V+G + + + +
Sbjct: 13 TLLIFAALLFRPAAAEEFSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRSYDML 72

Query: 73 PEDQALKTLQSALRMQGFALV-QDHGVLKVVPEADAKLQGVPTYIGNAPQARGDQVVTQV 131
E+Q + S L + GFA++ ++GVLKVV DAK VP AP GD+VVT+V
Sbjct: 73 NEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPGI-GDEVVTRV 131

Query: 132 FELRNESANNLLPVLRPLI--SPNNTITAYPANNTIVVTDYADNVRRIAQIIAGVDSAAG 189
L N +A +L P+LR L + ++ Y +N +++T A ++R+ I+ VD+A
Sbjct: 132 VPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVDNAGD 191

Query: 190 SQVAVVPLKNANAIDIAAQLTKLLDPGAIGNTDATLKVTVQADPRTNALLLRASNAQRLA 249
V VPL A+A D+ +T+L + ++ V AD RTNA+L+ R
Sbjct: 192 RSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVLVSGEPNSR-Q 250

Query: 250 AAKKIAQQLDAPSGVPGNMHVVPLRNAEAVKLAKTLRGMLGKGGGESGSSASSNDANAFN 309
+ +QLD GN V+ L+ A+A L + L G+
Sbjct: 251 RIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGIS-------------------- 290

Query: 310 QGGSQSGSNFSTGASGTPPLPSGLSSNSSGGAGGTTGGGGLGNAGLLGGDKDKGDDNQPG 369
S + S +
Sbjct: 291 ---------------------STMQSEKQAAKPVAALDKNI------------------- 310

Query: 370 GMIQADAASNSLIITASDPVYRNLRAVIDQLDARRAQVYIEALVVELQATTSANLGIQWQ 429
+I+A +N+LI+TA+ V +L VI QLD RR QV +EA++ E+Q NLGIQW
Sbjct: 311 -IIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGIQWA 369

Query: 430 VANNALYAGTNLVTGQTNLGNSIVNLTAGAVT--NPGGTLGSLG---SITNGLNIGWLHN 484
N +T TN G I AGA G SL S NG+ G
Sbjct: 370 NKNAG-------MTQFTNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAG---- 418

Query: 485 MFGVQGLGALLQFFAGSSDANVLSTPNLVTLDNEEAKIVVGQNVPIPTGSYSNLTSGTTA 544
F LL + S+ ++L+TP++VTLDN EA VGQ VP+ TGS + +
Sbjct: 419 -FYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTGS----QTTSGD 473

Query: 545 NAFNTYDRRDVGLTLHVKPQITEGGILKLQLYTEDSAVVPGTNTTSANSPGPTFTKRSIQ 604
N FNT +R+ VG+ L VKPQI EG + L++ E S+V +++++ G TF R++
Sbjct: 474 NIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSV-ADAASSTSSDLGATFNTRTVN 532

Query: 605 STVLADNGEIIVLGGLMQDNYQVSNTKVPLLGDIPWIGQLFRSEGKTRQKTNLMVFLRPV 664
+ VL +GE +V+GGL+ + + KVPLLGDIP IG LFRS K K NLM+F+RP
Sbjct: 533 NAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPT 592

Query: 665 IINDRETAQAVTSNRYDYIQGVTGAYKSDNN 695
+I DR+ + +S +Y + N
Sbjct: 593 VIRDRDEYRQASSGQYTAFNDAQSKQRGKEN 623


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0009BCTERIALGSPF383e-133 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 383 bits (985), Expect = e-133
Identities = 174/406 (42%), Positives = 266/406 (65%), Gaps = 2/406 (0%)

Query: 1 MPAFRFEAIDASGRAQKGVIEADSARNARGQLRTQGLTPLVVEPAASAQRGARSQRLALG 60
M + ++A+DA G+ +G EADSAR AR LR +GL PL V+ Q+ + S L+L
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 61 R--KLSQREQAILTRQLASLLVAGLPLDEALAVLTEQAERDYIRELMAAIRAEVLGGHSL 118
R +LS + A+LTRQLA+L+ A +PL+EAL + +Q+E+ ++ +LMAA+R++V+ GHSL
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 119 ANALTQHPRDFPEIYRALVAAGEHTGKLGIVLSRLADYIEERNALKQKILLAFTYPAIVT 178
A+A+ P F +Y A+VAAGE +G L VL+RLADY E+R ++ +I A YP ++T
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 179 VIAFGIVTFLLSYVVPQVVNVFASTKQQLPVLTIVMMALSDFVRHWWWAILIGIAAVVYL 238
V+A +V+ LLS VVP+VV F KQ LP+ T V+M +SD VR + +L+ + A
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 239 VKATLSRDGPRLAFDRWLLTAPLAGKLVRGYNTVRFASTLGILTAAGVPILRALQAAGET 298
+ L ++ R++F R LL PL G++ RG NT R+A TL IL A+ VP+L+A++ +G+
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 299 LSNRAMRGNIEDAIVRVREGSALSRALNNVKTFPPVLVHLIRSGEATGDVTTMLDRAAEG 358
+SN R + A VREG +L +AL FPP++ H+I SGE +G++ +ML+RAA+
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 359 ESRELERRTMFLTSLLEPLLILAMGGIVLVIVLAVMLPIIELNNMV 404
+ RE + L EPLL+++M +VL IVLA++ PI++LN ++
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0012BCTERIALGSPG1886e-65 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 188 bits (480), Expect = 6e-65
Identities = 67/140 (47%), Positives = 94/140 (67%), Gaps = 3/140 (2%)

Query: 10 QAARRQRGFTLIEIMVVVAILGILAALIVPKIMSRPDEARRIAAKQDIGTIMQALKLYRL 69
+A +QRGFTL+EIMVV+ I+G+LA+L+VP +M ++A + A DI + AL +Y+L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL 61

Query: 70 DNGRYPTQDQGLNALIQKPTTDPIPNNWKDGGYLERLPNDPWGNSYKYLNPGVHGEIDVF 129
DN YPT +QGL +L++ PT P+ N+ GY++RLP DPWGN Y +NPG HG D+
Sbjct: 62 DNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLL 121

Query: 130 SYGADGKEGGESNDSDIGSW 149
S G DG+ G E DI +W
Sbjct: 122 SAGPDGEMGTE---DDITNW 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0013BCTERIALGSPH511e-10 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 51.5 bits (123), Expect = 1e-10
Identities = 20/101 (19%), Positives = 33/101 (32%), Gaps = 15/101 (14%)

Query: 51 RARGFTLLEMLVVLVIAGILVSVASLTLRRNPRTDLREEAQRIALLFETAGDEAQVRARP 110
R RGFTLLEM+++L++ G+ + L + + R +
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQF 61

Query: 111 IAWRATEHGFRF---------------DIRTGDGWRPLRDD 136
++F D +G W PLR
Sbjct: 62 FGVSVHPDRWQFLVLEARDGADPAPADDGWSGYRWLPLRAG 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0014BCTERIALGSPG300.001 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 30.2 bits (68), Expect = 0.001
Identities = 10/26 (38%), Positives = 18/26 (69%)

Query: 10 RSPARSRGFTMIEVLVALAIIAVALA 35
R+ + RGFT++E++V + II V +
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLAS 27


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0015BCTERIALGSPG333e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 33.3 bits (76), Expect = 3e-04
Identities = 17/72 (23%), Positives = 34/72 (47%), Gaps = 3/72 (4%)

Query: 33 RGFTLIEMMIAITILAVIA-ILSWRGLDQIIRGREKVAAAMEDERVFAQMFDQMRIDARR 91
RGFTL+E+M+ I I+ V+A ++ + + ++ A+ D D ++D
Sbjct: 8 RGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQK--AVSDIVALENALDMYKLDNHH 65

Query: 92 AATDDEAGQPAV 103
T ++ + V
Sbjct: 66 YPTTNQGLESLV 77


64BURPS668_0029BURPS668_0034N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_002909-0.532244flagellar motor switch protein FliM
BURPS668_00300121.521609flagellar motor switch protein FliN
BURPS668_0031-1102.349579flagellar biosynthesis protein FliO
BURPS668_0032-1112.289950flagellar biosynthesis protein FliP
BURPS668_0033-392.129747flagellar biosynthesis protein FliQ
BURPS668_0034-281.475328flagellar biosynthetic protein FliR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0029FLGMOTORFLIM2762e-93 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 276 bits (706), Expect = 2e-93
Identities = 82/324 (25%), Positives = 159/324 (49%), Gaps = 10/324 (3%)

Query: 5 EFMSQEEVDALLKGVTGEDDSADEPAEASG---IRPYNIATQERIVRGRMPGLEIINDRF 61
E +SQ+E+D LL ++ D S ++ S I Y+ ++ + +M L ++++ F
Sbjct: 3 EVLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETF 62

Query: 62 ARLLRIGIFNFMRRTAEISVSQVKVQKYSEFTRNLPIPTNLNLVHVKPLRGTSLFVFDPN 121
ARL + +R + V+ V Y EF R++P P+ L ++ + PL+G ++ DP+
Sbjct: 63 ARLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPS 122

Query: 122 LVFFVVDNLFGGDGRFHTRVEGRDFTATEQRIIGKLLNLVFEHYASAWKSVRPLQFEFVR 181
+ F ++D LFGG G+ RD T E ++ ++ + + +W V L+ +
Sbjct: 123 ITFSIIDRLFGGTGQAAKVQ--RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQ 180

Query: 182 SEMHTQFANVATPNEIVIVTQFSIEFGPTGGTLHICMPYSMIEPIRDVLSSPIQGEAL-- 239
E + QFA + P+E+V++ + G G ++ C+PY IEPI LSS ++
Sbjct: 181 IETNPQFAQIVPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 240 EVDRRWVRVLSQQVQSAEVELVADLAEVPTTFEKILNLRTGDVLPLD---ITDSITAKVD 296
+++ VL ++ + ++++VA++ + + IL LR GD++ L + D +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 297 GVPVMECGYGIFNGQYALRVQRMI 320
C G+ + A ++ I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0030FLGMOTORFLIN1343e-43 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 134 bits (338), Expect = 3e-43
Identities = 78/126 (61%), Positives = 97/126 (76%), Gaps = 3/126 (2%)

Query: 41 AMDD-WAAALAEQNQQPIETGATGAGVFRPLSKATASSTHNDIDLILDIPVKMTVELGRT 99
A+DD WA AL EQ ++ A VF+ L S DIDLI+DIPVK+TVELGRT
Sbjct: 14 ALDDLWADALNEQKATTTKSAADA--VFQQLGGGDVSGAMQDIDLIMDIPVKLTVELGRT 71

Query: 100 KIAIRNLLQLAQGSVVELDGLAGEPMDVLVNGCLIAQGEVVVVNDKFGIRLTDIITPSER 159
++ I+ LL+L QGSVV LDGLAGEP+D+L+NG LIAQGEVVVV DK+G+R+TDIITPSER
Sbjct: 72 RMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGVRITDIITPSER 131

Query: 160 IRKLNR 165
+R+L+R
Sbjct: 132 MRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0032FLGBIOSNFLIP297e-104 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 297 bits (761), Expect = e-104
Identities = 154/242 (63%), Positives = 192/242 (79%), Gaps = 1/242 (0%)

Query: 34 RWLPAILIGLAPALACAQAAGLPAFNSAPGPNGGTTYSLSVQTMLLLTMLSFLPAMLLMM 93
R L + L + A LP S P P GG ++SL VQT++ +T L+F+PA+LLMM
Sbjct: 3 RLLSVAPVLLW-LITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMM 61

Query: 94 TSFTRIIIVLSLLRQAIGTASTPPNQVLVGLALFLTLFVMSPVLDRAYNDAYKPFSEGTL 153
TSFTRIIIV LLR A+GT S PPNQVL+GLALFLT F+MSPV+D+ Y DAY+PFSE +
Sbjct: 62 TSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKI 121

Query: 154 QMDQAVQRGTAPFKAFMLKQTRETDLALFAKISKAAPMQGPEDVPLSLLVPAFVTSELKT 213
M +A+++G P + FML+QTRE DL LFA+++ P+QGPE VP+ +L+PA+VTSELKT
Sbjct: 122 SMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKT 181

Query: 214 GFQIGFTIFIPFLIIDMVVASVLMSMGMMMVSPATVSLPFKLMLFVLVDGWQLLIGSLAQ 273
FQIGFTIFIPFLIID+V+ASVLM++GMMMV PAT++LPFKLMLFVLVDGWQLL+GSLAQ
Sbjct: 182 AFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQ 241

Query: 274 SF 275
SF
Sbjct: 242 SF 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0033TYPE3IMQPROT694e-19 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 68.6 bits (168), Expect = 4e-19
Identities = 26/85 (30%), Positives = 46/85 (54%)

Query: 4 ENVMTLAHQAMYIGLLLAAPLLLVALAVGLVVSLFQAATQINEATLSFIPKLLAVAATMV 63
++++ ++A+Y+ L+L+ +VA +GL+V LFQ TQ+ E TL F KLL V +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLSTMIDYLRETLLRVATLG 88
+ W ++ Y R+ + G
Sbjct: 62 LLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0034TYPE3IMRPROT1615e-51 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 161 bits (410), Expect = 5e-51
Identities = 117/250 (46%), Positives = 159/250 (63%), Gaps = 1/250 (0%)

Query: 1 MFSVTYAQLNGWLTAFLWPFVRMLALVAIAPVTGHRSTPVRVKIGLAGFMALVVAPTLPP 60
M VT Q WL + WP +R+LAL++ AP+ RS P RVK+GLA + +AP+LP
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 MPVATVFSAQGVWIIVNQFLIGAALGFTMQIVFAAIEAAGDIIGLSMGLGFATFFDPHSS 120
V VFS +W+ V Q LIG ALGFTMQ FAA+ AG+IIGL MGL FATF DP S
Sbjct: 61 NDVP-VFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASH 119

Query: 121 GATPVMGRFLNAVAILAFLAFDGHLQVFAALVDSFRLVPVSADLLRAAGWQTLVAFGAAI 180
PV+ R ++ +A+L FL F+GHL + + LVD+F +P+ + L + + L G+ I
Sbjct: 120 LNMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLI 179

Query: 181 FEMGLLLALPVVAALLIANLALGILNRAAPQIGIFQVGFPVTMLVGLLLVQLMAPNLIPF 240
F GL+LALP++ LL NLALG+LNR APQ+ IF +GFP+T+ VG+ L+ + P + PF
Sbjct: 180 FLNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPF 239

Query: 241 VGRLFDTGVD 250
LF +
Sbjct: 240 CEHLFSEIFN 249


65BURPS668_0041BURPS668_0049N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0041-3110.157723sensor histidine kinase
BURPS668_0042-210-0.560088DNA-binding response regulator
BURPS668_0043-210-0.442663hypothetical protein
BURPS668_0044-210-0.550943porin
BURPS668_0045-39-0.233410hypothetical protein
BURPS668_0046-210-0.714130type III DNA modification methyltransferase
BURPS668_0047-211-0.531498type III restriction enzyme, res subunit
BURPS668_00480140.130717hypothetical protein
BURPS668_00490120.380640porin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0041PF06580543e-10 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 54.5 bits (131), Expect = 3e-10
Identities = 24/128 (18%), Positives = 45/128 (35%), Gaps = 22/128 (17%)

Query: 334 RIDLGAELDDDLQVAGSESLLSALLMNLVDNAVRYAHE----GGRVTVSARRDGDAVVLE 389
R+ +++ + + L+ LV+N +++ GG++ + +D V LE
Sbjct: 239 RLQFENQINPAIM---DVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLE 295

Query: 390 VVDDGPGIPAEARPHVFKRFYRVARDEEGTGLGLAIVEE-IAQSHGGAVSLATGPGNRGV 448
V + G +E TG GL V E + +G + V
Sbjct: 296 VENTGSLALKN--------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKV 341

Query: 449 RMTVRLPA 456
V +P
Sbjct: 342 NAMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0042HTHFIS963e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 96.1 bits (239), Expect = 3e-25
Identities = 30/119 (25%), Positives = 60/119 (50%), Gaps = 1/119 (0%)

Query: 2 KLLLVEDNAELAHWIVDLLRGEGFGVDSAPDGESADTVLKAQRYDALLLDMRLPGMSGKE 61
+L+ +D+A + + L G+ V + + + A D ++ D+ +P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 LLARLRRRGDNVPVLMLTAHGSVDDKVDCFSAGADDYVVKPFESRELVARI-RALIRRQ 119
LL R+++ ++PVL+++A + + GA DY+ KPF+ EL+ I RAL +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0044ECOLNEIPORIN611e-12 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 61.4 bits (149), Expect = 1e-12
Identities = 54/228 (23%), Positives = 94/228 (41%), Gaps = 28/228 (12%)

Query: 12 LKRQYLALSIATAACAAPQAHAQSSVQLYGLIDLSIPTYRSHANAKGDHVIGMGLGGEPW 71
+K+ +AL++A AA + V LYG I + T RS A+ G + G
Sbjct: 1 MKKSLIALTLAALPVAA-----MADVTLYGTIKAGVETSRSVAH-NGAQAASVETGTGIV 54

Query: 72 FSGSRWGLKGAEDIGGGTKVIFRLESEYTVADGNMEDPGQIFDRDAWVGVENDTFGKLTA 131
GS+ G KG ED+G G K I+++E + ++A + +R +++G++ FGKL
Sbjct: 55 DLGSKIGFKGQEDLGNGLKAIWQVEQKASIAGTD----SGWGNRQSFIGLKGG-FGKLRV 109

Query: 132 GFQNTIARDAAAIYGDPYGSAKLTTEEGGWTNANNFKQMIFYAAGATGTRYNNGLAWKKL 191
G N++ +D I +P+ S RY++ +
Sbjct: 110 GRLNSVLKDTGDI--NPWDSKSDYLGVNKIAEPEARL---------ISVRYDS----PEF 154

Query: 192 FGNGIFASAGYAFSNSTSFGQNSTYQVALGYNGGPFNVSGFFSHVNHA 239
G+ S YA +++ + +Y Y G F V ++ H
Sbjct: 155 A--GLSGSVQYALNDNAGRHNSESYHAGFNYKNGGFFVQYGGAYKRHH 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0048RTXTOXIND250.009 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 25.2 bits (55), Expect = 0.009
Identities = 13/39 (33%), Positives = 20/39 (51%), Gaps = 5/39 (12%)

Query: 10 WRDAARMRQDSHGPVRDRDA----PAH-EPSRPPPTRRA 43
W + ++R+ PVR++D PAH E P +RR
Sbjct: 19 WSETWKIRKQLDTPVREKDENEFLPAHLELIETPVSRRP 57


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0049ECOLNEIPORIN723e-16 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 72.1 bits (177), Expect = 3e-16
Identities = 80/356 (22%), Positives = 118/356 (33%), Gaps = 75/356 (21%)

Query: 1 MKK--FAVAAAGLAVATGAHASDGSVTLFGLIDAGVSYVSNEGGKRNVYFDDGIAVPNLW 58
MKK A+ A L VA A VTL+G I AGV + +
Sbjct: 1 MKKSLIALTLAALPVAAMA-----DVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVD 55

Query: 59 -----GLRGTEDLGGGAKAIFELTSQYALGNGAALPTPGSIFSRTALVGLWSERLGSMTL 113
G +G EDLG G KAI+++ + T +R + +GL G + +
Sbjct: 56 LGSKIGFKGQEDLGNGLKAIWQVEQ-----KASIAGTDSGWGNRQSFIGLKGG-FGKLRV 109

Query: 114 GQQYDFMTDSLTFGSFDGAFRYGGLYNFRQGPFSKLGIPDNPTGSFDFDRLAGSSRVPNS 173
G+ + D+ +D Y G+ + P+ S
Sbjct: 110 GRLNSVLKDTGDINPWDSKSDYLGVNKIAE--------PEA---------------RLIS 146

Query: 174 VKYTSANLNGLVFGLMYGFGNQAGGGLAANSTVSAGLKYETGSFAL--GAAYVEVKYPQM 231
V+Y S GL + Y + AG + + AG Y+ G F + G AY Q
Sbjct: 147 VRYDSPEFAGLSGSVQYALNDNAGRH--NSESYHAGFNYKNGGFFVQYGGAYKRHHQVQE 204

Query: 232 NNGHDGLRNWGLGARYALSAFDLNL-LYTNTRNT--LTGAAIDVIQAGVRYVGAPWTIGA 288
N + + L + Y L + ++ + Q V A
Sbjct: 205 NVNIEKYQIHRLVSGY--DNDALYASVAVQQQDAKLVEENYSHNSQTEV---------AA 253

Query: 289 NYEYMKGNAQLDRNYAH----------------QVTAAAQYALSKRTSAYVETVYQ 328
Y GN +YAH QV A+Y SKRTSA V +
Sbjct: 254 TLAYRFGNVTPRVSYAHGFKGSFDATNYNNDYDQVVVGAEYDFSKRTSALVSAGWL 309


66BURPS668_0164BURPS668_0173N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0164-391.469209HpcH/HpaI aldolase
BURPS668_0165-390.297802hypothetical protein
BURPS668_0166-111-0.420357rod shape-determining protein RodA
BURPS668_0167-1130.512229penicillin-binding protein
BURPS668_0168012-0.344891rod shape-determining protein MreD
BURPS668_0169012-0.655510rod shape-determining protein MreC
BURPS668_0170011-1.972152rod shape-determining protein MreB
BURPS668_0171012-2.344466aspartyl/glutamyl-tRNA amidotransferase subunit
BURPS668_0172-111-2.384137aspartyl/glutamyl-tRNA amidotransferase subunit
BURPS668_0173-113-2.747730aspartyl/glutamyl-tRNA amidotransferase subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0164PHPHTRNFRASE452e-07 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 45.2 bits (107), Expect = 2e-07
Identities = 36/178 (20%), Positives = 60/178 (33%), Gaps = 34/178 (19%)

Query: 145 RALDAGARTLMFPGVETADEAAHAVRLTRFQAPDAPDGLRGVAGIVRAAAYGMRRDYVQT 204
RA G +MFP + T +E LR I++ + + V
Sbjct: 380 RASTYGNLKVMFPMIATLEE------------------LRQAKAIMQEEKDKLLSEGVDV 421

Query: 205 ANAQIATIVQIESARGVDEAERIAATPGVDCVFVGPADL----------SASLGHLGDTK 254
++ I + +E A A VD +G DL + + +L
Sbjct: 422 SD-SIEVGIMVEIPSTAVAANLFA--KEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPY 478

Query: 255 HPDVAAALEHVLAAGRRAGVPVGI---FAADTAGARQSLEAGFRVVALSADVVWLLRA 309
HP + ++ V+ A G VG+ A D L G ++SA + R+
Sbjct: 479 HPAILRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARS 536


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0167cloacin320.014 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.6 bits (71), Expect = 0.014
Identities = 19/69 (27%), Positives = 26/69 (37%)

Query: 681 SGADGASGASGAGGEPTEHANAGGNPAGGGIAGGAAGTANNGSGAAAPGGMPGANGAATG 740
+G GAS G +E+ GG G GG +G N G + GG +
Sbjct: 25 TGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAV 84

Query: 741 APPASRRLP 749
A P + P
Sbjct: 85 AAPVAFGFP 93



Score = 30.8 bits (69), Expect = 0.027
Identities = 19/68 (27%), Positives = 22/68 (32%), Gaps = 2/68 (2%)

Query: 677 PASASGADGASGASGAGGEPTEHANAGGNPAGGGIAGGAAGTANNGSGAAAPGGMPGANG 736
P GAS SG E G+ G G NG+ G G N
Sbjct: 24 PTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGT--GGNL 81

Query: 737 AATGAPPA 744
+A AP A
Sbjct: 82 SAVAAPVA 89


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0169GPOSANCHOR280.046 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 28.5 bits (63), Expect = 0.046
Identities = 16/64 (25%), Positives = 22/64 (34%), Gaps = 3/64 (4%)

Query: 293 KAAKGKKATKGADKSAKAADKGADKDKGAKPAAAPPVPARSRPAGPAQPAAPLKPATAPS 352
K + +KA A A+A A K+K AK A + + P A P
Sbjct: 424 KLTEKEKAELQAKLEAEAK---ALKEKLAKQAEELAKLRAGKASDSQTPDAKPGNKAVPG 480

Query: 353 PGAP 356
G
Sbjct: 481 KGQA 484


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0170SHAPEPROTEIN5040.0 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 504 bits (1300), Expect = 0.0
Identities = 247/348 (70%), Positives = 294/348 (84%), Gaps = 2/348 (0%)

Query: 1 MFGFLRSYFSNDLAIDLGTANTLIYMRGKGIVLDEPSVVSIRQEGGPNGKKTIQAVGKEA 60
M R FSNDL+IDLGTANTLIY++G+GIVL+EPSVV+IRQ+ K++ AVG +A
Sbjct: 1 MLKKFRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRA-GSPKSVAAVGHDA 59

Query: 61 KQMLGKVPGNIEAIRPMKDGVIADFTVTEQMIKQFIKTAHESRMFSPSPRIIICVPCGST 120
KQMLG+ PGNI AIRPMKDGVIADF VTE+M++ FIK H + PSPR+++CVP G+T
Sbjct: 60 KQMLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGAT 119

Query: 121 QVERRAIKEAAHGAGASQVYLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVGVISLG 180
QVERRAI+E+A GAGA +V+LIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEV VISL
Sbjct: 120 QVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN 179

Query: 181 GIVYKGSVRVGGDKFDEAIVNYIRRNYGMLIGEQTAEAIKKEIGSAFPGSEVKEMEVKGR 240
G+VY SVR+GGD+FDEAI+NY+RRNYG LIGE TAE IK EIGSA+PG EV+E+EV+GR
Sbjct: 180 GVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGR 239

Query: 241 NLSEGIPRSFTISSNEILEALTDPLNQIVSSVKIALEQTPPELGADIAERGMMLTGGGAL 300
NL+EG+PR FT++SNEILEAL +PL IVS+V +ALEQ PPEL +DI+ERGM+LTGGGAL
Sbjct: 240 NLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGAL 299

Query: 301 LRDLDRLLAEETGLPVLVAEDPLTCVVRGSGMALERMDKL-GSIFSYE 347
LR+LDRLL EETG+PV+VAEDPLTCV RG G ALE +D G +FS E
Sbjct: 300 LRNLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHGGDLFSEE 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0173TYPE4SSCAGA310.013 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 30.8 bits (69), Expect = 0.013
Identities = 29/89 (32%), Positives = 43/89 (48%), Gaps = 5/89 (5%)

Query: 395 SNKIAKEIFVTIWDEKAADEGAADRIIEAKGLK-QISDTGALEAIIDEVLAANAKSVEEF 453
+N EIF I E D A KG+K ++SD LE + ++ L KS +EF
Sbjct: 648 ANSQKDEIFALINKEANRDARAIAYAQNLKGIKRELSDK--LENV-NKNLKDFDKSFDEF 704

Query: 454 RAGKDKAFNALVGQAMKATKGKANPQQVN 482
+ GK+K F+ + +KA KG +N
Sbjct: 705 KNGKNKDFSK-AEETLKALKGSVKDLGIN 732


67BURPS668_0183BURPS668_0190N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_01830100.091345nucleoid occlusion protein
BURPS668_0184011-0.140577HAD-superfamily hydrolase
BURPS668_0185011-1.624559acetylglutamate kinase
BURPS668_0186-110-0.957262hypothetical protein
BURPS668_0187010-1.576597hypothetical protein
BURPS668_0188211-3.129866sensor histidine kinase
BURPS668_0189213-2.747668Fis family transcriptional regulator
BURPS668_0190212-2.098385ATP-dependent protease ATP-binding subunit HslU
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0183HTHTETR581e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 58.1 bits (140), Expect = 1e-12
Identities = 31/183 (16%), Positives = 62/183 (33%), Gaps = 15/183 (8%)

Query: 24 ASRTRPKPGERRVHILQTLASMLEAPKSEKITTAALAARLDVSEAALYRHFSSKAQMFEG 83
A +T+ + E R HIL + + +A V+ A+Y HF K+ +F
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 84 LIEFIEETFFGLVNQIAANEPNGVLQA-RSIALMLLNFSAKNPGMTRVLTGEALVGEHER 142
+ E E L + A P L R I + +L + ++ + H+
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLME----IIFHKC 117

Query: 143 LAERVNQMLERVEASIKQCLR---VALLEAQAHAAGGAPPPVPLPDDYDPALRASLVISY 199
++++ + ++ L+ A P + A ++ Y
Sbjct: 118 EFVGEMAVVQQAQRNLCLESYDRIEQTLKH-CIEAKMLPADL------MTRRAAIIMRGY 170

Query: 200 VLG 202
+ G
Sbjct: 171 ISG 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0185CARBMTKINASE445e-07 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 43.7 bits (103), Expect = 5e-07
Identities = 27/99 (27%), Positives = 48/99 (48%), Gaps = 6/99 (6%)

Query: 232 IPVISPIGFGEDGLSYNINADLVAGKLATVLNAEKLVMMTNIPGVMDKEG----NLLTDL 287
+PVI G G+ I+ DL KLA +NA+ +++T++ G G L ++
Sbjct: 197 VPVILEDG-EIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAALYYGTEKEQWLREV 255

Query: 288 SAREIDALFEDGT-ISGGMLPKISSALDAAKSGVKSVHI 325
E+ +E+G +G M PK+ +A+ + G + I
Sbjct: 256 KVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAII 294



Score = 36.7 bits (85), Expect = 9e-05
Identities = 21/60 (35%), Positives = 27/60 (45%), Gaps = 10/60 (16%)

Query: 83 GKTVVIKYGGNAMTEERLKQGF----------ARDVILLKLVGINPVIVHGGGPQIDQAL 132
GK VVI GGNA+ + K + AR + + G VI HG GPQ+ L
Sbjct: 2 GKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLL 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0189HTHFIS889e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 87.6 bits (217), Expect = 9e-23
Identities = 30/127 (23%), Positives = 60/127 (47%)

Query: 1 MSDKNFLVIDDNEVFAGTLARGLERRGYAVRQAHNKDEALKLAGAEKFEFITVDLHLGND 60
M+ LV DD+ L + L R GY VR N + A + + D+ + ++
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 SGLSLIAPLCDLQPDARILVLTGYASIATAVQAVKDGADNYLAKPANVESILAALQTNAS 120
+ L+ + +PD +LV++ + TA++A + GA +YL KP ++ ++ + +
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 EVQAEEA 127
E + +
Sbjct: 121 EPKRRPS 127



Score = 45.2 bits (107), Expect = 4e-08
Identities = 16/101 (15%), Positives = 32/101 (31%), Gaps = 3/101 (2%)

Query: 75 DARILVLTGYASIATAVQAVKDGADNYLAKPANVESILAALQTNASEVQAEEALENPVVL 134
I+ + I + L+ VE + + + L
Sbjct: 375 TREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGL---YDR 431

Query: 135 SVDRLEWEHIQRVLAENNNNISATARALNMHRRTLQRKLAK 175
+ +E+ I L N A L ++R TL++K+ +
Sbjct: 432 VLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0190HTHFIS310.016 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.6 bits (69), Expect = 0.016
Identities = 13/68 (19%), Positives = 29/68 (42%), Gaps = 15/68 (22%)

Query: 17 IIGQAKAKKAVAVALRNRWRRQQVAEPLRQEITPKNILMIGPTGVGKTEIAR---RLAKL 73
++G++ A + + ++ + T +++ G +G GK +AR K
Sbjct: 139 LVGRSAAMQEI------YRVLARLMQ------TDLTLMITGESGTGKELVARALHDYGKR 186

Query: 74 ADAPFIKI 81
+ PF+ I
Sbjct: 187 RNGPFVAI 194


68BURPS668_0217BURPS668_0227N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_02170123.210358flagellar hook-length control protein FliK
BURPS668_02182141.653303flagellar export protein FliJ
BURPS668_02192121.285758flagellar protein export ATPase FliI
BURPS668_02201120.297878flagellar assembly protein H
BURPS668_02212102.359422flagellar motor switch protein G
BURPS668_0222093.767198flagellar MS-ring protein
BURPS668_02232114.525182flagellar hook-basal body complex protein FliE
BURPS668_0224194.841104flagellar protein FliS
BURPS668_0225-183.950569hypothetical protein
BURPS668_0226-193.282096hypothetical protein
BURPS668_02270122.151817flagellar biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0217FLGHOOKFLIK711e-15 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 71.4 bits (174), Expect = 1e-15
Identities = 65/198 (32%), Positives = 90/198 (45%), Gaps = 6/198 (3%)

Query: 259 ALAALRDAADSARATLAASSAPAALQQAA-PAALAANAGAAAASAAPSLAPPVGTPDWTD 317
L A++ S P+ + AA P AAP L+ P+G+ +W
Sbjct: 183 PAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEWQQ 242

Query: 318 ALSQKVVFLSNAHQQSAELTLNPPDLGPLQVVLRVADNHAHALFVSQHAQVRDAVEAALP 377
+LSQ + + QQSAEL L+P DLG +Q+ L+V DN A VS H VR A+EAALP
Sbjct: 243 SLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAALP 302

Query: 378 KLREAMEAGGLGLGSASVSDGGFASAQQQQTPQRQSSDGSATRRAFGASTADAALDELAA 437
LR + G+ LG +++S F+ QQ + Q+QS +A D L
Sbjct: 303 VLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQR-TANHEPLAGEDDDT----LPV 357

Query: 438 ASSSGATRRTVGMVDTFA 455
S VD FA
Sbjct: 358 PVSLQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0218FLGFLIJ602e-14 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 59.8 bits (144), Expect = 2e-14
Identities = 43/140 (30%), Positives = 74/140 (52%)

Query: 1 MAQSFPLQLLLERAQDDLDTAAKQLGRAQRERTDAQAQLDALMRYRDEYRVRFAESAQSG 60
MA+ L L + A+ +++ AA+ LG +R A+ QL L+ Y++EYR +G
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MPAGNWRNFQAFLDTLDAAIEQQRRVLAAAQTRIDAARPEWQAKKRTLGSYEILQARGAR 120
+ + W N+Q F+ TL+ AI Q R+ L ++D A W+ KK+ L +++ LQ R +
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 QDAQRAAKREQRDADEHAAK 140
+ +Q+ DE A +
Sbjct: 121 AALLAENRLDQKKMDEFAQR 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0220FLGFLIH1091e-31 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 109 bits (273), Expect = 1e-31
Identities = 64/184 (34%), Positives = 106/184 (57%), Gaps = 4/184 (2%)

Query: 37 AAAALAAELQRVRDAAHAEGLAAGHVEGQALGYQAGYEQGRAKGFDEGQAEAHTHAAQLA 96
A +L +L +++ AH +G AG EG+ G++ GY++G A+G ++G AEA + A +
Sbjct: 36 AEPSLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIH 95

Query: 97 A----LAASFRDALAGVERDLADDIATLALEIAQQVVRQHVQHDPAALIAAAREVLAAEP 152
A L + F+ L ++ +A + +ALE A+QV+ Q D +ALI +++L EP
Sbjct: 96 ARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEP 155

Query: 153 ALAGAPHLIVNPADLPVVEAYLKDELDTLGWSVRTDTSIERGGCRAHASTGEIDATLTTR 212
+G P L V+P DL V+ L L GW +R D ++ GGC+ A G++DA++ TR
Sbjct: 156 LFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATR 215

Query: 213 WERV 216
W+ +
Sbjct: 216 WQEL 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0221FLGMOTORFLIG298e-102 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 298 bits (765), Expect = e-102
Identities = 114/324 (35%), Positives = 191/324 (58%)

Query: 5 GLNKSALLLMSIGEEEAAQVFKFLAPREVQKIGAAMAALKNVTREQVEDVLNDFVQEAEK 64
G K+A+LL+SIG E +++VFK+L+ E++ + +A L+ +T E ++VL +F +
Sbjct: 17 GKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKELMMA 76

Query: 65 HTALSLDSSEYIRTVLTKALGEDKAGVLIDRILQGSDTSGIEGLKWMDSAAVAELIKNEH 124
+ +Y R +L K+LG KA +I+ + + E ++ D A + I+ EH
Sbjct: 77 QEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANILNFIQQEH 136

Query: 125 PQIIATILVHLDRDQASEIASCFTERLRNDVLLRIATLDGIQPTALRELDDVLTGLLSGS 184
PQ IA IL +LD +AS I S ++ +V RIA +D P +RE++ VL L+
Sbjct: 137 PQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKKLASL 196

Query: 185 DNLKRAPMGGIRTAAEILNFMTSVHEEAVIENVKQYDPDLAQKIIDQMFVFENLLDLEDR 244
+ GG+ EI+N E+ +IE++++ DP+LA++I +MFVFE+++ L+DR
Sbjct: 197 SSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIVLLDDR 256

Query: 245 AIQLLLKEVESEALIIALKGAPPALRQKFLSNMSQRAAELLAEDLDARGPVRVSEVETQQ 304
+IQ +L+E++ + L ALK +++K NMS+RAA +L ED++ GP R +VE Q
Sbjct: 257 SIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKDVEESQ 316

Query: 305 RKILQVVRNLAESGQIVIGGKAED 328
+KI+ ++R L E G+IVI E+
Sbjct: 317 QKIVSLIRKLEEQGEIVISRGGEE 340


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0222FLGMRINGFLIF468e-162 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 468 bits (1205), Expect = e-162
Identities = 254/562 (45%), Positives = 360/562 (64%), Gaps = 37/562 (6%)

Query: 53 LSRMKTNPRLPFLIGAALAIAAIVALVLWSRAPDYRVLYSNLSDRDGGAIIAALQQANVP 112
L+R++ NPR+P ++ + A+A +VA+VLW++ PDYR L+SNLSD+DGGAI+A L Q N+P
Sbjct: 16 LNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIP 75

Query: 113 YKFADAGGAILVPANQVHETRLKLAAMGLPKGGSVGFELMDNQKFGISQFAEQVNYQRAL 172
Y+FA+ GAI VPA++VHE RL+LA GLPKGG+VGFEL+D +KFGISQF+EQVNYQRAL
Sbjct: 76 YRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRAL 135

Query: 173 EGELQRTVESINAVRAARVHLAIPKPSVFVRDREAPSASVLVDLYPGRVLDEGQVLAVTR 232
EGEL RT+E++ V++ARVHLA+PKPS+FVR++++PSASV V L PGR LDEGQ+ AV
Sbjct: 136 EGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALDEGQISAVVH 195

Query: 233 MVSSSVPDMPAKNVTIVDQDGNLLTQT-ASATGLDASQLKYVQQIERNTQKRIDAILAPI 291
+VSS+V +P NVT+VDQ G+LLTQ+ S L+ +QLK+ +E Q+RI+AIL+PI
Sbjct: 196 LVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRIQRRIEAILSPI 255

Query: 292 FGAGNARSQVSADVDFSKIEQTSESYGPNGTPQQSAIRSQQTSSSTELAQSGASGVPGAL 351
G GN +QV+A +DF+ EQT E Y PNG ++ +RS+Q + S ++ GVPGAL
Sbjct: 256 VGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGVPGAL 315

Query: 352 SNTPPQPASAPIVA-------------SNGQPAGPAATPVSDRKDSTTNYELDKTVRHVE 398
SN P P API ++ +A P S +++ T+NYE+D+T+RH +
Sbjct: 316 SNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDRTIRHTK 375

Query: 399 QSMGTIKRLSVAVVVNYQPSTDAKGRVTMQPLAADKLAQVQQLVKDAMGYDEKRGDSVNV 458
++G I+RLSVAVVVNY+ D K PL AD++ Q++ L ++AMG+ +KRGD++NV
Sbjct: 376 MNVGDIERLSVAVVVNYKTLADGKP----LPLTADQMKQIEDLTREAMGFSDKRGDTLNV 431

Query: 459 VNSAFSAAADPFANLPWWRQPDMIELGKDIAKWLGVAAAAAALYFMFVRPALRR---AFP 515
VNS FSA + LP+W+Q I+ +WL V A L+ VRP L R
Sbjct: 432 VNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRRVEEAK 491

Query: 516 PPAEPAAAAVPALDGPDDVLALDGLPSPDKKQLAEEDEEHPALLAFENERNRYERNLDYA 575
E A + + L+ D + N+R E
Sbjct: 492 AAQEQAQVRQETEEAVEVRLSKDEQLQQRR----------------ANQRLGAEVMSQRI 535

Query: 576 RTIARQDPKIVATVVKNWVSDE 597
R ++ DP++VA V++ W+S++
Sbjct: 536 REMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0223FLGHOOKFLIE619e-16 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 61.2 bits (148), Expect = 9e-16
Identities = 47/111 (42%), Positives = 62/111 (55%), Gaps = 8/111 (7%)

Query: 3 APVNGIASALQQMQAMAAQAAGGASPATSLAGSGAASAGSFASAMKASLDKISGDQQKAL 62
+ + GI + Q+QA A A S SFA + A+LD+IS Q A
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQES--------LPQPTISFAGQLHAALDRISDTQTAAR 52

Query: 63 GEAHAFEIGAQNVSLNDVMVDMQKANIGFQFGLQVRNKLVSAYNEIMQMSV 113
+A F +G V+LNDVM DMQKA++ Q G+QVRNKLV+AY E+M M V
Sbjct: 53 TQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0227TYPE3IMSPROT634e-15 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 62.9 bits (153), Expect = 4e-15
Identities = 17/81 (20%), Positives = 32/81 (39%), Gaps = 1/81 (1%)

Query: 10 AVLAYDAKGGDTAPRVVAKGYGLVAERIIERARDAGLYVHTAPEMV-SLLMQVDLDARIP 68
A+ +G P V K + + + A + G+ + + +L +D IP
Sbjct: 268 AIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEEEGVPILQRIPLARALYWDALVDHYIP 327

Query: 69 PQLYQAVAELLAWLYALERDA 89
+ +A AE+L WL +
Sbjct: 328 AEQIEATAEVLRWLERQNIEK 348


69BURPS668_0267BURPS668_0277N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0267419-1.772540flagellar basal body rod protein FlgC
BURPS668_0268420-0.715152flagellar basal body rod modification protein
BURPS668_0269020-0.766190flagellar hook protein FlgE
BURPS668_0270-217-0.063251flagellar basal body rod protein FlgF
BURPS668_0271015-0.066216flagellar basal body rod protein FlgG
BURPS668_02721150.132211flagellar basal body L-ring protein
BURPS668_02732130.167387flagellar basal body P-ring biosynthesis protein
BURPS668_0274010-0.183504flagellar rod assembly protein/muramidase FlgJ
BURPS668_0275190.229814hypothetical protein
BURPS668_02760110.663429flagellar hook-associated protein FlgK
BURPS668_02770131.441236flagellar hook-associated protein FlgL
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0267FLGHOOKAP1270.029 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 26.8 bits (59), Expect = 0.029
Identities = 10/38 (26%), Positives = 17/38 (44%)

Query: 102 NVDPVQEMVNMISASRSYQANVETLNTAKQLMLKTLTI 139
V+ +E N+ + Y AN + L TA + + I
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0269FLGHOOKAP1340.001 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 34.2 bits (78), Expect = 0.001
Identities = 17/58 (29%), Positives = 24/58 (41%)

Query: 356 ISAPGSTNHGTLQGSALENSNVDLTSQLVKLITAQRNYQANAQTIKTQQTVDQTLINL 413
SA L S V+L + L Q+ Y ANAQ ++T + LIN+
Sbjct: 488 SSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 29.9 bits (67), Expect = 0.018
Identities = 11/31 (35%), Positives = 17/31 (54%)

Query: 6 GLSGLAGASSDLDVIGNNIANANTVGFKGST 36
+SGL A + L+ NNI++ N G+ T
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQT 37


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0270FLGHOOKAP1290.019 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.2 bits (65), Expect = 0.019
Identities = 9/34 (26%), Positives = 18/34 (52%)

Query: 4 LIYTAMTGATQSLEQQSVVANNLANASTTGFRAQ 37
LI AM+G + + +NN+++ + G+ Q
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQ 36


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0271FLGHOOKAP1421e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 42.3 bits (99), Expect = 1e-06
Identities = 10/48 (20%), Positives = 23/48 (47%)

Query: 213 TLKQGYVESSNVNVVQELVNMIQTQRAYEINSKAVTTSDQMLQTVTQM 260
L S VN+ +E N+ + Q+ Y N++ + T++ + + +
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 40.3 bits (94), Expect = 5e-06
Identities = 19/80 (23%), Positives = 34/80 (42%), Gaps = 14/80 (17%)

Query: 4 SLYIAATGMNAQQAQMDVISNNLANVSTNGFKGSRAVFEDLLYQTVRQPGANSTQQTELP 63
+ A +G+NA QA ++ SNN+++ + G+ RQ + + L
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYT--------------RQTTIMAQANSTLG 48

Query: 64 SGLQLGTGVQQVATERLYTQ 83
+G +G GV +R Y
Sbjct: 49 AGGWVGNGVYVSGVQREYDA 68


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0272FLGLRINGFLGH2051e-68 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 205 bits (522), Expect = 1e-68
Identities = 128/222 (57%), Positives = 156/222 (70%), Gaps = 7/222 (3%)

Query: 25 AALAAAALALAGCAQIPREPITQQPMSAMPPMPPAMQAPGSIY---NPGYAG-RPLFEDQ 80
A + L+L GCA IP P+ Q SA P P A GSI+ P G +PLFED+
Sbjct: 10 AISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINYGYQPLFEDR 69

Query: 81 RPRNVGDILTIVIAENINATKSSGANTNRQGNTSFDVPTAG-FLGGLF--NKANLSAQGA 137
RPRN+GD LTIV+ EN++A+KSS AN +R G T+F T +L GLF +A++ A G
Sbjct: 70 RPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNARADVEASGG 129

Query: 138 NKFAATGGASAANTFNGTITVTVTNVLPNGNLVVSGEKQMLINQGNEFVRFSGIVNPNTI 197
N F GGA+A+NTF+GT+TVTV VL NGNL V GEKQ+ INQG EF+RFSG+VNP TI
Sbjct: 130 NTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTI 189

Query: 198 SGQNSVYSTQVADARIEYSAKGYINEAETMGWLQRFFLNIAP 239
SG N+V STQVADARIEY GYINEA+ MGWLQRFFLN++P
Sbjct: 190 SGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSP 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0273FLGPRINGFLGI371e-129 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 371 bits (954), Expect = e-129
Identities = 164/392 (41%), Positives = 225/392 (57%), Gaps = 27/392 (6%)

Query: 10 RVVRPLVAARRRAAACCALAACMLALAFAPAAARAERLKDLAQIQGVRDNPLIGYGLVVG 69
RV+R + AA +A L+ PA A R+KD+A +Q RDN LIGYGLVVG
Sbjct: 2 RVLRIIAAALVFSALPF--------LSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVG 53

Query: 70 LDGTGDQTMQTPFTTQTLANMLANLGISINNGSANGGGSSAMTNMQLKNVAAVMVTATLP 129
L GTGD +PFT Q++ ML NLGI+ G +N KN+AAVMVTA LP
Sbjct: 54 LQGTGDSLRSSPFTEQSMRAMLQNLGITTQGGQSN-----------AKNIAAVMVTANLP 102

Query: 130 PFARPGEAIDVTVSSLGNAKSLRGGTLLLTPLKGADGQVYALAQGNMAVGGAGASANGSR 189
PFA PG +DVTVSSLG+A SLRGG L++T L GADGQ+YA+AQG + V G A + +
Sbjct: 103 PFASPGSRVDVTVSSLGDATSLRGGNLIMTSLSGADGQIYAVAQGALIVNGFSAQGDAAT 162

Query: 190 VQVNQLAAGRIAGGAIVERSVPNAVAQMNGVLQLQLNDMDYGTAQRIVSAVNS----SFG 245
+ + R+ GAI+ER +P+ L LQL + D+ TA R+ VN+ +G
Sbjct: 163 LTQGVTTSARVPNGAIIERELPSKFKDSV-NLVLQLRNPDFSTAVRVADVVNAFARARYG 221

Query: 246 AGTATALDGRTIQLTAPADSAQQVAFMARLQNLEVSPERAAAKVILNARTGSIVMNQMVT 305
A D + I + P + MA ++NL V + AKV++N RTG+IV+ V
Sbjct: 222 DPIAEPRDSQEIAVQKPRVA-DLTRLMAEIENLTVETD-TPAKVVINERTGTIVIGADVR 279

Query: 306 LQNCAVAHGNLSVVVNTQPVVSQPGPFSNGQTVVAQQSQIQLKQDNGSLRMVTAGANLAE 365
+ AV++G L+V V P V QP PFS GQT V Q+ I Q+ + + G +L
Sbjct: 280 ISRVAVSYGTLTVQVTESPQVIQPAPFSRGQTAVQPQTDIMAMQEGSKV-AIVEGPDLRT 338

Query: 366 VVKALNSLGATPADLMSILQAMKAAGALRADL 397
+V LNS+G +++ILQ +K+AGAL+A+L
Sbjct: 339 LVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0274FLGFLGJ2265e-75 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 226 bits (578), Expect = 5e-75
Identities = 124/297 (41%), Positives = 173/297 (58%), Gaps = 15/297 (5%)

Query: 15 ALDVQGFDALRSKATAAAPREGVKMVAGQFDAMFTQMMLKSMRDATPSDGLLDSSSSKMY 74
A D Q + L++KA P ++ VA Q + MF QMMLKSMRDA P DGL S +++Y
Sbjct: 12 AWDAQSLNELKAKA-GEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDGLFSSEHTRLY 70

Query: 75 TSMLDQQLAQQMSS-KGIGVADALTKQLLRNANVAPDAQGEGGLAAMNALAKAYANSNGA 133
TSM DQQ+AQQM++ KG+G+A+ + KQ+ + ++ + Y N +
Sbjct: 71 TSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLETVVRYQNQALS 130

Query: 134 PGNGALAGTRGYSAASALTPPLKGNGNSAQADAFVEKMALAAQAASAATGIPARFIVGQA 193
P + + AF+ +++L AQ AS +G+P I+ QA
Sbjct: 131 ------------QLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQA 178

Query: 194 ALESGWGKREIRGANGESSYNVFGIKATKGWTGRTVSAVTTEYVNGKPHRVVAQFRAYDS 253
ALESGWG+R+IR NGE SYN+FG+KA+ W G TTEY NG+ +V A+FR Y S
Sbjct: 179 ALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSS 238

Query: 254 YEHAMTDYANLLKNNPRYASVLNAGHNAEGFAHGMQKAGYATDPHYAKKLISIMQQI 310
Y A++DY LL NPRYA+V A +AE A +Q AGYATDPHYA+KL +++QQ+
Sbjct: 239 YLEALSDYVGLLTRNPRYAAVTTAA-SAEQGAQALQDAGYATDPHYARKLTNMIQQM 294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0276FLGHOOKAP12362e-71 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 236 bits (603), Expect = 2e-71
Identities = 162/444 (36%), Positives = 253/444 (56%), Gaps = 12/444 (2%)

Query: 50 NTLMNLGVSGLNAALWGLTTTGQNISNAATPGYSVERPVYAEASGQYTSSGYLPQGVSTV 109
++L+N +SGLNAA L T NIS+ GY+ + + A+A+ + G++ GV
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 110 TVERQYNQYLSNQLNAAQTQGSSLSTYYTLVAQLNNYVGSPTAGIATAITNYFTGLQTVA 169
V+R+Y+ +++NQL AAQTQ S L+ Y +++++N + + T+ +AT + ++FT LQT+
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 170 NNAADPSARQTAMSNAQTLASQLVAAGQQYSQLRQSVNSQLTDTVTQINSYTSQIAQLNE 229
+NA DP+ARQ + ++ L +Q Q + VN + +V QIN+Y QIA LN+
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 230 QIA--SASSQGQPPNQLLDQRDLAVSKLSQLAGVQV-VQSNGNYSVFLSGGQPLVVGNAS 286
QI+ + G PN LLDQRD VS+L+Q+ GV+V VQ G Y++ ++ G LV G+ +
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 287 YQLATVASPSDPSELTI-VSKGVAGSAQPGPTQYLPDVSLTGGALGGLLAFRSQTLDPAQ 345
QLA V S +DPS T+ G AG+ + +P+ L G+LGG+L FRSQ LD +
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIE------IPEKLLNTGSLGGILTFRSQDLDQTR 294

Query: 346 AQLGALAVSFASQVNAQNALGVDMSGNPGGSLFAVGAPAVYANQNNTGSATLSVSFVDGT 405
LG LA++FA N Q+ G D +G+ G FA+G PAV N N G + + D +
Sbjct: 295 NTLGQLALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDAS 354

Query: 406 QPTTSDYALSYDGAKYTLTDRATGSVVGTATPSSTPPTMTIGGLKLSLSSTPNAGDSFTV 465
+DY +S+D ++ +T R + T TP + + GL+L+ + TP DSFT+
Sbjct: 355 AVLATDYKISFDNNQWQVT-RLASNTTFTVTPDAN-GKVAFDGLELTFTGTPAVNDSFTL 412

Query: 466 LPTRGALDGFSLATANGSAIAAAS 489
P A+ + + + IA AS
Sbjct: 413 KPVSDAIVNMDVLITDEAKIAMAS 436



Score = 84.6 bits (209), Expect = 4e-19
Identities = 46/105 (43%), Positives = 66/105 (62%)

Query: 608 GTNDGRNALALSQLVNSKTMNNGTTTLTGAYAGYVNAIGNAASQLKASSAAQTALVGQIT 667
G +D RN AL L ++ G + AYA V+ IGN + LK SSA Q +V Q++
Sbjct: 441 GDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQLS 500

Query: 668 QAQQSVSGVNQNEEAANLMQYQQLYQANAKVIQTANSVFQTVLGL 712
QQS+SGVN +EE NL ++QQ Y ANA+V+QTAN++F ++ +
Sbjct: 501 NQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0277FLAGELLIN416e-06 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 41.2 bits (96), Expect = 6e-06
Identities = 55/369 (14%), Positives = 113/369 (30%), Gaps = 10/369 (2%)

Query: 16 MNDQQAQIAQLYQQVSSGISLTTPADNPLAAAQAVQLSATSATLAQYTQNQTIVQTALQT 75
+N Q+ ++ +++SSG+ + + D+ A A + ++ L Q ++N + QT
Sbjct: 17 LNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGISIAQT 76

Query: 76 EDTTLTSVNDVLNAAYQALMHAGDGGLSDSDRAALAAQIQGSRDHLLTLANTADGAGNYL 135
+ L +N+ L + + A +G SDSD ++ +IQ + + ++N G +
Sbjct: 77 TEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQFNGVKV 136

Query: 136 FAGFQPTTQPFSNKPGGGVTY------AGDYGARAVQIADTRTVSQGDNGANVFMSVPFL 189
+ G +T G + + + GD ++ +
Sbjct: 137 LSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYD 196

Query: 190 GSLPVPAAGASNTGTGTIGAVSITNPSDPTNTHQFTITFGGTAAAPTYTVTDNSVTPPTT 249
+ +G + + T A T D T +T
Sbjct: 197 TYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKST 256

Query: 250 TAAQAYSSGQGINLGGQTVAVSGKPAVGDTFTVTPAPQAGTDVFATLD----TVIAALKS 305
+ G GG+ V T V T++ T+ A +
Sbjct: 257 AGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADIT 316

Query: 306 PVGNSQTASTALTNTMATASTKLMNTMTNVLTVQASVGGRLQEVKAMQAVTTTNTLQTTN 365
+ A+T ++ S + T S E + T+
Sbjct: 317 AGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAE 376

Query: 366 SLSNLTDTN 374
+N
Sbjct: 377 YTANAAGDK 385


70BURPS668_0403BURPS668_0409N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0403-291.002066hypothetical protein
BURPS668_0404080.299167MarR family transcriptional regulator
BURPS668_0405190.201484hypothetical protein
BURPS668_04062100.355252short chain dehydrogenase
BURPS668_0407211-0.268979short chain dehydrogenase
BURPS668_0408114-1.621470thiol:disulfide interchange protein DsbA
BURPS668_0409014-0.854329cell division protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0403PYOCINKILLER290.018 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 29.0 bits (64), Expect = 0.018
Identities = 23/88 (26%), Positives = 41/88 (46%), Gaps = 3/88 (3%)

Query: 1 MQRIMSAEPIVTLSVARTLSAAIGAAFRRSDRASARRDSRGRRACASARRASTRRRPAAR 60
+Q M+ +T + A +AA A ++ + R+ R A+ R A+T PA
Sbjct: 200 LQIRMNT---LTAAKASIEAAAANKAREQAAAEAKRKAEEQARQQAAIRAANTYAMPANG 256

Query: 61 CAPSAGLGRALARQAAPGASLSRSVASA 88
+ GR L + A ASL+++++ A
Sbjct: 257 SVVATAAGRGLIQVAQGAASLAQAISDA 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0406DHBDHDRGNASE583e-12 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 57.8 bits (139), Expect = 3e-12
Identities = 49/186 (26%), Positives = 78/186 (41%), Gaps = 13/186 (6%)

Query: 7 VVLVTGANRGLGLAFVEGLKAAGAK------------KIYAAARDPARVTTPGVQPVRLD 54
+ +TGA +G+G A L + GA K+ ++ + AR VR
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 55 VTRAQDIAAAARELRDVNLLVNNAGIFRMGSLLAEADGGGLQAQLDTNFFGPLAMARAFA 114
+ A RE+ +++LVN AG+ R G L+ +A N G +R+ +
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPG-LIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 115 PVLRENGGGAIVNVLSVLSWLGLPNTGAYGISKAAAWAATNAIRNELREQRTRVLALHSA 174
+ + G+IV V S + + + AY SKAAA T + EL E R +
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 175 YIDTDM 180
+TDM
Sbjct: 189 STETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0407DHBDHDRGNASE732e-17 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 73.2 bits (179), Expect = 2e-17
Identities = 50/188 (26%), Positives = 81/188 (43%), Gaps = 10/188 (5%)

Query: 9 VFITGASSGLGLALAAEYARHGATLGLVARRADALAEFAP------RFPKASVSIYPADV 62
FITGA+ G+G A+A A GA + V + L + R +A +PADV
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEA----FPADV 66

Query: 63 RDADALALAASRFVAAHGCPDVVIANAGISKGAITGEGDLAAFREIMDVNYYGMIATFEP 122
RD+ A+ +R G D+++ AG+ + + + VN G+
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 123 FIAPMTAARRGTLVGIASVAGVRGLPGSGAYSASKAAAIKYLEALRVELRPAQVAVVTIA 182
M R G++V + S AY++SKAAA+ + + L +EL + ++
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PGYIRTPM 190
PG T M
Sbjct: 187 PGSTETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0409IGASERPTASE348e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.9 bits (77), Expect = 8e-04
Identities = 22/173 (12%), Positives = 50/173 (28%), Gaps = 6/173 (3%)

Query: 62 ASQPQQFDPNRALQGKTPGQPVTPQAAQPAPPNTAPGQAANPSQPPLLPEPQIVEVPSSN 121
A ++ DP ++ T QPA ++ + + +VE P +
Sbjct: 1143 AEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENT 1202

Query: 122 NNGNGSPSASNNAAD-----NGVAVAPKPAEPAPPPAKKPQAAANGSSAPHVANNNAQAS 176
P+ ++ +++ + +V P P + N NA S
Sbjct: 1203 TPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLS 1262

Query: 177 AAATPPKAAQAPKGASSATTTAAKPTSGADANTGYFLQVGAYKTEADAEQQRA 229
A + G + + + + ++ + + Q R
Sbjct: 1263 DARAKAQFVALNVGKAVSQHISQLEMNNEGQYN-VWVSNTSMNKNYSSSQYRR 1314


71BURPS668_0848BURPS668_0855N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0848-3100.628074serine protease
BURPS668_0849-2121.278082hypothetical protein
BURPS668_0850113-0.911187hypothetical protein
BURPS668_0851114-0.019329carbon monoxide dehydrogenase
BURPS668_0852012-0.369524multidrug efflux pump repressor protein BpeR
BURPS668_0853012-0.615019hypothetical protein
BURPS668_0854-2110.860317multidrug efflux periplasmic linker protein
BURPS668_08551170.736249inner membrane multidrug efflux protein BpeB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0848V8PROTEASE794e-18 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 78.5 bits (193), Expect = 4e-18
Identities = 38/207 (18%), Positives = 71/207 (34%), Gaps = 40/207 (19%)

Query: 81 QRRAAPQLPIDPDDP-----FYQFFRHFYGQIPGMGGGRQPQPDDQPSTSLGSGFIISAD 135
++R + + +D I Q + T + SG ++
Sbjct: 62 EQREHANVILPNNDRHQITDTTNGHYAPVTYI---------QVEAPTGTFIASGVVV-GK 111

Query: 136 GYILTNAHVIDGANVVTVKLTDKR-----------EYKA-KVVGADKQSDVAVLKIDA-- 181
+LTN HV+D + L + A ++ + D+A++K
Sbjct: 112 DTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNE 171

Query: 182 ------SGLPIVKIGDPAQSKVGQWVVAIGSPYGFDNTVTSGIISAKSRALPDENYTPFI 235
+ + + A+++V Q + G P +K + + +
Sbjct: 172 QNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKPVATMW---ESKGKITYLKGE--AM 226

Query: 236 QTDVPVNPGNSGGPLFNLNGEVIGINS 262
Q D+ GNSG P+FN EVIGI+
Sbjct: 227 QYDLSTTGGNSGSPVFNEKNEVIGIHW 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0852HTHTETR1262e-38 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 126 bits (317), Expect = 2e-38
Identities = 81/209 (38%), Positives = 115/209 (55%), Gaps = 1/209 (0%)

Query: 1 MARRTKEEALATRDRILDAAEHVFFEKGVSHTSLADIAQHAGVTRGAIYWHFASKSELFD 60
MAR+TK+EA TR ILD A +F ++GVS TSL +IA+ AGVTRGAIYWHF KS+LF
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 AMFDRVLLPIDELKAGT-GEPHADPLGRIREILIWCLLGAARDPQLRRVFSILFMKCEYV 119
+++ I EL+ + DPL +REILI L + + R + I+F KCE+V
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 120 ADMGPLLQRNREGMRDALRNIEADLAQGVANGQLPADLDTWRATLMLHTLVSGFVRDMLM 179
+M + Q R ++ IE L + LPADL T RA +++ +SG + + L
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 180 LPGEIDAERHAEKLVDGCFDMLRTSPAMR 208
P D ++ A V +M P +R
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLR 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0854RTXTOXIND424e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.7 bits (98), Expect = 4e-06
Identities = 42/266 (15%), Positives = 80/266 (30%), Gaps = 75/266 (28%)

Query: 92 KIDPAPYIAQLNSAKATLAKAQANLATQNALVARYKVLVAANAVSKQQYDDAVAAQGQAA 151
+++ A+ + A + + + + + + + L+ A++K + +A
Sbjct: 206 ELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAV 265

Query: 152 ADVGAGKAAV-------------------------------------------ETAQINL 168
++ K+ + +
Sbjct: 266 NELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQ 325

Query: 169 GYTDVVSPITGRV-GISQVTPGAYVQASQATLMSTVQQLDPVYVDLTQSSLDGLKLRQDI 227
+ + +P++ +V + T G V ++ TLM V + D + V + D +
Sbjct: 326 QASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALVQNKDIGFINVG- 383

Query: 228 QSGRIK-------TEGPGAAKVTLILEDGKPYPERGKLQFSDVTVDQTTGSVT--IRAI- 277
Q+ IK G KV I D DQ G V I +I
Sbjct: 384 QNAIIKVEAFPYTRYGYLVGKVKNI--------------NLDAIEDQRLGLVFNVIISIE 429

Query: 278 -----FPNKQRVLLPGMFVRARIEEG 298
NK L GM V A I+ G
Sbjct: 430 ENCLSTGNKNIPLSSGMAVTAEIKTG 455



Score = 30.6 bits (69), Expect = 0.011
Identities = 20/122 (16%), Positives = 35/122 (28%), Gaps = 20/122 (16%)

Query: 1 MRVERVPYRLITVATAAVFLAACGKKESAPPPQTPEVGVVTVQPQPVPVVSELPGRTSAY 60
R V Y ++ A L+ G+ E G +T + +
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLGQVEIVATAN----GKLTHSGRSKEIKPIENSIVKEI 110

Query: 61 LVAQVRARVDGIVLRREFTEGSDVKAGQRLYKIDPAPYIAQLNSAKATLAKAQANLATQN 120
+V EG V+ G L K+ A +++L +A+
Sbjct: 111 IVK----------------EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQ 154

Query: 121 AL 122
L
Sbjct: 155 IL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0855ACRIFLAVINRP12720.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1272 bits (3293), Expect = 0.0
Identities = 674/1035 (65%), Positives = 822/1035 (79%), Gaps = 2/1035 (0%)

Query: 1 MAKFFIDRPIFAWVIAIILMLAGVAAIFTLPIAQYPTIAPPSIQITANYPGASAKTVEDT 60
MA FFI RPIFAWV+AIILM+AG AI LP+AQYPTIAPP++ ++ANYPGA A+TV+DT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQQMSGLDNFLYMSSTSDDSGNATITITFAPGTNPDIAQVQVQNKLSLATPILPQ 120
VTQVIEQ M+G+DN +YMSSTSD +G+ TIT+TF GT+PDIAQVQVQNKL LATP+LPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 VVQQLGLSVTKSSSSFLLVLAFNSEDGSMNKYDLANYVASHVKDPISRINGVGTVTLFGS 180
VQQ G+SV KSSSS+L+V F S++ + D+++YVAS+VKD +SR+NGVG V LFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWLDPTKLTNYGLTPVDVTSAISAQNVQIAGGQLGGTPAVPGTVLQATITEATLL 240
QYAMRIWLD L Y LTPVDV + + QN QIA GQLGGTPA+PG L A+I T
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 QTPEQFGNILLKVNQDGSQVRLKDVAQIGLGGETYNFDTKYNGQPTAALGIQLATNANAL 300
+ PE+FG + L+VN DGS VRLKDVA++ LGGE YN + NG+P A LGI+LAT ANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 ATAKAVRAKIDEMSAYFPHGLVVKYPYDTTPFVRLSIEEVVKTLLEGIVLVFLVMYLFLQ 360
TAKA++AK+ E+ +FP G+ V YPYDTTPFV+LSI EVVKTL E I+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NLRATIIPTIAVPVVLLGTFAIMSMVGFSINVLSMFGLVLAIGLLVDDAIVVVENVERVM 420
N+RAT+IPTIAVPVVLLGTFAI++ G+SIN L+MFG+VLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKAMGQITGALVGVALVLSAVFVPVAFSGGSVGAIYRQFSLTIVSAMVL 480
E+ LPPKEAT K+M QI GALVG+A+VLSAVF+P+AF GGS GAIYRQFS+TIVSAM L
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATILKPIPQGHHEEKKGFFGWFNRTFNSSRDKYHVGVHHVIKRSGRW 540
SVLVALILTPALCAT+LKP+ HHE K GFFGWFN TF+ S + Y V ++ +GR+
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 541 LIIYLAVIVAVGLLFVRLPKSFLPDEDQGLMFVIVQTPSGSTQETTARTLANISDYLLTQ 600
L+IY ++ + +LF+RLP SFLP+EDQG+ ++Q P+G+TQE T + L ++DY L
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 601 EKDIVESAFTVNGFSFAGRGQNSGLVFVKLKDYSQRQSSDQKVQALIGRMFGRYAGYKDA 660
EK VES FTVNGFSF+G+ QN+G+ FV LK + +R + +A+I R +D
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 661 LVIPFNPPSIPELGTAAGFDFELTDNAGLGHDALMAARNQLLGMAAKDP-TLRGVRPNGL 719
VIPFN P+I ELGTA GFDFEL D AGLGHDAL ARNQLLGMAA+ P +L VRPNGL
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 720 NDTPQYKVDIDREKANALGVTADAIDQTFSIAWASKYVNNFLDTDGRIKKVYVQSDAPFR 779
DT Q+K+++D+EKA ALGV+ I+QT S A YVN+F+D GR+KK+YVQ+DA FR
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFID-RGRVKKLYVQADAKFR 779

Query: 780 MTPEDMNIWYVRNGSGGMVPFSAFATGHWTYGSPKLERYNGISAMEIQGQAAPGKSTGQA 839
M PED++ YVR+ +G MVPFSAF T HW YGSP+LERYNG+ +MEIQG+AAPG S+G A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 840 MTAMETLAKKLPTGIGYSWTGLSFQEIQSGSQAPILYAISILVVFLCLAALYESWSIPFS 899
M ME LA KLP GIGY WTG+S+QE SG+QAP L AIS +VVFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 900 VIMVVPLGVIGALLAATLRGLENDVFFQVGLLTTVGLSAKNAILIVEFARELQQTEKMGP 959
V++VVPLG++G LLAATL +NDV+F VGLLTT+GLSAKNAILIVEFA++L + E G
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 960 IEAALEAARLRLRPILMTSLAFILGVMPLAISNGAGSASQHAIGTGVIGGMITATFLAIF 1019
+EA L A R+RLRPILMTSLAFILGV+PLAISNGAGS +Q+A+G GV+GGM++AT LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1020 MIPMFFVKVRAVFSG 1034
+P+FFV +R F G
Sbjct: 1020 FVPVFFVVIRRCFKG 1034


72BURPS668_0877BURPS668_0882N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_0877-1102.382176carbohydrate ABC transporter ATP-binding
BURPS668_08760124.268027hypothetical protein
BURPS668_08780113.924452hypothetical protein
BURPS668_0879-1105.461813hypothetical protein
BURPS668_08800115.081780LysR family transcriptional regulator
BURPS668_08810115.055787esterase
BURPS668_0882-1134.869920major facilitator family transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0877PF05272300.017 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.017
Identities = 14/35 (40%), Positives = 17/35 (48%)

Query: 32 VVFVGPSGCGKSTLMRMIAGLEEISGGELLIDGAK 66
VV G G GKSTL+ + GL+ S I K
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGK 633


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0878PF06776300.020 Invasion associated locus B
		>PF06776#Invasion associated locus B

Length = 214

Score = 29.5 bits (66), Expect = 0.020
Identities = 11/49 (22%), Positives = 15/49 (30%), Gaps = 2/49 (4%)

Query: 1 MKTGRRHFVRSVASASAALAAAAWSPARAAIDAPASPATALSLTPGRWS 49
+ + RR R+ A A A A A A+ G W
Sbjct: 38 LASCRRLARRNGARLMLAGAMAI--ALSFGWSDRADAQGAVRSVHGDWQ 84



Score = 28.7 bits (64), Expect = 0.039
Identities = 7/37 (18%), Positives = 13/37 (35%)

Query: 10 RSVASASAALAAAAWSPARAAIDAPASPATALSLTPG 46
+++ A L+ S R A A A ++
Sbjct: 25 KAIQMGPAELSPMLASCRRLARRNGARLMLAGAMAIA 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0881BLACTAMASEA300.018 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 29.8 bits (67), Expect = 0.018
Identities = 11/35 (31%), Positives = 15/35 (42%)

Query: 57 REDALFRFASVSKPIVSAAAMRAVAAGKLDLDASI 91
R D F S K ++ A + V AG L+ I
Sbjct: 57 RADERFPMMSTFKVVLCGAVLARVDAGDEQLERKI 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_0882TCRTETB354e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 35.2 bits (81), Expect = 4e-04
Identities = 31/155 (20%), Positives = 59/155 (38%), Gaps = 5/155 (3%)

Query: 26 LLALATAGFITIVTEALPAGLLPLMGRDLRVSDALVGQLVTVYAAGSIVAAIPLVAATRG 85
L+ L F +++ E + LP + D A + T + + +
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 86 MRRRPLLLAALAGFVVANTATAASPYYAPVLV-ARCVAGVSAGLLWALLAGYASRMVDAR 144
+ + LLL + + + +L+ AR + G A AL+ +R +
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 145 QRGRAIAIAMLGAPVAMSVGI-PL-GTALGAALGW 177
RG+A ++G+ VAM G+ P G + + W
Sbjct: 136 NRGKAFG--LIGSIVAMGEGVGPAIGGMIAHYIHW 168


73BURPS668_1359BURPS668_1364N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_13592140.425841hypothetical protein
BURPS668_13602130.511532hypothetical protein
BURPS668_13612120.477876hypothetical protein
BURPS668_1362213-0.146748multidrug resistance protein MdtC
BURPS668_1363010-0.809301multidrug resistance protein MdtB
BURPS668_1364011-0.395279membrane fusion protein MdtA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1359PF01540290.015 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 28.9 bits (64), Expect = 0.015
Identities = 26/84 (30%), Positives = 38/84 (45%), Gaps = 3/84 (3%)

Query: 13 RTGRALADLLLKQQDFEVTALVRRPDFA--LPGAKVVVADLTGDFSSAFN-GITHAIYAA 69
+ G+ AD LKQ + L + PD++ L +A+ T F A + G AI +
Sbjct: 35 KNGKEKADAALKQANALAEELKKNPDYSKILETLNKEIAEATKSFKEAGSYGDYPAIISK 94

Query: 70 GSAESEGATEEEQIDRDAVARAAD 93
SA E A E+Q A + AD
Sbjct: 95 LSAAVENAKSEQQKVDQANKKIAD 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1362ACRIFLAVINRP7450.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 745 bits (1925), Expect = 0.0
Identities = 279/1104 (25%), Positives = 502/1104 (45%), Gaps = 100/1104 (9%)

Query: 3 LARPFITRPVATTLLALGIALAGLFAFVKLPVSPLPQVDFPTILVQASLPGASPETVATS 62
+A FI RP+ +LA+ + +AG A ++LPV+ P + P + V A+ PGA +TV +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 VTSPLERHLGSIADVAEMTSMS-SVGNARIVLQFNLNRDIDGAARDVQAAINAARADLPA 121
VT +E+++ I ++ M+S S S G+ I L F D D A VQ + A LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 SLKSNPTYRKVNPADSPIMVVSLTS--KTASPAKLYDAASTVLQQSLSQIDGIGQVSLSG 179
++ + S +MV S + + D ++ ++ +LS+++G+G V L G
Sbjct: 121 EVQ-QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 180 SANPAVRVELEPQALFHYGIGLEDVRAALASANANSPKGAIEAGP------HRYQLYTND 233
+ A+R+ L+ L Y + DV L N G + P +
Sbjct: 180 AQY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQT 238

Query: 234 QATKAAQYKDLVI-AYRNHAAVSLSDVSSVVDSVEDLRNLGLMNGERAVLVILYRSPGAN 292
+ ++ + + + + V L DV+ V E+ + +NG+ A + + + GAN
Sbjct: 239 RFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGAN 298

Query: 293 IIDTIERVKAALPQLTAALPADIQVTPVLDRSRTIRASLADTEHTLIIAVSLVVMVVFLF 352
+DT + +KA L +L P ++V D + ++ S+ + TL A+ LV +V++LF
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 353 LRNWRATLIPSVAVPISIVGTFGAMYLLGFSLNNLSLMALIVATGFVVDDAIVVLENIAR 412
L+N RATLIP++AVP+ ++GTF + G+S+N L++ +++A G +VDDAIVV+EN+ R
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 413 HI-ENGTPRLQAAFDGAREVGFTVLSISLSLVAVFLPILLMGGIVGRLFREFALTLSLAI 471
+ E+ P +A ++ ++ I++ L AVF+P+ GG G ++R+F++T+ A+
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 472 AVSLVVSLTLTPMMCARLLPEAHAPRDE--GRVARWLERGFEWMQRGYERTLSWALRHPF 529
A+S++V+L LTP +CA LL A E G W F+ Y ++ L
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 530 TILMTLVATIALNIALYIVVPKGFFPQQDTGLMIGGIQADQTTSFQAMKLRFTEMMRIIR 589
L+ +A + L++ +P F P++D G+ + IQ + + + ++
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 590 ANP-----NVANVAGFT-GGAQTNSGFMFVALKDKPQR---KLSADQVIQQLRPQLAEVA 640
N +V V GF+ G N+G FV+LK +R + SA+ VI + + +L ++
Sbjct: 599 KNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 641 GARTFLQAAQDIRAGGRQSNAQYQFT-LLGDSTAELYKWGP-ILTEALQKRPELADVNSD 698
I G + ++ G L + +L A Q L V +
Sbjct: 659 DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 699 QQQGGLEAMVTIDRATAARLGIKPAQIDNTLYDAFGQRQVSTIYNPLNQYHVVMEVAPQY 758
+ + + +D+ A LG+ + I+ T+ A G V+ + + ++ ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 759 WQSPEMLKQIYISTSGGSASGVQTTNAAAGTYVATTARASTAGAAAQSAAAIAADSARNQ 818
PE + ++Y+ ++ G V + +V + R R
Sbjct: 779 RMLPEDVDKLYVRSANGEM--VPFSAFTTSHWVYGSPRLE-----------------RYN 819

Query: 819 ALNSIASSG--KSSASSGAAVSTSKSTMVPLSAIASFGPSTTPLAVNHQGLFVATTISFN 876
L S+ G SSG A++ ++
Sbjct: 820 GLPSMEIQGEAAPGTSSGDAMALMENLAS------------------------------K 849

Query: 877 LPPGVSLSKATQVIYQTMAEVGVPPTIQGSFQGTAQAFQESLKDQPILILAALAAVYIVL 936
LP G+ G + P L+ + V++ L
Sbjct: 850 LPAGIGY------------------DWTGMSYQERLSG----NQAPALVAISFVVVFLCL 887

Query: 937 GILYESYIHPVTILSTLPSAGVGALLGLLLFKTEFSIIALIGVILLIGIVKKNAIMMVDF 996
LYES+ PV+++ +P VG LL LF + + ++G++ IG+ KNAI++V+F
Sbjct: 888 AALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEF 947

Query: 997 AIDA-SRQGKSSFDAIHEACLLRFRPIMMTTMAALLGALPLAFGRGDGAEMRAPLGIAIA 1055
A D ++GK +A A +R RPI+MT++A +LG LPLA G G+ + +GI +
Sbjct: 948 AKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVM 1007

Query: 1056 GGLIVSQMLTLYTTPVVYLYMDRL 1079
GG++ + +L ++ PV ++ + R
Sbjct: 1008 GGMVSATLLAIFFVPVFFVVIRRC 1031



Score = 96.1 bits (239), Expect = 4e-22
Identities = 83/503 (16%), Positives = 167/503 (33%), Gaps = 25/503 (4%)

Query: 2 NLARPFITRPVATTLLALGIALAGLFAFVKLPVSPLPQVDFPTILVQASLP-GASPETVA 60
N + L+ I + F++LP S LP+ D L LP GA+ E
Sbjct: 528 NSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQ 587

Query: 61 TSVTSPLERHLGSIAD----VAEMTSMSSVGNAR----IVLQFNLNRDIDGAARDVQAAI 112
+ + +L + V + S G A+ + + +G +A I
Sbjct: 588 KVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVI 647

Query: 113 NAARADLPASLKSNPTYRKVNPADSPIMVVSLTSKT-----ASPAKLYDAASTVLQQSLS 167
+ A+ +L + + L A + +L +
Sbjct: 648 HRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQ 707

Query: 168 QIDGIGQVSLSGSAN-PAVRVELEPQALFHYGIGLEDVRAALASANANSPKGAIEAGPHR 226
+ V +G + ++E++ + G+ L D+ +++A +
Sbjct: 708 HPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRV 767

Query: 227 YQLYT---NDQATKAAQYKDLVIAYRNHAAVSLSDVSSVVDSVEDLRNLGLMNGERAVLV 283
+LY L + N V S ++ V L NG ++ +
Sbjct: 768 KKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHW-VYGSPRLERYNGLPSMEI 826

Query: 284 ILYRSPGANIIDTIERVKAALPQLTAALPADIQVTPVLDRSRTIRASLADTEHTLIIAVS 343
+PG + D A + L + LPA I S R S + I+
Sbjct: 827 QGEAAPGTSSGD----AMALMENLASKLPAGIGYD-WTGMSYQERLSGNQAPALVAISFV 881

Query: 344 LVVMVVFLFLRNWRATLIPSVAVPISIVGTFGAMYLLGFSLNNLSLMALIVATGFVVDDA 403
+V + + +W + + VP+ IVG A L + ++ L+ G +A
Sbjct: 882 VVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNA 941

Query: 404 IVVLENI-ARHIENGTPRLQAAFDGAREVGFTVLSISLSLVAVFLPILLMGGIVGRLFRE 462
I+++E + G ++A R +L SL+ + LP+ + G
Sbjct: 942 ILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNA 1001

Query: 463 FALTLSLAIAVSLVVSLTLTPMM 485
+ + + + ++++ P+
Sbjct: 1002 VGIGVMGGMVSATLLAIFFVPVF 1024



Score = 59.9 bits (145), Expect = 5e-11
Identities = 37/225 (16%), Positives = 84/225 (37%), Gaps = 4/225 (1%)

Query: 870 ATTISFNLPPGVSLSKATQVIYQTMAEV--GVPPTIQGS-FQGTAQAFQESLKDQPILIL 926
A + L G + + I +AE+ P ++ T Q S+ + +
Sbjct: 286 AAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLF 345

Query: 927 AALAAVYIVLGILYESYIHPVTILSTLPSAGVGALLGLLLFKTEFSIIALIGVILLIGIV 986
A+ V++V+ + ++ + +P +G L F + + + G++L IG++
Sbjct: 346 EAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLL 405

Query: 987 KKNAIMMVDFAIDASRQGKSSF-DAIHEACLLRFRPIMMTTMAALLGALPLAFGRGDGAE 1045
+AI++V+ + K +A ++ ++ M +P+AF G
Sbjct: 406 VDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGA 465

Query: 1046 MRAPLGIAIAGGLIVSQMLTLYTTPVVYLYMDRLRVWAEKRRDRR 1090
+ I I + +S ++ L TP + + +
Sbjct: 466 IYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGG 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1363ACRIFLAVINRP8000.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 800 bits (2067), Expect = 0.0
Identities = 283/1035 (27%), Positives = 498/1035 (48%), Gaps = 31/1035 (2%)

Query: 4 SRVFILRPVGTALLMAAIMLAGLVALRFLPLAALPEVDYPTIQVQTFYPGASPEVMTSSV 63
+ FI RP+ +L +M+AG +A+ LP+A P + P + V YPGA + + +V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 64 TAPLERQFGQMPSLNQMSSQS-SAGASVITLQFSLDLPLDIAEQEVQAAINAAGNLLPSD 122
T +E+ + +L MSS S SAG+ ITL F DIA+ +VQ + A LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 123 LPAPPIYAKVNPADAPVITLAVTSKTLPLTQ--VQDLADTRLAMKISQVSGVGLVSLSGG 180
+ I + + ++ S TQ + D + + +S+++GVG V L G
Sbjct: 122 VQQQGIS-VEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 NRPAVRIQANPLALASYGLNLDDLRTTISNLNVNTPKGNFDGP------TRAYTINANDQ 234
A+RI + L Y L D+ + N G G +I A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 235 LTSADQYNDAVV-AYKNGRPVMLTDVAKIVAGSENTKLGAWVDAEPAIILNVQRQPGANV 293
+ +++ + +G V L DVA++ G EN + A ++ +PA L ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 294 IQTVDNVKAILPKLQESLPAALDVQIVTDRTTMIRAAVRDVQFELGLAVALVVLVMYLFL 353
+ T +KA L +LQ P + V D T ++ ++ +V L A+ LV LVMYLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 354 ANVYATIIPSLSVPLSLIGTLAVMYLSGFSLNNLSLMALTIATGFVVDDAIVMIENIARY 413
N+ AT+IP+++VP+ L+GT A++ G+S+N L++ + +A G +VDDAIV++EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 414 -VEEGDSALEAALKGSKQIGFTIISLTVSLIAVLIPLLFMGDVVGRLFHEFAITLAVTIV 472
+E+ EA K QI ++ + + L AV IP+ F G G ++ +F+IT+ +
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 473 ISAVVSLTLVPMMCAKLLRHTPPPESHRFEAKVHGLIERV----IERYGVALQWVLDRQR 528
+S +V+L L P +CA LL+ E H + G + Y ++ +L
Sbjct: 480 LSVLVALILTPALCATLLK-PVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 529 ATLVVAVLTLALTALLYVVIPKGFFPTQDTGVIQAITQAPQSVSYGAMAERQQALAAEIL 588
L++ L +A +L++ +P F P +D GV + Q P + + + L
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 589 KH--PDVVSLTSFIGVDGANITLNSGRMLINLKPRDERS---ESASDVIRSLQRQVANVT 643
K+ +V S+ + G + N+G ++LKP +ER+ SA VI + ++ +
Sbjct: 599 KNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 644 GISLYMQPVQDLTIDSTVSPTQYQFMLTS---PNPDEFATWVPKLVDRLKKEPS-LADVA 699
+ P I + T + F L D +L+ + P+ L V
Sbjct: 659 DGFVI--PFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVR 716

Query: 700 TDLQNSGKSVYIEIDRTSAARFGITPATVDNALYDAYGQRIVSTIFTQSNQYRVILESEP 759
+ +E+D+ A G++ + ++ + A G V+ + ++ ++++
Sbjct: 717 PNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADA 776

Query: 760 QMQHYTDSLNGIYLPSAGGGQVPLSAIATFRERPAPLLVSHLSQFPATTISFNLAAGASL 819
+ + + ++ +Y+ SA G VP SA T + + P+ I A G S
Sbjct: 777 KFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSS 836

Query: 820 GEAVKAIDAAERELGLPASFQTRFQGAALAFQASLSNQLFLILAAIVTMYIVLGVLYESY 879
G+A+ ++ +L PA + G + + S + L+ + V +++ L LYES+
Sbjct: 837 GDAMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESW 894

Query: 880 IHPITILSTLPSAGVGALLALMITGHDLDIIGIIGIVLLIGIVKKNAIMMIDFALEAERV 939
P++++ +P VG LLA + D+ ++G++ IG+ KNAI++++FA +
Sbjct: 895 SIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEK 954

Query: 940 EGKPPREAIYQACLLRFRPILMTTLAALLGAVPLIVGSGAGSELRQPLGIAIAGGLIVSQ 999
EGK EA A +R RPILMT+LA +LG +PL + +GAGS + +GI + GG++ +
Sbjct: 955 EGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSAT 1014

Query: 1000 VLTLFTTPVIYLGFD 1014
+L +F PV ++
Sbjct: 1015 LLAIFFVPVFFVVIR 1029


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1364RTXTOXIND487e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.5 bits (113), Expect = 7e-08
Identities = 28/149 (18%), Positives = 58/149 (38%), Gaps = 16/149 (10%)

Query: 84 AVRGEMPVVLNALGTVTPLANV-TVRTQLSGYLQAVSFQEGQIVKKGDVLAQIDPRP--- 139
+V G++ +V A G +T ++ + ++ + +EG+ V+KGDVL ++
Sbjct: 75 SVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA 134

Query: 140 ----YQISLANAQGALARDEALLATARLDLKRYQTLVAQ---DSIAKQTADTQASLVKQY 192
Q SL A+ R + L + L+ L + +++++ SL+K+
Sbjct: 135 DTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKE- 193

Query: 193 EGTVQIDRAAIDSAKLNLAYARITAPVSG 221
Q + L + A
Sbjct: 194 ----QFSTWQNQKYQKELNLDKKRAERLT 218



Score = 38.7 bits (90), Expect = 4e-05
Identities = 33/182 (18%), Positives = 61/182 (33%), Gaps = 26/182 (14%)

Query: 141 QISLANAQGALARDEALLAT--ARLDLKRYQTLVAQDSIAKQTADTQASLVKQY-EGTVQ 197
+ ++ + L ++L+ + L A++ T + ++ + + T
Sbjct: 251 KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDN 310

Query: 198 ID--RAAIDSAKLNLAYARITAPVSGRV-GLRQVDPGNYVTPSDT--------NGIVVIT 246
I + + + I APVS +V L+ G VT ++T + + V
Sbjct: 311 IGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTA 370

Query: 247 QLQPMSVIFTTSEDNLPAILKQVGAGGKLSVTAYNRNNTTPLETGV-LDTLDNQIDTATG 305
+Q + F AI+K V A+ L V LD D G
Sbjct: 371 LVQNKDIGFINVG--QNAIIK---------VEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419

Query: 306 TV 307
V
Sbjct: 420 LV 421


74BURPS668_1617BURPS668_1624N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_16170102.031056Signal transduction histidine kinase
BURPS668_1618-1101.368869DNA-binding response regulator
BURPS668_1619-181.652073hypothetical protein
BURPS668_1621-192.083400trans-aconitate methyltransferase
BURPS668_16200102.406028Outer membrane protein
BURPS668_1622-1100.415287EmrB/QacA family drug resistance transporter
BURPS668_1623-2111.680585hypothetical protein
BURPS668_1624-1101.016927HlyD family secretion protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1617HTHFIS632e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.5 bits (152), Expect = 2e-12
Identities = 30/122 (24%), Positives = 50/122 (40%), Gaps = 10/122 (8%)

Query: 391 RVLVVDDQEMNRIVLRYQLDALGHHARLCASGDEALRALGTAAYDVVLTDCRMPGMDGIA 450
+LV DD R VL L G+ R+ ++ R + D+V+TD MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 451 LTAAIRAH-PDARVRATPIVGVTALVSDAEHARCVDAGMTLCIGKP----TTLDALERAL 505
L I+ PD P++ ++A + + + G + KP + + RAL
Sbjct: 65 LLPRIKKARPD-----LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 506 VE 507
E
Sbjct: 120 AE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1618HTHFIS554e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 54.8 bits (132), Expect = 4e-11
Identities = 35/143 (24%), Positives = 55/143 (38%), Gaps = 22/143 (15%)

Query: 5 VLIADDHPLVLLGVRHMLAGMG-DVSIVGEAHDPAGLLALLAATPCDIVITDFAMPEQPA 63
+L+ADD + + L+ G DV I A L +AA D+V+TD MP+
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAAT---LWRWIAAGDGDLVVTDVVMPD--- 59

Query: 64 ADGLAMLTAIRDGYPSVRVIVLTMLDNPVLMHTMRQAGALAVLSKRGDLDE--------- 114
+ +L I+ P + V+V++ + + + GA L K DL E
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 115 ------LPRALAAVYQGRPFVGT 131
+ G P VG
Sbjct: 120 AEPKRRPSKLEDDSQDGMPLVGR 142


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1622TCRTETB1022e-25 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 102 bits (256), Expect = 2e-25
Identities = 71/331 (21%), Positives = 140/331 (42%), Gaps = 20/331 (6%)

Query: 41 AFMEVLDTTIVNVALPHIAGTMSASYDEATWTLTSYLVANGIVLPISGFLGRLLGRKRYF 100
+F VL+ ++NV+LP IA + W T++++ I + G L LG KR
Sbjct: 23 SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLL 82

Query: 101 VLCIVAFTICSFLCGIATDLGQLIVF-RVLQGLFGGGLQPNQQSIILDTF-PPEQRNRAF 158
+ I+ S + + L++ R +QG G P +++ + P E R +AF
Sbjct: 83 LFGIIINCFGSVIGFVGHSFFSLLIMARFIQGA-GAAAFPALVMVVVARYIPKENRGKAF 141

Query: 159 SISAVAIVVAPVLGPTLGGWITDNFSWRWVFLLNVPIGVLTSLAVIQLVEDPPWKRGRAR 218
+ + + +GP +GG I W +LL +P+ +T + V L++ K R +
Sbjct: 142 GLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPM--ITIITVPFLMKLLK-KEVRIK 196

Query: 219 GLSIDYIGITLIAIGLGCLQVMLDRGEDEDWFASTFIRTFAVLTVAGLVGATFWLLYAKK 278
G D GI L+++G+ + F +++ +F +++V + +
Sbjct: 197 G-HFDIKGIILMSVGIVFFML----------FTTSYSISFLIVSVLSFLIFVKHIRKVTD 245

Query: 279 PVVDLSCLKDRNFALGCVTIATFAVVLYGSAVLVPQLAQQRLGYTAMLAG-LVLSPGALL 337
P VD K+ F +G + + G +VP + + + G +++ PG +
Sbjct: 246 PFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMS 305

Query: 338 ITLEIPIVSKLMPYVQTRFLVCFGFLLLAAS 368
+ + I L+ +++ G L+ S
Sbjct: 306 VIIFGYIGGILVDRRGPLYVLNIGVTFLSVS 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1624RTXTOXIND1011e-25 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 101 bits (253), Expect = 1e-25
Identities = 63/414 (15%), Positives = 134/414 (32%), Gaps = 91/414 (21%)

Query: 51 KRPGKKPLVVLAIIVVLLLVGAFVW-WFATRNQVSTDDA--YTDGNAITIAPKVSGYVVA 107
+ P + ++A ++ LV AF+ V+T + G + I P + V
Sbjct: 50 ETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKE 109

Query: 108 LAIDDNAYVHRGDLLLVIDQRDYQAQVDAARAQLGLAQAQLDAAQVQLDIA------HVQ 161
+ + + V +GD+LL + +A ++ L A+ + Q+ ++
Sbjct: 110 IIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELK 169

Query: 162 FPAQYRQAQA---QIEAAQASFRQALAAYERQHAVDARATSQQAIDVADAQRLTADANVA 218
P + ++ + ++ + ++ Q + + +D A+RLT A +
Sbjct: 170 LPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQ-----KYQKELNLDKKRAERLTVLARIN 224

Query: 219 TARAQA----------------------------RTASLVPQQIRQAQTAVEQRRQQVLQ 250
+ ++R ++ +EQ ++L
Sbjct: 225 RYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILS 284

Query: 251 AQA-----------------------------QLEAAQLALSYCEVRAPSDGWITRRNVQ 281
A+ +L + +RAP + + V
Sbjct: 285 AKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVH 344

Query: 282 -LGSFLQAGAALFAIVTPQ---LWVTANFKESQLERMRAGDRVSVSVDAYP---NLELHG 334
G + L IV P+ L VTA + + + G + V+A+P L G
Sbjct: 345 TEGGVVTTAETLMVIV-PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVG 403

Query: 335 HVDSIQLGSGSRFSAFPPENATGNFVKIVQRVPVKIAIDGGLPRDPPLGIGLSV 388
V +I L + + G ++ + G ++ PL G++V
Sbjct: 404 KVKNINLDA-------IEDQRLGLVFNVIISIEENCLSTGN--KNIPLSSGMAV 448


75BURPS668_1773BURPS668_1778N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_1773-292.256495hypothetical protein
BURPS668_1774-2101.455179hypothetical protein
BURPS668_1775-1111.142488hypothetical protein
BURPS668_1776-110-0.242146sigma-54 interaction domain/Fis family
BURPS668_1777-110-0.680710hypothetical protein
BURPS668_1778013-0.805765RNA chaperone Hfq
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1773PYOCINKILLER300.013 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.1 bits (67), Expect = 0.013
Identities = 29/86 (33%), Positives = 37/86 (43%), Gaps = 3/86 (3%)

Query: 214 LMNQLKLAPAVRAEIRNDATRIAAAARARQRA-LARPGAPGAAASAGATLAASAAGSNGG 272
MN L A A + R AAA A+++A AA A T A A GS
Sbjct: 203 RMNTLTAAKASIEAAAANKAREQAAAEAKRKAEEQARQQ--AAIRAANTYAMPANGSVVA 260

Query: 273 AAAGKGAVAGAGASAPGAAATAAAAA 298
AAG+G + A +A A A + A A
Sbjct: 261 TAAGRGLIQVAQGAASLAQAISDAIA 286


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1776HTHFIS2973e-98 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 297 bits (763), Expect = 3e-98
Identities = 130/475 (27%), Positives = 204/475 (42%), Gaps = 53/475 (11%)

Query: 19 ADIVDRVARCMSSFDVEVIRADN-EELSAERTAMRPSLAIISVSMIE-SGAAFLRTWQAE 76
A I + + +S +V N L A L + V M + + L +
Sbjct: 13 AAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKA 72

Query: 77 -IGMPVVWVGA--------------ARDHDPSLYPPEYSHILPLDFTCAELRGMISKLAV 121
+PV+ + A A D+ P P + + ++ + +++
Sbjct: 73 RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPK--PFDLTELIGIIGRA------LAEPKR 124

Query: 122 QLRAHAAKALEPSTLVAHSDCMQALLQEVDTFADCDTNVLLHGETGVGKERIAQLLHEKH 181
+ + + LV S MQ + + + D +++ GE+G GKE +A+ LH+ +
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHD-Y 183

Query: 182 SRYGMGEFVPVNCGAIPDGLFESLFFGHAKGSFTGAVGTHKGYFEQAAGGTLFLDEVGDL 241
+ G FV +N AIP L ES FGH KG+FTGA G FEQA GGTLFLDE+GD+
Sbjct: 184 GKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDM 243

Query: 242 PLYQQVKLLRVLEDGAVLRIGATAPVKVDFRLVAASNKKLPQLVKDGLFRADLYYRLAVI 301
P+ Q +LLRVL+ G +G P++ D R+VAA+NK L Q + GLFR DLYYRL V+
Sbjct: 244 PMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVV 303

Query: 302 ELSIPSLEERGPVDKIALFKSFVASIVGEDRLAALPELPYWLAEAVADSYFPGNVRELRN 361
L +P L +R D L + FV E + E + +PGNVREL N
Sbjct: 304 PLRLPPLRDR-AEDIPDLVRHFVQQAEKEGL--DVKRFDQEALELMKAHPWPGNVRELEN 360

Query: 362 LAERVGV------------------------TVRQTGGWDTARLQRLIAHARSAAQPAPA 397
L R+ + + + + + +
Sbjct: 361 LVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFG 420

Query: 398 ESAPDVFVDRSKWDMTERNRVIAALDANGWRRQDTAQHLGISRKVLWEKMRKYQI 452
++ P + E ++AAL A + A LG++R L +K+R+ +
Sbjct: 421 DALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1777IGASERPTASE280.044 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 27.7 bits (61), Expect = 0.044
Identities = 19/108 (17%), Positives = 32/108 (29%), Gaps = 9/108 (8%)

Query: 113 LFQQKAFWRVIRTASEARAEAVYRDFAKQSETLAVNELQAAKLESQKALTDRQIAVA--- 169
++A V + + T K E K T++ V
Sbjct: 1067 EVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVT 1126

Query: 170 ------QERASRLQADLSIAREQRAAVATRQKDKLDETVALREQKSER 211
QE++ +Q ARE V ++ T A EQ ++
Sbjct: 1127 SQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKE 1174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1778cloacin290.028 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 28.5 bits (63), Expect = 0.028
Identities = 25/85 (29%), Positives = 26/85 (30%), Gaps = 7/85 (8%)

Query: 76 GRGPRAGGAHGGGGRPGGREGGGHGPYGSHG----GSREPRGDGGGYGAREPRGDGGYGS 131
GRG G G GG G G G S G P G G G G GG G
Sbjct: 6 GRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSG---IHWGGGSGH 62

Query: 132 RESRGDGGYGSREPRGDGGYGSREP 156
G+G G G P
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAP 87


76BURPS668_1871BURPS668_1882N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_18713135.315776peptidase A24A, prepilin type IV
BURPS668_18722135.028382pilus assembly protein CpaB
BURPS668_18734154.321064type II/III secretion system protein
BURPS668_18744135.076822lipoprotein
BURPS668_18753134.426542CpaE protein
BURPS668_18761124.148057type IV pilus assembly protein
BURPS668_1877-2104.581954type II secretion system protein
BURPS668_1878-2102.989488type II secretion system protein
BURPS668_1879092.706618hypothetical protein
BURPS668_18800101.696923TadE family protein
BURPS668_1882-1122.182165TadE family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1871PREPILNPTASE328e-04 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 32.1 bits (73), Expect = 8e-04
Identities = 31/148 (20%), Positives = 49/148 (33%), Gaps = 18/148 (12%)

Query: 20 LVASWTLASLALADLRTRRLATFAVALVGALYAALALAGAPGDGGFASHAALGAAA---- 75
L+ +W L +L DL L + L+ L G A +GA A
Sbjct: 138 LLLTWVLVALTFIDLDKMLLP--DQLTLPLLWGGLLFNLLGGFVSLGD-AVIGAMAGYLV 194

Query: 76 ----FALGAAMFRAGWIAGGDVKLAAVVFLWAGPAHAWPVAFAIGVGGLAVGAVCIAAGR 131
+ + + GD KL A + W G V + G +G I
Sbjct: 195 LWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLRN 254

Query: 132 APRVLAWFAPARGVPYGVALAAGGLLAV 159
++ +P+G LA G +A+
Sbjct: 255 H-------HQSKPIPFGPYLAIAGWIAL 275


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1873BCTERIALGSPD1443e-39 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 144 bits (364), Expect = 3e-39
Identities = 68/283 (24%), Positives = 116/283 (40%), Gaps = 16/283 (5%)

Query: 170 VVQTLKPYLRQQEALVNRLTLARPIQVHLRVRITEVDRNITQQLGINWSALGA------- 222
+V + E ++ +L + RP QV + I EV LGI W+ A
Sbjct: 322 IVTAAPDVMNDLERVIAQLDIRRP-QVLVEAIIAEVQDADGLNLGIQWANKNAGMTQFTN 380

Query: 223 SGNFVGGLFNGRTLFDTASKAFDLSPSGAFSVVGGFHTSRYSIDG--VLDALDQEGLITM 280
SG + G ++ S A S G Y + +L AL +
Sbjct: 381 SGLPISTAIAGANQYNKDGTVSSSLAS-ALSSFNGIAAGFYQGNWAMLLTALSSSTKNDI 439

Query: 281 LAEPNLTAISGQTASFLAGGEFPIPVAQDTTGA----ITIQFKPYGVSLDFTPTVLADNR 336
LA P++ + A+F G E P+ TT T++ K G+ L P + +
Sbjct: 440 LATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDNIFNTVERKTVGIKLKVKPQINEGDS 499

Query: 337 ISLKVRPEVSEIDPTNSVTTGSIKVPALTVRRVDTTVELSSGQSFAIGGLLQSKSSDVLA 396
+ L++ EVS + S T+ + R V+ V + SG++ +GGLL SD
Sbjct: 500 VLLEIEQEVSSVADAASSTSSDLGA-TFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTAD 558

Query: 397 ELPGLARLPVLGKLFSSRNYLNDKTEVVVIVTPYIVQPANPGE 439
++P L +PV+G LF S + K +++ + P +++ +
Sbjct: 559 KVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIRDRDEYR 601


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1875HTHFIS340.001 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 33.7 bits (77), Expect = 0.001
Identities = 29/165 (17%), Positives = 52/165 (31%), Gaps = 20/165 (12%)

Query: 22 GARLVAIVADAASDEVIRNLIADQAMTGAQVARGGIDDAIALMRDLSHGPQHLLVDVSGA 81
GA ++ DAA V+ ++ + + ++ DV
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYD--VRITSNAATLWRWIA--AGDGDLVVTDVV-- 56

Query: 82 AMP----LSDLARLADVCDPSVNVIVIGERNDVGLFRSMLRIGVRDYLVKPL----TVEL 133
MP L R+ P + V+V+ +N G DYL KP + +
Sbjct: 57 -MPDENAFDLLPRIKKA-RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGI 114

Query: 134 VHRALSAADPNAAARAGKAIGFVGARGGVGVTSIAVALARHLADR 178
+ RAL+ + + + G S A+ + R
Sbjct: 115 IGRALAEPKRRPSKLEDDSQDGMPLVG----RSAAMQEIYRVLAR 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1876PF05272300.032 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.032
Identities = 18/50 (36%), Positives = 26/50 (52%), Gaps = 4/50 (8%)

Query: 302 IVISGGTGSGKTTLLNAL---SHFIDSHERIVTIEDAAELQLQQPHVVSL 348
+V+ G G GK+TL+N L F D+H I T +D+ E Q+ L
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYE-QIAGIVAYEL 647


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1879SYCDCHAPRONE310.004 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 30.7 bits (69), Expect = 0.004
Identities = 20/83 (24%), Positives = 32/83 (38%)

Query: 38 SVAESALAAGDAELAATLFERALKADPRSLPAQVGLGDAMYQTGELARAGVLYAQAAAAA 97
S+A + +G E A +F+ D +GLG G+ A Y+ A
Sbjct: 41 SLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMD 100

Query: 98 PDDPRAQLGLARVALRERHLDDA 120
+PR A L++ L +A
Sbjct: 101 IKEPRFPFHAAECLLQKGELAEA 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1882PYOCINKILLER300.017 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 29.8 bits (66), Expect = 0.017
Identities = 30/132 (22%), Positives = 50/132 (37%), Gaps = 4/132 (3%)

Query: 40 RVAAARNELQNAADAAALAGAASLEAGAGAPAWAAAASAAAAALSLNASDGAALSSGDVQ 99
A A+ + + A A AA+ A PA + + AA + + GAA + +
Sbjct: 226 AAAEAKRKAEEQARQQAAIRAANTYA---MPANGSVVATAAGRGLIQVAQGAASLAQAIS 282

Query: 100 TGYWNVTGVPAGLEPTTLAPGEYDVPAVQATVTRARNQNGGPLSLLMGGLLGLVGTPAAA 159
V G P+ +A G + T + ++Q + +G +G P +
Sbjct: 283 DAI-AVLGRVLASAPSVMAVGFASLTYSSRTAEQWQDQTPDSVRYALGMDAAKLGLPPSV 341

Query: 160 TAVAVAGAPATV 171
AVA A TV
Sbjct: 342 NLNAVAKASGTV 353


77BURPS668_1888BURPS668_1891N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_1888-1102.864369multidrug efflux operon transciptional regulator
BURPS668_1889-1102.303916periplasmic multidrug efflux lipoprotein
BURPS668_1890-1110.871636multidrug efflux protein
BURPS668_18914150.853883outer membrane efflux protein OprA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1888HTHTETR1175e-35 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 117 bits (295), Expect = 5e-35
Identities = 53/210 (25%), Positives = 100/210 (47%), Gaps = 4/210 (1%)

Query: 1 MARKTREESLNTKNRILDAAELVLLEKGVGQTAMADIAEAAGMSRGAVYGHFNGKIEVCV 60
MARKT++E+ T+ ILD A + ++GV T++ +IA+AAG++RGA+Y HF K ++
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 AVCDRAFSRAVEGFDLSDERPA---LATLRLAASHYLHQCGEPGSMQRVLEILYMKCEQS 117
+ + + S E + L+ LR H L + ++EI++ KCE
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 118 EENAPLMRRRALYELQTLRIAKALLRRAVAAGELDASLDVHLAGVYLLSLLEGIFGSMIW 177
E A + + + L++ + L+ + A L A L A + + + G+ + ++
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 178 TTRLRGDRWRDAEAMLDAGVDTLRASPALR 207
+ D ++A + ++ P LR
Sbjct: 181 APQSF-DLKKEARDYVAILLEMYLLCPTLR 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1889RTXTOXIND401e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 40.2 bits (94), Expect = 1e-05
Identities = 20/133 (15%), Positives = 41/133 (30%), Gaps = 5/133 (3%)

Query: 67 EVRARVAGIVTARTYEEGQEVKRGAVLFRIDPAPFKAARDAAAGALEKARAAHLAALDKR 126
E++ IV +EG+ V++G VL ++ +A +L +AR
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 127 RRYDELVRDRAVSERDHTEALADERQAKAAVASARAELA-----RAQLQLDYATVTAPID 181
R + + E + + + + + + Q +L+ A
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERL 217

Query: 182 GRARRALVTEGAL 194
R E
Sbjct: 218 TVLARINRYENLS 230



Score = 35.2 bits (81), Expect = 4e-04
Identities = 18/100 (18%), Positives = 38/100 (38%), Gaps = 10/100 (10%)

Query: 102 KAARDAAAGALEKARAAHLAALDKRRRYDELVRDRAVSERDHTEALADERQAKAAVASAR 161
LE+ + L+A ++ + +L + E L RQ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFK---------NEILDKLRQTTDNIGLLT 315

Query: 162 AELARAQLQLDYATVTAPIDGR-ARRALVTEGALVGQDQA 200
ELA+ + + + + AP+ + + + TEG +V +
Sbjct: 316 LELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1890ACRIFLAVINRP10790.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1079 bits (2791), Expect = 0.0
Identities = 516/1032 (50%), Positives = 701/1032 (67%), Gaps = 6/1032 (0%)

Query: 1 MARFFIDRPVFAWVISLFIMLGGIFAIRALPVAQYPDIAPPVVSLYATYPGASAQVVEES 60
MA FFI RP+FAWV+++ +M+ G AI LPVAQYP IAPP VS+ A YPGA AQ V+++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTAVIEREMNGVPGLLYTSATS-SAGQASLSLTFKQGVSADLAAVDVQNRLKIVEARLPE 119
VT VIE+ MNG+ L+Y S+TS SAG +++LTF+ G D+A V VQN+L++ LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 120 PVRRDGISIEKAADNAQIIVSLTSEDGRLSGVELGEYASANVLQALRRVEGVGKVQFWGA 179
V++ GIS+EK++ + ++ S++ + ++ +Y ++NV L R+ GVG VQ +GA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 180 EYAMRIWPDPVKMAALGLTASDIASAVRAHNARVTIGDVGRSAVPDSAPIAATVLADAPL 239
+YAMRIW D + LT D+ + ++ N ++ G +G + + A+++A
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 240 TTPDAFGAIALRARADGSTLYLRDVARIEFGGNDYNYPSFVNGKTATGMGIKLAPGSNAV 299
P+ FG + LR +DGS + L+DVAR+E GG +YN + +NGK A G+GIKLA G+NA+
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 300 ATEKRVRATMEELAKFFPPGVKYQIPYETASFVRVSMSKVVTTLVEAGVLVFAVMFLFMQ 359
T K ++A + EL FFP G+K PY+T FV++S+ +VV TL EA +LVF VM+LF+Q
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 360 NFRATLIPTLVVPVALLGTFGAMLAAGFSINVLTMFGMVLAIGILVDDAIVVVENVERLM 419
N RATLIPT+ VPV LLGTF + A G+SIN LTMFGMVLAIG+LVDDAIVVVENVER+M
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 420 VEEKLPPYEATVKAMKQISGAIVGITVVLTSVFVPMAFFGGAVGNIYRQFAFALAVSIGF 479
+E+KLPP EAT K+M QI GA+VGI +VL++VF+PMAFFGG+ G IYRQF+ + ++
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 480 SAFLALSLTPALCATLLKPVADDHHE-KDGFFGWFNRFVARSTHRYTRRVGRVLERPLRW 538
S +AL LTPALCATLLKPV+ +HHE K GFFGWFN S + YT VG++L R+
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 539 LVVYGALTAAAALLITKLPAAFLPDEDQGNFMVMVIRPQGTPLAETMQSVRRVEEYVRTH 598
L++Y + A +L +LP++FLP+EDQG F+ M+ P G T + + +V +Y +
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 599 SPSAY--TFALGGYNLYGEGPNGGMIFVTMKDWKERKRARDQVQAIIAEINAHFAGTPNT 656
+ F + G++ G+ N GM FV++K W+ER + +A+I +
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 657 MVFAINMPALPDLGLTGGFDFRLQDRGGLGYGAFVAAREKLLAEGRKDPV-LTDLMFAGT 715
V NMPA+ +LG GFDF L D+ GLG+ A AR +LL + P L + G
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 716 QDAPQLKLDIDRAKASALGVSMEEINATLAVMFGSDYIGDFMHGSQVRRVIVQADGRHRL 775
+D Q KL++D+ KA ALGVS+ +IN T++ G Y+ DF+ +V+++ VQAD + R+
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 776 DAADVTKLRVRNAKGEMVPLAAFATLHWTMGPPQLTRYNGFPSFTINGAASAGHSSGEAM 835
DV KL VR+A GEMVP +AF T HW G P+L RYNG PS I G A+ G SSG+AM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 836 AAIERIASTLPAGTGYAWSGQSYEERLSGAQAPMLFALSVLVVFLALAALYESWSIPFAV 895
A +E +AS LPAG GY W+G SY+ERLSG QAP L A+S +VVFL LAALYESWSIP +V
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 896 MLVVPLGVIGAVAGVTLRGMPNDIYFKVGLIATIGLSAKNAILIVEVAKDLVAQR-MSLA 954
MLVVPLG++G + TL ND+YF VGL+ TIGLSAKNAILIVE AKDL+ + +
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 955 DAALEAARLRLRPIVMTSLAFGVGVLPLAFATGAASGAQIAIGTGVLGGVISATLFAIFL 1014
+A L A R+RLRPI+MTSLAF +GVLPLA + GA SGAQ A+G GV+GG++SATL AIF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1015 VPLFFVCVGRVF 1026
VP+FFV + R F
Sbjct: 1021 VPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_1891RTXTOXIND340.001 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.4 bits (79), Expect = 0.001
Identities = 19/104 (18%), Positives = 35/104 (33%), Gaps = 2/104 (1%)

Query: 379 APRLTLPIFAGGRNRANLDVADARKHIAVAEYEKTIQTAFREV--ADALAARDQIDAQLA 436
P L LP +N + +V I Q +E+ A R + A++
Sbjct: 165 LPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARIN 224

Query: 437 AQQAVYGVDAERLRLAQRRYDSGVASYLELLDAQRSTFESGQEL 480
+ + V+ RL + +L+ + E+ EL
Sbjct: 225 RYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNEL 268


78BURPS668_2067BURPS668_2074N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_2067-1100.775844cytochrome c family protein
BURPS668_2068-1100.372064regulatory protein NosR
BURPS668_2069-2130.922374phosphotransferase enzyme family protein
BURPS668_2070-2121.754573mechanosensitive ion channel family protein
BURPS668_20715262.610876hypothetical protein
BURPS668_2072-1160.729214hypothetical protein
BURPS668_2073-1140.727163hypothetical protein
BURPS668_2074-1120.672974hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2067PF05272270.034 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 27.3 bits (60), Expect = 0.034
Identities = 10/36 (27%), Positives = 14/36 (38%), Gaps = 2/36 (5%)

Query: 53 ASPSAASAPAAS--SAAAPAATTAAAADTPNPGGEK 86
+SP+AA+ A + A D PGG
Sbjct: 388 SSPTAAAGGAGGGEPPKKRDPSAGAGTDPGGPGGGD 423


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2070RTXTOXIND373e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 36.7 bits (85), Expect = 3e-04
Identities = 21/139 (15%), Positives = 43/139 (30%), Gaps = 12/139 (8%)

Query: 38 PPAPAISRDEALAELKRVQAALDRIKQQASAATTYKQLDALDESTQALSADVDKLTAALV 97
P +S +E L ++ + Q LD + A +++
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKEL--NLDKKRAERLTVLARINRYENLS- 230

Query: 98 PTRAQLQAQLDVLGPPPAPGAAPETPAVARQR------ADLNARKTQLDAALKQAADEKE 151
+++LD A + + ++ +L K+QL+ + KE
Sbjct: 231 ---RVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKE 287

Query: 152 SLANLTQQYSKLRRSLLRD 170
+TQ + LR
Sbjct: 288 EYQLVTQLFKNEILDKLRQ 306



Score = 30.6 bits (69), Expect = 0.027
Identities = 32/188 (17%), Positives = 57/188 (30%), Gaps = 28/188 (14%)

Query: 7 FARRVALIGLLHLLCAALPAAAADLASGASAPPAPAISRDEALAELKRVQAA----LDRI 62
R VA + L+ A + + + A+A S K ++ + I
Sbjct: 56 RPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHS-----GRSKEIKPIENSIVKEI 110

Query: 63 K----QQASAATTYKQLDALDESTQALSADVDKLTAALVPTRAQL--------QAQLDVL 110
+ +L AL L L A L TR Q+ + L
Sbjct: 111 IVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKL 170

Query: 111 GPPPAPGAAPETPAVAR------QRADLNARKTQLDAALKQAADEKESLANLTQQYSKLR 164
P E + Q + +K Q + L + E+ ++ +Y L
Sbjct: 171 PDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLS 230

Query: 165 RSLLRDQL 172
R + + +L
Sbjct: 231 R-VEKSRL 237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2071INVEPROTEIN250.025 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 25.5 bits (55), Expect = 0.025
Identities = 16/52 (30%), Positives = 24/52 (46%), Gaps = 2/52 (3%)

Query: 6 IARAHARSAPGGEPAGRVETDARTAGDGRSFAQRRRSERRMGRKAN--ERVL 55
I +A S+PG E V++ + F RR E++ +N ERVL
Sbjct: 35 IQQAAEDSSPGAEVQKFVQSTDEMSAALAQFRNRRDYEKKSSNLSNSFERVL 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2074AEROLYSIN270.044 Aerolysin signature.
		>AEROLYSIN#Aerolysin signature.

Length = 493

Score = 26.5 bits (58), Expect = 0.044
Identities = 13/37 (35%), Positives = 20/37 (54%)

Query: 29 RDIDAAVRRALDAENRGALNSMLGQPLPVAADVASVR 65
R + A + AE++ A N +G P+P+AAD R
Sbjct: 417 RPVRAGITGDFSAESQFAGNIEIGAPVPLAADSKVRR 453


79BURPS668_2413BURPS668_2417N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_2413-210-0.568509hypothetical protein
BURPS668_2414-19-0.827338NfeD family protein
BURPS668_2415-19-0.911173phosphoenolpyruvate synthase
BURPS668_2416-1121.068272phytochelatin synthase
BURPS668_2417-1120.806976serine protease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2413RTXTOXINA300.021 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.5 bits (66), Expect = 0.021
Identities = 18/92 (19%), Positives = 35/92 (38%), Gaps = 8/92 (8%)

Query: 221 INQAQGEAAAILAVAEANSQAIQKIAQAIQSQGGMDAVNLKVAEQYVGAFGNLAKAGNTL 280
INQ A++ + SQ + + + + ++ V K+ NL G L
Sbjct: 188 INQLVDTVASLNNNVNSFSQQLNTLGSVLSNTKHLNGVGNKLQN-----LPNLDNIGAGL 242

Query: 281 IVPSNLSDLSTAIASALTIVNRSAPGALAPGA 312
+S + +AI+++ + N A A
Sbjct: 243 DT---VSGILSAISASFILSNADADTRTKAAA 271


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2415PHPHTRNFRASE2653e-81 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 265 bits (679), Expect = 3e-81
Identities = 99/441 (22%), Positives = 171/441 (38%), Gaps = 73/441 (16%)

Query: 396 IHDPSEMERVQPGDVLVADMTDPNWEPVMK-RASAIVTNRGGRTCHAAIIARELGVPAVV 454
+ S + ++ D+T + + K T+ GGRT H+AI++R L +PAVV
Sbjct: 145 VETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSRSLEIPAVV 204

Query: 455 GCGDATDVLKDGALVTVSCAEGDEGKIYDGLLETEVSEVQRGE------------LPSVP 502
G + T+ ++ G +V V G EG + E EV + L P
Sbjct: 205 GTKEVTEKIQHGDMVIVD---GIEGIVIVNPTEEEVKAYEEKRAAFEKQKQEWAKLVGEP 261

Query: 503 --------VKIMMNVGNPQLAFDFSQLPNAGVGLARLEFIINNNIGVHPKAILEYPNVDA 554
V++ N+G P+ G+GL R EF+ + P
Sbjct: 262 STTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLYMDR-DQLPT---------- 310

Query: 555 DLKKAVESVARGHASPRAFYVDKLTEGIATIAAAFYPKPVIVRLSDFKSNEYKKLIGGSR 614
++ E + KPV++R D ++ +
Sbjct: 311 --------------------EEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYL---- 346

Query: 615 YEPDEENPMLGFRGASRYIAEDFAQAFEMECMALKRVRDEMGLTNVEIMVPFVRTVKQAE 674
P E NP LGFR + + F + AL R N+++M P + T+++
Sbjct: 347 QLPKELNPFLGFRAIRLCL--EKQDIFRTQLRALLRAS---TYGNLKVMFPMIATLEELR 401

Query: 675 RVVGLLGKFGLKRGDNG------LRLIMMCEVPSNAILAEEFLQHFDGFSIGSNDLTQLT 728
+ ++ + K G + + +M E+PS A+ A F + D FSIG+NDL Q T
Sbjct: 402 QAKAIMQEEKDKLLSEGVDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYT 461

Query: 729 LGLDRDSGMELLAVDFDERDPAVKFMLKRAIDTCRKLDKYVGICGQGPSDHPDFAKWLAD 788
+ DR + E ++ + PA+ ++ I K+VG+CG+ D L
Sbjct: 462 MAADRMN--ERVSYLYQPYHPAILRLVDMVIKAAHSEGKWVGMCGEMAGD-EVAIPLLLG 518

Query: 789 EGIASISLNPDTVIETWQALA 809
G+ S++ +++ L
Sbjct: 519 LGLDEFSMSATSILPARSQLL 539


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2416cloacin401e-05 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 39.7 bits (92), Expect = 1e-05
Identities = 26/64 (40%), Positives = 30/64 (46%), Gaps = 5/64 (7%)

Query: 13 VGVGVGVGVGVGISV-----GVGVGVGVGGGGGGGGGGGGGGDGDGDGDGDGGGPNARQA 67
+GVG G G G S G G G G+ GGG G G GGG G G G GG +A A
Sbjct: 27 LGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAA 86

Query: 68 ASVY 71
+
Sbjct: 87 PVAF 90



Score = 33.9 bits (77), Expect = 0.001
Identities = 26/73 (35%), Positives = 29/73 (39%), Gaps = 7/73 (9%)

Query: 14 GVGVGVGVGVGISVGVGV-------GVGVGGGGGGGGGGGGGGDGDGDGDGDGGGPNARQ 66
G G+GVG G S G G G G G G GGG G G G G G G
Sbjct: 22 GGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNL 81

Query: 67 AASVYRAAWHVPA 79
+A A+ PA
Sbjct: 82 SAVAAPVAFGFPA 94


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2417SUBTILISIN417e-06 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 41.4 bits (97), Expect = 7e-06
Identities = 21/122 (17%), Positives = 36/122 (29%), Gaps = 22/122 (18%)

Query: 342 VLYAAPSMLLSDITSAYNRAVVDNVAKVINVSLGVCEADARASGTQAADDRIFKSAVAQG 401
VL S I A+ + +I++SLG A + AVA
Sbjct: 117 VLNKQGSGQYDWIIQGIYYAI-EQKVDIISMSLG---GPEDVPELHEAVKK----AVASQ 168

Query: 402 QTFVVAAGDAGAYECSVSRVSGGQGVPARSNYSVSEPATSPYVVAVGGTTLSTDRTTLAY 461
+ AAG+ G + + P V++VG + +
Sbjct: 169 ILVMCAAGNEGDGDDRTD--------------ELGYPGCYNEVISVGAINFDRHASEFSN 214

Query: 462 AG 463
+
Sbjct: 215 SN 216


80BURPS668_2457BURPS668_2464N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_2457-191.192585DNA repair protein RadA
BURPS668_2458-182.594707alanine racemase
BURPS668_2459-182.234109lysophospholipid transporter LplT
BURPS668_2460-1104.863632phosphomethylpyrimidine kinase
BURPS668_2461-2124.376859hypothetical protein
BURPS668_2462-2134.556853hypothetical protein
BURPS668_2463-3104.356580uracil-DNA glycosylase
BURPS668_2464-2142.600800ribosomal-protein-alanine acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2457TCRTETOQM310.011 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.0 bits (70), Expect = 0.011
Identities = 16/79 (20%), Positives = 27/79 (34%), Gaps = 17/79 (21%)

Query: 104 LLQSLAQIASERPALYISGEESGAQIALRAQRLALLEGGASAADLKLLAEIQLEKIQATI 163
LL +L +I+ P L + + +I L L ++Q+E A +
Sbjct: 361 LLDALLEISDSDPLLRYYVDSATHEIILS-----------------FLGKVQMEVTCALL 403

Query: 164 DAERPDVAVIDSIQTIYSE 182
+ I IY E
Sbjct: 404 QEKYHVEIEIKEPTVIYME 422


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2458ALARACEMASE438e-156 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 438 bits (1127), Expect = e-156
Identities = 207/353 (58%), Positives = 270/353 (76%)

Query: 1 MPRPISATIHTAALANNLSVVRRHAAQSKVWAIVKANAYGHGLARVFPGLRGTDGFGLLD 60
M RPI A++ AL NLS+VR+ A ++VW++VKANAYGHG+ R++ + TDGF LL+
Sbjct: 1 MTRPIQASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGATDGFALLN 60

Query: 61 LDEAVKLRELGWAGPILLLEGFFRSTDIDVIDRYSLTTAVHNDEQMRMLETARLSKPVNV 120
L+EA+ LRE GW GPIL+LEGFF + D+++ D++ LTT VH++ Q++ L+ ARL P+++
Sbjct: 61 LEEAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDI 120

Query: 121 QLKMNSGMNRLGYTPEKYRAAWERARACPGIGQITLMTHFSDADGERGVAEQMATFERGA 180
LK+NSGMNRLG+ P++ W++ RA +G++TLM+HF++A+ G++ MA E+ A
Sbjct: 121 YLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAEHPDGISGAMARIEQAA 180

Query: 181 QGIAGARSFANSAAVLWHPSAHFDWVRPGIMLYGASPSGRAADIADRGLKPTMTLASELI 240
+G+ RS +NSAA LWHP AHFDWVRPGI+LYGASPSG+ DIA+ GL+P MTL+SE+I
Sbjct: 181 EGLECRRSLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTGLRPVMTLSSEII 240

Query: 241 AVQTLAKGQAVGYGSMFVAEDTMRIGVVACGYADGYPRIAPEGTPVVVDGVRTRIVGRVS 300
VQTL G+ VGYG + A D RIG+VA GYADGYPR AP GTPV+VDGVRT VG VS
Sbjct: 241 GVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVLVDGVRTMTVGTVS 300

Query: 301 MDMLTVDLTPVPQAGVGARVELWGETLPIDDVAARCMTVGYELMCAVAPRVPV 353
MDML VDLTP PQAG+G VELWG+ + IDDVAA TVGYELMCA+A RVPV
Sbjct: 301 MDMLAVDLTPCPQAGIGTPVELWGKEIKIDDVAAAAGTVGYELMCALALRVPV 353


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2459TCRTETB290.049 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 28.7 bits (64), Expect = 0.049
Identities = 31/139 (22%), Positives = 54/139 (38%), Gaps = 4/139 (2%)

Query: 13 FFSSLADSALLIAAIALLKDLHAPNWMIPLLKLFFVLSYVVLAAFVGAFADSRPKGHVMF 72
FFS L + L ++ + D + P + F+L++ + A G +D ++
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 73 ITNSIKVVGCLIMLFGAHP----LIAYGIVGFGAAAYSPAKYGILTELLPPERLVAANGW 128
I G +I G ++A I G GAAA+ ++ +P E A G
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 129 IEGTTVGSIILGTVLGGAL 147
I +G +GG +
Sbjct: 144 IGSIVAMGEGVGPAIGGMI 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2463PYOCINKILLER290.034 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 29.4 bits (65), Expect = 0.034
Identities = 30/121 (24%), Positives = 39/121 (32%), Gaps = 13/121 (10%)

Query: 124 ARQAAAESGVRAAADAPAPAAAPESRTRDATIARGASPAEPEEPGVAGVVGVADEPSVAG 183
+ +AAA + R A A A A E + A I + A P V
Sbjct: 213 SIEAAAANKAREQAAAEAKRKAEEQARQQAAIRAANTYAMPANGSVV----------ATA 262

Query: 184 GARRARTSGGGAEVPASALDLTDAAEATRPAAAPAPAALTGGDAGAAASDEDMS-WFDLE 242
R GA A A ++DA A AP+ + G A S W D
Sbjct: 263 AGRGLIQVAQGAASLAQA--ISDAIAVLGRVLASAPSVMAVGFASLTYSSRTAEQWQDQT 320

Query: 243 P 243
P
Sbjct: 321 P 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2464SACTRNSFRASE443e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 44.2 bits (104), Expect = 3e-08
Identities = 21/71 (29%), Positives = 32/71 (45%)

Query: 79 VAPVAQRSGVGLALLREAVRIARAERLDGVLLEVRPSNPRAIRLYERFGFVSVGRRRNYY 138
VA ++ GVG ALL +A+ A+ G++LE + N A Y + F+ Y
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVDTMLY 156

Query: 139 PAKHRSREDAI 149
+ E AI
Sbjct: 157 SNFPTANEIAI 167


81BURPS668_2608BURPS668_2614N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_2608-115-1.765427D-alanyl-D-alanine carboxypeptidase
BURPS668_2609014-2.004850phasin family protein
BURPS668_2610014-1.608523pyruvate dehydrogenase complex E3 component,
BURPS668_2611013-1.214769dihydrolipoamide acetyltransferase
BURPS668_2612011-1.527873pyruvate dehydrogenase subunit E1
BURPS668_2613010-0.540169sensory box histidine kinase
BURPS668_26140110.630149LuxR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2608SSBTLNINHBTR280.025 Streptomyces subtilisin inhibitor signature.
		>SSBTLNINHBTR#Streptomyces subtilisin inhibitor signature.

Length = 144

Score = 28.3 bits (62), Expect = 0.025
Identities = 15/50 (30%), Positives = 23/50 (46%)

Query: 25 VATAAVAPADAFAATAKTAQSAKGKKSAAKKSLRAASSSAEPRAKGARKR 74
+A+ A APA +A +A G+ +A LRA + + P A G
Sbjct: 27 LASPATAPASLYAPSALVLTVGHGESAATAAPLRAVTLTCAPTASGTHPA 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2611RTXTOXIND365e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 35.6 bits (82), Expect = 5e-04
Identities = 12/58 (20%), Positives = 22/58 (37%)

Query: 49 VPSPSAGTVKEVKVKVGDAVSQGSLIVLLDGAQAAAQPAQANGAATSAAQPAAAPAAA 106
+ VKE+ VK G++V +G +++ L A A + + A
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQIL 156



Score = 33.6 bits (77), Expect = 0.002
Identities = 20/91 (21%), Positives = 32/91 (35%), Gaps = 9/91 (9%)

Query: 162 VPSPAAGVVKDIKVKVGDAVSQGSLIVVLEASGGAA------ASAPQAAAPAAPAPAPAP 215
+ +VK+I VK G++V +G +++ L A G A +S QA +
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSR 158

Query: 216 QAAPAAAPA---PAQAPAPAASGEYRASHAS 243
P P + S E S
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEEEVLRLTS 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2613PF06580320.012 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.012
Identities = 18/85 (21%), Positives = 31/85 (36%), Gaps = 18/85 (21%)

Query: 700 PVLIEQVLV-NLMKNAAEAMQEARPQAENGVIRVVADLEAGFVDIRVIDQGPGVDEATAE 758
P ++ Q LV N +K+ + G I + + G V + V + G + T E
Sbjct: 256 PPMLVQTLVENGIKHGIA------QLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE 309

Query: 759 RLFEPFYSTKSDGMGMGLNICRSII 783
S G G+ N+ +
Sbjct: 310 ----------STGTGL-QNVRERLQ 323


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_2614HTHFIS1132e-31 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 113 bits (283), Expect = 2e-31
Identities = 39/153 (25%), Positives = 67/153 (43%), Gaps = 4/153 (2%)

Query: 11 TVFVVDDDEAVRDSLRWLLEANGYRVQCFSSAEQFLDAYQPAQQAGQIACLILDVRMSGM 70
T+ V DDD A+R L L GY V+ S+A AG ++ DV M
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIA----AGDGDLVVTDVVMPDE 60

Query: 71 SGLELQERLIAENAALPIIFVTGHGDVPMAVSTMKKGAMDFIEKPFDEAELRKLVERMLE 130
+ +L R+ LP++ ++ A+ +KGA D++ KPFD EL ++ R L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 131 KARNESKSVQEQRAASERLSKLTAREQQVLERI 163
+ + +++ L +A Q++ +
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVL 153


82BURPS668_3168BURPS668_3175N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_31680102.160660D-isomer specific 2-hydroxyacid dehydrogenase
BURPS668_3169-2121.756025cob(II)yrinic acid a,c-diamide reductase
BURPS668_3170-3111.880982inner membrane transport protein YdhP
BURPS668_3171-3111.690296hypothetical protein
BURPS668_3172-2121.851098major facilitator transporter
BURPS668_3173-3131.306848fumarylacetoacetase
BURPS668_3174-3131.072974homogentisate 1,2-dioxygenase
BURPS668_3175-2141.6962234-hydroxybenzoate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3168SECA340.001 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 33.7 bits (77), Expect = 0.001
Identities = 27/87 (31%), Positives = 42/87 (48%), Gaps = 9/87 (10%)

Query: 116 LNRRLPRAVARTREGDFSLNGLLGFDLFGKTVGVIGTGLI--GSVFARIMTGFGMRVLAH 173
L +P A A RE + G+ FD V ++G G++ A + TG G + L
Sbjct: 60 LENLIPEAFAVVREASKRVFGMRHFD-----VQLLG-GMVLNERCIAEMRTGEG-KTLTA 112

Query: 174 SLPPHDDALIALGVRYVPLDALLAEAD 200
+LP + +AL GV V ++ LA+ D
Sbjct: 113 TLPAYLNALTGKGVHVVTVNDYLAQRD 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3170TCRTETA501e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 49.8 bits (119), Expect = 1e-08
Identities = 95/399 (23%), Positives = 149/399 (37%), Gaps = 37/399 (9%)

Query: 5 LFALAVAAFGIGTTEFVIMGLLPNVARDLGVSIPAA---GMLVSGYALGVTIGAPILAVV 61
L +A+ A GIG +IM +LP + RDL S G+L++ YAL AP+L +
Sbjct: 11 LSTVALDAVGIG----LIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 62 TAKMPRKAALLALIGVFIVGNLFCAIAPGYATLMVARVVTAFCHGAFFGIGSVVASNLVA 121
+ + R+ LL + V A AP L + R+V G+ +A ++
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIA-DITD 125

Query: 122 PNKRAQAIALMFTGLTLANVLGVPLGTALGQAFGWRATFWAVTGIGALAAAALAFCVPKR 181
++RA+ M V G LG +G F A F+A + L F +P+
Sbjct: 126 GDERARHFGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 182 LEMPAAGIAREFGVLRNPQVLMVLGISVLASASLFTVFTYIAPI-----------LEDVT 230
+ + RE NP + A+L VF + + ED
Sbjct: 185 HKGERRPLRRE---ALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRF 241

Query: 231 GFTPHDVTLVLLLFG-LGLTVGGTVGGKLADW---RRMPSLVATLASIGVVLAAFAGTMR 286
+ + + L FG L + G +A RR L G +L AFA
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 287 TPLPALVTIFVWGVLAFAIVPPLQILIVDRAS-HAPNLASTLNQGAFNLGNALGAWLGGT 345
P +V + G+ +P LQ ++ + +L + +G L
Sbjct: 302 MAFPIMVLLASGGI----GMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 346 AIHAGVPLAK-LPW-AGAAL---AMAALALTLWSASLER 379
A + W AGAAL + AL LWS + +R
Sbjct: 358 IYAASITTWNGWAWIAGAALYLLCLPALRRGLWSGAGQR 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3172TCRTETA479e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 46.7 bits (111), Expect = 9e-08
Identities = 95/395 (24%), Positives = 152/395 (38%), Gaps = 25/395 (6%)

Query: 12 LILSVAVVGLGTGATLPLTALALTEAGHGTRIV---GILTAAQAGGGLAVVPFVTAITKR 68
++ +VA+ +G G +P+ L + H + GIL A A A P + A++ R
Sbjct: 10 ILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDR 69

Query: 69 LGARQVIVASVVVLAAATALMQFTSNLVVWGVLRVVCGAALMLLFTIGEAWVNQLADDAT 128
G R V++ S+ A A+M L V + R+V G G A++ + D
Sbjct: 70 FGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG-AYIADITDGDE 128

Query: 129 RGRVVAIYATNFTLFQMAGPVLVSQIAGMT-HVRFALSGALFLLAL--------PSLASI 179
R R + F +AGPVL + G + H F + AL L S
Sbjct: 129 RARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGE 188

Query: 180 RKTPIADEPHHDAHDRWTRVIPKMPALVVGTAFFALFDTLALSLLPIFAMAR--GVASEA 237
R+ + + A RW R + + AL+ L + +L IF R A+
Sbjct: 189 RRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTI 248

Query: 238 AVLFAAILLFGDTAMQFPIGWLADKLGRERVHLGAGCVVLALLPLLPAVVTTPWLCWPLL 297
+ AA + A G +A +LG ER L G + +L A T W+ +P++
Sbjct: 249 GISLAAFGILHSLAQAMITGPVAARLG-ERRALMLGMIADGTGYILLAFATRGWMAFPIM 307

Query: 298 FVLGAAAGSVYTL----SLVACGERFRGSALVTASSLVSASWSAASFGGPLVAGALMEQF 353
+L + + L S ER +G + ++L S S GPL+ A+
Sbjct: 308 VLLASGGIGMPALQAMLSRQVDEER-QGQLQGSLAALT----SLTSIVGPLLFTAIYAAS 362

Query: 354 GGDALIGVLIVSAIAFVGAALWERRALPMQAARRG 388
I A ++ RR L A +R
Sbjct: 363 ITTWNGWAWIAGAALYLLCLPALRRGLWSGAGQRA 397


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3175TCRTETA441e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 43.7 bits (103), Expect = 1e-06
Identities = 78/398 (19%), Positives = 125/398 (31%), Gaps = 59/398 (14%)

Query: 50 VAPSVIAEWGVKKQA---LGPVFSASLFGMLLGALGLSVLADRIGRRPVLIGATLFFALA 106
V P ++ + G + + A L L+DR GRRPVL+ + A+
Sbjct: 27 VLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVD 86

Query: 107 MLATPFATSIPILIALRFVTGLGLGCIMPNAMALVGECSPGAHRVKRM----MIVSCGFT 162
A + +L R V G+ G A A + + + G R + G
Sbjct: 87 YAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARHFGFMSACFGFGMV 145

Query: 163 LGAALGGFVSAALIPAFGWRAVFFVGGAVPLALAAAMAASLPESPQSLVLRGRHDAARAW 222
G LGG + F A FF A+ LPES H R
Sbjct: 146 AGPVLGGLMG-----GFSPHAPFFAAAALNGLNFLTGCFLLPES---------HKGERRP 191

Query: 223 LAKFAPRLAVPPDTRLVVREAGPRGAPVAELFRSGRARVTLLLWAINF-MNLIDLYFLSN 281
L + A P+A + V L A+ F M L+ +
Sbjct: 192 LRREALN-------------------PLASFRWARGMTVVAALMAVFFIMQLVGQVPAAL 232

Query: 282 WLPTVMRDAGYASGTAVIVGTVLQTGGVIGTLS----LGWFIERHGFARVLFACFACATI 337
W+ + A +G L G++ +L+ G R G R L
Sbjct: 233 WVIFGEDRFHW---DATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGT 289

Query: 338 AIGLIGPVAHAFVWLLAAVFVGGFCVVGGQPAVNALAGHYYPTSLRSTGIGWSLGVGRVG 397
L+ ++ V + + G PA+ A+ + G + +
Sbjct: 290 GYILLAFATRGWMAFPIMVLLASGGI--GMPALQAMLSRQVDEERQGQLQGSLAALTSLT 347

Query: 398 SVLGPLVGGQLIA--------LGWSNDALFHAAAVPVL 427
S++GPL+ + A W A + +P L
Sbjct: 348 SIVGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPAL 385


83BURPS668_3670BURPS668_3674N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_3670111-1.476062ABC-2 type transporter permease
BURPS668_3671011-1.189463ABC transporter ATP-binding protein
BURPS668_3672-113-0.906563hypothetical protein
BURPS668_3673-3110.431852toluene tolerance protein
BURPS668_3674-3111.162142VacJ family lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3670ABC2TRNSPORT741e-17 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 73.8 bits (181), Expect = 1e-17
Identities = 60/243 (24%), Positives = 100/243 (41%), Gaps = 6/243 (2%)

Query: 7 LFYKEILRFWKVSFQTVLAPVVTALLYLTIFGHALTGRVNVYPGVEYVSFLVPGLVMMSV 66
++ + + + K + ++L + L+YL G L V GV Y +FL G+V S
Sbjct: 19 VWRRNYIAWKKAALASLLGHLAEPLIYLFGLGAGLGVMVGRVGGVSYTAFLAAGMVATSA 78

Query: 67 LQNA-FANSSSSLIQSKITGNLVFMLLPPLSSADIFGAYVLASVVRGLAVGAGVFVVTVW 125
+ A F ++ + + ML L DI + + + GAG+ VV
Sbjct: 79 MTAATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAA 138

Query: 126 FIPMSFAAPLYIVAFALFGSAILGTLGLIAGIWAEKFDQLAAFQNFLIMPLTFLSGVFYS 185
+ + LY + +LG++ A +D +Q +I P+ FLSG +
Sbjct: 139 LGYTQWLSLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFP 198

Query: 186 THSLPPVWREVSRLNPFFYMIDGFRYGFFG--IADVNPLASLS---VVAGFFVLLALIAM 240
LP V++ +R P + ID R G + DV +V FF+ AL+
Sbjct: 199 VDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRR 258

Query: 241 RLL 243
RLL
Sbjct: 259 RLL 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3671PF05272280.037 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.037
Identities = 11/19 (57%), Positives = 13/19 (68%)

Query: 34 LLGPNGAGKTTLISILAGL 52
L G G GK+TLI+ L GL
Sbjct: 601 LEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3673FLGMOTORFLIG280.026 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 28.2 bits (63), Expect = 0.026
Identities = 12/73 (16%), Positives = 22/73 (30%)

Query: 74 RTTQLAMGRNWRTATPAQQQQVIEQFKQLLIRTYSGALAQLKPDQQIQYPPFRADADATD 133
R + A ++ +Q +T + L+ L P + T+
Sbjct: 107 NLGSALQSRPFEFVRRADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTN 166

Query: 134 VVVRTVAMNNGQP 146
V R M+ P
Sbjct: 167 VARRIALMDRTSP 179


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3674VACJLIPOPROT2242e-74 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 224 bits (571), Expect = 2e-74
Identities = 85/220 (38%), Positives = 114/220 (51%), Gaps = 8/220 (3%)

Query: 32 AAAALSGCATVQTPTKG--DPFEGFNRTMYTFNDKV-DQYALKPVARGYQWAVPQPMRDS 88
L GCA+ T +G DP EGFNRTMY FN V D Y ++PVA ++ VPQP R+
Sbjct: 11 GTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWRDYVPQPARNG 70

Query: 89 VTNFFSNIGDVYIAANNLVQLKIADGVGDIMRVVINTVFGVGGLFDVATLAKLPKHAND- 147
++NF N+ + + N +Q G+ R +NT+ G+GG DVA +A +
Sbjct: 71 LSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGMANPKLQRTEP 130

Query: 148 --FGVTLGHYGVPSGPYLVLPLLGPSTVRDTAGLAVDYAGNPLTYVRPDGVSWGLFGLNL 205
FG TLGHYGV GPY+ LP G T+RD G D A P+ +S G + L
Sbjct: 131 HRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMAD-ALYPVLSWLTWPMSVGKWTLEG 189

Query: 206 VNTRANLLGAGDVLEAAAIDKYSFVRNAYLQRRQALIGGA 245
+ TRA LL + +L + D Y VR AY QR + G
Sbjct: 190 IETRAQLLDSDGLLR-QSSDPYIMVREAYFQRHDFIANGG 228


84BURPS668_3844BURPS668_3857N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BURPS668_3844-290.084063flagellar biosynthesis protein FlhB
BURPS668_3845-1110.0385143-demethylubiquinone-9 3-methyltransferase
BURPS668_3846-2110.197942lipoprotein
BURPS668_3847-212-0.199820hypothetical protein
BURPS668_3848-1120.733589chemotaxis regulator CheZ
BURPS668_3849-1110.406497chemotaxis protein CheY
BURPS668_38500111.379045chemotaxis-specific methylesterase
BURPS668_38511121.390058chemoreceptor glutamine deamidase CheD
BURPS668_38521121.171202chemotaxis protein methyltransferase
BURPS668_38532130.881968methyl-accepting chemotaxis protein II
BURPS668_3854212-0.638582chemotaxis protein CheW
BURPS668_3855212-0.433478chemotaxis protein CheA
BURPS668_3856113-2.619747chemotaxis protein CheY
BURPS668_3857014-2.736267flagellar motor protein MotB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3844TYPE3IMSPROT359e-125 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 359 bits (923), Expect = e-125
Identities = 108/344 (31%), Positives = 181/344 (52%), Gaps = 2/344 (0%)

Query: 8 DRTEAATPKRREKAREEGQVARSRELASFALLSAGFYGAWMLSGPIGEHLRTMLHTAFSF 67
++TE TPK+ AR++GQVA+S+E+ S AL+ A LS EH ++
Sbjct: 4 EKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLM--LIPA 61

Query: 68 DRAAAFDTNRMLSHAGTLSLEGLYALAPVLALTGVAALATPMAMGGWLVSTKTFELKFER 127
+++ + + + LE Y P+L + + A+A+ + G+L+S + + ++
Sbjct: 62 EQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKK 121

Query: 128 LNPITGLGRIFSIQGPIQLGMSIAKTLVVGGIGGIAIWRSKDELLGLATQPLHAALADAL 187
+NPI G RIFSI+ ++ SI K +++ + I I + LL L T +
Sbjct: 122 INPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLG 181

Query: 188 HLVAVCCGMTVAGMLVVAGLDVPYQLWQYNKKLRMTKEEVKREHRENEGDPHVKGRIRQQ 247
++ + G +V++ D ++ +QY K+L+M+K+E+KRE++E EG P +K + RQ
Sbjct: 182 QILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQF 241

Query: 248 QRAMARRRMMANVPTADVVVTNPTHFAVALKYTDGEMRAPKVVAKGVNLVAARIRELAAE 307
+ + R M NV + VVV NPTH A+ + Y GE P V K + +R++A E
Sbjct: 242 HQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEE 301

Query: 308 HHVPLLEAPPLARALYHNVELEREIPGTLYSAVAEVLAWVYQLK 351
VP+L+ PLARALY + ++ IP A AEVL W+ +
Sbjct: 302 EGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQN 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3846cloacin320.004 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 32.0 bits (72), Expect = 0.004
Identities = 14/43 (32%), Positives = 20/43 (46%)

Query: 28 GGGGDGGSNASVNTGTGGGDTSAGGGSNGGTGGTGGSGSTPLA 70
GGG G + +G G G + G GTGG + + P+A
Sbjct: 47 GGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVA 89



Score = 30.1 bits (67), Expect = 0.022
Identities = 17/56 (30%), Positives = 22/56 (39%)

Query: 17 AAATAALVAACGGGGDGGSNASVNTGTGGGDTSAGGGSNGGTGGTGGSGSTPLASN 72
A +T+ + G G AS +G + GGGS G GGSG N
Sbjct: 13 AHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGN 68



Score = 29.7 bits (66), Expect = 0.025
Identities = 21/61 (34%), Positives = 27/61 (44%), Gaps = 7/61 (11%)

Query: 28 GGGGDGGSNASVNTGTGGGDTS-------AGGGSNGGTGGTGGSGSTPLASNQAAITVST 80
GG DG +S N GGG S +G G+ GG G +GG T + A V+
Sbjct: 31 GGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAF 90

Query: 81 G 81
G
Sbjct: 91 G 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3849HTHFIS865e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 86.4 bits (214), Expect = 5e-23
Identities = 32/110 (29%), Positives = 52/110 (47%), Gaps = 4/110 (3%)

Query: 1 MDKSMKILVVDDFPTMRRIVRNLLKELGYSNVDEAEDGLAGLARLRGGGYDFVISDWNMP 60
M + ILV DD +R ++ L GY V + + G D V++D MP
Sbjct: 1 MTGA-TILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 NLDGLAMLKEIRADASLTHLPVLMVTAESKKENIIAAAQAGASGYVVKPF 110
+ + +L I+ LPVL+++A++ I A++ GA Y+ KPF
Sbjct: 59 DENAFDLLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3850HTHFIS664e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.4 bits (162), Expect = 4e-14
Identities = 32/146 (21%), Positives = 62/146 (42%), Gaps = 14/146 (9%)

Query: 1 MQKKIKVLCVDDSALIRSLMTEIINSQPDMEVCATAPDPLVARELIKQHNPDVLTLDVEM 60
M +L DD A IR+++ + ++ + I + D++ DV M
Sbjct: 1 MTG-ATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVM 57

Query: 61 PRMDGLDFLEKLMRLRP-MPVVMVSSLTERGSEITLRALELGAVDFVTKPRVGIRDGMLD 119
P + D L ++ + RP +PV+++S+ ++A E GA D++ KP D
Sbjct: 58 PDENAFDLLPRIKKARPDLPVLVMSAQNT--FMTAIKASEKGAYDYLPKP--------FD 107

Query: 120 YSEKLADKVRAASRARVRQNPQPHAA 145
+E + RA + + R + +
Sbjct: 108 LTELIGIIGRALAEPKRRPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3855PF06580463e-07 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 46.0 bits (109), Expect = 3e-07
Identities = 21/151 (13%), Positives = 50/151 (33%), Gaps = 52/151 (34%)

Query: 464 ELDKSLIERIIDPLT--HLVRNSLDHGIETVEARRAAGKDAVGQLVLSAAHHGGNIVIEV 521
+++ ++++ + P+ LV N + HGI G+++L G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 522 SDDGAGLNRERILAKAAKQGMQISENISDDEVWNLIFAPGFSTAEVVTDVSGRGVGMDVV 581
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKNTK--------------------------------------ESTGTGLQNV 318

Query: 582 KRNIQSMGG---HVEISSQAGRGTTTRIVLP 609
+ +Q + G +++S + G +++P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3856HTHFIS718e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 71.4 bits (175), Expect = 8e-18
Identities = 38/114 (33%), Positives = 58/114 (50%), Gaps = 2/114 (1%)

Query: 4 TILAIDDSATMRTLLSATLGEAGYDVTVASDGEVGLDVALATRFDLVLTDHHMPRKNGLE 63
TIL DD A +RT+L+ L AGYDV + S+ A DLV+TD MP +N +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 LIVALRRQLGYEATPILVLTTENGDAFKDAARAAGATGWIEKPIDPDALIELVA 117
L+ +++ P+LV++ +N A GA ++ KP D LI ++
Sbjct: 65 LLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BURPS668_3857OMPADOMAIN401e-05 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 39.5 bits (92), Expect = 1e-05
Identities = 25/117 (21%), Positives = 51/117 (43%), Gaps = 9/117 (7%)

Query: 182 FAMSSDAVEPYMRDILREIGKTLNDV---PNRIIVQGHTDAVPYAGGEKGYSNWELSADR 238
F + ++P + L ++ L+++ ++V G+TD + G Y N LS R
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI----GSDAY-NQGLSERR 277

Query: 239 ANASRRELIAGGMDEAKVLRV-LGLASTQNLNKADPLDPENRRISIIVLNRKSELAL 294
A + LI+ G+ K+ +G ++ N D + I + +R+ E+ +
Sbjct: 278 AQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.