PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeNC_005945.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_005945 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1BAS0249BAS0269Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS02492323.046518redox-sensing transcriptional repressor Rex
BAS02502303.683514lipoprotein
BAS02510273.477022CAAX amino terminal protease family protein
BAS02521222.845708co-chaperonin GroES
BAS02530202.382290molecular chaperone GroEL
BAS0254-2141.242290GMP synthase
BAS0255013-0.038506xanthine/uracil permease family protein
BAS0256318-1.380336DNA-binding response regulator
BAS0257218-1.771418sensor histidine kinase
BAS0259219-1.794675hypothetical protein
BAS0260219-1.652715hypothetical protein
BAS0261320-1.935091hypothetical protein
BAS0262321-1.892718hypothetical protein
BAS0264323-1.946131hypothetical protein
BAS0266323-2.117623glycoside hydrolase
BAS0269221-1.404692undecaprenyl pyrophosphate phosphatase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0251SSPAMPROTEIN290.009 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type

M signature.
Length = 147

Score = 29.3 bits (65), Expect = 0.009
Identities = 14/30 (46%), Positives = 19/30 (63%)

Query: 17 LSSIAGLPLLLKTGLYDNRGFTREEKFQLI 46
+ IAGL LLL T +NR +REE + L+
Sbjct: 43 VEQIAGLKLLLDTLRAENRQLSREEIYALL 72


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0256HTHFIS908e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 8e-23
Identities = 38/122 (31%), Positives = 60/122 (49%), Gaps = 1/122 (0%)

Query: 1 MAHETILVVDDEKEIRNLITIYLKNEGYKVLQAGDGEEGLRLLEENEVHLVVLDIMMPKV 60
M TILV DD+ IR ++ L GY V + R + + LVV D++MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGIHMCMKIREE-KEMPIIMLSAKTQDMDKILGLTTGADDYVTKPFNPLELIARIKSQLR 119
+ + +I++ ++P++++SA+ M I GA DY+ KPF+ ELI I L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RY 121

Sbjct: 121 EP 122


2BAS0428BAS0437Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS04282170.600030hypothetical protein
BAS04292140.770271hypothetical protein
BAS0429a4150.477032hypothetical protein
BAS04303140.290735prophage LambdaBa04 transactivating regulatory
BAS0431315-0.003155hypothetical protein
BAS0432219-0.316744hypothetical protein
BAS0433119-0.195262hypothetical protein
BAS0434223-0.956537hypothetical protein
BAS0435225-0.828363hypothetical protein
BAS0436423-0.231914hypothetical protein
BAS0436a217-0.224601hypothetical protein
BAS0437217-0.104744ArpU family phage transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0428HTHFIS270.028 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 27.1 bits (60), Expect = 0.028
Identities = 4/20 (20%), Positives = 9/20 (45%)

Query: 19 GISRRVLYMRMYRYGWELQE 38
G++R L ++ G +
Sbjct: 460 GLNRNTLRKKIRELGVSVYR 479


3BAS0836BAS0849aY        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS0836215-0.823104hypothetical protein
BAS0837321-0.502581hypothetical protein
BAS0838321-0.475808preprotein translocase subunit SecA
BAS0839222-0.594755polysaccharide biosynthesis protein CsaA
BAS0840322-0.679352CsaB protein
BAS0841222-0.814311S-layer protein
BAS0842017-0.816560S-layer protein
BAS0843117-0.636389hypothetical protein
BAS0844115-0.466504alginate O-acetyltransferase
BAS0845113-0.652111alginate O-acetyltransferase
BAS08462140.225020hypothetical protein
BAS08470130.972403hypothetical protein
BAS08480152.264860enoyl-CoA hydratase
BAS0849-1142.512774hypothetical protein
BAS0849a-2133.148127hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0838SECA9000.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 900 bits (2327), Expect = 0.0
Identities = 354/829 (42%), Positives = 506/829 (61%), Gaps = 52/829 (6%)

Query: 1 MLNSVKKLLGDSQKRKLKKYEQLVQEINNLEEKLSDLSDEELRHKTITFKDMLRDGKTVD 60
++ + K+ G R L++ ++V IN +E ++ LSDEEL+ KT F+ L G+ ++
Sbjct: 2 LIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVLE 61

Query: 61 DIKVEAFAVVREAAKRVLGLRHYDVQLIGGLVLLEGNIAEMPTGEGKTLVSSLPTYVRAL 120
++ EAFAVVREA+KRV G+RH+DVQL+GG+VL E IAEM TGEGKTL ++LP Y+ AL
Sbjct: 62 NLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNAL 121

Query: 121 EGKGVHVITVNDYLAKRDKELIGQVHEFLGLKVGLNIPQIDPSEKKLAYEADITYGIGTE 180
GKGVHV+TVNDYLA+RD E + EFLGL VG+N+P + K+ AY ADITYG E
Sbjct: 122 TGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNNE 181

Query: 181 FGFDYLRDNMAASKNEQVQRPYHFAIIDEIDSVLIDEAKTPLIIAGKKSSSSDLHYLCAK 240
+GFDYLRDNMA S E+VQR H+A++DE+DS+LIDEA+TPLII+G SS+++ K
Sbjct: 182 YGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVNK 241

Query: 241 VIKS-----------FQDTLHYTYDAESKSASFTEDGITKIEDLFDI-------DNLYDL 282
+I FQ H++ D +S+ + TE G+ IE+L ++LY
Sbjct: 242 IIPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYSP 301

Query: 283 EHQTLYHYMIQALRAHVAFQCDVDYIVHDEKILLVDIFTGRVMDGRSLSDGLHQALEAKE 342
+ L H++ ALRAH F DVDYIV D ++++VD TGR M GR SDGLHQA+EAKE
Sbjct: 302 ANIMLMHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAKE 361

Query: 343 GLEITEENQTQASITIQNFFRMYPALSGMTGTAKTEEKEFNRVYNMEVMPIPTNRPIIRE 402
G++I ENQT ASIT QN+FR+Y L+GMTGTA TE EF+ +Y ++ + +PTNRP+IR+
Sbjct: 362 GVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIRK 421

Query: 403 DKKDVVYVTADAKYKAVREDVLKHNKQGRPILIGTMSILQSETVARYLDEANITYQLLNA 462
D D+VY+T K +A+ ED+ + +G+P+L+GT+SI +SE V+ L +A I + +LNA
Sbjct: 422 DLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLNA 481

Query: 463 KSAEQEADLIATAGQKGQITIATNMAGRGTDILLG------------------------- 497
K EA ++A AG +TIATNMAGRGTDI+LG
Sbjct: 482 KFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKADW 541

Query: 498 ----EGVHELGGLHVIGTERHESRRVDNQLKGRAGRQGDPGSSQFFLSLEDEMLKRFAQE 553
+ V E GGLH+IGTERHESRR+DNQL+GR+GRQGD GSS+F+LS+ED +++ FA +
Sbjct: 542 QVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFASD 601

Query: 554 EVEKLTKSLKTDETGLILTSKVHDFVNRTQLICEGSHFSMREYNLKLDDVINDQRNVIYK 613
V + + L I V + Q E +F +R+ L+ DDV NDQR IY
Sbjct: 602 RVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIYS 661

Query: 614 LRNNLLQEDTNMIEIIIPMIDHAVEAISKQYLVEGMLPEEWDFASLTASLNEI--LSVEN 671
RN LL + +++ E I + + +A Y+ L E WD L L L +
Sbjct: 662 QRNELL-DVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPI 720

Query: 672 MPSLSANNVHSPEDLQS-VLKETLSLYKERVNELDSNTDLQQSLRYVALHFLDQNWVNHL 730
L E L+ +L +++ +Y+ + + + ++ + V L LD W HL
Sbjct: 721 AEWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEM-MRHFEKGVMLQTLDSLWKEHL 779

Query: 731 DAMTHLKEGIGLRQYQQEDPTRLYQKEALDIFLYTYGNFEKEMCRYVAR 779
AM +L++GI LR Y Q+DP + Y++E+ +F + + E+ +++
Sbjct: 780 AAMDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSK 828


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0841INTIMIN521e-08 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 51.6 bits (123), Expect = 1e-08
Identities = 66/340 (19%), Positives = 116/340 (34%), Gaps = 32/340 (9%)

Query: 190 TVTKAEAAQFIAKTDKQFGTEAAKVESAKAVTTQKVEVKFSKAVEKLTKEDIKVT----- 244
T+T Q + + A SAKA T+ + + + + ++ V+
Sbjct: 545 TITVLSNGQVVDQV--GVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVS 602

Query: 245 -------NKANNDKVLVKEVTLSEDKKSATVELYSNLAAKQTYTVDVNKVGKTEVAVGSL 297
N AN + VTL DK V A+ T ++ N V + S+
Sbjct: 603 GTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAK--TAEMTSALNANAVIFVDQTKASI 660

Query: 298 EAKTIEMADQTVVADEPTALQFTVKDENGTEVVSPEGIEFVTPAAEKINAKGEITLAKGT 357
I+ T VA+ A+ +TVK G + VS + + F T K++ E T G
Sbjct: 661 --TEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTT-TLGKLSNSTEKTDTNGY 717

Query: 358 ST-TVKAVYKKDGKVVAESKEVKVSAEGAAVASISNWTVAEQNKADFTSKDFKQNNKVYE 416
+ T+ + V A +V V + V D + + +
Sbjct: 718 AKVTLTSTTPGKSLVSARVSDVAVDVKAPEV------EFFTTLTIDDGNIEIVGTGVKGK 771

Query: 417 GDNAYVQVELKDQFNAVTTG---KVEYESLNTEVAVVDKATGKVTVLSAGKAPVKVTVKD 473
++Q Q N +G K + S N +A VD ++G+VT+ G + V D
Sbjct: 772 LPTVWLQ---YGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSD 828

Query: 474 SKGKELVSKTVEIEAFAQKAMKEIKLEKTNVALSTKDVTD 513
++ T + + + N +
Sbjct: 829 NQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLP 868


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0842INTIMIN350.002 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 34.7 bits (79), Expect = 0.002
Identities = 43/268 (16%), Positives = 77/268 (28%), Gaps = 34/268 (12%)

Query: 335 VKFVANNLDGSPANIFEGGEATSTTGKLAVGIK----QGDYKVEVQVTKRGGLTVSNTGI 390
+ AN+ S T L+ G V ++ K G + VS
Sbjct: 580 YTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTA 639

Query: 391 ITVKNLDTPA-----------SAIKNVVFALDADNDGVVNYGSKLSGKDFALNSQNLVVG 439
L+ A + IK A+ + Y K+ D +++Q +
Sbjct: 640 EMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEV--- 696

Query: 440 EKASLNKLVATIAGEDKVVDPGSISIKSSNHGIISVVNNYITAEAAGEATLTIKVGDVTK 499
+ + + K+ +G V +T+ G++ ++ +V DV
Sbjct: 697 ------------TFTTTLGKLSNSTEKTDTNGYAKVT---LTSTTPGKSLVSARVSDVAV 741

Query: 500 DVKFKVTTDSRKLVSVKANPDKLQVVQNKTLPVTFVTTDQYGDPFGANTAAIKEVLPKTG 559
DVK L N + + LP ++ Q
Sbjct: 742 DVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPA 801

Query: 560 VVAEGGLDVVTTDSGSIGTKTIGVTGND 587
+ + T GT TI V +D
Sbjct: 802 IASVDASSGQVTLKEK-GTTTISVISSD 828


4BAS0871BAS0880Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BAS0871216-0.065314lipoprotein
BAS0872318-0.963021tellurium resistance protein
BAS0873317-0.648902hypothetical protein
BAS08743271.917385hypothetical protein
BAS08754424.820677hypothetical protein
BAS08765445.203445hypothetical protein
BAS08776425.284429merR family transcriptional regulator
BAS08784396.313284hypothetical protein
BAS08794395.572815DnaD domain-containing protein
BAS08802272.947312replicative DNA helicase
5BAS0908BAS0958Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS0908520-5.118288hypothetical protein
BAS0909621-6.364952hypothetical protein
BAS0910519-6.495463hypothetical protein
BAS0911520-7.126740CAAX amino terminal protease family protein
BAS0912-114-3.134139hypothetical protein
BAS0913-114-3.045808hypothetical protein
BAS0914-115-2.555408HD domain-containing protein
BAS0915-114-0.862212hypothetical protein
BAS0914a-114-1.495787hypothetical protein
BAS0916-215-1.253029S-layer protein
BAS0917219-2.856579hypothetical protein
BAS0918420-2.164285hypothetical protein
BAS0920321-2.251510hypothetical protein
BAS0921420-2.658396lipoprotein
BAS0922320-2.122851hypothetical protein
BAS0923220-1.566974hypothetical protein
BAS0924120-1.642476hypothetical protein
BAS0925117-2.589511hypothetical protein
BAS0926015-2.890797anti sigma b factor antagonist RsbV
BAS0927-112-3.204953serine-protein kinase RsbW
BAS0928012-3.446585RNA polymerase sigma factor SigB
BAS0929013-3.474759hypothetical protein
BAS0930-113-3.632775response regulator
BAS0931-213-3.313090chemotaxis protein CheR
BAS0932-214-3.138376sensor histidine kinase/response regulator
BAS0933218-1.310915hypothetical protein
BAS09341111.159036hypothetical protein
BAS09352140.862951hypothetical protein
BAS09360151.762941hypothetical protein
BAS0937-2161.614237hypothetical protein
BAS0938-1182.793646hypothetical protein
BAS09391203.397601zinc-containing alcohol dehydrogenase
BAS09413192.179263hypothetical protein
BAS09406192.383460hypothetical protein
BAS09425172.151820hypothetical protein
BAS09435151.390576DNA repair exonuclease family protein
BAS0944313-0.155199hypothetical protein
BAS0945211-0.225553IS605 family transposase
BAS09462120.031132hypothetical protein
BAS0947-111-2.2477633'-5' exoribonuclease
BAS0948012-1.751034TetR family transcriptional regulator
BAS0949-113-1.098432transporter
BAS0950013-0.309773glyoxalase
BAS0951013-1.022382glyoxylase
BAS0952118-1.943578hypothetical protein
BAS09532251.004070alpha/beta hydrolase
BAS09540200.849539hypothetical protein
BAS0955-1161.994931hypothetical protein
BAS09562253.408798DNA-binding protein
BAS09571253.915748hypothetical protein
BAS09581233.279628glycerol uptake operon antiterminator regulatory
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0918ACRIFLAVINRP280.027 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.3 bits (63), Expect = 0.027
Identities = 5/35 (14%), Positives = 13/35 (37%)

Query: 156 QGVSFLWSFLFETPFALMRGLAWLFIPAAIVMYLV 190
G+ + W+ + L + +V++L
Sbjct: 852 AGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLC 886


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0930HTHFIS832e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.0 bits (205), Expect = 2e-19
Identities = 34/125 (27%), Positives = 67/125 (53%), Gaps = 10/125 (8%)

Query: 2 SILIVDDNPVNIFVIKKILKQAGYQDLVSLNSAQELFEYIHFGKDSSRHNEIDLILLDIM 61
+IL+ DD+ V+ + L +AGY D+ ++A L+ +I + DL++ D++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY-DVRITSNAATLWRWI-------AAGDGDLVVTDVV 56

Query: 62 MPEIDGLEVCRRLQKEEKFKDIPIIFVTALEDANKLAEALDIGAMDYITKPINKVELLAR 121
MP+ + ++ R++K D+P++ ++A +A + GA DY+ KP + EL+
Sbjct: 57 MPDENAFDLLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGI 114

Query: 122 MRVAL 126
+ AL
Sbjct: 115 IGRAL 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0932HTHFIS686e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 68.3 bits (167), Expect = 6e-14
Identities = 25/107 (23%), Positives = 50/107 (46%), Gaps = 3/107 (2%)

Query: 777 TIMIVDDDHRNIFALQNALKKQHANIITAQNGLECLEILKNNTNIDLILMDIMMPNMDGY 836
TI++ DDD L AL + ++ N + DL++ D++MP+ + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDG-DLVVTDVVMPDENAF 63

Query: 837 ETMEHIRMNLGLHEIPIIALTAKAMPNDKEKCLSAGASDYISKPLNL 883
+ + I+ ++P++ ++A+ K GA DY+ KP +L
Sbjct: 64 DLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDL 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0934PF07132290.010 Harpin protein (HrpN)
		>PF07132#Harpin protein (HrpN)

Length = 356

Score = 28.9 bits (64), Expect = 0.010
Identities = 22/79 (27%), Positives = 36/79 (45%), Gaps = 6/79 (7%)

Query: 51 SGEKVNSETAHKADIFSATGLVAGGVAGGLGGLLTGLGVLAVSGMGPIVAAGPIAAAIGG 110
G + ++ +DI + + + GGLGG L GLG G ++ G G
Sbjct: 40 FGGQRSNIAEQLSDIMTTMMFMGSMMGGGLGGGLGGLGSSLGGLGGGLLGGG------LG 93

Query: 111 AGIGGGAGSLIGAFIGLGI 129
G+G GS +G+ +G G+
Sbjct: 94 GGLGSSLGSGLGSALGGGL 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0947MICOLLPTASE310.006 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 31.2 bits (70), Expect = 0.006
Identities = 21/108 (19%), Positives = 40/108 (37%), Gaps = 7/108 (6%)

Query: 20 IKTATKGIASNGKPFLTVILQDPSGDIEAKLWDV-------SPEVEKQYVAETIVKVAGD 72
IK+ + I F +D G+I+A WD + +Y +V
Sbjct: 779 IKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEAKATHKYNKTGEYEVKLT 838

Query: 73 ILNYKGRIQLRVKQIRVANENEVTDISDFVEKAPVKKEDMVEKITQYI 120
+ + G I K+I+V + V I++ +K + + K +
Sbjct: 839 VTDNNGGINTESKKIKVVEDKPVEVINESEPNNDFEKANQIAKSNMLV 886


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0948HTHTETR698e-17 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 68.9 bits (168), Expect = 8e-17
Identities = 21/166 (12%), Positives = 62/166 (37%), Gaps = 3/166 (1%)

Query: 1 MGRKISFNKERALNKAMHLFWEKGYDATYISDLIETMGISRSTLYDSFGDKDALFKLVLE 60
++ ++ L+ A+ LF ++G +T + ++ + G++R +Y F DK LF + E
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 61 QYKNYGSQKRNLLFSDT--NTKESLKSFFYQHIEKCYSDDIPKGCIITNSSLLIGQIDPS 118
++ + + + L+ +E +++ + + + +
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA 124

Query: 119 IEEILINDFN-ELEKAFKQVIEEGKKKGEISQEDDTELVAYSLLSL 163
+ + + E +Q ++ + + + T A +
Sbjct: 125 VVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGY 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0949TCRTETB330.002 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.3 bits (76), Expect = 0.002
Identities = 34/152 (22%), Positives = 63/152 (41%), Gaps = 3/152 (1%)

Query: 36 IAKDLNIASDLSGLLTTLTQIGYGLGLFFIVPMADLFKSKKIIGILIGLTIISLIGTLIS 95
IA D N + + T + + +G ++D K+++ I + + +
Sbjct: 40 IANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVG 99

Query: 96 TNGIVFLILTTVI-GIGACAAQMLVPLTM-RIVPIEEMGKYVGKVMSGLLIGIMIARPLS 153
+ LI+ I G GA A LV + + R +P E GK G + S + +G + +
Sbjct: 100 HSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIG 159

Query: 154 IGITEWFGWRMVFLFSLIILVAVLLLLIKFLP 185
I + W + L +I ++ V L+K L
Sbjct: 160 GMIAHYIHWSYLLLIPMITIITV-PFLMKLLK 190


6BAS0971BAS0997Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS09712221.676309EmrB/QacA family drug resistance transporter
BAS09725231.049033hypothetical protein
BAS09737241.321686UvrD/REP helicase
BAS0974626-0.949453peptidyl-prolyl isomerase
BAS09074a317-0.220413hypothetical protein
BAS09753180.303202hypothetical protein
BAS09762181.038996transcriptional regulator Hpr
BAS09773170.868041hypothetical protein
BAS09783180.882474HIT family protein
BAS09792201.428297ABC transporter ATP-binding protein
BAS09801190.571878ABC transporter permease EscB
BAS09811180.555161EcsC protein
BAS0982016-0.718318TetR family transcriptional regulator
BAS0983117-1.374862hypothetical protein
BAS0984-115-2.798420hypothetical protein
BAS0987118-3.792106hypothetical protein
BAS0988118-2.603895hypothetical protein
BAS0989119-1.225959hypothetical protein
BAS0990216-0.743985lipoprotein
BAS09930151.308873merR family transcriptional regulator
BAS09941162.737613hypothetical protein
BAS09951142.878312hypothetical protein
BAS0996-1143.504850hypothetical protein
BAS0997-2123.086685hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0971TCRTETB1385e-38 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 138 bits (348), Expect = 5e-38
Identities = 91/412 (22%), Positives = 191/412 (46%), Gaps = 13/412 (3%)

Query: 17 NVKRLPILISMIIGAFFTILNETLLNVAFPQLMIELNVTPSTLQWLSTGYMLVVAVLIPA 76
N++ ILI + I +FF++LNE +LNV+ P + + N P++ W++T +ML ++
Sbjct: 9 NLRHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAV 68

Query: 77 SALLVQWFTTRQVFIGAMVVFTFGTLVSAIA-PGFSILLMGRLLQAAGTGLMMPVLMNTI 135
L +++ + +++ FG+++ + FS+L+M R +Q AG ++M +
Sbjct: 69 YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVV 128

Query: 136 LLLYPPEKRGAAMGSIGLVIMFAPAIGPTLSGIILETLNWRWLFYIVLPFAIFSIVFAFI 195
P E RG A G IG ++ +GP + G+I ++W +L ++P V +
Sbjct: 129 ARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL--LIPMITIITVPFLM 186

Query: 196 YLKNVSEPTKPKVDVLSILLSTIGFGGIVYGFSSSGEGWDSFQVYGIILIGLVALLFFVL 255
L K D+ I+L ++G + +S +++ +++ L FV
Sbjct: 187 KLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSY--------SISFLIVSVLSFLIFVK 238

Query: 256 RQLKLKEPLLDLSAFKYPMFTLTTILLTIMMMTMFSTMTLLPFLFQGALGLTVYATG-LI 314
K+ +P +D K F + + I+ T+ ++++P++ + L+ G +I
Sbjct: 239 HIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVI 298

Query: 315 MLPGSLLNGLLSPVSGKLFDKFGPRALIIPGTLLLASVMWFFTQVTADTSKITFILLHVT 374
+ PG++ + + G L D+ GP ++ G L SV + +T+ ++ V
Sbjct: 299 IFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFMTIIIVF 357

Query: 375 MMVSISMIMMPAQTNGLNQLPKRFYPHGTAILNTLSQVAGAVGVAFFISVMT 426
++ +S T + L ++ G ++LN S ++ G+A +++
Sbjct: 358 VLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0982HTHTETR762e-19 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 76.2 bits (187), Expect = 2e-19
Identities = 34/170 (20%), Positives = 67/170 (39%), Gaps = 3/170 (1%)

Query: 6 QTSQNIVEASFKLMAEHGIEKMSLSMIAKEVGISKPAIYYHFSSKEALVDFLFEEIFS-- 63
+T Q+I++ + +L ++ G+ SL IAK G+++ AIY+HF K L ++E S
Sbjct: 11 ETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNI 70

Query: 64 GYHFVSYFDKEQYTKENFVEKLIADGLHMLSEYEGQEGILRVINEFIVTAARNEKYQKRL 123
G + Y K + + +++ L E + ++ +I Q+
Sbjct: 71 GELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQ 130

Query: 124 FEIQEEFLNGFHELLKKGARLG-VVSQHATEENAHTLALVIDNMSNYMLM 172
+ E + + LK + + T A + I + L
Sbjct: 131 RNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180


7BAS1020BAS1034Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS1020211-0.310613hypothetical protein
BAS1021212-0.455196S-layer protein
BAS1022311-0.334634wall-associated protein
BAS1023219-3.320401hypothetical protein
BAS1024018-0.975614hypothetical protein
BAS10250130.121915wall-associated domain-containing protein
BAS1026214-0.306728hypothetical protein
BAS10292120.551162hypothetical protein
BAS10302110.461965hypothetical protein
BAS10332130.573729HD domain-containing protein
BAS1034213-0.099480hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1022PF03544340.005 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 34.2 bits (78), Expect = 0.005
Identities = 16/62 (25%), Positives = 20/62 (32%)

Query: 11 IQLIVVALIVTSVPLNGLAETAPPFTPSPNSEQSPETEKKEEKELPAPHPDQSKKDKAKA 70
I + +VA P P P P E PE K+ + P P K K
Sbjct: 50 ISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVK 109

Query: 71 KA 72
K
Sbjct: 110 KV 111


8BAS1051BAS1065Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BAS10512122.258038malate synthase
BAS10523151.539811isocitrate lyase
BAS1053316-1.177912trifolitoxin immunity domain-containing protein
BAS1054519-1.033802cold shock protein CspA
BAS10553141.669415hypothetical protein
BAS10563152.686779hypothetical protein
BAS10573152.950394competence transcription factor
BAS10583163.331409hypothetical protein
BAS10592153.292587signal peptidase I
BAS10603163.537061ATP-dependent nuclease subunit B
BAS10613152.797795ATP-dependent nuclease subunit A
BAS10623240.377680hypothetical protein
BAS10634260.110334spore germination protein GerPF
BAS10645180.152158spore germination protein GerPE
BAS10654190.233437spore germination protein GerPD
9BAS1077BAS1084Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BAS1077014-4.404315hypothetical protein
BAS1079116-3.989625alpha-amylase
BAS1080115-3.359672DNA-binding protein
BAS1081216-0.614609nucleotide-binding protein
BAS1082116-0.064973hypothetical protein
BAS10832190.068565hypothetical protein
BAS10842161.483708peptidyl-prolyl isomerase
10BAS1121BAS1143Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS11214223.636144pseudouridine synthase
BAS11244222.351628bis(5'-nucleosyl)-tetraphosphatase
BAS11274201.408692glycosyl transferase
BAS11285212.282960hypothetical protein
BAS11296242.443754bacteriocin O-metyltransferase
BAS11305252.749458hypothetical protein
BAS1131118-0.814112glycosyl transferase
BAS1132219-0.377737hypothetical protein
BAS11332210.137405hypothetical protein
BAS11340170.588289streptomycin biosynthesis StrF domain-containing
BAS1135-1170.688311glucose-1-phosphate thymidylyltransferase
BAS1136-2150.872458dTDP-4-dehydrorhamnose 3,5-epimerase
BAS1137-2160.639537dTDP-glucose 4,6-dehydratase
BAS11380171.069504dTDP-4-dehydrorhamnose reductase
BAS11394181.722245enoyl-ACP reductase
BAS11405181.582572hypothetical protein
BAS11412170.618250spore coat protein Z
BAS11421140.474113hypothetical protein
BAS11432150.638302hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1137NUCEPIMERASE1881e-59 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 188 bits (478), Expect = 1e-59
Identities = 75/332 (22%), Positives = 141/332 (42%), Gaps = 26/332 (7%)

Query: 1 MNILVTGGAGFIGSNFVHYMLQSYETYKIINFDALT--YSGNLNNVK-SIQDHPNYYFVK 57
M LVTG AGFIG + +L+ ++++ D L Y +L + + P + F K
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLE--AGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK 58

Query: 58 GEIQNGELLEHVIKERDVQVIVNFAAESHVDRSIENPIPFYDTNVIGTVTLLELVKKYPH 117
++ + E + + + + V S+ENP + D+N+ G + +LE +
Sbjct: 59 IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 118 IKLVQVSTDEVYGSLGKTGRFTEETPLA-PNSPYSSSKASADMIALAYYKTYQLPVIVTR 176
L+ S+ VYG L + F+ + + P S Y+++K + +++A Y Y LP R
Sbjct: 119 QHLLYASSSSVYG-LNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLR 177

Query: 177 CSNNYGPYQYPEKLIPLMVTNALEGKKLPLYGDGLNVRDWLHVTDHCSAIDVVLHKGRV- 235
YGP+ P+ + LEGK + +Y G RD+ ++ D AI +
Sbjct: 178 FFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHA 237

Query: 236 -----------------GEVYNIGGNNEKTNVEVVEQIITLLGKTKKDIEYVTDRLGHDR 278
VYNIG ++ ++ ++ + LG + + + G
Sbjct: 238 DTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGI-EAKKNMLPLQPGDVL 296

Query: 279 RYAINAEKMKNEFDWEPKYTFEQGLQETVQWY 310
+ + + + + P+ T + G++ V WY
Sbjct: 297 ETSADTKALYEVIGFTPETTVKDGVKNFVNWY 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1138NUCEPIMERASE444e-07 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 43.6 bits (103), Expect = 4e-07
Identities = 36/200 (18%), Positives = 70/200 (35%), Gaps = 38/200 (19%)

Query: 4 RVIITGANGQLGKQLQEEL--NPEE----------YDIYPFDKKL------------LDI 39
+ ++TGA G +G + + L + YD+ +L +D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 40 TNISQVQQVVQEIRPHIIIHCAAYTKVDQAEKERDLAYV-INAIGARNVAVASQLVGAK- 97
+ + + + V + E AY N G N+ + +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYS-LENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 98 LVYISTDYVFQGDRPEGYDEFHNPA-PINIYGASKYAGEQFVKELHNKYFIVRTSW---- 152
L+Y S+ V+ +R + + P+++Y A+K A E + Y + T
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFFT 180

Query: 153 LYGKYGN------NFVKTMI 166
+YG +G F K M+
Sbjct: 181 VYGPWGRPDMALFKFTKAML 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1139DHBDHDRGNASE577e-12 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 57.0 bits (137), Expect = 7e-12
Identities = 60/259 (23%), Positives = 105/259 (40%), Gaps = 19/259 (7%)

Query: 4 LQGKTFVVMGVANQRSIAWGIARSLHNAGAKLI-FTYAGERLERNVRELADTLEGQESLV 62
++GK + G A + I +AR+L + GA + Y E+LE+ V L E + +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSL--KAEARHAEA 61

Query: 63 LPCDVTNDEELTACFETIKQEVGTIHGVAHCIAFANRDDLKGEFVDTSRDGFLLAQNISA 122
P DV + + I++E+G I + + G S + + ++++
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVAGVLR----PGLIHSLSDEEWEATFSVNS 117

Query: 123 FSLTAVAREAKKVMT--EGGNILTLTYLGGERVVKNYNVMGVAKASLEASVKYLANDLGQ 180
+ +R K M G+I+T+ + +KA+ K L +L +
Sbjct: 118 TGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAE 177

Query: 181 HGIRVNAISAGPIRT-----LSAKGVGDFNSILREIEE---RAPLRRTTTQEEVGDTAVF 232
+ IR N +S G T L A G I +E PL++ ++ D +F
Sbjct: 178 YNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLF 237

Query: 233 LFSDLARGVTGENIHVDSG 251
L S A +T N+ VD G
Sbjct: 238 LVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1140IGASERPTASE300.009 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.6 bits (66), Expect = 0.009
Identities = 23/80 (28%), Positives = 33/80 (41%), Gaps = 6/80 (7%)

Query: 29 LELAAPKTKRIILTNFENEDRKEESNRNENVVSSAVEEVIEQEEQQQEQEQEQEE----- 83
+ A K TN E E+ + + V ++E+ + E E+ QE
Sbjct: 1069 AKEAKSNVKANTQTN-EVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS 1127

Query: 84 QVEEKTEEEEQVQEQQEPVR 103
QV K E+ E VQ Q EP R
Sbjct: 1128 QVSPKQEQSETVQPQAEPAR 1147


11BAS1172BAS1191Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS11720193.214128hypothetical protein
BAS1173-1193.450324hypothetical protein
BAS1174-1203.442976hypothetical protein
BAS11750223.160256hypothetical protein
BAS11762263.169099dihydrolipoamide succinyltransferase
BAS11770222.4103312-oxoglutarate dehydrogenase E1
BAS1178-125-3.389498DNA-binding protein
BAS1179219-2.684567hypothetical protein
BAS1180318-2.856579hypothetical protein
BAS1181519-2.766974hypothetical protein
BAS1182217-2.674509hypothetical protein
BAS1183218-2.432152hypothetical protein
BAS1184318-1.572231hypothetical protein
BAS1185-320-0.933980hypothetical protein
BAS1186-118-0.576755hypothetical protein
BAS1187-116-1.078521*hypothetical protein
BAS1188216-1.366214hypothetical protein
BAS1189115-1.287684hypothetical protein
BAS1190212-1.321887D-alanyl-D-alanine carboxypeptidase
BAS1191511-1.702436signal peptidase I
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1181FIMBRIALPAPF300.002 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 29.7 bits (66), Expect = 0.002
Identities = 33/127 (25%), Positives = 50/127 (39%), Gaps = 21/127 (16%)

Query: 6 IFFLLTCLLLVASTTYIICNKREQV--PPMLVWEGQEYYVTNEPAKAEEVGQRLGEVTKK 63
I LLT + ++A + N R V PP + GQ V E V GEVTK
Sbjct: 8 ISLLLTSVAVLAD---VQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGEVTKN 64

Query: 64 IETSEKPIKN--------------SESNIVQEKTEVFTM-IEEEKGPHSPLIIKEPDGEE 108
I S P K+ ++N++ F + + + KG +PL + G
Sbjct: 65 ISIS-CPYKSGSLWIKVTGNTMGVGQNNVLATNITHFGIALYQGKGMSTPLTLGNGSGNG 123

Query: 109 YRIVRAM 115
YR+ +
Sbjct: 124 YRVTAGL 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1183HTHFIS280.009 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 28.3 bits (63), Expect = 0.009
Identities = 10/36 (27%), Positives = 17/36 (47%), Gaps = 1/36 (2%)

Query: 69 EDRLKHLPEGSHQTVVIDVRGPDETG-EILKQIREE 103
+ + G VV DV PDE ++L +I++
Sbjct: 37 ATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKA 72


12BAS1267BAS1273Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS1267214-1.114587ribonucleoside-diphosphate reductase subunit
BAS1268212-2.766974group I intron GIY-YIG endonuclease
BAS1269215-1.741333ribonucleoside-diphosphate reductase subunit
BAS1270214-2.391200ribonucleotide-diphosphate reductase subunit
BAS1271015-2.832634GntR family transcriptional regulator
BAS1272014-3.064388ABC transporter ATP-binding protein
BAS1273113-3.203337ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1272PF05272330.001 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.7 bits (74), Expect = 0.001
Identities = 15/56 (26%), Positives = 22/56 (39%)

Query: 12 KRYGLKAVIRELNIEITEGKIIGLVGDNGSGKTTLLKMIAGLQHPSEGSITIAGKK 67
K + V R + + L G G GK+TL+ + GL S+ I K
Sbjct: 578 KYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGK 633


13BAS1459BAS1464Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BAS1459320-3.526905Holliday junction-specific endonuclease
BAS1460621-1.919828hypothetical protein
BAS1461721-1.430075hypothetical protein
BAS1462619-1.695851hypothetical protein
BAS14633190.460895hypothetical protein
BAS14642190.566360hypothetical protein
14BAS1487BAS1502Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS1487-316-3.096138hypothetical protein
BAS1488-115-3.229196cation transporter
BAS1489117-4.7787505'-3' exonuclease
BAS1490019-5.197529acyltransferase
BAS1491019-5.069327chain length determinant protein
BAS1494019-4.594883capsular polysaccharide biosynthesis
BAS1495314-0.184681polysaccharide biosynthesis protein
BAS1496314-0.178166hypothetical protein
BAS14973150.384414glycosyl transferase
BAS14983160.518829group 1 glycosyl transferase
BAS14993170.823971hypothetical protein
BAS15023191.136331hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1502CHLAMIDIAOM6394e-04 Chlamydia cysteine-rich outer membrane protein 6 si...
		>CHLAMIDIAOM6#Chlamydia cysteine-rich outer membrane protein 6

signature.
Length = 547

Score = 39.3 bits (91), Expect = 4e-04
Identities = 24/63 (38%), Positives = 37/63 (58%), Gaps = 7/63 (11%)

Query: 1130 SNTVSTQINLANVVIVKQVDLTIAD---VGQPITYTIALANPGNTPANNVVVTDILPPGT 1186
+ +V+T IN V QV + AD V +P+ Y I+++NPG+ +VVV D L PG
Sbjct: 305 TASVTTVINEPCV----QVSIAGADWSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSPGV 360

Query: 1187 TLV 1189
T++
Sbjct: 361 TVL 363



Score = 37.4 bits (86), Expect = 0.002
Identities = 73/382 (19%), Positives = 135/382 (35%), Gaps = 101/382 (26%)

Query: 1956 YTVLLENIGNTTATNIIFTDPIPNHTVFIEDSVRVGGILLPGVNPANGIPIGDIIAGDFI 2015
Y + + N G TA N++ +P+P+ G +GD+ G+
Sbjct: 229 YKINIVNQGTATARNVVVENPVPD------------GYAHSSGQRVLTFTLGDMQPGE-- 274

Query: 2016 NITFRVQVVSIPNPIFTIGPGGPNSPVVNGASINYQFMTGPNLPLASRSTTSNPVSTQIN 2075
+ T V+ + N A+++Y G + AS +T N Q++
Sbjct: 275 HRTITVEFCPLKR-----------GRATNIATVSY---CGGHKNTASVTTVINEPCVQVS 320

Query: 2076 SGEIALVKSVDKTFVTIGDTLSYSISLSNPGNVTSQNIIFTDVLPEGITFISGTLTNDSG 2135
+ D ++V + Y IS+SNPG++ ++++ D L G+T +
Sbjct: 321 ------IAGADWSYVC--KPVEYVISVSNPGDLVLRDVVVEDTLSPGVTVLEA------- 365

Query: 2136 TQQIGNPATGIQIGNINPGSTATIVINALVTNIPSINPISNFSSVQFAHVVDPSQPSVSQ 2195
G A I N +V + +NP S+Q+ +V P
Sbjct: 366 -----------------AG--AQISCNKVVWTVKELNP---GESLQYKVLVRAQTPG--- 400

Query: 2196 TNLSNTVSTTIKSAILTTTKSADKSV------------------ISVGDTITYTTTITNT 2237
+N V S T T A+ + + VG+ Y +TN
Sbjct: 401 -QFTNNVVVKSCSDCGTCTSCAEATTYWKGVAATHMCVVDTCDPVCVGENTVYRICVTNR 459

Query: 2238 GNTAAANI----KFT-------SAIPANTTFIPNSVTINGVQQSGVQPAL--GVNIPNIA 2284
G+ N+ KF+ + P T N+V + + + G + + V + ++
Sbjct: 460 GSAEDTNVSLMLKFSKELQPVSFSGPTKGTITGNTVVFDSLPRLGSKETVEFSVTLKAVS 519

Query: 2285 PGETV-TVTFQVNVLSVPSSSS 2305
G+ + L+VP S +
Sbjct: 520 AGDARGEAILSSDTLTVPVSDT 541



Score = 35.8 bits (82), Expect = 0.005
Identities = 36/160 (22%), Positives = 62/160 (38%), Gaps = 30/160 (18%)

Query: 2884 IVYSVTITNSGNVNATNVIFTDVIPDGTSFEPNSFTLNGTIIENANIITGVPIGDIAPNE 2943
+VY + I N G A NV+ + +PDG + L T +GD+ P E
Sbjct: 227 VVYKINIVNQGTATARNVVVENPVPDGYAHSSGQRVLTFT------------LGDMQPGE 274

Query: 2944 SAI--VEFHITSNEIPAINPITNQASVSFQHIVNPANPPVSKNITSNSVTTTIESAILTT 3001
VEF TN A+VS+ + + SVTT I +
Sbjct: 275 HRTITVEFCPLKR-----GRATNIATVSY----------CGGHKNTASVTTVINEPCVQV 319

Query: 3002 TKIGDKAFATIGDTITYTTTITNIGNIPANNVIFSDPIPS 3041
+ I ++ + + Y +++N G++ +V+ D +
Sbjct: 320 S-IAGADWSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSP 358



Score = 33.9 bits (77), Expect = 0.018
Identities = 35/180 (19%), Positives = 69/180 (38%), Gaps = 27/180 (15%)

Query: 4316 IAVKTTPIQYADLQTIIPYTISITNNGNIQVENIIVTDIIPANTNFIENSVIVNGNTRPN 4375
I VK + A L+ + Y I+I N G N++V + +P +G +
Sbjct: 211 ICVKQEGPENACLRCPVVYKINIVNQGTATARNVVVENPVP------------DGYAHSS 258

Query: 4376 DNPLSGIPIDNILPNTTATVLFQVRVTSIPQT-NPISNTSTIEYEYTVGDQPPITKTIIS 4434
+ + ++ P T+ V P +N +T+ Y + +T I
Sbjct: 259 GQRVLTFTLGDMQPGEHRTIT----VEFCPLKRGRATNIATVSYCGGHKNTASVTTVINE 314

Query: 4435 SAALTEINHANLNSNKAVDLAFAMVGDTLTYTITLNQTGNVAANDVIIQDMIPQGTTFIE 4494
I A+ ++ V + Y I+++ G++ DV+++D + G T +E
Sbjct: 315 PCVQVSIAGAD----------WSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSPGVTVLE 364


15BAS1525BAS1555Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS1525-212-3.099476hypothetical protein
BAS1526-115-1.974466hypothetical protein
BAS1527-215-1.651732hypothetical protein
BAS1528-116-1.996495NAD(P)H-flavin oxidoreductase
BAS1529-118-2.445430thiJ/pfpI family protein
BAS1530-121-2.046881hypothetical protein
BAS1531-122-1.713788hypothetical protein
BAS1532219-3.019499hypothetical protein
BAS1533120-3.277178permease
BAS1534016-2.807134glyoxalase
BAS1535-213-2.396603hypothetical protein
BAS1536-214-2.802135hypothetical protein
BAS1537-211-1.524592host factor-I protein
BAS153817-1.199865hypothetical protein
BAS1539110-1.227955flagellar motor protein MotP
BAS154029-1.597067flagellar motor protein MotS
BAS154127-1.947607chemotaxis response regulator
BAS1544310-2.053472flagellar motor switch protein
BAS1545512-3.374734hypothetical protein
BAS1546314-3.074739hypothetical protein
BAS1547212-3.528291chemotaxis protein CheR
BAS1548214-3.534295hypothetical protein
BAS1549215-3.098885hypothetical protein
BAS1550217-2.983268hypothetical protein
BAS1551118-2.014912flagellar hook-associated protein FlgK
BAS1554219-2.638892flagellar capping protein
BAS1555219-0.388287flagellar protein FliS
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1531TCRTETA509e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 49.8 bits (119), Expect = 9e-09
Identities = 49/261 (18%), Positives = 96/261 (36%), Gaps = 15/261 (5%)

Query: 56 WSGSIVDRLNKRSIMLITDIIRAALIGCIPLFDSIWAIYIFIFLTRIATSFFDPASFSYK 115
G++ DR +R ++L++ A + +W +YI + I + + +Y
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAYI 120

Query: 116 TMLIRAEERAQFNAWSNFCTSGAFIIGPALAGILLTTHSAT---FVIYCNSLSFLLSTIF 172
+ +ERA+ + + C + GP L G++ N L+FL F
Sbjct: 121 ADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTG-CF 179

Query: 173 IYFLPNIALQTKQNEEVANTFVQTLRNDWKQVFSFARTETYIILIFVLFQATMLVAMALD 232
+ + + E N F +AR T + + +F LV
Sbjct: 180 LLPESHKGERRPLRREALNPLAS---------FRWARGMTVVAALMAVFFIMQLVGQVPA 230

Query: 233 SQEVVFTKQVLLLSNMEYSMLVSITGAAY-VFGSFLVSLFAKRLPIQYCIGFGMIFTAIG 291
+ V+F + + ++ G + + + + A RL + + GMI G
Sbjct: 231 ALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTG 290

Query: 292 YVIFAFSNSFIVAAGGFILLG 312
Y++ AF+ +A +LL
Sbjct: 291 YILLAFATRGWMAFPIMVLLA 311


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1533TCRTETA479e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 46.7 bits (111), Expect = 9e-08
Identities = 56/292 (19%), Positives = 109/292 (37%), Gaps = 17/292 (5%)

Query: 59 LPQLLLSPFIGGVVDRFSKKNIMIFTDITRGILVLTYILASYK-IEIIFIANICLSVLSC 117
L Q +P +G + DRF ++ +++ + G V I+A+ + +++I I ++ ++
Sbjct: 54 LMQFACAPVLGALSDRFGRRPVLLVSLA--GAAVDYAIMATAPFLWVLYIGRI-VAGITG 110

Query: 118 LFEPAKQATLKNIVHENHFVTANSLSSTMNGFMSIMGASLGGIIAQ-SLHIEFAF--LVN 174
A + +I + S GF + G LGG++ S H F +N
Sbjct: 111 ATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALN 170

Query: 175 SLSYFISAYFIYSMCIPSHNTCNKKKAFLTDIKDGYTYILQTKIILTLILVGISWGLIGG 234
L++ + + ++ + + + ++ L+ V L+G
Sbjct: 171 GLNFLTGCFLLPESHKGERRPLRREA---LNPLASFRWARGMTVVAALMAVFFIMQLVGQ 227

Query: 235 AYQLLLTIYAEKIFH---TNIGILYTVQGA-GLMIGSLLVNLYISHNKEKIKKAFGWACF 290
L I+ E FH T IGI G + +++ + E+ G
Sbjct: 228 VPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIAD 287

Query: 291 LQGVFFLGFILSSQLIFGLTTLLCMRIAGGIIVPLDTTLLQTYTRENMIGKV 342
G L F + F + LL +GGI +P +L E G++
Sbjct: 288 GTGYILLAFATRGWMAFPIMVLL---ASGGIGMPALQAMLSRQVDEERQGQL 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1540OMPADOMAIN636e-14 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 63.4 bits (154), Expect = 6e-14
Identities = 30/127 (23%), Positives = 56/127 (44%), Gaps = 17/127 (13%)

Query: 110 SVVIVDNLIFDTGDANVKPEAKEIISQLVGFFQSVPNP---IVVEGHTDSRPIHNDKFPS 166
+ +++F+ A +KPE + + QL ++ +VV G+TD +D +
Sbjct: 214 HFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI--GSDAY-- 269

Query: 167 NWELSSARAANMIHHLIEVYNVDDKRLAAVGYADTKPVVPN---------DSPQNWEKNR 217
N LS RA +++ +LI + +++A G ++ PV N +R
Sbjct: 270 NQGLSERRAQSVVDYLIS-KGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDR 328

Query: 218 RVVIYIK 224
RV I +K
Sbjct: 329 RVEIEVK 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1541HTHFIS839e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.6 bits (204), Expect = 9e-22
Identities = 28/112 (25%), Positives = 46/112 (41%), Gaps = 2/112 (1%)

Query: 4 KILVVDDAMFMRTMIKNLLKSNSEFEVIGEAENGVEAIQKYKELQPDIVTLDITMPEMDG 63
ILV DD +RT++ L + + N + D+V D+ MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQAL--SRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 64 LEALKEIIKIDASAKVVICSAMGQQGMVLDAIKGGAKDFIVKPFQADRVIEA 115
+ L I K V++ SA + A + GA D++ KPF +I
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGI 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1544FLGMOTORFLIN561e-11 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 55.7 bits (134), Expect = 1e-11
Identities = 23/71 (32%), Positives = 40/71 (56%)

Query: 473 DTSILQNVEMNVKFVFGSTVKTIQDILSLQENEAVVLDEDIDEPIRIYVNDVLVAYGELV 532
D ++ ++ + + G T TI+++L L + V LD EP+ I +N L+A GE+V
Sbjct: 53 DIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVV 112

Query: 533 NVDGFFGVKVT 543
V +GV++T
Sbjct: 113 VVADKYGVRIT 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1545IGASERPTASE340.002 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.9 bits (77), Expect = 0.002
Identities = 18/126 (14%), Positives = 51/126 (40%), Gaps = 1/126 (0%)

Query: 301 EQKTEEDKKIEEPENEDKLENKLEDKKVTEKQEDSKVEISLPEEKTPVVQIPKKEEKVND 360
+ EE K+E + ++ + + E+ E + + E P V I + + + N
Sbjct: 1105 TVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNT 1164

Query: 361 LIKEPLKEKEKITYVIKEPLTDNKEVNKTKAQKDKDNNNQVISKKKEKKEEPEEKKEAKS 420
+ ++ + +++P+T++ VN + + N + + E K + +
Sbjct: 1165 TADTE-QPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRH 1223

Query: 421 EQGIQA 426
+ +++
Sbjct: 1224 RRSVRS 1229



Score = 29.3 bits (65), Expect = 0.045
Identities = 26/109 (23%), Positives = 44/109 (40%), Gaps = 7/109 (6%)

Query: 22 LQSKAEEQNVP-EQNINEV-NVQEENKEVQEQLEQVEMKQDKEEQQEAKNEQETEKKIET 79
+K + NV NEV E KE Q + + E++++AK ETEK E
Sbjct: 1067 EVAKEAKSNVKANTQTNEVAQSGSETKETQTT--ETKETATVEKEEKAK--VETEKTQEV 1122

Query: 80 DQGVITVNKPELKVGEEVLVTIEPKEKNVQSIKGILRLPKNGDQYEQER 128
+ V + P+ + E V EP +N ++ + + E+
Sbjct: 1123 PK-VTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQ 1170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1551FLGHOOKAP11043e-26 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 104 bits (260), Expect = 3e-26
Identities = 72/249 (28%), Positives = 112/249 (44%), Gaps = 14/249 (5%)

Query: 4 SDYNTPLSGLLAAQMGLQTTKQNLSNIHTPGYVRQMVNYGSAGASQGYSPEQKIGYGVQT 63
S N +SGL AAQ L T N+S+ + GY RQ A ++ G +G GV
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLG--AGGWVGNGVYV 59

Query: 64 LGVDRITDEVKTKQFNDQLSQLSYYNYMNSTLSRVESMVGTTGKNSLSSLMDGFFNAFRE 123
GV R D T Q +Q S +S++++M+ T+ SL++ M FF + +
Sbjct: 60 SGVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTS-SLATQMQDFFTSLQT 118

Query: 124 VAKNPEQPNYYDTLISETGKFTSQVNRLAKSLDTAEAQTTEDIEAHVNEFNRLAGSLAEA 183
+ N E P LI ++ +Q + L + Q I A V++ N A +A
Sbjct: 119 LVSNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASL 178

Query: 184 NKKI----GQAGTQVPNQLLDERDRIITEMSKYANIEVS---YESMNPNIASVRMNGVLT 236
N +I G PN LLD+RD++++E+++ +EVS + N +A NG
Sbjct: 179 NDQISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMA----NGYSL 234

Query: 237 VNGQDTYPL 245
V G L
Sbjct: 235 VQGSTARQL 243



Score = 54.2 bits (130), Expect = 6e-10
Identities = 19/51 (37%), Positives = 35/51 (68%)

Query: 380 LLEGIQQEKMGIEGVNMEEEMVNLMAFQKYFVANSKAITTMNEVFDSLFSI 430
++ + ++ I GVN++EE NL FQ+Y++AN++ + T N +FD+L +I
Sbjct: 495 VVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


16BAS1568BAS1600Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS1568013-3.638951flagellar hook protein FlgE
BAS1569-115-4.510633hypothetical protein
BAS1573017-4.732841hypothetical protein
BAS1574321-5.627670glycosyl transferase
BAS1575322-5.749430hypothetical protein
BAS1576220-5.228114TPR/glycosyl transferase domain-containing
BAS1577117-3.478999hypothetical protein
BAS1578-112-2.844015glycosyl transferase
BAS1579112-2.334775hypothetical protein
BAS15803190.859247hypothetical protein
BAS15812170.956121hypothetical protein
BAS15823181.095177flagellin
BAS15834260.817585Slt family transglycosylase
BAS1587324-0.383995flagellar motor switch protein
BAS1588420-0.244038hypothetical protein
BAS1589417-0.357123flagellar biosynthesis protein FliP
BAS1590214-0.389823flagellar biosynthesis protein FliQ
BAS1591112-0.282301flagellar biosynthesis protein FliR
BAS1592090.073013flagellar biosynthesis protein FlhB
BAS15931100.420791flagellar biosynthesis protein FlhA
BAS1596-1100.173457flagellar basal body rod protein FlgG
BAS1597-111-0.613730alanyl-tRNA synthetase
BAS1598111-1.038365hypothetical protein
BAS1599314-1.352832AzlC family protein
BAS1600013-3.577159hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1568FLGHOOKAP1441e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 1e-06
Identities = 15/36 (41%), Positives = 24/36 (66%)

Query: 5 LYTSITGMNAAQNALSVTSNNIANAQTVGYKKQKAI 40
+ +++G+NAAQ AL+ SNNI++ GY +Q I
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI 39



Score = 37.6 bits (87), Expect = 9e-05
Identities = 10/39 (25%), Positives = 26/39 (66%)

Query: 397 SNVDLSVEFVDLMLYQRGFQGNAKVIKVSDEVLNEVVNL 435
S V+L E+ +L +Q+ + NA+V++ ++ + + ++N+
Sbjct: 507 SGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1576SYCDCHAPRONE412e-06 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 41.5 bits (97), Expect = 2e-06
Identities = 23/101 (22%), Positives = 34/101 (33%), Gaps = 11/101 (10%)

Query: 444 DNEQIQLALIREDIRQLINQGMISQAKYLISEYEKTFPITSEIYQMKGIVAFSENNYLDA 503
D ++ QLA+ + G IS T E + Y DA
Sbjct: 7 DTQEYQLAME-----SFLKGGGTIAMLNEISSD------TLEQLYSLAFNQYQSGKYEDA 55

Query: 504 ENFFKLALKLYHFDVDALFNLGYLYEVQEQYDRAVQNYNLA 544
F+ L H+D LG + QYD A+ +Y+
Sbjct: 56 HKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYG 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1582FLAGELLIN1259e-35 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 125 bits (314), Expect = 9e-35
Identities = 76/282 (26%), Positives = 130/282 (46%), Gaps = 18/282 (6%)

Query: 1 MRINTNINSMRTQEYMRQNQDKMNVSMNRLSSGKRINSAADDAAGLAIATRMRARQSGLE 60
INTN S+ TQ + ++Q ++ ++ RLSSG RINSA DDAAG AIA R + GL
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 61 KASQNTQDGMSLIRTAESAMNSVSNILTRMRDIAVQSSNGTNTAENQSALQKEFAELQEQ 120
+AS+N DG+S+ +T E A+N ++N L R+R+++VQ++NGTN+ + ++Q E + E+
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 121 IDYIAKNTEFNDKNLLAGTGAVTIGSTSISGAEISIETLDSSATNQQITIKLANTTAEKL 180
ID ++ T+FN +L+ + I + G + ITI L + L
Sbjct: 122 IDRVSNQTQFNGVKVLSQDNQMKIQVGANDG--------------ETITIDLQKIDVKSL 167

Query: 181 GIDATTSN----ISISGAASALAAISALNTALNTVAGNRATLGATLNRLDRNVENLNNQA 236
G+D N ++ S+ ++ +T R + + D + ++
Sbjct: 168 GLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV 227

Query: 237 TNMASAASQIEDADMAKEMSEMTKFKILNEAGISMLSQANQT 278
A+ D ++ K + A
Sbjct: 228 YVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAI 269



Score = 86.3 bits (213), Expect = 4e-21
Identities = 62/259 (23%), Positives = 107/259 (41%), Gaps = 7/259 (2%)

Query: 36 INSAADDAAGLAIATRMRARQSGLEKASQNTQDGMSLIRTAESAMNSVSNILTRMRDIAV 95
+ AG A A + G ++ G++ ++ + + T + V
Sbjct: 249 LFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKV 308

Query: 96 QSSNGTNTAENQSALQKEFAELQEQID-YIAKNTEFNDKN------LLAGTGAVTIGSTS 148
+ TA + + + F+DK L + S
Sbjct: 309 TLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGES 368

Query: 149 ISGAEISIETLDSSATNQQITIKLANTTAEKLGIDATTSNISISGAASALAAISALNTAL 208
+ T +++ + K G+ + + + S ++++++AL
Sbjct: 369 KITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTANPLASIDSAL 428

Query: 209 NTVAGNRATLGATLNRLDRNVENLNNQATNMASAASQIEDADMAKEMSEMTKFKILNEAG 268
+ V R++LGA NR D + NL N TN+ SA S+IEDAD A E+S M+K +IL +AG
Sbjct: 429 SKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKAQILQQAG 488

Query: 269 ISMLSQANQTPQMVSKLLQ 287
S+L+QANQ PQ V LL+
Sbjct: 489 TSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1583PF06580290.021 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.7 bits (64), Expect = 0.021
Identities = 8/42 (19%), Positives = 20/42 (47%), Gaps = 1/42 (2%)

Query: 122 LTKKY-NIQKIRSSNEGKYEDIIDRVSHTYGIPKTLIQKMIE 162
+ Y + I+ + ++E+ I+ +P L+Q ++E
Sbjct: 224 VVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVE 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1587FLGMOTORFLIN592e-14 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 58.8 bits (142), Expect = 2e-14
Identities = 22/94 (23%), Positives = 51/94 (54%)

Query: 13 LEDFAGKRNEASKAHIDTVSDISIELGVKLGKASITLGDVKQLKVGDVLEVEKNLGHKVD 72
+ G + ID + DI ++L V+LG+ +T+ ++ +L G V+ ++ G +D
Sbjct: 39 FQQLGGGDVSGAMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLD 98

Query: 73 VYLSNMKVGIGEAIVMDEKFGIIISEIEADKKQA 106
+ ++ + GE +V+ +K+G+ I++I ++
Sbjct: 99 ILINGYLIAQGEVVVVADKYGVRITDIITPSERM 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1589FLGBIOSNFLIP1642e-52 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 164 bits (417), Expect = 2e-52
Identities = 75/239 (31%), Positives = 136/239 (56%), Gaps = 2/239 (0%)

Query: 14 FVFSIVFSIIFVNPAYAAQNGFINFENGKEFTSN--SSVQLFALVTLLSLSSSIVLLFTH 71
+ + + P AQ I + + VQ +T L+ +I+L+ T
Sbjct: 4 LLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMMTS 63

Query: 72 FTYFMIVLGITRQGLGVMNLPPNQVLVGLALFLSLFTMQPVLGQLKSDVWDPMTKEKITV 131
FT +IV G+ R LG + PPNQVL+GLALFL+ F M PV+ ++ D + P ++EKI++
Sbjct: 64 FTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKISM 123

Query: 132 SQAAETTAPIMKEYMSKHTYKHDLKMMLKVRGEELPKDLKDLSLFTLVPSFTLTQIQKGL 191
+A E A ++E+M + T + DL + ++ + + + + L+P++ ++++
Sbjct: 124 QEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKTAF 183

Query: 192 LTGMFIYLAFVFIDLIISTLLMYLGMMMVPPMILSLPFKILIFVYLGGYTKIVDIMFKT 250
G I++ F+ IDL+I+++LM LGMMMVPP ++LPFK+++FV + G+ +V + ++
Sbjct: 184 QIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQS 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1590TYPE3IMQPROT421e-08 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 41.7 bits (98), Expect = 1e-08
Identities = 15/81 (18%), Positives = 35/81 (43%)

Query: 4 SPIIDIFQTFFYKGVMILMPVAGVSMIVVIIIAVIMAMMQIQEQTLTFLPKMASIVLVII 63
++ Y +++ V+ I+ +++ + + Q+QEQTL F K+ + L +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 ILGPWMFQELTTLILDLFDKI 84
+L W + L + +
Sbjct: 62 LLSGWYGEVLLSYGRQVIFLA 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1591TYPE3IMRPROT967e-26 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 96.0 bits (239), Expect = 7e-26
Identities = 51/233 (21%), Positives = 113/233 (48%), Gaps = 1/233 (0%)

Query: 10 FFAFCRITSFLYFLPFFSGRSIPAMAKVTFGLALSITVADQVDVSHIKTVWDVAA-YAGT 68
F+ R+ + + P S RS+P K+ + ++ +A + + + A A
Sbjct: 17 FWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPVFSFFALWLAVQ 76

Query: 69 QIVIGLSLSKIVEMLWNIPKMAGHILDFDIGLSQASLFDVNAGSQSTLLSTIFDIFFLII 128
QI+IG++L ++ + + AG I+ +GLS A+ D + +L+ I D+ L++
Sbjct: 77 QILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDMLALLL 136

Query: 129 FISLGGINYFVATILKSFQYTEAISKLLTTSFLDSLLATLLFAITSAVEIALPLMGSLFI 188
F++ G + ++ ++ +F + L ++ +L + + +ALPL+ L
Sbjct: 137 FLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALPLITLLLT 196

Query: 189 INFVLILIAKNAPQLNVFMNAYVIKITCGILFIAMSVPMLGYVFKNMTDVLLE 241
+N L L+ + APQL++F+ + + +T GI +A +P++ +++ +
Sbjct: 197 LNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIFN 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1592TYPE3IMSPROT2892e-98 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 289 bits (742), Expect = 2e-98
Identities = 92/343 (26%), Positives = 186/343 (54%), Gaps = 2/343 (0%)

Query: 4 DNKTEKATPQKRKKSREEGNIARSKDLNNLFSILVLAVVVYFFGDWLGFEIANSVSVLFD 63
KTE+ TP+K + +R++G +A+SK++ + I+ L+ ++ D+ + + + +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 64 QIGKNTDS--TEYFYMMGILLLKVSAPILILVYAFHLFNYMIQVGFLFSSKVIKPKASRI 121
Q + + + + P+L + + ++++Q GFL S + IKP +I
Sbjct: 63 QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKKI 122

Query: 122 NPKNYFTRLFSRKSLVDILKSLFYMGLIGYVAYVLFKKNLEKIVSMIGFNWTASLTEIIR 181
NP R+FS KSLV+ LKS+ + L+ + +++ K NL ++ + + +
Sbjct: 123 NPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQ 182

Query: 182 QIKFIFLAILIILIVLSIIDFIYQKWEYEQDIKMKKEEVKQEHKDNEGDPQVKGKRKNFM 241
++ + + + +V+SI D+ ++ ++Y +++KM K+E+K+E+K+ EG P++K KR+ F
Sbjct: 183 ILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQFH 242

Query: 242 HAILQGTIAKKMDGATFIVNNPTHISVVLRYNKHVDAAPIVVAKGEDELALYIRTLAREQ 301
I + + + ++ +V NPTHI++ + Y + P+V K D +R +A E+
Sbjct: 243 QEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEEE 302

Query: 302 EIPMVENRPLARSLYYQVEEDETIPEDLYVAVIEVMRYLIQTN 344
+P+++ PLAR+LY+ D IP + A EV+R+L + N
Sbjct: 303 GVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQN 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1596FLGHOOKAP1280.033 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 28.4 bits (63), Expect = 0.033
Identities = 11/47 (23%), Positives = 24/47 (51%)

Query: 203 NGVGTVKNYMLENSNVDMTKEMADLMTDQRMISASQRVMTSFDKIYE 249
N V + N S V++ +E +L Q+ A+ +V+ + + I++
Sbjct: 494 NVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 540


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1597DPTHRIATOXIN280.039 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 27.8 bits (61), Expect = 0.039
Identities = 26/113 (23%), Positives = 49/113 (43%), Gaps = 16/113 (14%)

Query: 63 EQGEIVHYIKDGAQVKLGPVKLEINWERRHNLMRHHSLLHLIGAVVYEKYGALCTGNQIY 122
E V YI + Q K V+LEIN+E R + +YE C GN++
Sbjct: 174 EGSSSVEYINNWEQAKALSVELEINFETRGKRGQD---------AMYEYMAQACAGNRVR 224

Query: 123 PDKA------RIDFNELQELSSVEVEGIVKEVNKLIEQNKEISTRYMSREEAE 169
+D++ +++ + ++E + KE + + E + +S E+A+
Sbjct: 225 RSVGSSLSCINLDWDVIRDKTKTKIESL-KEHGPIKNKMSESPNKTVSEEKAK 276


17BAS1701BAS1713Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS17012170.971137sodium-dependent transporter
BAS17023221.886797polysaccharide deacetylase
BAS17032223.044322hypothetical protein
BAS17043203.288182hypothetical protein
BAS17050204.598050fibronectin-binding protein
BAS1707-1205.880053dehydrogenase
BAS1708-2132.818251hypothetical protein
BAS1709-2142.103756hypothetical protein
BAS1711-2123.148574peptide methionine sulfoxide reductase
BAS1712-2123.733982short chain dehydrogenase
BAS1713-2123.191728branched-chain amino acid aminotransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1705PF07299334e-120 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 334 bits (859), Expect = e-120
Identities = 209/213 (98%), Positives = 212/213 (99%)

Query: 1 MEAFIRSDQYNFIKSQAYILANGHATANDRGVIQALKSLAIEKIIHVFENLTDEQKELID 60
MEAFIRSDQYNFIKSQAYILANGHATANDRGVIQALKSLAIEKIIHVFENLTDEQKELID
Sbjct: 7 MEAFIRSDQYNFIKSQAYILANGHATANDRGVIQALKSLAIEKIIHVFENLTDEQKELID 66

Query: 61 TVLTVQNREDAESFLTKINPYVIPFQEVTAQTLKKLFPKAKKLKLPDMEEMDMKEISYLS 120
TVLTVQNREDAESFL KINPYVIPFQEVTAQTLKKLFPKAKKLKLPDMEE+DMKE+SYLS
Sbjct: 67 TVLTVQNREDAESFLLKINPYVIPFQEVTAQTLKKLFPKAKKLKLPDMEELDMKELSYLS 126

Query: 121 WVDKGSSRKFIIAKNDKNKFVGLQGTFQSLNKKSICSLCHGHEEVGMFLVEIKGDIPGTF 180
W+DKGSSRKFIIAKNDKNKFVGLQGTFQSLNKKSICSLCHGHEEVGMFLVEIKGDIPGTF
Sbjct: 127 WIDKGSSRKFIIAKNDKNKFVGLQGTFQSLNKKSICSLCHGHEEVGMFLVEIKGDIPGTF 186

Query: 181 VKKGNYICKDGVACNQNMKSLDKLQDFIERLKK 213
VKKGNYICKDGVACNQNMKSLDKLQDFIERLKK
Sbjct: 187 VKKGNYICKDGVACNQNMKSLDKLQDFIERLKK 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1712DHBDHDRGNASE885e-23 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 88.2 bits (218), Expect = 5e-23
Identities = 68/263 (25%), Positives = 121/263 (46%), Gaps = 21/263 (7%)

Query: 2 LKGKVALVTGASRGIGRAIAKRLANDGALV-AIHYGNRKEEAEETVYEIQSNGGSAFSIG 60
++GK+A +TGA++GIG A+A+ LA+ GA + A+ Y K E + + ++ AF
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP-- 63

Query: 61 ANLESLHGVEALYSSLDNELQNRTGSTKFDILINNAGIGPGAFIEETTEQFFDRMVSVNA 120
A++ ++ + + ++ E+ DIL+N AG+ I +++ ++ SVN+
Sbjct: 64 ADVRDSAAIDEITARIEREMG------PIDILVNVAGVLRPGLIHSLSDEEWEATFSVNS 117

Query: 121 KAPFFIIQQALSRLRD--NSRIINISSAATRISLPDFIAYSMTKGAINTMTFTLAKQLGA 178
F + + D + I+ + S + AY+ +K A T L +L
Sbjct: 118 TGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAE 177

Query: 179 RGITVNAILPGFVKTDMNAELLSDP---------MMKQYATTISAFNRLGEVEDIADTAA 229
I N + PG +TDM L +D ++ + T I +L + DIAD
Sbjct: 178 YNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIP-LKKLAKPSDIADAVL 236

Query: 230 FLASPDSRWVTGQLIDVSGGSCL 252
FL S + +T + V GG+ L
Sbjct: 237 FLVSGQAGHITMHNLCVDGGATL 259


18BAS1880BAS1892Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS18807203.366360spermidine acetyltransferase
BAS188110213.650139hypothetical protein
BAS18828192.556024hypothetical protein
BAS1883821-0.297838hypothetical protein
BAS1884321-0.655041hypothetical protein
BAS1885218-0.827915hypothetical protein
BAS1886-217-1.311947hypothetical protein
BAS1887-216-0.971324hypothetical protein
BAS1888014-0.621431hypothetical protein
BAS1889-111-0.447832adhesion lipoprotein
BAS1890012-0.364228hypothetical protein
BAS1891014-0.076915NADPH dehydrogenase NamA
BAS1892218-0.492314methylated-DNA--protein-cysteine
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1889adhesinb2144e-70 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 214 bits (547), Expect = 4e-70
Identities = 75/319 (23%), Positives = 137/319 (42%), Gaps = 20/319 (6%)

Query: 3 KRLTIFSFLLIFTLIFTGCSNTKEGNAKKDGKLTVYTTIFPLADFAKKIGGDYVTVEAIY 62
K+ LL+ + CS+ K KL V T +AD K I GD + + +I
Sbjct: 2 KKCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKNIAGDKINLHSIV 61

Query: 63 PPGADSHTFEPSQKQTVKVAKADLFVYNGAELE-----PFAEKMEKSLQKENVKIVNASK 117
P G D H +EP + K ++ADL YNG LE F + +E + +KEN S+
Sbjct: 62 PVGQDPHEYEPLPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVSE 121

Query: 118 GIELRTSTEEEHHDHGDGHKEDEHHHDKDPHIWLDPTLAMKQAEKIKNALVALQPDHKQE 177
G+++ + DPH WL+ + A+ I L P +K+
Sbjct: 122 GVDVIYLEGQSEKGKE------------DPHAWLNLENGIIYAQNIAKRLSEKDPANKET 169

Query: 178 FEKNFAALQTKFTDLDDQFKAVVAN--AKTKDILVSHAAYGYWEQRYGLKQIAIAGISAS 235
+EKN A K + LD + K N + K I+ S + Y+ + Y + I I+
Sbjct: 170 YEKNLKAYVEKLSALDKEAKEKFNNIPGEKKMIVTSEGCFKYFSKAYNVPSAYIWEINTE 229

Query: 236 DEPSQKQLADITKTVKEHNLKYILFETFSTPKVASVIQKETGTKVLRLNHLATISEDDAK 295
+E + Q+ + + +++ + + E+ + + K+T + +++E +
Sbjct: 230 EEGTPDQIKTLVEKLRKTKVPSLFVESSVDDRPMKTVSKDTNIPIYAKIFTDSVAEKGEE 289

Query: 296 NNKDYFTLMEENVNTLKEA 314
+ Y+++M+ N+ + E
Sbjct: 290 GD-SYYSMMKYNLEKIAEG 307


19BAS1936BAS1950Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS1936314-1.297569glycosyl transferase
BAS1937215-1.306432hypothetical protein
BAS1938315-1.121538GntR family transcriptional regulator
BAS1939216-2.181494acetyltransferase
BAS1942216-2.379877hypothetical protein
BAS1943220-2.330927acetyltransferase
BAS1944223-3.061959acetyltransferase
BAS1945325-3.837637hypothetical protein
BAS1946323-5.969303hypothetical protein
BAS1947525-6.426322hypothetical protein
BAS1948421-4.143910hypothetical protein
BAS1949220-4.104766hypothetical protein
BAS1950016-3.810356hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1939SACTRNSFRASE376e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 37.2 bits (86), Expect = 6e-06
Identities = 25/103 (24%), Positives = 42/103 (40%), Gaps = 5/103 (4%)

Query: 27 SREEASSLFQKMKEENYKLFSLRNEENEVVSLAGVAICTNFYNEKHVFVYDLVTAEAHRS 86
E+ ++EE F E N + + I +N+ + + D+ A+ +R
Sbjct: 49 QYEDDDMDVSYVEEEGKAAFLYYLENNCI---GRIKIRSNW--NGYALIEDIAVAKDYRK 103

Query: 87 KGYGNVLLSYVEKWGKEKGCSSIVLTSAFPRIDAHRFYEREGF 129
KG G LL +W KE ++L + I A FY + F
Sbjct: 104 KGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1943PF05616290.017 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 28.6 bits (63), Expect = 0.017
Identities = 16/42 (38%), Positives = 23/42 (54%)

Query: 43 YFSFSMQEYSVYKEKMQTRLKEEPLSNLIIENNGQVIGTVGF 84
+ SFS+Q S YKE+M + EE LS + N + I G+
Sbjct: 220 FISFSLQGNSKYKEEMDAKKLEEILSLKVDANPDKYIKATGY 261


20BAS2035BAS2057Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2035212-1.255316DNA translocase FtsK
BAS2036215-1.100307hypothetical protein
BAS2038117-1.522160hypothetical protein
BAS2042119-0.764618hypothetical protein
BAS2043219-0.834961hypothetical protein
BAS2044121-2.915783hypothetical protein
BAS2045218-2.488732resolvase family site-specific recombinase
BAS2046620-4.016583IS110 family transposase OrfA
BAS2047319-1.908723IS110 family transposase OrfB
BAS2048218-2.985410hypothetical protein
BAS2049118-2.586251hypothetical protein
BAS2050-118-2.837896hypothetical protein
BAS2051-216-3.166375hypothetical protein
BAS2054-215-2.883582hypothetical protein
BAS2055-215-3.562557sodium/solute symporter family protein
BAS2056017-3.334954DNA-binding response regulator
BAS2057-116-3.273669sensor histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2038SYCDCHAPRONE290.019 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 29.1 bits (65), Expect = 0.019
Identities = 20/109 (18%), Positives = 39/109 (35%), Gaps = 6/109 (5%)

Query: 131 REIDKENNEAAYLLASANFRIGKYQEAVQNFEQALANNAKGIEPYKKDAMRDLAVSHMKM 190
EI + E Y LA ++ GKY++A + F+ ++ Y L M
Sbjct: 29 NEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCV-----LDHYDSRFFLGLGACRQAM 83

Query: 191 KEFEKAEDVIVKMSTKTNEDKAIVSYLKGQLSTATVQLEKAESFFKEAI 239
+++ A + ++ + + +L +AES A
Sbjct: 84 GQYDLAIHSYSYGAIMDIKE-PRFPFHAAECLLQKGELAEAESGLFLAQ 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2043RTXTOXIND392e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.4 bits (92), Expect = 2e-05
Identities = 27/175 (15%), Positives = 66/175 (37%), Gaps = 30/175 (17%)

Query: 5 LEIKVKPEQLEQIAKNISEMQTHSQNIQQNLN--QSMFSIQMQWQGATSQHFY----GEY 58
L + K + + I+ + S+ + L+ S+ + A ++H +Y
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLH-----KQAIAKHAVLEQENKY 261

Query: 59 MRSMRLMESYIRNLQVTEKELRRIAQKFRQADEEYQKKQNEKLKEAHKK--EKKNEKSWW 116
+ ++ + Y L+ E E+ ++++ + ++ + +KL++ E +
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELA-- 319

Query: 117 EKGIEGAAEFIGVNDAIRAVTGKDPITG--KELS--TKERLIAAGWTLLNFVPVG 167
E IRA P++ ++L T+ ++ TL+ VP
Sbjct: 320 ------KNEERQQASVIRA-----PVSVKVQQLKVHTEGGVVTTAETLMVIVPED 363


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2056HTHFIS1036e-28 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 103 bits (259), Expect = 6e-28
Identities = 31/115 (26%), Positives = 62/115 (53%)

Query: 4 RILIVEDEEKIARVVQLELEFEGYESEIAKTGTEAMEKFGNGNWDLILLDVMLPNISGLE 63
IL+ +D+ I V+ L GY+ I G+ DL++ DV++P+ + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 VLRRIRLKNAVIPIILLTARDSVVDKVSGLDQGASDYITKPFQIEELLARIRACL 118
+L RI+ +P+++++A+++ + + ++GA DY+ KPF + EL+ I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119


21BAS2145BAS2180Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2145316-1.519026L-lysine 2,3-aminomutase
BAS2146215-3.912807hypothetical protein
BAS2147017-4.468067hypothetical protein
BAS2148016-3.563786hypothetical protein
BAS2149-116-3.368779hypothetical protein
BAS2150015-1.987247hypothetical protein
BAS2151013-1.639759hypothetical protein
BAS2152-113-1.369294protein kinase domain-containing protein
BAS2153-314-1.308823sporulation-control protein Spo0M
BAS2154-117-1.063439PAP2 family protein
BAS2155115-1.168954cation efflux family protein
BAS2156417-3.364498thioredoxin
BAS2157418-3.461857hypothetical protein
BAS2158417-3.726206hypothetical protein
BAS2159316-2.998777hypothetical protein
BAS2160315-3.780689S-layer protein
BAS2161315-2.000690hypothetical protein
BAS21623170.844137Mrr restriction system protein
BAS2163116-0.259741DNA-binding protein
BAS2164216-0.651531hypothetical protein
BAS2165315-0.724983hypothetical protein
BAS2166218-0.652643DNA translocase FtsK
BAS2167220-2.286864hypothetical protein
BAS2169418-2.284420hypothetical protein
BAS2170622-1.183747hypothetical protein
BAS2171324-0.827580hypothetical protein
BAS2172222-1.298925hypothetical protein
BAS2173015-2.408551hypothetical protein
BAS2174115-2.603974hypothetical protein
BAS2175216-3.375987hypothetical protein
BAS2176217-3.944830hypothetical protein
BAS2177317-3.744751hypothetical protein
BAS2178116-4.230796peptidyl-prolyl isomerase
BAS2179022-4.425348hypothetical protein
BAS2180119-4.213830hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2177TCRTETB260.007 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 26.4 bits (58), Expect = 0.007
Identities = 11/30 (36%), Positives = 16/30 (53%), Gaps = 3/30 (10%)

Query: 18 VTFFGPYNEVITNVS---IINQLSTPKCQT 44
++FF NE++ NVS I N + P T
Sbjct: 22 LSFFSVLNEMVLNVSLPDIANDFNKPPAST 51


22BAS2226BAS2280Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2226213-2.364043alpha/beta hydrolase
BAS2227114-2.001153ABC transporter permease
BAS2230315-2.109943hypothetical protein
BAS2231214-1.499462zinc transporter family protein
BAS2232316-2.331394hypothetical protein
BAS2233115-2.298878hypothetical protein
BAS2234115-1.921754lipoprotein
BAS2235215-2.111236metallo-beta-lactamase family protein
BAS2236316-1.548523inosine-uridine preferring nucleoside hydrolase
BAS2237415-3.563659hypothetical protein
BAS2238012-1.885916hypothetical protein
BAS2239014-1.803712hypothetical protein
BAS2240015-2.304595acetyltransferase
BAS2241-115-2.684922hydrolase
BAS2242-117-3.563399TetR family transcriptional regulator
BAS2243-118-3.065125MmpL family membrane protein
BAS2244019-4.205627hypothetical protein
BAS2245118-4.120986chloramphenicol acetyltransferase
BAS2246-117-4.211265acetyltransferase
BAS2247017-3.336400acetyltransferase
BAS2248-114-2.747132acetyltransferase
BAS2249-113-3.535295hypothetical protein
BAS2250114-4.053080DNA-binding protein
BAS2251114-3.820087hypothetical protein
BAS2252114-2.684124alpha/beta hydrolase
BAS2253015-3.559277protoporphyrinogen oxidase
BAS2254120-4.996055hypothetical protein
BAS2255118-4.581682acetyltransferase
BAS2256-115-3.799838hypothetical protein
BAS2257-116-3.404496cold shock protein CspA
BAS2259017-4.234110hypothetical protein
BAS2260118-4.209124hypothetical protein
BAS2261118-3.053507HAD superfamily hydrolase
BAS2262018-1.242494hypothetical protein
BAS2264018-1.698271hypothetical protein
BAS2265016-2.562318hypothetical protein
BAS6001015-3.011744hypothetical protein
BAS2267016-2.807823LysR family transcriptional regulator
BAS2268118-2.868848aspartate-semialdehyde dehydrogenase
BAS2269218-3.679301hypothetical protein
BAS2270-115-3.001166hypothetical protein
BAS2271-213-2.526625LysR family transcriptional regulator
BAS2272-213-2.018844hypothetical protein
BAS2273-312-1.662457hypothetical protein
BAS2274-311-1.865135hypothetical protein
BAS2275-313-1.994230ABC transporter ATP-binding protein/permease
BAS2276-113-2.166173ABC transporter ATP-binding protein/permease
BAS2276a3172.014009hypothetical protein
BAS22774192.107862N-acetylmuramoyl-L-alanine amidase
BAS22784201.926250amino acid transporter LysE
BAS22793182.498446DNA-binding protein
BAS22803182.543327hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2226PF06057310.005 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 31.3 bits (71), Expect = 0.005
Identities = 16/46 (34%), Positives = 24/46 (52%)

Query: 121 EDLLAMTDYISKRLGKEKAILIGHSYGTYIGMQAANKAPEKYEAYV 166
+D LA+ D G +K ILIG+S+G + N+ P +Y V
Sbjct: 101 QDTLAIIDKYQAEFGTQKVILIGYSFGAEVIPFVLNEMPARYRKNV 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2227TCRTETA330.002 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 33.3 bits (76), Expect = 0.002
Identities = 24/119 (20%), Positives = 45/119 (37%), Gaps = 8/119 (6%)

Query: 49 ATMTQIMIALPAL--IFF--LLVGTVVDRFDRQRICTVSNICCSLCNIGILISLYYGMII 104
AT I +A + ++ G V R +R + I I + + M
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAF 304

Query: 105 LVFLFLFLENACIQFFSPSEQSMIQGVVESDQYGAAAGINQMVNSLYALFGVGIATMVY 163
+ + L A P+ Q+M+ V+ ++ G G + SL ++ G + T +Y
Sbjct: 305 PIMVLL----ASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIY 359



Score = 31.7 bits (72), Expect = 0.005
Identities = 28/198 (14%), Positives = 67/198 (33%), Gaps = 28/198 (14%)

Query: 173 LVNTLTFIMSGILIQTISIPEKVRLPNGRTKWKEVNLKMLITEFKEGIRYIYQNETLKKL 232
+N L F+ L+ K + L+ R+ + L
Sbjct: 168 ALNGLNFLTGCFLLPE------------SHKGERRPLRREALNPLASFRWARGMTVVAAL 215

Query: 233 LLGFIVFGLLNGILSVSTTYIL----KYKLAPATYESLAMVGGVVGGISLLIGSIVATSI 288
+ F + L+ + + +++ ++ T G++ ++ + + +
Sbjct: 216 MAVFFIMQLVGQVPA--ALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAM---ITGPV 270

Query: 289 GKKYAPKPIIVFGMAGSGIFFGMCYFVNYVWSFY---VCIAFATFFLPFINVAIMGWMYE 345
+ + ++ GM G + + F W + V +A +P A+ +
Sbjct: 271 AARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMP----ALQAMLSR 326

Query: 346 IVEESFMGRVQSLLSPLT 363
V+E G++Q L+ LT
Sbjct: 327 QVDEERQGQLQGSLAALT 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2242HTHTETR836e-22 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 83.1 bits (205), Expect = 6e-22
Identities = 35/164 (21%), Positives = 66/164 (40%), Gaps = 10/164 (6%)

Query: 20 KSTKETILEVATRLFLTQNYQVVSMDEVAKVCGVTKATVYYYFSTKADLFTATMIQMMIR 79
+ T++ IL+VA RLF Q S+ E+AK GVT+ +Y++F K+DLF+
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 80 IRENMSQILS-TNNTLEERLLNFAKVYLHATMDIDMKNFMKDAKLSLSEEQLKELKK--- 135
I E + + L L +T+ + + + + + E + E+
Sbjct: 70 IGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEI-IFHKCEFVGEMAVVQQ 128

Query: 136 ----AEDSMYEVLEKALDKAMQLGEIQKG-NPKFAAHAFVSLLS 174
Y+ +E+ L ++ + + AA +S
Sbjct: 129 AQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYIS 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2243ACRIFLAVINRP528e-09 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 51.8 bits (124), Expect = 8e-09
Identities = 37/232 (15%), Positives = 85/232 (36%), Gaps = 25/232 (10%)

Query: 203 LLVATVLLVLVLLILLYRSPILAILPLLVVGFAYGIISPTLGFLADHGWIKVDAQAISIM 262
L A +L+ LV+ + L ++ ++P + V + T LA G+ +I+ +
Sbjct: 344 LFEAIMLVFLVMYLFL-QNMRATLIPTIAVPVV---LLGTFAILAAFGY------SINTL 393

Query: 263 T----VLLFGAGTDYCLFLISRYREYLLEEESKYK-ALQLAIKASGGAIIMSALTVVLGL 317
T VL G D + ++ ++E++ K A + ++ GA++ A+ +
Sbjct: 394 TMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVF 453

Query: 318 GTLLL--AHYGAFHR-FAVPFSVAVFIMGIAALTILPAFLLIFGRTAFFPFIPRTTSMNE 374
+ GA +R F++ A+ + + AL + PA + P + +E
Sbjct: 454 IPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLK-------PVSAEHHE 506

Query: 375 ELARRKKKVVKVKKSKGAFSKKLGDVVVRRPWTIIMLTVFVLGGLASFVPRI 426
++ +++ ++ G+ R+
Sbjct: 507 NKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRL 558



Score = 38.3 bits (89), Expect = 1e-04
Identities = 28/161 (17%), Positives = 68/161 (42%), Gaps = 9/161 (5%)

Query: 203 LLVATVLLVLVLLILLYRSPILAILPLLVVGFAYGIISPTLGFLADHGWIKVDAQAISIM 262
L+ + ++V + L LY S + + +LVV I+ L + V +
Sbjct: 875 LVAISFVVVFLCLAALYESWSIPVSVMLVVPLG--IVGVLLAATLFNQKNDVYFM---VG 929

Query: 263 TVLLFGAGTDYCLFLISRYREYLLEE-ESKYKALQLAIKASGGAIIMSALTVVLGLGTLL 321
+ G + ++ ++ + +E + +A +A++ I+M++L +LG+ L
Sbjct: 930 LLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLA 989

Query: 322 LAH---YGAFHRFAVPFSVAVFIMGIAALTILPAFLLIFGR 359
+++ GA + + + + A+ +P F ++ R
Sbjct: 990 ISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030



Score = 31.7 bits (72), Expect = 0.012
Identities = 32/202 (15%), Positives = 69/202 (34%), Gaps = 21/202 (10%)

Query: 533 AGISNAEDQL--WIGGETASLYDTKQITERDEAVIIPVMISIIALLLLVYLRSIVAMIYL 590
A + N +L IG + + ++++ ++ + ++ L L S + +
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 591 IVTVVLSFFSALGAGWLLLHYGMGAPAIQGAIPLYAFVFLVALGEDYNIFMVSEIWKNRK 650
++ V L L A L + + G + + L I +V +
Sbjct: 901 MLVVPLGIVGVLLAAT-LFNQKNDVYFMVG------LLTTIGLSAKNAILIVEFAKDLME 953

Query: 651 TQNHLDAVKNGVIQTGSVITSAGLILAGTFAVLGTLPIQV------LVQFGIVTAI--GV 702
+ V + + L+ + F +LG LP+ + Q + + G+
Sbjct: 954 KEGK--GVVEATLMAVRMRLRPILMTSLAF-ILGVLPLAISNGAGSGAQNAVGIGVMGGM 1010

Query: 703 LLDTFIVRPLLVPAITVVLGRF 724
+ T + VP VV+ R
Sbjct: 1011 VSATLLAI-FFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2247SACTRNSFRASE411e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.1 bits (96), Expect = 1e-06
Identities = 26/98 (26%), Positives = 42/98 (42%), Gaps = 6/98 (6%)

Query: 49 YSSVEMMRYSIEELDS--YKVIMDEKIIGGIIVTISGKSYGRIDRIFVEPVYQGKGIGSN 106
Y +M +EE + ++ IG I + + Y I+ I V Y+ KG+G+
Sbjct: 50 YEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTA 109

Query: 107 VIKL-IE--AEYPSIRIWDLETSSRQINNHHFYKKMGY 141
++ IE E + LET I+ HFY K +
Sbjct: 110 LLHKAIEWAKENHFCGLM-LETQDINISACHFYAKHHF 146


23BAS2290BAS2363Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2290218-0.552209PTS system cellobiose-specific transporter
BAS2291118-1.375185PTS system cellobiose-specific transporter
BAS2292116-1.265967hypothetical protein
BAS2293017-1.229357anhydro-N-acetylmuramic acid kinase
BAS2294016-2.507026hypothetical protein
BAS2295015-3.627453glycerol-3-phosphate acyltransferase PlsY
BAS2296016-4.092484acetyltransferase
BAS2297-214-2.904242threonine dehydratase
BAS2297a013-3.731161hypothetical protein
BAS2298-213-2.766974metallo-beta-lactamase family protein
BAS2299-212-2.216659hypothetical protein
BAS2300-212-1.504702hypothetical protein
BAS2301-212-3.129293DEAD/DEAH box helicase
BAS2302015-4.241133hypothetical protein
BAS2303015-4.214850hypothetical protein
BAS2304-116-4.515340TetR family transcriptional regulator
BAS2305015-4.250937ABC transporter permease
BAS2306-113-3.737268ABC transporter permease
BAS2307-111-1.982174ABC transporter ATP-binding protein
BAS2308-112-1.602490hypothetical protein
BAS2309-111-1.692371hypothetical protein
BAS2310-112-2.029126hypothetical protein
BAS2311-211-2.377395indolepyruvate decarboxylase
BAS2312016-3.100678marR family transcriptional regulator
BAS2313217-3.770738phosphoglyceromutase
BAS2314-115-3.951732hypothetical protein
BAS2315019-4.239108hypothetical protein
BAS2316018-4.321700hypothetical protein
BAS2317018-4.415033hypothetical protein
BAS2318a016-4.049025hypothetical protein
BAS2319-113-2.800559aminoacyl-histidine dipeptidase
BAS2320015-3.079413hypothetical protein
BAS2323012-2.289267ECF subfamily RNA polymerase sigma factor
BAS2324112-1.249689hypothetical protein
BAS2325012-0.172527penicillin-binding protein
BAS23280131.626189beta-lactamase
BAS2329-1131.992702PAP2 family protein
BAS23300122.721543transcriptional regulator
BAS2331-1123.033583alpha-ketoglutarate permease
BAS23320123.265312oxidoreductase, NAD-binding
BAS23330143.431969iolC protein
BAS23341142.896547methylmalonic acid semialdehyde dehydrogenase
BAS23351150.241736iolD protein
BAS2337117-1.324232fructose-bisphosphate aldolase
BAS2338217-2.841545iolB protein
BAS2339120-4.139971hypothetical protein
BAS2345119-4.291796lipoprotein
BAS2346220-3.075140hypothetical protein
BAS2347220-1.881234DNA-binding protein
BAS2348218-1.843434hypothetical protein
BAS2349318-2.123473D-alanyl-D-alanine carboxypeptidase
BAS2350219-3.156837hypothetical protein
BAS2351320-3.210040N-acetylmuramoyl-L-alanine amidase
BAS2354420-3.540603TetR family transcriptional regulator
BAS2355319-3.057494ABC transporter ATP-binding protein
BAS2356420-2.905095hypothetical protein
BAS2357420-2.690381sensory box/GGDEF family protein
BAS2358622-1.900851acetyltransferase
BAS2359419-1.434592spore coat protein
BAS2360217-1.811626hypothetical protein
BAS2361019-2.181356metallo-beta-lactamase/rhodanese-like
BAS2362-122-3.055991hypothetical protein
BAS2363-121-3.284572hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2294PRPHPHLPASEC280.048 Prokaryotic zinc-dependent phospholipase C signature.
		>PRPHPHLPASEC#Prokaryotic zinc-dependent phospholipase C signature.

Length = 398

Score = 28.1 bits (62), Expect = 0.048
Identities = 13/72 (18%), Positives = 26/72 (36%), Gaps = 11/72 (15%)

Query: 112 GILTIGGTGAICLGRKGEVYEYSGGW-GHILGDEGSGYWIALQGLKRMANQFDQGVTLCP 170
++ ++ G +VY W G I G G+ I QG+ + N +
Sbjct: 8 ALICATLATSLWAGASTKVY----AWDGKIDG-TGTHAMIVTQGVSILENDLSKNEP--- 59

Query: 171 LSLRIQDEFQLL 182
++ ++L
Sbjct: 60 --ESVRKNLEIL 69


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2295ACRIFLAVINRP280.027 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 27.9 bits (62), Expect = 0.027
Identities = 15/62 (24%), Positives = 28/62 (45%), Gaps = 6/62 (9%)

Query: 84 VMLTLLAVIMGHIYPMLFKGKGGKGIS-----TFIGGLIAFDYLIALTLVAVFIIFYLIF 138
+++T LA I+G + P+ G G +GG+++ L + F++ F
Sbjct: 974 ILMTSLAFILGVL-PLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRCF 1032

Query: 139 KG 140
KG
Sbjct: 1033 KG 1034


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2296AUTOINDCRSYN290.044 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 28.7 bits (64), Expect = 0.044
Identities = 10/52 (19%), Positives = 23/52 (44%), Gaps = 1/52 (1%)

Query: 15 ESIHKLNYKTFVEEIPQHEETKDRVRIDRFHEENT-YLICLDDDKLVGMVAL 65
+ L +TF + + + D + D++ NT YL + D+ ++ +
Sbjct: 18 GELFTLRKETFKDRLNWAVQCTDGMEFDQYDNNNTTYLFGIKDNTVICSLRF 69


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2301TONBPROTEIN300.013 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 30.3 bits (68), Expect = 0.013
Identities = 20/113 (17%), Positives = 39/113 (34%), Gaps = 6/113 (5%)

Query: 338 AGGSGLAITFVAAKDEKH------LEEIEKTLGAPIQREIIEQPKIKRVDENGKPLPKPA 391
A +++T V D + E + + V E KP PKP
Sbjct: 40 APAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPK 99

Query: 392 PKKSGEYRQRDSREGSRSGSKGRTRNDSRNSSRNENNRSFNKPSNKKGSTKQG 444
PK + +++ R+ S+ + ++ +R ++ + S S G
Sbjct: 100 PKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATSKPVTSVASG 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2302BACTRLTOXIN280.005 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 27.6 bits (61), Expect = 0.005
Identities = 8/23 (34%), Positives = 13/23 (56%)

Query: 31 KINWYNDMKTSFANKELADLVKG 53
K+ Y+ +KT N++LA K
Sbjct: 84 KLKNYDKVKTELLNEDLAKKYKD 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2304HTHTETR728e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.4 bits (177), Expect = 8e-18
Identities = 30/174 (17%), Positives = 72/174 (41%), Gaps = 13/174 (7%)

Query: 8 EERRKEILETAERLFLTKGYTKTTVNDILKEIGIAKGTFYHYFKSKEEVMDEIIMRIIKE 67
+E R+ IL+ A RLF +G + T++ +I K G+ +G Y +FK K ++ E I + +
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE-IWELSES 68

Query: 68 DVAKAKVIVSNPNIPVLEKLFRVLME---QSPKSGDIKDKMIE-QFHQPNNA---EMYQK 120
++ + ++ + R ++ +S + + + ++E FH+ + Q+
Sbjct: 69 NIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQ 128

Query: 121 SLVQSIIHLSPVLTEILEQGIEEGIFSTSY-PQETIELLLSSAQVIFDEGLFQW 173
+ + + + L+ IE + + ++ + W
Sbjct: 129 AQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGY----ISGLMENW 178


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2305TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.0 bits (91), Expect = 3e-05
Identities = 58/342 (16%), Positives = 125/342 (36%), Gaps = 36/342 (10%)

Query: 49 IFAGLYAITSIPFLLAPLGGAIADRFNRRNLMVIFDFINTAIVLSFIVLLFTGSVSILLI 108
I LYA+ F AP+ GA++DRF RR +++ A+ + ++ + +L I
Sbjct: 47 ILLALYALMQ--FACAPVLGALSDRFGRR-PVLLVSLAGAAV--DYAIMATAPFLWVLYI 101

Query: 109 GTIMFLLAIVNAMYAPVVMASIPQLVPEKKLEQANGIVNGVQALSNIVAPVLGGILYGII 168
G I +A + V A I + + + G ++ + PVLGG++ G
Sbjct: 102 GRI---VAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLM-GGF 157

Query: 169 GLKMLVIISCLAFFLSAILEMFITIPFIKRVQESHIIPTIVKDMKGGFIYVLKQPFILKS 228
+ L+ + F+ +P + + + + +
Sbjct: 158 SPHAPFFAAAALNGLNFLTGCFL-LPESHKGERRPLRREALNPLASFRWARGMTVVAA-L 215

Query: 229 MLLAALLNLILTPLFVVGAPIIIRVTMESSH-TLYGIGMGLIDFATIIGALSMVFFAKKL 287
M + ++ L+ V A + + + H IG+ L F I+ +L+ +
Sbjct: 216 MAVFFIMQLVGQ----VPAALWVIFGEDRFHWDATTIGISLAAFG-ILHSLAQAMITGPV 270

Query: 288 QMQTLYYWMILIALLVIPMALSVTPFILNLGY------YPPFILFILSSILIAMIMTVVS 341
+ L++ M T +IL +P +L I + + ++S
Sbjct: 271 AA-----RLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLS 325

Query: 342 IYVITVVQKKTPNENLGKVMAIITAVSQCMAPIGQVIYGFMF 383
V E G++ + A++ + +G +++ ++
Sbjct: 326 RQV--------DEERQGQLQGSLAALTSLTSIVGPLLFTAIY 359



Score = 29.0 bits (65), Expect = 0.033
Identities = 16/79 (20%), Positives = 34/79 (43%), Gaps = 3/79 (3%)

Query: 88 TAIVLSFIVLLFTGSVSILLIGTIMFLLAIVNAMYAPVVMASIPQLVPEKKLEQANGIVN 147
A +I+L F + ++ + P + A + + V E++ Q G +
Sbjct: 285 IADGTGYILLAFATRGWMAFPIMVLLASG---GIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 148 GVQALSNIVAPVLGGILYG 166
+ +L++IV P+L +Y
Sbjct: 342 ALTSLTSIVGPLLFTAIYA 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2320SECA270.035 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 27.1 bits (60), Expect = 0.035
Identities = 9/24 (37%), Positives = 14/24 (58%)

Query: 33 ESQPTQKESRFLDTWRWQNYFLLH 56
E Q E++ L + +QNYF L+
Sbjct: 361 EGVQIQNENQTLASITFQNYFRLY 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2328BLACTAMASEA386e-138 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 386 bits (992), Expect = e-138
Identities = 99/262 (37%), Positives = 147/262 (56%), Gaps = 3/262 (1%)

Query: 39 HKNQATHKEFSQLEKKFDARLGVYAIDTGTNQTI-AYRPNERFAFASTYKALAAGVLLQQ 97
H + ++ E + R+G+ +D + +T+ A+R +ERF ST+K + G +L +
Sbjct: 20 HASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLAR 79

Query: 98 NSTKK--LDEVITYTKEDLVDYSPVTEKHVDTGMTLGEIAEAAVRYSDNTAGNILFHKIG 155
L+ I Y ++DLVDYSPV+EKH+ GMT+GE+ AA+ SDN+A N+L +G
Sbjct: 80 VDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVG 139

Query: 156 GPKGYEKALRQMGDRVTMSDRFETELNEAIPGDIRDTSTAKAIATNLKAFTAGNALPNHK 215
GP G LRQ+GD VT DR+ETELNEA+PGD RDT+T ++A L+ L
Sbjct: 140 GPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARS 199

Query: 216 RNILTKWMKGNATGDKLIRAGVPTNWVVADKSGAGSYGTRNDIAIVWPPNRAPIIIAILS 275
+ L +WM + LIR+ +P W +ADK+GAG G R +A++ P N+A I+ I
Sbjct: 200 QRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGAGERGARGIVALLGPNNKAERIVVIYL 259

Query: 276 SKDEKGATYDNQLIAEAAEVIV 297
NQ IA ++
Sbjct: 260 RDTPASMAERNQQIAGIGAALI 281


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2331TCRTETB402e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 39.9 bits (93), Expect = 2e-05
Identities = 72/361 (19%), Positives = 140/361 (38%), Gaps = 58/361 (16%)

Query: 60 TIYGITVSASSWFSGVFVQMWGPRKVMTFGLVSFILGS-IGFIGIGIQHMNYPVILICYA 118
T + +T S + G G ++++ FG++ GS IGF+G H + ++++
Sbjct: 56 TAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVG----HSFFSLLIMARF 111

Query: 119 LRGFGYPLFAYSFLVWVSYSTPQQ-------------------------MLSRAVGWFWF 153
++G G F +V V+ P++ M++ + W +
Sbjct: 112 IQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYL 171

Query: 154 VFQLGLSVIGAFYSSYMVPKIGEI--------ATLWSALIFVVVGGLFSIVVNKDKFKAQ 205
+ +++I + ++ K I L S I + LF+ +
Sbjct: 172 LLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFM--LFTTSYSISFLIVS 229

Query: 206 TVSANKSSELLKGITIAFENPKVG------IGGIVKIINSAAQFGFVVFLPTYMMKYNFT 259
+S + ++ +T F +P +G IG + I GFV +P YMMK
Sbjct: 230 VLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVP-YMMKDVHQ 288

Query: 260 MTEWLQIWGTLFFVNMVFNIIFGIVG----DKFGWINTIKWFGGVGCGIVTLALYYVPQM 315
++ +I + F + IIFG +G D+ G + + +G ++++ +
Sbjct: 289 LSTA-EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLN----IGVTFLSVSFLTASFL 343

Query: 316 VGHNYWAILF-VACCYGATLAGYVPLTALVP-SLSPENKGAAMSVLNLGSGLSAFVGPLV 373
+ W + + G ++ +V SL + GA MS+LN S LS G +
Sbjct: 344 LETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403

Query: 374 V 374
V
Sbjct: 404 V 404


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2334HTHTETR300.012 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 30.4 bits (68), Expect = 0.012
Identities = 37/201 (18%), Positives = 67/201 (33%), Gaps = 24/201 (11%)

Query: 187 AARLAELAEEAGLPKGVLNIVNGAHDVVNGLLEHKLVKAISFVGSQPVAEYVYKKGTENL 246
+ L E+A+ AG+ +G + L + S +G EY K + L
Sbjct: 31 STSLGEIAKAAGVTRGAIYWHFKDKS---DLFSEIWELSESNIGELE-LEYQAKFPGDPL 86

Query: 247 KRVQALAGAKNHSIVLNDANLELATKQIISAAFGSAGERCMAASVVTVEEEIADQLVERL 306
++ + S V + +II GE A V + + + +R+
Sbjct: 87 SVLREILIHVLESTVTEER--RRLLMEIIFHKCEFVGEM---AVVQQAQRNLCLESYDRI 141

Query: 307 VAEANKIVIGNGLDEDVFLGPVIRDNHKERTI--GYIDSGVEQGA------TLVRDGRED 358
+ L D+ + I GYI +E L ++ R+
Sbjct: 142 EQTLKHCIEAKMLPADL-------MTRRAAIIMRGYISGLMENWLFAPQSFDLKKEARDY 194

Query: 359 TAVKGAGYFVGPTIFDHVTKE 379
A+ Y + PT+ + T E
Sbjct: 195 VAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2354HTHTETR653e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.0 bits (158), Expect = 3e-15
Identities = 30/151 (19%), Positives = 66/151 (43%), Gaps = 8/151 (5%)

Query: 1 MEKSREQTMENILKAAKKKFGERGYEGTSIQEIAKEAKVNVAMASYYFNGKENLYYEVFK 60
++ ++T ++IL A + F ++G TS+ EIAK A V ++F K +L+ E+++
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 61 K-YGLANELPNFLEKNQF-NPINALREYLTVFTTHIKENPE-----IGTLAYEEIIKESA 113
EL + +P++ LRE L E + E A
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA 124

Query: 114 RLEK-IKPYFIGSFEQLKEILQEGEKQGVFH 143
+++ + + S++++++ L+ + +
Sbjct: 125 VVQQAQRNLCLESYDRIEQTLKHCIEAKMLP 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2355PF05272340.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 33.9 bits (77), Expect = 0.002
Identities = 11/34 (32%), Positives = 19/34 (55%)

Query: 338 IVLDGKNGSGKSSILKLILGQSIQYTGLVTLGTG 371
+VL+G G GKS+++ ++G +GTG
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTG 632


24BAS2377BAS2440Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2377117-3.865875hypothetical protein
BAS2378217-4.247394acetyltransferase
BAS2379116-4.218285hypothetical protein
BAS2380115-4.329080hypothetical protein
BAS2381015-3.876404hypothetical protein
BAS2384-116-4.384159sensor histidine kinase
BAS2385019-4.905732DNA-binding response regulator
BAS2386-120-5.556612hypothetical protein
BAS2387-120-4.960336hypothetical protein
BAS2388-120-4.691531S-adenosylhomocysteine nucleosidase
BAS2389120-5.762302hypothetical protein
BAS2390217-5.035650acetyltransferase
BAS2391217-5.026861hypothetical protein
BAS2394216-5.000651hypothetical protein
BAS2395116-5.496280hypothetical protein
BAS2396115-5.319726hypothetical protein
BAS2397014-4.561232excinuclease ABC subunit A
BAS2398118-4.753997hypothetical protein
BAS2399117-4.271152hypothetical protein
BAS2400116-4.318163penicillin-binding protein
BAS2401213-4.378695merR family transcriptional regulator
BAS2402113-4.753729permease
BAS2403315-5.587082mutT/nudix family protein
BAS2404213-5.672878hypothetical protein
BAS2405214-6.098616hypothetical protein
BAS2406116-5.995899araC family transcriptional regulator
BAS2407017-5.830037hypothetical protein
BAS2408220-5.919783lipoprotein
BAS2409119-5.275224hypothetical protein
BAS2410222-5.012224hypothetical protein
BAS2411120-5.083187hypothetical protein
BAS2412-119-3.720717zinc-containing alcohol dehydrogenase
BAS2413-115-2.312646hypothetical protein
BAS2414-215-1.851481Mur ligase family protein
BAS2415-114-1.435814Mur ligase family protein
BAS2416-114-1.547158hypothetical protein
BAS2417-111-0.837024cell wall hydrolase
BAS2418-111-1.964589acetamidase/formamidase family protein
BAS2419214-3.645488DNA-binding response regulator
BAS2422014-3.508264hypothetical protein
BAS2423014-4.277695ABC transporter ATP-binding protein
BAS2424113-4.453403ABC transporter ATP-binding protein
BAS2425115-3.544428acetyltransferase
BAS2426014-3.358301hypothetical protein
BAS2427-113-2.639190ABC transporter ATP-binding protein
BAS2430-112-2.535739hypothetical protein
BAS2431-112-1.754180hypothetical protein
BAS2432-212-1.187935lipase
BAS2433-212-1.846279homoserine dehydrogenase
BAS2434-211-1.676157GntR family transcriptional regulator
BAS2435-313-1.448515D-alanine--D-alanine ligase
BAS2436-218-4.226008transcriptional activator TenA
BAS2437-119-5.014702hypothetical protein
BAS2438021-5.876046hypothetical protein
BAS2438a022-6.391272hypothetical protein
BAS2439-220-5.747662hypothetical protein
BAS2440-120-5.147926hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2378SACTRNSFRASE280.011 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.4 bits (63), Expect = 0.011
Identities = 25/105 (23%), Positives = 38/105 (36%), Gaps = 6/105 (5%)

Query: 41 EQQLEKYIESENTLAFKVIDEETKEVIGHISLGQIDHINKSARIGKVLVGDTRMRGRSIG 100
+ Y+E E AF E IG I + + N A I + V R + +G
Sbjct: 53 DDMDVSYVEEEGKAAFLYYLEN--NCIGRIKIRS--NWNGYALIEDIAV-AKDYRKKGVG 107

Query: 101 KHMMKAVLHIAFDELKLHRVTLGVYDFNTSAISCYEKIGFVKEGL 145
++ + A E + L D N SA Y K F+ +
Sbjct: 108 TALLHKAIEWA-KENHFCGLMLETQDINISACHFYAKHHFIIGAV 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2385HTHFIS929e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.2 bits (229), Expect = 9e-24
Identities = 40/150 (26%), Positives = 74/150 (49%), Gaps = 3/150 (2%)

Query: 5 ILIIDDDKEIVELLAVYLRNEGYNIYKAYDGDEALQMISTYEVDLMILDIMMPKRNGLEV 64
IL+ DDD I +L L GY++ + + I+ + DL++ D++MP N ++
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 65 CQEVRE-NNTVPILMLSAKAEDMDKILGLMTGADDYMIKPFNPLELVARV-KALLRRSSF 122
+++ +P+L++SA+ M I GA DY+ KPF+ EL+ + +AL
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRR 125

Query: 123 QNASSPKNEDGM-IRIRSAEIHKHNHTVKV 151
+ ++DGM + RSA + + +
Sbjct: 126 PSKLEDDSQDGMPLVGRSAAMQEIYRVLAR 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2390SACTRNSFRASE427e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.9 bits (98), Expect = 7e-07
Identities = 24/91 (26%), Positives = 43/91 (47%), Gaps = 8/91 (8%)

Query: 186 DMDYIEKTNHTFYGAYVDNDLKGSICI----NEQGKISFIFIDKEYRNRGIGSKLLQVAR 241
D+ Y+E+ + Y++N+ G I I N I I + K+YR +G+G+ LL A
Sbjct: 56 DVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAI 115

Query: 242 D---ELNLESLLISFPNNSLLE-GFVKKTGF 268
+ E + L++ + ++ F K F
Sbjct: 116 EWAKENHFCGLMLETQDINISACHFYAKHHF 146



Score = 39.2 bits (91), Expect = 5e-06
Identities = 18/52 (34%), Positives = 22/52 (42%)

Query: 83 LAVHPNYRGVGVSQKLFELHKEEALQNECKQLFLEVIVGNDRAIRFYNKLGY 134
+AV +YR GV L E A +N L LE N A FY K +
Sbjct: 95 IAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2402TCRTETA418e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 40.6 bits (95), Expect = 8e-06
Identities = 66/351 (18%), Positives = 119/351 (33%), Gaps = 27/351 (7%)

Query: 40 LPWIAYQLTGSAVIMSS---LFAINVLPIVLFGPLVGVIIDRYDRKKLLLVADITNIILV 96
LP + L S + + L A+ L P++G + DR+ R+ +LLV+ +
Sbjct: 28 LPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDY 87

Query: 97 SFVPILHSLHLLEIWHLYIITFMLAVMSMLFDVTTVTVIPKIAGASLTKANSFYQMVNQL 156
+ + L W LYI + + V + G + F
Sbjct: 88 AIMATAPFL-----WVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGF 142

Query: 157 ASLFGPMIAGVFISFIGGFQLLWINVLSFIATLVAVMLLPSMKTTNKKCEDKNTLQNVLS 216
+ GP++ G+ F L+ + L LLP + L+
Sbjct: 143 GMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGE-----RRPLRREAL 197

Query: 217 DLVNGFTWLKNDRLNLALSFQAMIGNFGASAVLGVFMYYLLSTLQLTPEQSGVNYSLIGI 276
+ + F W + + AL I +++ + G++ + GI
Sbjct: 198 NPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGI 257

Query: 277 -GGLLGSLIAIPLEKRLQRSILIPLLLFVGAIGLTFALWNT-YWFA-PGI----AFGVAM 329
L ++I P+ RL + L + G + T W A P + + G+ M
Sbjct: 258 LHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM 317

Query: 330 TCNIAWNTIVATVRQETVPSNMQGRVLGFSRVLTRLAMPLGALVGGIISAY 380
A +++ V QG++ G LT L +G L+ I A
Sbjct: 318 P---ALQAMLSR----QVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAA 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2415ACRIFLAVINRP260.025 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 26.3 bits (58), Expect = 0.025
Identities = 8/33 (24%), Positives = 17/33 (51%)

Query: 37 QNNNLNYKVEVDRELAINHAINMASSNDIVLIA 69
+ +K+EVD+E A ++++ N + A
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTA 752


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2419HTHFIS676e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.8 bits (163), Expect = 6e-15
Identities = 27/112 (24%), Positives = 54/112 (48%), Gaps = 1/112 (0%)

Query: 3 KIMIVEDDMKIAELLSTHVAKYGYEGIIVSDFQNVLNIFLEEQPELVLLDINLPSFDGYY 62
I++ +DD I +L+ +++ GY+ I S+ + +LV+ D+ +P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 63 WCRQIRGV-STCPILFISAREGTMDQVMALENGGDDFISKPFHYEVVMAKIR 113
+I+ P+L +SA+ M + A E G D++ KPF ++ I
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2426PF06057290.012 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 28.7 bits (64), Expect = 0.012
Identities = 10/50 (20%), Positives = 17/50 (34%), Gaps = 12/50 (24%)

Query: 6 WFRHLP-QISMDLSEWTPFIQNNWHRKHYMKFVYVLQIIIFLIPYYFGAD 54
W + P ++ D Q + + + LI Y FGA+
Sbjct: 91 WKQKDPKDVTQDTLAIIDKYQAEFGTQK-----------VILIGYSFGAE 129


25BAS2451BAS2556Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2451-118-3.526415hypothetical protein
BAS2452015-3.228441cytochrome P450 family protein
BAS2453015-3.664112hypothetical protein
BAS2454014-3.680633HAD superfamily hydrolase
BAS2455113-3.517293hypothetical protein
BAS2456113-3.475098sensor histidine kinase
BAS2457013-3.690531penicillin-binding protein
BAS2458115-4.935258glycosyl transferase
BAS2459115-4.897489aspartate racemase
BAS2460215-4.959275hypothetical protein
BAS2461216-4.483368ABC transporter ATP-binding protein
BAS2462314-3.711218cobalt transport protein
BAS2463215-3.174850sporulation kinase B
BAS2464314-2.490167hypothetical protein
BAS2465315-2.428408hypothetical protein
BAS2466315-2.366481zinc-containing alcohol dehydrogenase
BAS2467114-2.719423penicillin-binding protein
BAS2468016-3.083229permease
BAS2469217-3.102770penicillin-binding protein
BAS2470219-2.889613acetyltransferase
BAS2471318-3.191302hypothetical protein
BAS2472420-3.459871degV family protein
BAS2473522-1.767201thiJ/pfpI family protein
BAS2474620-3.420180hypothetical protein
BAS2475617-3.768086thiJ/pfpI family protein
BAS2476415-3.920820hypothetical protein
BAS2477315-3.657559hypothetical protein
BAS2478215-3.757073alkaline D-peptidase
BAS2480216-5.442559hypothetical protein
BAS2481317-4.515747permease
BAS2482318-3.930035hypothetical protein
BAS2483518-3.734326acetyltransferase
BAS2484418-3.701960glycerophosphoryl diester phosphodiesterase
BAS2485417-3.508246hypothetical protein
BAS2486317-2.590873hypothetical protein
BAS2489117-2.565225cysteine transporter
BAS2490014-2.178120chitosanase
BAS2491014-1.880865spermine/spermidine acetyltransferase
BAS2492-213-2.369623mutT/nudix family protein
BAS2493-115-2.549393hypothetical protein
BAS2494-113-3.794958hypothetical protein
BAS2495014-3.871510oxalate/formate antiporter
BAS2498016-4.135133mutT/nudix family protein
BAS2499-115-4.465369DNA polymerase III subunit beta
BAS2500014-5.194980mutT/nudix family protein
BAS2501-213-4.999460hypothetical protein
BAS2502-214-4.895891alpha/beta hydrolase
BAS2503014-4.900476hypothetical protein
BAS2504014-5.492971intein homing endonuclease-like protein
BAS2505213-5.280111hypothetical protein
BAS2506216-4.841776endoribonuclease L-PSP
BAS2507215-4.659978hypothetical protein
BAS2508318-5.037637hypothetical protein
BAS2509118-4.778986esterase
BAS2510018-5.266974hypothetical protein
BAS2511-116-4.462831hypothetical protein
BAS2513-217-4.107300hypothetical protein
BAS2514-117-4.251690acetyltransferase
BAS2515014-3.674775metal-dependent hydrolase
BAS2516015-3.969435acetyltransferase
BAS2520516-3.235632hypothetical protein
BAS2521317-3.550263endo/excinuclease amino terminal
BAS2522518-3.706319acetyltransferase
BAS2523519-3.405863hypothetical protein
BAS2524016-1.656667hypothetical protein
BAS2525-213-2.053961hypothetical protein
BAS2526-312-2.173578hypothetical protein
BAS2527-213-2.238255hypothetical protein
BAS2528-314-2.353537mutT/nudix family protein
BAS2529-213-2.863873DadA family oxidoreductase
BAS2530-214-4.654829hypothetical protein
BAS2531-116-5.008689N-acetyltransferase
BAS2532016-3.696589hypothetical protein
BAS2533213-2.377793hypothetical protein
BAS2534212-2.106363HAD superfamily hydrolase
BAS2535211-1.842354hypothetical protein
BAS2536210-1.697263hypothetical protein
BAS2539210-1.721062decarboxylase, pyridoxal-dependent
BAS2540210-1.846684AMP-binding protein
BAS2541111-1.855623hypothetical protein
BAS254209-1.681887pullulanase
BAS2543-211-2.534570neutral protease
BAS2544-214-3.244255hypothetical protein
BAS2545-312-2.995284RNA polymerase sigma factor SigX
BAS2546-115-2.451726hypothetical protein
BAS2547-115-2.288832preprotein translocase subunit SecY
BAS2548-117-2.557220hypothetical protein
BAS2549016-1.884621cysteine transporter
BAS2550016-2.194819GntR family transcriptional regulator
BAS2551117-2.766974alpha/beta hydrolase
BAS2552a018-3.795316hypothetical protein
BAS2554-117-4.179613hypothetical protein
BAS2555120-4.702271hypothetical protein
BAS2556016-3.653740hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2456PF06580387e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.9 bits (88), Expect = 7e-05
Identities = 17/102 (16%), Positives = 37/102 (36%), Gaps = 18/102 (17%)

Query: 388 QVFI-NILQNSIEAMPDGGRISIHIKEIGKDGIIISVIDKGIGIPAERIKRLGEPFYSTK 446
Q + N +++ I +P GG+I + + + + V + G
Sbjct: 261 QTLVENGIKHGIAQLPQGGKILLKGTKDNGT-VTLEVENT------------GSLALKNT 307

Query: 447 EKGTGIGLMLSYKIIESHQGN---ISIMSEVGVGTTVTIYLP 485
++ TG GL + ++ G I + + G + +P
Sbjct: 308 KESTGTGLQNVRERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2463PF06580362e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.0 bits (83), Expect = 2e-04
Identities = 18/94 (19%), Positives = 39/94 (41%), Gaps = 12/94 (12%)

Query: 320 NLIKNGIEAMPNGGTLNISSSISNNKVIIRIEDSGIGMSQEQVNRFGEPYFNTKTKGTGL 379
N IK+GI +P GG + + + N V + +E++G + ++ GTGL
Sbjct: 266 NGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNT----------KESTGTGL 315

Query: 380 G-TMVAVKIIETMQGSLRIRSVVNKGTTLTITFP 412
++++ + +++ K + P
Sbjct: 316 QNVRERLQMLYGTEAQIKLSEKQGKVNA-MVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2468TCRTETA454e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.8 bits (106), Expect = 4e-07
Identities = 64/318 (20%), Positives = 119/318 (37%), Gaps = 24/318 (7%)

Query: 13 LLLSGVGIANLGAWIYLIALNVLVYHMGGSALAVATLYVIKPLAAL---FTNAWSGSMID 69
++LS V + +G + + L L+ + S A ++ L AL G++ D
Sbjct: 9 VILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSD 68

Query: 70 RLNKRKLMIHLDIYRAVCIAILPLLPSLWMVYVFVFFISMANAIYEPTAMTYMTKLIPVE 129
R +R +++ AV AI+ P LW++Y+ + A A Y+ + +
Sbjct: 69 RFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAYIADITDGD 127

Query: 130 QRQRFNSLRSLIGSGASVIGPAIAGALLIASTPE---FAIYMNAIAFLLSGVITLLLPNL 186
+R R S V GP + G + S A +N + FL LLP
Sbjct: 128 ERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTG---CFLLPES 184

Query: 187 DKKFDSHTSNDTLSLAVLKKDWNIVLNFSKKSLYIVFVYFLFQGMMVLAAANDSLELSFA 246
K L L + + + +F M ++ +L + F
Sbjct: 185 HK-----GERRPLRREALNPLASFRWARGMTVVA--ALMAVFFIMQLVGQVPAALWVIFG 237

Query: 247 KEVLLLTDSEYGFLVSIAGAGFILGAITNAI----LSKKLTPSLLIGIGSLFIAIGYIIY 302
++ + G S+A G IL ++ A+ ++ +L + +G + GYI+
Sbjct: 238 EDRFHWDATTIGI--SLAAFG-ILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILL 294

Query: 303 AFSNEFLIAAIGFFILSF 320
AF+ +A +L+
Sbjct: 295 AFATRGWMAFPIMVLLAS 312



Score = 29.0 bits (65), Expect = 0.032
Identities = 21/115 (18%), Positives = 48/115 (41%), Gaps = 1/115 (0%)

Query: 48 TLYVIKPLAALFTNAWSGSMIDRLNKRKLMIHLDIYRAVCIAILPLLPSLWMVYVFVFFI 107
+L L +L +G + RL +R+ ++ I +L WM + + +
Sbjct: 251 SLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLL 310

Query: 108 SMANAIYEPTAMTYMTKLIPVEQRQRFNSLRSLIGSGASVIGPAIAGALLIASTP 162
+ + I P +++ + E++ + + + S S++GP + A+ AS
Sbjct: 311 A-SGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASIT 364


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2470SACTRNSFRASE310.001 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 31.5 bits (71), Expect = 0.001
Identities = 22/65 (33%), Positives = 35/65 (53%), Gaps = 3/65 (4%)

Query: 81 IWHIAVHPDFRRMKIGNQLLNEGEKLAKERKLNRLEAWTRD-NLWVHGWYEKNGFV--KV 137
I IAV D+R+ +G LL++ + AKE L T+D N+ +Y K+ F+ V
Sbjct: 92 IEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAV 151

Query: 138 DSYLH 142
D+ L+
Sbjct: 152 DTMLY 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2478BLACTAMASEA349e-04 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 34.0 bits (78), Expect = 9e-04
Identities = 12/50 (24%), Positives = 18/50 (36%)

Query: 81 YAAGIADLRTKKQMKTDFRFRIGSTTKTFIATVLLQLAGENRLNLDDSIE 130
+A RT + D RF + ST K + +L L+ I
Sbjct: 43 IEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIH 92


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2483SACTRNSFRASE290.011 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 29.1 bits (65), Expect = 0.011
Identities = 18/104 (17%), Positives = 40/104 (38%), Gaps = 4/104 (3%)

Query: 151 EGQYVQAFYNQTASAHLWNSENMKLYLGFYKDEVVSVGSLVCTLDSIG-IYDIATKEEMR 209
Y + + + E +L + ++ + + + I DIA ++ R
Sbjct: 43 SKPYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYR 102

Query: 210 GKGFGSTMFNYLLQEAKELNVAQCVLQASPDGV---NIYKKAGF 250
KG G+ + + ++ AKE + +L+ + + Y K F
Sbjct: 103 KKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2491SACTRNSFRASE280.013 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 27.6 bits (61), Expect = 0.013
Identities = 25/123 (20%), Positives = 37/123 (30%), Gaps = 16/123 (13%)

Query: 30 AKVYIKPDGDA---VEYQP------FAIYNGDLMVGFVMHAVVKETTDMYWINGFIIDQK 80
+K Y K D V Y F Y + +G + + I + +
Sbjct: 43 SKPYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRI--KIRSNWNGYALIEDIAVAKD 100

Query: 81 QQGNGYGKAALQESIYLIKNTFKACKEIRLTVHKDNISAKKLYESYGFQPLGHD---YDG 137
+ G G A L ++I K + L NISA Y + F D Y
Sbjct: 101 YRKKGVGTALLHKAIEWAKE--NHFCGLMLETQDINISACHFYAKHHFIIGAVDTMLYSN 158

Query: 138 EEV 140

Sbjct: 159 FPT 161


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2495TCRTETA476e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 47.1 bits (112), Expect = 6e-08
Identities = 38/186 (20%), Positives = 79/186 (42%), Gaps = 8/186 (4%)

Query: 206 MMGTKQVYLLFFMLFTSCMGGLYLIGMVKDIGVQLVGLSTATAANAVAMIAIFNTVGRI- 264
M + + ++ + +G + LI V ++ + S A+ ++A++ +
Sbjct: 1 MKPNRPLIVILSTVALDAVG-IGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFAC 59

Query: 265 --VLGTLSDKIGRMKIVSATFIIIGLSVFTLSFIPLNYGIYFACVASFAFCFGGNITIFP 322
VLG LSD+ GR ++ + + ++ P + +Y + A G +
Sbjct: 60 APVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRI--VAGITGATGAVAG 117

Query: 323 AIVGDFFGLKNHSTNYGIVYQGFGFGALAGSFIGAILGGFQP--TFIIIGVLSVISFIIS 380
A + D + ++G + FGFG +AG +G ++GGF P F L+ ++F+
Sbjct: 118 AYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTG 177

Query: 381 ILIRPP 386
+ P
Sbjct: 178 CFLLPE 183



Score = 37.9 bits (88), Expect = 6e-05
Identities = 27/146 (18%), Positives = 58/146 (39%), Gaps = 13/146 (8%)

Query: 8 PLLIVLGTIIVQIGLGTIYTWSLFNQPLVSKFGWNLNSVAITFS-ITSFSLSFSTLFAGK 66
L+ + I+ +G W +F + +F W+ ++ I+ + + G
Sbjct: 213 AALMAVFFIMQLVGQVPAALWVIFGE---DRFHWDATTIGISLAAFGILHSLAQAMITGP 269

Query: 67 LQQKLGLRKLIATAGIVLGLGLILSSQVSS----LPLLYLLAGVVVGYADGTAYITSLSN 122
+ +LG R+ + I G G IL + + P++ LLA +G A ++ +
Sbjct: 270 VAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVD 329

Query: 123 LIKWFPNRKGLISGISVSAYGMGSLI 148
R+G + G + + S++
Sbjct: 330 -----EERQGQLQGSLAALTSLTSIV 350


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2516BLACTAMASEA342e-04 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 34.0 bits (78), Expect = 2e-04
Identities = 13/49 (26%), Positives = 25/49 (51%), Gaps = 4/49 (8%)

Query: 64 LIRNEEKEIVGRINLVDIDTETRISSLGYRVGEKF----TKKGVATAAV 108
I+ E ++ GR+ ++++D + + +R E+F T K V AV
Sbjct: 28 QIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAV 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2540SALSPVBPROT320.023 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 32.4 bits (73), Expect = 0.023
Identities = 28/136 (20%), Positives = 54/136 (39%), Gaps = 19/136 (13%)

Query: 611 VARGYLGKSKLTDEKFVPSPFAAGEMMYHTGDLVRWLPNGELEYLGRMDHQVKIRGYRIE 670
++R YL ++ +DE+F M + D+ L + L DH+V RG +++
Sbjct: 428 LSRDYLSTNEPSDEEF------KNAMSVYINDIAEGLSS-----LPETDHRVVYRGLKLD 476

Query: 671 LREIETQLREYPEINQVIV--------VDQVYGNRKLLAAYYVSDNKVSFGEIRKYLSDK 722
+ L+EY I +I+ D+ + N +L Y +K + +
Sbjct: 477 KPALSDVLKEYTTIGNIIIDKAFMSTSPDKAWINDTILNIYLEKGHKGRILGDVAHFKGE 536

Query: 723 LPEFMIPEKMIQVEEI 738
P +++E I
Sbjct: 537 AEMLFPPNTKLKIESI 552


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2543THERMOLYSIN310e-101 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 310 bits (795), Expect = e-101
Identities = 187/588 (31%), Positives = 267/588 (45%), Gaps = 67/588 (11%)

Query: 1 MKNKKTLTKVALTTGLALTAVAPYGVGHAEETDQLQVQIQEESFRSGELTQPSQKAPENV 60
M + L + L GL P+G ++ Q + SF SG L +
Sbjct: 1 MNKRAMLGAIGLAFGLM---AWPFGASAKGKSMVWNEQWKTPSFVSGSLLGRCSQ----- 52

Query: 61 VKDALKEKTEQALSPKQVNGETGVDYKVLQKRGSYDGTTLVRIQQTYEGKEVYGHQLTAH 120
+ + +Q + Q+ G+ ++ + G T++R +Q G L AH
Sbjct: 53 --ELVYRYLDQEKNTFQLGGQARERLSLIGNKLDELGHTVMRFEQAIAASLCMGAVLVAH 110

Query: 121 VDNSGVIKSVSGDSAQNLKQEELKKPINLSKDEATQYIYTKYGNDI---NFISEPEVKEV 177
V++ + S+SG NL + LK +S +A + + +E
Sbjct: 111 VNDGE-LSSLSGTLIPNLDKRTLKTEAAISIQQAEMIAKQDVADRVTKERPAAEEGKPTR 169

Query: 178 IFVDENNGQASNAYQVTFEAATPNYISGTYLINAQNGDMLK--NMVQESNLKASEKLVGA 235
+ + + AY+V TP + Y+I+A +G +L N + E+ ++ + G
Sbjct: 170 LVIYPDEETPRLAYEVNVRFLTPVPGNWIYMIDAADGKVLNKWNQMDEAKPGGAQPVAG- 228

Query: 236 LKKSKKSSLTSLTGTGKDDLGISRSFGISKQSN-GKYALADYTRGQGIETYDVNYRDITK 294
TS G G+ LG + + S G Y L D TRG GI TYD R
Sbjct: 229 ---------TSTVGVGRGVLGDQKYINTTYSSYYGYYYLQDNTRGSGIFTYDGRNR---- 275

Query: 295 EESYYPGTLATSTSTTF---NDPKAVSAHYLATKVFDFYKDKYNRNSFDNQGQKVVSVVH 351
+ PG+L F D AV AHY A V+D+YK+ + R S+D + S VH
Sbjct: 276 --TVLPGSLWADGDNQFFASYDAAAVDAHYYAGVVYDYYKNVHGRLSYDGSNAAIRSTVH 333

Query: 352 AWDSGDTNDPKNWQNALSANNGSMLVYGD-------PIVKAYDVAGHEFTHAVTSSESNL 404
+ + NA NGS +VYGD P DV GHE THAVT + L
Sbjct: 334 Y--------GRGYNNA--FWNGSQMVYGDGDGQTFLPFSGGIDVVGHELTHAVTDYTAGL 383

Query: 405 EYYGESGAINEALSDIMGTSIEKYVNNGSFNWTMGEQT------GSVFRDMENPTSVPSS 458
Y ESGAINEA+SDI GT +E Y N + +W +GE G R M +P
Sbjct: 384 VYQNESGAINEAMSDIFGTLVEFY-ANRNPDWEIGEDIYTPGVAGDALRSMSDPAKYGD- 441

Query: 459 LGVPYPDDYSEFNDFNGWDQGGVHFNSSIINKVAYLIAKGGTHNGVTVKGIGEDKMFDIF 518
PD YS+ D GGVH NS IINK AYL+++GG H GV+V GIG DKM IF
Sbjct: 442 -----PDHYSKRYTGTQ-DNGGVHTNSGIINKAAYLLSQGGVHYGVSVTGIGRDKMGKIF 495

Query: 519 YYANTDELNMTSNFKELRSACIRVATNKYGANSAEVQAVQKAFDAAKI 566
Y A L TSNF +LR+AC++ A + YG+ S EV +V++AF+A +
Sbjct: 496 YRALVYYLTPTSNFSQLRAACVQAAADLYGSTSQEVNSVKQAFNAVGV 543


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2547SECYTRNLCASE443e-156 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 443 bits (1141), Expect = e-156
Identities = 180/445 (40%), Positives = 270/445 (60%), Gaps = 22/445 (4%)

Query: 1 MFRTISNFMRVAEIRRKILFTLAMLIVFRIGTFIPVPHTNAEVLK-----IQDQANVLGM 55
M + R ++R+K+LFTLA+++V+R+GT IP+P + + ++ + G+
Sbjct: 1 MLTAFARAFRTPDLRKKLLFTLAIIVVYRVGTHIPIPGVDYKNVQQCVREASGNQGLFGL 60

Query: 56 LNVFGGGALQHFSIFAVGITPYITASIIVQLLQMDVIPKFSEWAKQGEMGRKKSAQFTRY 115
+N+F GGAL +IFA+GI PYITASII+QLL + VIP+ K+G+ G K Q+TRY
Sbjct: 61 VNMFSGGALLQITIFALGIMPYITASIILQLLTV-VIPRLEALKKEGQAGTAKITQYTRY 119

Query: 116 FTIILAFIQAIGMSYGFNNI-------AGGQLITDQSWTTYLFIATVLTAGTAFLLWLGE 168
T+ LA +Q G+ + GGQ++ DQS T + + +TAGT ++WLGE
Sbjct: 120 LTVALAILQGTGLVATARSAPLFGRCSVGGQIVPDQSIFTTITMVICMTAGTCVVMWLGE 179

Query: 169 QITANGVGNGISMLIFAGLVAAIPNVANQIYLQQFQNAGDQLFMHIIKMLLIGLVILAIV 228
IT G+GNG+S+L+F + A P+ I Q G F +I V L +V
Sbjct: 180 LITDRGIGNGMSILMFISIAATFPSALWAIKKQGTLAGGWIEFGTVI------AVGLIMV 233

Query: 229 VGVIYIQQAVRKIPIQYAKAVSGNNQYQGAKNTHLPLKVNSAGVIPVIFASAFLMTPRTI 288
V++++QA R+IP+QYAK + G Y G +T++PLKVN AGVIPVIFAS+ L P +
Sbjct: 234 ALVVFVEQAQRRIPVQYAKRMIGRRSY-GGTSTYIPLKVNQAGVIPVIFASSLLYIPALV 292

Query: 289 AQLFPDSSVSKWLVAN--LDFAHPIGMTLYVGLIVAFTYFYAFIQVNPEQMAENLKKQNG 346
AQ +S K V HPI + Y LIV F +FY I NPE++A+N+KK G
Sbjct: 293 AQFAGGNSGWKSWVEQNLTKGDHPIYIVTYFLLIVFFAFFYVAISFNPEEVADNMKKYGG 352

Query: 347 YVPCIRPGKSTEQYVTKILYRLTFIGAIFLGAISILPLVFTKIATLPPSAQIGGTSLLII 406
++P IR G+ T +Y++ +L R+T+ G+++LG I+++P + + GGTS+LII
Sbjct: 353 FIPGIRAGRPTAEYLSYVLNRITWPGSLYLGLIALVPTMALVGFGASQNFPFGGTSILII 412

Query: 407 VGVALETMKTLESQLVKRHYKGFIK 431
VGV LET+K +ESQL +R+Y+GF++
Sbjct: 413 VGVGLETVKQIESQLQQRNYEGFLR 437


26BAS2568BAS2579Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2568116-3.889423TetR family transcriptional regulator
BAS2569-116-2.982336mutT/nudix family protein
BAS2570-116-3.552181hypothetical protein
BAS2571-217-3.255059hypothetical protein
BAS2572-119-3.243565hypothetical protein
BAS2573018-3.018926hypothetical protein
BAS2574017-3.028318acetoin operon transcriptional activator
BAS2575322-4.556909hypothetical protein
BAS2576320-3.585304hypothetical protein
BAS2577218-3.383761acetyltransferase
BAS2578117-3.317220DeoR family transcriptional regulator
BAS2579118-3.071945lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2568HTHTETR594e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 59.3 bits (143), Expect = 4e-13
Identities = 34/196 (17%), Positives = 74/196 (37%), Gaps = 11/196 (5%)

Query: 12 RSLETKKKLLHSGYTIFIRNGFQKTTITQIIKHAETGYGTAYVYFKNKDDLLIVLMEDVM 71
+ ET++ +L +F + G T++ +I K A G Y +FK+K DL + ++
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIW-ELS 66

Query: 72 NQFYNIAERSFSPQTTEEARTMIQNQVKAFLQLAEKE------RAILQVVEEAIGLSKEI 125
E + + + ++++ + L+ E I+ E +G +
Sbjct: 67 ESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVV 126

Query: 126 RQKWDEIRERFINSITQDITYSQESGLAHSKLNKEIVARAWFAMNEMFLWTIVQNDKKLE 185
+Q + + I Q + + E+ + + L A + + + +
Sbjct: 127 QQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFD 186

Query: 186 LEEI----VHTLTEMY 197
L++ V L EMY
Sbjct: 187 LKKEARDYVAILLEMY 202


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2574HTHFIS386e-131 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 386 bits (994), Expect = e-131
Identities = 119/343 (34%), Positives = 188/343 (54%), Gaps = 32/343 (9%)

Query: 304 TFPGVIGTSDAFQHTLEEIKLVSPTDASVYVCGETGVGKEYVARAIHENSPRKNGPFIAV 363
++G S A Q + + TD ++ + GE+G GKE VARA+H+ R+NGPF+A+
Sbjct: 135 DGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAI 194

Query: 364 NCGALPKELMESELFGYAEGAFTGARRQGYKGKFEQADGGTIFLDEIGEVPPEMQVALLR 423
N A+P++L+ESELFG+ +GAFTGA+ + G+FEQA+GGT+FLDEIG++P + Q LLR
Sbjct: 195 NMAAIPRDLIESELFGHEKGAFTGAQTRS-TGRFEQAEGGTLFLDEIGDMPMDAQTRLLR 253

Query: 424 VLQERTVTPIGSSKEVPVNIRIITATHKDLLRLVEEGKFRQDLYYRLHVYPLYVPSLIER 483
VLQ+ T +G + ++RI+ AT+KDL + + +G FR+DLYYRL+V PL +P L +R
Sbjct: 254 VLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDR 313

Query: 484 KEDIPYFIKHFCKRKNWNVVFPKSI----CNQFSQHTWPGNIRELLNALERIYILSQGRE 539
EDIP ++HF ++ + K H WPGN+REL N + R+ L
Sbjct: 314 AEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDV 373

Query: 540 ICEKQISFLLQTMMRNQHQLELQTENKTEDTLN--------------------------F 573
I + I L++ + +E +++
Sbjct: 374 ITREIIENELRSEI-PDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRV 432

Query: 574 REKIQRDSMIEALEKTNGNVSSAAKLLDVPRSTFYKRMQKYKL 616
+++ ++ AL T GN AA LL + R+T K++++ +
Sbjct: 433 LAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2577SACTRNSFRASE444e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 44.2 bits (104), Expect = 4e-08
Identities = 25/94 (26%), Positives = 42/94 (44%), Gaps = 4/94 (4%)

Query: 65 EESKNLFLVAEVHDRIVGFSRCEGSNLKRLSHKIEFGVCILKEFWGYGIGKSLLGQSIHW 124
EE K FL + + +G + SN + + V K++ G+G +LL ++I W
Sbjct: 62 EEGKAAFL-YYLENNCIGRIKIR-SNWNGYALIEDIAVA--KDYRKKGVGTALLHKAIEW 117

Query: 125 ADENEIKKISLQVLETNEKAIQLYKKLGFEVEGI 158
A EN + L+ + N A Y K F + +
Sbjct: 118 AKENHFCGLMLETQDINISACHFYAKHHFIIGAV 151


27BAS2660BAS2670Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2660-221-3.697600cell division protein DivIC
BAS2661016-0.767500hypothetical protein
BAS2662115-2.198074hypothetical protein
BAS2663114-2.219528hypothetical protein
BAS2664214-2.571470hypothetical protein
BAS2665114-3.237009hypothetical protein
BAS2667015-3.042456x-prolyl-dipeptidyl aminopeptidase
BAS2668014-4.403878sensor histidine kinase SrrB
BAS2670117-3.440374hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2668PF06580363e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 35.6 bits (82), Expect = 3e-04
Identities = 33/173 (19%), Positives = 60/173 (34%), Gaps = 38/173 (21%)

Query: 282 IIKQSDHISNLIEEL---LRFS---KLERDVLQKEEFSIKSLVQSILDKHKIELESKEIN 335
I++ ++ L +R+S R V +E + +V S L I+ E +
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELT---VVDSYLQLASIQFEDR--- 239

Query: 336 LQVNYNVGDAIVYADVNKMRMVFQNLISNAIKY-----TSNQNIKITLEDRNESVYFQIQ 390
LQ + AI+ V M + Q L+ N IK+ I + N +V +++
Sbjct: 240 LQFENQINPAIMDVQVPPM--LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVE 297

Query: 391 NGMNAEHMKDIDKIWEPFYVLESSRSKDRSGTGLGLAIVKS-IVERHGFDYGV 442
N + TG GL V+ + +G + +
Sbjct: 298 N--TGSLALK----------------NTKESTGTGLQNVRERLQMLYGTEAQI 332


28BAS2686BAS2705Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2686-214-3.222596hypothetical protein
BAS2687-213-2.492786solute-binding family 5 protein
BAS2688112-2.305613major facilitator family transporter protein
BAS2689215-2.949622lipoprotein
BAS2690115-3.176690hypothetical protein
BAS2691016-3.039083hypothetical protein
BAS2692117-3.469785hypothetical protein
BAS2693011-3.425535inosine-uridine preferring nucleoside hydrolase
BAS2694-112-3.711520hypothetical protein
BAS2695-111-4.213733UbiE/COQ5 family methlytransferase
BAS2696018-2.638174hypothetical protein
BAS2697018-2.102352hypothetical protein
BAS2698019-2.208704transporter
BAS2698a118-3.491611hypothetical protein
BAS2699117-4.060287lipoprotein
BAS2700118-3.761618aspartate aminotransferase
BAS2701120-5.823742(3R)-hydroxymyristoyl-ACP dehydratase
BAS2702021-6.212123pantothenate kinase
BAS2703-116-5.292226CAAX amino terminal protease family protein
BAS2705-212-4.284546hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2686PYOCINKILLER310.004 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.9 bits (69), Expect = 0.004
Identities = 14/53 (26%), Positives = 20/53 (37%), Gaps = 5/53 (9%)

Query: 12 LELTGISYGQLYRWKRKNLIPEDWFVRKSTFTGQETFFPKEKILERINKIQTM 64
L+ + G KNL P D R T G +K+L KI ++
Sbjct: 97 LDKADAALGPA-----KNLAPLDVINRSLTIVGNALQQKNQKLLLNQKKITSL 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2688TCRTETA801e-18 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 80.3 bits (198), Expect = 1e-18
Identities = 53/318 (16%), Positives = 113/318 (35%), Gaps = 9/318 (2%)

Query: 50 LIFGLQPFSDIVFTLIAGGITDKYGRKKIMLLGLLLQGVAIGSFVFAQSVFIFALLYVIN 109
++ L + G ++D++GR+ ++L+ L V A +++ + ++
Sbjct: 47 ILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVA 106

Query: 110 GIGRSLYIPAQRAQIADLIKQGQQAEIFALLQTMGAIGTVIGPLIGAVFYNTHPEYLFIM 169
GI + A IAD+ ++A F + G V GP++G + P F
Sbjct: 107 GITGAT-GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFA 165

Query: 170 QSITLMVYAVVVWTQLPETAPAITMPKQKLEVSSPKQF--VRNHSAVIGLMVSTLPISFF 227
+ + + LPE+ P ++ ++ F R + V LM +
Sbjct: 166 AAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLV 225

Query: 228 YAQTETNYRIFAEDVFPNFIFILAFISTCRAIMEIILQIFLV-KWSERFSMAKIIIISYT 286
+ IF ED F + I+ + Q + + R + +++
Sbjct: 226 GQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLG-- 283

Query: 287 CYIVAAIGYGFSATIVS--LFFTLLFLVIGESIALNHLLRFVSEIAPSDKRGLYFSIYGL 344
I GY A + F ++ L+ I + L +S +++G
Sbjct: 284 -MIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAA 342

Query: 345 HWDVSRTCGPVIGAILLS 362
++ GP++ + +
Sbjct: 343 LTSLTSIVGPLLFTAIYA 360



Score = 47.5 bits (113), Expect = 5e-08
Identities = 20/121 (16%), Positives = 53/121 (43%), Gaps = 1/121 (0%)

Query: 45 IMITMLIFGLQPFSDIVFTLIAGGITDKYGRKKIMLLGLLLQGVAIGSFVFAQSVFIFAL 104
I + + + +I G + + G ++ ++LG++ G FA ++
Sbjct: 246 TTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFP 305

Query: 105 LYVINGIGRSLYIPAQRAQIADLIKQGQQAEIFALLQTMGAIGTVIGPLIGAVFYNTHPE 164
+ V+ G + +PA +A ++ + + +Q ++ L + ++ +++GPL+ Y
Sbjct: 306 IMVLLASG-GIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASIT 364

Query: 165 Y 165

Sbjct: 365 T 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2689TYPE4SSCAGA290.014 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 29.3 bits (65), Expect = 0.014
Identities = 35/130 (26%), Positives = 52/130 (40%), Gaps = 20/130 (15%)

Query: 20 LAACKGTDEKKETNP----TSENSKNEQNTSSEGK-----KEPEVKSNTDSNSKDIVINQ 70
L A KG+ + NP EN N GK K + KS+ +++ KD++INQ
Sbjct: 719 LKALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENSVKDVIINQ 778

Query: 71 KSINHVKNLFELAKEGKVPNVPFAAHTGDIEEIEKAWGKADKTEQAGNGMYATFTNKNVS 130
K + V NL + K TGD +E+A + A KN S
Sbjct: 779 KVTDKVDNLNQAVSVAKA--------TGDFSRVEQALADLKNFSKE---QLAQQAQKNES 827

Query: 131 FGFNKGSQVF 140
K S+++
Sbjct: 828 LNARKKSEIY 837


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2692CHANLCOLICIN359e-06 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 35.4 bits (81), Expect = 9e-06
Identities = 15/49 (30%), Positives = 24/49 (48%), Gaps = 3/49 (6%)

Query: 6 IVGGILGWLASLITGRDVPGGVIG-NIIAGIIGSWIGGKLLGSFGPVIG 53
V ++ L SL+ G G+ G I+ GI+ S+I L + V+G
Sbjct: 475 GVSYVVALLFSLLAG--TTLGIWGIAIVTGILCSYIDKNKLNTINEVLG 521


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2698TCRTETA454e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.8 bits (106), Expect = 4e-07
Identities = 65/342 (19%), Positives = 118/342 (34%), Gaps = 11/342 (3%)

Query: 1 MWRNKNVWIVLIGEFIAGLGLWLGILGNLEFMQKYVPSDFMKS---VILFIGLLAGVLVG 57
M N+ + ++L + +G+ L + ++ V S+ + + ++L + L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 58 PMAGRIIDQYEKKKVHLYAGFGRVISVIFMFFAIQFESIAFMIAFMVALQISAAFYFPAL 117
P+ G + D++ ++ V L + G + M A + I +VA A
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVL--YIGRIVAGITGATGAVAG- 117

Query: 118 QSVIPLIVREHELLQMNGVHMNVGTIARIAGTSLGGILLVVMSLQYMYAFSMAAYALLFL 177
+ I I E + G +AG LGG L+ S + + A L FL
Sbjct: 118 -AYIADITDGDERARHFGFMSACFGFGMVAGPVLGG-LMGGFSPHAPFFAAAALNGLNFL 175

Query: 178 STFFLQFEDKKSTTPSKQAAKDNSFMEVFRILRGIPIAFTALILSIIPLLFIAGFNLMVI 237
+ FL E K + N +A + I+ L+ L VI
Sbjct: 176 TGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVI 235

Query: 238 -NISEMQHDPTIKGFIYTIEGIAFMLG-AFVIKRLSDHFKPEKLLYFFAVCTAFAHLSLF 295
D T G GI L A + ++ + L + ++ L
Sbjct: 236 FGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLA 295

Query: 296 FSDIKWMSLTSFGLFGFSVGCFFPIMSTIFQTKVEKSYHGRL 337
F+ WM+ L G P + + +V++ G+L
Sbjct: 296 FATRGWMAFPIMVLLAS-GGIGMPALQAMLSRQVDEERQGQL 336


29BAS2790BAS2796Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2790217-4.179154hypothetical protein
BAS2791121-4.191241DNA-binding protein
BAS2792022-4.185943hypothetical protein
BAS2793020-3.967539lipoprotein
BAS2794020-3.591473CAAX amino terminal protease family protein
BAS2795020-4.032476histidine kinase domain-containing protein
BAS2796017-3.145403response regulator
30BAS2807aBAS2812Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2807a320-4.317869hypothetical protein
BAS2808319-3.795251glycosyl transferase
BAS2809522-4.152655hypothetical protein
BAS2810420-3.982336hypothetical protein
BAS2811217-2.794185glycosyl transferase
BAS2812217-2.940503hypothetical protein
31BAS2829BAS2853Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2829112-4.287655hypothetical protein
BAS2830113-2.303253mutT/nudix family protein
BAS2832113-2.586095hypothetical protein
BAS2833013-3.820910hypothetical protein
BAS2834014-3.973246ABC transporter ATP-binding protein
BAS2835016-3.281515GntR family transcriptional regulator
BAS2836017-2.647642hypothetical protein
BAS2837018-3.624817ABC transporter ATP-binding protein
BAS2838016-3.278156ABC transporter permease
BAS2839118-2.454474ABC transporter permease
BAS2841217-1.304245hypothetical protein
BAS2842117-0.961215hypothetical protein
BAS28432130.041648araC family transcriptional regulator
BAS2844-1120.708769acetyltransferase
BAS2845-1130.292971hypothetical protein
BAS2846013-0.508981mutT/nudix family protein
BAS2848113-2.167812cysteine transporter
BAS2847214-3.836829GntR family transcriptional regulator
BAS2849419-6.334266hypothetical protein
BAS2850419-5.979056hypothetical protein
BAS2851219-5.396471hypothetical protein
BAS2852017-4.111717sensor histidine kinase
BAS2853015-3.015267DNA-binding response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2836NUCEPIMERASE362e-04 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 35.5 bits (82), Expect = 2e-04
Identities = 28/135 (20%), Positives = 45/135 (33%), Gaps = 42/135 (31%)

Query: 1 MKILILGGTRFLGRAFVEEALQRGHEV-----------TLFNRGTNQEI------FLE-- 41
MK L+ G F+G + L+ GH+V + + + F +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 42 ------VEQLIGDRNGDV-----------SSLENRKWDVVINTCGFSPHHIRNVGEVLKD 84
+ L + + SLEN N GF N+ E +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFL-----NILEGCRH 115

Query: 85 -NIEHYIFISSLSVY 98
I+H ++ SS SVY
Sbjct: 116 NKIQHLLYASSSSVY 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2844SACTRNSFRASE362e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.5 bits (84), Expect = 2e-05
Identities = 16/71 (22%), Positives = 29/71 (40%)

Query: 60 LSQTHKEEAYVHFIGVNPKYRRRGIASTLYSYFFDVARANKRKVVKAITSPVNKKSIQFH 119
+ A + I V YR++G+ + L + A+ N + T +N + F+
Sbjct: 82 IRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFY 141

Query: 120 REIGFRIEAGD 130
+ F I A D
Sbjct: 142 AKHHFIIGAVD 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2852PF06580290.017 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.4 bits (66), Expect = 0.017
Identities = 16/78 (20%), Positives = 33/78 (42%), Gaps = 7/78 (8%)

Query: 250 IQRLFDNIFQNVLKHSKAK---KLKIIIDEDIVYF--RDNGIGFDINSK-GTGLGLKNI- 302
+Q L +N ++ + LK D V + G N+K TG GL+N+
Sbjct: 260 VQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVR 319

Query: 303 EDISKMFDIKYTLQSNSE 320
E + ++ + ++ + +
Sbjct: 320 ERLQMLYGTEAQIKLSEK 337


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2853HTHFIS758e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 74.9 bits (184), Expect = 8e-18
Identities = 36/117 (30%), Positives = 61/117 (52%), Gaps = 3/117 (2%)

Query: 6 ILIVEDDLIIGDLLQKILQREKYNVYWEKEGRKVLDII--HEIDLVVMDVMLPGEDGYQI 63
IL+ +DD I +L + L R Y+V + I + DLVV DV++P E+ + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 TKKIKNLGLNIPIIFLSARNDMDSKLKGLTIGE-EYMIKPFDPRELLLRIQKMLGNQ 119
+IK ++P++ +SA+N + +K G +Y+ KPFD EL+ I + L
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122


32BAS2900BAS2929Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2900320-0.552952metallo-beta-lactamase family protein
BAS2901320-1.246506hypothetical protein
BAS2902220-2.244610hypothetical protein
BAS2903219-1.432338spore coat protein CotF
BAS2904220-1.055744hypothetical protein
BAS2905216-0.722949hypothetical protein
BAS2908117-0.975191small, acid-soluble spore protein alpha/beta
BAS2909315-1.018002hypothetical protein
BAS29102120.516793hypothetical protein
BAS29112120.046757small, acid-soluble spore protein
BAS2912212-0.256057zinc-containing alcohol dehydrogenase
BAS2913112-1.607901hypothetical protein
BAS2914211-0.640056catalase
BAS2915213-1.240869aspartate ammonia-lyase
BAS2916418-2.535863L-asparaginase
BAS2917518-3.048607transcriptional regulator AnsR
BAS2918616-2.458649hypothetical protein
BAS2919618-2.253412amino acid permease
BAS2920620-2.732106branched-chain amino acid transport system II
BAS2921419-1.906872pyrroline-5-carboxylate reductase
BAS2922319-2.302338hypothetical protein
BAS2923420-1.695711malate dehydrogenase
BAS2924420-2.672456hypothetical protein
BAS2927220-2.881973spore germination protein
BAS2928117-2.854785spore germination protein GerAA
BAS2929217-4.066935response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2919BACINVASINB300.030 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 29.7 bits (66), Expect = 0.030
Identities = 48/195 (24%), Positives = 87/195 (44%), Gaps = 35/195 (17%)

Query: 174 TRESKRINNIMVLIK--IGMILLFITVGIFYVKPMNWIPIAPYGLSGVFTGGAAILFAFT 231
TR+++ N IM I +G +L ++V ++ VFTGGA++ A
Sbjct: 304 TRKAEETNRIMGCIGKVLGALLTIVSV-----------------VAAVFTGGASLALAAV 346

Query: 232 GFDILATSAEEVKDPKRNLPIGIIASLIICTIIYVMVCLVMTGMVSYKE-LNVPEAMAYV 290
G ++ E VK I + I+ ++ ++ L+ + E L V + A
Sbjct: 347 GLAVMVAD-EIVKAATGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTA-- 403

Query: 291 MEVVGQ--GKVAGAIAVGAVIGLMAVIFSNMYAATRVFFAMSRDGLLPKSFAKVNKKTGA 348
E+ G G + AIA+ AVI ++AV+ AA ++ A+S+ ++ ++ K+
Sbjct: 404 -EMAGSIVGAIVAAIAMVAVIVVVAVVGKG--AAAKLGNALSK--MMGETIKKL-----V 453

Query: 349 PTFITGLAGIGSSII 363
P + LA GS +
Sbjct: 454 PNVLKQLAQNGSKLF 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2929HTHFIS507e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.8 bits (119), Expect = 7e-09
Identities = 20/123 (16%), Positives = 54/123 (43%), Gaps = 8/123 (6%)

Query: 5 IVDDEKAVRSMLAQIIEDEDLGEVIGEAENGLSLEQQMLILKN--IDILFIDLLMPIQDG 62
+ DD+ A+R++L Q + +V + + + D++ D++MP ++
Sbjct: 8 VADDDAAIRTVLNQALSRAGY-DVRITS----NAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 63 IKTIRQIKPSFKG-KIIMVSQVESKELIAEAYSLGVEYYIIKPINRIEVLTVVRKVIERI 121
+ +IK + ++++S + +A G Y+ KP + E++ ++ + +
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122

Query: 122 RLE 124
+
Sbjct: 123 KRR 125


33BAS2944BAS2952Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2944217-1.642004hypothetical protein
BAS2948317-2.378473LysR family transcriptional regulator
BAS2949417-2.117096magnesium and cobalt transport protein
BAS2950219-2.260116hypothetical protein
BAS2951218-2.181712hypothetical protein
BAS2952219-2.198498hypothetical protein
34BAS3034BAS3070Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS30342142.107654major facilitator family transporter protein
BAS30352131.748077hypothetical protein
BAS30362131.301029(Fe-S)-binding protein
BAS30372160.102428hypothetical protein
BAS3038318-0.868436LysR family transcriptional regulator
BAS3039417-2.682300ATP synthase F0F1 subunit alpha
BAS3040621-3.425035hypothetical protein
BAS3041825-5.333111hypothetical protein
BAS3042824-5.821423hypothetical protein
BAS3043621-5.362225hypothetical protein
BAS3044620-4.725719hypothetical protein
BAS3045522-4.247992Gfo/Idh/MocA family oxidoreductase
BAS3046319-4.024459hypothetical protein
BAS3049219-3.635377LacI family transcriptional regulator
BAS3050218-2.586375hypothetical protein
BAS3051217-2.363833hypothetical protein
BAS3052114-2.101090hypothetical protein
BAS3053114-2.468522impB/mucB/samB family protein
BAS3054215-2.276510hypothetical protein
BAS3055112-2.418975hypothetical protein
BAS3056113-2.566019methyl-accepting chemotaxis protein
BAS3057113-1.889414CAAX amino terminal protease family protein
BAS3058014-1.651950spermine/spermidine acetyltransferase
BAS3059013-1.289512MATE efflux family protein
BAS3060015-0.978803collagenase
BAS30611191.144255Fic family protein
BAS30622181.245932transporter
BAS30631170.234606TetR family transcriptional regulator
BAS3065318-1.505237arsR family transcriptional regulator
BAS3066319-1.600638serine/threonine transporter family protein
BAS3067418-1.914910L-serine dehydratase, iron-sulfur-dependent
BAS3068519-2.735512L-serine dehydratase, iron-sulfur-dependent
BAS3069618-2.910776diaminobutyrate--2-oxoglutarate
BAS3070518-3.138601hydrogenase maturation protein HypF
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3034TCRTETA340.001 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.0 bits (78), Expect = 0.001
Identities = 61/379 (16%), Positives = 127/379 (33%), Gaps = 42/379 (11%)

Query: 45 WGAILGYFGYGYMIGSLLGGIFSDKKGPKFVWIVAATAWSIFEIATAFAGEIGIAVFGGS 104
+G +L + + + G SD+ G + V +V+ ++ A A + + G
Sbjct: 45 YGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIG-- 102

Query: 105 ALIGFAIFRVLFGLTEGPSFAVSNKTAANWAAPKERAFLTSLGFVGVPLGAVLTA-PVAV 163
R++ G+T G + AV+ A+ ERA GF+ G + A PV
Sbjct: 103 --------RIVAGIT-GATGAVAGAYIADITDGDERA--RHFGFMSACFGFGMVAGPVLG 151

Query: 164 LLLSFTSWKIMFFILGTIGIVWAIIWYFTFTNMPEDHPRVTKEELAEIRSTEGVLQSAKV 223
L+ S FF + + + F +PE H + E + +
Sbjct: 152 GLMGGFSPHAPFFAAAALNGLNFLTGCFL---LPESHKGERRPLRREALNPLASFR---- 204

Query: 224 EKEIPKEPWYSFFKVPTFVMVTIAYFCFQYINFLILTWTPKYLQDVFHFQLSSLWYLGMI 283
W V +M +F Q + + + +D FH+ ++ +G+
Sbjct: 205 --------WARGMTVVAALMAV--FFIMQLVGQVPAALWVIFGEDRFHWDATT---IGIS 251

Query: 284 PWLGACITLPLGAKLSDRILRKTGNLRLARTGLPIIALLLTAICFSFIPAMNNYVAVLAL 343
+ A ++ + + G R G ++ + + +
Sbjct: 252 LAAFGILHSLAQAMITGPVAARLGERRALMLG-----MIADGTGYILLAFATRGWMAFPI 306

Query: 344 MSLGNAFAFLPSSLFWAIIVDTAPAYSGTYSGIMHFIANIATILAPTLTGYL---VVSYG 400
M L + +L + G G + + ++ +I+ P L + ++
Sbjct: 307 MVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTW 366

Query: 401 YPSMFIVAAILAAIAMGAM 419
+I A L + + A+
Sbjct: 367 NGWAWIAGAALYLLCLPAL 385



Score = 29.0 bits (65), Expect = 0.037
Identities = 28/161 (17%), Positives = 45/161 (27%), Gaps = 12/161 (7%)

Query: 290 ITLPLGAKLSDRILRKTGNLRLARTGLPIIALLLTAICFSFIPAMNNYVAVLALMSLGNA 349
P+ LSDR R+ ++ L A I A ++ VL + +
Sbjct: 58 ACAPVLGALSDRFGRR----------PVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAG 107

Query: 350 FAFLPSSLFWAIIVDTAPAYSGT-YSGIMHFIANIATILAPTLTGYLVVSYGYPSMFIVA 408
++ A I D + G M + P L G + + + F A
Sbjct: 108 ITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMG-GFSPHAPFFAA 166

Query: 409 AILAAIAMGAMLFVKPGQQTKTESLFNWRGKKRLEEPRANF 449
A L + F+ P L R
Sbjct: 167 AALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWAR 207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3060MICOLLPTASE7490.0 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 749 bits (1935), Expect = 0.0
Identities = 417/885 (47%), Positives = 578/885 (65%), Gaps = 18/885 (2%)

Query: 93 YTLAELNKMPNSELIDTLSKISWNQITDLFQFNQDTKAFYQNKERMNVIINELGQRGRTF 152
YT ELN+M S+L++ + IS+ + DLF FN + F+ N++R+ II L GRT+
Sbjct: 93 YTFDELNRMNYSDLVELIKTISYENVPDLFNFNDGSYTFFSNRDRVQAIIYGLEDSGRTY 152

Query: 153 TKENSKGIETFVEVLRSAFYVGYYNNELSYLKERSFHEKCLPALKAIAKNPNFTLGTAEQ 212
T ++ KGI T VE LR+ +Y+G+YN +LSYL +CLPA+KAI N NF LGT Q
Sbjct: 153 TADDDKGIPTLVEFLRAGYYLGFYNKQLSYLNTPQLKNECLPAMKAIQYNSNFRLGTKAQ 212

Query: 213 DRVVAAYGKLIGNASSDTETVQYAVNVLKQYNDNLTTYVSDYAKGQAVYEIVKGIDYDIQ 272
D VV A G+LIGNAS+D E + + VL + DN+ Y S+Y+KG AV+ ++KGIDY
Sbjct: 213 DGVVEALGRLIGNASADPEVINNCIYVLSDFKDNIDKYGSNYSKGNAVFNLMKGIDYYTN 272

Query: 273 SYLQDT-NKQPNETMWYGKIDNFINEVNRIALVGN-ITNENSWLINNGIYYAGRLGKFHS 330
S + +T T +Y +ID ++ + + +G+ + N+N+WL+NN +YY GR+GKF
Sbjct: 273 SVIYNTKGYDAKNTEFYNRIDPYMERLESLCTIGDKLNNDNAWLVNNALYYTGRMGKFRE 332

Query: 331 NPYKGLEVITQAMSLYPRLSGPYFVAVEQIKTNYGGKDYSGKAVDLQKIREEGKRQYLPK 390
+P + +AM YP LS Y A + N+GGK+ SG +D KI+ + + +YLPK
Sbjct: 333 DPSISQRALERAMKEYPYLSYQYIEAANDLDLNFGGKNSSGNDIDFNKIKADAREKYLPK 392

Query: 391 TYTFDDGSIVFKTGDKVTEEKIKRLYWAAKEVKAQYHRVIGNDKALEPGNADDVLTIVIY 450
TYTFDDG V K GDKVTEEKIKRLYWA+KEVKAQ+ RV+ NDKALE GN DD+LT+VIY
Sbjct: 393 TYTFDDGKFVVKAGDKVTEEKIKRLYWASKEVKAQFMRVVQNDKALEEGNPDDILTVVIY 452

Query: 451 NNPDEYQLNRQLYGYETNNGGIYIEEKRTFFTYERTPKQSIYSLEELFRHEFTHYLQGRY 510
N+P+EY+LNR + G+ T+NGGIYIE TFFTYERTP++SIY+LEELFRHEFTHYLQGRY
Sbjct: 453 NSPEEYKLNRIINGFSTDNGGIYIENIGTFFTYERTPEESIYTLEELFRHEFTHYLQGRY 512

Query: 511 EVPGLFGSGEMYQNERLTWFQEGNAEFFAGSTRTNNVVPRKSMISGLSSDPASRYTAKQT 570
VPG++G GE YQ LTW++EG AEFFAGSTRT+ + PRKS+ GL+ D +R +
Sbjct: 513 VVPGMWGQGEFYQEGVLTWYEEGTAEFFAGSTRTDGIKPRKSVTQGLAYDRNNRMSLYGV 572

Query: 571 LFSKYGSWDFYKYSFALQSYLYNHQFDTFDKLQDLIRVNDVKNYDSYRESLSNNTQLNAE 630
L +KYGSWDFY Y FAL +Y+YN+ F+K+ + I+ NDV Y Y S+S++ LN +
Sbjct: 573 LHAKYGSWDFYNYGFALSNYMYNNNMGMFNKMTNYIKNNDVSGYKDYIASMSSDYGLNDK 632

Query: 631 YQAYMQQLIDNQDKYNVPQVTNDYLIQHAPKPLAEVKNEIVDVANIKDAKITKYESQFFN 690
YQ YM L++N D +VP V+++Y+ H K + E+ N+I +V+NIKD +SQFF
Sbjct: 633 YQDYMDSLLNNIDNLDVPLVSDEYVNGHEAKDINEITNDIKEVSNIKDLSSNVEKSQFFT 692

Query: 691 TFTVEGKYTGGTSKGESEDWKTMSKQVNRTLEQLSQKGWSGYKTVTAYFVNYRVNAANQF 750
T+ + G Y GG S+GE DWK M+ ++N L++LS+K W+GYKTVTAYFVN++V+ +
Sbjct: 693 TYDMRGTYVGGRSQGEENDWKDMNSKLNDILKELSKKSWNGYKTVTAYFVNHKVDGNGNY 752

Query: 751 EYDIVFHGVATE--EKEKTNTIVN--MNGPYSGIVNEEIQFHSDGTKSENGKVTSYLWNF 806
YD+VFHG+ T+ N + S IV EEI F +K E+G++ +Y W+F
Sbjct: 753 VYDVVFHGMNTDTNTDVHVNKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDF 812

Query: 807 GDGTTSTEANPTHVYEEKGTYTVELTVKDRRGKESKEQTKVTVKQD----------PQTG 856
GDG S EA TH Y + G Y V+LTV D G + E K+ V +D P
Sbjct: 813 GDGEKSNEAKATHKYNKTGEYEVKLTVTDNNGGINTESKKIKVVEDKPVEVINESEPNND 872

Query: 857 EFHEEEKVLLFNTLVKGNLVTPDQTDVYTFDVTDTKEVDISVVNEQNIGMTWVLYHESDM 916
F + ++ N LVKG L D +D Y FDV V I++ N ++G+TW LY E D+
Sbjct: 873 -FEKANQIAKSNMLVKGTLSEEDYSDKYYFDVAKKGNVKITLNNLNSVGITWTLYKEGDL 931

Query: 917 QNYVA-CGEDEGNVIKGKFEAKPGKYYLNVYKFDDKNGEYSLLVK 960
NYV ++G V+KG+ +PG+YYL+VY +D+++G Y++ VK
Sbjct: 932 NNYVLYATGNDGTVLKGEKTLEPGRYYLSVYTYDNQSGTYTVNVK 976



Score = 89.8 bits (222), Expect = 2e-20
Identities = 102/514 (19%), Positives = 173/514 (33%), Gaps = 90/514 (17%)

Query: 488 KQSIYSLEELF--RHEFTHYLQGRYEVPGLFGSGEMYQNERLTWFQEGNAEFFAGSTRTN 545
K I S+ + ++ Y+ + A+
Sbjct: 617 KDYIASMSSDYGLNDKYQDYMDSLLNNIDNLD----VPLVSDEYVNGHEAKDIN---EIT 669

Query: 546 NVVPRKSMISGLSSDPASRYTAKQTLFSKYGSWDFYKYSFALQSYLYNH-QFDTFDKLQD 604
N + S I LSS K F+ Y D +S + D KL D
Sbjct: 670 NDIKEVSNIKDLSS-----NVEKSQFFTTY---DMRGTYVGGRSQGEENDWKDMNSKLND 721

Query: 605 LIRVNDVKNYDSYRESL----------SNNTQLNAEYQAYMQQLIDNQDKYNVPQ--VTN 652
+++ K+++ Y+ + N + + + P+ + +
Sbjct: 722 ILKELSKKSWNGYKTVTAYFVNHKVDGNGNYVYDVVFHGMNTDTNTDVHVNKEPKAVIKS 781

Query: 653 DYLIQHAPKPLAEVKNEI-VDVANIKDA--KITKYESQFFNTFTVEGKYTGGTSKGESED 709
D + V+ EI D KD +I YE F + G E++
Sbjct: 782 DSSVI--------VEEEINFDGTESKDEDGEIKAYEWDFGD----------GEKSNEAKA 823

Query: 710 WKTMSKQVNRTLEQLSQKGWSGYKTVTAYFVNYRVNAANQFEYDIVFHGVATEEKEKTNT 769
+K ++ G T + ++ +++ + EK N
Sbjct: 824 THKYNKTGEYEVKLTVTDNNGGINTESK-----KIKVVEDKPVEVINESEPNNDFEKANQ 878

Query: 770 IVNMNGPYSGIVNEE---IQFHSDGTKSENGKVTS----------YLWNFGDGTT-STEA 815
I N G ++EE +++ D K N K+T L+ GD A
Sbjct: 879 IAKSNMLVKGTLSEEDYSDKYYFDVAKKGNVKITLNNLNSVGITWTLYKEGDLNNYVLYA 938

Query: 816 NPTHVYEEKGTYTVE-------------------LTVKDRRGKESKEQTKVTVKQDPQTG 856
KG T+E + VK E KE K +K+
Sbjct: 939 TGNDGTVLKGEKTLEPGRYYLSVYTYDNQSGTYTVNVKGNLKNEVKETAKDAIKEVENNN 998

Query: 857 EFHEEEKVLLFNTLVKGNLVTPDQTDVYTFDVTDTKEVDISVVNEQNIGMTWVLYHESDM 916
+F + KV N+ + G L D D+Y+ D+ + +++I V N NI M W+LY D+
Sbjct: 999 DFDKAMKVDS-NSKIVGTLSNDDLKDIYSIDIQNPSDLNIVVENLDNIKMNWLLYSADDL 1057

Query: 917 QNYVACGEDEGNVIKGKFEAKPGKYYLNVYKFDD 950
NYV +GN + + PGKYYL VY+F++
Sbjct: 1058 SNYVDYANADGNKLSNTCKLNPGKYYLCVYQFEN 1091


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3062TCRTETB348e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 8e-04
Identities = 36/152 (23%), Positives = 69/152 (45%), Gaps = 3/152 (1%)

Query: 42 ISNEIGLSNSSAGLIVTLTQIGYVVGLLFLVPLGDIVENKKLILILLFLSAFA-LISMVF 100
I+N+ +S + T + + +G L D + K+L+L + ++ F +I V
Sbjct: 40 IANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVG 99

Query: 101 VKSATLLLIASFFIGLGSVAAQVLVP-LVSYLSSENARGRVVGNVMSGLLLGIMLARPIS 159
+LL++A F G G+ A LV +V+ + RG+ G + S + +G + I
Sbjct: 100 HSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIG 159

Query: 160 SLVADMWGWNAIFALSATVIIVLAFVLSKVLP 191
++A W+ + L + I+ L K+L
Sbjct: 160 GMIAHYIHWSYLL-LIPMITIITVPFLMKLLK 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3063HTHTETR864e-23 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 85.8 bits (212), Expect = 4e-23
Identities = 26/170 (15%), Positives = 58/170 (34%), Gaps = 12/170 (7%)

Query: 4 KRGRPRNIETQKAILSASYELLLESGFKAVTVDKIADRAKVSKATIYKWWPNKAAVV--- 60
++ + ET++ IL + L + G + ++ +IA A V++ IY + +K+ +
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 61 ---MDGFLSAAAARLPVPDTGS---ALNDILTHATSLANFLISREGTIINELVGEGQFDS 114
+ + G L +IL H R +
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 115 --KLAEEYRARYFQPRRLQAKQLLEKGMKRGELKENLDVELSIDLIYGPI 162
+ + R + +Q L+ ++ L +L + ++ G I
Sbjct: 123 MAVVQQAQRNLCLESYDR-IEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171


35BAS3219BAS3254Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS32192150.230144acetyltransferase
BAS32201140.029447AMP-binding protein
BAS3223017-0.529986hypothetical protein
BAS3226218-0.780357ankyrin repeat-containing protein
BAS3227322-1.251822arsR family transcriptional regulator
BAS3230422-0.906855hypothetical protein
BAS3231322-2.864440RNA polymerase sigma factor SigI
BAS3233420-1.060049CAAX amino terminal protease family protein
BAS32341191.590325TetR family transcriptional regulator
BAS3235-1212.293755hypothetical protein
BAS3236-2223.278616hypothetical protein
BAS3238-2193.231983hypothetical protein
BAS3239-2162.061687ABC transporter efflux permease
BAS3240-2120.798918ABC transporter ATP-binding protein
BAS3241-114-0.740699hypothetical protein
BAS3242013-1.232331hypothetical protein
BAS3243-112-1.426369hydroxylamine reductase
BAS3244217-2.678634hypothetical protein
BAS3245-220-2.376196beta-lactamase II
BAS3246-218-1.917717lysozyme
BAS3247-120-2.381865hypothetical protein
BAS3248120-1.927993hypothetical protein
BAS3249218-2.387467hypothetical protein
BAS3250017-2.906912hypothetical protein
BAS3251116-2.898518penicillin-binding protein
BAS3252017-3.820257hypothetical protein
BAS3253115-4.154658hypothetical protein
BAS3254013-3.492984hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3230cloacin330.003 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 32.8 bits (74), Expect = 0.003
Identities = 23/88 (26%), Positives = 29/88 (32%), Gaps = 6/88 (6%)

Query: 327 GNNGRGSQGNNGHQQENNGRGSQGNNGNQQGNNGRGSQGNNGHQQENNGRGSQGNNGNQQ 386
G +GRG N G G ++G +G ENN G +G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDG------SGWSSENNPWGGGSGSGIHW 56

Query: 387 GDNGRGSQQGNNGNQQGDNGRGSQKENV 414
G G NGN G +G G V
Sbjct: 57 GGGSGHGNGGGNGNSGGGSGTGGNLSAV 84



Score = 29.7 bits (66), Expect = 0.026
Identities = 21/73 (28%), Positives = 30/73 (41%), Gaps = 2/73 (2%)

Query: 296 NNGRESQQGN--NGNQQGNNGRESQQGNNGNQQGNNGRGSQGNNGHQQENNGRGSQGNNG 353
N G S GN G G + G+ + + N G G+ H +G G+ G NG
Sbjct: 10 NTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNG 69

Query: 354 NQQGNNGRGSQGN 366
N G +G G +
Sbjct: 70 NSGGGSGTGGNLS 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3234HTHTETR712e-17 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 71.2 bits (174), Expect = 2e-17
Identities = 28/172 (16%), Positives = 68/172 (39%), Gaps = 3/172 (1%)

Query: 8 KEKIIETSLYLFNTNGITRTSIQDIMTATELPKGSIYRRFKNKEEIVLAAYDKSGEIMWS 67
++ I++ +L LF+ G++ TS+ +I A + +G+IY FK+K ++ ++ S +
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 68 HFHKAMENK-KTAIDKILAIFLVYQDAANNPPI-AGGCPLLNSAIESTGVFPELQKAAAK 125
+ + + I + ++ ++ E G +Q+A
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRN 132

Query: 126 GYDDTVMLMASLIKEGIEKQELKEEINIISLASFLASSMEGAIMASRVSNDN 177
++ + +K IE + L ++ A + + G M + +
Sbjct: 133 LCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGL-MENWLFAPQ 183


36BAS3328BAS3338Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BAS33282150.483504hypothetical protein
BAS33292130.494497hypothetical protein
BAS3330117-0.376401hypothetical protein
BAS3331-313-0.381596hypothetical protein
BAS3332-2130.314012exonuclease
BAS3333-212-0.322070cold shock protein CspB
BAS3334-112-0.068415BNR repeat-containing protein
BAS33354152.763859flavodoxin
BAS33364132.579906hypothetical protein
BAS33374142.355051mutT/nudix family protein
BAS33382131.885240hypothetical protein
37BAS3436BAS3452Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS3436-2123.224266hypothetical protein
BAS3437-1143.552411Oye family NADH-dependent flavin oxidoreductase
BAS3438-1173.811351CarD family transcriptional regulator
BAS3439-1173.615694formimidoylglutamase
BAS34400163.187628imidazolonepropionase
BAS3441-1132.328792urocanate hydratase
BAS3442-1130.628730histidine ammonia-lyase
BAS3443-314-1.290959anti-terminator HutP
BAS3444-114-1.019351hypothetical protein
BAS3445217-0.290783thiJ/pfpI family protein
BAS34464170.307292hypothetical protein
BAS344711202.654575hypothetical protein
BAS344810192.515664hypothetical protein
BAS344910192.492745hypothetical protein
BAS345010201.772720hypothetical protein
BAS34518171.411716hypothetical protein
BAS34528171.230900hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3440UREASE371e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 37.0 bits (86), Expect = 1e-04
Identities = 19/56 (33%), Positives = 27/56 (48%), Gaps = 8/56 (14%)

Query: 356 TVNSSYAINRGDVAGKIRVGRKADLVLWDAYNYAYVPYHYGVSHVNTVWKNGNIAY 411
T+N + A G + VG++ADLVLW+ P +GV + V G IA
Sbjct: 410 TINPAIAHGLSHEIGSLEVGKRADLVLWN-------PAFFGVK-PDMVLLGGTIAA 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3446PF05272290.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.010
Identities = 21/121 (17%), Positives = 42/121 (34%), Gaps = 28/121 (23%)

Query: 3 YVISLQGPMASGKTTLAKKLELHGLSVIYENPYPIVEKR---KQL---------NLD-MN 49
Y + L+G GK+TL L GL + + I + +Q+ +
Sbjct: 597 YSVVLEGTGGIGKSTLINT--LVGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSEMTAFR 654

Query: 50 SKEGFIANQKMFIEAKIKEFQNAKGSVVIFDRGPEDIEFYTIFYPTTIGKEWDIETELKD 109
+ K F ++ ++ A + R +D + + TT +++ L D
Sbjct: 655 RAD--AEAVKAFFSSRKDRYRGA------YGRYVQDHPRQVVIWCTTNKRQY-----LFD 701

Query: 110 E 110

Sbjct: 702 I 702


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3450CHLAMIDIAOM6290.002 Chlamydia cysteine-rich outer membrane protein 6 si...
		>CHLAMIDIAOM6#Chlamydia cysteine-rich outer membrane protein 6

signature.
Length = 547

Score = 28.5 bits (63), Expect = 0.002
Identities = 11/27 (40%), Positives = 15/27 (55%)

Query: 12 YTIVVQNTGTLPAQNVTFTDPLPAGTT 38
Y I + N GT A+NV +P+P G
Sbjct: 229 YKINIVNQGTATARNVVVENPVPDGYA 255



Score = 27.7 bits (61), Expect = 0.003
Identities = 11/27 (40%), Positives = 13/27 (48%)

Query: 12 YTIVVQNTGTLPAQNVTFTDPLPAGTT 38
Y I V N G L ++V D L G T
Sbjct: 335 YVISVSNPGDLVLRDVVVEDTLSPGVT 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3452CHLAMIDIAOM6476e-07 Chlamydia cysteine-rich outer membrane protein 6 si...
		>CHLAMIDIAOM6#Chlamydia cysteine-rich outer membrane protein 6

signature.
Length = 547

Score = 47.4 bits (112), Expect = 6e-07
Identities = 38/163 (23%), Positives = 68/163 (41%), Gaps = 26/163 (15%)

Query: 787 VTYTITFTNQGTIPATNVTITDALPPGTSFVTNSVTVNNVTQPGASPVTGILVGTVNPGE 846
V Y I NQGT A NV + + +P G + V +G + PGE
Sbjct: 227 VVYKINIVNQGTATARNVVVENPVPDGYA------------HSSGQRVLTFTLGDMQPGE 274

Query: 847 TVTVTFQIQINAIPPSGKIENTASVTYTSQPNPNEPPITTTETTPTVTIPVRTANLNPQK 906
T+T + G+ N A+V+Y + N +TT P V + + A
Sbjct: 275 HRTITVEF---CPLKRGRATNIATVSYCGG-HKNTASVTTVINEPCVQVSIAGA------ 324

Query: 907 TVDREFASIGDTLTYTITLQNTGNIPATNVIITDSIPTGTTFI 949
+++ + + Y I++ N G++ +V++ D++ G T +
Sbjct: 325 ----DWSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSPGVTVL 363



Score = 36.2 bits (83), Expect = 0.001
Identities = 42/176 (23%), Positives = 65/176 (36%), Gaps = 26/176 (14%)

Query: 1698 KTATPETVTLGDIITYTISLQNTGTIPANNILVSDPIPTGTSFIQNSVTINNVSQPTANP 1757
K PE L + Y I++ N GT A N++V +P+P G + +
Sbjct: 214 KQEGPENACLRCPVVYKINIVNQGTATARNVVVENPVPDG------------YAHSSGQR 261

Query: 1758 ETGIQIPTLSPSESATISFHVLVTSIPPSGEIQNQGNVSFQYQPDATKPPVSVTTPTPTT 1817
+ + P E TI+ G N VS+ K SVTT
Sbjct: 262 VLTFTLGDMQPGEHRTITVEFCPLK---RGRATNIATVSY---CGGHKNTASVTTVINEP 315

Query: 1818 ITPVNVGTINPIKTADKSIVSVGDTITFTITFQNEGTIPVTDISVTDSLPAGTSFI 1873
V+ I AD S V + + I+ N G + + D+ V D+L G + +
Sbjct: 316 CVQVS------IAGADWSYVC--KPVEYVISVSNPGDLVLRDVVVEDTLSPGVTVL 363



Score = 36.2 bits (83), Expect = 0.001
Identities = 37/163 (22%), Positives = 65/163 (39%), Gaps = 26/163 (15%)

Query: 127 ITYTITFNNDGTVPATNVIFTDSIPAGTTFIPNSVVLNNNPVPNSNPALGITVGTLNPGE 186
+ Y I N GT A NV+ + +P G + L T+G + PGE
Sbjct: 227 VVYKINIVNQGTATARNVVVENPVPDGYAH------------SSGQRVLTFTLGDMQPGE 274

Query: 187 TKTLSFQVRVTQIPAGGTITNEASTTYTYQPDPTLPPVTTTEPTPPTSVTVNTATVNPTK 246
+T++ + + G TN A+ +Y T VTT P V++ A
Sbjct: 275 HRTITVEFCPLK---RGRATNIATVSYCGGHKNT-ASVTTVINEPCVQVSIAGAD----- 325

Query: 247 SADRAFADIGDIITYTISLQNNGTVPATNIILTDPIPNGTTFI 289
++ + + Y IS+ N G + ++++ D + G T +
Sbjct: 326 -----WSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSPGVTVL 363



Score = 35.1 bits (80), Expect = 0.004
Identities = 48/221 (21%), Positives = 80/221 (36%), Gaps = 40/221 (18%)

Query: 1315 ITYTISLQNTGTVPATNVLVTDPIPAGTTFIPNSVTINDVTQPGIVPSSGILIGTLEPNT 1374
+ Y I++ N GT A NV+V +P+P G + +G ++P
Sbjct: 227 VVYKINIVNQGTATARNVVVENPVPDGYAHSSGQRVLT------------FTLGDMQPGE 274

Query: 1375 SAVVTFQVQVTSIPPTGFIENQGTVSFQYQPDPTRPPVSVTTPTPTTKTQVSEVTINPNK 1434
+T + G N TVS+ T + T ++E + +
Sbjct: 275 HRTITVEFCPLK---RGRATNIATVSY----------CGGHKNTASVTTVINEPCVQVSI 321

Query: 1435 QGNPQTINLGDTVTYTITFQNVGNINATDVIITDPTPAGTTFIPNSVTINGVSSPGANPN 1494
G + + V Y I+ N G++ DV++ D G T + + GA +
Sbjct: 322 AGADWSY-VCKPVEYVISVSNPGDLVLRDVVVEDTLSPGVTVL---------EAAGAQIS 371

Query: 1495 SGVNVGTV---TPGQIVTLTYQVTVTALPPDGIIKNTATVT 1532
V TV PG+ +L Y+V V A P N +
Sbjct: 372 CNKVVWTVKELNPGE--SLQYKVLVRAQTPGQFTNNVVVKS 410



Score = 33.9 bits (77), Expect = 0.009
Identities = 39/164 (23%), Positives = 67/164 (40%), Gaps = 28/164 (17%)

Query: 523 VTFTVTFQNKGTVPATNVTVQDSLPQGVSFVPGSVVINGISQLGENPEIGIPIGTVNPGQ 582
V + + N+GT A NV V++ +P G + G V+ +G + PG+
Sbjct: 227 VVYKINIVNQGTATARNVVVENPVPDGYAHSSGQRVLT------------FTLGDMQPGE 274

Query: 583 SITVTFQGIVNSIPPG-GVIRNKANITFTYEPSPNEPPVTTTITTPETETTVNTATLEPQ 641
T+T V P G N A +++ N VTT I P + ++ A
Sbjct: 275 HRTIT----VEFCPLKRGRATNIATVSYC-GGHKNTASVTTVINEPCVQVSIAGA----- 324

Query: 642 KTVNRSFVTLNDIITYTLSFQNVGTVSATNVTITDSIPAGTTFI 685
+ S+V + Y +S N G + +V + D++ G T +
Sbjct: 325 ---DWSYVC--KPVEYVISVSNPGDLVLRDVVVEDTLSPGVTVL 363



Score = 32.7 bits (74), Expect = 0.016
Identities = 36/166 (21%), Positives = 63/166 (37%), Gaps = 32/166 (19%)

Query: 259 ITYTISLQNNGTVPATNIILTDPIPNGTTFIPNSVTINGISQPNTNPSTGITVGTLDPTE 318
+ Y I++ N GT A N+++ +P+P +G + + T+G + P E
Sbjct: 227 VVYKINIVNQGTATARNVVVENPVP------------DGYAHSSGQRVLTFTLGDMQPGE 274

Query: 319 AATISFQVQVISVPPHGLVENQGTVSFTHIVNPNEPPVTKTSPTPKTETAVNTIISTPTK 378
TI+ + G N TVS+ K +V T+I+ P
Sbjct: 275 HRTITVE---FCPLKRGRATNIATVSYCG--------------GHKNTASVTTVINEPCV 317

Query: 379 TADKQLAD---IGDTITYTITFRNGGTVPATNVTLIDSTPSGTTFI 421
AD + + Y I+ N G + +V + D+ G T +
Sbjct: 318 QVSIAGADWSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSPGVTVL 363



Score = 32.0 bits (72), Expect = 0.029
Identities = 40/174 (22%), Positives = 66/174 (37%), Gaps = 26/174 (14%)

Query: 1434 KQGNPQTINLGDTVTYTITFQNVGNINATDVIITDPTPAGTTFIPNSVTINGVSSPGANP 1493
KQ P+ L V Y I N G A +V++ +P P +G +
Sbjct: 214 KQEGPENACLRCPVVYKINIVNQGTATARNVVVENPVP------------DGYAHSSGQR 261

Query: 1494 NSGVNVGTVTPGQIVTLTYQVTVTALPPDGIIKNTATVTYTFQPNPGEPPITITDPTPTV 1553
+G + PG+ T+T + G N ATV+Y + +T P V
Sbjct: 262 VLTFTLGDMQPGEHRTITVEFCPLK---RGRATNIATVSYC-GGHKNTASVTTVINEPCV 317

Query: 1554 EVSVITPTPNPNKLADKQIVDINEIITYTVTFQNRGSVPATSVIVTDPLANGLT 1607
+VS+ A + + + Y ++ N G + V+V D L+ G+T
Sbjct: 318 QVSI----------AGADWSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSPGVT 361


38BAS3481BAS3543Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS3481218-3.536204hypothetical protein
BAS3482317-4.278689hypothetical protein
BAS3483317-4.621167hypothetical protein
BAS3484520-5.103834prophage LambdaBa01 TPR domain-containing
BAS3485617-1.346519hypothetical protein
BAS3486516-0.039701hypothetical protein
BAS34875160.115690hypothetical protein
BAS34885151.197109hypothetical protein
BAS34893131.631800hypothetical protein
BAS34903141.430427prophage LambdaBa01, N-acetylmuramoyl-L-alanine
BAS34913141.322662prophage LambdaBa01, holin
BAS34923130.845880prophage LambdaBa01, AbrB family transcriptional
BAS34973131.090704prophage LambdaBa01, membrane protein
BAS34982170.478969hypothetical protein
BAS3499118-0.700487hypothetical protein
BAS3501219-0.249491prophage LambdaBa01, major tail protein
BAS3502419-1.035850hypothetical protein
BAS3503418-0.376401hypothetical protein
BAS3504517-0.414032hypothetical protein
BAS3505216-0.599732hypothetical protein
BAS3506315-0.602472hypothetical protein
BAS3507416-0.661315phage major capsid protein
BAS3508418-0.822893prophage LambdaBa01, prohead protease
BAS3509420-1.680755hypothetical protein
BAS3510420-2.480112prophage LambdaBa01, terminase, large subunit
BAS3511826-3.414593hypothetical protein
BAS3512825-4.064471hypothetical protein
BAS3513824-4.661260hypothetical protein
BAS3514923-4.660261hypothetical protein
BAS3515924-4.477942hypothetical protein
BAS3516824-4.145784hypothetical protein
BAS3517725-3.318155hypothetical protein
BAS3518723-4.241900hypothetical protein
BAS3519823-4.999744hypothetical protein
BAS3520824-5.049562hypothetical protein
BAS3521823-4.056559hypothetical protein
BAS3522924-4.527402positive control sigma-like factor
BAS3523925-5.777471hypothetical protein
BAS3524926-4.444123prophage LambdaBa01, acyltransferase
BAS3525624-0.432610hypothetical protein
BAS35266211.356139hypothetical protein
BAS35274221.261224hypothetical protein
BAS35283191.358880hypothetical protein
BAS35293191.958046hypothetical protein
BAS35304191.143769prophage LambdaBa01, thymidylate
BAS35315190.371807prophage LambdaBa01, C-5 cytosine-specific DNA
BAS3532617-1.085629hypothetical protein
BAS3533416-0.934871hypothetical protein
BAS3534419-0.831099hypothetical protein
BAS3535221-1.066655hypothetical protein
BAS3536021-1.211418hypothetical protein
BAS3537122-1.245721hypothetical protein
BAS3538321-2.025263hypothetical protein
BAS3539623-2.717222hypothetical protein
BAS3540721-2.720548hypothetical protein
BAS3541419-3.035430hypothetical protein
BAS3542118-0.757517hypothetical protein
BAS3543217-0.942052hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3498GPOSANCHOR371e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 37.0 bits (85), Expect = 1e-04
Identities = 32/234 (13%), Positives = 77/234 (32%), Gaps = 9/234 (3%)

Query: 15 DGETTGLQNALKDVNKRSNDLTKELKDVERLLKFDPGNIEALAQKQQLLTQQIENTTQKL 74
E + + L+ +K ++ +++++E +E + +I+ +
Sbjct: 91 TEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEK 150

Query: 75 DKLKAAEQQVQAQFQNGKISEEQYRAFRREIEFTEGSLNGLKNKLGNMKAEQDSVASSTR 134
L A + ++ + A + +E + +L + +L + +++
Sbjct: 151 AALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADS 210

Query: 135 QLETLFSATGKSVDDFAGALGNRLVNAIRSGTATSKQLDQAIGIIGREALGTEADIEKLQ 194
A + A L A+ S I + E EA +L+
Sbjct: 211 AKIKTLEAEKAA----LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELE 266

Query: 195 RALRSV-----DAGNTIQQVQNELRDLQQEAGKTEKKFEGLKIGLENVIGGLAA 243
+AL I+ ++ E L+ E E + + L +++ L A
Sbjct: 267 KALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDA 320


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3506PF07675260.024 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 26.2 bits (57), Expect = 0.024
Identities = 17/58 (29%), Positives = 23/58 (39%), Gaps = 3/58 (5%)

Query: 23 HYSVADSYESNDAERVMYLQDEGFLNKERIIEKQEGSKGPVHVGGGYYE---LPNGEK 77
HY+V S NDA E L + ++ E +G G Y + LP G K
Sbjct: 1172 HYAVYASSTGNDASNFANALLEEVLTAKTVVTAPEAIRGTRAQGTWYQKTVQLPAGTK 1229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3509PF05043300.020 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 29.9 bits (67), Expect = 0.020
Identities = 18/99 (18%), Positives = 34/99 (34%), Gaps = 11/99 (11%)

Query: 196 AIKNSAVVKWILKFKSVLKQEDIDS------QVKNFVNNYLNISNDGGAASSDPRYDLEQ 249
I+N + W L + L ++++ + Q N + N+ NI SD + +L
Sbjct: 308 EIENKDNLIWHLHNTAHLYRQELFTEFILFDQKGNTIRNFQNIFPK---FVSDVKKELSH 364

Query: 250 VKPEAFVPDSKQMQETVQRIYNFFNTNEKIIQSKYNEDE 288
V S M Y F + ++ +
Sbjct: 365 YLETLEVCSSSMM--VNHLSYTFITHTKHLVINLLQNQP 401


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3515UREASE290.003 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 28.6 bits (64), Expect = 0.003
Identities = 12/30 (40%), Positives = 18/30 (60%), Gaps = 2/30 (6%)

Query: 27 DEVLTTPEVMDVLGISKARISKMIKDGKLV 56
D V+T ++D GI KA I +KDG++
Sbjct: 69 DTVITNALILDHWGIVKADIG--LKDGRIA 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3522HELNAPAPROT325e-04 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 32.2 bits (73), Expect = 5e-04
Identities = 9/50 (18%), Positives = 18/50 (36%), Gaps = 3/50 (6%)

Query: 1 MQDLIKQYNTTLRQLREAQKDAKEEDVKVLTDMISDITYSLE---WMKKA 47
+Q L+ Y + + A+E D+ + +E WM +
Sbjct: 101 VQALVNDYKQISSESKFVIGLAEENQDNATADLFVGLIEEVEKQVWMLSS 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3527TCRTETB280.003 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 27.9 bits (62), Expect = 0.003
Identities = 7/21 (33%), Positives = 12/21 (57%)

Query: 1 MITFVGVLLTIKFTREESRRE 21
MIT + V +K ++E R +
Sbjct: 176 MITIITVPFLMKLLKKEVRIK 196


39BAS3588BAS3594Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BAS3588213-1.011268peptidase T
BAS3589211-3.404496hypothetical protein
BAS3590112-3.996618hypothetical protein
BAS3591113-4.194842phosphoglycerate mutase
BAS3592113-4.180401alpha/beta hydrolase
BAS3593014-4.097542glyoxylase
BAS3594-114-4.248068sensory box/GGDEF family protein
40BAS3730BAS3751Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS3730-2143.274294hypothetical protein
BAS37311272.895714hypothetical protein
BAS37330292.926398orotate phosphoribosyltransferase
BAS37340283.110252orotidine 5'-phosphate decarboxylase
BAS37350272.908078dihydroorotate dehydrogenase 1B
BAS37360282.908514dihydroorotate dehydrogenase electron transfer
BAS37371272.603596carbamoyl phosphate synthase large subunit
BAS37382222.654018carbamoyl phosphate synthase small subunit
BAS37391202.279067dihydroorotase
BAS37402191.368953aspartate carbamoyltransferase
BAS37412201.613165uracil permease
BAS37422190.787615bifunctional pyrimidine regulatory protein
BAS37431190.630831RNA pseudouridylate synthase
BAS37440200.491762lipoprotein signal peptidase
BAS37451191.012893hypothetical protein
BAS37461190.920423isoleucyl-tRNA synthetase
BAS3747212-0.141403cell-division initiation protein DivIVA
BAS3748214-0.207559S4 domain-containing protein
BAS3749215-0.338316hypothetical protein
BAS3750216-1.086974hypothetical protein
BAS3751217-1.554420hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3739UREASE330.003 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 32.8 bits (75), Expect = 0.003
Identities = 25/83 (30%), Positives = 36/83 (43%), Gaps = 20/83 (24%)

Query: 17 IVATDLLVQDGKIAKV--AEN---------ITADNAEVIDVNGKLIAPGLVDVHVHLREP 65
IV D+ ++DG+IA + A N I EVI GK++ G +D H+H P
Sbjct: 83 IVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIHFICP 142

Query: 66 GGEHKETIETGTLAAAKGGFTTI 88
+ IE A G T +
Sbjct: 143 -----QQIEE----ALMSGLTCM 156


41BAS3775BAS3853Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS3775213-2.029511hypothetical protein
BAS3776115-2.605475prophage LambdaBa02, lipoprotein
BAS3777219-2.058547hypothetical protein
BAS3778316-2.463943DNA translocase FtsK
BAS3779516-0.871458hypothetical protein
BAS3780417-0.356334hypothetical protein
BAS3781013-1.209038prophage LambdaBa02, repressor protein
BAS3782013-1.147926hypothetical protein
BAS3783112-0.504085hypothetical protein
BAS3784013-0.268131prophage LambdaBa02, N-acetylmuramoyl-L-alanine
BAS3785013-0.316196prophage LambdaBa02, holin
BAS3786114-0.592185prophage LambdaBa02, site-specific recombinase
BAS37872140.235571prophage LambdaBa02, AbrB family transcriptional
BAS37881120.807185hypothetical protein
BAS37891120.848306phage minor structural protein
BAS37902111.612588hypothetical protein
BAS37914121.409656hypothetical protein
BAS37925131.840339hypothetical protein
BAS37933121.535375prophage LambdaBa02, tape measure protein
BAS37952170.196892hypothetical protein
BAS37962200.706188prophage LambdaBa02, major tail protein
BAS3797116-0.462009hypothetical protein
BAS3799116-0.310506hypothetical protein
BAS3800215-0.788511hypothetical protein
BAS3801214-1.177031hypothetical protein
BAS3802115-0.772052hypothetical protein
BAS3803214-0.811371prophage LambdaBa02, major capsid protein
BAS3804316-0.896225prophage LambdaBa02, Clp protease family
BAS3805418-1.425255hypothetical protein
BAS3806319-2.230126prophage LambdaBa02, terminase, large subunit
BAS3807520-2.552381hypothetical protein
BAS3808421-3.021983prophage LambdaBa02, HNH endonuclease family
BAS3809421-3.278109hypothetical protein
BAS3810421-3.127724hypothetical protein
BAS3811219-2.945945prophage LambdaBa02, site-specific recombinase
BAS3812317-2.221271hypothetical protein
BAS3813217-2.906638hypothetical protein
BAS3814416-2.722110hypothetical protein
BAS3815416-2.339014hypothetical protein
BAS3816419-0.072493hypothetical protein
BAS38174190.656450hypothetical protein
BAS38183210.242808fosfomycin resistance protein FosB
BAS38193211.503489hypothetical protein
BAS38202231.699946prophage LambdaBa02, membrane protein
BAS38210211.554806prophage LambdaBa02, deoxyuridine
BAS3822017-0.428444hypothetical protein
BAS3823119-0.771339prophage LambdaBa02, RNA polymerase sigma-F
BAS3824018-0.330920hypothetical protein
BAS3825019-0.933640hypothetical protein
BAS3826017-1.732765hypothetical protein
BAS3827016-3.205955hypothetical protein
BAS3828016-3.443106prophage LambdaBa02, DNA replication protein
BAS3829116-3.260801prophage LambdaBa02, DNA-binding protein
BAS3830216-2.807378prophage LambdaBa02, DNA-binding protein
BAS3831015-1.730858prophage LambdaBa02, repressor protein
BAS3832-314-1.239892hypothetical protein
BAS3833-1150.853046prophage LambdaBa02, repressor protein
BAS38340120.914815hypothetical protein
BAS38350130.462862hypothetical protein
BAS3836-1120.190291prophage LambdaBa02, site-specific recombinase
BAS38370130.108547hypothetical protein
BAS38382150.221847PDZ domain-containing protein
BAS38390140.275250phospholipase
BAS38400150.455249hypothetical protein
BAS3841-2170.520488phosphopantetheine adenylyltransferase
BAS3842-2170.902035methyltransferase
BAS3844-2180.566360hypothetical protein
BAS3845-3170.473321ComK regulator
BAS38461180.208729phosphoglycerate mutase
BAS3847219-0.386021hypothetical protein
BAS3848318-0.608524hypothetical protein
BAS3849419-0.866903hypothetical protein
BAS38504130.532967hypothetical protein
BAS38513120.661598formamidase
BAS38523130.285658hypothetical protein
BAS38533100.479100cytochrome c oxidase subunit IVB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3791SHAPEPROTEIN290.013 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 29.0 bits (65), Expect = 0.013
Identities = 10/41 (24%), Positives = 23/41 (56%)

Query: 29 KMQFTGVQMANGIAEGIKTQYSVVRDALQETVSGAVNSIRS 69
+++ G +A G+ G + + +ALQE ++G V+++
Sbjct: 233 EIEVRGRNLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMV 273


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3808TYPE3IMPPROT290.004 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 29.0 bits (65), Expect = 0.004
Identities = 12/58 (20%), Positives = 21/58 (36%), Gaps = 1/58 (1%)

Query: 10 RKFYDKYNRDKEAKKFYDSTAWRRCRELALIRDNYRCQECMKHDPLIPVPADMVHHIK 67
R + K D+E +F+++ +R E K +PA + IK
Sbjct: 100 RDYLIK-YSDRELVQFFENAQLKRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIK 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3841LPSBIOSNTHSS2285e-80 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 228 bits (583), Expect = 5e-80
Identities = 88/155 (56%), Positives = 115/155 (74%)

Query: 4 IAISSGSFDPITLGHLDIIKRGAKVFDEVYVVVLNNSSKKPFFSVEERLDLIREATKDIP 63
AI GSFDPIT GHLDII+RG ++FD+VYV VL N +K+P FSV+ERL+ I +A +P
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 NVKVDSHSGLLVEYAKMRNANAILRGLRAVSDFEYEMQITSMNRKLDENIETFFIMTNNQ 123
N +VDS GL V YA+ R A AILRGLR +SDFE E+Q+ + N+ L ++ET F+ T+ +
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 124 YSFLSSSIVKEVARYGGSVVDLVPPVVERALKEKF 158
YSFLSSS+VKEVAR+GG+V VP V AL ++F
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHVAAALYDQF 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS384456KDTSANTIGN260.019 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 26.1 bits (57), Expect = 0.019
Identities = 9/34 (26%), Positives = 19/34 (55%), Gaps = 2/34 (5%)

Query: 47 MEQIEHMMQKLNKLPFVKKIEQSYRPYLKTEFEN 80
+EQI+ +Q+L ++++ S+ Y+ F N
Sbjct: 297 IEQIQSKIQELGDT--LEELRDSFDGYINNAFVN 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3848ANTHRAXTOXNA270.030 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 27.0 bits (59), Expect = 0.030
Identities = 10/27 (37%), Positives = 17/27 (62%)

Query: 65 SSVKENKKEKDNRTEEEKTADVMGQML 91
S +K N K + N+TE+EK D + ++
Sbjct: 41 SDIKRNHKTEKNKTEKEKFKDSINNLV 67


42BAS3863BAS3880Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS38632140.044818PhoH family protein
BAS38646210.224867hypothetical protein
BAS38653200.881860hypothetical protein
BAS38663190.838862hypothetical protein
BAS38673190.947886hypothetical protein
BAS38683180.891871GTP-binding protein TypA
BAS3869-1110.705249hypothetical protein
BAS3870-1110.968453inositol monophosphatase
BAS3871-1130.502630hypothetical protein
BAS3872-1140.434533hypothetical protein
BAS3873-115-0.096862hypothetical protein
BAS38740211.545987lysine decarboxylase
BAS38751302.287671transglutaminase
BAS38763393.313661hypothetical protein
BAS38773433.591832hypothetical protein
BAS38783433.602230hypothetical protein
BAS38802393.515835dihydrolipoamide dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3868TCRTETOQM1812e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 181 bits (461), Expect = 2e-51
Identities = 101/476 (21%), Positives = 195/476 (40%), Gaps = 96/476 (20%)

Query: 8 LRNIAIIAHVDHGKTTLVDQLLRQAGTFRANEHVEE--RAMDSNDLERERGITILAKNTA 65
+ NI ++AHVD GKTTL + LL +G V++ D+ LER+RGITI T+
Sbjct: 3 IINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITS 62

Query: 66 IHYEDKRINILDTPGHADFGGEVERIMKMVDGVLLVVDAYEGCMPQTRFVLKKALEQNLT 125
+E+ ++NI+DTPGH DF EV R + ++DG +L++ A +G QTR + + +
Sbjct: 63 FQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIP 122

Query: 126 PIVVVNKIDRDFARPDEVVDEVIDLF---------IELG-------------------AN 157
I +NKID++ V ++ + +EL N
Sbjct: 123 TIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGN 182

Query: 158 EDQLE--------------------------FPVVFASAMNGTASLDSNPANQEENMKSL 191
+D LE FPV SA N + +L
Sbjct: 183 DDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNN------------IGIDNL 230

Query: 192 FDTIIEHIPAPIDNSEEPLQFQVALLDYNDYVGRIGVGRVFRGTMKVGQQVALMKVDGSV 251
+ I + + L +V ++Y++ R+ R++ G + + V + + +
Sbjct: 231 IEVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKE--- 287

Query: 252 KQFRVTKLFGYMGLKRQEIEEAKAGDLVAVSGMEDINVGETVCPVEHQDALPLLRIDEPT 311
+ ++T+++ + + +I++A +G++V + E + + + + + P
Sbjct: 288 -KIKITEMYTSINGELCKIDKAYSGEIVILQN-EFLKLNSVLGDTKLLPQRERIENPLPL 345

Query: 312 LQMTFLVNNSPFAGREGKYITSRKIEER------LRSQLETDVSLRVDNTESPDAWIVSG 365
LQ T + K ++R L ++D LR + I+S
Sbjct: 346 LQTT---------------VEPSKPQQREMLLDALLEISDSDPLLRYYVDSATHEIILSF 390

Query: 366 RGELHLSILIENMRRE-GYELQVSKPEVIIKEVDGVRCEPVERVQIDVPEEYTGSI 420
G++ + + ++ + E+++ +P VI E + E +++ P + SI
Sbjct: 391 LGKVQMEVTCALLQEKYHVEIEIKEPTVIYMERPLKKAEYTIHIEVP-PNPFWASI 445



Score = 39.8 bits (93), Expect = 3e-05
Identities = 17/77 (22%), Positives = 28/77 (36%), Gaps = 1/77 (1%)

Query: 403 EPVERVQIDVPEEYTGSIMESMGARKGEMLDMVNNGNGQVRLTFMVPARGLIGYTTEFLT 462
EP +I P+EY ++D N +V L+ +PAR + Y ++
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNN-EVILSGEIPARCIQEYRSDLTF 595

Query: 463 LTRGYGILNHTFDCYQP 479
T G + Y
Sbjct: 596 FTNGRSVCLTELKGYHV 612


43BAS3947BAS3957Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS3947214-2.8476842-hydroxy-3-keto-5-methylthiopentenyl-1-
BAS3948114-1.596906methylthioribulose-1-phosphate dehydratase
BAS3949215-2.1525345-methylthio-3-oxo-1-penten-1,2-diol
BAS3950215-1.832996hypothetical protein
BAS3951316-0.631351hypothetical protein
BAS3954418-0.218349sensory box/GGDEF family protein
BAS39553311.564261nitroreductase family protein
BAS39562260.613874hypothetical protein
BAS39572240.483851hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3950CHANLCOLICIN306e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.4 bits (68), Expect = 6e-04
Identities = 14/47 (29%), Positives = 19/47 (40%), Gaps = 9/47 (19%)

Query: 21 GAIMEELEVGVLGFVASCVSALFF--------GLFG-AIPISILCAF 58
+ LE S V AL F G++G AI ILC++
Sbjct: 461 KPLFLTLEKKAADAGVSYVVALLFSLLAGTTLGIWGIAIVTGILCSY 507


44BAS4082BAS4094Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS40822151.327019geranyltranstransferase
BAS40830131.960695exodeoxyribonuclease VII small subunit
BAS40840141.806574exodeoxyribonuclease VII large subunit
BAS40852141.168711bifunctional 5,10-methylene-tetrahydrofolate
BAS40862140.533606transcription antitermination protein NusB
BAS40872120.179094hypothetical protein
BAS40881130.139420acetyl-CoA carboxylase biotin carboxylase
BAS4089215-0.758329acetyl-CoA carboxylase biotin carboxyl carrier
BAS4090215-1.009697stage III sporulation protein AH
BAS4091219-0.213904stage III sporulation protein AG
BAS40921181.473026stage III sporulation protein AF
BAS40931212.499281stage III sporulation protein AE
BAS40942202.438017stage III sporulation protein AD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4089RTXTOXIND270.025 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 27.5 bits (61), Expect = 0.025
Identities = 8/25 (32%), Positives = 12/25 (48%)

Query: 140 GEIVEILVNNGQLVEYGQPLFLVKA 164
+ EI+V G+ V G L + A
Sbjct: 105 SIVKEIIVKEGESVRKGDVLLKLTA 129


45BAS4109BAS4121Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS41093141.276037spore photoproduct lyase
BAS41103181.023777hypothetical protein
BAS41113201.323006lipoate-protein ligase A
BAS41122271.804053rhodanese-like domain-containing protein
BAS41132231.196786LacI family transcriptional regulator
BAS4114119-1.354170TetR family transcriptional regulator
BAS4115020-2.161699sugE protein
BAS4116221-3.264486sugE protein
BAS4117425-4.182545hypothetical protein
BAS4119120-3.771999hypothetical protein
BAS4121-120-3.165063hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4110IGASERPTASE354e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 34.7 bits (79), Expect = 4e-04
Identities = 28/114 (24%), Positives = 40/114 (35%), Gaps = 7/114 (6%)

Query: 104 KENKETAEQEETVVEATPKKEVVVEVPKAVTPAPKPVTRVETPAIASTPKPTPAPT--PK 161
E KETA E+ +A + E EVPK + + ET + P PT K
Sbjct: 1098 TETKETATVEKEE-KAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIK 1156

Query: 162 PVSVEAAVELSTPAPVKK---AVPTPVTKQETTPVAPVKPKQSALTETNSKLQE 212
+ T P K+ V PVT+ T ++ T + Q
Sbjct: 1157 EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGN-SVVENPENTTPATTQP 1209



Score = 32.7 bits (74), Expect = 0.002
Identities = 21/96 (21%), Positives = 31/96 (32%), Gaps = 10/96 (10%)

Query: 105 ENKETAEQEETVVEATPKKEVVVEVPKAVTPAPKPVTRV----------ETPAIASTPKP 154
E ++T E + + +PK+E V PA + V T K
Sbjct: 1115 ETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKE 1174

Query: 155 TPAPTPKPVSVEAAVELSTPAPVKKAVPTPVTKQET 190
T + +PV+ V TP T Q T
Sbjct: 1175 TSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPT 1210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4111DHBDHDRGNASE300.008 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 30.0 bits (67), Expect = 0.008
Identities = 26/98 (26%), Positives = 41/98 (41%), Gaps = 8/98 (8%)

Query: 93 VIVSEDHPNMPKTVTEAYRVISQGLLDGFKALGLE-AYYAVPKTEADRENLKNPRSG-VC 150
V V + +P+T AY + K LGLE A Y + R N+ +P S
Sbjct: 140 VTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI------RCNIVSPGSTETD 193

Query: 151 FDAPSWYEIVVEGRKIAGSAQTRQKGVILQHGSIPLEI 188
W + + I GS +T + G+ L+ + P +I
Sbjct: 194 MQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDI 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4114HTHTETR616e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 61.2 bits (148), Expect = 6e-14
Identities = 39/203 (19%), Positives = 71/203 (34%), Gaps = 25/203 (12%)

Query: 2 TANRIKAVALSHFARYGYEGTSLANIAQEVGIKKPSIYAHFKGKEELYFICLESALQKDL 61
T I VAL F++ G TSL IA+ G+ + +IY HFK K +L+ E +
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 62 QSFTDDIENFSNSSTEELLLQLLKGYAKRFGESEESMFWLRTSYFPPDAFRE-QIIEK-- 118
+ + F +L ++L + E + + + E ++++
Sbjct: 72 ELELEYQAKFP-GDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQ 130

Query: 119 ANAHIENVGKLLFPIFKQANEKSELH-NIEVKDALEAFLCLLDGLM-------------- 163
N +E+ ++ K E L ++ + A + GLM
Sbjct: 131 RNLCLESYDRIE-QTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDLKK 189

Query: 164 -----VELLFAGLNRFETRLNAS 181
V +L T N +
Sbjct: 190 EARDYVAILLEMYLLCPTLRNPA 212


46BAS4290BAS4316Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS42902153.079647hypothetical protein
BAS42913163.111003tRNA-specific 2-thiouridylase MnmA
BAS42923203.843364class V aminotransferase
BAS42934243.830084rrf2 family protein
BAS42944243.542433recombination factor protein RarA
BAS42954263.473513prespore-specific transcriptional regulator
BAS42962242.793485hesA/moeB/thiF family protein
BAS42972252.903208aspartyl-tRNA synthetase
BAS42980161.932618histidyl-tRNA synthetase
BAS43000131.614587hypothetical protein
BAS43010161.550156D-tyrosyl-tRNA(Tyr) deacylase
BAS43020161.382013GTP pyrophosphokinase
BAS43031131.418777adenine phosphoribosyltransferase
BAS43040111.135465single-stranded-DNA-specific exonuclease RecJ
BAS43051160.743541cation efflux family protein
BAS43062170.490476preprotein translocase subunit SecD/SecF
BAS4307-1191.089969hypothetical protein
BAS4308-1191.573098stage V sporulation protein B
BAS4309-2191.382742hypothetical protein
BAS43100222.912039hypothetical protein
BAS43110213.208750preprotein translocase subunit YajC
BAS43120202.446392queuine tRNA-ribosyltransferase
BAS43130151.729470S-adenosylmethionine--tRNA
BAS43140130.982682hypothetical protein
BAS43151140.566360Holliday junction DNA helicase RuvB
BAS4316214-0.769257holliday junction DNA helicase RuvA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4290SYCDCHAPRONE334e-04 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 33.0 bits (75), Expect = 4e-04
Identities = 17/90 (18%), Positives = 32/90 (35%)

Query: 8 GIQYMQEGNWEEAAKNFTEAIEENPKDALGYINFANLLDVLGDSERAILFYKRALELDDK 67
Q G +E+A K F + D+ ++ +G + AI Y +D K
Sbjct: 43 AFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIK 102

Query: 68 SAAAYYGLGNVYYGQEQFAEAKAVFEQAMQ 97
+ + + AEA++ A +
Sbjct: 103 EPRFPFHAAECLLQKGELAEAESGLFLAQE 132



Score = 31.1 bits (70), Expect = 0.002
Identities = 17/96 (17%), Positives = 27/96 (28%)

Query: 109 LGITHVQLGNDRLALPFLQRATELDENDVEAVFQCGLCFARLEHIQEAKPYFEKVLEMDE 168
L Q G A Q LD D G C + A + MD
Sbjct: 42 LAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDI 101

Query: 169 EHADAYYNLGVAYVFEENNEKALALFKKATEIQPDH 204
+ ++ + + +A + A E+ D
Sbjct: 102 KEPRFPFHAAECLLQKGELAEAESGLFLAQELIADK 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4292RTXTOXINA300.028 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.5 bits (66), Expect = 0.028
Identities = 25/123 (20%), Positives = 46/123 (37%), Gaps = 8/123 (6%)

Query: 114 GFEVTYLPVDETGRVQVSDIQKAL-TEETILVSVMFGNNEVGTMQPIAEIGKLLKEHQAY 172
G++ + E +S K E ++L++ + +G + + G ++Y
Sbjct: 444 GYDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHWDTLIGELAGVTRNGDKTLSGKSY 503

Query: 173 FHTDAVQAYGLVEINVKEFGIDLLSISAHKINGPKGVGFLYAGTNVKF-EPLLIGGEQER 231
D + +E EF + I+ T +KF PLL GE+ R
Sbjct: 504 --IDYYEEGKRLEKKXDEFQKQVFDPLKGNIDLSDSKS----STLLKFVTPLLTPGEEIR 557

Query: 232 KRR 234
+RR
Sbjct: 558 ERR 560


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4300PF05043250.020 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 25.3 bits (55), Expect = 0.020
Identities = 9/32 (28%), Positives = 14/32 (43%)

Query: 32 FISKEQNNTSMELASEFGISLQDVKRLKKQIE 63
FI + + + EF IS + R+ QI
Sbjct: 94 FIFFNEGCQAESICKEFYISSSSLYRIISQIN 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4301THERMOLYSIN280.010 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 28.4 bits (63), Expect = 0.010
Identities = 24/118 (20%), Positives = 46/118 (38%), Gaps = 16/118 (13%)

Query: 16 DGEIVGQIPFGLTLLVGITHEDTEKDATYIAEKIANLRIFEDESGKMNHSVLDVEGQVLS 75
DG+ +PF + V + HE T + + A L ++++ESG +N ++ D+ G ++
Sbjct: 352 DGDGQTFLPFSGGIDV-VGHELTHA----VTDYTAGL-VYQNESGAINEAMSDIFGTLVE 405

Query: 76 ----------ISQFTLYGDCRKGRRPNFMDAAKPDYAEHLYDFFNEEVRKQGLHVETG 123
I + + D AK +H + G+H +G
Sbjct: 406 FYANRNPDWEIGEDIYTPGVAGDALRSMSDPAKYGDPDHYSKRYTGTQDNGGVHTNSG 463


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4306SECFTRNLCASE2702e-86 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 270 bits (691), Expect = 2e-86
Identities = 100/318 (31%), Positives = 165/318 (51%), Gaps = 21/318 (6%)

Query: 443 PTKFDRINFVNVGHKFLIFSIVVVIAGAIILPIFKLNLGIDFASGTRIDLQSKQSVTVSD 502
P K + +F +IV++IA I+ + LN GIDF GT I +S ++ V
Sbjct: 9 PEKTN-FDFFRWQWATFGAAIVMMIASVILPLVIGLNFGIDFKGGTTIRTESTTAIDVGV 67

Query: 503 VHKDFKELNID---VKEENIVPTGDDNKGFAVR-----------TLGVLSKDEIAKTKTF 548
+ L + + E +D +R G ++ + K +T
Sbjct: 68 YRAALEPLELGDVIISEVRDPSFREDQHVAMIRIQMQEDGQGAEGQGAQGQELVNKVETA 127

Query: 549 FH--DKYGTDPNVSTVSPTIGKEIARNAFIAVLIASAVIILYVSIRFRFTYALSAVLALL 606
D + +V P + E+ A ++L A+ VI+ Y+ +RF + +AL AV+AL+
Sbjct: 128 LTAVDPALKITSFESVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALV 187

Query: 607 HDAFVMIVIFSIFQLEVDLTFIAAVLTIIGYSINDSIVTFDRNRELYKQKKRVRDIKDLE 666
HD + + +F++ QL+ DLT +AA+LTI GYSIND++V FDR RE + K L
Sbjct: 188 HDVLLTVGLFAVLQLKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKT----MPLR 243

Query: 667 EIVNASIRQTLGRSINTVLTVLFPVIALLIFGSESLRNFSFALLVGLVVGTYSSVFVASQ 726
+++N S+ +TL R++ T +T L ++ +LI+G + +R F FA++ G+ GTYSSV+VA
Sbjct: 244 DVMNLSVNETLSRTVMTGMTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKN 303

Query: 727 IWLMLENRRLKKGKNKKK 744
I L + R K+ K+
Sbjct: 304 IVLFIGLDRNKEKKDPSD 321



Score = 66.0 bits (161), Expect = 1e-13
Identities = 38/180 (21%), Positives = 84/180 (46%), Gaps = 11/180 (6%)

Query: 249 SVGAKFGQQALEQTIFASAIGIALIFLFMLV-FYRLPGLVAVIMLGLYIFVTLLVFNWMH 307
SVG K + + +++ +I ++ V F L AV+ L + +T+ +F +
Sbjct: 142 SVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHDVLLTVGLFAVLQ 201

Query: 308 AVLTLPGIAALVLGVGIAVDANIITYERLKEELKIGKSMM------SAFRAGNHRSLATI 361
L +AAL+ G +++ ++ ++RL+E L K+M + R++ T
Sbjct: 202 LKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNETLSRTVMTG 261

Query: 362 LDANITTLAAAGVLFVYGNSSVKGFATSLIVSILVGFITNVFGTRFLLSLLVKSRYFDKK 421
+ TTL A + ++G ++GF +++ + G ++V+ + ++ + R +KK
Sbjct: 262 M----TTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIVLFIGLDRNKEKK 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4311PF06580280.006 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 27.5 bits (61), Expect = 0.006
Identities = 8/39 (20%), Positives = 19/39 (48%)

Query: 7 NIVMIVAMFAIFYFLLIRPQQKRQKAVAQMQSELKKGDA 45
N+V++ M+++ YF + +Q + Q + +A
Sbjct: 123 NVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMAQEA 161


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4314ACRIFLAVINRP260.011 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 26.0 bits (57), Expect = 0.011
Identities = 13/59 (22%), Positives = 30/59 (50%), Gaps = 5/59 (8%)

Query: 1 MTEMPKLLITAGILLIVVGLAWKFIGRLPGDIFVKKGNVTFYFPIITCIVLSIVLSFIM 59
M+++ L+ ++L V + F G G I+ + F I++ + LS++++ I+
Sbjct: 435 MSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQ-----FSITIVSAMALSVLVALIL 488


47BAS4350BAS4371Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS43500173.106374rod shape-determining protein MreB
BAS4351-1142.739910DNA repair protein RadC
BAS43521132.381752Maf-like protein
BAS43530152.835267stage II sporulation protein B
BAS4354-1172.889739folylpolyglutamate synthase
BAS43550173.124560valyl-tRNA synthetase
BAS4356-1132.690920hypothetical protein
BAS4357-1121.972691stage VI sporulation protein D
BAS43581131.962315glutamate-1-semialdehyde aminotransferase
BAS43591120.859089delta-aminolevulinic acid dehydratase
BAS43602140.854373uroporphyrinogen-III synthase
BAS43611140.364870porphobilinogen deaminase
BAS43620140.903212hemX protein
BAS43630152.080977glutamyl-tRNA reductase
BAS4364-1172.224114marR family transcriptional regulator
BAS43651192.777555organic hydroperoxide resistance protein
BAS43662162.086202ribosome biogenesis GTP-binding protein YsxC
BAS43672151.842564ATP-dependent protease La 1
BAS43682161.447395ATP-dependent protease LA
BAS43694200.856662ATP-dependent protease ATP-binding subunit ClpX
BAS43706200.744296trigger factor
BAS43713150.498332hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4350SHAPEPROTEIN497e-180 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 497 bits (1281), Expect = e-180
Identities = 194/336 (57%), Positives = 252/336 (75%), Gaps = 5/336 (1%)

Query: 4 FGGFTRDLGIDLGTANTLVYVKGKGVVLREPSVVALQTD----TKQIVAVGSDAKQMIGR 59
G F+ DL IDLGTANTL+YVKG+G+VL EPSVVA++ D K + AVG DAKQM+GR
Sbjct: 6 RGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAAVGHDAKQMLGR 65

Query: 60 TPGNVVALRPMKDGVIADYETTATMMKYYIQQAQKSNGFFSRKPYVMVCVPSGITAVERR 119
TPGN+ A+RPMKDGVIAD+ T M++++I+Q SN F P V+VCVP G T VERR
Sbjct: 66 TPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQV-HSNSFMRPSPRVLVCVPVGATQVERR 124

Query: 120 AVIDATRQAGARDAYPIEEPFAAAIGANLPVWEPTGSMVVDIGGGTTEVAIISLGGIVTS 179
A+ ++ + AGAR+ + IEEP AAAIGA LPV E TGSMVVDIGGGTTEVA+ISL G+V S
Sbjct: 125 AIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYS 184

Query: 180 QSVRVAGDDMDDSIIQYIKKSYNLMIGERTAEALKLEIGSAGEPEGIEPMEIRGRDLVSG 239
SVR+ GD D++II Y++++Y +IGE TAE +K EIGSA + + +E+RGR+L G
Sbjct: 185 SSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEG 244

Query: 240 LPKTVLIQPEEIADALKDTVDAIVESVKNTLEKTPPELAADIMDRGIVLTGGGALLRNLD 299
+P+ + EI +AL++ + IV +V LE+ PPELA+DI +RG+VLTGGGALLRNLD
Sbjct: 245 VPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLD 304

Query: 300 KVISEETNMPVLVAEDPLDCVAIGTGKALDNIDLFK 335
+++ EET +PV+VAEDPL CVA G GKAL+ ID+
Sbjct: 305 RLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHG 340


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4359ENTEROVIROMP310.004 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 31.0 bits (70), Expect = 0.004
Identities = 32/157 (20%), Positives = 54/157 (34%), Gaps = 25/157 (15%)

Query: 146 AVLAKTAVSQAKAGADIIAPSNMMDGFVTAIRHALDENGFGHVPVMSYAVKYSSAFYGPF 205
+V A + V+ A +D N M GF R+ D + G + +Y K +A G +
Sbjct: 21 SVAATSTVTGGYAQSDAQGQMNKMGGFNLKYRYEEDNSPLGVIGSFTYTEKSRTASSGDY 80

Query: 206 RDAAHGAPQFGDRKTYQMDPANRME-----------AFREAESDVMEGADFLIVKPALSY 254
+ G PA R+ + + ++ SY
Sbjct: 81 NKNQYYGITAG--------PAYRINDWASIYGVVGVGYGKFQTTEYPTYKHDTSDYGFSY 132

Query: 255 LDIVRDVKNNFN-LPVVAYNVSGEYSMIKAAAQNGWI 290
++ FN + VA + S E S I++ WI
Sbjct: 133 GAGLQ-----FNPMENVALDFSYEQSRIRSVDVGTWI 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4366TCRTETOQM280.027 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 27.9 bits (62), Expect = 0.027
Identities = 18/90 (20%), Positives = 37/90 (41%), Gaps = 13/90 (14%)

Query: 58 KTQTLNFFLINEMMHFVDVPGYGYAKVSKTERAAWGKMIETYFTTREQLDAAVLVVDLRH 117
+T +F N ++ +D PG+ +++ R+ LD A+L++ +
Sbjct: 57 QTGITSFQWENTKVNIIDTPGH-MDFLAEVYRSL------------SVLDGAILLISAKD 103

Query: 118 KPTNDDVMMYDFLKHYDIPTIIIATKADKI 147
+++ L+ IPTI K D+
Sbjct: 104 GVQAQTRILFHALRKMGIPTIFFINKIDQN 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4367HTHFIS382e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.5 bits (87), Expect = 2e-04
Identities = 29/101 (28%), Positives = 43/101 (42%), Gaps = 14/101 (13%)

Query: 352 LCLVGPPGVGKTSLARSI-ATSLNRN--FVRVSLGGVRD---ESEIRGHRRTYVGAMPGR 405
L + G G GK +AR++ RN FV +++ + ESE+ GH + GA G
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEK---GAFTGA 219

Query: 406 IIQGMKKAKSVNP-VFLLDEIDKMSNDFRGDPSAALLEVLD 445
+ + + LDEI M D + LL VL
Sbjct: 220 QTRSTGRFEQAEGGTLFLDEIGDMPMDAQ----TRLLRVLQ 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4368HTHFIS584e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 58.3 bits (141), Expect = 4e-11
Identities = 43/214 (20%), Positives = 76/214 (35%), Gaps = 41/214 (19%)

Query: 44 ELEQLRKMREISLTEPLAEKVR----PTSFLDIVGQEDGIKSLK--AALCGPNPQHVIIY 97
+L +L + +L EP + + +VG+ ++ + A ++I
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMIT 166

Query: 98 GPPGVGKTAAARLVLEEAKRNPKSPFRTNATFIELDATTARFDERGIADPLIGSVHDPIY 157
G G GK AR + + KR F+ ++ A I L G
Sbjct: 167 GESGTGKELVARALHDYGKRRNGP-------FVAINM--AAIPRDLIESELFGHE----- 212

Query: 158 QGAGAMGQAGIPQPKKGAVTDAHGGILFIDEIGELHPIQMNKMLKVLEDRKVFLESAYYS 217
GA G G A GG LF+DEIG++ ++L+VL+ +
Sbjct: 213 --KGAF--TGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGG--- 265

Query: 218 EENTMIPTYIHDIFQKGLPADFRLVGATTRSPEE 251
+ + +D R+V AT + ++
Sbjct: 266 --------------RTPIRSDVRIVAATNKDLKQ 285


48BAS4415BAS4430Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS44152132.312766hypothetical protein
BAS44161112.446751excinuclease ABC subunit C
BAS44170133.400083thioredoxin
BAS4418-2144.632714electron transfer flavoprotein subunit alpha
BAS4419-2113.437406electron transfer flavoprotein subunit beta
BAS4420-1133.788534enoyl-CoA hydratase
BAS4421-1143.623892TetR family transcriptional regulator
BAS44220143.304840long-chain-fatty-acid--CoA ligase
BAS44230182.840163triple helix repeat-containing collagen
BAS4424116-0.163096iron ABC transporter substrate-binding protein
BAS44251170.012424iron-hydroxamate transporter permease subunit
BAS4426013-3.089394hypothetical protein
BAS4427-115-4.671096spore coat protein C
BAS4428-215-4.114403hypothetical protein
BAS4429-113-4.131767hypothetical protein
BAS4430-112-3.067855hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4421HTHTETR1132e-33 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 113 bits (283), Expect = 2e-33
Identities = 36/192 (18%), Positives = 75/192 (39%), Gaps = 10/192 (5%)

Query: 5 RPKYNQIIDAAVIVIAENGYHQAQVSKIAKQAGVADGTIYLYFKNKEDILISLFQEKMGE 64
+ I+D A+ + ++ G + +IAK AGV G IY +FK+K D+ +++
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 65 FVETIRQKTAGIESAVSKLFMLVETHFLLLSQNDPL--AIVTQLELRQSNQDLRLKINEV 122
E + A + + H L + + ++ + + + +
Sbjct: 70 IGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 123 LKGY----LQVIDEILETGIKQGEFQADLNVRVARQMIFGTVDEVVTNWVMSDHKYDLVA 178
+ I++ L+ I+ ADL R A ++ G + ++ NW+ + +DL
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDL-- 187

Query: 179 LSKTVHGLLIAA 190
K +A
Sbjct: 188 --KKEARDYVAI 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4424FERRIBNDNGPP1835e-58 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 183 bits (465), Expect = 5e-58
Identities = 62/258 (24%), Positives = 115/258 (44%), Gaps = 11/258 (4%)

Query: 52 AKKVVVLEWVYSEDLLALGVQPVGMADIKNYNKWVNTKTKPSKDVVDVGTRQQPNLEEIS 111
++V LEW+ E LLALG+ P G+AD NY WV+ P V+DVG R +PNLE ++
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLP-DSVIDVGLRTEPNLELLT 93

Query: 112 RLKPDLIITASFRGKAIKNELEQIAPTVMFDPSTSNNDHFAEMTETFKQIAKAVGKEEEG 171
+KP ++ S L +IAP F+ S A ++ ++A + +
Sbjct: 94 EMKPSFMVW-SAGYGPSPEMLARIAPGRGFNFSDGKQP-LAMARKSLTEMADLLNLQSAA 151

Query: 172 KKVLADMDKAFADAKAKIEKADLKDKNIAMAQAFTAKNVPTFRILTDNSLALQVTKKLGL 231
+ LA + K + K + + + +++ + NSL ++ + G+
Sbjct: 152 ETHLAQYEDFIRSMKPRFVKRGARP--LLLTTLIDPRHM---LVFGPNSLFQEILDEYGI 206

Query: 232 TNTFEAGKSEPDGFKQTTVESLQSVQDSNFIYIVADEDNIFDTQLKGNPAWEELKFKKEN 291
N ++ G++ G +++ L + +D + + D D L P W+ + F +
Sbjct: 207 PNAWQ-GETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMD-ALMATPLWQAMPFVRAG 264

Query: 292 KMYKLKGDTWIFGGPESA 309
+ ++ W +G SA
Sbjct: 265 RFQRVP-AVWFYGATLSA 281


49BAS4443BAS4471Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS44432140.864888hypothetical protein
BAS44441151.083366cell wall anchor domain-containing protein
BAS44451151.437621branched-chain amino acid transport system II
BAS44462162.040866RNA pseudouridylate synthase
BAS44472181.913111recombination and DNA strand exchange inhibitor
BAS44480201.106090hypothetical protein
BAS44491272.022142colicin V production protein CvpA
BAS44504333.514524cell division protein ZapA
BAS44514353.623765ribonuclease HIII
BAS44524353.508495hypothetical protein
BAS44534363.455249hypothetical protein
BAS44544353.611337asparaginyl-tRNA synthetase
BAS44554292.601447phenylalanyl-tRNA synthetase subunit beta
BAS44560200.227038phenylalanyl-tRNA synthetase subunit alpha
BAS4457017-1.182519RNA methyltransferase
BAS4458121-2.017910small, acid-soluble spore protein SspI
BAS4459114-0.381288HD domain-containing protein
BAS4460113-0.560614CAAX amino terminal protease family protein
BAS4461113-0.162117CAAX amino terminal protease family protein
BAS44622151.530141hypothetical protein
BAS44632171.167518hypothetical protein
BAS44643201.387210EmrB/QacA family drug resistance transporter
BAS44655231.282322hypothetical protein
BAS44665231.691434TetR family transcriptional regulator
BAS44673322.187981M42 family peptidase
BAS44683270.463174hypothetical protein
BAS44693231.10662550S ribosomal protein L20
BAS44702171.41438750S ribosomal protein L35
BAS44712151.287080translation initiation factor IF-3
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4447GPOSANCHOR372e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 37.4 bits (86), Expect = 2e-04
Identities = 35/118 (29%), Positives = 60/118 (50%), Gaps = 11/118 (9%)

Query: 518 KIENMIAKLEE-------SQKNAERDWNEAEALRKQSEKLHREL--QRQIIEFNEERDER 568
++E KLEE S+++ RD + + +KQ E H++L Q +I E + + R
Sbjct: 327 QLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRR 386

Query: 569 LLKAQKEGEEKVEAAKKEAEGIIQELRQLRKAQLANVK--DHELIEAKSRLEGAAPEL 624
L A +E +++VE A +EA + L +L K + K + E E +++LE A L
Sbjct: 387 DLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKLEAEAKAL 444


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4458DNABINDINGHU240.033 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 24.3 bits (53), Expect = 0.033
Identities = 10/33 (30%), Positives = 15/33 (45%), Gaps = 1/33 (3%)

Query: 19 DQLQETIVDAIQSGEEKMLPGLGVLFEVIWKNA 51
D + + + GE+ L G G FEV + A
Sbjct: 27 DAVFSAVSSYLAKGEKVQLIGFGN-FEVRERAA 58


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4464TCRTETB1464e-40 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 146 bits (369), Expect = 4e-40
Identities = 86/400 (21%), Positives = 174/400 (43%), Gaps = 14/400 (3%)

Query: 108 FVSILNQTIINVALPPLMNEFNVSTSTAQWLITGFMLVNGILVPISAFLVSRFTYRKLFV 167
F S+LN+ ++NV+LP + N+FN ++ W+ T FML I + L + ++L +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 168 AAMLFFTVGSIICATSGN-FTMMMTGRVIQAVGAGILMPVGMNIFMTLFPPHKRGAAMGL 226
++ GS+I + F++++ R IQ GA + M + P RG A GL
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 227 LGVAMILAPAIGPTVTGWVIENYSWNLMFYAMFIIGLIITFLSLKFFTLAQPVSNTKLDI 286
+G + + +GP + G + W+ + I IIT L + DI
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMI--TIITVPFLMKLLKKEVRIKGHFDI 201

Query: 287 FGVVSSSIGLGSLLYGFSEAGNNSWTSAEVIISLVIGVIGLALFIWRELTTDNKMLDLQV 346
G++ S+G+ + + S + L++ V+ +F+ + +D +
Sbjct: 202 KGIILMSVGIVFFMLF---TTSYSIS------FLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 347 FKYPVFTFTLVINAIVTMALFGGMLLLPVYLQNIRGFTPIESG-LLLLPGSLIMGIMGPV 405
K F ++ I+ + G + ++P ++++ + E G +++ PG++ + I G +
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 406 AGKLFDKYGIRPLAIIGLAITTYATYEFTKLSMDTPYSVIMTDYIIRSIGMSFIMMPIMT 465
G L D+ G + IG+ + + + L T + + + G+SF I T
Sbjct: 313 GGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSW-FMTIIIVFVLGGLSFTKTVIST 371

Query: 466 AGMNALPMKLISHGTATQNTSRQVAGSIGTAILITLMTQQ 505
++L + G + N + ++ G AI+ L++
Sbjct: 372 IVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIP 411


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4465RTXTOXIND793e-19 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 79.1 bits (195), Expect = 3e-19
Identities = 29/135 (21%), Positives = 49/135 (36%), Gaps = 12/135 (8%)

Query: 87 QTVDVTIPQNATVVQSNATT-NAFVGAGSPI-AYAFDMNNLWVTANIEETDVDDVQKGQD 144
Q + P + V Q T V + + + L VTA ++ D+ + GQ+
Sbjct: 326 QASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQN 385

Query: 145 VDVYVDAYPDTT---LTGKVEQVGLTTANTFSMLPSSNATANYTKVTQVVPVKISLDHSK 201
+ V+A+P T L GKV+ + L V + +K
Sbjct: 386 AIIKVEAFPYTRYGYLVGKVKNINLDAI-------EDQRLGLVFNVIISIEENCLSTGNK 438

Query: 202 SVNIVPGMNVTVRIH 216
++ + GM VT I
Sbjct: 439 NIPLSSGMAVTAEIK 453


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4466HTHTETR602e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.0 bits (145), Expect = 2e-13
Identities = 24/100 (24%), Positives = 39/100 (39%), Gaps = 6/100 (6%)

Query: 9 PRVKRTRQLIQDAFVALVGEKGFENVTVQHIAERAPVNRATFYSHYHDKYDLLDKSIEEM 68
+ TRQ I D + L ++G + ++ IA+ A V R Y H+ DK DL + E
Sbjct: 7 QEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELS 66

Query: 69 LEKLTEVIKPKNRNKEDFQLAFDSPHPNFLALFEHIAENA 108
+ E+ E P + H+ E+
Sbjct: 67 ESNIGELE------LEYQAKFPGDPLSVLREILIHVLEST 100


50BAS4503BAS4526Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS45032150.122603hypothetical protein
BAS4504113-0.183387hypothetical protein
BAS45050130.227038thioesterase
BAS4506-114-0.893038hypothetical protein
BAS4507-113-0.968910metal-dependent hydrolase
BAS4508-114-1.674562proline dipeptidase
BAS4509217-3.374415lipoprotein
BAS4510016-2.603093lipoprotein
BAS4511016-2.933308hypothetical protein
BAS4512022-2.958765hypothetical protein
BAS4513221-3.444175acetyltransferase
BAS4514221-3.185543hypothetical protein
BAS4515-215-1.245171acetyltransferase
BAS4516-313-0.515512hypothetical protein
BAS4518-115-0.172501hypothetical protein
BAS4519-116-0.459080DNA-binding protein
BAS4520-1150.008559hypothetical protein
BAS45211160.320978alanine dehydrogenase
BAS45222130.0150973-ketoacyl-ACP reductase
BAS45232150.259097universal stress protein
BAS45241150.074637hypothetical protein
BAS45251150.122912hypothetical protein
BAS45262120.484534hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4513SACTRNSFRASE401e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 39.5 bits (92), Expect = 1e-06
Identities = 21/74 (28%), Positives = 31/74 (41%), Gaps = 3/74 (4%)

Query: 70 DNILGCYIAYSKSISGK--IEVLFVDEKHRGNGFGLKLMNSAVEWFKAKKIDEIELTVVY 127
+N I + +G IE + V + +R G G L++ A+EW K + L
Sbjct: 73 ENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQD 132

Query: 128 GN-EAISFYEKLGF 140
N A FY K F
Sbjct: 133 INISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4518ADHESNFAMILY280.020 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 27.5 bits (61), Expect = 0.020
Identities = 21/69 (30%), Positives = 32/69 (46%), Gaps = 2/69 (2%)

Query: 7 KQKQAKRQKYIKKKQEQNIPLSKKVVLMIEKTFRYICMALYVILCMYSFGVFYSLEITPN 66
K K K K K IP KK+++ E F+Y A Y + Y + + E TP
Sbjct: 177 TDKLDKLDKESKDKF-NKIPAEKKLIVTSEGAFKYFSKA-YGVPSAYIWEINTEEEGTPE 234

Query: 67 IIESILEFL 75
I++++E L
Sbjct: 235 QIKTLVEKL 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4522DHBDHDRGNASE821e-20 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 82.0 bits (202), Expect = 1e-20
Identities = 65/264 (24%), Positives = 107/264 (40%), Gaps = 19/264 (7%)

Query: 1 MNYSNLKGGRFVRHALITAGTKGLGKQVTEKLLAKGYSVTVTYHSDITAMKKMKETYKNM 60
MN ++G + A IT +G+G+ V L ++G + + ++K+ + K
Sbjct: 1 MNAKGIEG----KIAFITGAAQGIGEAVARTLASQGAHIA-AVDYNPEKLEKVVSSLKAE 55

Query: 61 EERLQFVQADVTKKEDLHKIVEEAISRFGKIDFLINNAGPYVFERKKLVDYEEDEWNEMI 120
+ ADV + +I G ID L+N AG V + ++EW
Sbjct: 56 ARHAEAFPADVRDSAAIDEITARIEREMGPIDILVNVAG--VLRPGLIHSLSDEEWEATF 113

Query: 121 QGNLTAVFHLLKLVVPIMRKQNFGRIINYGFQGADSAPGWIYRSAFAAAKVGLVSLTKTV 180
N T VF+ + V M + G I+ G A +A+A++K V TK +
Sbjct: 114 SVNSTGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPR--TSMAAYASSKAAAVMFTKCL 171

Query: 181 AYEEAEYGITANMVCPGDIIGEMK----------EATIQEARQLKERNTPIGRSGTGEDI 230
E AEY I N+V PG +M+ E I+ + + + P+ + DI
Sbjct: 172 GLELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDI 231

Query: 231 ARTISFLCEEDSDMITGTIIEVTG 254
A + FL + IT + V G
Sbjct: 232 ADAVLFLVSGQAGHITMHNLCVDG 255


51BAS4541BAS4560Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BAS4541-1133.515578hypothetical protein
BAS45420152.703764hypothetical protein
BAS45430142.577108acetyl-CoA synthetase
BAS45441132.004365small, acid-soluble spore protein B
BAS45451132.060286thiamine biosynthesis protein ThiI
BAS45461132.407630class V aminotransferase
BAS45471142.667054septation ring formation regulator EzrA
BAS45480172.777313LysR family transcriptional regulator
BAS45492233.108104cysteine transporter
BAS45522233.302390hypothetical protein
BAS45531212.851803methionine gamma-lyase
BAS45541241.80706930S ribosomal protein S4
BAS45550172.304548hypothetical protein
BAS4556-1172.874369tyrosyl-tRNA synthetase
BAS45570152.721837hypothetical protein
BAS4558-1173.601590ECF subfamily RNA polymerase sigma factor
BAS45590173.163497lipoprotein
BAS4560-1173.035207acetyl-CoA synthetase
52BAS4573BAS4585Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS4573313-1.067175hypothetical protein
BAS45743200.589348hypothetical protein
BAS45753190.362278catabolite control protein A
BAS4576216-0.408483lipoprotein
BAS45772201.347516hypothetical protein
BAS45781201.639024hypothetical protein
BAS45790201.786760aminopeptidase
BAS45803152.580275lipoprotein
BAS45812152.629976hypothetical protein
BAS45823162.778008ribosomal-protein-serine acetyltransferase
BAS45833172.663669UDP-N-acetylmuramate--L-alanine ligase
BAS45843162.622919nicotinate phosphoribosyltransferase
BAS45854152.956932DNA translocase FtsK
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4578TYPE4SSCAGA290.009 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 29.3 bits (65), Expect = 0.009
Identities = 20/71 (28%), Positives = 38/71 (53%), Gaps = 4/71 (5%)

Query: 104 VTDEIENNADKVAQVVQWSSAAIEVY---NHYRATRQEKKVEKEERKLERLEKKAEKK-E 159
+ D + +N + V + + ++ A + N+ + +K +EK RK E LEK+ EKK E
Sbjct: 574 IKDFLSSNKELVGKTLNFNKAVADAKNTGNYDEVKKAQKDLEKSLRKREHLEKEVEKKLE 633

Query: 160 KRSRLRMRGES 170
+S + + E+
Sbjct: 634 SKSGNKNKMEA 644


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4585IGASERPTASE645e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 63.5 bits (154), Expect = 5e-12
Identities = 56/328 (17%), Positives = 96/328 (29%), Gaps = 35/328 (10%)

Query: 553 PVVEGQSVVEEAPIAEEQPVAEETSVVEEQPVAEETSIVEEQPVAEEAPVVE-EQPVVQK 611
P VE ++ + P + V EE + V+E PV AP E
Sbjct: 983 PEVEKRNQTVDTTNIT-TPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVA 1041

Query: 612 EEPKREKKRHVPFNVVMLKQDRARLMERHASRTNGMQSSMSERVENKPVHQVEEQPQVEE 671
E K+E K E+ A+ T +++ E + V+
Sbjct: 1042 ENSKQESKT-------------VEKNEQDATETTAQNREVAK----------EAKSNVKA 1078

Query: 672 KPMQQVV--VEPQVEEKQMQQVVEPQVEEKPMQQVVVEPQVEEKPMQQVVVEPQVEEKPM 729
V + +E Q + E EK + V + +E P V P+ E+
Sbjct: 1079 NTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSET 1138

Query: 730 QQVVVEPQVEEKPM------QQVVVEPQVEEKPMQQVVVEPQVEEKPMQQVVVEPQVEEK 783
Q EP E P Q E+P ++ + V V E
Sbjct: 1139 VQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVEN 1198

Query: 784 PVQ-QVVEPQVEEVQPVQQVVAEQVQKPISSTEVEEKAYVVNQRENDVRNVLQTPPTYTI 842
P Q + ++ + S + + + + T T
Sbjct: 1199 PENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTN 1258

Query: 843 PSLT-LLSIPQQAALDNTEWLEEQKELL 869
L+ + Q AL+ + + + L
Sbjct: 1259 AVLSDARAKAQFVALNVGKAVSQHISQL 1286



Score = 63.2 bits (153), Expect = 7e-12
Identities = 61/283 (21%), Positives = 95/283 (33%), Gaps = 40/283 (14%)

Query: 311 EEIKRSTEIEQPTIEVEKQAPEESVIVKAEEKLE-ETIVVEIPEEVEVIAEAEEPEEVEV 369
E+ ++ + T QA SV EE + V P A A E E
Sbjct: 986 EKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPP------APATPSETTET 1039

Query: 370 IAETEESEEVEVIAETEESEEV-------------EVIAETEELEEVEVTAETEELEEVE 416
+AE + E V +++ E V A T+ E + +ET+E + E
Sbjct: 1040 VAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTE 1099

Query: 417 VVAETEELEEVEVIAETEKLEELEEV--EVIAETEESEEVEVIAETEAPEEVEPVALEEM 474
+E + ETEK +E+ +V +V + E+SE V+ AE A E V ++E
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEP-ARENDPTVNIKEP 1158

Query: 475 QQEMVLNEAIEQKNEFIHVAVADEQTKKDVQSFADVLIAEEQSVVEETPIVEEQPVAEEA 534
Q + EQ + V T E + V V E P
Sbjct: 1159 QSQTNTTADTEQPAKETSSNVEQPVT--------------ESTTVNTGNSVVENPENTTP 1204

Query: 535 PVVEEQSVVEETPIVEEAPVVEGQSV---VEEAPIAEEQPVAE 574
+ E + + +SV VE A +
Sbjct: 1205 ATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTV 1247



Score = 48.1 bits (114), Expect = 3e-07
Identities = 45/223 (20%), Positives = 75/223 (33%), Gaps = 21/223 (9%)

Query: 214 EQGERQYEESKKEEKSVVDQWLEKNGYEIERQEPIVEEKEVVQEMSAPQEVPAAELLHET 273
E E E SK+E K+V E++ E Q V ++ + Q A+ ET
Sbjct: 1035 ETTETVAENSKQESKTVEKN--EQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 274 IAERMEGAKQESDVVDKNILQEELVDSKVEHEDTILSEEIKRSTEIEQPTIEVEKQAPEE 333
+ K+ + E+ +KVE E T +E+ + T P KQ E
Sbjct: 1093 KETQTTETKETAT-------VEKEEKAKVETEKT---QEVPKVTSQVSP-----KQEQSE 1137

Query: 334 SVIVKAEEKLEETIVVEIPEEVEVIAEAEEPEEVEVIAETEESEEVEVIAETEESEEVEV 393
+V +AE E V I E ++ + E A+ S + + E+
Sbjct: 1138 TVQPQAEPARENDPTVNIKEPQ---SQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNS 1194

Query: 394 IAETEELEEVEVTAETEELEEVEVVAETEELEEVEVIAETEKL 436
+ E E T + E + V + +
Sbjct: 1195 VVENPE-NTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEP 1236



Score = 46.6 bits (110), Expect = 7e-07
Identities = 54/266 (20%), Positives = 83/266 (31%), Gaps = 35/266 (13%)

Query: 423 ELEEVEVIAETEKLEELEEVEVIAETEESEEVEVIAETEAPEEVEPVALEEMQQEMVLNE 482
E+E+ +T + ++ + S E+ EAP A E V E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETV-AE 1042

Query: 483 AIEQKNEFIHVAVADEQTKKDVQS------FADVLIAEEQSVV----EETPIVEEQPVAE 532
+Q+++ + D ++V + + V ET + E
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 533 EAPVVEEQSVVEETPIVEEAPVVEGQSVVEEAPIAEEQPVAE------------------ 574
A V +E+ ET +E P V Q ++ QP AE
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQT 1162

Query: 575 ETSVVEEQPVAEETSIVEEQPVAEEAPV-----VEEQPVVQKEEPKREKKRHVPFNVVML 629
T+ EQP A+ETS EQPV E V V E P + N
Sbjct: 1163 NTTADTEQP-AKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKN 1221

Query: 630 KQDRARLMERHASRTNGMQSSMSERV 655
+ R+ H S+ V
Sbjct: 1222 RHRRSVRSVPHNVEPATTSSNDRSTV 1247


53BAS4615BAS4658Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS4615-1173.075723molybdopterin converting factor subunit 1
BAS46166247.714216molybdopterin converting factor subunit 2
BAS46176257.799944molybdopterin-guanine dinucleotide biosynthesis
BAS46187248.306280molybdopterin biosynthesis protein MoeA
BAS46199288.537981molybdenum cofactor biosynthesis protein MoaC
BAS46209318.644438thiamine/molybdopterin biosynthesis MoeB-like
BAS46239308.433416triple helix repeat-containing collagen
BAS4624-1171.045143hypothetical protein
BAS4625-2160.207148hypothetical protein
BAS4626-214-0.794489rhodanese-like domain-containing protein
BAS4627-215-0.961066hypothetical protein
BAS4629-215-0.398766homoserine O-acetyltransferase
BAS46300171.024822spore germination protein GerHA
BAS46310242.713167spore germination protein GerHB
BAS46321223.513936spore germination protein GerHC
BAS4633-1194.658448hypothetical protein
BAS4634-1184.592334hypothetical protein
BAS4636-1173.857346vrrB protein
BAS4637-3171.266749leucyl-tRNA synthetase
BAS46380150.036919permease
BAS4639014-1.052785sodium/hydrogen exchanger family protein
BAS4640117-1.440074TrkA domain-containing protein
BAS4641119-0.504281phage integrase family site specific
BAS46422200.503883ABC transporter permease
BAS46431191.535563ABC transporter ATP-binding protein
BAS46441190.600962hypothetical protein
BAS46450221.923270hypothetical protein
BAS4646-1191.292728hypothetical protein
BAS4647-1180.271830hypothetical protein
BAS46480140.118848ABC transporter ATP-binding protein
BAS46492150.631390hypothetical protein
BAS46503151.473079aspartate racemase
BAS46512141.180985hypothetical protein
BAS46521161.908532hypothetical protein
BAS46531162.308735hypothetical protein
BAS46540162.473201hypothetical protein
BAS4655-1163.428896transferase
BAS46560213.168117PAP2 family protein
BAS46570183.203176glycoside hydrolase
BAS46581243.069689molybdopterin-guanine dinucleotide biosynthesis
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4623THERMOLYSIN310.017 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 31.1 bits (70), Expect = 0.017
Identities = 27/120 (22%), Positives = 39/120 (32%), Gaps = 13/120 (10%)

Query: 482 TGSTGPTGSTGTTG-NTGVTGDTGPTGATGVSTTATYAFANNTSGSVISVLLGGTNIPLP 540
G P T T G GV GD T S Y +NT GS I G
Sbjct: 220 PGGAQPVAGTSTVGVGRGVLGDQKYINTTYSSYYGYYYLQDNTRGSGIFTYDG------- 272

Query: 541 NNQNIGPGITVSGGNTVFTV-----ANAGNYYIAYTINLTAGLLVSSRITVNGSPLAGTI 595
N+ + PG + G+ F A +YY + + + + + T+
Sbjct: 273 RNRTVLPGSLWADGDNQFFASYDAAAVDAHYYAGVVYDYYKNVHGRLSYDGSNAAIRSTV 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4625PF07675250.045 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 24.7 bits (53), Expect = 0.045
Identities = 13/41 (31%), Positives = 18/41 (43%)

Query: 33 VYAGAGGSSAAIFLNGKRQPEAVIRTSVFLPPLATSTRTLG 73
VYA + G+ A+ F N + +T V P TR G
Sbjct: 1175 VYASSTGNDASNFANALLEEVLTAKTVVTAPEAIRGTRAQG 1215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4630IGASERPTASE473e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 47.0 bits (111), Expect = 3e-07
Identities = 38/253 (15%), Positives = 86/253 (33%), Gaps = 12/253 (4%)

Query: 9 KKKLNTTEKNETDNSEQKPNNQEDDNKEQTRSTKHNKSNNSEQKKEEHKESSQDKQQNQS 68
K++ T EKNE D +E N+E KE + K N N + + +Q + ++
Sbjct: 1045 KQESKTVEKNEQDATETTAQNREVA-KEAKSNVKANTQTNEVAQSGSETKETQTTETKET 1103

Query: 69 NQNQQQSAKQDESSQGQQNHSKQDDSDQGQQQHSKQGNSDQGQQQHSKQGDSNQGQQNHS 128
+++ + E+ + Q+ Q+Q + +++ + + Q +
Sbjct: 1104 ATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTN 1163

Query: 129 KQNDSDQGQQQHSKQDESSQEQQNHSKQDDS----DQGQQQHSKQDESSQEQQNHSKQDD 184
D++Q ++ S E + +S + + Q + E N K
Sbjct: 1164 TTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRH 1223

Query: 185 SDQGQQQHSKQDESSQEQQNHSKQDDSDQDDSFQDTQQSSKQD-------DLAQDKQQHS 237
+ + ++ + S D + + S + ++ + QH
Sbjct: 1224 RRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHI 1283

Query: 238 KQDNSDQDKQQNS 250
Q + + Q N
Sbjct: 1284 SQLEMNNEGQYNV 1296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4638TCRTETA552e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 55.2 bits (133), Expect = 2e-10
Identities = 54/321 (16%), Positives = 118/321 (36%), Gaps = 10/321 (3%)

Query: 42 GMVLMINSLTGVIGNLLGGVLFDKWGGYKSTLVGIVITLVSILGLVFFHG-WPLYVVWLA 100
G++L + +L + G L D++G LV + V + W LY+
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIG--R 103

Query: 101 LIGFGSGMVFPSMYAMVGTVWPEGGR-RAFNAMYVGQNVGIAIGTACGGLVASYRFDYIF 159
++ +G A + + R R F M G+ G GGL+ + F
Sbjct: 104 IVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPF 163

Query: 160 LANFILYFVFFLIAFIGFR-GMEDKKEPGVQKEVEAKKGWSLTPGFKALLIVCVAYALCW 218
A L + FL + ++ P ++ + + G + + + +
Sbjct: 164 FAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQ 223

Query: 219 VTYVQWQGAIATHMQE-LNISLRHYSLLWTINGAMIVCAQPLVSMLIRWMKR-SLKQQIM 276
+ ++ + + G + AQ +++ R ++ +M
Sbjct: 224 LVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITG--PVAARLGERRALM 281

Query: 277 IGILIFAVSFIVLSQAQQFTMFLVAMVTLTIGELFVWPAVPTIANILAPKDKLGFYQGVV 336
+G++ +I+L+ A + M MV L G + + PA+ + + +++ G QG +
Sbjct: 282 LGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM-PALQAMLSRQVDEERQGQLQGSL 340

Query: 337 NSAATVGKMFGPVVGGAIVDL 357
+ ++ + GP++ AI
Sbjct: 341 AALTSLTSIVGPLLFTAIYAA 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4640SECA290.006 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.5 bits (66), Expect = 0.006
Identities = 12/37 (32%), Positives = 23/37 (62%)

Query: 122 KNMKKFFNPGPDSIIEAGDMLVLSGARHEVKRIINEL 158
+ +K + D+++EAG + ++ RHE +RI N+L
Sbjct: 535 EKIKADWQVRHDAVLEAGGLHIIGTERHESRRIDNQL 571


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4645BACINVASINB270.008 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 27.4 bits (60), Expect = 0.008
Identities = 18/65 (27%), Positives = 30/65 (46%), Gaps = 3/65 (4%)

Query: 8 ESYITQAEQAVEYAKEQLDQGMRQEHYNTMEYSDAQLQLEQAYNDLQTMQQHANDEQREQ 67
E+ + QA + AKE LD+ +DA+ + E+A N L Q AN + Q
Sbjct: 185 EAAVEQAGKEATEAKEALDKATDATV---KAGTDAKAKAEKADNILTKFQGTANAASQNQ 241

Query: 68 LNRAR 72
+++
Sbjct: 242 VSQGE 246


54BAS4743BAS4751Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS47430163.125115D-alanyl-D-alanine carboxypeptidase
BAS47441174.189983sensor histidine kinase
BAS47451175.317941DNA-binding response regulator
BAS47461155.028066N-acylamino acid racemase
BAS47470144.679095O-succinylbenzoic acid--CoA ligase
BAS47481144.437726naphthoate synthase
BAS47491133.797129alpha/beta hydrolase
BAS47500144.0199682-succinyl-5-enolpyruvyl-6-hydroxy-3-
BAS4751-1183.477805menaquinone-specific isochorismate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4745HTHFIS1022e-27 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 102 bits (256), Expect = 2e-27
Identities = 37/144 (25%), Positives = 70/144 (48%), Gaps = 5/144 (3%)

Query: 1 MKRISILIADDEAEIADLIEIHLEKEGYHVVKAADGEEAIHIIETQPIDLVVLDIMMPKM 60
M +IL+ADD+A I ++ L + GY V ++ I DLVV D++MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGYEVTRQIRA-KHHMPIIFLSAKTSDFDKVTGLVLGADDYMTKPFTPIELVARVNAQLR 119
+ +++ +I+ + +P++ +SA+ + + GA DY+ KPF EL+ +
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGII----G 116

Query: 120 RFLTLNQPKVAENKSALQVGGVTI 143
R L + + ++ + Q G +
Sbjct: 117 RALAEPKRRPSKLEDDSQDGMPLV 140


55BAS4769BAS4775Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BAS47692161.685581hypothetical protein
BAS47703181.472367general stress protein 13
BAS47712171.589663hypothetical protein
BAS47724181.668134asnC family transcriptional regulator
BAS47733182.002141gluconate 2-dehydrogenase
BAS47744181.586038alpha/beta hydrolase
BAS47752151.853804hypothetical protein
56BAS4827BAS4862Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS48272182.493074phosphatase
BAS48283161.771367DeoR family transcriptional regulator
BAS48293151.847477hypothetical protein
BAS48301222.471515fructose 1,6-bisphosphatase II
BAS48312180.578637hypothetical protein
BAS48331170.560720hypothetical protein
BAS48340180.727650lipoprotein
BAS48360151.316174transcriptional activator tipA
BAS4835-1192.173875hypothetical protein
BAS48380161.964814hypothetical protein
BAS48371142.775984phosphoglycerate mutase
BAS48390163.141825hypothetical protein
BAS48400163.006085lipoyl synthase
BAS48412172.128931M24/M37 family peptidase
BAS48421191.994931hypothetical protein
BAS48431212.090873hypothetical protein
BAS48441202.116963hypothetical protein
BAS48452262.656380PadR family transcriptional regulator
BAS48463302.916198hypothetical protein
BAS48472302.926137hypothetical protein
BAS48483251.810754NifU domain-containing protein
BAS48493241.918291class V aminotransferase
BAS48503211.772585hypothetical protein
BAS48511160.379801ABC transporter ATP-binding protein
BAS4852015-0.299703ABC transporter substrate-binding protein
BAS4853-1150.195989ABC transporter substrate-binding protein
BAS48540151.095409ABC transporter permease
BAS48551140.368161ABC transporter ATP-binding protein
BAS4856215-1.075431hypothetical protein
BAS4857119-0.671165thioredoxin
BAS4858418-1.665441TOPRIM domain-containing protein
BAS4859618-1.500510glycine cleavage system protein H
BAS4860620-2.295498hypothetical protein
BAS4861419-1.851451hypothetical protein
BAS48622180.528337hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4853adhesinb280.044 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 27.9 bits (62), Expect = 0.044
Identities = 16/49 (32%), Positives = 24/49 (48%), Gaps = 4/49 (8%)

Query: 1 MKKLLLTALISTSIFGLAACGGKDNDEK----KLVVGASNVPHAEILEK 45
MKK L+ + GLAAC + + + KL V A+N A+I +
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKN 49


57BAS4954BAS4996Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS49541213.049197DNA-binding response regulator
BAS49551243.534951ssrA-binding protein
BAS4956-1213.305378ribonuclease R
BAS49570161.342615carboxylesterase
BAS49581180.463843preprotein translocase subunit SecG
BAS49591170.454836murein hydrolase export regulator
BAS4960119-0.644403holin-like protein
BAS4961220-0.548449inosine-uridine preferring nucleoside hydrolase
BAS4962220-1.569369hypothetical protein
BAS4963220-2.160913hypothetical protein
BAS4964221-2.254716hypothetical protein
BAS4965220-1.962880prophage LambdaBa03, HNH endonuclease family
BAS4966223-1.938244hypothetical protein
BAS4967021-2.483044hypothetical protein
BAS4968-120-3.232523hypothetical protein
BAS4969121-2.766974hypothetical protein
BAS4970120-2.159335hypothetical protein
BAS4971022-0.869284hypothetical protein
BAS4972025-0.395228hypothetical protein
BAS49731250.029244prophage LambdaBa03 transcriptional regulator
BAS49742281.354776hypothetical protein
BAS49754321.922781hypothetical protein
BAS49763301.494438prophage LambdaBa03, terminase, large subunit
BAS49783270.715613hypothetical protein
BAS49792260.263329hypothetical protein
BAS49802341.373041prophage LambdaBa03, prohead protease
BAS49812341.887681HK97 family phage major capsid protein
BAS49820312.457142hypothetical protein
BAS49831373.685626hypothetical protein
BAS49843464.570553prophage LambdaBa03, site-specific recombinase
BAS49854455.231850phosphopyruvate hydratase
BAS49864374.588269phosphoglyceromutase
BAS49874263.925014triosephosphate isomerase
BAS49883233.337444phosphoglycerate kinase
BAS49893162.166175glyceraldehyde-3-phosphate dehydrogenase
BAS49902171.957436gapA transcriptional regulator CggR
BAS49910172.999219glutaredoxin family protein
BAS49920193.920190RNA polymerase factor sigma-54
BAS49930284.620607*hypothetical protein
BAS49940283.380212lipoprotein
BAS4995-2294.282113stage V sporulation protein AC
BAS4996-2264.345358stage V sporulation protein AD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4954HTHFIS586e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 57.5 bits (139), Expect = 6e-12
Identities = 23/134 (17%), Positives = 54/134 (40%), Gaps = 2/134 (1%)

Query: 4 VLVIKNERSLAKKIVSGLTEEGHFILKLHNENEGLNIIYEQDWDIIILDWDSLSISGPEI 63
+LV ++ ++ + L+ G+ + N I D D+++ D + ++
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 CRQIR-LVKMTPIIIVTDNISSKDCVAGLQAGADDYIRKPFAKEELVARV-QAILRRSGC 121
+I+ P+++++ + + + GA DY+ KPF EL+ + +A+
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRR 125

Query: 122 NQQHETTFFQFKDL 135
+ E L
Sbjct: 126 PSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4958SECGEXPORT392e-07 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 38.8 bits (90), Expect = 2e-07
Identities = 21/77 (27%), Positives = 43/77 (55%), Gaps = 4/77 (5%)

Query: 1 MHTLLSVLLIIVSILMIVMVLMQSSNSSGLSGAISGGAE-QLFGKQKARGIEAVLNRITI 59
M+ L V+ +IV+I ++ ++++Q + + + GA LFG + G + R+T
Sbjct: 1 MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFG---SSGSGNFMTRMTA 57

Query: 60 VLAVLFFALTIGVTYLN 76
+LA LFF +++ + +N
Sbjct: 58 LLATLFFIISLVLGNIN 74


58BAS5011BAS5038Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS50111223.264260prolipoprotein diacylglyceryl transferase
BAS50121192.353315HPr kinase/phosphorylase
BAS50131172.071514hypothetical protein
BAS50141172.316193hypothetical protein
BAS50151162.073384excinuclease ABC subunit A
BAS50160180.375563excinuclease ABC subunit B
BAS5017021-1.987247IS605 family transposase
BAS5018025-1.360664lipoprotein
BAS50192241.714303hypothetical protein
BAS50201212.444015merR family transcriptional regulator
BAS50213263.586794hypothetical protein
BAS50222274.173666hypothetical protein
BAS50231243.533580DNA-binding protein
BAS50241213.072795hypothetical protein
BAS5025-1192.453152LysR family transcriptional regulator
BAS5026-1162.100475merR family transcriptional regulator
BAS5027-2151.673882NADPH-dependent FMN reductase
BAS5028-1151.894939macrolide efflux pump
BAS5030-1152.056537ABC transporter ATP-binding protein/permease
BAS5031-2161.970759hypothetical protein
BAS50320181.871286carboxyl-terminal protease
BAS50332242.558470cell division ABC transporter permease FtsX
BAS50343242.705663cell division ABC transporter ATP-binding
BAS50353232.525036cytochrome c-551
BAS50364201.994931hypothetical protein
BAS50373172.600662peptide chain release factor 2
BAS50382162.031578preprotein translocase subunit SecA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5032BINARYTOXINB300.026 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 30.0 bits (67), Expect = 0.026
Identities = 14/44 (31%), Positives = 22/44 (50%), Gaps = 1/44 (2%)

Query: 210 GKDIGYMQITSFAENTAKEFKDQLKELEKKNIKGLVIDVRGNPG 253
GKDI F + T++ K+QL EL NI ++ ++ N
Sbjct: 573 GKDITEFDFN-FDQQTSQNIKNQLAELNATNIYTVLDKIKLNAK 615


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5038SECA11710.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 1171 bits (3030), Expect = 0.0
Identities = 446/897 (49%), Positives = 598/897 (66%), Gaps = 65/897 (7%)

Query: 1 MIGILKKVF-DVNQRQIKRMQKTVEQIDALESSIKPLTDEQLKGKTLEFKERLTKGETVD 59
+I +L KVF N R ++RM+K V I+A+E ++ L+DE+LKGKT EF+ RL KGE ++
Sbjct: 2 LIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVLE 61

Query: 60 DLLPEAFAVVREAATRVLGMRPYGVQLMGGIALHEGNISEMKTGEGKTLTSTLPVYLNAL 119
+L+PEAFAVVREA+ RV GMR + VQL+GG+ L+E I+EM+TGEGKTLT+TLP YLNAL
Sbjct: 62 NLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNAL 121

Query: 120 TGKGVHVVTVNEYLAQRDANEMGQLHEFLGLTVGINLNSMSREEKQEAYAADITYSTNNE 179
TGKGVHVVTVN+YLAQRDA L EFLGLTVGINL M K+EAYAADITY TNNE
Sbjct: 122 TGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNNE 181

Query: 180 LGFDYLRDNMVLYKEQCVQRPLHFAIIDEVDSILVDEARTPLIISGQAQKSTELYMFANA 239
GFDYLRDNM E+ VQR LH+A++DEVDSIL+DEARTPLIISG A+ S+E+Y N
Sbjct: 182 YGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVNK 241

Query: 240 FVRTL-----------ENEKDYSFDVKTKNVMLTEDGITKAEKAFHI-------ENLFDL 281
+ L + E +S D K++ V LTE G+ E+ E+L+
Sbjct: 242 IIPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYSP 301

Query: 282 KHVALLHHINQALRAHVVMHRDTDYVVQEGEIVIVDQFTGRLMKGRRYSEGLHQAIEAKE 341
++ L+HH+ ALRAH + RD DY+V++GE++IVD+ TGR M+GRR+S+GLHQA+EAKE
Sbjct: 302 ANIMLMHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAKE 361

Query: 342 GVEIQNESMTLATITFQNYFRMYEKLSGMTGTAKTEEEEFRNIYNMNVIVIPTNKPIIRD 401
GV+IQNE+ TLA+ITFQNYFR+YEKL+GMTGTA TE EF +IY ++ +V+PTN+P+IR
Sbjct: 362 GVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIRK 421

Query: 402 DRADLIFKSMKGKFNAVVEDIVNRHKQGQPVLVGTVAIETSELISKMLTRKGVRHNILNA 461
D DL++ + K A++EDI R +GQPVLVGT++IE SEL+S LT+ G++HN+LNA
Sbjct: 422 DLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLNA 481

Query: 462 KNHAREADIIAEAGMKGAVTIATNMAGRGTDIKLG------------------------- 496
K HA EA I+A+AG AVTIATNMAGRGTDI LG
Sbjct: 482 KFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKADW 541

Query: 497 ----DDIKNIG-LAVIGTERHESRRIDNQLRGRAGRQGDPGVTQFYLSMEDELMRRFGSD 551
D + G L +IGTERHESRRIDNQLRGR+GRQGD G ++FYLSMED LMR F SD
Sbjct: 542 QVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFASD 601

Query: 552 NMKAMMDRLGMDDSQPIESKMVSRAVESAQKRVEGNNYDARKQLLQYDDVLRQQREVIYK 611
+ MM +LGM + IE V++A+ +AQ++VE N+D RKQLL+YDDV QR IY
Sbjct: 602 RVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIYS 661

Query: 612 QRQEVMESENLRGIIEGMMKSTVERAV-ALHTQEEIEEDWNIKGLVDYLNTNLLQEGDVK 670
QR E+++ ++ I + + + + A + +EE W+I GL + L + + +
Sbjct: 662 QRNELLDVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPIA 721

Query: 671 E--EELRRLAPEEMSEPIIAKLIERYNDKEKLMPEEQMREFEKVVVFRVVDTKWTEHIDA 728
E ++ L E + E I+A+ IE Y KE+++ E MR FEK V+ + +D+ W EH+ A
Sbjct: 722 EWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAA 781

Query: 729 MDHLREGIHLRAYGQIDPLREYQMEGFAMFESMIASIEEEISRYIMKAEI---------- 778
MD+LR+GIHLR Y Q DP +EY+ E F+MF +M+ S++ E+ + K ++
Sbjct: 782 MDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMPEEVEELE 841

Query: 779 -EQNLERQEVVQGEAVHPSSDGEEAKKKPVVKGDQ--VGRNDLCKCGSGKKYKNCCG 832
++ +E + + Q + + D A + + VGRND C CGSGKKYK C G
Sbjct: 842 QQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCPCGSGKKYKQCHG 898


59BAS5117BAS5166Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS5117215-3.387541UDP-N-acetylglucosamine 2-epimerase
BAS5118317-4.261597teichoic acids export protein ATP-binding
BAS5119418-4.178738techoic acid ABC transporter efflux permease
BAS5120417-3.963555UDP-N-acetyl-D-mannosamine dehydrogenase
BAS5121316-3.747366hypothetical protein
BAS5124116-3.078085hypothetical protein
BAS5125016-2.241004hypothetical protein
BAS5126-114-1.095504glycoside hydrolase
BAS5127013-0.868376glycoside hydrolase
BAS51282180.301959rod shape-determining protein Mbl
BAS5129318-0.628816stage III sporulation protein D
BAS5130216-0.984585lipoprotein
BAS51312131.052944stage II sporulation protein
BAS51320131.527377ABC transporter permease
BAS51331131.875958ABC transporter ATP-binding protein
BAS51340132.308080ABC transporter ATP-binding protein
BAS5135-2162.870281DNA-binding protein
BAS5136-1173.801092stage II sporulation protein D
BAS51371184.048340UDP-N-acetylglucosamine
BAS51382243.925747hypothetical protein
BAS51393254.302716hypothetical protein
BAS51403244.189690NADH dehydrogenase subunit N
BAS51414244.618682NADH dehydrogenase subunit M
BAS51424275.001874NADH dehydrogenase subunit L
BAS51434275.340508NADH dehydrogenase subunit K
BAS51443265.379217NADH dehydrogenase subunit J
BAS51454255.600559NADH dehydrogenase subunit I
BAS51461152.684042NADH dehydrogenase subunit H
BAS51470121.549346NADH dehydrogenase subunit D
BAS5148-111-0.034733NADH dehydrogenase subunit C
BAS5149-29-1.254307NADH dehydrogenase subunit B
BAS5150-212-0.045564NADH dehydrogenase subunit A
BAS5151-1140.415973sensory box/GGDEF family protein
BAS51521253.167639hypothetical protein
BAS51533273.613305hypothetical protein
BAS51543314.247552ATP synthase F0F1 subunit epsilon
BAS51554334.264122ATP synthase F0F1 subunit beta
BAS51563283.472990ATP synthase F0F1 subunit gamma
BAS51571283.401107ATP synthase F0F1 subunit alpha
BAS5158-2202.124949ATP synthase F0F1 subunit delta
BAS5159-3212.959899ATP synthase F0F1 subunit B
BAS51601263.558850ATP synthase F0F1 subunit C
BAS51611213.548540ATP synthase F0F1 subunit A
BAS51621213.588959ATP synthase protein I
BAS51631193.476059hypothetical protein
BAS51642213.001755uracil phosphoribosyltransferase
BAS51652233.002257serine hydroxymethyltransferase
BAS51660193.019828hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5119ABC2TRNSPORT300.007 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 30.3 bits (68), Expect = 0.007
Identities = 53/243 (21%), Positives = 99/243 (40%), Gaps = 19/243 (7%)

Query: 27 KQAYAGNLLGLLWVFLNPLSQIGVYWLVFGLGIRGGAPVHGVPYFVWLVCGLVTWFFVGT 86
K+A +LLG L PL I ++ L GLG+ G V GV Y +L G+V +
Sbjct: 28 KKAALASLLGHL---AEPL--IYLFGLGAGLGVMVGR-VGGVSYTAFLAAGMVATSAMTA 81

Query: 87 TITQSANSIYSRLN---TVSKMNFPLSIIPTYVVISQLY--THLILIIFALVIVIFNLGF 141
++ + + R+ T M + + V+ + T L + +V LG+
Sbjct: 82 ATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY 141

Query: 142 STINILELMYGLVASTLFLIALSFLTSTLSTMLRDIQLLI--QS--VTRMLFFLTPIFWE 197
+ L L+Y L L +A + L ++ + I Q+ +T +LF +F
Sbjct: 142 TQ--WLSLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVF-- 197

Query: 198 PKENMSNLLLFIIKINPLYYIVEVYRGALIYNDTSIVLSWYTLYFWGAVIILFIAGSMLH 257
P + + + + PL + +++ R ++ + V VI F++ ++L
Sbjct: 198 PVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLR 257

Query: 258 IRF 260
R
Sbjct: 258 RRL 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5128SHAPEPROTEIN478e-173 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 478 bits (1233), Expect = e-173
Identities = 179/330 (54%), Positives = 244/330 (73%), Gaps = 5/330 (1%)

Query: 1 MFARDIGIDLGTANVLIHVKGKGIVLNEPSVVAIDRNTG----KVLAVGEEARSMVGRTP 56
MF+ D+ IDLGTAN LI+VKG+GIVLNEPSVVAI ++ V AVG +A+ M+GRTP
Sbjct: 8 MFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAAVGHDAKQMLGRTP 67

Query: 57 GNIVAIRPLKDGVIADFEITEAMLKYFINKLDVKSFFS-KPRILICCPTNITSVEQKAIR 115
GNI AIRP+KDGVIADF +TE ML++FI ++ SF PR+L+C P T VE++AIR
Sbjct: 68 GNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIR 127

Query: 116 EAAERSGGKTVFLEEEPKVAAVGAGMEIFQPSGNMVVDIGGGTTDIAVLSMGDIVTSSSI 175
E+A+ +G + VFL EEP AA+GAG+ + + +G+MVVDIGGGTT++AV+S+ +V SSS+
Sbjct: 128 ESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSV 187

Query: 176 KMAGDKFDMEILNYIKRKYKLLIGERTSEDIKIKVGTVFPGARSEELEIRGRDMVTGLPR 235
++ GD+FD I+NY++R Y LIGE T+E IK ++G+ +PG E+E+RGR++ G+PR
Sbjct: 188 RIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPR 247

Query: 236 TITVCSEEITEALKENAAVIVQAAKGVLERTPPELSADIIDRGVILTGGGALLHGIDMLL 295
T+ S EI EAL+E IV A LE+ PPEL++DI +RG++LTGGGALL +D LL
Sbjct: 248 GFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLL 307

Query: 296 AEELKVPVLIAENPMHCVAVGTGIMLENID 325
EE +PV++AE+P+ CVA G G LE ID
Sbjct: 308 MEETGIPVVVAEDPLTCVARGGGKALEMID 337


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5148IGASERPTASE386e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 37.7 bits (87), Expect = 6e-05
Identities = 22/121 (18%), Positives = 42/121 (34%), Gaps = 6/121 (4%)

Query: 51 KNDDMTIEEAKRRAAAAAKA--KAAALAKQKREGIEEVTEEEKVKAKAAAAAKAKAAALA 108
+ + E +K+ + K A Q RE +E K + A++ +
Sbjct: 1035 ETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKE 1094

Query: 109 KQK--ASQGNGDSGDEKAKAIAAAKAKAAAAARAKTKGAEGKKEEELKQEEPSV-NEPYL 165
Q + +EKAK + ++ + + E Q EP+ N+P +
Sbjct: 1095 TQTTETKETATVEKEEKAKVETEKTQEVPKVT-SQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 166 N 166
N
Sbjct: 1154 N 1154



Score = 36.2 bits (83), Expect = 2e-04
Identities = 27/156 (17%), Positives = 47/156 (30%), Gaps = 11/156 (7%)

Query: 7 DLEDLKREAARRAKEEARKRLVAKHGVEISKLEEENREKEKA--LPKNDDMTIEEAKRRA 64
DL + + E + + ++ + N E + P ++
Sbjct: 979 DLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTE 1038

Query: 65 AAAAKAKAAALAKQKREGIEEVTEEEKVKAKAAAAAKAKAAALAKQKASQGNGDSGDEKA 124
A +K + +K E ++ TE + A AK+ A + +G E
Sbjct: 1039 TVAENSKQESKTVEKNE--QDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQ 1096

Query: 125 KAIAAAKAKAAAAARAKTKGAEGKKEEELKQEEPSV 160
A +AK E E QE P V
Sbjct: 1097 TTETKETATVEKEEKAKV-------ETEKTQEVPKV 1125



Score = 34.7 bits (79), Expect = 6e-04
Identities = 26/154 (16%), Positives = 48/154 (31%), Gaps = 17/154 (11%)

Query: 14 EAARRAKEEARKRLVAKHGVEISKLEEENREKEKALPKNDDMTIEEAKRRAAAAAKAKAA 73
+A + E A+ K E EKE+ + T E K + + K + +
Sbjct: 1077 KANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQS 1136

Query: 74 ALAKQKREGIEEVTEEEKVKAKAAAAAKAKAAALAKQKASQGNGDSGDEKAKAIAAAKAK 133
E V+ +A A + K+ SQ N + E+ ++ +
Sbjct: 1137 ----------------ETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVE 1180

Query: 134 AAAAARAKTKGAEGKKEEELKQEEPSVNEPYLNQ 167
T E + P+ +P +N
Sbjct: 1181 QPVTEST-TVNTGNSVVENPENTTPATTQPTVNS 1213



Score = 33.1 bits (75), Expect = 0.002
Identities = 19/152 (12%), Positives = 46/152 (30%), Gaps = 3/152 (1%)

Query: 13 REAARR-AKEEARKRLVAKHGVEISKLEEENREKEKALPKNDDMTIEEAKRRAAAAAKAK 71
+E KE A K VE K +E + + PK + E + +A A +
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQS--ETVQPQAEPAREND 1150

Query: 72 AAALAKQKREGIEEVTEEEKVKAKAAAAAKAKAAALAKQKASQGNGDSGDEKAKAIAAAK 131
K+ + + E+ + ++ + ++ + A
Sbjct: 1151 PTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPT 1210

Query: 132 AKAAAAARAKTKGAEGKKEEELKQEEPSVNEP 163
+ ++ + K + + E + +
Sbjct: 1211 VNSESSNKPKNRHRRSVRSVPHNVEPATTSSN 1242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5159IGASERPTASE300.005 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.005
Identities = 23/116 (19%), Positives = 49/116 (42%), Gaps = 6/116 (5%)

Query: 36 PLMGIMKEREEHVANEIDAAERNNAEAKKLVEEQREMLKQSRVEAQELIERAKKQAVDQK 95
P E E VA + +++ + +K ++ E Q+R A+E ++ +A Q
Sbjct: 1028 PAPATPSETTETVA---ENSKQESKTVEKNEQDATETTAQNREVAKE--AKSNVKANTQT 1082

Query: 96 DVIVAAAKEEAESIKASAVQEIQREKEQAIAALQEQVASLSVQIASKVIEKELKEE 151
+ + + E E+ + EKE+ E+ + ++ S+V K+ + E
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVP-KVTSQVSPKQEQSE 1137


60BAS5175BAS5191Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS51752151.687260hypothetical protein
BAS51761162.336109stage II sporulation protein R
BAS5177-1183.775643HemK family modification methylase
BAS5178-1194.131481peptide chain release factor 1
BAS5179-1234.421065thymidine kinase
BAS5180-1234.05166350S ribosomal protein L31
BAS5181-1213.558784transcription termination factor Rho
BAS51820274.106246fructose 1,6-bisphosphatase II
BAS51832263.528893UDP-N-acetylglucosamine
BAS51842282.368042fructose-bisphosphate aldolase
BAS5185-1163.024693stage 0 sporulation protein F
BAS5186-1173.849351hypothetical protein
BAS5187-1184.195159CTP synthetase
BAS51880165.159074DNA-directed RNA polymerase subunit delta
BAS5189-1165.279003TetR family transcriptional regulator
BAS5190-1144.853576acyl-CoA dehydrogenase
BAS5191-1133.529173acyl-CoA dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5176IGASERPTASE401e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.0 bits (93), Expect = 1e-05
Identities = 27/99 (27%), Positives = 42/99 (42%), Gaps = 5/99 (5%)

Query: 177 AESPEEEQVKQIDDEEVVDTEEKKEDEVKEKKVVKQEVATKVTASEKKVVKNETKVEEQP 236
A E + + ++ T EK E + E +EVA + K VK T+ E
Sbjct: 1031 ATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKE----AKSNVKANTQTNEVA 1086

Query: 237 VSKEETKTVEKVEKPVEQKQEKQNEY-VKVEEEEEEPEV 274
S ETK + E EK+ + V+ E+ +E P+V
Sbjct: 1087 QSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKV 1125



Score = 33.1 bits (75), Expect = 0.001
Identities = 24/119 (20%), Positives = 44/119 (36%), Gaps = 6/119 (5%)

Query: 166 TAVRKEEHVVKAESPEEEQVKQIDDEEVVDTEEKKEDEVKEKKVVKQEVATKVTASEKKV 225
+ ++ + V K E E Q V E K + + + ++ ++
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQ---NREVAKEAKSNVKANTQTNEVAQSGSETKETQTTE 1099

Query: 226 VKNETKVEEQPVSKEETKTVEKVEKPVEQ---KQEKQNEYVKVEEEEEEPEVKLFIVEA 281
K VE++ +K ET+ ++V K Q KQE+ E E + + I E
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEP 1158



Score = 29.3 bits (65), Expect = 0.020
Identities = 20/103 (19%), Positives = 39/103 (37%), Gaps = 5/103 (4%)

Query: 176 KAESPEEEQVKQIDDEEVVDTEEKKEDEVKEKKVV----KQEVATKVTASEKKVVKNETK 231
K+ Q ++ +T+E + E KE V K +V T+ T KV +
Sbjct: 1073 KSNVKANTQTNEVAQSGS-ETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSP 1131

Query: 232 VEEQPVSKEETKTVEKVEKPVEQKQEKQNEYVKVEEEEEEPEV 274
+EQ + + + P +E Q++ + E+ +
Sbjct: 1132 KQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKE 1174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5185HTHFIS1122e-32 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 112 bits (281), Expect = 2e-32
Identities = 31/117 (26%), Positives = 56/117 (47%)

Query: 3 GKILIVDDQYGIRVLLHEVFQKEGYQTFQAANGFQALDIVKKDNPDLVVLDMKIPGMDGI 62
IL+ DD IR +L++ + GY +N + + DLVV D+ +P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 63 EILKHVKEIDESIKVILMTAYGELDMIQEAKDLGALMHFAKPFDIDEIRQAVRNELA 119
++L +K+ + V++M+A +A + GA + KPFD+ E+ + LA
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5189HTHTETR654e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.0 bits (158), Expect = 4e-15
Identities = 27/141 (19%), Positives = 61/141 (43%), Gaps = 6/141 (4%)

Query: 19 RREQMIKGAVQLFKQKGFPRTTTREIAKAAGFSIGTLYEYIRTKDDVLYLVCDSIYEHVK 78
R+ ++ A++LF Q+G T+ EIAKAAG + G +Y + + K D+ + + ++
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 79 ERLEEV-VCTEKGSVESLKIAITNYFKVMDELQEE---VLIMYQEVRFLPKESLPYVLEK 134
E E + L+ + + + + + I++ + F+ + ++ ++
Sbjct: 72 ELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQR 131

Query: 135 EF--QMVGMFENILEQCTENG 153
+ E L+ C E
Sbjct: 132 NLCLESYDRIEQTLKHCIEAK 152


61BAS5266BAS5279Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS52663101.053325pyridoxal kinase
BAS52673130.790799diguanylate cyclase
BAS52682120.813884hypothetical protein
BAS5269211-0.155033carbon starvation protein A
BAS5270111-1.627668response regulator
BAS5271-19-1.638093major facilitator family transporter protein
BAS5272-29-3.090449WecB/TagA/CpsF family glycosyl transferase
BAS5273-211-3.373615glycoside hydrolase
BAS5274-112-4.172232hypothetical protein
BAS5275015-3.959224hypothetical protein
BAS5276-114-3.356054methyl-accepting chemotaxis protein
BAS5277-110-3.471397hypothetical protein
BAS5278-112-3.253355cytosolic long-chain acyl-CoA thioester
BAS5279-112-3.023500polysaccharide biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5270HTHFIS533e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 52.9 bits (127), Expect = 3e-10
Identities = 21/137 (15%), Positives = 49/137 (35%), Gaps = 12/137 (8%)

Query: 2 KILLIMEEAEERRSLAEKFIENIKNVECFEASMGTEALFIMKKHTPDFVFLNSKLMDGTG 61
IL+ ++A R L + + + S + D V + + D
Sbjct: 5 TILVADDDAAIRTVLNQAL--SRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 62 FEYVNLLREVNCYAKFIFMGE--DIEESITAFRFQAFYYLLRPFREEDLQFLLYRMGKEQ 119
F+ + +++ + M +I A A+ YL +PF +L ++
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGII------- 115

Query: 120 GEKAKSYLRKLPIEGQE 136
+A + ++ P + ++
Sbjct: 116 -GRALAEPKRRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5271TCRTETA598e-12 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 59.1 bits (143), Expect = 8e-12
Identities = 72/380 (18%), Positives = 142/380 (37%), Gaps = 35/380 (9%)

Query: 7 ISKRKLLGIAGLGWLFDAMDVGMLSFVMVALQKDWGLSTQEMGWIG---SINSIGMAVGA 63
+ + L + DA+ +G++ V+ L +D S G ++ ++ A
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 64 LVFGILSDKIGRKSVFIITLLLFSIGSGLTALTTTLAMFLVLRFLIGMGLGGELPVASTL 123
V G LSD+ GR+ V +++L ++ + A L + + R + G+ G VA
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAY 119

Query: 124 VSESVEAHERGKIVVLLESFWAGGWLIAALISYF---VIPKYGWEVAMILSAIPALYALY 180
+++ + ER + + + + G + ++ P + A L+ + L +
Sbjct: 120 IADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCF 179

Query: 181 LRWNLPDSPRFQKVEKRPSVIENIKSVWSGEYRKATIMLWILWFSV---------VFSYY 231
L LP+S + ++ R + + S L ++F + ++ +
Sbjct: 180 L---LPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIF 236

Query: 232 GM--FLWLPSV--MVLKGFSLIKSFQYVLIMTLAQLPGYFTAAWFIERLGRKFVLVTYLI 287
G F W + + L F ++ S +I RLG + L+ +I
Sbjct: 237 GEDRFHWDATTIGISLAAFGILHSLAQAMITGP-----------VAARLGERRALMLGMI 285

Query: 288 GTACSAYLFGVAESLTVLIVAGMLLSFFNLGAWGALYAYTPEQYPTVIRGTGAGMAAAFG 347
L A + +LL+ +G AL A Q +G G AA
Sbjct: 286 ADGTGYILLAFATRGWMAFPIMVLLASGGIGM-PALQAMLSRQVDEERQGQLQGSLAALT 344

Query: 348 RIGGILGPLLVGYLVASQAS 367
+ I+GPLL + A+ +
Sbjct: 345 SLTSIVGPLLFTAIYAASIT 364



Score = 33.6 bits (77), Expect = 0.001
Identities = 29/125 (23%), Positives = 45/125 (36%), Gaps = 5/125 (4%)

Query: 274 ERLGRKFVLVTYLIGTACSAYLFGVAESLTVLIVAGMLLSFFNLGAWGALYAYTPEQYPT 333
+R GR+ VL+ L G A + A L VL + G +++ AY +
Sbjct: 68 DRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYI-GRIVAGITGATGAVAGAYIADITDG 126

Query: 334 VIRGTGAG-MAAAFGRIGGILGPLLVGYLVASQASLSLIFTIFCGSILIGVFAVIILGQE 392
R G M+A FG G + GP+L G + + L E
Sbjct: 127 DERARHFGFMSACFG-FGMVAGPVLGGLMGGFSPHAPFFAAAALN--GLNFLTGCFLLPE 183

Query: 393 TKQRE 397
+ + E
Sbjct: 184 SHKGE 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5274GPOSANCHOR355e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.4 bits (81), Expect = 5e-04
Identities = 9/47 (19%), Positives = 18/47 (38%)

Query: 289 EQEQSAKKEEKKKEEAKEHKPPVTQQEKEKEKEKEKVAEKKEETQAL 335
Q + K E + + E EK + + A+ + ++Q L
Sbjct: 261 RQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVL 307



Score = 32.3 bits (73), Expect = 0.004
Identities = 19/63 (30%), Positives = 30/63 (47%), Gaps = 9/63 (14%)

Query: 291 EQSAKKEEKKKEEAKE-----HKPPVTQQEKEKEKEKEKVAEKKEETQALIFSGRQLFEQ 345
++ K+ EK EEA K +E +K EKEK AE + + +A + L E+
Sbjct: 392 REAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEK-AELQAKLEA---EAKALKEK 447

Query: 346 MYK 348
+ K
Sbjct: 448 LAK 450


62BAS0064BAS0071N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS0064-2163.269772cell division protein FtsH
BAS0065-2162.317461pantothenate kinase
BAS0066-2162.449311heat shock protein 33
BAS00670172.083698cysteine synthase A
BAS00682181.271446para-aminobenzoate synthase component I
BAS00691191.041707para-aminobenzoate/anthranilate synthase
BAS0070-1161.2340864-amino-4-deoxychorismate lyase
BAS00710161.482061dihydropteroate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0064HTHFIS364e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 36.3 bits (84), Expect = 4e-04
Identities = 38/179 (21%), Positives = 57/179 (31%), Gaps = 41/179 (22%)

Query: 185 RKFAEVGARIPKGVLLVGPPGTGKTLLARAV---AGEAGVPFFS-----ISGSDFVEMFV 236
+ + +++ G GTGK L+ARA+ PF + I
Sbjct: 150 YRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELF 209

Query: 237 GV------GASRVRD-LFENAKKNAPCIIFIDEIDAVGRQRGAGLGGGHDEREQTLNQLL 289
G GA FE A+ +F+DEI + L +
Sbjct: 210 GHEKGAFTGAQTRSTGRFEQAEGGT---LFLDEIGDMPMDAQTRLLRVLQQG-------- 258

Query: 290 VEMDGFGANEGII----IIAATNRPDILDPALLRPGRFDRQITVDRPDVNGREAVLKVH 344
E G I I+AATN+ L + G F R D+ R V+ +
Sbjct: 259 -EYTTVGGRTPIRSDVRIVAATNKD--L-KQSINQGLF-------REDLYYRLNVVPLR 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0065PF03309379e-136 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 379 bits (975), Expect = e-136
Identities = 96/269 (35%), Positives = 163/269 (60%), Gaps = 12/269 (4%)

Query: 1 MIFVLDVGNTNAVLGVF----EEGELRQHWRMETDRHKTEDEYGMLVKQLLEHEGLSFED 56
M+ +DV NT+ V+G+ + ++ Q WR+ T+ T DE + + L+ G E
Sbjct: 1 MLLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTADELALTIDGLI---GDDAER 57

Query: 57 VKGIIVSSVVPPIMFALERMCEKYFKIKP-LVVGPGIKTGLNIKYENPREVGADRIVNAV 115
+ G S VP ++ + M E+Y+ P +++ PG++TG+ + +NP+EVGADRIVN +
Sbjct: 58 LTGASGLSTVPSVLHEVRVMLEQYWPNVPHVLIEPGVRTGIPLLVDNPKEVGADRIVNCL 117

Query: 116 AGIHLYGSPLIIVDFGTATTYCYINEEKHYMGGVITPGIMISAEALYSRAAKLPRIEITK 175
A H YG+ I+VDFG++ ++ + ++GG I PG+ +S++A +R+A L R+E+T+
Sbjct: 118 AAYHKYGTAAIVVDFGSSICVDVVSAKGEFLGGAIAPGVQVSSDAAAARSAALRRVELTR 177

Query: 176 PSSVVGKNTVSAMQSGILYGYVGQVEGIVKRMKEEA----KQEPKVIATGGLAKLISEES 231
P SV+GKNTV MQ+G ++G+ G V+G+V R++++ + V+ATG A L+ +
Sbjct: 178 PRSVIGKNTVECMQAGAVFGFAGLVDGLVNRIRDDVDGFSGADVAVVATGHTAPLVLPDL 237

Query: 232 NVIDVVDPFLTLKGLYMLYERNANLQHEK 260
++ D LTL GL +++ERN Q K
Sbjct: 238 RTVEHYDRHLTLDGLRLVFERNRANQRGK 266


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0070RTXTOXINA280.045 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.4 bits (63), Expect = 0.045
Identities = 19/94 (20%), Positives = 42/94 (44%), Gaps = 21/94 (22%)

Query: 191 ILYTPSLETGILNGITRAFIIKVAEELGIKVKEGFFTKDELLSADEVFVTNSIQEIVPLN 250
IL P G + + +++ A+ELGI+V+ + K+ +VF + ++++ L
Sbjct: 50 ILLIPKDYKGQGSSLND--LVRTADELGIEVQ--YDEKNGTAITKQVF--GTAEKLIGL- 102

Query: 251 RIEERDFPGKVGMVTKRFINLYEMQREKLWSRNE 284
T+R + ++ Q +KL + +
Sbjct: 103 --------------TERGVTIFAPQLDKLLQKYQ 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0071PF07201290.015 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 29.4 bits (66), Expect = 0.015
Identities = 10/72 (13%), Positives = 26/72 (36%), Gaps = 4/72 (5%)

Query: 146 ILMHNRDNMNYRNLMADMIADLYDSIKIAKDAGVRDENIILDPGIGFAKTPEQNLEAMRN 205
L + + +L+ + + + G R I +++ L+ +R+
Sbjct: 145 ALKGRPELAHLSHLVEQALVSMAEEQGETIVLGAR----ITPEAYRESQSGVNPLQPLRD 200

Query: 206 LEQLNVLGYPVL 217
+ V+GY +
Sbjct: 201 TYRDAVMGYQGI 212


63BAS0244BAS0251N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS0244-3100.369381*************hypothetical protein
BAS0245-2110.426881hypothetical protein
BAS0246-290.330548ribosomal-protein-alanine acetyltransferase
BAS0247-2110.572544DNA-binding/iron metalloprotein/AP endonuclease
BAS02480211.438548ABC transporter ATP-binding protein
BAS02492323.046518redox-sensing transcriptional repressor Rex
BAS02502303.683514lipoprotein
BAS02510273.477022CAAX amino terminal protease family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0244PF05272300.005 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.005
Identities = 8/25 (32%), Positives = 12/25 (48%)

Query: 25 VRAQDVIILEGDLGAGKTTFTKGLA 49
+ ++LEG G GK+T L
Sbjct: 593 CKFDYSVVLEGTGGIGKSTLINTLV 617


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0246SACTRNSFRASE442e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 44.2 bits (104), Expect = 2e-08
Identities = 21/72 (29%), Positives = 32/72 (44%)

Query: 67 ITNIAILPEYRGLKLGDALLKEVISEAKTLGVKTMTLEVRVSNEVAKQLYRKYGFQNGGI 126
I +IA+ +YR +G ALL + I AK + LE + N A Y K+ F G +
Sbjct: 92 IEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAV 151

Query: 127 RKRYYADNQEDG 138
Y++
Sbjct: 152 DTMLYSNFPTAN 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0248PF05272300.024 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.024
Identities = 13/44 (29%), Positives = 18/44 (40%), Gaps = 2/44 (4%)

Query: 379 LVGPNGIGKSTLLKSIVNKLPLLHGDVSFGSNVSVGYYDQEQAN 422
L G GIGKSTL+ ++V G+ Y+Q
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKD--SYEQIAGI 642


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0251SSPAMPROTEIN290.009 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type

M signature.
Length = 147

Score = 29.3 bits (65), Expect = 0.009
Identities = 14/30 (46%), Positives = 19/30 (63%)

Query: 17 LSSIAGLPLLLKTGLYDNRGFTREEKFQLI 46
+ IAGL LLL T +NR +REE + L+
Sbjct: 43 VEQIAGLKLLLDTLRAENRQLSREEIYALL 72


64BAS0370BAS0379N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS0370-114-0.704510ABC transporter ATP-binding protein
BAS0371-114-0.297838chitinase
BAS0372019-2.001837hypothetical protein
BAS0373119-1.461753hypothetical protein
BAS0374114-0.871598hypothetical protein
BAS0375012-0.595226TetR family transcriptional regulator
BAS0376012-0.155898major facilitator family transporter protein
BAS0377-110-0.268461DNA-binding protein
BAS0378-1151.737048hypothetical protein
BAS0379-1141.448713hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0370PF05272320.008 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.008
Identities = 11/27 (40%), Positives = 14/27 (51%)

Query: 35 VGGNGIGKSTLLRILTGELIHDDGNIE 61
G GIGKSTL+ L G D + +
Sbjct: 602 EGTGGIGKSTLINTLVGLDFFSDTHFD 628


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0374cloacin250.025 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 25.4 bits (55), Expect = 0.025
Identities = 13/29 (44%), Positives = 16/29 (55%)

Query: 54 DSSHGGSHDCGGSFGGDSGGSCDGGGGGG 82
S G H GGS G+ GG+ + GGG G
Sbjct: 48 GGSGSGIHWGGGSGHGNGGGNGNSGGGSG 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0375HTHTETR843e-22 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 83.5 bits (206), Expect = 3e-22
Identities = 46/198 (23%), Positives = 83/198 (41%), Gaps = 8/198 (4%)

Query: 1 MRRSAEEIKKEIAYKAEILFSQKGYAATSMEEICEITERSKGSIYYHFKSKEELFLFVVK 60
++ A+E ++ I A LFSQ+G ++TS+ EI + ++G+IY+HFK K +LF + +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 61 QHTYDWLEKWNEK-EKLYSTSTEKLYALAEYHVEDIQQPISN----AIEEFSMSQVVSKE 115
+ E E K L + + +E I V
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA 124

Query: 116 ILDEMLALT-RESYVMFETLIEAGIQSGEFRED-NTRDLMYIVNGLLSGL-GVLYYELDY 172
++ + ESY E ++ I++ D TR I+ G +SGL +
Sbjct: 125 VVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQS 184

Query: 173 KELKRIYKKAIDVLLKGM 190
+LK+ + + +LL+
Sbjct: 185 FDLKKEARDYVAILLEMY 202


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0376TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.0 bits (91), Expect = 3e-05
Identities = 27/130 (20%), Positives = 50/130 (38%), Gaps = 5/130 (3%)

Query: 34 FIMERTNNDPVSVSL-LSVMEYAPIFIFSFIGGALADRWNPKRTMVAGDVLSVLSIIGIV 92
F +R + D ++ + L+ + I G +A R +R ++ G + I +
Sbjct: 236 FGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI--L 293

Query: 93 LLLKLDYWQAIFFATLISAIVGQFSQPSSSRIFKRYVKEEQVANAIAFNQTLQSLFMIFG 152
L W + F ++ G P+ + R V EE+ L SL I G
Sbjct: 294 LAFATRGW--MAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVG 351

Query: 153 PVVGSLVYTQ 162
P++ + +Y
Sbjct: 352 PLLFTAIYAA 361



Score = 35.2 bits (81), Expect = 4e-04
Identities = 60/344 (17%), Positives = 124/344 (36%), Gaps = 26/344 (7%)

Query: 58 FIFSFIGGALADRWNPKRTMVAGDVLSVLS--IIGIVLLLKLDYWQAIFFATLISAIVGQ 115
F + + GAL+DR+ + ++ + + I+ L + ++ +++ I G
Sbjct: 57 FACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWV-----LYIGRIVAGITGA 111

Query: 116 FSQPSSSRIFKRYVKEEQVANAIAFNQTLQSLFMIFGPVVGSL---VYTQLGLFTSLYSL 172
+ + ++ A F M+ GPV+G L F +
Sbjct: 112 -TGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALN 170

Query: 173 IILFLLSAIALSFLPKWVEQEQVARDSLKNDIKEGWKYVLHTKNLRMITITFTIMGLAVG 232
+ FL LP+ + E+ + +++ + + F IM L
Sbjct: 171 GLNFLTGCF---LLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQ 227

Query: 233 LTNPLEVFLVIERLGMEKEAVQYLAAADGI-GMLIGGIVAAVFASKVNPKKMFVFGMSIL 291
+ L V +R + + AA GI L ++ A+++ ++ + GM
Sbjct: 228 VPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIAD 287

Query: 292 AMSFLVEGLSTSFWITSFMRFGTGICLACVNI---VVGTLMIQLVPENMVGRVNGTILPL 348
+++ +T W M F + LA I + ++ + V E G++ G++ L
Sbjct: 288 GTGYILLAFATRGW----MAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAAL 343

Query: 349 FMGAMLIGTALAGGLKEMTSLV---IVFCIAMALILLAIGPVLR 389
++G L + + + AL LL + P LR
Sbjct: 344 TSLTSIVGPLLFTAIYAASITTWNGWAWIAGAALYLLCL-PALR 386


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0379TCRTETB461e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 46.4 bits (110), Expect = 1e-07
Identities = 30/158 (18%), Positives = 56/158 (35%), Gaps = 3/158 (1%)

Query: 264 DLGISATNLLIILFVTQIVACPFALLYGKLSTTFTGKKMLYVGIIIYIIICIYAYFLKTT 323
D + + + +YGKLS K++L GIII + + +
Sbjct: 43 DFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSF 102

Query: 324 LDFWILAMLV-ATSQGGIQALSRSYFAKLVPKESANEFFGFYNIFGKFAAIMGPVLVGVT 382
I+A + AL A+ +PKE+ + FG +GP + G+
Sbjct: 103 FSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMI 162

Query: 383 TQLTGKTNAGVLSIIVLFIIGGFLLTRVPENNTSVTPP 420
+ +L I ++ II L ++ + +
Sbjct: 163 AHYIHWSY--LLLIPMITIITVPFLMKLLKKEVRIKGH 198


65BAS0520BAS0527N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS052008-1.706837internalin
BAS0521-19-0.899203acetyltransferase
BAS0522-19-0.740811glycine betaine transporter
BAS0523-19-0.670411collagenase
BAS0524010-0.431167hypothetical protein
BAS0525111-0.319856hypothetical protein
BAS0526112-0.973045methyl-accepting chemotaxis protein
BAS0527014-0.673859sensor histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0520IGASERPTASE451e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 45.4 bits (107), Expect = 1e-06
Identities = 43/214 (20%), Positives = 78/214 (36%), Gaps = 18/214 (8%)

Query: 833 TQNIVAKEEPKEPVEEVEGSKEEPIKEAEGSKEEPKEPAKEVEGSKEEPKEPAKEVEGSK 892
T N + + P P E ++ + EA P P++ E E K+ +K VE ++
Sbjct: 999 TPNNIQADVPSVPSNNEEIAR---VDEAPVPPPAPATPSETTETVAENSKQESKTVEKNE 1055

Query: 893 EEVKEPAKEVEGPKEEVKEPTK------EVEGPKEEVKEPTKEVEGPKEEVKEPMKEVEG 946
++ E + +E K K EV E KE V++ K
Sbjct: 1056 QDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVE 1115

Query: 947 SKEEVKGPTKEAEGSKEEVK--------EPTTEVEGSKEVKEPGKEVEGSKDAINQSAVA 998
+++ + P ++ S ++ + EP E + + +KEP + + Q A
Sbjct: 1116 TEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQ-TNTTADTEQPAKE 1174

Query: 999 QETNVNNQVGKEKVVENQNMKENKPAVTKQEESK 1032
+NV V + V N P T ++
Sbjct: 1175 TSSNVEQPVTESTTVNTGNSVVENPENTTPATTQ 1208



Score = 41.6 bits (97), Expect = 2e-05
Identities = 31/189 (16%), Positives = 59/189 (31%), Gaps = 8/189 (4%)

Query: 860 AEGSKEEPKEPAKEVEGSKEEPKEPAKEVEGSKEEVKEPAKEVEGPKEEVKEPTKEVEGP 919
+ S E E P P++ E E K+ +K VE +++ E T +
Sbjct: 1009 SVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREV 1068

Query: 920 KEEVKEPTKEVEGPKEEVKEPMKEVEGSKEEVKGPTKEAEGSKEEVKEPTTEVEGSKEVK 979
+E K K E + + E E K + K +V+ T+ +
Sbjct: 1069 AKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQ 1128

Query: 980 EPGKEVEGSKDAINQSAVAQETNVNNQVGKEKVVENQ------NMKENKPAVTKQEESKK 1033
K+ + + + A N KE + + + +Q ++
Sbjct: 1129 VSPKQEQ--SETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES 1186

Query: 1034 SLGATGGQE 1042
+ TG
Sbjct: 1187 TTVNTGNSV 1195



Score = 39.7 bits (92), Expect = 9e-05
Identities = 38/191 (19%), Positives = 65/191 (34%), Gaps = 12/191 (6%)

Query: 859 EAEGSKEEPKEPAKEVEGSKEEPKEPAKEVEGSKEEVKEPAKEVEGPKEEVKEPTKEVEG 918
E S E KE E KE A + K +V E K E PK + K+ +
Sbjct: 1084 EVAQSGSETKETQ------TTETKETATVEKEEKAKV-ETEKTQEVPKVTSQVSPKQEQS 1136

Query: 919 PKEEVKEPTKEVEGPKEEVKEPMKEVEGSKEEVKGPTKEAEGSKEEVKEPTTEVEGSKEV 978
+ + P +KEP + + + + + + ++ V E TT G+ V
Sbjct: 1137 ETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVV 1196

Query: 979 KEPGKEVEGSKDAINQSAVAQETNVNNQVGKEKVVENQNMKENKPAVTKQEESKKSLGAT 1038
+ P + S + + ++ V N +PA T +
Sbjct: 1197 ENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNV-----EPATTSSNDRSTVALCD 1251

Query: 1039 GGQENTSTLLS 1049
NT+ +LS
Sbjct: 1252 LTSTNTNAVLS 1262



Score = 38.9 bits (90), Expect = 1e-04
Identities = 31/171 (18%), Positives = 51/171 (29%), Gaps = 8/171 (4%)

Query: 833 TQNIVAKEEPKEPVEEVEGSKEEPIKEAEGSKEEPKEPAKEVEGSKEEPKEPAKEVEGSK 892
TQ KE VE+ E +K E K E K + K+ + +P+
Sbjct: 1095 TQTTETKETAT--VEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPT 1152

Query: 893 EEVKEPAKEVEGPKEEVKEPTKEVEGPKEEVKEPTKEVEGPKEEVKEPMKEVEGSKEEVK 952
+KEP + + + + ++ V E T G E +
Sbjct: 1153 VNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENP-----ENTTPATT 1207

Query: 953 GPTKEAEGSKEEVKEPTTEVEGSKEVKEPGKEVEGSKDAINQS-AVAQETN 1002
PT +E S + V EP + + + TN
Sbjct: 1208 QPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTN 1258



Score = 35.8 bits (82), Expect = 0.001
Identities = 37/195 (18%), Positives = 69/195 (35%), Gaps = 16/195 (8%)

Query: 856 PIKEAEGSKEEPKEPAKEVEGSKEEPKEPAKEVEGSKEE--------VKEPAKEVEGPKE 907
P E + + P P+ E ++ + P++ E E
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAE 1042

Query: 908 EVKEPTKEVEGPKEEVKEPTKEVEGPKEEVKEPMKEVEGSKEEVKGPTKEAEGSKEEVKE 967
K+ +K VE +++ E T + +E K +K + E + ++ E E KE
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 968 PTTEVEGSKEVKEPGKEVEGSKDAINQSAVAQETNVNNQVGKEKVVENQNMKENKPAVT- 1026
T + K E K E K V + + + + + + +EN P V
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPK-------VTSQVSPKQEQSETVQPQAEPARENDPTVNI 1155

Query: 1027 KQEESKKSLGATGGQ 1041
K+ +S+ + A Q
Sbjct: 1156 KEPQSQTNTTADTEQ 1170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0521SACTRNSFRASE381e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.4 bits (89), Expect = 1e-05
Identities = 25/101 (24%), Positives = 41/101 (40%), Gaps = 5/101 (4%)

Query: 178 TYYEGNEIIGRLSDTNK-LFVSMKNEKLEGYVYVEVNPEFQE-ANIEFIATAENSRRKGV 235
Y + + + + + K F+ G + ++ + A IE IA A++ R+KGV
Sbjct: 49 QYEDDDMDVSYVEEEGKAAFLYYLENNCIGRI--KIRSNWNGYALIEDIAVAKDYRKKGV 106

Query: 236 GERLLQAAIQYIFSFQGMREIELCLNTNNDRAVKLYKKVGF 276
G LL AI++ + L N A Y K F
Sbjct: 107 GTALLHKAIEWAKE-NHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0523MICOLLPTASE7550.0 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 755 bits (1950), Expect = 0.0
Identities = 412/886 (46%), Positives = 569/886 (64%), Gaps = 16/886 (1%)

Query: 94 YSMADLNKMNNQELVETLGSIKWHQITDLFQFNEDAKAFYKDKGKMQVVIDELAHRGSTF 153
Y+ +LN+MN +LVE + +I + + DLF FN+ + F+ ++ ++Q +I L G T+
Sbjct: 93 YTFDELNRMNYSDLVELIKTISYENVPDLFNFNDGSYTFFSNRDRVQAIIYGLEDSGRTY 152

Query: 154 TKDDSKGIQTFTEVLRSAFYLAFYNNELSELNERSFQDKCLPALKAIAKNPNFKLGTTEQ 213
T DD KGI T E LR+ +YL FYN +LS LN +++CLPA+KAI N NF+LGT Q
Sbjct: 153 TADDDKGIPTLVEFLRAGYYLGFYNKQLSYLNTPQLKNECLPAMKAIQYNSNFRLGTKAQ 212

Query: 214 DTVVSAYGKLISNASSDVETVQYASNILKQYNDNFTTYVNDRMKGQAIYDIMQGIDYDIQ 273
D VV A G+LI NAS+D E + +L + DN Y ++ KG A++++M+GIDY
Sbjct: 213 DGVVEALGRLIGNASADPEVINNCIYVLSDFKDNIDKYGSNYSKGNAVFNLMKGIDYYTN 272

Query: 274 SYLIEARKE-ANETMWYGKVDGFINEINRIALL-NEVTQENKWLVNNGIYFASRLGKFHS 331
S + + A T +Y ++D ++ + + + +++ +N WLVNN +Y+ R+GKF
Sbjct: 273 SVIYNTKGYDAKNTEFYNRIDPYMERLESLCTIGDKLNNDNAWLVNNALYYTGRMGKFRE 332

Query: 332 NPNKGLEVVTQAMHMYPRLSEPYFVAVEQITTNYNGKDYSGNTVDLEKIRKEGKEQYLPK 391
+P+ + +AM YP LS Y A + N+ GK+ SGN +D KI+ + +E+YLPK
Sbjct: 333 DPSISQRALERAMKEYPYLSYQYIEAANDLDLNFGGKNSSGNDIDFNKIKADAREKYLPK 392

Query: 392 TYTFDDGSIVFKTGDKVSEEKIKRLYWAAKEVKAQYHRVIGNDKALEPGNADDILTIVIY 451
TYTFDDG V K GDKV+EEKIKRLYWA+KEVKAQ+ RV+ NDKALE GN DDILT+VIY
Sbjct: 393 TYTFDDGKFVVKAGDKVTEEKIKRLYWASKEVKAQFMRVVQNDKALEEGNPDDILTVVIY 452

Query: 452 NSPEEYQLNRQLYGYETNNGGIYIEETGTFFTYERTPEQSIYSLEELFRHEFTHYLQGRY 511
NSPEEY+LNR + G+ T+NGGIYIE GTFFTYERTPE+SIY+LEELFRHEFTHYLQGRY
Sbjct: 453 NSPEEYKLNRIINGFSTDNGGIYIENIGTFFTYERTPEESIYTLEELFRHEFTHYLQGRY 512

Query: 512 EVPGLFGRGDMYQNERLTWFQEGNAEFFAGSTRTNNVVPRKSIISGLSSDPASRYTAERT 571
VPG++G+G+ YQ LTW++EG AEFFAGSTRT+ + PRKS+ GL+ D +R +
Sbjct: 513 VVPGMWGQGEFYQEGVLTWYEEGTAEFFAGSTRTDGIKPRKSVTQGLAYDRNNRMSLYGV 572

Query: 572 LFAKYGSWDFYNYSFALQSYLYTHQFETFDKIQDLIRANDVKNYDAYRENLSKDLKLNEE 631
L AKYGSWDFYNY FAL +Y+Y + F+K+ + I+ NDV Y Y ++S D LN++
Sbjct: 573 LHAKYGSWDFYNYGFALSNYMYNNNMGMFNKMTNYIKNNDVSGYKDYIASMSSDYGLNDK 632

Query: 632 YQEYMQHLIDNQDKYNVPEVADDYLAEHTPKSLTAVEKEITETLPMKDAKMTKHSSQFFN 691
YQ+YM L++N D +VP V+D+Y+ H K + + +I E +KD SQFF
Sbjct: 633 YQDYMDSLLNNIDNLDVPLVSDEYVNGHEAKDINEITNDIKEVSNIKDLSSNVEKSQFFT 692

Query: 692 TFTLEGTYTGSVTKGDSEDWNAMSKKVNEALEQLAQKEWSGYKTVTAYFVNYGVNSSNQF 751
T+ + GTY G ++G+ DW M+ K+N+ L++L++K W+GYKTVTAYFVN+ V+ + +
Sbjct: 693 TYDMRGTYVGGRSQGEENDWKDMNSKLNDILKELSKKSWNGYKTVTAYFVNHKVDGNGNY 752

Query: 752 EYDVVFHG----IAKDDEENKAPTVNINGPYNGLVKEGIQFKSDGSKDEDGKIVSYLWDF 807
YDVVFHG D NK P I + +V+E I F SKDEDG+I +Y WDF
Sbjct: 753 VYDVVFHGMNTDTNTDVHVNKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDF 812

Query: 808 GDGSTSAEVNPVHVYESEGSYKVALIVKDDKGKESKSEITVTV----KGGSLTESEPNNR 863
GDG S E H Y G Y+V L V D+ G + + V + ESEPNN
Sbjct: 813 GDGEKSNEAKATHKYNKTGEYEVKLTVTDNNGGINTESKKIKVVEDKPVEVINESEPNND 872

Query: 864 PEEANRIG-LNTTIKGSLIGGDHTDVYTFNVASAKNIDISVLNEYGIGMTWVLHHESDMQ 922
E+AN+I N +KG+L D++D Y F+VA N+ I++ N +G+TW L+ E D+
Sbjct: 873 FEKANQIAKSNMLVKGTLSEEDYSDKYYFDVAKKGNVKITLNNLNSVGITWTLYKEGDLN 932

Query: 923 NYAAYGQANGNHI---EANFNAKPGKYYLYVYKYDNGDGTYELSVK 965
NY Y A GN + +PG+YYL VY YDN GTY ++VK
Sbjct: 933 NYVLY--ATGNDGTVLKGEKTLEPGRYYLSVYTYDNQSGTYTVNVK 976



Score = 97.9 bits (243), Expect = 8e-23
Identities = 60/251 (23%), Positives = 99/251 (39%), Gaps = 49/251 (19%)

Query: 762 KDDEENKAPTVNINGPYNGLVKEGIQFKSD----GSKDEDGKIVSYLWDF---------- 807
K E+ +N + P N K KS+ G+ E+ Y +D
Sbjct: 854 KVVEDKPVEVINESEPNNDFEKANQIAKSNMLVKGTLSEEDYSDKYYFDVAKKGNVKITL 913

Query: 808 ---------------GDGST-SAEVNPVHVYESEGSYKVA-----LIVKDDKGKESKSEI 846
GD + +G + L V +
Sbjct: 914 NNLNSVGITWTLYKEGDLNNYVLYATGNDGTVLKGEKTLEPGRYYLSVYTYDNQ--SGTY 971

Query: 847 TVTVKGG-----------SLTESEPNNRPEEANRIGLNTTIKGSLIGGDHTDVYTFNVAS 895
TV VKG ++ E E NN ++A ++ N+ I G+L D D+Y+ ++ +
Sbjct: 972 TVNVKGNLKNEVKETAKDAIKEVENNNDFDKAMKVDSNSKIVGTLSNDDLKDIYSIDIQN 1031

Query: 896 AKNIDISVLNEYGIGMTWVLHHESDMQNYAAYGQANGNHIEANFNAKPGKYYLYVYKYDN 955
+++I V N I M W+L+ D+ NY Y A+GN + PGKYYL VY+++N
Sbjct: 1032 PSDLNIVVENLDNIKMNWLLYSADDLSNYVDYANADGNKLSNTCKLNPGKYYLCVYQFEN 1091

Query: 956 -GDGTYELSVK 965
G G Y ++++
Sbjct: 1092 SGTGNYIVNLQ 1102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0525IGASERPTASE412e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.8 bits (95), Expect = 2e-05
Identities = 30/194 (15%), Positives = 67/194 (34%), Gaps = 12/194 (6%)

Query: 201 QPQIATVKRDATIANAEREKEARIEKARAEKEAKEAEYQRDAQIAEAEKHKELKVQSYKR 260
P + T AE K+ + E++A E Q EA+ + + Q+ +
Sbjct: 1026 PPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEV 1085

Query: 261 EQEQARADADLSYELQQAKAQQGVTEEQMRVKIIEREKQIELEEKEIARREKQYDAEVKK 320
Q + E Q + ++ T E+ E + ++E E+ + + + ++
Sbjct: 1086 AQSGSETK-----ETQTTETKETATVEK------EEKAKVETEKTQEVPKVTSQVSPKQE 1134

Query: 321 KADADRYAVEQSAEAEKVKQIKKADADQYKIEAEARARAEEVRVEGLAKAEIEKAQGQAK 380
+++ + E + E + IK+ + A+ A+E
Sbjct: 1135 QSETVQPQAEPARENDPTVNIKEPQSQTNT-TADTEQPAKETSSNVEQPVTESTTVNTGN 1193

Query: 381 AEVQKAQGTAEADV 394
+ V+ + T A
Sbjct: 1194 SVVENPENTTPATT 1207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0527PF06580310.008 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.008
Identities = 25/132 (18%), Positives = 51/132 (38%), Gaps = 27/132 (20%)

Query: 403 LKIEFMLDRESSLDKLSPPIESNYVVSILGNLITNAFE-AIERNEEHDKKVRMFVTDIGE 461
L+ E ++ + +D PP+ ++ L+ N + I + + K+ + T
Sbjct: 240 LQFENQIN-PAIMDVQVPPM-------LVQTLVENGIKHGIAQLPQ-GGKILLKGTKDNG 290

Query: 462 EIVIEVEDSGQGIHDEVITSIFYKGFSTKEGEKRGYGLAKVKELVEDLNG---SIAIEKG 518
+ +EVE++G E G GL V+E ++ L G I + +
Sbjct: 291 TVTLEVENTGSLALKNT-------------KESTGTGLQNVRERLQMLYGTEAQIKLSEK 337

Query: 519 DLGGALFIIALP 530
G ++ +P
Sbjct: 338 Q-GKVNAMVLIP 348


66BAS0535BAS0546N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS0535-312-0.745697glycerol-3-phosphate ABC transporter ATP-binding
BAS0536-214-1.117668glycerol-3-phosphate ABC transporter permease
BAS0537-115-0.925940glycerol-3-phosphate ABC transporter permease
BAS0538117-0.795648glycerol-3-phosphate ABC transporter
BAS0539216-1.134255serine/threonine phosphatase
BAS0540216-1.440414DNA-binding response regulator
BAS0541217-1.798145sensor histidine kinase
BAS0542014-1.464674hypothetical protein
BAS0543-113-0.700096hypothetical protein
BAS0544-213-0.690426methyl-accepting chemotaxis protein
BAS0545-112-0.517291sensory histidine kinase DcuS
BAS0546013-0.032854response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0535PF05272330.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.7 bits (74), Expect = 0.002
Identities = 13/32 (40%), Positives = 16/32 (50%)

Query: 44 VLVGPSGCGKSTLLRMIAGLEEISSGDLIINE 75
VL G G GKSTL+ + GL+ S I
Sbjct: 600 VLEGTGGIGKSTLINTLVGLDFFSDTHFDIGT 631


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0538MALTOSEBP419e-06 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 40.9 bits (95), Expect = 9e-06
Identities = 72/327 (22%), Positives = 119/327 (36%), Gaps = 43/327 (13%)

Query: 131 IKKDKYDTSKLEKAITNYYSVDGKMYSMPFNSSTPVLIYNKDAFAKAGLDPEKAPKTYAE 190
I DK KL + +GK+ + P LIYNKD PKT+ E
Sbjct: 105 ITPDKAFQDKLYPFTWDAVRYNGKLIAYPIAVEALSLIYNKDLLP-------NPPKTWEE 157

Query: 191 LQEAAKKLTIKEGGNVKQYGFSMLNYGWFFEELLATQGALYVDNENGRKDAAKKAVFNGK 250
+ K+L K G + + + W L+A G ENG+ D V N
Sbjct: 158 IPALDKELKAK-GKSALMFNLQEPYFTW---PLIAADGGYAFKYENGKYDIKDVGVDNAG 213

Query: 251 EGQKVFGMLDELNKAGALGKYGASWDDIRAAFQSGQVAMYLDSSAGVRDLIDASKFNVGV 310
+ ++D + + AAF G+ AM ++ + ID SK N GV
Sbjct: 214 AKAGLTFLVDLIKNKHM--NADTDYSIAEAAFNKGETAMTINGPWAWSN-IDTSKVNYGV 270

Query: 311 SYIPYPEDSKQN---GVVIGGASLWMTNMVSEETQQGAWDFMKYLTKPDVQAKWHTATGY 367
+ +P + GV+ G + N K L K ++ T G
Sbjct: 271 TVLPTFKGQPSKPFVGVLSAGINAASPN--------------KELAKEFLENYLLTDEGL 316

Query: 368 FSINPD----AYNEPLVKEQYEKYPQLKVTVDQLQATKQSPATQGALISVFPESRDAVVK 423
++N D A +E+ K P++ T++ Q +G ++ P+
Sbjct: 317 EAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQ--------KGEIMPNIPQMSAFWYA 368

Query: 424 ALEAMYDGENSKEALDEAAKATDRAIS 450
A+ + + ++ +DEA K I+
Sbjct: 369 VRTAVINAASGRQTVDEALKDAQTRIT 395


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0540HTHFIS926e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.2 bits (229), Expect = 6e-24
Identities = 35/140 (25%), Positives = 67/140 (47%), Gaps = 2/140 (1%)

Query: 2 RLLVVEDNASLLESIVQILCDE-FEVDTALNGEDGLFLALQNIYDAILLDVMMPEMDGFE 60
+LV +D+A++ + Q L ++V N D ++ DV+MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 61 VIQKIRDEKIETPVLFLTARDSLEDRVKGLDFGGDDYIVKPFQAPELKARI-RALLRRSG 119
++ +I+ + + PVL ++A+++ +K + G DY+ KPF EL I RAL
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 120 SLTTKQTIRYKGIELFGKDK 139
+ + G+ L G+
Sbjct: 125 RPSKLEDDSQDGMPLVGRSA 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0541PF06580392e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.5 bits (92), Expect = 2e-05
Identities = 34/198 (17%), Positives = 70/198 (35%), Gaps = 53/198 (26%)

Query: 234 TISKECRRLSKLVANLLL---------LARSDSNQIEMDKKIFELDKLLEEIVEPYKEIA 284
I +L L S++ Q+ + ++ +V+ Y ++A
Sbjct: 181 NIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADEL--------TVVDSYLQLA 232

Query: 285 SYQEKEMILKVEYDISFMGDRERIHQMMV------ILLDNAMKY----TNEGGHIQIDCT 334
S Q ++ L+ E I+ I + V L++N +K+ +GG I + T
Sbjct: 233 SIQFEDR-LQFENQIN-----PAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGT 286

Query: 335 QTNSSIRIRVKDDGIGVKGEDIPKLFDRFYQGDKARSASEGAGLGLSIANWIVEKHYGK- 393
+ N ++ + V++ G ++ E G GL ++ YG
Sbjct: 287 KDNGTVTLEVENTG-----------------SLALKNTKESTGTGLQNVRERLQMLYGTE 329

Query: 394 --ISVESQWGEGTCFEVI 409
I + + G+ +I
Sbjct: 330 AQIKLSEKQGKVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0545PF06580387e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.9 bits (88), Expect = 7e-05
Identities = 20/99 (20%), Positives = 42/99 (42%), Gaps = 19/99 (19%)

Query: 434 LIDNALE-AVTNCEKK-RVEVKIQHED-ILTITVQDTGKGIQEKEIEELFTKGYSTKGDN 490
L++N ++ + + ++ +K ++ +T+ V++TG + E +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE------------S 310

Query: 491 RGYGLYLVKESIQRINGE---IHMHSLVGKGTTITIEIP 526
G GL V+E +Q + G I + GK + IP
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNA-MVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0546HTHFIS802e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.9 bits (197), Expect = 2e-19
Identities = 32/129 (24%), Positives = 59/129 (45%), Gaps = 5/129 (3%)

Query: 2 IKVLIVEDDPMVAMLNTHYLEQVGGFELVQAVNSIKSAIEVLEESRIDLVLLDIFMPEET 61
+L+ +DD + + L + G V+ ++ + + DLV+ D+ MP+E
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GFELLMYIRNQEKEIDIMMISAVHDMGSIKKALQYGVVDYLIKPFTFERFKEALTIYREK 121
F+LL I+ ++ ++++SA + + KA + G DYL KPF E + I
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT---ELIGIIGRA 118

Query: 122 LTFMKEQQK 130
L K +
Sbjct: 119 LAEPKRRPS 127


67BAS0552BAS0555N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS0552-2110.204041acetyltransferase
BAS0553-29-0.357481sensor histidine kinase
BAS0554-1110.587800DNA-binding response regulator
BAS05550130.795807acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0552SACTRNSFRASE385e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.0 bits (88), Expect = 5e-06
Identities = 20/87 (22%), Positives = 31/87 (35%), Gaps = 4/87 (4%)

Query: 59 GAFKDGKLIGVATLETKPYVKQEHKAKIGSVYVSPKARGLGAGKALIKECLELAKSLEVE 118
+ + IG + + A I + V+ R G G AL+ + +E AK
Sbjct: 69 LYYLENNCIGRIKIRSN----WNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFC 124

Query: 119 QVMLDVVVGNDGAKKLYESLGFKTFGV 145
+ML+ N A Y F V
Sbjct: 125 GLMLETQDINISACHFYAKHHFIIGAV 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS055360KDINNERMP310.013 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 30.7 bits (69), Expect = 0.013
Identities = 23/87 (26%), Positives = 35/87 (40%), Gaps = 9/87 (10%)

Query: 152 KKSKFITTVSP-IHTTEFQGKLYMLLKTSFLENMLLKLMKQFLIISVLTIILTTISVFIF 210
+ + V+P + T G L+ + + F LLK + F+ +II+ T V
Sbjct: 312 EIQDKMAAVAPHLDLTVDYGWLWFISQPLF---KLLKWIHSFVGNWGFSIIIITFIV--- 365

Query: 211 SRVITEPL-IKMKRATEKMSKLNKPIQ 236
R I PL + KM L IQ
Sbjct: 366 -RGIMYPLTKAQYTSMAKMRMLQPKIQ 391


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0554HTHFIS941e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.1 bits (234), Expect = 1e-24
Identities = 34/121 (28%), Positives = 64/121 (52%), Gaps = 1/121 (0%)

Query: 3 KILLVDDEERMLRLLDLFLSPRGYFCMKATSGLEALKLIEQKDFDIILLDVMMPNMDGWD 62
IL+ DD+ + +L+ LS GY ++ + I D D+++ DV+MP+ + +D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 63 TCYQIRQI-SNVPIIMLTARNQNYDMVKGLTMGADDYITKPFDEHVLVARIEAILRRTKK 121
+I++ ++P+++++A+N +K GA DY+ KPFD L+ I L K+
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 D 122

Sbjct: 125 R 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0555SACTRNSFRASE431e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 42.6 bits (100), Expect = 1e-07
Identities = 29/123 (23%), Positives = 42/123 (34%), Gaps = 7/123 (5%)

Query: 23 TKNPEAFSSSYEDVLKHEDPVAAMAKRLSNPDKYTLGVFKDKDLIGIATLETKPFIKQEH 82
T E FS Y K + + K + + + IG + +
Sbjct: 36 TYTEERFSKPY---FKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSN----WNG 88

Query: 83 KAKIGSVFVSPKARGLGAGRALIKAIIENADKLHVEQLMLDVVVGNDAAKKLYESLGFQT 142
A I + V+ R G G AL+ IE A + H LML+ N +A Y F
Sbjct: 89 YALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148

Query: 143 YGV 145
V
Sbjct: 149 GAV 151


68BAS0685BAS0692N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS0685-217-1.062846AcrB/AcrD/AcrF family transporter
BAS0687216-1.370697*******************hypothetical protein
BAS0688213-1.280906hypothetical protein
BAS0689111-0.232362hypothetical protein
BAS0690-1110.644626M24/M37 family peptidase
BAS06910141.369279transcriptional activator TenA
BAS0692-1141.489783ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0685ACRIFLAVINRP5650.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 565 bits (1458), Expect = 0.0
Identities = 222/1039 (21%), Positives = 447/1039 (43%), Gaps = 55/1039 (5%)

Query: 4 LTKFSLKNRAAVIIMVFLISILGVYSGSKLPMEFLPSIDNPAVTVTTLSPGLDAEAMTKE 63
+ F ++ ++ ++ + G + +LP+ P+I PAV+V+ PG DA+ +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 64 VTDPLEKQFRNLEHIDNITS-STHEGLSRIDIAYTSKANMKDATREVEKAINTIK--LPK 120
VT +E+ ++++ ++S S G I + + S + A +V+ + LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 DATKPIVSQLNTTMIPLAQIAIQKQNGFSKADE--KQIEKEIVPQLESIDGVANVMFFGK 178
+ + +S ++ L N + D+ + + L ++GV +V FG
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 179 STSELSIILDPNQLKDKNVTTEQILKVLQGKETSTPAG------AVTVNKEEYNLRVIGD 232
+ + I LD + L +T ++ L+ + AG A+ + ++
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 233 IKNVDDIKNITVAP-----HVKLQDVAQIEL-KQHYDTISHINGEEGTGLIIMKEPSKNA 286
KN ++ +T+ V+L+DVA++EL ++Y+ I+ ING+ GL I NA
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 287 VAIGKEIDKKIKDISKQYKDQFSIKLLASTHEQVENAVTSMGKEVILGAIAATLIILIFL 346
+ K I K+ ++ + + T V+ ++ + K + + L++ +FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 347 RNFRTTLIAVVSIPLSILLTLFLLHQSNITLNTLTLGGLAVAVGRLVDDSIVVIENIFRR 406
+N R TLI +++P+ +L T +L ++NTLT+ G+ +A+G LVDD+IVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 407 LQKEYFS-KDIILDATKEVAVAITSSTLTTVAVFLPIGLVSGVIGKLMLPMVLAVVYSIL 465
+ ++ K+ + ++ A+ + AVF+P+ G G + + +V ++
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 466 SSLIVALTVVPLMAFLLLKKIK---HKKPS------------SSPRYVATLKWALSHKFI 510
S++VAL + P + LLK + H+ S Y ++ L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 511 ILLTSFLLFAGSIAAYVLLPKANIKSEDDTMLSINMTFPADYALETQKQKAFDFEKKLLS 570
LL L+ AG + ++ LP + + ED + + PA E ++ L
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 571 NSDVTD-VILRMGSSAEDAQWGQTTKNNLASIFVVFK-----------KGSDIDQYIKEL 618
N + + + GQ N FV K + I + EL
Sbjct: 600 NEKANVESVFTVNGFSFS---GQA--QNAGMAFVSLKPWEERNGDENSAEAVIHRAKMEL 654

Query: 619 KKEHNAF-EPAELDYIKTSYSSSGGGNNLQFNVTATNETNLKKAATIVETKLKNMDDLSK 677
K + F P + I +++G L ++ + ++ ++ L
Sbjct: 655 GKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVS 714

Query: 678 VKTNLEDSKKEWQIHVDQTKAEQLGLTPELAAQQVAFLMKKSPIGEVSINNEKTTIMIEH 737
V+ N + ++++ VDQ KA+ LG++ Q ++ + + + + + ++
Sbjct: 715 VRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQA 774

Query: 738 KKESITKQEDILNTNILSPINGPIPLKDIATISEKQLQTEVFHKDGKETIQITAEASNED 797
+ ED+ + S +P T + +G +++I EA+
Sbjct: 775 DAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGT 834

Query: 798 LSKVSAEVNKAITDLDLPSGAKVNIAGATESMQENFTDLFKIMGIAIGIVYLIMVITFGQ 857
S + + + + LP+G + G + + + ++ I+ +V+L + +
Sbjct: 835 SSGDAMALMENLAS-KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYES 893

Query: 858 ARAPFAILFSLPLAAVGGILGLIISGTPVDVNSLIGALMLIGIVVTNAIVLIERVQQNRE 917
P +++ +PL VG +L + DV ++G L IG+ NAI+++E + E
Sbjct: 894 WSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLME 953

Query: 918 H-GMETREALLEAGSTRLRPIIMTAITTIVAMLPLLFGQSQAGSMVSKSLAVVVIGGLAV 976
G EA L A RLRPI+MT++ I+ +LPL + AGS ++ + V+GG+
Sbjct: 954 KEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAIS-NGAGSGAQNAVGIGVMGGMVS 1012

Query: 977 STVLTLVVVPVMYELLDKI 995
+T+L + VPV + ++ +
Sbjct: 1013 ATLLAIFFVPVFFVVIRRC 1031



Score = 127 bits (321), Expect = 5e-32
Identities = 96/518 (18%), Positives = 198/518 (38%), Gaps = 42/518 (8%)

Query: 509 FIILLTSFLLFAGSIAAYVLLPKANIKSEDDTMLSINMTFPADYALETQKQKAFDFEKKL 568
F +L L+ AG++A + LP A + +S++ +P A Q
Sbjct: 11 FAWVLAIILMMAGALA-ILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT--------- 60

Query: 569 LSNSDVTDVILR--MGSSAEDAQWGQTTKNNLASIFVVFKKGSDIDQYIKELKKEHNAFE 626
VT VI + G + +I + F+ G+D D +++ +
Sbjct: 61 -----VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLAT 115

Query: 627 P-----AELDYIKTSYSSSGGGNNLQFNVTATNETNLKK---AATIVETKLKNMDDLSKV 678
P + I SSS F T A+ V+ L ++ + V
Sbjct: 116 PLLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDV 175

Query: 679 KTNLEDSKKEWQIHVDQTKAEQLGLTPE-----LAAQQVAFLMKKSPIGEVSINNEKTTI 733
L ++ +I +D + LTP L Q + G ++ ++
Sbjct: 176 --QLFGAQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQ-IAAGQLGGTPALPGQQLNA 232

Query: 734 MIEHKKESITKQEDILNTNILSPING-PIPLKDIATISE-KQLQTEVFHKDGKETIQIT- 790
I + E+ + +G + LKD+A + + + +GK +
Sbjct: 233 SIIAQTRFKNP-EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGI 291

Query: 791 AEASNEDLSKVSAEVNKAITDL--DLPSGAKVNIA-GATESMQENFTDLFKIMGIAIGIV 847
A+ + + + + +L P G KV T +Q + ++ K + AI +V
Sbjct: 292 KLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLV 351

Query: 848 YLIMVITFGQARAPFAILFSLPLAAVGGILGLIISGTPVDVNSLIGALMLIGIVVTNAIV 907
+L+M + RA ++P+ +G L G ++ ++ G ++ IG++V +AIV
Sbjct: 352 FLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIV 411

Query: 908 LIERVQQ-NREHGMETREALLEAGSTRLRPIIMTAITTIVAMLPLLFGQSQAGSMVSKSL 966
++E V++ E + +EA ++ S ++ A+ +P+ F G++ +
Sbjct: 412 VVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIY-RQF 470

Query: 967 AVVVIGGLAVSTVLTLVVVPVMYELLDKIGRKRRSRRK 1004
++ ++ +A+S ++ L++ P + L K K
Sbjct: 471 SITIVSAMALSVLVALILTPALCATLLKPVSAEHHENK 508



Score = 94.1 bits (234), Expect = 1e-21
Identities = 74/516 (14%), Positives = 174/516 (33%), Gaps = 41/516 (7%)

Query: 3 RLTKFSLKNRAAVIIMVFLISILGVYSGSKLPMEFLPSIDNPAVTVT-TLSPGLDAE--- 58
L + +++ LI V +LP FLP D L G E
Sbjct: 528 NSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQ 587

Query: 59 -AMTKEVTDPLEKQFRNLEHIDNITSSTHEGLSRID-IAYTSKANMKDATR---EVEKAI 113
+ + L+ + N+E + + + G ++ +A+ S ++ E I
Sbjct: 588 KVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVI 647

Query: 114 NTIKLPKDATK---------PIVSQLNTTMIPLAQIAIQKQNGFSKADEKQIEKEIVPQL 164
+ K+ + P + +L T + Q G Q +++
Sbjct: 648 HRAKMELGKIRDGFVIPFNMPAIVELGTATG--FDFELIDQAGLGHDALTQARNQLLGMA 705

Query: 165 -ESIDGVANVMFFGKS-TSELSIILDPNQLKDKNVTTEQILKVLQGKETSTPAGAVTVNK 222
+ + +V G T++ + +D + + V+ I + + T
Sbjct: 706 AQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRG 765

Query: 223 EEYNLRVIGD---IKNVDDIKNITVAPH----VKLQDVAQIELKQHYDTISHINGEEGTG 275
L V D +D+ + V V + NG
Sbjct: 766 RVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSME 825

Query: 276 LIIMKEPSKNAVAIGKEIDKKIKDISKQYKDQFSIKLLASTHEQVENAVTSMGKEVILGA 335
+ P ++ + +++++ + ++++ S + L A
Sbjct: 826 IQGEAAPGTSS----GDAMALMENLASKLPAGIGYDWTGMSYQERL----SGNQAPALVA 877

Query: 336 IAATLIILIF---LRNFRTTLIAVVSIPLSILLTLFLLHQSNITLNTLTLGGLAVAVGRL 392
I+ ++ L ++ + ++ +PL I+ L N + + GL +G
Sbjct: 878 ISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLS 937

Query: 393 VDDSIVVIENIFRRLQKEYFS-KDIILDATKEVAVAITSSTLTTVAVFLPIGLVSGVIGK 451
++I+++E ++KE + L A + I ++L + LP+ + +G
Sbjct: 938 AKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSG 997

Query: 452 LMLPMVLAVVYSILSSLIVALTVVPLMAFLLLKKIK 487
+ + V+ ++S+ ++A+ VP+ ++ + K
Sbjct: 998 AQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRCFK 1033


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0688MICOLLPTASE320.004 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 32.0 bits (72), Expect = 0.004
Identities = 14/68 (20%), Positives = 28/68 (41%), Gaps = 1/68 (1%)

Query: 50 KEFYKEENLAAFIVYGM-NKAKNLPQFHKDEIPTLVRILRLCQEIGWYEEANTFMVNQGL 108
F+ + I+YG+ + + IPTLV LR +G+Y + +++ L
Sbjct: 129 YTFFSNRDRVQAIIYGLEDSGRTYTADDDKGIPTLVEFLRAGYYLGFYNKQLSYLNTPQL 188

Query: 109 AEFVHTSL 116
++
Sbjct: 189 KNECLPAM 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0690RTXTOXIND320.004 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.004
Identities = 17/71 (23%), Positives = 27/71 (38%), Gaps = 14/71 (19%)

Query: 284 AAQGNVSIQAAAAGKVVKSYYSASYGNVVFIAHQINGKLYTTVYAHMKDRTVQAGDQVQA 343
+ G V I A A GK+ S G I N + K+ V+ G+ V+
Sbjct: 75 SVLGQVEIVATANGKLTHS------GRSKEIKPIENSIV--------KEIIVKEGESVRK 120

Query: 344 GQLVGHMGNTG 354
G ++ + G
Sbjct: 121 GDVLLKLTALG 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS0692PF05272300.014 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.014
Identities = 11/32 (34%), Positives = 15/32 (46%)

Query: 39 GPSGCGKSTLFRLITGLEEASTGQIELTETKS 70
G G GKSTL + GL+ S ++ K
Sbjct: 603 GTGGIGKSTLINTLVGLDFFSDTHFDIGTGKD 634


69BAS1137BAS1140N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS1137-2160.639537dTDP-glucose 4,6-dehydratase
BAS11380171.069504dTDP-4-dehydrorhamnose reductase
BAS11394181.722245enoyl-ACP reductase
BAS11405181.582572hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1137NUCEPIMERASE1881e-59 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 188 bits (478), Expect = 1e-59
Identities = 75/332 (22%), Positives = 141/332 (42%), Gaps = 26/332 (7%)

Query: 1 MNILVTGGAGFIGSNFVHYMLQSYETYKIINFDALT--YSGNLNNVK-SIQDHPNYYFVK 57
M LVTG AGFIG + +L+ ++++ D L Y +L + + P + F K
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLE--AGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK 58

Query: 58 GEIQNGELLEHVIKERDVQVIVNFAAESHVDRSIENPIPFYDTNVIGTVTLLELVKKYPH 117
++ + E + + + + V S+ENP + D+N+ G + +LE +
Sbjct: 59 IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 118 IKLVQVSTDEVYGSLGKTGRFTEETPLA-PNSPYSSSKASADMIALAYYKTYQLPVIVTR 176
L+ S+ VYG L + F+ + + P S Y+++K + +++A Y Y LP R
Sbjct: 119 QHLLYASSSSVYG-LNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLR 177

Query: 177 CSNNYGPYQYPEKLIPLMVTNALEGKKLPLYGDGLNVRDWLHVTDHCSAIDVVLHKGRV- 235
YGP+ P+ + LEGK + +Y G RD+ ++ D AI +
Sbjct: 178 FFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHA 237

Query: 236 -----------------GEVYNIGGNNEKTNVEVVEQIITLLGKTKKDIEYVTDRLGHDR 278
VYNIG ++ ++ ++ + LG + + + G
Sbjct: 238 DTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGI-EAKKNMLPLQPGDVL 296

Query: 279 RYAINAEKMKNEFDWEPKYTFEQGLQETVQWY 310
+ + + + + P+ T + G++ V WY
Sbjct: 297 ETSADTKALYEVIGFTPETTVKDGVKNFVNWY 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1138NUCEPIMERASE444e-07 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 43.6 bits (103), Expect = 4e-07
Identities = 36/200 (18%), Positives = 70/200 (35%), Gaps = 38/200 (19%)

Query: 4 RVIITGANGQLGKQLQEEL--NPEE----------YDIYPFDKKL------------LDI 39
+ ++TGA G +G + + L + YD+ +L +D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 40 TNISQVQQVVQEIRPHIIIHCAAYTKVDQAEKERDLAYV-INAIGARNVAVASQLVGAK- 97
+ + + + V + E AY N G N+ + +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYS-LENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 98 LVYISTDYVFQGDRPEGYDEFHNPA-PINIYGASKYAGEQFVKELHNKYFIVRTSW---- 152
L+Y S+ V+ +R + + P+++Y A+K A E + Y + T
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFFT 180

Query: 153 LYGKYGN------NFVKTMI 166
+YG +G F K M+
Sbjct: 181 VYGPWGRPDMALFKFTKAML 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1139DHBDHDRGNASE577e-12 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 57.0 bits (137), Expect = 7e-12
Identities = 60/259 (23%), Positives = 105/259 (40%), Gaps = 19/259 (7%)

Query: 4 LQGKTFVVMGVANQRSIAWGIARSLHNAGAKLI-FTYAGERLERNVRELADTLEGQESLV 62
++GK + G A + I +AR+L + GA + Y E+LE+ V L E + +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSL--KAEARHAEA 61

Query: 63 LPCDVTNDEELTACFETIKQEVGTIHGVAHCIAFANRDDLKGEFVDTSRDGFLLAQNISA 122
P DV + + I++E+G I + + G S + + ++++
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVAGVLR----PGLIHSLSDEEWEATFSVNS 117

Query: 123 FSLTAVAREAKKVMT--EGGNILTLTYLGGERVVKNYNVMGVAKASLEASVKYLANDLGQ 180
+ +R K M G+I+T+ + +KA+ K L +L +
Sbjct: 118 TGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAE 177

Query: 181 HGIRVNAISAGPIRT-----LSAKGVGDFNSILREIEE---RAPLRRTTTQEEVGDTAVF 232
+ IR N +S G T L A G I +E PL++ ++ D +F
Sbjct: 178 YNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLF 237

Query: 233 LFSDLARGVTGENIHVDSG 251
L S A +T N+ VD G
Sbjct: 238 LVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1140IGASERPTASE300.009 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.6 bits (66), Expect = 0.009
Identities = 23/80 (28%), Positives = 33/80 (41%), Gaps = 6/80 (7%)

Query: 29 LELAAPKTKRIILTNFENEDRKEESNRNENVVSSAVEEVIEQEEQQQEQEQEQEE----- 83
+ A K TN E E+ + + V ++E+ + E E+ QE
Sbjct: 1069 AKEAKSNVKANTQTN-EVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS 1127

Query: 84 QVEEKTEEEEQVQEQQEPVR 103
QV K E+ E VQ Q EP R
Sbjct: 1128 QVSPKQEQSETVQPQAEPAR 1147


70BAS1213BAS1219N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS1213-2120.815330DNA-binding response regulator
BAS1214-1110.339168sensor histidine kinase
BAS12150100.674521GntR family transcriptional regulator
BAS12160130.790112hypothetical protein
BAS12170130.166634(Fe-S)-binding protein
BAS1218015-0.567975hypothetical protein
BAS1219-117-0.478485late competence protein comC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1213HTHFIS1126e-31 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 112 bits (281), Expect = 6e-31
Identities = 33/130 (25%), Positives = 61/130 (46%), Gaps = 1/130 (0%)

Query: 1 MSKYRVLVVDDESDMRQLVGMYLDNFGYEWGEAENGKEALKKLETDHYDFVVLDIMMPEM 60
M+ +LV DD++ +R ++ L GY+ N + + D VV D++MP+
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLSVCKEIRKT-SDVPIIFLTAKGEEWNRVNGLRMGADDYIVKPFSPGELIARMEAVLR 119
+ + I+K D+P++ ++A+ + GA DY+ KPF ELI + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RYTKQEQQEE 129
++ + E
Sbjct: 121 EPKRRPSKLE 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1214PF06580392e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.1 bits (91), Expect = 2e-05
Identities = 30/188 (15%), Positives = 73/188 (38%), Gaps = 32/188 (17%)

Query: 275 EKVTQLIHKEADRMQRLVHDLLDL--AQLEGEHFPLQKQPIVFSQ---LIEDVLDTYEIK 329
+ LI ++ + + ++ L +L L + + + +++ L I+
Sbjct: 180 NNIRALILEDPTKAREMLTSLSELMRYSLRYS----NARQVSLADELTVVDSYLQLASIQ 235

Query: 330 FIEKKIRISTNLNPEII-VMIDEDRMQQVLHNVLDNAIRYTNQNGDIMITLRQIDDYCEL 388
F E +++ +NP I+ V + +Q ++ N + + I Q G I++ + + L
Sbjct: 236 F-EDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTL 294

Query: 389 SIKDTGIGIDTEHLENLGERFYRVDKARSRQHGGTGLGLAIVRQ-IVHIHDGQW--QIES 445
+++TG A TG GL VR+ + ++ + ++
Sbjct: 295 EVENTG------------------SLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSE 336

Query: 446 EKGNGTTV 453
++G +
Sbjct: 337 KQGKVNAM 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1217ANTHRAXTOXNA320.007 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 31.6 bits (71), Expect = 0.007
Identities = 22/87 (25%), Positives = 35/87 (40%), Gaps = 7/87 (8%)

Query: 88 KTKEEAAKYIQDVAKKKQAKKVVKSKSMVTEEISMNHALEEIGCEVLE--SDLGEYILQV 145
KT++E K + K + K T+++ L++I +VLE S+LG I
Sbjct: 53 KTEKEKFKDSINNLVKTEFTNETLDKIQQTQDL-----LKKIPKDVLEIYSELGGEIYFT 107

Query: 146 DNDPPSHIIAPALHKNRTQIRDVFKEK 172
D D H L + + EK
Sbjct: 108 DIDLVEHKELQDLSEEEKNSMNSRGEK 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1219PREPILNPTASE1337e-40 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 133 bits (335), Expect = 7e-40
Identities = 64/264 (24%), Positives = 122/264 (46%), Gaps = 35/264 (13%)

Query: 4 YVYALLVGMVFGSFFMLIAMRIPL------------------------GESIIIPRSHCH 39
+ L ++ GSF ++ R+P+ ++++PRS C
Sbjct: 16 FSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEPPYNLMVPRSCCP 75

Query: 40 YCKYVLKPKELIPIISFCIQRGRCTNCKRKISILYVIFELVTGIICLLTVYMIGVERELI 99
+C + + E IP++S+ RGRC C+ IS Y + EL+T ++ + + +
Sbjct: 76 HCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSVAVAMTLAPGWGTL 135

Query: 100 IILSLFSLLLIISVTDYIYMLIPNRI---LAWFSCLLILECVFVPLVTWTESIVGSGVIF 156
L L +L+ ++ D ML+P+++ L W L L FV L ++++G+ +
Sbjct: 136 AALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLG---DAVIGAMAGY 192

Query: 157 ILLYCMQKIY-----PEGLGGGDIKLLSLLGFIAGLKGVFMILFLSSFFSLCFFGAGLVL 211
++L+ + + EG+G GD KLL+ LG G + + ++L LSS ++L
Sbjct: 193 LVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILL 252

Query: 212 KRMKMRTQIPFGPFISLGAICYML 235
+ IPFGP++++ +L
Sbjct: 253 RNHHQSKPIPFGPYLAIAGWIALL 276


71BAS1540BAS1545N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS154029-1.597067flagellar motor protein MotS
BAS154127-1.947607chemotaxis response regulator
BAS1544310-2.053472flagellar motor switch protein
BAS1545512-3.374734hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1540OMPADOMAIN636e-14 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 63.4 bits (154), Expect = 6e-14
Identities = 30/127 (23%), Positives = 56/127 (44%), Gaps = 17/127 (13%)

Query: 110 SVVIVDNLIFDTGDANVKPEAKEIISQLVGFFQSVPNP---IVVEGHTDSRPIHNDKFPS 166
+ +++F+ A +KPE + + QL ++ +VV G+TD +D +
Sbjct: 214 HFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI--GSDAY-- 269

Query: 167 NWELSSARAANMIHHLIEVYNVDDKRLAAVGYADTKPVVPN---------DSPQNWEKNR 217
N LS RA +++ +LI + +++A G ++ PV N +R
Sbjct: 270 NQGLSERRAQSVVDYLIS-KGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDR 328

Query: 218 RVVIYIK 224
RV I +K
Sbjct: 329 RVEIEVK 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1541HTHFIS839e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.6 bits (204), Expect = 9e-22
Identities = 28/112 (25%), Positives = 46/112 (41%), Gaps = 2/112 (1%)

Query: 4 KILVVDDAMFMRTMIKNLLKSNSEFEVIGEAENGVEAIQKYKELQPDIVTLDITMPEMDG 63
ILV DD +RT++ L + + N + D+V D+ MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQAL--SRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 64 LEALKEIIKIDASAKVVICSAMGQQGMVLDAIKGGAKDFIVKPFQADRVIEA 115
+ L I K V++ SA + A + GA D++ KPF +I
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGI 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1544FLGMOTORFLIN561e-11 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 55.7 bits (134), Expect = 1e-11
Identities = 23/71 (32%), Positives = 40/71 (56%)

Query: 473 DTSILQNVEMNVKFVFGSTVKTIQDILSLQENEAVVLDEDIDEPIRIYVNDVLVAYGELV 532
D ++ ++ + + G T TI+++L L + V LD EP+ I +N L+A GE+V
Sbjct: 53 DIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVV 112

Query: 533 NVDGFFGVKVT 543
V +GV++T
Sbjct: 113 VVADKYGVRIT 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1545IGASERPTASE340.002 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.9 bits (77), Expect = 0.002
Identities = 18/126 (14%), Positives = 51/126 (40%), Gaps = 1/126 (0%)

Query: 301 EQKTEEDKKIEEPENEDKLENKLEDKKVTEKQEDSKVEISLPEEKTPVVQIPKKEEKVND 360
+ EE K+E + ++ + + E+ E + + E P V I + + + N
Sbjct: 1105 TVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNT 1164

Query: 361 LIKEPLKEKEKITYVIKEPLTDNKEVNKTKAQKDKDNNNQVISKKKEKKEEPEEKKEAKS 420
+ ++ + +++P+T++ VN + + N + + E K + +
Sbjct: 1165 TADTE-QPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRH 1223

Query: 421 EQGIQA 426
+ +++
Sbjct: 1224 RRSVRS 1229



Score = 29.3 bits (65), Expect = 0.045
Identities = 26/109 (23%), Positives = 44/109 (40%), Gaps = 7/109 (6%)

Query: 22 LQSKAEEQNVP-EQNINEV-NVQEENKEVQEQLEQVEMKQDKEEQQEAKNEQETEKKIET 79
+K + NV NEV E KE Q + + E++++AK ETEK E
Sbjct: 1067 EVAKEAKSNVKANTQTNEVAQSGSETKETQTT--ETKETATVEKEEKAK--VETEKTQEV 1122

Query: 80 DQGVITVNKPELKVGEEVLVTIEPKEKNVQSIKGILRLPKNGDQYEQER 128
+ V + P+ + E V EP +N ++ + + E+
Sbjct: 1123 PK-VTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQ 1170


72BAS1551BAS1568N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS1551118-2.014912flagellar hook-associated protein FlgK
BAS1554219-2.638892flagellar capping protein
BAS1555219-0.388287flagellar protein FliS
BAS1556115-0.576422hypothetical protein
BAS1557114-0.789371flagellar basal-body rod protein FlgB
BAS1558012-0.352368flagellar basal body rod protein FlgC
BAS1559112-0.101830flagellar hook-basal body protein FliE
BAS15600110.169040flagellar MS-ring protein
BAS1561011-0.067376flagellar motor switch protein G
BAS1562011-0.410852flagellar assembly protein H
BAS1563-112-0.724093flagellum-specific ATP synthase
BAS1567-213-1.719720flagellar basal body rod modification protein
BAS1568013-3.638951flagellar hook protein FlgE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1551FLGHOOKAP11043e-26 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 104 bits (260), Expect = 3e-26
Identities = 72/249 (28%), Positives = 112/249 (44%), Gaps = 14/249 (5%)

Query: 4 SDYNTPLSGLLAAQMGLQTTKQNLSNIHTPGYVRQMVNYGSAGASQGYSPEQKIGYGVQT 63
S N +SGL AAQ L T N+S+ + GY RQ A ++ G +G GV
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLG--AGGWVGNGVYV 59

Query: 64 LGVDRITDEVKTKQFNDQLSQLSYYNYMNSTLSRVESMVGTTGKNSLSSLMDGFFNAFRE 123
GV R D T Q +Q S +S++++M+ T+ SL++ M FF + +
Sbjct: 60 SGVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTS-SLATQMQDFFTSLQT 118

Query: 124 VAKNPEQPNYYDTLISETGKFTSQVNRLAKSLDTAEAQTTEDIEAHVNEFNRLAGSLAEA 183
+ N E P LI ++ +Q + L + Q I A V++ N A +A
Sbjct: 119 LVSNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASL 178

Query: 184 NKKI----GQAGTQVPNQLLDERDRIITEMSKYANIEVS---YESMNPNIASVRMNGVLT 236
N +I G PN LLD+RD++++E+++ +EVS + N +A NG
Sbjct: 179 NDQISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMA----NGYSL 234

Query: 237 VNGQDTYPL 245
V G L
Sbjct: 235 VQGSTARQL 243



Score = 54.2 bits (130), Expect = 6e-10
Identities = 19/51 (37%), Positives = 35/51 (68%)

Query: 380 LLEGIQQEKMGIEGVNMEEEMVNLMAFQKYFVANSKAITTMNEVFDSLFSI 430
++ + ++ I GVN++EE NL FQ+Y++AN++ + T N +FD+L +I
Sbjct: 495 VVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1557FLGHOOKAP1310.001 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 31.1 bits (70), Expect = 0.001
Identities = 10/28 (35%), Positives = 15/28 (53%)

Query: 20 NTVSSNIANANTPGYKAQDVTFAEQMNK 47
NT S+NI++ N GY Q A+ +
Sbjct: 19 NTASNNISSYNVAGYTRQTTIMAQANST 46


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1558FLGHOOKAP1333e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 33.0 bits (75), Expect = 3e-04
Identities = 19/75 (25%), Positives = 32/75 (42%), Gaps = 7/75 (9%)

Query: 5 INASGSGLTTARKWMEVTSNNIVNANTTAAPGADLYERRSVVLESNNSFANMLDGSPTNG 64
IN + SGL A+ + SNNI + N Y R++ ++ NS G NG
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAG------YTRQTTIMAQANSTLGA-GGWVGNG 56

Query: 65 VKIKSIEADKTENLV 79
V + ++ + +
Sbjct: 57 VYVSGVQREYDAFIT 71



Score = 28.0 bits (62), Expect = 0.013
Identities = 10/38 (26%), Positives = 17/38 (44%)

Query: 97 NIDVTAEMTNVMVAQKMYEANTSVLNANKKMLDKDLEI 134
+++ E N+ Q+ Y AN VL + D + I
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1559FLGHOOKFLIE355e-06 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 35.4 bits (81), Expect = 5e-06
Identities = 18/77 (23%), Positives = 36/77 (46%), Gaps = 1/77 (1%)

Query: 24 SQTSVVEGKKFIDLLEDMNQTQNNAQTAVYDLLTKGVG-ETHDVLIQQKKAESQMKTAAL 82
Q ++ + L+ ++ TQ A+T G +DV+ +KA M+
Sbjct: 27 PQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQ 86

Query: 83 VRDNLIENYKSLINMQI 99
VR+ L+ Y+ +++MQ+
Sbjct: 87 VRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1560FLGMRINGFLIF1623e-46 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 162 bits (411), Expect = 3e-46
Identities = 87/459 (18%), Positives = 185/459 (40%), Gaps = 42/459 (9%)

Query: 17 LVIGAALLAIVTGALLYFTLPDKYVVVYQNLNDADKQEITAELSKLGVDYQLAADG-SIR 75
+V G+A +AIV +L+ PD Y ++ NL+D D I A+L+++ + Y+ A +I
Sbjct: 28 IVAGSAAVAIVVAMVLWAKTPD-YRTLFSNLSDQDGGAIVAQLTQMNIPYRFANGSGAIE 86

Query: 76 VQKNDAPWVRKEMNGMGLPFNSKSGEEILLESSLGSSEQDKKMKQIVGTKKQLEQDIVRN 135
V + +R + GLP G E+L + G S+ +++ + +L + I
Sbjct: 87 VPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRALEGELARTI-ET 145

Query: 136 FATVETANVQITLPEKETIFDEEKAKGTAAITVGVKRGQLLTADQVAGIQQMISAAVPGV 195
V++A V + +P+ ++F E+ +A++TV ++ G+ L Q++ + ++S+AV G+
Sbjct: 146 LGPVKSARVHLAMPKP-SLFVREQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGL 204

Query: 196 KAEEVSVIDSKKGVISKGADEAHSTSSSSYEKEVEMQHQIEGKLKQDIDATLMTMFKPNE 255
V+++D ++++ + + ++ + +E ++++ I+A L +
Sbjct: 205 PPGNVTLVDQSGHLLTQSNTSGRDLNDAQ----LKFANDVESRIQRRIEAILSPIVGNGN 260

Query: 256 YKVNTKVSVNYDEVTRQSEKYG-DKGVLRSKQEQEESSTAQEGAETKQGA--GITANG-- 310
+++ + E Y + ++ + + +++ G G +N
Sbjct: 261 VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGVPGALSNQPA 320

Query: 311 -------EVPNYGTNNNQNGKIVYDNKNGNKI----------ENYEIDKTVETIKKHP-E 352
P N QN + N N NYE+D+T+ K + +
Sbjct: 321 PPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDRTIRHTKMNVGD 380

Query: 353 LTKTNVVVWVDNDTLVKRKI------DMTTFKEAIGTAAGLQADPNGNFTNGQVNVVTVQ 406
+ + +V V V+ TL K M ++ A G +NVV
Sbjct: 381 IERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDK-----RGDTLNVVNSP 435

Query: 407 FDQPKEEKKKEPEESGINWWLFGGIPAGLLAIGGLVWFF 445
F + P ++ L + + W
Sbjct: 436 FSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWIL 474


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1561FLGMOTORFLIG2004e-64 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 200 bits (511), Expect = 4e-64
Identities = 116/336 (34%), Positives = 197/336 (58%), Gaps = 6/336 (1%)

Query: 2 LDEISSKEKAAILIRTLEEGVAAKVIEYMTAKEKEVLLREIAKFRVYKSETLENVLGEFL 61
+ ++ K+KAAIL+ ++ +++KV +Y++ +E E L EIAK SE +NVL EF
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 62 YELNVKELNLVTPDKEYIRRIF-KNMPEDELEKLLEDLWYN-KDNPFEFLNSLTDLEPLL 119
EL + + + +Y R + K++ + ++ +L + PFEF+ D +L
Sbjct: 72 -ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRA-DPANIL 129

Query: 120 TVLNDESPQTIAIIASYIKPQLASQLIERLPDHKRVETVMGIAKLEQVDGELINQIGDLL 179
+ E PQTIA+I SY+ PQ AS ++ LP + IA +++ E++ ++ +L
Sbjct: 130 NFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVL 189

Query: 180 KSKLNNMAFNAINKTDGLKTIVNILNNVSRGVEKTVFQKLDEMDYELSEKIKENMFVFED 239
+ KL +++ G+ +V I+N R EK + + L+E D EL+E+IK+ MFVFED
Sbjct: 190 EKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFED 249

Query: 240 LLGLEDLALRRVLEEITDNGVIAKALKIAKEEIKEKLFTCMSSNRKEMILEELDGLGPLK 299
++ L+D +++RVL EI D +AKALK ++EK+F MS M+ E+++ LGP +
Sbjct: 250 IVLLDDRSIQRVLREI-DGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTR 308

Query: 300 MTDAEKAQQTITDTVKKLEKEGRIIVQRG-EDDVLI 334
D E++QQ I ++KLE++G I++ RG E+DVL+
Sbjct: 309 RKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVLV 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1568FLGHOOKAP1441e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 1e-06
Identities = 15/36 (41%), Positives = 24/36 (66%)

Query: 5 LYTSITGMNAAQNALSVTSNNIANAQTVGYKKQKAI 40
+ +++G+NAAQ AL+ SNNI++ GY +Q I
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI 39



Score = 37.6 bits (87), Expect = 9e-05
Identities = 10/39 (25%), Positives = 26/39 (66%)

Query: 397 SNVDLSVEFVDLMLYQRGFQGNAKVIKVSDEVLNEVVNL 435
S V+L E+ +L +Q+ + NA+V++ ++ + + ++N+
Sbjct: 507 SGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


73BAS1582BAS1611N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS15823181.095177flagellin
BAS15834260.817585Slt family transglycosylase
BAS1587324-0.383995flagellar motor switch protein
BAS1588420-0.244038hypothetical protein
BAS1589417-0.357123flagellar biosynthesis protein FliP
BAS1590214-0.389823flagellar biosynthesis protein FliQ
BAS1591112-0.282301flagellar biosynthesis protein FliR
BAS1592090.073013flagellar biosynthesis protein FlhB
BAS15931100.420791flagellar biosynthesis protein FlhA
BAS1596-1100.173457flagellar basal body rod protein FlgG
BAS1597-111-0.613730alanyl-tRNA synthetase
BAS1598111-1.038365hypothetical protein
BAS1599314-1.352832AzlC family protein
BAS1600013-3.577159hypothetical protein
BAS1601-111-2.789634TetR family transcriptional regulator
BAS1602011-2.973585hypothetical protein
BAS1604011-2.932092hypothetical protein
BAS1605-110-1.958660hypothetical protein
BAS1606-110-1.747606DNA-binding protein
BAS1607-212-0.465949permease
BAS1608-313-0.481080lipoprotein
BAS1609-314-0.675236LysR family transcriptional regulator
BAS1610-215-0.368776ABC transporter ATP-binding protein
BAS1611-215-0.645453ABC transporter substrate-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1582FLAGELLIN1259e-35 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 125 bits (314), Expect = 9e-35
Identities = 76/282 (26%), Positives = 130/282 (46%), Gaps = 18/282 (6%)

Query: 1 MRINTNINSMRTQEYMRQNQDKMNVSMNRLSSGKRINSAADDAAGLAIATRMRARQSGLE 60
INTN S+ TQ + ++Q ++ ++ RLSSG RINSA DDAAG AIA R + GL
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 61 KASQNTQDGMSLIRTAESAMNSVSNILTRMRDIAVQSSNGTNTAENQSALQKEFAELQEQ 120
+AS+N DG+S+ +T E A+N ++N L R+R+++VQ++NGTN+ + ++Q E + E+
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 121 IDYIAKNTEFNDKNLLAGTGAVTIGSTSISGAEISIETLDSSATNQQITIKLANTTAEKL 180
ID ++ T+FN +L+ + I + G + ITI L + L
Sbjct: 122 IDRVSNQTQFNGVKVLSQDNQMKIQVGANDG--------------ETITIDLQKIDVKSL 167

Query: 181 GIDATTSN----ISISGAASALAAISALNTALNTVAGNRATLGATLNRLDRNVENLNNQA 236
G+D N ++ S+ ++ +T R + + D + ++
Sbjct: 168 GLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV 227

Query: 237 TNMASAASQIEDADMAKEMSEMTKFKILNEAGISMLSQANQT 278
A+ D ++ K + A
Sbjct: 228 YVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAI 269



Score = 86.3 bits (213), Expect = 4e-21
Identities = 62/259 (23%), Positives = 107/259 (41%), Gaps = 7/259 (2%)

Query: 36 INSAADDAAGLAIATRMRARQSGLEKASQNTQDGMSLIRTAESAMNSVSNILTRMRDIAV 95
+ AG A A + G ++ G++ ++ + + T + V
Sbjct: 249 LFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKV 308

Query: 96 QSSNGTNTAENQSALQKEFAELQEQID-YIAKNTEFNDKN------LLAGTGAVTIGSTS 148
+ TA + + + F+DK L + S
Sbjct: 309 TLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGES 368

Query: 149 ISGAEISIETLDSSATNQQITIKLANTTAEKLGIDATTSNISISGAASALAAISALNTAL 208
+ T +++ + K G+ + + + S ++++++AL
Sbjct: 369 KITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTANPLASIDSAL 428

Query: 209 NTVAGNRATLGATLNRLDRNVENLNNQATNMASAASQIEDADMAKEMSEMTKFKILNEAG 268
+ V R++LGA NR D + NL N TN+ SA S+IEDAD A E+S M+K +IL +AG
Sbjct: 429 SKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKAQILQQAG 488

Query: 269 ISMLSQANQTPQMVSKLLQ 287
S+L+QANQ PQ V LL+
Sbjct: 489 TSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1583PF06580290.021 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.7 bits (64), Expect = 0.021
Identities = 8/42 (19%), Positives = 20/42 (47%), Gaps = 1/42 (2%)

Query: 122 LTKKY-NIQKIRSSNEGKYEDIIDRVSHTYGIPKTLIQKMIE 162
+ Y + I+ + ++E+ I+ +P L+Q ++E
Sbjct: 224 VVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVE 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1587FLGMOTORFLIN592e-14 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 58.8 bits (142), Expect = 2e-14
Identities = 22/94 (23%), Positives = 51/94 (54%)

Query: 13 LEDFAGKRNEASKAHIDTVSDISIELGVKLGKASITLGDVKQLKVGDVLEVEKNLGHKVD 72
+ G + ID + DI ++L V+LG+ +T+ ++ +L G V+ ++ G +D
Sbjct: 39 FQQLGGGDVSGAMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLD 98

Query: 73 VYLSNMKVGIGEAIVMDEKFGIIISEIEADKKQA 106
+ ++ + GE +V+ +K+G+ I++I ++
Sbjct: 99 ILINGYLIAQGEVVVVADKYGVRITDIITPSERM 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1589FLGBIOSNFLIP1642e-52 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 164 bits (417), Expect = 2e-52
Identities = 75/239 (31%), Positives = 136/239 (56%), Gaps = 2/239 (0%)

Query: 14 FVFSIVFSIIFVNPAYAAQNGFINFENGKEFTSN--SSVQLFALVTLLSLSSSIVLLFTH 71
+ + + P AQ I + + VQ +T L+ +I+L+ T
Sbjct: 4 LLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMMTS 63

Query: 72 FTYFMIVLGITRQGLGVMNLPPNQVLVGLALFLSLFTMQPVLGQLKSDVWDPMTKEKITV 131
FT +IV G+ R LG + PPNQVL+GLALFL+ F M PV+ ++ D + P ++EKI++
Sbjct: 64 FTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKISM 123

Query: 132 SQAAETTAPIMKEYMSKHTYKHDLKMMLKVRGEELPKDLKDLSLFTLVPSFTLTQIQKGL 191
+A E A ++E+M + T + DL + ++ + + + + L+P++ ++++
Sbjct: 124 QEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKTAF 183

Query: 192 LTGMFIYLAFVFIDLIISTLLMYLGMMMVPPMILSLPFKILIFVYLGGYTKIVDIMFKT 250
G I++ F+ IDL+I+++LM LGMMMVPP ++LPFK+++FV + G+ +V + ++
Sbjct: 184 QIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQS 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1590TYPE3IMQPROT421e-08 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 41.7 bits (98), Expect = 1e-08
Identities = 15/81 (18%), Positives = 35/81 (43%)

Query: 4 SPIIDIFQTFFYKGVMILMPVAGVSMIVVIIIAVIMAMMQIQEQTLTFLPKMASIVLVII 63
++ Y +++ V+ I+ +++ + + Q+QEQTL F K+ + L +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 ILGPWMFQELTTLILDLFDKI 84
+L W + L + +
Sbjct: 62 LLSGWYGEVLLSYGRQVIFLA 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1591TYPE3IMRPROT967e-26 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 96.0 bits (239), Expect = 7e-26
Identities = 51/233 (21%), Positives = 113/233 (48%), Gaps = 1/233 (0%)

Query: 10 FFAFCRITSFLYFLPFFSGRSIPAMAKVTFGLALSITVADQVDVSHIKTVWDVAA-YAGT 68
F+ R+ + + P S RS+P K+ + ++ +A + + + A A
Sbjct: 17 FWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPVFSFFALWLAVQ 76

Query: 69 QIVIGLSLSKIVEMLWNIPKMAGHILDFDIGLSQASLFDVNAGSQSTLLSTIFDIFFLII 128
QI+IG++L ++ + + AG I+ +GLS A+ D + +L+ I D+ L++
Sbjct: 77 QILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDMLALLL 136

Query: 129 FISLGGINYFVATILKSFQYTEAISKLLTTSFLDSLLATLLFAITSAVEIALPLMGSLFI 188
F++ G + ++ ++ +F + L ++ +L + + +ALPL+ L
Sbjct: 137 FLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALPLITLLLT 196

Query: 189 INFVLILIAKNAPQLNVFMNAYVIKITCGILFIAMSVPMLGYVFKNMTDVLLE 241
+N L L+ + APQL++F+ + + +T GI +A +P++ +++ +
Sbjct: 197 LNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIFN 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1592TYPE3IMSPROT2892e-98 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 289 bits (742), Expect = 2e-98
Identities = 92/343 (26%), Positives = 186/343 (54%), Gaps = 2/343 (0%)

Query: 4 DNKTEKATPQKRKKSREEGNIARSKDLNNLFSILVLAVVVYFFGDWLGFEIANSVSVLFD 63
KTE+ TP+K + +R++G +A+SK++ + I+ L+ ++ D+ + + + +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 64 QIGKNTDS--TEYFYMMGILLLKVSAPILILVYAFHLFNYMIQVGFLFSSKVIKPKASRI 121
Q + + + + P+L + + ++++Q GFL S + IKP +I
Sbjct: 63 QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKKI 122

Query: 122 NPKNYFTRLFSRKSLVDILKSLFYMGLIGYVAYVLFKKNLEKIVSMIGFNWTASLTEIIR 181
NP R+FS KSLV+ LKS+ + L+ + +++ K NL ++ + + +
Sbjct: 123 NPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQ 182

Query: 182 QIKFIFLAILIILIVLSIIDFIYQKWEYEQDIKMKKEEVKQEHKDNEGDPQVKGKRKNFM 241
++ + + + +V+SI D+ ++ ++Y +++KM K+E+K+E+K+ EG P++K KR+ F
Sbjct: 183 ILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQFH 242

Query: 242 HAILQGTIAKKMDGATFIVNNPTHISVVLRYNKHVDAAPIVVAKGEDELALYIRTLAREQ 301
I + + + ++ +V NPTHI++ + Y + P+V K D +R +A E+
Sbjct: 243 QEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEEE 302

Query: 302 EIPMVENRPLARSLYYQVEEDETIPEDLYVAVIEVMRYLIQTN 344
+P+++ PLAR+LY+ D IP + A EV+R+L + N
Sbjct: 303 GVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQN 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1596FLGHOOKAP1280.033 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 28.4 bits (63), Expect = 0.033
Identities = 11/47 (23%), Positives = 24/47 (51%)

Query: 203 NGVGTVKNYMLENSNVDMTKEMADLMTDQRMISASQRVMTSFDKIYE 249
N V + N S V++ +E +L Q+ A+ +V+ + + I++
Sbjct: 494 NVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 540


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1597DPTHRIATOXIN280.039 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 27.8 bits (61), Expect = 0.039
Identities = 26/113 (23%), Positives = 49/113 (43%), Gaps = 16/113 (14%)

Query: 63 EQGEIVHYIKDGAQVKLGPVKLEINWERRHNLMRHHSLLHLIGAVVYEKYGALCTGNQIY 122
E V YI + Q K V+LEIN+E R + +YE C GN++
Sbjct: 174 EGSSSVEYINNWEQAKALSVELEINFETRGKRGQD---------AMYEYMAQACAGNRVR 224

Query: 123 PDKA------RIDFNELQELSSVEVEGIVKEVNKLIEQNKEISTRYMSREEAE 169
+D++ +++ + ++E + KE + + E + +S E+A+
Sbjct: 225 RSVGSSLSCINLDWDVIRDKTKTKIESL-KEHGPIKNKMSESPNKTVSEEKAK 276


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1601HTHTETR734e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.7 bits (178), Expect = 4e-18
Identities = 35/191 (18%), Positives = 70/191 (36%), Gaps = 33/191 (17%)

Query: 1 MAKPN----VVNKEKLLQAAKEIIAEHGMEKLTLKAVAESAQVTQGTVYYHFKTKDQLLL 56
MA+ ++ +L A + ++ G+ +L +A++A VT+G +Y+HFK K L
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 57 EVTEAFCKASWEQIGKDVQLEKALQSAESRCVKDSMYHHLFFQLVASGLQNDAMKDKIGG 116
E+ + S IG+ +A + V + H+ S + + + +
Sbjct: 61 EI----WELSESNIGELELEYQAKFPGDPLSVLREILIHVL----ESTVTEERRRLLMEI 112

Query: 117 LLHYENQQ--------------------LTRVLNKNI-GGTMTSQISTETWSVLCNALID 155
+ H + + L I + + + T +++ I
Sbjct: 113 IFHKCEFVGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYIS 172

Query: 156 GLELQALFNPS 166
GL LF P
Sbjct: 173 GLMENWLFAPQ 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1607TCRTETA416e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 41.0 bits (96), Expect = 6e-06
Identities = 40/229 (17%), Positives = 86/229 (37%), Gaps = 11/229 (4%)

Query: 55 FATTLVCGSLPRMICGPIAGAVADRVSRRWLVIGTDLLSSLTMLIMFILATIFGPSLPFI 114
+ L +L + C P+ GA++DR RR +++ + +++ IM P L +
Sbjct: 45 YGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIM-----ATAPFLWVL 99

Query: 115 YISAALLSICASFYSVALTSSIPNLVDEGRIQKASALNQTAASLSNILGPIIGGVVFGFL 174
YI + I + +VA + I ++ D + + GP++GG++ G
Sbjct: 100 YIGRIVAGITGATGAVA-GAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLM-GGF 157

Query: 175 SIQSFFLLNSITFFLAVILQLFIVFDLYKKEVAESKEHFLTSIKEGFSYVKRQHEIYGLM 234
S + F + L + F++ + +K E + L + F + + + LM
Sbjct: 158 SPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREAL-NPLASFRWARGMTVVAALM 216

Query: 235 KIALWVNFFACGLTVALPYIIVHTLHLSSKQLGTVEGMLAVGMLMGAIT 283
+ + H + +G LA ++ ++
Sbjct: 217 AVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGI---SLAAFGILHSLA 262



Score = 33.6 bits (77), Expect = 0.001
Identities = 17/97 (17%), Positives = 35/97 (36%), Gaps = 2/97 (2%)

Query: 76 VADRVSRRWLVIGTDLLSSLTMLIMFILATIFGPSLPFIYISAALLSICASFYSVALTSS 135
+ V+ R +L + +IL ++ +L AL +
Sbjct: 266 ITGPVAARLGERRALMLGMIADGTGYILLAFATRG--WMAFPIMVLLASGGIGMPALQAM 323

Query: 136 IPNLVDEGRIQKASALNQTAASLSNILGPIIGGVVFG 172
+ VDE R + SL++I+GP++ ++
Sbjct: 324 LSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1608CHANLCOLICIN270.047 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 27.0 bits (59), Expect = 0.047
Identities = 33/151 (21%), Positives = 60/151 (39%), Gaps = 16/151 (10%)

Query: 40 ATMWFEKAEKEKSGNEAKSYKEMAEKMDHGATALKDGKYLEAKDIANEVLQMKKDDALET 99
AT + A+ +K+ E + + A + A A +D KDI NE L+
Sbjct: 53 ATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEALRHNASRTPSA 112

Query: 100 AVTSNAENM----------LQKAKDVEKKVNERVAK------RRKVEEEEGIDKLIKAVD 143
++A N L KA++ +K E K +R+ E E + + +
Sbjct: 113 TELAHANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLK 172

Query: 144 SIDDVKEKEKKVSEALDKAEEAQAKIEAKKN 174
+ +++ +SE E AQ K+ A ++
Sbjct: 173 LAEAEEKRLAALSEEAKAVEIAQKKLSAAQS 203


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1609VACCYTOTOXIN300.016 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 29.6 bits (66), Expect = 0.016
Identities = 18/64 (28%), Positives = 25/64 (39%), Gaps = 13/64 (20%)

Query: 123 TLMVKT--APEIRTMLQNHEINLGVISAAPFDESLLKQTNVMPDTLVLAFSKEHHFSKKE 180
TL++ + A RTM+ N + KQ N TL S EH S +
Sbjct: 914 TLLIDSHDAGYARTMIDATSAN-----------EITKQLNTATTTLNNIASLEHKTSGLQ 962

Query: 181 NVSL 184
+SL
Sbjct: 963 TLSL 966


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1610PF05272310.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.007
Identities = 14/41 (34%), Positives = 19/41 (46%), Gaps = 7/41 (17%)

Query: 32 TLLGPSGCGKTTLLRMIAGLEEPDKGEIYFGDTCMYSSTKK 72
L G G GK+TL+ + GL+ +F DT T K
Sbjct: 600 VLEGTGGIGKSTLINTLVGLD-------FFSDTHFDIGTGK 633


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1611MALTOSEBP290.045 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 28.5 bits (63), Expect = 0.045
Identities = 78/308 (25%), Positives = 116/308 (37%), Gaps = 69/308 (22%)

Query: 40 EKKIVVYSAGPKG---LAEKIQKDFEKKTGIKVEMFQGTTGKILARMEAEKKKPVVDV-- 94
E K+V++ G KG LAE + K FEK TGIKV + + E+K P V
Sbjct: 30 EGKLVIWINGDKGYNGLAE-VGKKFEKDTGIKVTVEHPD--------KLEEKFPQVAATG 80

Query: 95 ----VVLASLPAMEGLKKDGQTLAYKEAKQADKLRSEWSDDKGHYFG------YSASALG 144
++ + G + G K ++ D Y G + AL
Sbjct: 81 DGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTWDAVRYNGKLIAYPIAVEALS 140

Query: 145 IVYNTKNVKTAPEDWSDI--------TKGEWKGKVNLPDP--------ALSGSALDFVTG 188
++YN + P+ W +I KG+ NL +P A G A + G
Sbjct: 141 LIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEPYFTWPLIAADGGYAFKYENG 200

Query: 189 -YVKKN-------GKDGWDLFEQLKKNEVTVAGANQEALDPVVT-GAKDMVIAG------ 233
Y K+ K G L KN+ A + + G M I G
Sbjct: 201 KYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAEAAFNKGETAMTINGPWAWSN 260

Query: 234 -----VDY-MTYSAKAKGEPVDIVYPKSGTVISPRAAGIMKDSKNVEGAKEFID-YLLSD 286
V+Y +T KG+P S + +AGI S N E AKEF++ YLL+D
Sbjct: 261 IDTSKVNYGVTVLPTFKGQP-------SKPFVGVLSAGINAASPNKELAKEFLENYLLTD 313

Query: 287 DVQKQISK 294
+ + ++K
Sbjct: 314 EGLEAVNK 321


74BAS1663BAS1670N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS1663-1120.446838hypothetical protein
BAS1664-1120.823971cardiolipin synthetase domain-containing
BAS1665-2121.381114uridylate kinase
BAS1666-2100.978459proton/sodium-glutamate symport protein
BAS1667-290.829248aspartate ammonia-lyase
BAS1668-1100.461930malate dehydrogenase
BAS1669012-0.772672sensor histidine kinase
BAS16702120.244197response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1663ABC2TRNSPORT451e-07 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 45.3 bits (107), Expect = 1e-07
Identities = 28/106 (26%), Positives = 47/106 (44%)

Query: 262 IVMIGVLMLFALIAIGISLVLVAFSKNSASANTMQNLVIVPTCLLAGCYFPYDIMPKAVQ 321
+ + V+ L L + +V+ A + + Q LVI P L+G FP D +P Q
Sbjct: 148 LYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQ 207

Query: 322 KVADFLPQRWLLDTIAKLQQGIPFSELYVNILILFAFAVAFFLIAI 367
A FLP +D I + G P ++ ++ L + V F ++
Sbjct: 208 TAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLST 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1665CARBMTKINASE290.024 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 28.6 bits (64), Expect = 0.024
Identities = 15/60 (25%), Positives = 24/60 (40%), Gaps = 14/60 (23%)

Query: 122 LDNGYIVIFGGGNGQPFVTT-------------DYPSVQRAIEMNSDAILVAKQGVDGVF 168
++ G IVI GG G P + D + A E+N+D ++ V+G
Sbjct: 183 VERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILT-DVNGAA 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1669PF06580388e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.5 bits (87), Expect = 8e-05
Identities = 26/130 (20%), Positives = 46/130 (35%), Gaps = 20/130 (15%)

Query: 297 LGKDIRFSKHIEGEHAAYHV--YTVLSIFNNLVANAVEAIEDRGLIHIKLYKREQHVIFE 354
++F I V V ++ N + + + + G I +K K V E
Sbjct: 236 FEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLE 295

Query: 355 VIDDGPGIAQKYKKLVFKPGFTSKYDQTGTPSTGIGLSYIDEMVTEL-GGEVRLEDNENG 413
V + G + K+ STG GL + E + L G E +++ +E
Sbjct: 296 VENTGSLALKNTKE-----------------STGTGLQNVRERLQMLYGTEAQIKLSEKQ 338

Query: 414 NGCKFIVCLP 423
+V +P
Sbjct: 339 GKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1670HTHFIS543e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 54.1 bits (130), Expect = 3e-10
Identities = 29/177 (16%), Positives = 71/177 (40%), Gaps = 10/177 (5%)

Query: 5 IVDDDEVFRSMLSQIIEDGDLGEVIGESEDGAFIEAEQLNYKKVDILFIDLLMPMRDGIE 64
+ DDD R++L+Q + I + + + D++ D++MP + +
Sbjct: 8 VADDDAAIRTVLNQALSRAGYDVRITSNAATLW---RWIAAGDGDLVVTDVVMPDENAFD 64

Query: 65 TVRHIASSFTG-KIIMISQVESKQLIGEAYTLGVEYYITKPLNKIEVVSVVRKVIERIRL 123
+ I + ++++S + +A G Y+ KP + E++ ++ + + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 124 ERSIYDIQKSLNNVFQWEKPQMRNETVQEGKKISDSGRFLLSELGIAGENGS-KDLL 179
S + M+ E + ++ + L+ I GE+G+ K+L+
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQ-EIYRVLARLMQTDLTLM----ITGESGTGKELV 176


75BAS1832BAS1839N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS1832-116-2.914747DNA-binding response regulator
BAS1833-114-2.510826sensor histidine kinase
BAS1834-214-1.913489polysaccharide deacetylase
BAS1835015-1.862466lipoprotein
BAS1836115-1.635432sapB protein
BAS1837114-1.808603hypothetical protein
BAS1838014-1.680639siderophore biosynthesis protein
BAS1839016-1.526233siderophore biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1832HTHFIS933e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 93.4 bits (232), Expect = 3e-24
Identities = 29/123 (23%), Positives = 58/123 (47%), Gaps = 3/123 (2%)

Query: 2 PTILVLEDEMPIRSFIVLNLKRAGFYVLEASTGEEALQILCEHTVDVALLDVMLPGMDGF 61
TILV +D+ IR+ + L RAG+ V S + + D+ + DV++P + F
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 QVCKAIREENKKIGIIMLTARVQNEDKVQGLGIGADDYIAKPFSP---VELTARIQSLLR 118
+ I++ + +++++A+ ++ GA DY+ KPF + + R + +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 119 RIE 121
R
Sbjct: 124 RRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1833PF06580401e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.8 bits (93), Expect = 1e-05
Identities = 22/101 (21%), Positives = 44/101 (43%), Gaps = 22/101 (21%)

Query: 359 IVQNAIKY----SHENGKVYIEATKNEGQAVIKVKDDGIGIAKEHLPYIEQSFYQINNHA 414
+V+N IK+ + GK+ ++ TK+ G ++V++ G K N
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALK--------------NTK 308

Query: 415 TGAGLGLAIVKKMVELHGG---TINIISKEGIGTTILIKLP 452
G GL V++ +++ G I + K+G ++ +P
Sbjct: 309 ESTGTGLQNVRERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1837TYPE3OMBPROT270.050 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 27.0 bits (59), Expect = 0.050
Identities = 18/85 (21%), Positives = 35/85 (41%), Gaps = 11/85 (12%)

Query: 80 GVERTGTYNCEGELYAIYLLQEYQGMKIG-------QKLFQAFLSDCKNNDMQSLLVWVV 132
G +RTG + E + I + Q ++ ++LF L + N ++Q +
Sbjct: 442 GKDRTGMQDAEIKREIIRKHETGQFSQLNSKLSSEEKRLFSTILMNSGNMEIQEM----N 497

Query: 133 TNNPSKKFYEKFNPEKMDTKFLERV 157
T P K +K ++ + ER+
Sbjct: 498 TGVPGNKVMKKLPLSSLELSYSERI 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1838PF041832872e-91 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 287 bits (736), Expect = 2e-91
Identities = 101/543 (18%), Positives = 192/543 (35%), Gaps = 55/543 (10%)

Query: 82 QFYYQMGDSNSVMKADYVTVITFLIKEMSINYG-EGTNPAELMLRVIRSCQNIEEFTKER 140
+ + D+ ++ AD + L+ ++ AE M + + + K R
Sbjct: 54 IWGWLWIDAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKAR 113

Query: 141 KEDTSALYGFHTSFIEAEQSLLFGHLTHPTPKSRQGILEWKSAMYSPELKGECQLHYFRA 200
+ +++ + Q LL GH K R+G + Y+PE +LH+
Sbjct: 114 RGLSASDL--INLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAV 171

Query: 201 HKSIVNEKSLLLDSTTVILKEELRNDEM-VSKEFISKYCNEDEYSLLPIHPLQAEWLLHQ 259
+ + + +L + E + + + + LP+HP Q + +
Sbjct: 172 KREHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIAT 231

Query: 260 PYVQDWIEQGVLEYIGPTGKCYMATSSLRTLYHPDAKYMLKFSFPVKV--TNSMRINKLK 317
++ D +G + +G G ++A SLRTL + + L P+ + T+ R +
Sbjct: 232 DFIAD-FAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGR 290

Query: 318 ELESGLEGKAMLNTAI-GEVLEKFPGFDFICDPAFITL-----------NYGTQESGFEV 365
+ +G L + G + +PA + Y QE V
Sbjct: 291 YIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEM-LGV 349

Query: 366 IIRENPFYSEHADDATLIAGLVQDAIPGERTRLSNIIHRLADLESRSCEEVSLEWFRRYM 425
I RENP D++ ++ + + + I R E W +
Sbjct: 350 IWRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRS----GLDAET----WLTQLF 401

Query: 426 NISLKPMVWMYLQYGVALEAHQQNSVVQLKDGYPVKYYFRDNQG-FYFCNSMKEMLNNEL 484
+ + P+ + +YGVAL AH QN + +K+G P + +D QG +++
Sbjct: 402 RVVVVPLYHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLP 461

Query: 485 AGIGERTGNLYDDYIVDERFRYYL--IFNHMFGLINGFGTAGLIREEILLTELRTVLES- 541
+ + T L DY++ + + + + L+ G + E L VL
Sbjct: 462 QEVRDVTSRLSADYLIHDLQTGHFVTVLRFISPLMVRLG----VPERRFYQLLAAVLSDY 517

Query: 542 ----------FLPYNREPSTFLRELLEEDKLACKANLLTRFFDVDELSNPLEQAIYVQVQ 591
F ++ +R +L KL + D+D S L +Q
Sbjct: 518 MKKHPQMSERFALFSLFRPQIIRVVLNPVKLT--------WPDLDGGSRMLPN-YLEDLQ 568

Query: 592 NPL 594
NPL
Sbjct: 569 NPL 571


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS1839PF041835840.0 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 584 bits (1507), Expect = 0.0
Identities = 146/602 (24%), Positives = 267/602 (44%), Gaps = 45/602 (7%)

Query: 12 IESEDYISVRRRVLRQLVESLIYEGIITPARIEKEEQILFLIQGLDEDNKSVTYECYGRE 71
+ +D+ V RR++ +++ L YE + + +++ + G + + E
Sbjct: 1 MNHKDWDLVNRRLVAKMLSELEYEQVFHA-ESQGDDRYCINLPG-------AQWR-FIAE 51

Query: 72 RITFGRISIDSLIVRVQDGKQEIQSVAQFLEEVFRVVNVEQTKLDSFIHELEQTIFKDTI 131
R +G + ID+ +R D Q L ++ +V+++ + + +L T+ D
Sbjct: 52 RGIWGWLWIDAQTLRCADEPVLAQ---TLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQ 108

Query: 132 AQYER--CNKLKYTQKSYDELENHLIDGHPYHPSYKARIGFQYRDNFRYGYEFMRPIKLI 189
R + + D L+ L+ GHP K R G+ RY E+ +L
Sbjct: 109 LLKARRGLSASDLINLNADRLQ-CLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLH 167

Query: 190 WIAAHKKNATVGYENEVIYDKILKSEVGERKLEAYKERIHSMGCDPKQYLFIPVHPWQWE 249
W+A +++ +NE+ ++L + + ++ + + G D +L +PVHPWQW+
Sbjct: 168 WLAVKREHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLD-HNWLPLPVHPWQWQ 226

Query: 250 NFIISNYAEDIQDKGIIYLGESADDYCAQQSMRTLRNVTNPKRPYVKVSLNILNTSTLRT 309
I +++ D + ++ LGE D + AQQS+RTL N + +K+ L I NTS R
Sbjct: 227 QKIATDFIADFAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRG 286

Query: 310 LKPYSVASAPAISNWLSNVVSQDSYLRDESRVILLKEFSSVM----YDTNKKATYG---S 362
+ +A+ P S WL V + D+ L VIL + + + Y +A Y
Sbjct: 287 IPGRYIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEM 346

Query: 363 LGCIWRESVHHYLGEQEDAVPFNGLYAKEKDGTPIIDAWLNKYGI--ENWLRLLIQKAII 420
LG IWRE+ +L E V L +++ P+ A++++ G+ E WL L + ++
Sbjct: 347 LGVIWRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVV 406

Query: 421 PVIHLVVEHGIALESHGQNMILVHKEGLPVRIALKDFHEGLEFYRPFLKEMNKCPDFTKM 480
P+ HL+ +G+AL +HGQN+ L KEG+P R+ LKDF + + EM+ P +
Sbjct: 407 PLYHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRD 466

Query: 481 HKTYANGKMNDFFEMDRIECLQEMVLDALFLFNVGELAFVLADKYEWKEESFWMIVVEEI 540
+ + D+ D L V L + E F+ ++ +
Sbjct: 467 VTSRLS---ADYLIHD---------LQTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVL 514

Query: 541 ENHFRKYPHLKDRFESIQLYTPTFYAEQLTKRRL-YIDVESLVHEVP-------NPLYRA 592
++ +K+P + +RF L+ P L +L + D++ +P NPL+
Sbjct: 515 SDYMKKHPQMSERFALFSLFRPQIIRVVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLV 574

Query: 593 RQ 594
Q
Sbjct: 575 TQ 576


76BAS2069BAS2076N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2069018-1.964920acetyltransferase
BAS2072018-2.240512acetyltransferase
BAS2073-115-2.924100ABC transporter ATP-binding protein
BAS2076-112-2.171049hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2069SACTRNSFRASE522e-11 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 52.3 bits (125), Expect = 2e-11
Identities = 24/92 (26%), Positives = 44/92 (47%), Gaps = 5/92 (5%)

Query: 42 MERKESVIFVAVEDGEYIGFTQLYPSFSSISMKELWILNDLFVQAAKRGAGTGKKLLEAA 101
+E + F+ + IG ++ +++ ++ D+ V R G G LL A
Sbjct: 60 VEEEGKAAFLYYLENNCIGRIKIRSNWN-----GYALIEDIAVAKDYRKKGVGTALLHKA 114

Query: 102 KEFALENGAKGVKLQTEIDNLSAQRLYAENGY 133
E+A EN G+ L+T+ N+SA YA++ +
Sbjct: 115 IEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2072SACTRNSFRASE501e-10 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 50.3 bits (120), Expect = 1e-10
Identities = 27/109 (24%), Positives = 46/109 (42%), Gaps = 6/109 (5%)

Query: 29 SYEDMNNRLQFVQMSPFDFLYVYEEEKTIFGLLGFRIRENLEDITRYGEISIISVDSTIR 88
YED + + +V+ ++Y E G + R N Y I I+V R
Sbjct: 49 QYEDDDMDVSYVEEEG-KAAFLYYLENNCIGRIKIRSNWN-----GYALIEDIAVAKDYR 102

Query: 89 RKGIGHILMDYAEQLAKKHNCIGTWLVSGTKRVEAHPFYKKLGYEVNGY 137
+KG+G L+ A + AK+++ G L + + A FY K + +
Sbjct: 103 KKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAV 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2073PF05272320.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.002
Identities = 24/92 (26%), Positives = 37/92 (40%), Gaps = 19/92 (20%)

Query: 33 LIGANGAGKSTTIKTMLGLLVNVNGEISFGAKKNPYAYVPEHPTYYDYLTLWEHIELLMA 92
L G G GKST I T++GL + G K+ Y + + EL
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQI---------AGIV-AYEL--- 647

Query: 93 ARGNEVGSWERKAEELLHLF---RMDKYKHEY 121
+E+ ++ R E + F R D+Y+ Y
Sbjct: 648 ---SEMTAFRRADAEAVKAFFSSRKDRYRGAY 676


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2076TRNSINTIMINR290.019 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 28.5 bits (63), Expect = 0.019
Identities = 23/91 (25%), Positives = 41/91 (45%), Gaps = 9/91 (9%)

Query: 35 DKPTSTAGQQNLESTSYTYEETNDRLTTDTFITYAMQEAEKQSMQKFGTKIGPVIEDEFK 94
D PT+T Q + + T D+LT + F E +K ++ G I E K
Sbjct: 264 DDPTTTDPDQ---AANAAESATKDQLTQEAFKN---PENQKVNIDANGNAIP---SGELK 314

Query: 95 DVILPKIEEAIAELANDVPEESLQSLAISQK 125
D I+ +I + E +++++S A +Q+
Sbjct: 315 DDIVEQIAQQAKEAGEVARQQAVESNAQAQQ 345


77BAS2199BAS2216N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS219909-0.517122exonuclease
BAS22010120.469824arsR family transcriptional regulator
BAS22020120.758831hypothetical protein
BAS2203-2101.102839oxalate/formate antiporter
BAS2204-2101.1436882,3-dihydroxybenzoate-2,3-dehydrogenase
BAS2205-3101.063527isochorismate synthase DhbC
BAS2206-3120.9082192,3-dihydroxybenzoate-AMP ligase
BAS2207-2120.421906isochorismatase
BAS2208-3110.514397nonribosomal peptide synthetase DhbF
BAS2209-116-1.942386balhimycin biosynthetic protein MbtH
BAS2210-216-2.069624EmrB/QacA family drug resistance transporter
BAS2211016-2.0514544'-phosphopantetheinyl transferase
BAS2212-115-1.139913hypothetical protein
BAS2213-116-1.066293DNA-binding protein HU
BAS2214-113-1.442888hypothetical protein
BAS2215115-0.966930DinB family DNA polymerase
BAS2216015-0.830684alkaline serine protease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2199GPOSANCHOR373e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 37.4 bits (86), Expect = 3e-04
Identities = 50/335 (14%), Positives = 102/335 (30%), Gaps = 35/335 (10%)

Query: 327 EQWHEEAMQNEQKAESLLKQIIAKKENIMNNFELAQEKYEVVKNKESERENVKKLVQRLE 386
E+ E A + E + +L + N + E E + N + + K +
Sbjct: 53 EKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKA 112

Query: 387 -ELQPIIASLAEKQLNLQNAEIQIGKLKESMQNLDRQLEEHTNQKQLMTGELQQLEQALE 445
++Q + A A+ + L+ A ++ L+ + +K + L+
Sbjct: 113 SKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST 172

Query: 446 QYVDKVEELTNMREDAKVLKQAYDVWQEKQKFEKEKEAAYSKMQLAVNAYENMERRWLSE 505
K++ L EK E + ++ A+N + +
Sbjct: 173 ADSAKIKTLEA----------------EKAALEARQAELEKALEGAMNFSTADSAKIKTL 216

Query: 506 QAGILALHLHDGESCPVCGSTTHPKKATEQSGAIDENELNGLRDKKNIAEKLHVQLEEKW 565
+A AL K E++ N K E LE +
Sbjct: 217 EAEKAAL--------------AARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQ 262

Query: 566 NFYHHQYEQVIEEVKKRGYQSEELVETYSALVQKGKQLATEVNTLKAS-EETRKQIAVK- 623
E + + + L +AL + L + L A+ + R+ +
Sbjct: 263 AELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASR 322

Query: 624 --IKSVEEKVDALQKQKREVETEQHRIEMDCMQLR 656
K +E + L++Q + E + + D R
Sbjct: 323 EAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASR 357


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2203TCRTETA461e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 46.4 bits (110), Expect = 1e-07
Identities = 40/195 (20%), Positives = 82/195 (42%), Gaps = 8/195 (4%)

Query: 206 MLGTKQVYLLFIMLFTSCMSGLYLIGMVKDIGVELVGLSAATAANAVAMVAIFNTLGRI- 264
M + + ++ + + G+ LI V + + S A+ ++A++ +
Sbjct: 1 MKPNRPLIVILSTVALDAV-GIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFAC 59

Query: 265 --ILGPLSDKIGRLKIVTGTFVVMASSVLVLSFVDLNYGIYFVCVASVAFCFGGNITIFP 322
+LG LSD+ GR ++ + A +++ + +Y + VA G +
Sbjct: 60 APVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRI--VAGITGATGAVAG 117

Query: 323 AIVGDFFGMKNHSKNYGIVYQGFGFGALAGSFIGALLGGFKP--TFMVIGLLCVVSFIIA 380
A + D ++++G + FGFG +AG +G L+GGF P F L ++F+
Sbjct: 118 AYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTG 177

Query: 381 MLIQAPNQKKEQEEE 395
+ + K E+
Sbjct: 178 CFLLPESHKGERRPL 192



Score = 36.0 bits (83), Expect = 2e-04
Identities = 24/146 (16%), Positives = 59/146 (40%), Gaps = 13/146 (8%)

Query: 8 PWLVVLGTVIVQMGLGTIYTWSLFNQPLVSKYGWSLNAVAITFSITSLSLA-FSTLFASK 66
L+ + ++ +G W +F + ++ W + I+ + + + +
Sbjct: 213 AALMAVFFIMQLVGQVPAALWVIFGE---DRFHWDATTIGISLAAFGILHSLAQAMITGP 269

Query: 67 LQEKWGLRKLIMIAGLALGLGLILSSQASS----LILLYVLAGVVVGYADGTAYITSLSN 122
+ + G R+ +M+ +A G G IL + A+ ++ +LA +G A ++ +
Sbjct: 270 VAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVD 329

Query: 123 LIKWFPERKGLIAGISVSAYGSGSLI 148
ER+G + G + S++
Sbjct: 330 -----EERQGQLQGSLAALTSLTSIV 350



Score = 32.1 bits (73), Expect = 0.004
Identities = 52/317 (16%), Positives = 108/317 (34%), Gaps = 38/317 (11%)

Query: 63 FASKLQEKWGLRKLIMIAGLALGLGLILSSQASSLILLYVLAGVVVGYADGT-------- 114
L +++G R +++++ + + + A L +LY + +V G T
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLY-IGRIVAGITGATGAVAGAYI 120

Query: 115 AYITSLSNLIKWFPERKGLIAGISVSAYGSGSLIFKYVNAQLIESVGVSQAFIYWGLIVT 174
A IT + F G + +G G ++ V L+ F +
Sbjct: 121 ADITDGDERARHF--------GFMSACFGFG-MVAGPVLGGLMGGFSPHAPFFAAAALNG 171

Query: 175 AMIVLGACLI---HQAADQSAVQETKTHEYTTKEMLGTKQVYLLFIMLFTSCMSGLYLIG 231
+ G L+ H+ + +E + + G V L + F + G
Sbjct: 172 LNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAA 231

Query: 232 MVKDIGVELVGLSAATAANAVAMVAIFNTLGRIIL-GPLSDKIGRLKIVTGTFVVMASSV 290
+ G + A T ++A I ++L + ++ GP++ ++G + + + +
Sbjct: 232 LWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGY 291

Query: 291 LVLSFVDLNYGIYFVCVASVAFCFGGNITIFPAIVGDFFGMKNHSKNYGIVYQGFGFGAL 350
++L+F + + PA+ S+ QG G+L
Sbjct: 292 ILLAFATRGW-----MAFPIMVLLASGGIGMPALQAML------SRQVDEERQGQLQGSL 340

Query: 351 A-----GSFIGALLGGF 362
A S +G LL
Sbjct: 341 AALTSLTSIVGPLLFTA 357


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2204DHBDHDRGNASE322e-114 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 322 bits (827), Expect = e-114
Identities = 163/261 (62%), Positives = 196/261 (75%), Gaps = 3/261 (1%)

Query: 1 MNVGEFDGKTVLVTGAAQGIGSVVAKMFLERGATVIAVDQNGEGLNVLLNQNETRMKI-- 58
MN +GK +TGAAQGIG VA+ +GA + AVD N E L +++ + +
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 59 -FHLDVSDSNAVEDTVKRIENDIAPIDILVNVAGVLRMGAIHSLSDEDWNKTFSVNSTGV 117
F DV DS A+++ RIE ++ PIDILVNVAGVLR G IHSLSDE+W TFSVNSTGV
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 118 FYMSRAVSKHMMQRKSGAIVTVGSNAANTPRVEMAAYAASKAATTMFMKCLGLELAAYNI 177
F SR+VSK+MM R+SG+IVTVGSN A PR MAAYA+SKAA MF KCLGLELA YNI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 178 RCNLVSPGSTETEMQRLLWADENGAKNIIAGSQNTYRLGIPLQKIAQPSEITEAVLFLAS 237
RCN+VSPGSTET+MQ LWADENGA+ +I GS T++ GIPL+K+A+PS+I +AVLFL S
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVS 240

Query: 238 DKASHITMHNLCVDGGATLGV 258
+A HITMHNLCVDGGATLGV
Sbjct: 241 GQAGHITMHNLCVDGGATLGV 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2207ISCHRISMTASE389e-139 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 389 bits (1000), Expect = e-139
Identities = 176/306 (57%), Positives = 232/306 (75%), Gaps = 11/306 (3%)

Query: 1 MAIPSISVYKMPIESELPKNKVNWTPDPKRAVLLIHDMQEYFLDAYSDKESPKVELISNI 60
MAIP+I Y+MP S++P+NKV+W PDP RAVLLIHDMQ YF+DA++ SP EL +NI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 KVIREKCKELGIPVVYTAQPGGQTLEQRGLLQDFWGDGIPAGPDKKKIVDELTPDEDDIF 120
+ ++ +C +LGIPVVYTAQPG Q + R LL DFWG G+ +GP ++KI+ EL P++DD+
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LTKWRYSAFKKTNLLEILNEQGRDQLIICGIYAHIGCLLTACEAFMDGIQPFFVADAVAD 180
LTKWRYSAFK+TNLLE++ ++GRDQLII GIYAHIGCL+TACEAFM+ I+ FFV DAVAD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSLEHHKQALEYASNRCAVTTSTNSLLTELQGLKDD-----------DEITLQKVHELVA 229
FSLE H+ ALEYA+ RCA T T+SLL +LQ D + T + + + +A
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 230 QLLREPVESVGTDEDLLNRGLDSVRIMSLVEKWRREGKEITFADLAENPTVVDWYRLLSP 289
+LL+E E + EDLL+RGLDSVRIM+LVE+WRREG E+TF +LAE PT+ +W +LL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLTT 300

Query: 290 QTEHVL 295
+++ VL
Sbjct: 301 RSQQVL 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2210TCRTETB1215e-32 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 121 bits (304), Expect = 5e-32
Identities = 90/398 (22%), Positives = 172/398 (43%), Gaps = 14/398 (3%)

Query: 20 FMAAMDATIVNVALQTISKELQVPPSAMGTVNVGYLVSLAVFLPISGWLGDRFGTKRIFL 79
F + ++ ++NV+L I+ + PP++ VN ++++ ++ + G L D+ G KR+ L
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TALFVFTTASALCGIANDITSLNIF-RIIQGAGGGLLTPVGMAMLFRTFSPEERPKISRF 138
+ + S + + + SL I R IQGAG + M ++ R E R K
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 IVLPIAVAPAVGPIIGGFFVDQMSWRWAFYINLPFGIMALLFGLLFLKEHIEKSAGRFDS 198
I +A+ VGP IGG + W++ + +P + + L+ L + + G FD
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHYIH--WSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 199 LGFILSAPGFAMIIYALSQGPSRGWISTEIISTGIAGTVFITLFILVELKVKQPMLDLRL 258
G IL + G + ++ IS I + +F+ KV P +D L
Sbjct: 202 KGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 259 LKEPVFRKMSLISLFSSAGLLGMLFVFPLMYQNVIGVSALESG-LTTFPEAIGLMISSQI 317
K F L + G + + P M ++V +S E G + FP + ++I I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 318 VPWSYKKLGARKVISIGLICTVIIFVLLSFVNHDTNPWQIRALLFGIGIFLGQSVGAVQF 377
+ G V++IG+ + F+ SF+ T+ + ++F +G L + +
Sbjct: 313 GGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLG-GLSFTKTVIST 371

Query: 378 SAFNNITPPSMGRATTIFNVQNRLGSAIGVAVLASILA 415
+++ G ++ N + L G+A++ +L+
Sbjct: 372 IVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2211ENTSNTHTASED391e-05 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 38.9 bits (90), Expect = 1e-05
Identities = 22/129 (17%), Positives = 48/129 (37%), Gaps = 23/129 (17%)

Query: 53 RARFIIGCVISRLVLGKILSMSPVQVPIDRMCPVCKLQHGRPQLPEGMPQLSVSHSGEWV 112
+A + G + + L + + + V D+ +P P+G+ S+SH
Sbjct: 47 KAEHLAGRIAAVHALRE-VGVRTVPGMGDK---------RQPLWPDGLFG-SISHCATTA 95

Query: 113 VVAFTKFAPVGVDVEQMNPNVDVMKMAEGVLTDIEKAQVMKLPNEQKIEGFLTYWTR--- 169
+ ++ +G+D+E++ ++A ++ E+ Q
Sbjct: 96 LAVISR-QRIGIDIEKIMSQHTATELAPSIIDSDER------QILQASLLPFPLALTLAF 148

Query: 170 --KEAVLKA 176
KE+V KA
Sbjct: 149 SAKESVYKA 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2213DNABINDINGHU1243e-41 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 124 bits (313), Expect = 3e-41
Identities = 57/89 (64%), Positives = 74/89 (83%)

Query: 2 NKTELIKNVAQSADISQKDASAAVQSVFDTIATALQSGDKVQLIGFGTFEVRERSARTGR 61
NK +LI VA++ ++++KD++AAV +VF +++ L G+KVQLIGFG FEVRER+AR GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGEEIQIAAGKVPAFKAGKELKEAVK 90
NPQTGEEI+I A KVPAFKAGK LK+AVK
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2216SUBTILISIN2642e-88 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 264 bits (677), Expect = 2e-88
Identities = 101/304 (33%), Positives = 150/304 (49%), Gaps = 19/304 (6%)

Query: 110 TPNDPYYKN-QYGLQKIQAPLAWDSQRSDSSVKVAIIDTGVQGSHPDLSSKVIYGHDYVD 168
+ G++ IQAP W+ R VKVA++DTG HPDL +++I G ++ D
Sbjct: 13 IKQEQQVNEIPRGVEMIQAPAVWNQTRG-RGVKVAVLDTGCDADHPDLKARIIGGRNFTD 71

Query: 169 NDN----VSDDGNGHGTHCAGITGALTNNSVGIAGVAPHTSIYAVRVLDNQGSGTLDAVA 224
+D + D NGHGTH AG A T N G+ GVAP + ++VL+ QGSG D +
Sbjct: 72 DDEGDPEIFKDYNGHGTHVAGTIAA-TENENGVVGVAPEADLLIIKVLNKQGSGQYDWII 130

Query: 225 QGIREAADSGAKVISLSLGAPNGGTALQQAVQYAWNKGSVIVAAAGNAGNTKAN-----Y 279
QGI A + +IS+SLG P L +AV+ A +++ AAGN G+ Y
Sbjct: 131 QGIYYAIEQKVDIISMSLGGPEDVPELHEAVKKAVASQILVMCAAGNEGDGDDRTDELGY 190

Query: 280 PAYYSEVIAVASTDQSDRKSSFSTYGSWVDVAAPGSNIYSTYKGSTYQSLSGTSMATPHV 339
P Y+EVI+V + + S FS + VD+ APG +I ST G Y + SGTSMATPHV
Sbjct: 191 PGCYNEVISVGAINFDRHASEFSNSNNEVDLVAPGEDILSTVPGGKYATFSGTSMATPHV 250

Query: 340 AGVAAL-------LANQGYSNTQIRQIIESTSDKISGTGTYWKNGRVNAYKAVQYAKQLQ 392
AG AL + + ++ + + + + NG + + ++
Sbjct: 251 AGALALIKQLANASFERDLTEPELYAQLIKRTIPLGNSPKMEGNGLLYLTAVEELSRIFD 310

Query: 393 ENKA 396
+
Sbjct: 311 TQRV 314


78BAS2301BAS2305N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2301-212-3.129293DEAD/DEAH box helicase
BAS2302015-4.241133hypothetical protein
BAS2303015-4.214850hypothetical protein
BAS2304-116-4.515340TetR family transcriptional regulator
BAS2305015-4.250937ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2301TONBPROTEIN300.013 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 30.3 bits (68), Expect = 0.013
Identities = 20/113 (17%), Positives = 39/113 (34%), Gaps = 6/113 (5%)

Query: 338 AGGSGLAITFVAAKDEKH------LEEIEKTLGAPIQREIIEQPKIKRVDENGKPLPKPA 391
A +++T V D + E + + V E KP PKP
Sbjct: 40 APAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPK 99

Query: 392 PKKSGEYRQRDSREGSRSGSKGRTRNDSRNSSRNENNRSFNKPSNKKGSTKQG 444
PK + +++ R+ S+ + ++ +R ++ + S S G
Sbjct: 100 PKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATSKPVTSVASG 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2302BACTRLTOXIN280.005 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 27.6 bits (61), Expect = 0.005
Identities = 8/23 (34%), Positives = 13/23 (56%)

Query: 31 KINWYNDMKTSFANKELADLVKG 53
K+ Y+ +KT N++LA K
Sbjct: 84 KLKNYDKVKTELLNEDLAKKYKD 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2304HTHTETR728e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.4 bits (177), Expect = 8e-18
Identities = 30/174 (17%), Positives = 72/174 (41%), Gaps = 13/174 (7%)

Query: 8 EERRKEILETAERLFLTKGYTKTTVNDILKEIGIAKGTFYHYFKSKEEVMDEIIMRIIKE 67
+E R+ IL+ A RLF +G + T++ +I K G+ +G Y +FK K ++ E I + +
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE-IWELSES 68

Query: 68 DVAKAKVIVSNPNIPVLEKLFRVLME---QSPKSGDIKDKMIE-QFHQPNNA---EMYQK 120
++ + ++ + R ++ +S + + + ++E FH+ + Q+
Sbjct: 69 NIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQ 128

Query: 121 SLVQSIIHLSPVLTEILEQGIEEGIFSTSY-PQETIELLLSSAQVIFDEGLFQW 173
+ + + + L+ IE + + ++ + W
Sbjct: 129 AQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGY----ISGLMENW 178


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2305TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.0 bits (91), Expect = 3e-05
Identities = 58/342 (16%), Positives = 125/342 (36%), Gaps = 36/342 (10%)

Query: 49 IFAGLYAITSIPFLLAPLGGAIADRFNRRNLMVIFDFINTAIVLSFIVLLFTGSVSILLI 108
I LYA+ F AP+ GA++DRF RR +++ A+ + ++ + +L I
Sbjct: 47 ILLALYALMQ--FACAPVLGALSDRFGRR-PVLLVSLAGAAV--DYAIMATAPFLWVLYI 101

Query: 109 GTIMFLLAIVNAMYAPVVMASIPQLVPEKKLEQANGIVNGVQALSNIVAPVLGGILYGII 168
G I +A + V A I + + + G ++ + PVLGG++ G
Sbjct: 102 GRI---VAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLM-GGF 157

Query: 169 GLKMLVIISCLAFFLSAILEMFITIPFIKRVQESHIIPTIVKDMKGGFIYVLKQPFILKS 228
+ L+ + F+ +P + + + + +
Sbjct: 158 SPHAPFFAAAALNGLNFLTGCFL-LPESHKGERRPLRREALNPLASFRWARGMTVVAA-L 215

Query: 229 MLLAALLNLILTPLFVVGAPIIIRVTMESSH-TLYGIGMGLIDFATIIGALSMVFFAKKL 287
M + ++ L+ V A + + + H IG+ L F I+ +L+ +
Sbjct: 216 MAVFFIMQLVGQ----VPAALWVIFGEDRFHWDATTIGISLAAFG-ILHSLAQAMITGPV 270

Query: 288 QMQTLYYWMILIALLVIPMALSVTPFILNLGY------YPPFILFILSSILIAMIMTVVS 341
+ L++ M T +IL +P +L I + + ++S
Sbjct: 271 AA-----RLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLS 325

Query: 342 IYVITVVQKKTPNENLGKVMAIITAVSQCMAPIGQVIYGFMF 383
V E G++ + A++ + +G +++ ++
Sbjct: 326 RQV--------DEERQGQLQGSLAALTSLTSIVGPLLFTAIY 359



Score = 29.0 bits (65), Expect = 0.033
Identities = 16/79 (20%), Positives = 34/79 (43%), Gaps = 3/79 (3%)

Query: 88 TAIVLSFIVLLFTGSVSILLIGTIMFLLAIVNAMYAPVVMASIPQLVPEKKLEQANGIVN 147
A +I+L F + ++ + P + A + + V E++ Q G +
Sbjct: 285 IADGTGYILLAFATRGWMAFPIMVLLASG---GIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 148 GVQALSNIVAPVLGGILYG 166
+ +L++IV P+L +Y
Sbjct: 342 ALTSLTSIVGPLLFTAIYA 360


79BAS2364BAS2371N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2364-217-1.695095ABC transporter ATP-binding protein
BAS2365-214-1.090172ABC transporter permease
BAS2366-214-0.625507TetR family transcriptional regulator
BAS2367-214-0.001147hypothetical protein
BAS2368-1130.510494hypothetical protein
BAS2369-1141.624748acyl-CoA dehydrogenase
BAS23700131.863480acetyl-CoA carboxylase biotin carboxylase
BAS23710131.718192acetyl-CoA carboxylase biotin carboxyl carrier
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2364PF05272320.003 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.003
Identities = 16/65 (24%), Positives = 27/65 (41%), Gaps = 10/65 (15%)

Query: 17 LVGPSGSGKTTLIKLIAGINEATEGEVLVYNTNMPNLNEMKRIGYMAQADALYE--ELSA 74
L G G GK+TLI + G++ ++ ++ K YE E++A
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFFSDTHF--------DIGTGKDSYEQIAGIVAYELSEMTA 652

Query: 75 YENAD 79
+ AD
Sbjct: 653 FRRAD 657


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2365ABC2TRNSPORT499e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 48.8 bits (116), Expect = 9e-09
Identities = 37/163 (22%), Positives = 72/163 (44%), Gaps = 9/163 (5%)

Query: 166 SFVRERLSGALERLLSTPIKRWEIVVGYIIGFGIFAFIQSIIIVSFSVYILDLYVAGSIW 225
+F R E +L T ++ +IV+G + A + I + + Y
Sbjct: 90 AFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALG--YTQW--L 145

Query: 226 LTLLITCMLSLTAL---TLGTFLSAYANNEFQMIQFIPLVIVPQIFFSG-LFPIESMNKW 281
L +++LT L +LG ++A A + I + LVI P +F SG +FP++ +
Sbjct: 146 SLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIV 205

Query: 282 LQMLGKLFPLTYGADAMRQVMIRNQGFTEIALDLTVLLFFSVL 324
Q + PL++ D +R +M+ + ++ + L + V+
Sbjct: 206 FQTAARFLPLSHSIDLIRPIMLGHPV-VDVCQHVGALCIYIVI 247


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2366HTHTETR852e-22 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 84.7 bits (209), Expect = 2e-22
Identities = 37/206 (17%), Positives = 82/206 (39%), Gaps = 12/206 (5%)

Query: 16 DKRNERQMRILEAAVDMFGEKGYASTSTSEIAKRAGVAEGTIFRYYKTKKDLLLAVVMPT 75
+ E + IL+ A+ +F ++G +STS EIAK AGV G I+ ++K K DL
Sbjct: 7 QEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE----- 61

Query: 76 LMKFAAPFFVQAFAKEIFKSEYESYEGLLRVVIHNRFDFA---KKHFPMIKILIQEVPFH 132
+ + + + E +LR ++ + + ++ +++I+ + F
Sbjct: 62 IWELSESNIGELEL-EYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 133 PELK--NEIQQLVETELLLHFKKLIEKFQEKGKIIEMPPATVLRLTLSAVFGLLLTRFLL 190
E+ + Q+ + E ++ ++ E + + + L+ +L
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 191 LPEEKWDDETEIENTIQFILYGLTPR 216
P+ D + E + + +L
Sbjct: 181 APQSF-DLKKEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2370PHPHTRNFRASE340.001 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 34.4 bits (79), Expect = 0.001
Identities = 22/80 (27%), Positives = 33/80 (41%), Gaps = 6/80 (7%)

Query: 97 EEGIVFIGPSEEIITKMGSKIESRIAMQA--ADVPVVPGITTNIETAEEAIEIAKQIGYP 154
EGIV + P+EE + K + + A + P T + +E+A IG P
Sbjct: 224 IEGIVIVNPTEEEVKAYEEKRAAFEKQKQEWAKLVGEPSTTKD----GAHVELAANIGTP 279

Query: 155 LMLKASAGGGGIGMQLMETE 174
+ GG G+ L TE
Sbjct: 280 KDVDGVLANGGEGIGLYRTE 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2371RTXTOXIND321e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 1e-04
Identities = 13/30 (43%), Positives = 19/30 (63%)

Query: 42 IVSEEAGTVMKINVQEGDFVNEGDVLLEIE 71
I E V +I V+EG+ V +GDVLL++
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLT 128


80BAS2683BAS2692N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2683-113-1.531276EmrB/QacA family drug resistance transporter
BAS2684-112-2.413866hypothetical protein
BAS2685-113-2.939328hypothetical protein
BAS2686-214-3.222596hypothetical protein
BAS2687-213-2.492786solute-binding family 5 protein
BAS2688112-2.305613major facilitator family transporter protein
BAS2689215-2.949622lipoprotein
BAS2690115-3.176690hypothetical protein
BAS2691016-3.039083hypothetical protein
BAS2692117-3.469785hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2683TCRTETB1452e-40 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 145 bits (368), Expect = 2e-40
Identities = 91/406 (22%), Positives = 166/406 (40%), Gaps = 16/406 (3%)

Query: 19 ILMASMDNTIVVTAMGTIVGDLGGLENFV-WVVSAYMVAEMAGMPIFGKLSDMYGRKRFF 77
+ ++ ++ ++ I D WV +A+M+ G ++GKLSD G KR
Sbjct: 23 SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLL 82

Query: 78 IFGLIVFMVGSALCGTAENITQLGIY-RAIQGIGGGALVPIAFTIVFDIFPPEKRGKMGG 136
+FG+I+ GS + + L I R IQG G A + +V P E RGK G
Sbjct: 83 LFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFG 142

Query: 137 LFGAVFGLSSIFGPLLGAYITDYISWHWVFYINLPLGVLALIFITFFYKESRVHRKQKID 196
L G++ + GP +G I YI HW + + +P+ + + + V K D
Sbjct: 143 LIGSIVAMGEGVGPAIGGMIAHYI--HWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFD 200

Query: 197 WSGAITLVGAVICLMFALELGGQKYDWDSTFILSLFVGFAILIISFIFIERKVEEPIISF 256
G I + ++ M L Y S + + + F+ RKV +P +
Sbjct: 201 IKGIILMSVGIVFFM----LFTTSYSI------SFLIVSVLSFLIFVKHIRKVTDPFVDP 250

Query: 257 EMFKQRLFGMSTIIALCYGAAFMSATVYIPLFIQGVYGGSATNSG-LLLLPMMLGSVVTA 315
+ K F + + +P ++ V+ S G +++ P + ++
Sbjct: 251 GLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFG 310

Query: 316 QLGGFLTTKLSYRNIMIISAVIMLIGLFLLSALTPETSRALLTVYMIIIGFGVGFSFSVL 375
+GG L + ++ I + + FL ++ ET+ +T+ ++ + G+ F+ +V+
Sbjct: 311 YIGGILVDRRGPLYVLNIGVTFLSVS-FLTASFLLETTSWFMTIIIVFVLGGLSFTKTVI 369

Query: 376 SMAAIHNFGMEQRGSATSTSNFIRSLGMTLGITIFGMIQRTGFQDQ 421
S + ++ G+ S NF L GI I G + DQ
Sbjct: 370 STIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQ 415


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2686PYOCINKILLER310.004 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.9 bits (69), Expect = 0.004
Identities = 14/53 (26%), Positives = 20/53 (37%), Gaps = 5/53 (9%)

Query: 12 LELTGISYGQLYRWKRKNLIPEDWFVRKSTFTGQETFFPKEKILERINKIQTM 64
L+ + G KNL P D R T G +K+L KI ++
Sbjct: 97 LDKADAALGPA-----KNLAPLDVINRSLTIVGNALQQKNQKLLLNQKKITSL 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2688TCRTETA801e-18 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 80.3 bits (198), Expect = 1e-18
Identities = 53/318 (16%), Positives = 113/318 (35%), Gaps = 9/318 (2%)

Query: 50 LIFGLQPFSDIVFTLIAGGITDKYGRKKIMLLGLLLQGVAIGSFVFAQSVFIFALLYVIN 109
++ L + G ++D++GR+ ++L+ L V A +++ + ++
Sbjct: 47 ILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVA 106

Query: 110 GIGRSLYIPAQRAQIADLIKQGQQAEIFALLQTMGAIGTVIGPLIGAVFYNTHPEYLFIM 169
GI + A IAD+ ++A F + G V GP++G + P F
Sbjct: 107 GITGAT-GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFA 165

Query: 170 QSITLMVYAVVVWTQLPETAPAITMPKQKLEVSSPKQF--VRNHSAVIGLMVSTLPISFF 227
+ + + LPE+ P ++ ++ F R + V LM +
Sbjct: 166 AAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLV 225

Query: 228 YAQTETNYRIFAEDVFPNFIFILAFISTCRAIMEIILQIFLV-KWSERFSMAKIIIISYT 286
+ IF ED F + I+ + Q + + R + +++
Sbjct: 226 GQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLG-- 283

Query: 287 CYIVAAIGYGFSATIVS--LFFTLLFLVIGESIALNHLLRFVSEIAPSDKRGLYFSIYGL 344
I GY A + F ++ L+ I + L +S +++G
Sbjct: 284 -MIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAA 342

Query: 345 HWDVSRTCGPVIGAILLS 362
++ GP++ + +
Sbjct: 343 LTSLTSIVGPLLFTAIYA 360



Score = 47.5 bits (113), Expect = 5e-08
Identities = 20/121 (16%), Positives = 53/121 (43%), Gaps = 1/121 (0%)

Query: 45 IMITMLIFGLQPFSDIVFTLIAGGITDKYGRKKIMLLGLLLQGVAIGSFVFAQSVFIFAL 104
I + + + +I G + + G ++ ++LG++ G FA ++
Sbjct: 246 TTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFP 305

Query: 105 LYVINGIGRSLYIPAQRAQIADLIKQGQQAEIFALLQTMGAIGTVIGPLIGAVFYNTHPE 164
+ V+ G + +PA +A ++ + + +Q ++ L + ++ +++GPL+ Y
Sbjct: 306 IMVLLASG-GIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASIT 364

Query: 165 Y 165

Sbjct: 365 T 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2689TYPE4SSCAGA290.014 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 29.3 bits (65), Expect = 0.014
Identities = 35/130 (26%), Positives = 52/130 (40%), Gaps = 20/130 (15%)

Query: 20 LAACKGTDEKKETNP----TSENSKNEQNTSSEGK-----KEPEVKSNTDSNSKDIVINQ 70
L A KG+ + NP EN N GK K + KS+ +++ KD++INQ
Sbjct: 719 LKALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENSVKDVIINQ 778

Query: 71 KSINHVKNLFELAKEGKVPNVPFAAHTGDIEEIEKAWGKADKTEQAGNGMYATFTNKNVS 130
K + V NL + K TGD +E+A + A KN S
Sbjct: 779 KVTDKVDNLNQAVSVAKA--------TGDFSRVEQALADLKNFSKE---QLAQQAQKNES 827

Query: 131 FGFNKGSQVF 140
K S+++
Sbjct: 828 LNARKKSEIY 837


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2692CHANLCOLICIN359e-06 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 35.4 bits (81), Expect = 9e-06
Identities = 15/49 (30%), Positives = 24/49 (48%), Gaps = 3/49 (6%)

Query: 6 IVGGILGWLASLITGRDVPGGVIG-NIIAGIIGSWIGGKLLGSFGPVIG 53
V ++ L SL+ G G+ G I+ GI+ S+I L + V+G
Sbjct: 475 GVSYVVALLFSLLAG--TTLGIWGIAIVTGILCSYIDKNKLNTINEVLG 521


81BAS2726BAS2733N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2726-219-0.721645acetyltransferase
BAS2727-217-0.917975hypothetical protein
BAS2728-119-1.834138hypothetical protein
BAS2729014-0.949824acetyltransferase
BAS2730114-1.545475acetyltransferase
BAS2731114-0.745976hypothetical protein
BAS2732-314-1.232145hypothetical protein
BAS2733-320-0.476137lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2726SACTRNSFRASE361e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.5 bits (84), Expect = 1e-05
Identities = 25/78 (32%), Positives = 33/78 (42%), Gaps = 9/78 (11%)

Query: 46 YEEQACIGIEIIGAN---KAKIRHIAVIPQYRHKGIALQMI---KEVVRIHQLTYLEAET 99
Y E CIG I +N A I IAV YR KG+ ++ E + + L ET
Sbjct: 71 YLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLET 130

Query: 100 DD---EAVEFYKRIGFQV 114
D A FY + F +
Sbjct: 131 QDINISACHFYAKHHFII 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS272760KDINNERMP280.039 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 28.4 bits (63), Expect = 0.039
Identities = 10/45 (22%), Positives = 23/45 (51%), Gaps = 4/45 (8%)

Query: 12 FFFAFTFVLNRAMDLEGGSWI-WSASLRY---YFMVPMLLLIVMY 52
F A ++L +++L + W L Y+++P+L+ + M+
Sbjct: 432 IFLALYYMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVTMF 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2729SACTRNSFRASE439e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 43.0 bits (101), Expect = 9e-08
Identities = 19/90 (21%), Positives = 35/90 (38%), Gaps = 5/90 (5%)

Query: 57 FGAFNEDHQLVGVVTLLTEEKEAYKHKGHIVAMYVDASNQRSGLARELICKAIERAKEMN 116
F + ++ +G + + + + I + V ++ G+ L+ KAIE AKE +
Sbjct: 68 FLYY-LENNCIGRIKI----RSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENH 122

Query: 117 LEQLTLGVVSTNEPAKRLYESMGFKTYGIE 146
L L N A Y F ++
Sbjct: 123 FCGLMLETQDINISACHFYAKHHFIIGAVD 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2733IGASERPTASE250.046 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 25.0 bits (54), Expect = 0.046
Identities = 17/61 (27%), Positives = 28/61 (45%), Gaps = 2/61 (3%)

Query: 28 KDEKEPDPTEEPSEQRQEEKNEKQD-PAKEQNNELNK-KDEQEPDPTEEPSEEQKKKKEN 85
++ E D TE ++ R+ K K + A Q NE+ + E + T E E +KE
Sbjct: 1051 VEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE 1110

Query: 86 E 86
+
Sbjct: 1111 K 1111


82BAS2752BAS2759N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2752016-0.506683isochorismatase family protein
BAS2753117-1.261446acetyltransferase
BAS2754116-1.662267hypothetical protein
BAS2755216-1.888531hypothetical protein
BAS2756216-1.670621hypothetical protein
BAS2757014-1.690394hypothetical protein
BAS2758013-0.572932RNA polymerase sigma factor
BAS2759-1140.081713hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2752ISCHRISMTASE538e-11 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 52.7 bits (126), Expect = 8e-11
Identities = 43/158 (27%), Positives = 71/158 (44%), Gaps = 19/158 (12%)

Query: 2 KKALLVIDVQ---AGMYTAGMPVHNGEKFLETLQELIGECRSNDIPVIYVQHNGPKDHPL 58
+ LL+ D+Q +TAG + +++L +C IPV+Y G + +P
Sbjct: 30 RAVLLIHDMQNYFVDAFTAGASPV--TELSANIRKLKNQCVQLGIPVVYTAQPGSQ-NPD 86

Query: 59 EKG--TDGW-----------KIHAAIAPLEGECVVEKTTPDSFHKTNLKEVLQDKGIDHV 105
++ TD W KI +AP + + V+ K +F +TNL E+++ +G D +
Sbjct: 87 DRALLTDFWGPGLNSGPYEEKIITELAPEDDDLVLTKWRYSAFKRTNLLEMMRKEGRDQL 146

Query: 106 IISGMQTQYCVDTTTRRACSEGYKITLVSDAHSTFDTE 143
II+G+ T A E K V DA + F E
Sbjct: 147 IITGIYAHIGCLVTACEAFMEDIKAFFVGDAVADFSLE 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2753SACTRNSFRASE364e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 35.7 bits (82), Expect = 4e-05
Identities = 19/89 (21%), Positives = 30/89 (33%), Gaps = 14/89 (15%)

Query: 78 VDSESKTLYGYEESQNVWG-------------MDQFIGEPTYWGKGIGTKFVKAAITYIL 124
V+ E K + Y N G ++ Y KG+GT + AI +
Sbjct: 60 VEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAK 119

Query: 125 SEMGAEAIAMDPKVNNERAIKCYEKCGFK 153
E + ++ + N A Y K F
Sbjct: 120 -ENHFCGLMLETQDINISACHFYAKHHFI 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2756IGASERPTASE541e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 53.5 bits (128), Expect = 1e-09
Identities = 45/216 (20%), Positives = 73/216 (33%), Gaps = 4/216 (1%)

Query: 146 PVEKKADEKTKQVAKVQKSVKAKEEAKTQKITKAKETIKPKEEVKVQEVVKPKEEVKVQE 205
V+ + SV + E + P + E V E K +
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVA--ENSKQES 1048

Query: 206 VVKPKEEVKVQEVAKPKEEVKVQEVAKPKEEVKVQEVAKPKEEVKVQEVVKPKEEVKVQE 265
K E E EV + + K + EVA+ E K + + KE V++
Sbjct: 1049 KTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEK 1108

Query: 266 VAKAKEEA-KAQEIAKAKEEAKAQEIAKAKEEAKAQEIAKAKEEAKAQEIAKAKEEAKAR 324
KAK E K QE+ K + ++ +++ E A+ + + +++ A
Sbjct: 1109 EEKAKVETEKTQEVPKVTSQVSPKQ-EQSETVQPQAEPARENDPTVNIKEPQSQTNTTAD 1167

Query: 325 EALKAKEESKNNAQSAKRELTVVATAYTADPSENGT 360
AKE S N Q TV + EN T
Sbjct: 1168 TEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTT 1203



Score = 44.7 bits (105), Expect = 7e-07
Identities = 36/223 (16%), Positives = 79/223 (35%), Gaps = 11/223 (4%)

Query: 132 KTAYVNVSFLSSKAPVEKKADEKTKQVAKVQKSVKAKEEAKTQKITKAKETIKPKEEVKV 191
+T ++ +K ++ + + V + ++ + T+ E + E K
Sbjct: 1035 ETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKE 1094

Query: 192 QEVVKPKEEVKVQEVVKPKEEVKVQEVAKPKEEVKVQEVAKPKEEVKVQEVAKPKEEVKV 251
+ + KE V++ K K E K +E KV PK+E + + V E +
Sbjct: 1095 TQTTETKETATVEKEEKAKV-----ETEKTQEVPKVTSQVSPKQE-QSETVQPQAEPARE 1148

Query: 252 QEVVKPKEEVKVQEVAKAKEEAKAQEIAKAKEEA----KAQEIAKAKEEAKAQEIAKAKE 307
+ +E + Q A E A+E + E+ + E +
Sbjct: 1149 NDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQ 1208

Query: 308 EAKAQEIAKAKEEAKAREALKAKEESKNNAQSAKRELTVVATA 350
E + K + + R ++++ + A ++ + + VA
Sbjct: 1209 PTVNSESSN-KPKNRHRRSVRSVPHNVEPATTSSNDRSTVALC 1250



Score = 42.0 bits (98), Expect = 5e-06
Identities = 47/286 (16%), Positives = 88/286 (30%), Gaps = 30/286 (10%)

Query: 128 EYKGKTAYVNVSFLSSKAPVEKKADEKTKQVAKVQKSVK------AKEEAKTQKITKAKE 181
E ++ +A KA+ +T +VA+ K KE A +K KAK
Sbjct: 1055 EQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKV 1114

Query: 182 TIKPKEEV-KVQEVVKPK----EEVKVQEVVKPKEEVKVQEVAKPKEEVKVQEVAKPKEE 236
+ +EV KV V PK E V+ Q + + V + + +P +E
Sbjct: 1115 ETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKE 1174

Query: 237 VKVQEVAKPKEEVKVQEVVKPKEEVKVQEVAKAKEEAKAQEIAKAKEEAKAQEIAK--AK 294
V +P E E + E + + + +
Sbjct: 1175 TS-SNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHN 1233

Query: 295 EEAKAQEIAKAKEEAKAQEIAKAKEEAKAREALKAKEESKNNAQSAKRELTVVATA---- 350
E A + + KA+ + N ++ + ++ +
Sbjct: 1234 VEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQ 1293

Query: 351 ---YTADPSENGTYGG---------RVLTAMGHDLTANPNMRIIAV 384
+ ++ S N Y T +G D T + N+++ V
Sbjct: 1294 YNVWVSNTSMNKNYSSSQYRRFSSKSTQTQLGWDQTISNNVQLGGV 1339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2759RTXTOXINA270.038 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 26.9 bits (59), Expect = 0.038
Identities = 8/25 (32%), Positives = 13/25 (52%)

Query: 17 ISSGTIKIHFTNFHDSVDYDRQLYI 41
+S+G+ I+ HD V YD+
Sbjct: 625 LSAGSANIYAGKGHDVVYYDKTDTG 649


83BAS2819BAS2828N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2819-215-0.582614alkaline D-peptidase
BAS2820-115-1.490932hypothetical protein
BAS2823013-0.379253hypothetical protein
BAS2824113-0.481022ATPase AAA
BAS2825111-0.650572hypothetical protein
BAS2826111-0.489541nitroreductase family protein
BAS2827-213-1.607356marR family transcriptional regulator
BAS2828-313-2.085437tetracycline resistance protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2819BLACTAMASEA320.004 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 31.7 bits (72), Expect = 0.004
Identities = 14/55 (25%), Positives = 21/55 (38%)

Query: 74 GKISSYTAGVADLSTKKPVKSDYRFRIGSVTKTFTATTVLQLVGENRVQLDDSIE 128
G++ +A T ++D RF + S K VL V QL+ I
Sbjct: 38 GRVGMIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIH 92


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2823YERSSTKINASE290.029 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 29.3 bits (65), Expect = 0.029
Identities = 15/37 (40%), Positives = 23/37 (62%), Gaps = 3/37 (8%)

Query: 310 KHLQNVLKILASISDQDTPVSSSYFYTAGFRRKELDA 346
KHL+ +L++L ++S Q PVSS T GF + +A
Sbjct: 567 KHLETLLEVLVTLSQQGQPVSSE---TYGFLNRLTEA 600


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2824HTHFIS431e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 42.9 bits (101), Expect = 1e-06
Identities = 33/156 (21%), Positives = 62/156 (39%), Gaps = 20/156 (12%)

Query: 15 IIGKDESI----ELAAIALIAKGHILLEDVPGTGKTTLAKSL---AKSVDAKFQRIQFTA 67
++G+ ++ + A + +++ GTGK +A++L K + F I A
Sbjct: 139 LVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAA 198

Query: 68 DTLPGDVIGLEYFNVKESDF----KTRLGPI-FAN--IVLVDEINRAVPRTQSSLLEVME 120
+P D+I E F ++ F G A + +DEI Q+ LL V++
Sbjct: 199 --IPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQ 256

Query: 121 ERTVTIAKQTHSLPEPFLVIATQN-PLESA---GTF 152
+ T + ++A N L+ + G F
Sbjct: 257 QGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLF 292


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2828TCRTETOQM6350.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 635 bits (1640), Expect = 0.0
Identities = 223/647 (34%), Positives = 343/647 (53%), Gaps = 13/647 (2%)

Query: 1 MTTINIEIVAHVDAGKTSLTERILYETNVIKEVGRVDSGSTQTDSMELERQRGITIKASV 60
M INI ++AHVDAGKT+LTE +LY + I E+G VD G+T+TD+ LERQRGITI+ +
Sbjct: 1 MKIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGI 60

Query: 61 VSFFIDDIKVNVIDTPGHADFIAEVERSFRVLDGAILVISAVEGVQAQTKILMQTLQKLN 120
SF ++ KVN+IDTPGH DF+AEV RS VLDGAIL+ISA +GVQAQT+IL L+K+
Sbjct: 61 TSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMG 120

Query: 121 IPTILFVNKIDRTGANTEKVVKQIKTILSNETFPFYSVQNEGTKEARIIEYKSYDDCIER 180
IPTI F+NKID+ G + V + IK LS E V E + + + + +
Sbjct: 121 IPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKV--ELYPNMCVTNF-TESEQWDT 177

Query: 181 LAPYNESLLESFVNNEIVTDTLLREELEKQIQQANLYPIFFGSALTGIGVTELLEDIPAL 240
+ N+ LLE +++ + + L +E + +L+P++ GSA IG+ L+E I
Sbjct: 178 VIEGNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNK 237

Query: 241 LPANNPSQDEELSGIVFKIEREPSGEKIAYVRVFSGTLHVRKYVHIQRDGSLPHKEKIKK 300
++ EL G VFKIE +++AY+R++SG LH+R V I K KI +
Sbjct: 238 FYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKE----KIKITE 293

Query: 301 MCIFHNGNAVQTSTVPSGDFCKVWGLNNIKIGDIIGERT--DYIKDIHFAEPQMEAAINA 358
M NG + SG+ + +K+ ++G+ + I P ++ +
Sbjct: 294 MYTSINGELCKIDKAYSGEIVILQN-EFLKLNSVLGDTKLLPQRERIENPLPLLQTTVEP 352

Query: 359 VPKERIHDLYAALMELCEADPLIKVWKDDIHNELYIRLFGEVQKEVIETTLYEKYNLQVT 418
++ L AL+E+ ++DPL++ + D +E+ + G+VQ EV L EKY++++
Sbjct: 353 SKPQQREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIE 412

Query: 419 FSSTRVVCMEKPIGIGNSVEVMGEKANPFYATIGFKVERGELNSGITYKLGVELGSLPLA 478
V+ ME+P+ + NPF+A+IG V L SG+ Y+ V LG L +
Sbjct: 413 IKEPTVIYMERPLKKAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQS 472

Query: 479 FHKASEDTVFQTLKQGLYGWEVTDISVTLTHTGYASPVTTASDFRNLTPLVLMDALKQAE 538
F A + + +QGLYGW VTD + + Y SPV+T +DFR L P+VL LK+A
Sbjct: 473 FQNAVMEGIRYGCEQGLYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAG 532

Query: 539 TYVYEPVNEFELTVPEHAISTAMYKLAAILATFAEPIFNNDSYQLTGSLPVAKTESFKRM 598
T + EP F++ P+ +S A A + N+ L+G +P + ++
Sbjct: 533 TELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSD 592

Query: 599 LHSFTEGEGVFTTKPAGFTKLMAPLPTRKRVDYNPLNRKDYLLHVLK 645
L FT G V T+ G+ + R P +R D + ++
Sbjct: 593 LTFFTNGRSVCLTELKGYHVTTGEPVCQPR---RPNSRIDKVRYMFN 636


84BAS2989BAS2999N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS2989-1142.877982penicillin-binding protein
BAS29901132.439885hypothetical protein
BAS29910142.573642hypothetical protein
BAS29920142.351856marR family transcriptional regulator
BAS29930122.161220bifunctional P-450:NADPH-P450 reductase 1
BAS29940151.167533EmrB/QacA family drug resistance transporter
BAS2995-2170.017074hypothetical protein
BAS2996-119-1.683640hypothetical protein
BAS2997-113-1.281057hypothetical protein
BAS2998-114-1.162116lipoprotein
BAS2999-214-1.187195DNA-binding response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2989BLACTAMASEA340.001 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 34.0 bits (78), Expect = 0.001
Identities = 30/151 (19%), Positives = 52/151 (34%), Gaps = 46/151 (30%)

Query: 94 DTLYGIGSTSKVYTAAAVMKLVDEGKVDLDASVTRYIPEFKMKDERYKRITPRMLLNHSS 153
D + + ST KV AV+ VD G L+ + + L+++S
Sbjct: 59 DERFPMMSTFKVVLCGAVLARVDAGDEQLERKIH---------------YRQQDLVDYSP 103

Query: 154 GLQGSTLNNAFLFKDNDVYAHDILLQQLSNQNLKADPGAFSVYCNDGFTLAEILVERVSG 213
+ A + + +L A ++ +D + A +L+ V G
Sbjct: 104 VSE-------------KHLADGMTVGELC---------AAAITMSDN-SAANLLLATVGG 140

Query: 214 M-SFTEFLHQKFTEPLKLNHTITSQDKWEDE 243
T FL Q + +T D+WE E
Sbjct: 141 PAGLTAFLRQ-------IGDNVTRLDRWETE 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2992ARGREPRESSOR270.021 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 26.8 bits (59), Expect = 0.021
Identities = 14/46 (30%), Positives = 22/46 (47%), Gaps = 8/46 (17%)

Query: 38 IISVLCSQRATTQKELAEAIDKD-----QTTVVRMIQSMERKGIVK 78
I ++ + TQ EL + + KD Q TV R I+ + +VK
Sbjct: 10 IREIITANEIETQDELVDILKKDGYNVTQATVSRDIKEL---HLVK 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2993MECHCHANNEL330.002 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 32.9 bits (75), Expect = 0.002
Identities = 19/64 (29%), Positives = 30/64 (46%), Gaps = 13/64 (20%)

Query: 262 IITFLIAGHETTSGLLSFAIYFLLKNPDKLKKAYEEVDRVLTDSTPTYQQVMKLKYIRMI 321
+ FLI ++FAI+ +K +KL + EE PT ++V+ L IR +
Sbjct: 82 VFDFLI---------VAFAIFMAIKLINKLNRKKEEPA---AAPAPTKEEVL-LTEIRDL 128

Query: 322 LNES 325
L E
Sbjct: 129 LKEQ 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2994TCRTETB1282e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 128 bits (323), Expect = 2e-34
Identities = 86/362 (23%), Positives = 165/362 (45%), Gaps = 19/362 (5%)

Query: 14 MLVILFIGAFVSFLNNSLLNVALPSIMKDLDIKDYSTIQWLSTGYMLVSGILIPASAFLI 73
+L+ L I +F S LN +LNV+LP I D + ST W++T +ML I L
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPAST-NWVNTAFMLTFSIGTAVYGKLS 73

Query: 74 TRFSNRSLFITSMMIFTLGTALAAVAPN-FGLLLTGRMVQAAGSSVMGPLLMNIMLVSFP 132
+ + L + ++I G+ + V + F LL+ R +Q AG++ L+M ++ P
Sbjct: 74 DQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIP 133

Query: 133 REKRGTAMGIFGLVMITAPAIGPTLSGYIVEYYDWRLLFEMILPLAIISLLLGIWKSENV 192
+E RG A G+ G ++ +GP + G I Y W L L + ++ + +
Sbjct: 134 KENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL-----LIPMITIITVPFLMKL 188

Query: 193 MRQNKNAK--LDYLSLLLSSIGFGGLLYGFSSASSDGWTNKVVVTTLILGAIALIAFIIR 250
+++ K D ++L S+G + +S S ++ LI+ ++ + F+
Sbjct: 189 LKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYS---------ISFLIVSVLSFLIFVKH 239

Query: 251 QLKMNEPLLDLRVYKYPMFALASVIAIVNAVAMFSGMILTPAYVQNVRGISPLSSG-LMM 309
K+ +P +D + K F + + + + + + P +++V +S G +++
Sbjct: 240 IRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVII 299

Query: 310 LPGAVIMGIMSPITGKLFDKYGPRILGIVGLSITAVSTYMLANLQLDSSHTHTILIYTLR 369
PG + + I I G L D+ GP + +G++ +VS + L +S TI+I +
Sbjct: 300 FPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVL 359

Query: 370 MF 371

Sbjct: 360 GG 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS2999HTHFIS993e-26 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 99.1 bits (247), Expect = 3e-26
Identities = 31/112 (27%), Positives = 57/112 (50%)

Query: 2 RVLIVEDEQDLQNILVKRLNAEHYSVDACGNGEDALDYINMATYDLIVLDIMIPGINGLQ 61
+L+ +D+ ++ +L + L+ Y V N +I DL+V D+++P N
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLQKLRADNHTTSVLLLTAKDTIDDRVKGLDLGADDYLVKPFAFDELLARIR 113
+L +++ VL+++A++T +K + GA DYL KPF EL+ I
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


85BAS3027BAS3034N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS3027113-2.601606sensor histidine kinase
BAS3028112-2.550523DNA-binding response regulator
BAS3029113-1.165509hypothetical protein
BAS30301140.211653marR family transcriptional regulator
BAS30311141.563000sporulation transcriptional regulator
BAS30321142.151639protease synthase and sporulation negative
BAS30331142.198286hypothetical protein
BAS30342142.107654major facilitator family transporter protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3027PF06580385e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.3 bits (89), Expect = 5e-05
Identities = 17/102 (16%), Positives = 34/102 (33%), Gaps = 22/102 (21%)

Query: 359 NIFTNSIKFSNEGGTIEFFVEELESSVIISISDNGIGMEKEEMDRIFDRFYKVDTARARN 418
N + I +GG I + +V + + + G K
Sbjct: 266 NGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK----------------- 308

Query: 419 VEGSGLGLSIVQKIVELHNGN---VSVYSTKGEGTTVRVELP 457
E +G GL V++ +++ G + + +G V +P
Sbjct: 309 -ESTGTGLQNVRERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3028HTHFIS822e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.2 bits (203), Expect = 2e-20
Identities = 34/123 (27%), Positives = 61/123 (49%), Gaps = 1/123 (0%)

Query: 1 MKMIHILLADDDKHIRELLHYHLQKEGFKVFEAEDGKVAQEVLEKENIHLAIVDIMMPFV 60
M IL+ADDD IR +L+ L + G+ V + + + L + D++MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGYTLCEEIRK-YHDIPVILLTAKDQLVDKEKGFISGTDDYIVKPFEPAEVIFRMKALLR 119
+ + L I+K D+PV++++A++ + K G DY+ KPF+ E+I + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RYQ 122
+
Sbjct: 121 EPK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3031SACTRNSFRASE379e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.8 bits (85), Expect = 9e-06
Identities = 28/136 (20%), Positives = 48/136 (35%), Gaps = 15/136 (11%)

Query: 1 MTTHIEKCTLKDIHKLQEISYETFNETF-KHQNSPENMHHYLEKAFNLKQLEKE------ 53
M + +KD +K E + F +N KQ E +
Sbjct: 1 MIMKMTHLNMKDFNKPNE-PFVVFGRMIPAFENGVWTYTEERFSKPYFKQYEDDDMDVSY 59

Query: 54 LSNISSQFFFVYFNDEIAGYLKINTDDAQSEEIGDESLEVERIYIKSSFQKHGLGKYLLN 113
+ F Y + G +KI ++ + +E I + ++K G+G LL+
Sbjct: 60 VEEEGKAAFLYYLENNCIGRIKIRSN-------WNGYALIEDIAVAKDYRKKGVGTALLH 112

Query: 114 NAIEIAIANNKKNIWL 129
AIE A N+ + L
Sbjct: 113 KAIEWAKENHFCGLML 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3034TCRTETA340.001 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.0 bits (78), Expect = 0.001
Identities = 61/379 (16%), Positives = 127/379 (33%), Gaps = 42/379 (11%)

Query: 45 WGAILGYFGYGYMIGSLLGGIFSDKKGPKFVWIVAATAWSIFEIATAFAGEIGIAVFGGS 104
+G +L + + + G SD+ G + V +V+ ++ A A + + G
Sbjct: 45 YGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIG-- 102

Query: 105 ALIGFAIFRVLFGLTEGPSFAVSNKTAANWAAPKERAFLTSLGFVGVPLGAVLTA-PVAV 163
R++ G+T G + AV+ A+ ERA GF+ G + A PV
Sbjct: 103 --------RIVAGIT-GATGAVAGAYIADITDGDERA--RHFGFMSACFGFGMVAGPVLG 151

Query: 164 LLLSFTSWKIMFFILGTIGIVWAIIWYFTFTNMPEDHPRVTKEELAEIRSTEGVLQSAKV 223
L+ S FF + + + F +PE H + E + +
Sbjct: 152 GLMGGFSPHAPFFAAAALNGLNFLTGCFL---LPESHKGERRPLRREALNPLASFR---- 204

Query: 224 EKEIPKEPWYSFFKVPTFVMVTIAYFCFQYINFLILTWTPKYLQDVFHFQLSSLWYLGMI 283
W V +M +F Q + + + +D FH+ ++ +G+
Sbjct: 205 --------WARGMTVVAALMAV--FFIMQLVGQVPAALWVIFGEDRFHWDATT---IGIS 251

Query: 284 PWLGACITLPLGAKLSDRILRKTGNLRLARTGLPIIALLLTAICFSFIPAMNNYVAVLAL 343
+ A ++ + + G R G ++ + + +
Sbjct: 252 LAAFGILHSLAQAMITGPVAARLGERRALMLG-----MIADGTGYILLAFATRGWMAFPI 306

Query: 344 MSLGNAFAFLPSSLFWAIIVDTAPAYSGTYSGIMHFIANIATILAPTLTGYL---VVSYG 400
M L + +L + G G + + ++ +I+ P L + ++
Sbjct: 307 MVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTW 366

Query: 401 YPSMFIVAAILAAIAMGAM 419
+I A L + + A+
Sbjct: 367 NGWAWIAGAALYLLCLPAL 385



Score = 29.0 bits (65), Expect = 0.037
Identities = 28/161 (17%), Positives = 45/161 (27%), Gaps = 12/161 (7%)

Query: 290 ITLPLGAKLSDRILRKTGNLRLARTGLPIIALLLTAICFSFIPAMNNYVAVLALMSLGNA 349
P+ LSDR R+ ++ L A I A ++ VL + +
Sbjct: 58 ACAPVLGALSDRFGRR----------PVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAG 107

Query: 350 FAFLPSSLFWAIIVDTAPAYSGT-YSGIMHFIANIATILAPTLTGYLVVSYGYPSMFIVA 408
++ A I D + G M + P L G + + + F A
Sbjct: 108 ITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMG-GFSPHAPFFAA 166

Query: 409 AILAAIAMGAMLFVKPGQQTKTESLFNWRGKKRLEEPRANF 449
A L + F+ P L R
Sbjct: 167 AALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWAR 207


86BAS3163BAS3169N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS3163313-2.414861acetyltransferase
BAS3164214-2.246914hypothetical protein
BAS3165114-2.584254hypothetical protein
BAS3166014-2.442298hypothetical protein
BAS3167014-1.569992hypothetical protein
BAS3168117-0.938436lipoprotein
BAS3169017-0.701614hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3163SACTRNSFRASE384e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.0 bits (88), Expect = 4e-06
Identities = 25/122 (20%), Positives = 48/122 (39%), Gaps = 13/122 (10%)

Query: 21 IPAYEIEAKYINSTAIPRLY--------DTIADIQSCDEIFYGYFYEDTLAGFISFKID- 71
IPA+E + Y ++ ++ + + Y+ E+ G I + +
Sbjct: 27 IPAFENGVWTYTEERFSKPYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNW 86

Query: 72 KEEVDIHRLVVSPDHFHKGIATKLLLYIFDMFSSSKTY---IVQTGKENTPALSLYKKHG 128
I + V+ D+ KG+ T LL + ++ + +++T N A Y KH
Sbjct: 87 NGYALIEDIAVAKDYRKKGVGTALLHKAIE-WAKENHFCGLMLETQDINISACHFYAKHH 145

Query: 129 FI 130
FI
Sbjct: 146 FI 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3166TCRTETA581e-11 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 58.3 bits (141), Expect = 1e-11
Identities = 41/314 (13%), Positives = 96/314 (30%), Gaps = 11/314 (3%)

Query: 7 PIRFMLISSFFMSFGYFAVYAFLAIYLLTFLHFSAVQ--VGTVLTVMTITSRIIPLFSGL 64
P+ +L + + G + L L +H + V G +L + + G
Sbjct: 6 PLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 65 IADKIGYIIMMIAGLFLRGIGFIALGICSDFYTISISSALIGFGTAFYEPAARAIFGSQP 124
++D+ G +++ L + + + + + I + G A A I
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITD 125

Query: 125 AHTRKNLFTYLNLSFNCGAIMGPIAGGFLLLLDPIYAFSLTGSLMLIFAFIFYLLKDHFQ 184
R F +++ F G + GP+ GG + P F +L + L
Sbjct: 126 GDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESH 185

Query: 185 VTTENTSITLGIQAILQNKSFLLFSFIMIFFYIMFT-QLTVALPLHMKNISNSNQLA--- 240
+ + + + + + F QL +P + I ++
Sbjct: 186 KGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDA 245

Query: 241 ---TLVITINAITGVIFMVLFRKLFLKY-NTLSFIKYGVLLMSISFLLIPLFQHPYWLFI 296
+ + I + + + G++ ++L+ F W+
Sbjct: 246 TTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILL-AFATRGWMAF 304

Query: 297 CVIFFTIGETLVLP 310
++ + +P
Sbjct: 305 PIMVLLASGGIGMP 318


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3167SYCDCHAPRONE364e-05 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 36.4 bits (84), Expect = 4e-05
Identities = 24/133 (18%), Positives = 49/133 (36%), Gaps = 21/133 (15%)

Query: 78 YMKQKKWEEAKEALQKSISIQPSDEAYHNV-AVAHYNLGELEEASEFFLRVA----GDSD 132
+ K+E+A + Q + D + +G+ + A + A +
Sbjct: 46 QYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPR 105

Query: 133 YIMYSYVKCLIDLGRTKEAKEKLDAFNRESDNFLGEMMVAD------LYVELNCYKEAIE 186
+ ++ CL+ G EA+ L L + ++AD L ++ EAI+
Sbjct: 106 FPFHAAE-CLLQKGELAEAESGLF---------LAQELIADKTEFKELSTRVSSMLEAIK 155

Query: 187 WFEKGYKECWKSP 199
++ EC +P
Sbjct: 156 LKKEMEHECVDNP 168



Score = 30.7 bits (69), Expect = 0.004
Identities = 16/96 (16%), Positives = 33/96 (34%), Gaps = 2/96 (2%)

Query: 30 SRDVQSLNNLAWMYFYEEENDEKALELIGEVVKLNPSSYFPYNILRDIYMKQKKWEEAKE 89
S ++ L +LA Y+ E A ++ + L+ + L +++ A
Sbjct: 33 SDTLEQLYSLA-FNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIH 91

Query: 90 ALQKSISIQPSDEAYH-NVAVAHYNLGELEEASEFF 124
+ + + + + A GEL EA
Sbjct: 92 SYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGL 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3169TYPE3IMSPROT270.005 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 27.4 bits (61), Expect = 0.005
Identities = 13/69 (18%), Positives = 29/69 (42%), Gaps = 5/69 (7%)

Query: 10 IGNIFWIIVFGIWAAIIWL--RDVDGAGVIQTPEIKSISLIVI---LIAFIIPVFFQVIW 64
+ + WII+ G ++ L ++ + ++ + +I ++ I F+
Sbjct: 150 LSILIWIIIKGNLVTLLQLPTCGIECITPLLGQILRQLMVICTVGFVVISIADYAFEYYQ 209

Query: 65 LIINLRMSK 73
I L+MSK
Sbjct: 210 YIKELKMSK 218


87BAS3389BAS3396N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS3389-1140.609379TetR family transcriptional regulator
BAS33900130.933865Gfo/Idh/MocA family oxidoreductase
BAS3391-1121.160765DNA topoisomerase IV subunit A
BAS3392-3121.142400DNA topoisomerase IV subunit B
BAS3394-3140.304497CoA-binding domain-containing protein
BAS3395-3140.418867serine protease
BAS3396-2140.718229DNA-binding response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3389HTHTETR661e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 66.2 bits (161), Expect = 1e-15
Identities = 27/168 (16%), Positives = 57/168 (33%), Gaps = 22/168 (13%)

Query: 8 KEKKKRAIKEAAFLLFSERGFNEVKIEHIAKEANVSQVTIYNHFGSKDALFRELIQEFII 67
++ ++ I + A LFS++G + + IAK A V++ IY HF K LF E+ +
Sbjct: 9 AQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWEL--- 65

Query: 68 CEFQYYKELAEEKLP-------------FHDMMQKMIVRKMNTGGLFQPDMLLQMMQRDE 114
EL E +++ + + + + +
Sbjct: 66 -SESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA 124

Query: 115 ILREFIYSYQNEKILPWYLEILERAQRNNEI----NPHLTKEMMLLYI 158
++++ + E + L+ + +M YI
Sbjct: 125 VVQQAQRNLCLE-SYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3392ACRIFLAVINRP310.015 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.3 bits (71), Expect = 0.015
Identities = 12/49 (24%), Positives = 22/49 (44%), Gaps = 1/49 (2%)

Query: 455 INTEKAKLADIFKNEEINTIIYAIGGGVGNEFDVEDINYDKVVIMTDAD 503
++ EKA+ + ++ TI A+GG N+F + + DA
Sbjct: 730 VDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKK-LYVQADAK 777


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3395V8PROTEASE664e-14 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 66.2 bits (161), Expect = 4e-14
Identities = 33/166 (19%), Positives = 62/166 (37%), Gaps = 38/166 (22%)

Query: 134 NKAYIVTNNHVVDGANKLAVKLS------------DGKKVDAKLVGKDPWLDLAVVEI-- 179
K ++TN HVVD + L +G ++ DLA+V+
Sbjct: 110 GKDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSP 169

Query: 180 --DGANVN---KVATLGDSSKIRAGEKAIAIGNPLGFDG---SVTEGIISSKEREIPVDI 231
++ K AT+ ++++ + + G P ++G I+ + E
Sbjct: 170 NEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKPVATMWESKGKITYLKGEA---- 225

Query: 232 DGDKRADWNAQVIQTDAAINPGNSGGALFNQNGEIIGINSSKIAQQ 277
+Q D + GNSG +FN+ E+IGI+ + +
Sbjct: 226 ------------MQYDLSTTGGNSGSPVFNEKNEVIGIHWGGVPNE 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3396HTHFIS882e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.3 bits (219), Expect = 2e-22
Identities = 34/164 (20%), Positives = 76/164 (46%), Gaps = 16/164 (9%)

Query: 4 TVLLVEDERRLREIVSDYFRNEGFEVIEAEDGKKALELFAEHEIDLIMLDIMLPEIDGWS 63
T+L+ +D+ +R +++ G++V + A + DL++ D+++P+ + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 VCRRIRKESA-VPIIMLTARSDEDDTLLGFELGADEYVTKPFSPKVLVA---RAKTLLKR 119
+ RI+K +P+++++A++ + E GA +Y+ KPF L+ RA KR
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 120 ADGVVGVAEENAMSLAGIE------------VNRLSRTVLVDGE 151
+ ++ M L G + + T+++ GE
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGE 168


88BAS3635BAS3645N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS3635-2100.1000983-ketoacyl-ACP reductase
BAS3636-210-0.136156zinc protease
BAS3637-2100.827186hypothetical protein
BAS3638-3101.237636branched-chain amino acid ABC transporter
BAS3642-3130.988472sugar ABC transporter ATP-binding protein
BAS3643-2130.923660Bmp family lipoprotein
BAS3644-2130.554617GntR family transcriptional regulator
BAS3645-2130.913105stage III sporulation protein E
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3635DHBDHDRGNASE951e-25 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 94.7 bits (235), Expect = 1e-25
Identities = 68/248 (27%), Positives = 113/248 (45%), Gaps = 16/248 (6%)

Query: 3 KYALVTGGSGGIGSAISKQLIQDGYTVYVHYNNSE-----EKVNELQKEWGEVIPVQ-AN 56
K A +TG + GIG A+++ L G + N E + + E P +
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 57 LASSDGAEQLWEQIEHPLDAIIYAAGKSIFGLVTDVTNDELNDMVELQVKSIYKLLSMAL 116
A+ D E+ P+D ++ AG GL+ ++++E + ++
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 117 PSMIQRRSGNIVLVSSIWGQIGASCEVLYSMVKGAQNSYVKALAKEVSLSGIRVNAVAPG 176
M+ RRSG+IV V S + + Y+ K A + K L E++ IR N V+PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 177 AIETEM-LNVFSEEDKNE-----IAEE----IPLGRLGLPEEVAKTVSFLVSPGASYITG 226
+ ET+M +++++E+ E E IPL +L P ++A V FLVS A +IT
Sbjct: 189 STETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITM 248

Query: 227 QIIGVNGG 234
+ V+GG
Sbjct: 249 HNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3638TYPE3OMGPROT310.007 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 31.0 bits (70), Expect = 0.007
Identities = 9/14 (64%), Positives = 10/14 (71%)

Query: 162 VPGLSDIPVIGKIF 175
VP L DIP IG +F
Sbjct: 475 VPLLGDIPYIGALF 488


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3643LIPPROTEIN48642e-13 Mycoplasma P48 major surface lipoprotein signature.
		>LIPPROTEIN48#Mycoplasma P48 major surface lipoprotein signature.

Length = 428

Score = 64.3 bits (156), Expect = 2e-13
Identities = 76/329 (23%), Positives = 130/329 (39%), Gaps = 55/329 (16%)

Query: 1 MKKKTGLLLSLTLAAS---AVLGACGNSDKASSDKKE----------------------- 34
MKK +LL L+ A+ AV +CGN+D+++ KE
Sbjct: 1 MKKSKKILLGLSPIAAILPAVAVSCGNNDESNISFKEKDISKYTTTNANGKQVVKNAELL 60

Query: 35 -FKVGMVTDVGGVDDKSFNQSAWEGLTKFGKDNNLKKNEGYRYLQSSKDADYIPNLTKFA 93
K ++TD G +DDKSFNQSA+E L K ++ N +++
Sbjct: 61 KLKPVLITDEGKIDDKSFNQSAFEALKAINKQTGIEIN------NVEPSSNFESAYNSAL 114

Query: 94 KDHYNTTFGIGYLMEKSIEKVAEQYPKE----QFAIV----DTVVEKPNVTSITFKDHEG 145
+ G+ ++SI++ + + +E Q I+ D E S+ F E
Sbjct: 115 SAGHKIWVLNGFKHQQSIKQYIDAHREELERNQIKIIGIDFDIETEYKWFYSLQFNIKES 174

Query: 146 SFLVG-AVAAMTTKSNK----VGFVGGVKSPLITKFESGFKAGAKAVN---PNIEIVSQY 197
+F G A+A+ ++ ++ V GG P +T F GF G N + +I
Sbjct: 175 AFTTGYAIASWLSEQDESKRVVASFGGGAFPGVTTFNEGFAKGILYYNQKHKSSKIYHTS 234

Query: 198 ADAFDK-----PEKGSVLASAMYGGGVDVIYHASGATGNGVFTEAKNRKKKGENVWVIGV 252
D + +V+ + + DV Y+ + + + +VIGV
Sbjct: 235 PVKLDSGFTAGEKMNTVINNVLSSTPADVKYNPHVILSVAGPATFETVRLANKGQYVIGV 294

Query: 253 DRDQNQEGMPENVTLTSMVKRVDVAVAKV 281
D DQ + + LTS++K + AV +
Sbjct: 295 DSDQGMIQDKDRI-LTSVLKHIKQAVYET 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3645IGASERPTASE330.005 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.1 bits (75), Expect = 0.005
Identities = 30/168 (17%), Positives = 54/168 (32%), Gaps = 19/168 (11%)

Query: 210 RAKRTAEQTEKKKTTRSTRSKRATEQEEIIEPMEEISIDPPIISNFTENYPVNEQEDKRI 269
+ AE ++++ T + ATE + + + N N Q ++
Sbjct: 1036 TTETVAENSKQESKTVEKNEQDATETT---------AQNREVAKEAKSNVKANTQTNEVA 1086

Query: 270 EVEQEELITSPF-IEEAPPVEEPKKKRGEKIVESLEGETQAPPMQFSNVENKDYKLPALD 328
+ E T +E VE+ +K + E +TQ P S V K + +
Sbjct: 1087 QSGSETKETQTTETKETATVEKEEKAKVET------EKTQEVPKVTSQVSPKQEQSETVQ 1140

Query: 329 ILKFPKNKQVTNENAEIYENARKLERTFQSFGVKAKVTKVHRGPAVTK 376
P + N + + T AK T + VT+
Sbjct: 1141 PQAEPARENDPTVNI---KEPQSQTNTTADTEQPAKETSSNVEQPVTE 1185


89BAS3899BAS3906N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS3899-211-1.616022diguanylate phosphodiesterase
BAS3900-211-0.772956short chain dehydrogenase
BAS3901-210-2.327488Ser/Thr protein phosphatase family protein
BAS3902-115-2.784582hypothetical protein
BAS3903016-3.638085polyphosphate kinase
BAS3904118-4.584250ppx/GppA phosphatase
BAS3905018-2.617720hypothetical protein
BAS3906-118-2.597030lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3899FbpA_PF05833363e-04 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 36.0 bits (83), Expect = 3e-04
Identities = 18/151 (11%), Positives = 44/151 (29%), Gaps = 1/151 (0%)

Query: 133 EQFNHLLMYYRTYGIQISINKVGTGTSN-LERISVLAPDILKVDLTNLRQTALLQSYQDI 191
+ + +K+ TG S L +DL+ +++ +D+
Sbjct: 179 DMIENFTKENSLQLNDNIFSKIFTGVSKTLSSEICFRLKNNSIDLSLSNLKEIVEVCKDL 238

Query: 192 LYSLSLLARRIGATLLYEEIDAFYQLQYAWKNGGRYYQGNYLKECLPDFIETNVLKERLG 251
+ FY L K + Q + + L +F +RL
Sbjct: 239 FKEIQSNKFEFNCYTKNNSFVGFYCLNLMSKEDYKKIQYDSSSKLLENFYYAKDKSDRLK 298

Query: 252 NECHQFIQHEKKKLQKIYNLTEMLRDRIGDV 282
++ + + + ++L + +
Sbjct: 299 SKSSDLQKIVMNNINRCTKKDKILNNTLKKC 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3900DHBDHDRGNASE1015e-28 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 101 bits (253), Expect = 5e-28
Identities = 73/261 (27%), Positives = 122/261 (46%), Gaps = 25/261 (9%)

Query: 4 KVVIITGGSSGMGKGMATRFAKEGARVVITGRTKEKLEEAKLEI-------EQFPGQILT 56
K+ ITG + G+G+ +A A +GA + EKLE+ + E FP
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP----- 63

Query: 57 VQMDVRNTDDIQKMIEQIDEKFGRIDILINNAAGNFICPAEDLSVNGWNSVINIVLNGTF 116
DVR++ I ++ +I+ + G IDIL+N A LS W + ++ G F
Sbjct: 64 --ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVF 121

Query: 117 YCSQAIGKYWIEKGIKGNIINMVATYAWDAGPGVIHSAAAKAGVLAMTKTLAVEWGRKYG 176
S+++ KY +++ G+I+ + + A + A++KA + TK L +E +Y
Sbjct: 122 NASRSVSKYMMDRR-SGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA-EYN 179

Query: 177 IRVNAIAPGPIERTGGADKLWISEEMAKRTIQ--------SVPLGRLGTPEEIAGLAYYL 228
IR N ++PG T LW E A++ I+ +PL +L P +IA +L
Sbjct: 180 IRCNIVSPGST-ETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFL 238

Query: 229 CSDEAAYINGTCMTMDGGQHL 249
S +A +I + +DGG L
Sbjct: 239 VSGQAGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3902TYPE3IMRPROT260.022 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 26.2 bits (58), Expect = 0.022
Identities = 6/39 (15%), Positives = 8/39 (20%)

Query: 40 FFPFFGVPFLAGIAGGLLGGALAFGPRPYYPPYPPPFPP 78
P + L + F P P P
Sbjct: 29 TAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPVFS 67


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS3906cloacin270.047 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 27.4 bits (60), Expect = 0.047
Identities = 16/109 (14%), Positives = 33/109 (30%)

Query: 33 AFENAAKQEKTMFEDAKKLETLEKEGQELYNQIVQEGKDNNQTVKEKLNQAVKNTDEREK 92
+ A K + K+ + +++ Q Q + +N D K
Sbjct: 357 ELDAANKTLADAIAEIKQFNRFAHDPMAGGHRMWQMAGLKAQRAQTDVNNKQAAFDAAAK 416

Query: 93 VLKKEKESLNKAQEEVKSADKYVKKIEDKKLKDQADKVKSTYEKRHDSF 141
+L+ A E K + + E+ ++ K + HD
Sbjct: 417 EKSDADAALSSAMESRKKKEDKKRSAENNLNDEKNKPRKGFKDYGHDYH 465


90BAS4142BAS4149N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS4142-212-1.547138competence protein ComG
BAS4143-110-0.898608competence protein ComG
BAS4144-210-1.085778competence protein ComG
BAS4145-312-0.409068competence protein ComG
BAS4146-115-1.705403hypothetical protein
BAS4147-215-1.618648hypothetical protein
BAS4148117-1.940527sodium:dicarboxylate symporter family protein
BAS4149324-3.282679hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4142BCTERIALGSPH422e-07 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 41.9 bits (98), Expect = 2e-07
Identities = 19/75 (25%), Positives = 37/75 (49%), Gaps = 1/75 (1%)

Query: 1 MKQKGFTLLEMLLVLFAISVLSMVTYFNVHSLYEKQKIEQFLRQFSNDILYMQQLAINRQ 60
M+Q+GFTLLEM+L+L + V + + + + + R F + ++QQ +
Sbjct: 1 MRQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLAR-FEAQLRFVQQRGLQTG 59

Query: 61 KHYTLRWHKDRHMYY 75
+ + + H DR +
Sbjct: 60 QFFGVSVHPDRWQFL 74


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4143BCTERIALGSPG502e-11 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 50.3 bits (120), Expect = 2e-11
Identities = 18/65 (27%), Positives = 41/65 (63%)

Query: 1 MQNEEGFTLLEMLLVMVVITVLLLLIIPDVVTQRSSVEGKGCKAYVKSIEAQVQVYQLQH 60
+ GFTLLE+++V+V+I VL L++P+++ + + + + + ++E + +Y+L +
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDN 63

Query: 61 NKIPT 65
+ PT
Sbjct: 64 HHYPT 68


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4144BCTERIALGSPF919e-23 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 91.4 bits (227), Expect = 9e-23
Identities = 61/350 (17%), Positives = 150/350 (42%), Gaps = 22/350 (6%)

Query: 7 SLSDQVILLKRLGELLEKGYSLLQALEFLRFQLPLEKKVQLQRMIDGLKD----GKSLHD 62
S SD +L ++L L+ L +AL+ + Q +K L +++ ++ G SL D
Sbjct: 66 STSDLALLTRQLATLVAASMPLEEALDAVAKQS---EKPHLSQLMAAVRSKVMEGHSLAD 122

Query: 63 SFHQLKFHQEMLSYLFYA-----EQHGDISFALQQGSALLYKKDKYRKDMIKIMQYPMFL 117
+ +K L+ A E G + L + + ++ + R + + M YP L
Sbjct: 123 A---MKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVL 179

Query: 118 AIFLIIMILIFNRILLPQVDMVYSSFGSTAPLFTEQILSTIKLL----PYLIISTLFIIM 173
+ I ++ I +++P+V + PL T ++ + P+++++ ++
Sbjct: 180 TVVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLA---LLA 236

Query: 174 IVFGVYIVYFRKLPHMKQVKIILRIPLVKTFLILKHSHYFATQLSGLLHGGLSVLEALTI 233
++ ++ + + +L +PL+ ++ +A LS L + +L+A+ I
Sbjct: 237 GFMAFRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRI 296

Query: 234 MMEQKYHPFFQYEAGRIERQLIAGEPLQSIIAKSEYYEEELSYIITHGQANGNLAIELGD 293
+ + + ++ + G L + ++ + + ++I G+ +G L L
Sbjct: 297 SGDVMSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLER 356

Query: 294 YSDLIMEKMERKIKRMLVIIQPILFTCIGGIVVLMYLAMIMPMFQMMNSI 343
+D + ++ L + +P+L + +V+ + LA++ P+ Q+ +
Sbjct: 357 AADNQDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4149LIPPROTEIN48270.033 Mycoplasma P48 major surface lipoprotein signature.
		>LIPPROTEIN48#Mycoplasma P48 major surface lipoprotein signature.

Length = 428

Score = 26.9 bits (59), Expect = 0.033
Identities = 16/68 (23%), Positives = 31/68 (45%), Gaps = 2/68 (2%)

Query: 39 IEKNMELFIELIRD-KENPFETGYSSSISIAVLDEEGKMIEFYTVPIWECCSYFL-GVPL 96
IE + F L + KE+ F TGY+ + ++ DE +++ + + + F G
Sbjct: 157 IETEYKWFYSLQFNIKESAFTTGYAIASWLSEQDESKRVVASFGGGAFPGVTTFNEGFAK 216

Query: 97 QIRFWGSK 104
I ++ K
Sbjct: 217 GILYYNQK 224


91BAS4245BAS4252N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS4245-115-1.171948acetyltransferase
BAS4246-112-0.277955alpha/beta hydrolase
BAS42470100.836177hypothetical protein
BAS4248-1101.069060hypothetical protein
BAS4250-1110.919267phosphoglycerate mutase
BAS4249-1121.087101hypothetical protein
BAS4251-1111.305490TolB domain-containing protein
BAS42520141.202065minor extracellular protease VpR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4245SACTRNSFRASE320.001 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 0.001
Identities = 16/86 (18%), Positives = 29/86 (33%), Gaps = 13/86 (15%)

Query: 41 GTLVGYMHENKLIAAGGVFPFKDRFSTIGMLIVHPNFQGQGIGRTLLNHCLEHTHPK--- 97
Y EN I + + ++ I + V +++ +G+G LL+ +E
Sbjct: 65 KAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFC 124

Query: 98 ------QPIALIATKAGEPLYTSCGF 117
Q I + A Y F
Sbjct: 125 GLMLETQDINISACHF----YAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4248PF06917280.034 Periplasmic pectate lyase
		>PF06917#Periplasmic pectate lyase

Length = 555

Score = 27.6 bits (61), Expect = 0.034
Identities = 7/31 (22%), Positives = 19/31 (61%)

Query: 71 EELGEDIYEQDEPRGYFGAAEIYKYAKLHNT 101
++G+D++++ RG F + ++Y ++ N
Sbjct: 470 WQIGDDLFKRHYHRGLFVESAQHRYFRIDNP 500


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4249PF05272270.014 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 27.3 bits (60), Expect = 0.014
Identities = 14/39 (35%), Positives = 18/39 (46%)

Query: 48 WSGVFMFCTINLLNKWLGIETSNTSPYLPTSVATILKEN 86
+S F TI L + LG + +SP L V L EN
Sbjct: 791 YSVNTTFVTIADLVQALGADPGKSSPMLEGQVRDWLNEN 829


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4252SUBTILISIN1706e-49 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 170 bits (431), Expect = 6e-49
Identities = 74/203 (36%), Positives = 102/203 (50%), Gaps = 19/203 (9%)

Query: 173 GGPTIGAPEAWNLKDPSGKSLDGKGMKVAIIDSGVDYTHPDLKANYIGGYDTVDEDA--- 229
G I AP W G+G+KVA++D+G D HPDLKA IGG + D+D
Sbjct: 25 GVEMIQAPAVW-------NQTRGRGVKVAVLDTGCDADHPDLKARIIGGRNFTDDDEGDP 77

Query: 230 -DPMDGNVHGTHVAGIIAGNG---KIKGVAPNASILAYRVMNDGGTGTTDDIIQGIERAI 285
D N HGTHVAG IA + GVAP A +L +V+N G+G D IIQGI AI
Sbjct: 78 EIFKDYNGHGTHVAGTIAATENENGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAI 137

Query: 286 QDGADVLNLSLGQDLNVPDQPVTLTLERAAKLGITAVVSNGNDGPKPWSVDA---PGNAS 342
+ D++++SLG +VP + +++A I + + GN+G D PG +
Sbjct: 138 EQKVDIISMSLGGPEDVP--ELHEAVKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYN 195

Query: 343 SVISVGASTVSIPFPTFQVAGSS 365
VISVGA F + +
Sbjct: 196 EVISVGAINFDRHASEFSNSNNE 218



Score = 84.9 bits (210), Expect = 8e-20
Identities = 42/129 (32%), Positives = 59/129 (45%), Gaps = 19/129 (14%)

Query: 492 FSSRGPSQGSWLIKPDIVAPGVQITSTVPRGGYESHNGTSMAAPQVAGAVALLRQ----- 546
FS+ D+VAPG I STVP G Y + +GTSMA P VAGA+AL++Q
Sbjct: 212 FSNSNNE-------VDLVAPGEDILSTVPGGKYATFSGTSMATPHVAGALALIKQLANAS 264

Query: 547 MHPDWTTQQLKASLANTAKTLKDVNENTYPIMTQGSGLINIPKAAQTDVLVKPNNVSFGL 606
D T +L A L L + + +G+GL+ A + + G+
Sbjct: 265 FERDLTEPELYAQLIKRTIPLGNSPKM------EGNGLLY-LTAVEELSRIFDTQRVAGI 317

Query: 607 IKPNSGKVK 615
+ S KVK
Sbjct: 318 LSTASLKVK 326


92BAS4565BAS4569N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS4565012-0.873803DNA-binding response regulator
BAS4566-112-1.185555sensor histidine kinase
BAS4567-119-0.893520ankyrin repeat-containing protein
BAS4568019-1.894996Gfo/Idh/MocA family oxidoreductase
BAS4569117-1.687976large conductance mechanosensitive channel
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4565HTHFIS913e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.0 bits (226), Expect = 3e-23
Identities = 35/152 (23%), Positives = 74/152 (48%), Gaps = 7/152 (4%)

Query: 7 RILLIEDEVSIAELQRDYLEINDFQVDVEHSGETGLQMALQEDYDLIILDIMLPKMNGFE 66
IL+ +D+ +I + L + V + + T + D DL++ D+++P N F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 67 ICKQIRAI-KDIPILLVSAKKEDIDKIRGLGLGADDYITKPFSPSELVARVKAHISRYER 125
+ +I+ D+P+L++SA+ + I+ GA DY+ KPF +EL+ + ++ +R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 126 LLGNVSKQ-RDTLYIHGIS-----IDQRARKV 151
+ +D + + G S I + ++
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQEIYRVLARL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4566PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.7 bits (77), Expect = 0.001
Identities = 18/101 (17%), Positives = 37/101 (36%), Gaps = 24/101 (23%)

Query: 379 LIHNSVKY---MDKEEKKITVTVSSDNNKVIVKVMDNGSGIESDTLPYIFERFYRAEQSR 435
L+ N +K+ + KI + + DN V ++V + GS +T
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNT--------------- 307

Query: 436 NSSTGGSGLGLAIAKQIIEEHGGN---IWAESELGEGTSIF 473
+G GL ++ ++ G I + G+ ++
Sbjct: 308 ---KESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4567HTHFIS290.011 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.4 bits (66), Expect = 0.011
Identities = 16/96 (16%), Positives = 33/96 (34%), Gaps = 14/96 (14%)

Query: 130 GGTALIPASEHGYVDVIKELLTRTNIDVNHVNNLGWTALMEAIVLSNGNETQQQVIRLLI 189
G T L+ + V+ + L+R DV +N L I L++
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYDVRITSNA--ATLWRWI--------AAGDGDLVV 52

Query: 190 EHGADINIPDNDGVTPLEHARAHHFEEIEKILLEGH 225
D+ +PD + L + ++ +++
Sbjct: 53 ---TDVVMPDENAFDLLPRIKKAR-PDLPVLVMSAQ 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4569MECHCHANNEL1452e-48 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 145 bits (368), Expect = 2e-48
Identities = 76/134 (56%), Positives = 96/134 (71%), Gaps = 9/134 (6%)

Query: 1 MWNEFKKFAFKGNVIDLAVGVVIGAAFGKIVSSLVKDIITPLLGMVLGGVDFTDLKITFG 60
+ EF++FA +GNV+DLAVGV+IGAAFGKIVSSLV DII P LG+++GG+DF +T
Sbjct: 3 IIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMPPLGLLIGGIDFKQFAVTLR 62

Query: 61 KS-------SIMYGNFIQTIFDFLIIAAAIFMFVKVFNKLTSKREEEKEEEIPEPTKEEE 113
+ + YG FIQ +FDFLI+A AIFM +K+ NKL K+EE P PTKEE
Sbjct: 63 DAQGDIPAVVMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEEPAAA--PAPTKEEV 120

Query: 114 LLGEIRDLLKQQNS 127
LL EIRDLLK+QN+
Sbjct: 121 LLTEIRDLLKEQNN 134


93BAS4926BAS4936N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS49261211.123080hypothetical protein
BAS49302222.033357hypothetical protein
BAS49311212.403481aldo/keto reductase
BAS49321182.477521major facilitator family transporter protein
BAS49331222.334170hypothetical protein
BAS49341151.561803hypothetical protein
BAS4935-1121.312723pyridine nucleotide-disulfide oxidoreductase
BAS4936-1140.309119tyrosyl-tRNA synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4926NUCEPIMERASE320.002 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 31.7 bits (72), Expect = 0.002
Identities = 19/71 (26%), Positives = 32/71 (45%), Gaps = 4/71 (5%)

Query: 1 MKIGIIGAAGKAGSRILKEALDRGHEVTAI-VRNT---AKITEENVKVLEKDVFALTSND 56
MK + GAAG G + K L+ GH+V I N + + +++L + F D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 57 LQAFDVVVNAF 67
L + + + F
Sbjct: 61 LADREGMTDLF 71


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4932TCRTETA635e-13 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 62.9 bits (153), Expect = 5e-13
Identities = 77/342 (22%), Positives = 136/342 (39%), Gaps = 32/342 (9%)

Query: 11 VQTNRRSMFALLALAISAFGIGTTEFISVGLLPSISKDLNVSVTTA---GLTVSLYALGA 67
++ NR + L +A+ A GIG + + +LP + +DL S G+ ++LYAL
Sbjct: 1 MKPNRPLIVILSTVALDAVGIG----LIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQ 56

Query: 68 AVGAPVLTALTASMSRKTLLMWIMVIFIIGNGIAAVATSFTILIIARIVSAFAHGVFMSI 127
APVL AL+ R+ +L+ + + I A A +L I RIV+
Sbjct: 57 FACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA 116

Query: 128 GSTIAAAIVPENKRASAIAIMFTGLTVATITGVPIGTFIGQQFGWRASFMAIVVIGIIAF 187
G+ I A I ++RA M + G +G +G F A F A + + F
Sbjct: 117 GAYI-ADITDGDERARHFGFMSACFGFGMVAGPVLGGLMG-GFSPHAPFFAAAALNGLNF 174

Query: 188 IANSILVPSNLK------NGVPVSFRDQFKLIKNGR-----LLLVFIITALGYGGT--FV 234
+ L+P + K ++ F+ + + + FI+ +G +V
Sbjct: 175 LTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWV 234

Query: 235 TFTYLSPLLQEVTGFEASTVTIILLVYGIAIAIGN-MVGGKLSNH-NPIRALFYMFLIQA 292
F ++ ++A+T+ I L +GI ++ M+ G ++ RAL +
Sbjct: 235 IFG------EDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADG 288

Query: 293 IILFVLTFTAPFKVAGLITIIFMGLFAFMNVPGLQVYVVILA 334
+L F +A I ++ M P LQ +
Sbjct: 289 TGYILLAFATRGWMAFPIMVLLASGGIGM--PALQAMLSRQV 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4934PF07472280.017 Fucose-binding lectin II
		>PF07472#Fucose-binding lectin II

Length = 245

Score = 27.7 bits (61), Expect = 0.017
Identities = 21/66 (31%), Positives = 31/66 (46%), Gaps = 8/66 (12%)

Query: 15 VQISASQGQLDVLDQLLKPEVQESLTTLVEQLPKLTELVNILTKSYDFAQTVATDEVLKS 74
VQ + Q LD + Q + T LVE+LP+ V+I T Y F ++V K+
Sbjct: 24 VQANGDQAVLDRMRQFMT-------TQLVEKLPQYDVFVDIATIPYSFDVGSWQNKV-KA 75

Query: 75 DTVGAI 80
D G +
Sbjct: 76 DAAGQV 81


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS4936TACYTOLYSIN300.028 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 29.6 bits (66), Expect = 0.028
Identities = 23/92 (25%), Positives = 36/92 (39%), Gaps = 18/92 (19%)

Query: 333 DEIEQGFKEMPTFQSSKETKNIVEWLVDLGIEPSRRQAREDINNGAISMN---------- 382
D I+ KEMP + KE K + + S E+IN+ S+N
Sbjct: 77 DMIKLAPKEMPLESAEKEEKKSED------NKKSEEDHTEEINDKIYSLNYNELEVLAKN 130

Query: 383 GEKVTDVGTDVTVENSFDGRFIIIRKGKKNYS 414
GE + + +FI+I + KKN +
Sbjct: 131 GETIENF--VPKEGVKKADKFIVIERKKKNIN 160


94BAS5288BAS5301N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS5288-214-0.813629TetR family transcriptional regulator
BAS5289-214-0.733755AcrB/AcrD/AcrF family transporter
BAS5290-216-0.939761bifunctional methionine sulfoxide reductase A/B
BAS5291-115-0.823925hypothetical protein
BAS52931140.477184antiholin-like protein LrgB
BAS52940110.560547murein hydrolase regulator LrgA
BAS52950100.864509response regulator LytR
BAS5296-191.214161sensor histidine kinase LytS
BAS5297190.931202major facilitator family transporter protein
BAS52982120.583893BCCT family osmoprotectant transporter
BAS5299212-0.387581nitric-oxide synthase, oxygenase subunit
BAS5300413-1.179672superoxide dismutase
BAS5301213-2.375967hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5288HTHTETR635e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 63.1 bits (153), Expect = 5e-14
Identities = 21/62 (33%), Positives = 38/62 (61%)

Query: 2 KEKERLIIEMAMKLFATKGVNATSVQEIVTACGISKGAFYLYFKSKEELLLATLRYYYDK 61
+E + I+++A++LF+ +GV++TS+ EI A G+++GA Y +FK K +L
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 62 IQ 63
I
Sbjct: 70 IG 71


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5289ACRIFLAVINRP6690.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 669 bits (1728), Expect = 0.0
Identities = 240/1066 (22%), Positives = 459/1066 (43%), Gaps = 68/1066 (6%)

Query: 4 IINFSLKNKFAVWLLTIIVTIAGIYSGLNMKLETIPDITTPVVTVTTVYPGATPEEVADK 63
+ NF ++ W+L II+ +AG + L + + P I P V+V+ YPGA + V D
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 64 VSKPMEEQLQNLSGVNVVSSSSFQNASS-IQVEYDFDKNMEKAETEIKDALANVK--LPE 120
V++ +E+ + + + +SS+S S I + + + + A+ ++++ L LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 GVKDPKVSRVNF--NAFPVISLSVASKNESLATLTENVEKNVVPGLKGLDGVASVQISGQ 178
V+ +S + V + + +++ V NV L L+GV VQ+ G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 179 QVDEVQLVFKKDKMKELGLSEDTVKNVIKGSDVSLPLGLYTFKDT------EKSVVVDGN 232
Q +++ D + + L+ V N +K + + G S++
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 233 ITTMKALKELKIPAVPSSASSQGSQTAGAGAQMPQMNPAAMNGIPTVTLSEIADIKEVGK 292
+ ++ + +G V L ++A ++ G+
Sbjct: 240 FKNPEEFGKVTLRVNS-------------------------DGSV-VRLKDVARVELGGE 273

Query: 293 A-ESISRTNGKEAIGIQIVKAADANTVDVVNAVKDKVKELEKKY-KDLEIISTFDQGAPI 350
I+R NGK A G+ I A AN +D A+K K+ EL+ + + ++++ +D +
Sbjct: 274 NYNVIARINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFV 333

Query: 351 EKSVETMLSKAIFGAIFAIVIIMLFLRNIRTTLISVVSIPLSLLIAVLVIKQMDITLNIM 410
+ S+ ++ + +++ LFL+N+R TLI +++P+ LL ++ ++N +
Sbjct: 334 QLSIHEVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTL 393

Query: 411 TLGAMTVAIGRVVDDSIVVIENIYRRMSLSEEKLRGKDLIREATKEMFIPIMSSTIVTIA 470
T+ M +AIG +VDD+IVV+EN+ R M E+KL K+ ++ ++ ++ +V A
Sbjct: 394 TMFGMVLAIGLLVDDAIVVVENVERVM--MEDKLPPKEATEKSMSQIQGALVGIAMVLSA 451

Query: 471 VFLPLGLVKGMIGEMFLPFALTIVFALLASLLVAVTIVPMLAHSLFKKESMREKEVHH-- 528
VF+P+ G G ++ F++TIV A+ S+LVA+ + P L +L K S E
Sbjct: 452 VFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGF 511

Query: 529 ----EEKPSKLANIYKRILAWALNHKIITSSIAVLLLVGSLALVPIIGVSFLPSEEEKMI 584
N Y + L I L++ G + L + SFLP E++ +
Sbjct: 512 FGWFNTTFDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVF 571

Query: 585 IATYNPEPGQTLEDVEKIATKAEKHFQDNKDVKTIQ--FSLGGENPMSPGQSNQAMFFVQ 642
+ G T E +K+ + ++ + ++ F++ G + Q N M FV
Sbjct: 572 LTMIQLPAGATQERTQKVLDQVTDYYL-KNEKANVESVFTVNGFSFSGQAQ-NAGMAFVS 629

Query: 643 YD--NDTKNFEKEKEQVVKDLQKMSGKGEWKN---------QDFGASGGSNEIKLYVYGD 691
+ E E V+ + GK + G + G + + G
Sbjct: 630 LKPWEERNGDENSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGL 689

Query: 692 SSEDIKPVVKDIQNIMKKN-KDLKDIDSSIAKTYAEYTLVADQEKLSKMGLTAAQIGMGL 750
+ + + + ++ L + + + A++ L DQEK +G++ + I +
Sbjct: 690 GHDALTQARNQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTI 749

Query: 751 SNQHDRPVLTTIKKDGKDVNVYVEAEKQTYETIDDLTNRKITTPLGNEVAVKDVMTVKEG 810
S + G+ +YV+A+ + +D+ + + G V T
Sbjct: 750 STALGGTYVNDFIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWV 809

Query: 811 ETSNTVKHRDGRVYAEVSAKLTSDDVSK-ASAAVQKEVDKMDLPSGVDVSMGGVTKDIEE 869
S ++ +G E+ + S A A ++ K LP+G+ G++
Sbjct: 810 YGSPRLERYNGLPSMEIQGEAAPGTSSGDAMALMENLASK--LPAGIGYDWTGMSYQERL 867

Query: 870 SFKQLGLAMLAAIAIVYFVLVVTFGGALAPFAILFSLPFTIIGALVALLISGETLSVSAM 929
S Q + + +V+ L + P +++ +P I+G L+A + + V M
Sbjct: 868 SGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFM 927

Query: 930 IGALMLIGIVVTNAIVLIDRVIH-KENEGLSTREALLEAGATRLRPILMTAIATIGALIP 988
+G L IG+ NAI++++ E EG EA L A RLRPILMT++A I ++P
Sbjct: 928 VGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLP 987

Query: 989 LALGFEGSGLISKGLGVTVIGGLTSSTLLTLLIVPIVYEVLSKFKK 1034
LA+ +G+ V+GG+ S+TLL + VP+ + V+ + K
Sbjct: 988 LAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRCFK 1033



Score = 93.4 bits (232), Expect = 2e-21
Identities = 93/518 (17%), Positives = 198/518 (38%), Gaps = 46/518 (8%)

Query: 546 ALNHKIITSSIAVLLLVGSLALVPIIGVSFLPSEEEKM--IIATYNPEPGQTLEDVE-KI 602
+ I +A++L++ + + V+ P+ + A Y PG + V+ +
Sbjct: 5 FIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANY---PGADAQTVQDTV 61

Query: 603 ATKAEKHFQDNKDVKTIQFSLGGENPMSPGQSNQAMFFVQYDNDTKNFEKEKEQVVKDLQ 662
E++ ++ + S S ++ + + + QV LQ
Sbjct: 62 TQVIEQNMNGIDNLMYMS---------STSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQ 112

Query: 663 KMSGK--GEWKNQDFGASGGSNEIKLYVYGDSSEDIKPVVKDIQNIMKKN--KDLKDID- 717
+ E + Q S+ L V G S++ DI + + N L ++
Sbjct: 113 LATPLLPQEVQQQGISVEKSSSSY-LMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNG 171

Query: 718 ----SSIAKTYAEYTLVADQEKLSKMGLTAAQIGMGLSNQHDR----PVLTTIKKDGKDV 769
YA + D + L+K LT + L Q+D+ + T G+ +
Sbjct: 172 VGDVQLFGAQYA-MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQL 230

Query: 770 NVYVEAEKQTYETIDDLTNRKI-TTPLGNEVAVKDVMTVKEG--ETSNTVKHRDGRVYAE 826
N + A+ + ++ + G+ V +KDV V+ G + +
Sbjct: 231 NASIIAQTRFK-NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGL 289

Query: 827 VSAKLTSDDVSKASAAVQKEVDKM--DLPSGVDVSMGGVTKD----IEESFKQLGLAMLA 880
T + + A++ ++ ++ P G+ V D ++ S ++ +
Sbjct: 290 GIKLATGANALDTAKAIKAKLAELQPFFPQGMKVL---YPYDTTPFVQLSIHEVVKTLFE 346

Query: 881 AIAIVYFVLVVTFGGALAPFAILFSLPFTIIGALVALLISGETLSVSAMIGALMLIGIVV 940
AI +V+ V+ + A ++P ++G L G +++ M G ++ IG++V
Sbjct: 347 AIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLV 406

Query: 941 TNAIVLIDRVI-HKENEGLSTREALLEAGATRLRPILMTAIATIGALIPLALGFEGS-GL 998
+AIV+++ V + L +EA ++ + ++ A+ IP+A F GS G
Sbjct: 407 DDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAF-FGGSTGA 465

Query: 999 ISKGLGVTVIGGLTSSTLLTLLIVPIVYEVLSKFKKKK 1036
I + +T++ + S L+ L++ P + L K +
Sbjct: 466 IYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAE 503


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5295HTHFIS653e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 65.2 bits (159), Expect = 3e-14
Identities = 30/126 (23%), Positives = 57/126 (45%), Gaps = 6/126 (4%)

Query: 3 KVLVVDDEMLARDELKYLLERTK-EVEIIGEADCVEDALEELMKNKPDIVFLDIQLSDDN 61
+LV DD+ R L L R +V I A + D+V D+ + D+N
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNA---ATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GFEIANILKKMKNPPAIVFATAYDQY--ALQAFEVDALDYILKPFDEERIVQTLKKYKKQ 119
F++ +KK + ++ +A + + A++A E A DY+ KPFD ++ + + +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 120 KQSQIE 125
+ +
Sbjct: 122 PKRRPS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5296PF065802293e-72 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 229 bits (586), Expect = 3e-72
Identities = 65/216 (30%), Positives = 111/216 (51%), Gaps = 13/216 (6%)

Query: 359 QLELGEAELQSKLLQDAEIKALQAQINPHFLFNAINTVSALCRTDVEKARKLLLQLSVYF 418
Q E+ + ++ + Q+A++ AL+AQINPHF+FNA+N + AL D KAR++L LS
Sbjct: 146 QAEIDQWKMA-SMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELM 204

Query: 419 RCNLQGARQLLIPLEQELNHVQAYLSLEQARFPNKYEVKMYIEDELKTTLVPPFVLQLLV 478
R +L+ + + L EL V +YL L +F ++ + + I + VPP ++Q LV
Sbjct: 205 RYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLV 264

Query: 479 ENALRHAFPKKQPVCEVEVHVFEKEGMVHFEVKDNGQGIEEERLEQLGKMVVSSKKGTGT 538
EN ++H + ++ + + G V EV++ G + +K+ TGT
Sbjct: 265 ENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN-----------TKESTGT 313

Query: 539 ALYNINERLIGLFGKETMLHIESEVNEGTEITFVIP 574
L N+ ERL L+G E + + + + +IP
Sbjct: 314 GLQNVRERLQMLYGTEAQIKLSEKQGKVNA-MVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5297TCRTETB545e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 54.1 bits (130), Expect = 5e-10
Identities = 81/411 (19%), Positives = 148/411 (36%), Gaps = 28/411 (6%)

Query: 34 LDMLLLSFVLVYILKEFHLSPVEGGNLTLATTIGMLIGSYLFGFIADLFGRIRTMAFTIL 93
L+ ++L+ L I +F+ P + A + IG+ ++G ++D G R + F I+
Sbjct: 28 LNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGII 87

Query: 94 LFSLATALIYFATDYWQLLIL-RFLVGMGVGGEFGIGMAIVTETWSKEMRAKATSVVALG 152
+ + + + ++ LLI+ RF+ G G + M +V KE R KA ++
Sbjct: 88 INCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSI 147

Query: 153 WQFGVLIASLLPAFIVPHFGWRAVFLFGLIPALLAVYVRKSLSEPKIWEQKQRYKKELLQ 212
G + + I + W + L +I + ++ K L + + K +L
Sbjct: 148 VAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILM 207

Query: 213 KEAEGN--LTTTEAA-----------------QLKQMKKFPLRKLFANKKVTITTIGLII 253
L TT + K F L N I ++
Sbjct: 208 SVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMI----GVL 263

Query: 254 MSFIQNFGYYGIFTWMPTILANKYNYTLAKA-SGWMFISTIGMLIGIATFGILADKIGRR 312
I G + +P ++ + + + A+ S +F T+ ++I GIL D+ G
Sbjct: 264 CGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPL 323

Query: 313 KTFTIYYVGGTIYCLIY-FFLFTDSTLLLWG-SALLGFFANGMMGGFGAVLAENYPAEAR 370
I ++ L F L T S + +LG + V + EA
Sbjct: 324 YVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAG 383

Query: 371 STAENFIFGTGRGLAGFGPVIIGLLAAGGNLMGALSLIFIIYPIGLVTMLL 421
+ F + G G I+G L + L L + + L + LL
Sbjct: 384 AGMSLLNFTSFLS-EGTGIAIVGGLLSIPLLDQRLLPMEVDQSTYLYSNLL 433


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5301NUCEPIMERASE361e-04 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 35.9 bits (83), Expect = 1e-04
Identities = 12/26 (46%), Positives = 15/26 (57%)

Query: 3 KVLVLGGTRFFGKHLVEALLKDGHDV 28
K LV G F G H+ + LL+ GH V
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQV 27


95BAS5314BAS5320N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BAS5314-311-0.072766serine protease
BAS5315-2100.029200metallo-beta-lactamase family protein
BAS5316-1110.764675hypothetical protein
BAS53170121.152062hypothetical protein
BAS5318-1111.339021sensory box histidine kinase YycG
BAS5319-1111.236003DNA-binding response regulator YycF
BAS5320-1121.123516****adenylosuccinate synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5314V8PROTEASE582e-11 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 58.1 bits (140), Expect = 2e-11
Identities = 34/179 (18%), Positives = 64/179 (35%), Gaps = 40/179 (22%)

Query: 95 SEADSEAGTGSG-VIYKKTNDQAYIVTNNHVVAGANRIEVSLS------------DGKKV 141
EA + SG V+ K T ++TN HVV + +L +G
Sbjct: 95 VEAPTGTFIASGVVVGKDT-----LLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFT 149

Query: 142 PGKVLGTDVVTDLAVLEIDA----KHVKKVIE---IGDSNAVRRGEPVIAIGNPLGLQFS 194
++ DLA+++ KH+ +V++ + ++ + + + G P +
Sbjct: 150 AEQITKYSGEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKPVA 209

Query: 195 GTVTQGIISANERIVPVDLDQDGHYDWQVEVLQTDAAINPGNSGGALVNAAGQLIGINS 253
T + E +Q D + GNSG + N ++IGI+
Sbjct: 210 ---TMWESKGKITYLKG------------EAMQYDLSTTGGNSGSPVFNEKNEVIGIHW 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5318PF06580381e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.5 bits (87), Expect = 1e-04
Identities = 17/104 (16%), Positives = 35/104 (33%), Gaps = 25/104 (24%)

Query: 502 VLYNIISNALKY----SPEGGTVTYRLRDRGELLEISVSDQGMGIPKENVDKIFERFYRV 557
++ ++ N +K+ P+GG + + + + V + G K
Sbjct: 259 LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN------------ 306

Query: 558 DKARSRQMGGTGLGLAIAKEMIEAHGG---SIWAKSEEGKGTTI 598
TG GL +E ++ G I ++GK +
Sbjct: 307 ------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5319HTHFIS951e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.9 bits (236), Expect = 1e-24
Identities = 31/140 (22%), Positives = 69/140 (49%), Gaps = 4/140 (2%)

Query: 1 MMGKKILVVDDEKPIADILKFNLEKEGFEIVMAHDGDEAIEKATEEQPDMVLLDIMLPGK 60
M G ILV DD+ I +L L + G+++ + + D+V+ D+++P +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLEVCREIRK-SSEMPIIMLTAKDSEIDKVLGLELGADDYVTKPFS---TRELLARVKA 116
+ ++ I+K ++P+++++A+++ + + E GA DY+ KPF ++ R A
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 117 NLRRHQQGGAAEKEENTEMV 136
+R + ++ +V
Sbjct: 121 EPKRRPSKLEDDSQDGMPLV 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BAS5320HELNAPAPROT280.040 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 28.3 bits (63), Expect = 0.040
Identities = 9/48 (18%), Positives = 20/48 (41%), Gaps = 7/48 (14%)

Query: 152 DREAFKEKLEQNLAQKNRLFEK-------MYDTEGFSVDEIFEEYFEY 192
++ + L L+ L+ K + F++ E FEE +++
Sbjct: 9 NQTLVENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDH 56



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.