PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeNC_008600.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_008600 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1BALH_0472BALH_0490Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0472317-3.034146***************hypothetical protein
BALH_0473722-5.302098MutT/NUDIX family protein
BALH_0474722-7.811199hypothetical protein
BALH_0475721-7.924766hypothetical protein
BALH_0476823-8.174552hypothetical protein
BALH_0477521-7.272662hypothetical protein
BALH_0478418-7.342694nitroreductase
BALH_0479014-6.654734ABC transporter ATP-binding and permease
BALH_0480-113-3.401289hypothetical protein
BALH_0481-212-2.417643hypothetical protein
BALH_0482-212-1.913005penicillin-binding protein 1A
BALH_0483-213-3.107909thiol-disulfide oxidoreductase
BALH_0484-112-2.068477*transposase
BALH_0485014-1.003666iron-sulfur cluster-binding protein
BALH_0486213-2.544856hypothetical protein
BALH_0487213-2.799944RNA methyltransferase
BALH_0488212-2.341694PAS/PAC sensor-containing diguanylate
BALH_048909-1.446412serine protein kinase
BALH_0490211-2.095103hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0482TCRTETA280.044 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 27.9 bits (62), Expect = 0.044
Identities = 11/51 (21%), Positives = 21/51 (41%), Gaps = 4/51 (7%)

Query: 1 MFKMKERVLSRVNYHQKVALNPMTKFFY----KAIILLLLLSFTLLFIGNV 47
F + E ++ ALNP+ F + + L+ + F + +G V
Sbjct: 178 CFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQV 228


2BALH_0794BALH_0810Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0794321-0.393029hypothetical protein
BALH_0795321-0.394893preprotein translocase subunit SecA
BALH_0797222-0.460988polysaccharide biosynthesis protein
BALH_0799227-0.461748pyruvyl-transferase
BALH_0800328-0.170932S-layer protein
BALH_0801-1240.252049EA1 protein, S-layer protein
BALH_08020230.147425hypothetical protein
BALH_08031180.693313alginate O-acetyltransferase
BALH_0804-1131.150358hypothetical protein
BALH_08050152.409320enoyl-CoA hydratase
BALH_0806-1162.348901hypothetical protein
BALH_0808-2142.624969hypothetical protein
BALH_0809-2143.268509M42 family deblocking aminopeptidase
BALH_0810-3153.057236N-acetylmuramoyl-L-alanine amidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0795SECA8990.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 899 bits (2325), Expect = 0.0
Identities = 354/829 (42%), Positives = 507/829 (61%), Gaps = 52/829 (6%)

Query: 1 MLNSVKKLLGDSQKRKLKKYEQLVQEINNLEEKLSDLSDEELRHKTITFKDMLRDGKTVD 60
++ + K+ G R L++ ++V IN +E ++ LSDEEL+ KT F+ L G+ ++
Sbjct: 2 LIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVLE 61

Query: 61 DIKVEAFAVVREAAKRVLGLRHYDVQLIGGLVLLEGNIAEMPTGEGKTLVSSLPTYVRAL 120
++ EAFAVVREA+KRV G+RH+DVQL+GG+VL E IAEM TGEGKTL ++LP Y+ AL
Sbjct: 62 NLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNAL 121

Query: 121 EGKGVHVITVNDYLAKRDKELIGQVHEFLGLKVGLNIPQIDPFEKKLAYEADITYGIGTE 180
GKGVHV+TVNDYLA+RD E + EFLGL VG+N+P + K+ AY ADITYG E
Sbjct: 122 TGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNNE 181

Query: 181 FGFDYLRDNMAASKNEQVQRPYHFAIIDEIDSVLIDEAKTPLIIAGKKSSSSDLHYLCAK 240
+GFDYLRDNMA S E+VQR H+A++DE+DS+LIDEA+TPLII+G SS+++ K
Sbjct: 182 YGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVNK 241

Query: 241 VIKS-----------FQDTLHYTYDAESKSASFTEDGIIKIEDLFDI-------DNLYDL 282
+I FQ H++ D +S+ + TE G++ IE+L ++LY
Sbjct: 242 IIPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYSP 301

Query: 283 EHQTLYHYMIQALRAHVAFQCDVDYIVHDEKILLVDIFTGRVMDGRSLSDGLHQALEAKE 342
+ L H++ ALRAH F DVDYIV D ++++VD TGR M GR SDGLHQA+EAKE
Sbjct: 302 ANIMLMHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAKE 361

Query: 343 GLEITEENQTQASITIQNFFRMYPALSGMTGTAKTEEKEFNRVYNMEVIPIPTNRPIIRE 402
G++I ENQT ASIT QN+FR+Y L+GMTGTA TE EF+ +Y ++ + +PTNRP+IR+
Sbjct: 362 GVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIRK 421

Query: 403 DKKDVVYVTADAKYKAVREDVLKHNKQGRPILIGTMSILQSETVARYLDEANITYQLLNA 462
D D+VY+T K +A+ ED+ + +G+P+L+GT+SI +SE V+ L +A I + +LNA
Sbjct: 422 DLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLNA 481

Query: 463 KSAEQEADLIATAGQKGQITIATNMAGRGTDILLG------------------------- 497
K EA ++A AG +TIATNMAGRGTDI+LG
Sbjct: 482 KFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKADW 541

Query: 498 ----EGVHELGGLHVIGTERHESRRVDNQLKGRAGRQGDPGSSQFFLSLEDEMLKRFAQE 553
+ V E GGLH+IGTERHESRR+DNQL+GR+GRQGD GSS+F+LS+ED +++ FA +
Sbjct: 542 QVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFASD 601

Query: 554 EVEKLTKSLKTDETGLILTAKVHDFVNRTQLICEGSHFSMREYNLKLDDVINDQRNVIYK 613
V + + L I V + Q E +F +R+ L+ DDV NDQR IY
Sbjct: 602 RVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIYS 661

Query: 614 LRNNLLQEDTNMIEIIIPMIDHAVEAISKQYLVEGMLPEEWDFASLTASLNEI--LSVEN 671
RN LL + +++ E I + + +A Y+ L E WD L L L +
Sbjct: 662 QRNELL-DVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPI 720

Query: 672 MPSLSANNVHSPEDLQS-VLKETLSLYKERVNELDSHTDLQQSLRYVALHFLDQNWVNHL 730
L E L+ +L +++ +Y+ + + + ++ + V L LD W HL
Sbjct: 721 AEWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEM-MRHFEKGVMLQTLDSLWKEHL 779

Query: 731 DAMTHLKEGIGLRQYQQEDPTRLYQKEALDIFLYTYGNFEKEMCRYVAR 779
AM +L++GI LR Y Q+DP + Y++E+ +F + + E+ +++
Sbjct: 780 AAMDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSK 828


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0800INTIMIN405e-05 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 39.7 bits (92), Expect = 5e-05
Identities = 50/239 (20%), Positives = 79/239 (33%), Gaps = 27/239 (11%)

Query: 569 YKVAATKGFVKDAAGNESAAFTKEVKVVEKKEEGKKDEVAPKATKVERVADSKTKFTVTF 628
YKV A D GN S + V+ + + V A + +T+
Sbjct: 525 YKVTAR---AYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTS-AKADGTEAITY 580

Query: 629 DKEVKGGQGADSASNVNNYTLAGAKLPEGTLIVVNADGKSVTIELPETFTFEKSETVKFT 688
VK A + V+ ++G + N GK T+ L KS+
Sbjct: 581 TATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGK-ATVTL-------KSDKPGQV 632

Query: 689 VANVANKDGVKMGTTNLLVNVVDTKAP--EFKSAKITKVDAKEITLTFSEAVNIDTNDFV 746
V + + N ++ V TKA E K+ K T V + +T++ V + D
Sbjct: 633 VVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYT--VKVMKGDKP 690

Query: 747 I-------DLNGVALEVTKADATAKASKEVVLKVTAPADV----NLATGTVTVKAKEVE 794
+ L + +V L T P ++ V VKA EVE
Sbjct: 691 VSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVE 749


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0801INTIMIN330.008 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 32.7 bits (74), Expect = 0.008
Identities = 37/226 (16%), Positives = 71/226 (31%), Gaps = 19/226 (8%)

Query: 368 SHGDYKVEVQVTKRGGLTVSNTGIISVKNLDTPASAI----KSTAFAVDADKNGVVYGNK 423
G V ++ K G + VS L+ A K++ + ADK V +
Sbjct: 616 GSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQ 675

Query: 424 --LTGKDFKLNSQTLVVGEKAQIHNVVATIAGEDKVVDPNSISIKSSNHGIISVVNNYIT 481
+T + V ++ + ++ + K+ +G V +T
Sbjct: 676 DAITYTVKVMKGDKPVSNQEVTFTTTLGK---------LSNSTEKTDTNGYAKVT---LT 723

Query: 482 AEAAGEATLTIKVGDVTKDVKFKVTTDSRKLVSVKANPDKLQVVQNKELPVTFVTTDQYG 541
+ G++ ++ +V DV DVK L N + + +LP ++ Q
Sbjct: 724 STTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVN 783

Query: 542 DPFGANTAAIKEVLPKTGVVAEGGLDVVTTDSGSIGTKTLGVTGND 587
+ + T GT T+ V +D
Sbjct: 784 LKASGGNGKYTWRSANPAIASVDASSGQVTLKEK-GTTTISVISSD 828


3BALH_0820BALH_0856Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0820215-2.047387oligopeptide ABC transporter ATP-binding
BALH_0821113-2.571723hypothetical protein
BALH_0822113-2.726454ATP/GTP binding protein
BALH_0823113-2.385246GTPase
BALH_0824-3100.017767hypothetical protein
BALH_0825-290.305022hypothetical protein
BALH_0827-380.772346hypothetical protein
BALH_0828-1203.176507hypothetical protein
BALH_0829-1182.213594hypothetical protein
BALH_0830-2131.133195major facilitator superfamily sugar transporter
BALH_0831014-0.512320sensor histidine kinase
BALH_0832215-0.801921two-component response regulator
BALH_0833215-1.249774hypothetical protein
BALH_0834315-1.678537hypothetical protein
BALH_0835214-1.642443hypothetical protein
BALH_0836215-1.951503hypothetical protein
BALH_0837214-2.117135hypothetical protein
BALH_0839217-1.1911885-methylcytosine-specific restriction-like
BALH_08402190.181824hypothetical protein
BALH_08412230.073742hypothetical protein
BALH_08421220.281505hypothetical protein
BALH_08431210.777266type II restriction endonuclease
BALH_0844-2263.161976DnaD domain-containing protein
BALH_0845-1181.881925replicative DNA helicase
BALH_08460130.007232transcriptional regulator
BALH_0847014-0.206494hypothetical protein
BALH_0848-117-0.729926TetR family transcriptional regulator
BALH_0851016-0.005365lincomycin resistance protein
BALH_0852618-1.160230hypothetical protein
BALH_0853622-1.689417PadR family transcriptional regulator
BALH_0854320-1.478114transposase domain-containing protein
BALH_0855420-1.455681hypothetical protein
BALH_0856320-0.765669hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0827PF00577360.001 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 35.6 bits (82), Expect = 0.001
Identities = 36/156 (23%), Positives = 57/156 (36%), Gaps = 22/156 (14%)

Query: 214 QVEEMQSGYEKLYSQLVPFAQCDFNFNINDSQAITDSLTLGFNETLNESLTQTQSFT--- 270
QV Q+GY +Y+ VP F IND A +S L T+ E+ TQ FT
Sbjct: 310 QVTIKQNGY-DIYNSTVPPG----PFTINDIYAAGNSGDLQV--TIKEADGSTQIFTVPY 362

Query: 271 --NGNSETKGSSE---TKGKTRSVAGV--APLVGA-----GIGAVFAGPMGAAFGGSIGS 318
+ +G + T G+ RS P G+ A + G +
Sbjct: 363 SSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRA 422

Query: 319 AAASMMGSTSESKASNESVTESHSKTEGTSSTTGKS 354
+ + A + +T+++S S G+S
Sbjct: 423 FNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQS 458


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0830TCRTETA364e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 35.6 bits (82), Expect = 4e-04
Identities = 30/129 (23%), Positives = 57/129 (44%), Gaps = 16/129 (12%)

Query: 42 EFFPKGDPTSQLLNTAAIFAVGFLMRPIGSLLMGRYADRHGRRAALTLSITVMAGGSFII 101
+ D T+ A++A LM+ + ++G +DR GRR L +S+ A I+
Sbjct: 34 DLVHSNDVTAHYGILLALYA---LMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIM 90

Query: 102 ACTPSYESIGIMAPIILVLARLLQGLSLGGEYGTSATYLSEMASSGRR----GFYSSFQY 157
A P +L + R++ G++ G + Y++++ R GF S+
Sbjct: 91 ATAPFLW--------VLYIGRIVAGIT-GATGAVAGAYIADITDGDERARHFGFMSACFG 141

Query: 158 VTLVAGQMV 166
+VAG ++
Sbjct: 142 FGMVAGPVL 150



Score = 29.0 bits (65), Expect = 0.044
Identities = 21/82 (25%), Positives = 39/82 (47%), Gaps = 11/82 (13%)

Query: 287 VVLQPIAGLLSDKIGRRPLLMAFGILGTLLTAPIFFFMEKTTEPIVAFLLMMVGLII--V 344
P+ G LSD+ GRRP+L+ +L A + + + T ++ +G I+ +
Sbjct: 57 FACAPVLGALSDRFGRRPVLLV-----SLAGAAVDYAIMATAP---FLWVLYIGRIVAGI 108

Query: 345 TGYT-SINAIVKAELFPTEIRA 365
TG T ++ A++ + RA
Sbjct: 109 TGATGAVAGAYIADITDGDERA 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0831PF06580372e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 2e-04
Identities = 22/131 (16%), Positives = 51/131 (38%), Gaps = 24/131 (18%)

Query: 397 EKKIDFHIEGDSALHPLPDHIKVSHLITILGNIIDNAFD-AVSGQEEK-SVSFFVTDIGH 454
E ++ F + + A+ ++V ++ + +++N ++ + + T
Sbjct: 237 EDRLQFENQINPAIM----DVQVPPML--VQTLVENGIKHGIAQLPQGGKILLKGTKDNG 290

Query: 455 DIVFEVIDSGAGIPAEKITTIFQKGFSTKGNDRGYGLANVKEMVDLL---EGTIEIQNEK 511
+ EV ++G+ G GL NV+E + +L E I++ +EK
Sbjct: 291 TVTLEVENTGSL------------ALKNTKESTGTGLQNVRERLQMLYGTEAQIKL-SEK 337

Query: 512 NGGAIFTIYLP 522
G + +P
Sbjct: 338 QGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0832HTHFIS623e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.8 bits (150), Expect = 3e-13
Identities = 37/142 (26%), Positives = 63/142 (44%), Gaps = 4/142 (2%)

Query: 3 KVAIAEDDFRVAQIQEEFLSKIK-DVKVIGKALNAKETIELLQKEEIDLLLLDNYLPDGI 61
+ +A+DD + + + LS+ DV++ NA + + DL++ D +PD
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITS---NAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GTDLLPKIHADFPDVDVIMVTAANENHMLEKAIRNGVSNYLIKPVTLEKFVRTIEDYKRK 121
DLLP+I PD+ V++++A N KA G +YL KP L + + I +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 122 KQLLHSNNEVNQALIDNFFGIS 143
+ S E + G S
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRS 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0844FbpA_PF05833280.041 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 28.3 bits (63), Expect = 0.041
Identities = 10/58 (17%), Positives = 22/58 (37%)

Query: 152 CEETEEEEVIEEETGSRVFSFYEQHFGSLSPHTVEELSAWMEDLSEELVLKALQIAFE 209
E +E + + + + F +S E+ +++ S +L L L+ E
Sbjct: 176 FSYDMIENFTKENSLQLNDNIFSKIFTGVSKTLSSEICFRLKNNSIDLSLSNLKEIVE 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0848HTHTETR423e-07 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 42.3 bits (99), Expect = 3e-07
Identities = 30/192 (15%), Positives = 58/192 (30%), Gaps = 28/192 (14%)

Query: 5 EKQLDLRIRRTHKLLWDSLFELMTQSKQKYSTITVNQICDRAMVHRTTFYKHFEDKDALL 64
++ + T + + D L S+Q S+ ++ +I A V R Y HF+DK L
Sbjct: 2 ARKTKQEAQETRQHILDVALRLF--SQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLF 59

Query: 65 AFGFKRYSKMIVEIP--------------VSDRLSKPFQVMEQFLHHEEIGKILETQM-- 108
+ ++ I E+ + + L + + +I+ +
Sbjct: 60 SEIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 109 -SDEQFINRTQYLSHEMRKQEIEAL-------HQLRKNHTMPNDLIIEFYSGAINSLSAW 160
+ + + Q IE L M I G I+ L
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPA-DLMTRRAAIIMR-GYISGLMEN 177

Query: 161 WFKNERKVSAAE 172
W + +
Sbjct: 178 WLFAPQSFDLKK 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0851TCRTETB1401e-38 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 140 bits (353), Expect = 1e-38
Identities = 92/414 (22%), Positives = 187/414 (45%), Gaps = 15/414 (3%)

Query: 16 SNLKHTPILIALLLGAMVALLNETLLGNALTVLMKEFGVTASTIQWLSTAYMLVVGVLVP 75
SNL+H ILI L + + ++LNE +L +L + +F ++ W++TA+ML +
Sbjct: 8 SNLRHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTA 67

Query: 76 ITALLQQWLTTRQMFLIAMVTFLVGTLIAGFAPT-FSVLLVGRIVQAVATGLITPLLMNT 134
+ L L +++ L ++ G++I + FS+L++ R +Q L+M
Sbjct: 68 VYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 135 ILIICPPEKRGATMGLIALVMMAAPAIGPTLSGVIVDSLNWRWLFYIVIPVVIISIMIGM 194
+ P E RG GLI ++ +GP + G+I ++W +L +IP++ I + +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL--LIPMITIITVPFL 185

Query: 195 KYIQNVSELTRPKVDYPSILLSTLGFGGLVYSFSASGDLGWSDAKVYGTLLVGLISLCIF 254
+ + D I+L ++G + + + L+V ++S IF
Sbjct: 186 MKLLKKEVRIKGHFDIKGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIF 236

Query: 255 VARQLKIENPILELRAFKVPMFTLSVGLIVIVMMSLFSTMTLLPMFLQTVLLVTAFKSG- 313
V K+ +P ++ K F + V I+ ++ ++++P ++ V ++ + G
Sbjct: 237 VKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGS 296

Query: 314 IIMLPGSVISAVMGPIAGKLFDKFSPKVIIVPGIVLVGIAMFLFKGITPDTSIVQIIVMH 373
+I+ PG++ + G I G L D+ P ++ G+ + ++ FL +T+ + ++
Sbjct: 297 VIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVS-FLTASFLLETTSWFMTIII 355

Query: 374 SVLMVGLMFVMT-AQTYGLNQLTPNLYPHGTALFNTLQQVAGAIGTAIFISKMS 426
++ GL F T T + L G +L N ++ G AI +S
Sbjct: 356 VFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


4BALH_0870BALH_0914Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_08702171.849129DhaKLM operon coactivator DhaQ
BALH_0871212-0.019283TetR family transcriptional regulator
BALH_0872010-1.079221dihydroxyacetone kinase
BALH_0873317-4.201155acyl carrier protein phosphodiesterase
BALH_0874116-2.230717BadM/Rrf2 family transcriptional regulator
BALH_0875016-2.294088dihydroxyacetone kinase
BALH_0876016-2.612946response regulator
BALH_0877-115-2.311153HD domain-containing protein
BALH_0878-115-0.530375hypothetical protein
BALH_0879-315-0.594826S-layer protein
BALH_0880219-1.572437PadR family transcriptional regulator
BALH_0881319-1.598104hypothetical protein
BALH_0882420-2.564641lipoprotein
BALH_0883420-2.299787hypothetical protein
BALH_0884320-1.621237hypothetical protein
BALH_0885219-1.740260hypothetical protein
BALH_0886118-2.872396hypothetical protein
BALH_0887116-3.118322anti-sigma B factor antagonist
BALH_0888012-3.161574serine-protein kinase RsbW
BALH_0889012-3.387318RNA polymerase sigma factor SigB
BALH_0890013-3.381036hypothetical protein
BALH_0891-112-3.536455sigma factor sigB regulation protein
BALH_0892-213-3.014697chemotaxis protein methyltransferase
BALH_0893-213-1.264114sensor histidine kinase
BALH_0894012-0.083657hypothetical protein
BALH_0895-113-0.536487hypothetical protein
BALH_0896015-0.696985hypothetical protein
BALH_08970160.984442hypothetical protein
BALH_08984191.804559NADPH:quinone reductase (quinone
BALH_08995171.306365sensor histidine kinase
BALH_09006162.114126DNA-binding response regulator
BALH_09016162.222276hypothetical protein
BALH_09035161.758500DNA repair exonuclease
BALH_09044130.849795hypothetical protein
BALH_090509-0.4974773'-5' exoribonuclease YhaM
BALH_0906212-0.474201hypothetical protein
BALH_0907114-2.014710hypothetical protein
BALH_0908-113-0.662003hypothetical protein
BALH_0909-1130.924980hypothetical protein
BALH_0910-1110.824655glyoxalase
BALH_0911-2130.710847glyoxalase
BALH_0912-1171.223513response regulator aspartate phosphatase
BALH_0913-1143.536296hypothetical protein
BALH_0914-1183.623499alpha/beta hydrolase fold family lipase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0871HTHTETR411e-06 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 40.8 bits (95), Expect = 1e-06
Identities = 26/179 (14%), Positives = 55/179 (30%), Gaps = 29/179 (16%)

Query: 11 KKIIANSLKYLMETESFHKISVSDIMLHCQMRRQTFYYHFKDKFELLSWIYREETK---E 67
+ I+ +L+ L + S+ +I + R Y+HFKDK +L S I+ E
Sbjct: 14 QHILDVALR-LFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 68 NIIDFLD------YETWENIFDLLFDYFYEN-------------QKFYRNAFKVIE-QNS 107
+++ I + + +F V + Q +
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRN 132

Query: 108 FNHYLFEHTKNLYMKIIDELSVSCGFSLSDETKNTIASFYSHGFVGTIKDWIESKCEVD 166
++ + I+ +D A G +++W+ + D
Sbjct: 133 LCLESYDRIEQTLKHCIEA-----KMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFD 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0891HTHFIS849e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.7 bits (207), Expect = 9e-20
Identities = 36/147 (24%), Positives = 75/147 (51%), Gaps = 12/147 (8%)

Query: 2 SILIVDDNPVNIFVIEKILKQAGYQDLVSLNSAQELFEYIQFGKDSSRHNEIDLILLDIM 61
+IL+ DD+ V+ + L +AGY D+ ++A L+ +I + DL++ D++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY-DVRITSNAATLWRWI-------AAGDGDLVVTDVV 56

Query: 62 MPEIDGLEVCRRLQKEEKFKDIPIIFVTALEDANKLAEALDIGAMDYITKPINKVELLAR 121
MP+ + ++ R++K D+P++ ++A +A + GA DY+ KP + EL+
Sbjct: 57 MPDENAFDLLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGI 114

Query: 122 MRVALRLKSELNWHKEQEENLRNELDL 148
+ AL + E++ ++ + L
Sbjct: 115 IGRALAEPKRR--PSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0893HTHFIS701e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 1e-14
Identities = 26/107 (24%), Positives = 50/107 (46%), Gaps = 3/107 (2%)

Query: 777 TILIVDDDHRNIFALQNALKKQHANIITAQNGLECLEILKNNTNIDLILMDIMMPNMDGY 836
TIL+ DDD L AL + ++ N + DL++ D++MP+ + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDG-DLVVTDVVMPDENAF 63

Query: 837 ETMEHIRMNLGLHEIPIIALTAKAMPNDKEKCLSAGASDYISKPLNL 883
+ + I+ ++P++ ++A+ K GA DY+ KP +L
Sbjct: 64 DLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDL 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0895PF07132290.012 Harpin protein (HrpN)
		>PF07132#Harpin protein (HrpN)

Length = 356

Score = 28.5 bits (63), Expect = 0.012
Identities = 21/59 (35%), Positives = 30/59 (50%), Gaps = 6/59 (10%)

Query: 70 GLVAGGVAGGLGGLLTGLGVLAVSGLGPIVAAGPIAAAIGGAGIGGGAGSLIGAFIGLG 128
++ GG+ GGLGGL + LG L LG + G G+ +G G GS +G +G
Sbjct: 63 SMMGGGLGGGLGGLGSSLGGLGGGLLGGGLGGG------LGSSLGSGLGSALGGGLGGA 115



Score = 28.5 bits (63), Expect = 0.016
Identities = 21/66 (31%), Positives = 31/66 (46%), Gaps = 6/66 (9%)

Query: 64 DIFSATGLVAGGVAGGLGGLLTGLGVLAVSGLGPIVAAGPIAAAIGGAGIGGGAGSLIGA 123
DI + + + GGLGG L GLG G ++ G G G+G GS +G+
Sbjct: 53 DIMTTMMFMGSMMGGGLGGGLGGLGSSLGGLGGGLLGGG------LGGGLGSSLGSGLGS 106

Query: 124 FIGLGI 129
+G G+
Sbjct: 107 ALGGGL 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0899PF06580349e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.1 bits (78), Expect = 9e-04
Identities = 53/354 (14%), Positives = 114/354 (32%), Gaps = 42/354 (11%)

Query: 62 IFHWYASSLKNRQLLYFFFVQLFIVFLAAFIVPNGSIAIFVGLTPILIAQSLYVYNNIFK 121
W +L + + + + + + Q N
Sbjct: 17 GIGWGVYTLTGFGFASLYGSPKLHSMIFNIAISLMGLVLTHAYRSFIKRQGWLKLNMGQI 76

Query: 122 VMAVFTLMYAIFCIAISMNYGVNKVAILISM----FLLVLAIIIPFSYINKQQYDARNRI 177
++ V I + N + ++ I+ F L LA+ I F+ + +
Sbjct: 77 ILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSIIFNVVVVTFMWSLLYF 136

Query: 178 -QSYIQELESAYMRVEELTLANERQRMARDLHDTLAQGVASLIMQ---------LEAIDA 227
+ + + A + ++ MA++ AQ + +L Q L I A
Sbjct: 137 GWHFFKNYKQAEIDQWKM------ASMAQE-----AQ-LMALKAQINPHFMFNALNNIRA 184

Query: 228 HMQKGNTRRSQEIMKQTMIRARQTLHDARLVIDDLRHTTNSFNKAVEEEVQRFSEATSIH 287
+ + T+ + + + + R +L + L + ++ +F +
Sbjct: 185 LILEDPTKAREMLTSLSEL-MRYSLRYSNARQVSLADELTVVDSYLQLASIQFED----R 239

Query: 288 VRFTIQSPPHISS-LVKEHCLYVISECLTNIAKH---SQATDVHLKVEYSGSLERLTIEV 343
++F Q P I V + + E N KH + ++ + +T+EV
Sbjct: 240 LQFENQINPAIMDVQVPPMLVQTLVE---NGIKHGIAQLPQGGKILLKGTKDNGTVTLEV 296

Query: 344 EDNGIGFDTGYIGKNPGHYGLIGLNERVRLINGEIHIL--SEKVKGTKVCIQVP 395
E+ G K GL + ER++++ G + SEK + +P
Sbjct: 297 ENTGSLALKN--TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0900HTHFIS792e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.5 bits (196), Expect = 2e-19
Identities = 24/116 (20%), Positives = 49/116 (42%), Gaps = 2/116 (1%)

Query: 10 VLIVDDHFVVREGLKLIIETSDSFQIIGEAENGEEALSFIEKKKPDVILMDLNMPKMSGL 69
+L+ DD +R L + + + + N +I D+++ D+ MP +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAG-YDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 70 ETIEALNKKQNHTPIIILTTYNEDELMLKGIELGAKGYLLKDTDRENLFRTLEAAI 125
+ + + K + P+++++ N +K E GA YL K D L + A+
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0904RTXTOXIND310.019 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.3 bits (71), Expect = 0.019
Identities = 20/145 (13%), Positives = 46/145 (31%), Gaps = 5/145 (3%)

Query: 267 MARYEAIKAKMEPLQLQVDSLHKKIENVQSEIESIQIDEDFLQKESYVEELRMQHMS--Y 324
+ IK + Q Q ++ ++E ++ + + S VE+ R+ S
Sbjct: 185 LRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLL 244

Query: 325 EN---ARQEMRDLTGAITNIKEELAELEQQIGATFEKETVLSFDMSLATKELITQAVQKA 381
A+ + + EL + Q+ + + L T+ + + K
Sbjct: 245 HKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKL 304

Query: 382 RELETQKAQLDDRFKVAQEQLEEQE 406
R+ L +E+ +
Sbjct: 305 RQTTDNIGLLTLELAKNEERQQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0905MICOLLPTASE310.005 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 31.2 bits (70), Expect = 0.005
Identities = 21/108 (19%), Positives = 40/108 (37%), Gaps = 7/108 (6%)

Query: 20 IKTATKGIASNGKPFLTVILQDPSGDIEAKLWDV-------SPEVEKQYVAETIVKVAGD 72
IK+ + I F +D G+I+A WD + +Y +V
Sbjct: 779 IKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEAKATHKYNKTGEYEVKLT 838

Query: 73 ILNYKGRIQLRVKQIRVANENEVTDISDFVEKAPVKKEDMVEKITQYI 120
+ + G I K+I+V + V I++ +K + + K +
Sbjct: 839 VTDNNGGINTESKKIKVVEDKPVEVINESEPNNDFEKANQIAKSNMLV 886


5BALH_0923BALH_0947Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0923-1213.025510cell cycle protein FtsW
BALH_09241232.970031MarR family transcriptional regulator
BALH_09251232.725614hypothetical protein
BALH_09272222.356340major facilitator family transporter
BALH_09294231.955101permease, general substrate transporter
BALH_09306251.398473ATP-dependent DNA helicase UvrD
BALH_0931522-0.215038peptidyl-prolyl isomerase
BALH_09322160.492653hypothetical protein
BALH_09332181.285892transcriptional regulator Hpr
BALH_09343181.369592hypothetical protein
BALH_09352201.471361HIT family protein
BALH_09362211.645789ABC transporter ATP-binding protein
BALH_09372211.542222ABC transporter permease
BALH_0938417-0.825726ecsC-like protein
BALH_0939417-1.534457TetR family transcriptional regulator
BALH_0940518-1.652246hypothetical protein
BALH_0941620-1.989262hypothetical protein
BALH_0942418-1.403873TetR family transcriptional regulator
BALH_0943417-1.226508collagen adhesion protein
BALH_09442151.525596hypothetical protein
BALH_09451162.729599hypothetical protein
BALH_09461142.751349hypothetical protein
BALH_09470143.479080hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0927TCRTETB2344e-74 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 234 bits (598), Expect = 4e-74
Identities = 129/436 (29%), Positives = 237/436 (54%), Gaps = 2/436 (0%)

Query: 4 HSSQNADKLLGVLVVTLIFSVMNGTMFNVALPEIGKEFNLVPSEVSWIMTSYMVVYAVGS 63
S+ +++L L + FSV+N + NV+LP+I +FN P+ +W+ T++M+ +++G+
Sbjct: 7 QSNLRHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGT 66

Query: 64 IVMGKLADKYRLKDLLTYGLLIFALGSLLGLLA-TEYWVIILGRVIQAAGASVLPATAMI 122
V GKL+D+ +K LL +G++I GS++G + + + ++I+ R IQ AGA+ PA M+
Sbjct: 67 AVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMV 126

Query: 123 IPVRYFVPEKRGRALGTSAVGLALGNALGPVAAGLITSFGSWRLMFVLSLLPLLTLPFFR 182
+ RY E RG+A G +A+G +GP G+I + W + ++ ++ ++T+PF
Sbjct: 127 VVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLM 186

Query: 183 KYLDNAKGKAGRIDILGGGLLAVSVAFFLLAITQMQVLLFLGGFATLAFFILRIRKAKEP 242
K L G DI G L++V + FF+L T + + + F+ IRK +P
Sbjct: 187 KLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDP 246

Query: 243 FIKPILFKNKNFSIGLLLAFMTTAMSFSMTFMTPQFLSAVNGLTPSNIGF-VLVPAAIAS 301
F+ P L KN F IG+L + M P + V+ L+ + IG ++ P ++
Sbjct: 247 FVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSV 306

Query: 302 AIMGRKGGRVADTRGNFALVFIASVFIFLAFSLLSTFIGVSAFVIALILIFGNVGQTFMQ 361
I G GG + D RG ++ I F+ ++F S + +++ + +I++F G +F +
Sbjct: 307 IIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTK 366

Query: 362 ISMSNTISQTLSKEETGVGMGLLSMINFISGAMAMSVVGKLLDKGSTSLKLNPFVANEAA 421
+S +S +L ++E G GM LL+ +F+S +++VG LL +L P +++
Sbjct: 367 TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQST 426

Query: 422 NMYSNIFGVMSLLILL 437
+YSN+ + S +I++
Sbjct: 427 YLYSNLLLLFSGIIVI 442


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0929TCRTETB1385e-38 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 138 bits (348), Expect = 5e-38
Identities = 91/412 (22%), Positives = 191/412 (46%), Gaps = 13/412 (3%)

Query: 17 NVKRLPILISMIIGAFFTILNETLLNVAFPQLMIELNVTPSTLQWLSTGYMLVVAVLIPA 76
N++ ILI + I +FF++LNE +LNV+ P + + N P++ W++T +ML ++
Sbjct: 9 NLRHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAV 68

Query: 77 SALLVQWFTTRQVFIGAMVVFTFGTLVSAIA-PGFSILLMGRLLQAAGTGLMMPVLMNTI 135
L +++ + +++ FG+++ + FS+L+M R +Q AG ++M +
Sbjct: 69 YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVV 128

Query: 136 LLLYPPEKRGAAMGSIGLVIMFAPAIGPTLSGIILETLNWRWLFYIVLPFAIFSIVFAFI 195
P E RG A G IG ++ +GP + G+I ++W +L ++P V +
Sbjct: 129 ARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL--LIPMITIITVPFLM 186

Query: 196 YLKNVSEPTKPKVDVLSILLSTIGFGGIVYGFSSSGEGWDSFQVYGIILIGLVALLFFVL 255
L K D+ I+L ++G + +S +++ +++ L FV
Sbjct: 187 KLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSY--------SISFLIVSVLSFLIFVK 238

Query: 256 RQLKLKEPLLDLSAFKYPMFTLTTILLTIMMMTMFSTMTLLPFLFQGALGLTVYATG-LI 314
K+ +P +D K F + + I+ T+ ++++P++ + L+ G +I
Sbjct: 239 HIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVI 298

Query: 315 MLPGSLLNGLLSPVSGKLFDKFGPRALIIPGTILLASVMWFFTQVTADTSKITFILLHVT 374
+ PG++ + + G L D+ GP ++ G L SV + +T+ ++ V
Sbjct: 299 IFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFMTIIIVF 357

Query: 375 MMVSISMIMMPAQTNGLNQLPKRFYPHGTAILNTLSQVAGAVGVAFFISVMT 426
++ +S T + L ++ G ++LN S ++ G+A +++
Sbjct: 358 VLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0939HTHTETR757e-19 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 74.7 bits (183), Expect = 7e-19
Identities = 34/181 (18%), Positives = 68/181 (37%), Gaps = 7/181 (3%)

Query: 1 MRYSMTKNLQTSQNIVEASFKLMAEHGIEKMSLSMIAKEVGISKPAIYYHFSSKEALVDF 60
R + + +T Q+I++ + +L ++ G+ SL IAK G+++ AIY+HF K L
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 61 LFEEVFSGYHFVSYF-DKEQYTRENFAEKLIADGLHMLS---EYEGQEGILRVINEFIVT 116
++E S + + + + L +H+L E + ++ +I
Sbjct: 62 IWELSES--NIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 117 AARNEKYQKRLFEIQEEFLNGFHDLLKKGARLG-VVSQHATEENAHTLALVIDNMSNYML 175
Q+ + E + LK + + T A + I + L
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 176 M 176

Sbjct: 180 F 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0942HTHTETR617e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.8 bits (147), Expect = 7e-14
Identities = 36/163 (22%), Positives = 63/163 (38%), Gaps = 19/163 (11%)

Query: 6 TRQKILAAASQIVQCKGVAKLTLEAVAKEAGVSKGGLLYHFSNKEALIEGMIVRGVEDYE 65
TRQ IL A ++ +GV+ +L +AK AGV++G + +HF +K L + +
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 66 GAIYNKVAEDPERKGRWVRS----FVEERLNNERRTEELSSSMMAAFMLKP-ELLEPLQQ 120
A+ P +R +E + ERR + + +++ Q+
Sbjct: 72 ELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQR 131

Query: 121 SFQQ---------LQHKIENDEID-----SVCATIIRLAADGL 149
+ L+H IE + A I+R GL
Sbjct: 132 NLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGL 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0943TONBPROTEIN350.003 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 34.6 bits (79), Expect = 0.003
Identities = 19/85 (22%), Positives = 31/85 (36%), Gaps = 5/85 (5%)

Query: 1775 LTSLAPPGPEKPETTDPEKPETTDPEKPETTDPEKPGTTDPEKLETTDPEKPGTTNPEKP 1834
+T + P E P+ P +PE PE P E + KP KP
Sbjct: 47 VTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPP----KEAPVVIEKPKPKPKPKPKP 102

Query: 1835 ETMNPEKPEKELPKTGQKMPVEPYM 1859
E+P++++ + P P+
Sbjct: 103 VKKVQEQPKRDVKPVESR-PASPFE 126



Score = 33.4 bits (76), Expect = 0.007
Identities = 20/94 (21%), Positives = 28/94 (29%), Gaps = 6/94 (6%)

Query: 1742 VEAPKGYEKLTNPIPFEITKGMISPVQLQVLNKLTSLAPPGPEKPETTDPEKPETTDPEK 1801
+E P PI + V + P PE +P K EK
Sbjct: 36 IELPAP----AQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEK 91

Query: 1802 PETTDPEKPGTTDPEKLETTDPE-KPGTTNPEKP 1834
P+ KP + E + KP + P P
Sbjct: 92 PKPKPKPKP-KPVKKVQEQPKRDVKPVESRPASP 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0944IGASERPTASE378e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 37.4 bits (86), Expect = 8e-05
Identities = 25/92 (27%), Positives = 36/92 (39%)

Query: 139 AEAKETLNKLVLETKDNKNLEEYNKKAVGLVTKMNEEEKTEKGKATAKAQKTSAVTSQKV 198
A ET + +K E N++ T N E E +T+ V
Sbjct: 1031 ATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGS 1090

Query: 199 AIEKTQKTEVEKNQKVEAEKNQKVETGKNQNA 230
++TQ TE ++ VE E+ KVET K Q
Sbjct: 1091 ETKETQTTETKETATVEKEEKAKVETEKTQEV 1122


6BALH_0995BALH_1035Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_09952152.355964malate synthase
BALH_09962171.529499isocitrate lyase
BALH_0997115-1.076660trifolitoxin immunity protein
BALH_0998316-1.101454cold-shock DNA-binding protein family protein
BALH_09992151.702818hypothetical protein
BALH_10003152.640038hypothetical protein
BALH_10013152.857850competence transcription factor
BALH_10032163.249553hypothetical protein
BALH_10043153.139019signal peptidase I
BALH_10053163.398441DNA helicase/exodeoxyribonuclease V subunit B
BALH_10063152.584094DNA helicase/exodeoxyribonuclease V subunit A
BALH_10072230.766099hypothetical protein
BALH_10085240.357962spore germination protein PF
BALH_10094180.117148spore germination protein PE
BALH_10100171.889853spore germination protein PD
BALH_1011-1171.223763spore germination protein PC
BALH_1012-1142.012307spore germination protein PA
BALH_10130121.667406stage 0 sporulation regulatory protein
BALH_1014-1162.043686fumarylacetoacetate hydrolase family protein
BALH_10150173.207733ornithine--oxo-acid transaminase
BALH_10161121.958522hypothetical protein
BALH_10171120.573366asparagine synthetase
BALH_1018114-0.011098ferrochelatase
BALH_10190150.081733catalase
BALH_1020113-1.100915ammonium transporter
BALH_1021214-3.529667alpha-amylase family protein
BALH_1022417-3.628879transcriptional regulator
BALH_1023218-0.227156putative nucleotide-binding protein
BALH_10241180.194826hypothetical protein
BALH_10253190.173453hypothetical protein
BALH_10263151.679469peptidyl-prolyl isomerase
BALH_10272131.687212hypothetical protein
BALH_10291111.417779hypothetical protein
BALH_10301121.343953hypothetical protein
BALH_10310111.749229HAD superfamily hydrolase
BALH_10320132.773697ATP-dependent Clp protease, ATP-binding subunit
BALH_1033-1132.350533hydrolase
BALH_10340122.711487glucose epimerase
BALH_1035-1143.468956repressor of comG operon
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1011RTXTOXIND290.013 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.013
Identities = 12/78 (15%), Positives = 25/78 (32%), Gaps = 1/78 (1%)

Query: 9 LHQLQQALQVQQATILNLEDQVRQLQEELNELKN-RPSSSIGKVEYKFDQLKVENLNGTL 67
L + + A I E+ R + L++ + +I K + K L
Sbjct: 209 LDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNEL 268

Query: 68 NIGLNPFSTKEQQIEDFQ 85
+ + E +I +
Sbjct: 269 RVYKSQLEQIESEILSAK 286


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1032GPOSANCHOR366e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 36.2 bits (83), Expect = 6e-04
Identities = 32/118 (27%), Positives = 57/118 (48%), Gaps = 5/118 (4%)

Query: 405 RTEIDSMPTELDEVTRRIMQLEIEEAALGKEKDFGSQERLKTLQRELSDLKEVASSMRAK 464
S+ +LD QLE E L ++ R ++L+R+L +E + A+
Sbjct: 308 NANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASR-QSLRRDLDASREAKKQLEAE 366

Query: 465 WEKEKEDIHKVRDLREHLERLRRELEEA-EGNYDLNRAAELRHGKIPAIEKELKEAEE 521
+K +E +K+ + + LRR+L+ + E + +A E + K+ A+EK KE EE
Sbjct: 367 HQKLEEQ-NKISEAS--RQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEE 421



Score = 31.2 bits (70), Expect = 0.021
Identities = 19/124 (15%), Positives = 41/124 (33%), Gaps = 2/124 (1%)

Query: 413 TELDEVTRRIMQLEIEEAALGKEKDFGSQERLKTLQRELSDLKEVASSMRAKWEKEKEDI 472
+ ++ + + K + L +DL++ + I
Sbjct: 189 EARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKI 248

Query: 473 HKVRDLREHLERLRRELEEAEGNYDLNRAAELRHGKIPAIEKELKEAEEMGANNKQENRL 532
+ + LE + ELE+A A+ KI +E E E A+ + ++++
Sbjct: 249 KTLEAEKAALEARQAELEKALEGAMNFSTADSA--KIKTLEAEKAALEAEKADLEHQSQV 306

Query: 533 LREE 536
L
Sbjct: 307 LNAN 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1034NUCEPIMERASE383e-05 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 37.8 bits (88), Expect = 3e-05
Identities = 45/236 (19%), Positives = 95/236 (40%), Gaps = 50/236 (21%)

Query: 16 RVLIIGALTFVGYHLVNKMIAEEVEVYGLD-FDEFDSMTKINEEKLLLIGRNALFTYYS- 73
+ L+ GA F+G+H+ +++ +V G+D +++ + + + +L L+ + F ++
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDV-SLKQARLELLAQ-PGFQFHKI 59

Query: 74 -IRDEDGWRSV-EEEKFDAVYFCLYEPNQ--QSGFR---------NERVILQYLKRIIRL 120
+ D +G + F+ V+ + R + + +L I+
Sbjct: 60 DLADREGMTDLFASGHFERVF------ISPHRLAVRYSLENPHAYADSNLTGFLN-ILEG 112

Query: 121 CEEHKVK-FNLISSIEVGNADE----SENKR------LFAKVEEGLKKGEV---QYS-VF 165
C +K++ SS V + S + L+A + K E+ YS ++
Sbjct: 113 CRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATK---KANELMAHTYSHLY 169

Query: 166 RVP-------TLYGPWQPSFMMYHQLILSEL-GEKECHYASEENGSDLLYVEDVCE 213
+P T+YGPW M + + L G+ Y + D Y++D+ E
Sbjct: 170 GLPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAE 225


7BALH_1059BALH_1082Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_1059-2123.206797hypothetical protein
BALH_1060-3132.749357GTP pyrophosphokinase
BALH_1061-3162.513794inorganic polyphosphate/ATP-NAD kinase
BALH_1062-3181.440213ribosomal large subunit pseudouridylate synthase
BALH_1063-2180.878989hypothetical protein
BALH_10643212.053980bis(5'-nucleosyl)-tetraphosphatase
BALH_10651170.806014cell division protein
BALH_10661160.755572glycosyltransferase
BALH_10673170.726640methyltransferase
BALH_10683171.263016O-methyltransferase
BALH_10703191.683484bclA protein
BALH_1071117-0.578803glycosyltransferase, beta 1,4
BALH_1072319-0.249602O-antigen biosynthesis protein
BALH_10733220.244949hypothetical protein
BALH_10741181.045476streptomycin biosynthesis StrF domain-containing
BALH_10750181.100539glucose-1-phosphate thymidylyltransferase
BALH_1076-1171.060329dTDP-4-dehydrorhamnose 3,5-epimerase
BALH_1077-1131.358156dTDP-glucose 4,6-dehydratase
BALH_10792142.240434dTDP-4-dehydrorhamnose reductase
BALH_10801151.161057enoyl-ACP reductase
BALH_10810130.818240hypothetical protein
BALH_10822161.031283exosporium protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1063OMADHESIN389e-05 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 37.6 bits (86), Expect = 9e-05
Identities = 39/136 (28%), Positives = 61/136 (44%), Gaps = 11/136 (8%)

Query: 125 LAPSGDRQTEKTIIIHPPTPENSKVKKGGRYVHAEGLHSHAEGTASHAEGLLTHAKGSFS 184
++P+ D + PP P GG A+G+HS A G + A A G+ S
Sbjct: 39 ISPNADPALGLEYPVRPPVP-----GAGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGS 93

Query: 185 HAEGSKTKATGHSSHSEGSETTAGGSYSHAEGKHTIALGEAAHAEGTATIANGFSSHAEG 244
A G + A G S + G G+ S A+ K +A+G A T +A GF+S A+
Sbjct: 94 IATGVNSVAIGPLSKALGDSAVTYGAASTAQ-KDGVAIGARASTSDTG-VAVGFNSKADA 151

Query: 245 NHT----STAHFAGSH 256
++ ++H A +H
Sbjct: 152 KNSVAIGHSSHVAANH 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1077NUCEPIMERASE1893e-60 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 189 bits (483), Expect = 3e-60
Identities = 78/332 (23%), Positives = 141/332 (42%), Gaps = 26/332 (7%)

Query: 1 MNILVTGGAGFIGSNFVHYMLQSYETYKIINFDALT--YSGNLNNVK-SIQDHPNYYFVK 57
M LVTG AGFIG + +L+ ++++ D L Y +L + + P + F K
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLE--AGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK 58

Query: 58 GEIQNGELLEHVIKERDVQVIVNFAAESHVDRSIENPIPFYDTNVIGTVTLLELVKKYPH 117
++ + E + + + + V S+ENP + D+N+ G + +LE +
Sbjct: 59 IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 118 IKLVQVSTDEVYGSLGKTGRFTEETPLA-PNSPYSSSKASADMIALAYYKTYQLPVIVTR 176
L+ S+ VYG L + F+ + + P S Y+++K + +++A Y Y LP R
Sbjct: 119 QHLLYASSSSVYG-LNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLR 177

Query: 177 CSNNYGPYQYPEKLIPLMVTNALEGKKLPLYGDGLNVRDWLHVTDHCSAIDVVLHKGRI- 235
YGP+ P+ + LEGK + +Y G RD+ ++ D AI +
Sbjct: 178 FFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHA 237

Query: 236 -----------------GEVYNIGGNNEKTNIDVVEQIITLLGKTEQDIEYVTDRLGHDR 278
VYNIG ++ +D ++ + LG E + + G
Sbjct: 238 DTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGI-EAKKNMLPLQPGDVL 296

Query: 279 RYAIDAEKMKNEFDWEPKYTFEQGLQETVQWY 310
+ D + + + P+ T + G++ V WY
Sbjct: 297 ETSADTKALYEVIGFTPETTVKDGVKNFVNWY 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1079NUCEPIMERASE444e-07 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 43.6 bits (103), Expect = 4e-07
Identities = 36/200 (18%), Positives = 70/200 (35%), Gaps = 38/200 (19%)

Query: 4 RVIITGANGQLGKQLQEEL--NPEE----------YDIYPFDKKL------------LDI 39
+ ++TGA G +G + + L + YD+ +L +D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 40 TNISQVQQVVQEIRPHIIIHCAAYTKVDQAEKERDLAYV-INAIGARNVAVASQLVGAK- 97
+ + + + V + E AY N G N+ + +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYS-LENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 98 LVYISTDYVFQGDRPEGYDEFHNPA-PINIYGASKYAGEQFVKELHNKYFIVRTSW---- 152
L+Y S+ V+ +R + + P+++Y A+K A E + Y + T
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFFT 180

Query: 153 LYGKYGN------NFVKTMI 166
+YG +G F K M+
Sbjct: 181 VYGPWGRPDMALFKFTKAML 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1080DHBDHDRGNASE577e-12 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 57.0 bits (137), Expect = 7e-12
Identities = 60/259 (23%), Positives = 105/259 (40%), Gaps = 19/259 (7%)

Query: 4 LQGKTFVVMGVANQRSIAWGIARSLHNAGAKLI-FTYAGERLERNVRELADTLEGQESLV 62
++GK + G A + I +AR+L + GA + Y E+LE+ V L E + +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSL--KAEARHAEA 61

Query: 63 LPCDVTNDEELTACFETIKQEVGTIHGVAHCIAFANRDDLKGEFVDTSRDGFLLAQNISA 122
P DV + + I++E+G I + + G S + + ++++
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVAGVLR----PGLIHSLSDEEWEATFSVNS 117

Query: 123 FSLTAVAREAKKVMT--EGGNILTLTYLGGERVVKNYNVMGVAKASLEASVKYLANDLGQ 180
+ +R K M G+I+T+ + +KA+ K L +L +
Sbjct: 118 TGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAE 177

Query: 181 HGIRVNAISAGPIRT-----LSAKGVGDFNSILREIEE---RAPLRRTTTQEEVGDTAVF 232
+ IR N +S G T L A G I +E PL++ ++ D +F
Sbjct: 178 YNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLF 237

Query: 233 LFSDLARGVTGENIHVDSG 251
L S A +T N+ VD G
Sbjct: 238 LVSGQAGHITMHNLCVDGG 256


8BALH_1108BALH_1139Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_11080183.316023hypothetical protein
BALH_1109-1193.541353NADH oxidase (NADH dehydrogenase)
BALH_1110-1193.553907hypothetical protein
BALH_11110203.176180hypothetical protein
BALH_11122223.256020dihydrolipoamide succinyltransferase
BALH_11131182.5861092-oxoglutarate dehydrogenase E1 component
BALH_11141200.147110transcriptional regulator
BALH_1115-118-2.422040hypothetical protein
BALH_1116-117-3.807827hypothetical protein
BALH_1117216-4.826910hypothetical protein
BALH_1118316-6.274361PadR family transcriptional regulator
BALH_1119216-6.675350hypothetical protein
BALH_1120216-6.388521hypothetical protein
BALH_1121116-5.545268hypothetical protein
BALH_1122116-4.836576aminotransferase
BALH_1123117-4.772362hypothetical protein
BALH_1124117-4.140807short chain dehydrogenase/reductase family
BALH_1126216-4.289768sodium ABC transporter ATP-binding protein
BALH_1127317-3.922106sodium ABC transporter permease
BALH_1128722-3.215850hypothetical protein
BALH_1129720-4.144120hypothetical protein
BALH_1130820-4.781384hypothetical protein
BALH_1131720-4.577307hypothetical protein
BALH_1132621-3.707915resolvase
BALH_1133520-4.070859hypothetical protein
BALH_1134018-3.990553hypothetical protein
BALH_1135-118-2.601998hypothetical protein
BALH_1136-315-1.742286integrase
BALH_1137217-1.307519*methyltransferase
BALH_1138116-1.181679hypothetical protein
BALH_1139212-1.247580D-alanyl-D-alanine carboxypeptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1115FIMBRIALPAPF331e-04 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 33.1 bits (75), Expect = 1e-04
Identities = 36/122 (29%), Positives = 50/122 (40%), Gaps = 19/122 (15%)

Query: 6 IFFLLTCLLLVASTTYIICNKREQV--PPMLVWEGQEYYVTHEPAKAEEVGQRLGEVTKK 63
I LLT + ++A + N R V PP + GQ V E V GEVTK
Sbjct: 8 ISLLLTSVAVLAD---VQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGEVTKN 64

Query: 64 IEIS---------KKSTKNS----ESNTLQEKTEVFTM-IEEEKGPHSPLIVKEPHSDEY 109
I IS K T N+ ++N L F + + + KG +PL + + Y
Sbjct: 65 ISISCPYKSGSLWIKVTGNTMGVGQNNVLATNITHFGIALYQGKGMSTPLTLGNGSGNGY 124

Query: 110 RV 111
RV
Sbjct: 125 RV 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1124DHBDHDRGNASE924e-24 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 91.7 bits (227), Expect = 4e-24
Identities = 55/204 (26%), Positives = 91/204 (44%), Gaps = 20/204 (9%)

Query: 72 FNLDVSQEEDIRFLSNYFEENNITIDGIINLIGVNTLSNFYNVTNEAWDKTFDINIKSFV 131
F DV I ++ E ID ++N+ GV +++++E W+ TF +N
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVF 121

Query: 132 FLLKSIYSSLSHRV--SIVSVASQNGIVAHEDRI---AYGTSKAALIHLTKNLSIDFLKD 186
+S+ + R SIV+V S A R AY +SKAA + TK L ++
Sbjct: 122 NASRSVSKYMMDRRSGSIVTVGSN---PAGVPRTSMAAYASSKAAAVMFTKCLGLEL--- 175

Query: 187 QQRDIKVNCVSP---------SYIINDNNESYLKSFEGNKLLKKIPYRKFVEISDVTNII 237
+ +I+ N VSP S ++N + IP +K + SD+ + +
Sbjct: 176 AEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAV 235

Query: 238 LFLMSDKSDAIRGQNIVVDYGYTI 261
LFL+S ++ I N+ VD G T+
Sbjct: 236 LFLVSGQAGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1131RTXTOXINA280.034 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.4 bits (63), Expect = 0.034
Identities = 22/99 (22%), Positives = 36/99 (36%), Gaps = 3/99 (3%)

Query: 143 GTIADDIYEYIANNLHTLDTPIICNISDGIMFQLVKCNEKFLSISTIGNRLLFHFPNGNR 202
G +DIY Y++ H + I + L + + ++ GN L+ + GN
Sbjct: 844 GGYGNDIYRYLSGYGHHI---IDDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEGNV 900

Query: 203 DSIYTKYKITIPELYRYKDKYSDNFHKDTYKDKEPNQID 241
SI K IT + + N + DK I
Sbjct: 901 LSIGHKNGITFRNWFEKESGDISNHEIEQIFDKSGRIIT 939


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1132HTHTETR290.011 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 29.2 bits (65), Expect = 0.011
Identities = 9/47 (19%), Positives = 20/47 (42%)

Query: 167 KGRPKKYTENNRGLQYALELFHNRATNKMTVKEIEEITKISKATLYR 213
K + + L AL LF + + ++ EI + +++ +Y
Sbjct: 4 KTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYW 50


9BALH_1425BALH_1444Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_14253161.168749hypothetical protein
BALH_14263140.954624glutathionylspermidine synthase
BALH_14273160.965221hypothetical protein
BALH_14283161.001473Mg(2+) P-type ATPase-like protein
BALH_14293171.0935535'-3' exonuclease
BALH_14303181.114790hypothetical protein
BALH_1431-216-1.533198acetyltransferase
BALH_1432-119-0.973580isochorismatase
BALH_1433016-0.938245hypothetical protein
BALH_1435-116-0.876159DMT family permease
BALH_1436-216-1.134207ribonuclease H
BALH_1437215-1.055144aldolase 1 epimerase LacX
BALH_1438-115-1.387897hypothetical protein
BALH_1439016-0.912628hypothetical protein
BALH_1440114-2.207781hypothetical protein
BALH_1441311-0.413626cold-shock DNA-binding protein family protein
BALH_14422120.108455hypothetical protein
BALH_1444312-0.240098amino acid transporter LysE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1430CHLAMIDIAOM6403e-04 Chlamydia cysteine-rich outer membrane protein 6 si...
		>CHLAMIDIAOM6#Chlamydia cysteine-rich outer membrane protein 6

signature.
Length = 547

Score = 40.1 bits (93), Expect = 3e-04
Identities = 46/219 (21%), Positives = 86/219 (39%), Gaps = 56/219 (25%)

Query: 2096 ISYTITLSNPGNVTSQNIIFTDILPEGTTFISGTLTNDSGTQQIGNPANGIQIGNINPNG 2155
+ Y I++SNPG++ ++++ D L G T + A G Q
Sbjct: 333 VEYVISVSNPGDLVLRDVVVEDTLSPGVTVLE---------------AAGAQ-------- 369

Query: 2156 TAVITLNILVTNIPSINPISNFSSIQFEHVIDPSQPSALQTTV---SNTVSTTINSAILA 2212
I+ N +V + +NP S+Q++ ++ P V S + T S A
Sbjct: 370 ---ISCNKVVWTVKELNP---GESLQYKVLVRAQTPGQFTNNVVVKSCSDCGTCTSCAEA 423

Query: 2213 TTK----------TVDKS-IISVGDTLTYTTTITNTGNTPATNITF-----------TSA 2250
TT VD + VG+ Y +TN G+ TN++ + +
Sbjct: 424 TTYWKGVAATHMCVVDTCDPVCVGENTVYRICVTNRGSAEDTNVSLMLKFSKELQPVSFS 483

Query: 2251 IPPSTTFVPDSVTINGIQQLGAQPAL--GVTIPNIAPGE 2287
P T ++V + + +LG++ + VT+ ++ G+
Sbjct: 484 GPTKGTITGNTVVFDSLPRLGSKETVEFSVTLKAVSAGD 522



Score = 39.3 bits (91), Expect = 5e-04
Identities = 24/63 (38%), Positives = 37/63 (58%), Gaps = 7/63 (11%)

Query: 1130 SNTVSTQINLANVVIVKQVDLTIAD---VGQPITYTIALANPGNTPANNVVVTDILPPGT 1186
+ +V+T IN V QV + AD V +P+ Y I+++NPG+ +VVV D L PG
Sbjct: 305 TASVTTVINEPCV----QVSIAGADWSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSPGV 360

Query: 1187 TLV 1189
T++
Sbjct: 361 TVL 363



Score = 38.9 bits (90), Expect = 6e-04
Identities = 36/170 (21%), Positives = 69/170 (40%), Gaps = 27/170 (15%)

Query: 4326 ADLQTIIPYTISITNNGNIQVENIIVTDIIPANTSFIENSVIVNGTARPNDNPLSGIQID 4385
A L+ + Y I+I N G N++V + +P +G A + + +
Sbjct: 221 ACLRCPVVYKINIVNQGTATARNVVVENPVP------------DGYAHSSGQRVLTFTLG 268

Query: 4386 NIPPNTTATILFQVRVTSIPQT-NPISNTSTIEYQYTLPNRPPITETILSSAAVTTINHA 4444
++ P TI V P +N +T+ Y N +T I +I A
Sbjct: 269 DMQPGEHRTIT----VEFCPLKRGRATNIATVSYCGGHKNTASVTTVINEPCVQVSIAGA 324

Query: 4445 NLDSNKTVNLAFATVGDTLTYTITLNQTGNIAVNDVIIQDTIPQGTTFIE 4494
+ ++ V + Y I+++ G++ + DV+++DT+ G T +E
Sbjct: 325 D----------WSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSPGVTVLE 364



Score = 36.2 bits (83), Expect = 0.003
Identities = 35/160 (21%), Positives = 65/160 (40%), Gaps = 30/160 (18%)

Query: 2884 IVYSVTIINSGNVSATNVIFTDLIPDGTSFEPNSFTLNGTTIPNADIITGVPIGDIAPNE 2943
+VY + I+N G +A NV+ + +PDG + L T +GD+ P E
Sbjct: 227 VVYKINIVNQGTATARNVVVENPVPDGYAHSSGQRVLTFT------------LGDMQPGE 274

Query: 2944 SAIVAFHISADEIPPIN--PISNQASVSFQHIVNPANPPVSKNITSNSVTTTIESAILTT 3001
+ E P+ +N A+VS+ + + SVTT I +
Sbjct: 275 HRTITV-----EFCPLKRGRATNIATVSY----------CGGHKNTASVTTVINEPCVQV 319

Query: 3002 TKIGDKAFATIGDTVTYTTTITNTGNIPANNVIFSDPIPS 3041
+ I ++ + V Y +++N G++ +V+ D +
Sbjct: 320 S-IAGADWSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSP 358



Score = 33.1 bits (75), Expect = 0.037
Identities = 40/175 (22%), Positives = 63/175 (36%), Gaps = 26/175 (14%)

Query: 1539 IATKSVNTPNAAIGDIVTYTIAVTNTGNIPASATVLTDGLGPGASFISNSVTINNVSQPG 1598
I K NA + V Y I + N G A V+ + + G + S
Sbjct: 211 ICVKQEGPENACLRCPVVYKINIVNQGTATARNVVVENPVPDGYAHSSGQRV-------- 262

Query: 1599 LDPSLGIHLDDILPGGTTFITFQVKILAIPPSGTLTNNALVNYEYAVNPTETPAVGSTVT 1658
L L D+ PG IT + L G TN A V+Y G
Sbjct: 263 ----LTFTLGDMQPGEHRTITVEFCPLK---RGRATNIATVSY-----------CGGHKN 304

Query: 1659 NTTVTPIIDATLVINKNASTTFATIGDTITFTSSVTNTGNTTANNVIFTDSIPNG 1713
+VT +I+ V A ++ + + + SV+N G+ +V+ D++ G
Sbjct: 305 TASVTTVINEPCVQVSIAGADWSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSPG 359


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1431SACTRNSFRASE300.005 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 30.3 bits (68), Expect = 0.005
Identities = 19/92 (20%), Positives = 31/92 (33%), Gaps = 5/92 (5%)

Query: 34 EVETIFNSGIVYGVWNERKKLIASAAIILYGEALASIGMVIVHPDYKGKGIGKMITNSCI 93
+V + G ++ I I A I + V DY+ KG+G + + I
Sbjct: 56 DVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAI 115

Query: 94 KSVSTQNP--IMLIATDEGKP---LYEKLGFR 120
+ + +ML D Y K F
Sbjct: 116 EWAKENHFCGLMLETQDINISACHFYAKHHFI 147


10BALH_1473BALH_1515Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_1473312-3.068451hypothetical protein
BALH_1474213-2.959228hypothetical protein
BALH_1475314-2.793976chemotaxis protein methyltransferase
BALH_1476113-2.547242hypothetical protein
BALH_1477115-2.436958hypothetical protein
BALH_1478114-2.144682hypothetical protein
BALH_1479115-1.510384flagellar hook-associated protein FlgK
BALH_1480217-1.651185flagellar hook-associated protein FlgL
BALH_1481218-1.627897flagellar capping protein
BALH_1482118-0.185083flagellar protein
BALH_1483015-0.278382hypothetical protein
BALH_1484013-0.462387flagellar basal body rod protein FlgB
BALH_1485011-0.034362flagellar basal body rod protein FlgC
BALH_1486211-0.406962flagellar hook-basal body protein FliE
BALH_1487311-0.661413flagellar MS-ring protein
BALH_1488311-0.881077flagellar motor switch protein G
BALH_148929-0.186421flagellar assembly protein H
BALH_1490011-0.127065flagellum-specific ATP synthase
BALH_1491-112-0.762020hypothetical protein
BALH_1492-112-1.129425hypothetical protein
BALH_1493-113-0.473934flagellar basal body rod modification protein
BALH_1494014-0.382169flagellar hook protein FlgE
BALH_1495214-0.337439hypothetical protein
BALH_1496215-0.467238chemotaxis protein
BALH_14972170.337559hypothetical protein
BALH_14982211.495474flagellin
BALH_14992190.917633flagellin
BALH_15003260.470247soluble lytic murein transglycosylase
BALH_1501425-0.310995flagellar motor switch protein
BALH_15023200.098688flagellar motor switch protein FliM
BALH_1503416-0.268477flagellar motor switch protein
BALH_1504415-0.271097flagellar biosynthesis protein FliP
BALH_1505412-0.253432flagellar biosynthesis protein FliR
BALH_15062110.000684flagellar biosynthesis protein FlhB
BALH_15071100.505870flagellar biosynthesis protein FlhA
BALH_15080110.221255flagellar biosynthesis regulator FlhF
BALH_15090120.105451flagellar basal body rod protein FlgG
BALH_1510-115-1.530151metal-dependent hydrolases related to
BALH_1511216-1.720969transcriptional regulator
BALH_1512418-2.759230branched-chain amino acid transport protein
BALH_1513318-3.895383TetR family transcriptional regulator
BALH_1514-220-4.891748hypothetical protein
BALH_1515-113-3.670937VanZ family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1473PF03544330.002 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 33.0 bits (75), Expect = 0.002
Identities = 15/107 (14%), Positives = 31/107 (28%), Gaps = 7/107 (6%)

Query: 339 TEKQEDSKVEIPLQEEKPP--VVQIPKKEEKVNDFIKEPLKEKERITYVIKEPLTDNKEV 396
+ +E P + PP VV+ + E + + KE E+ +P K
Sbjct: 52 VTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEK-----PKPKPKPKPK 106

Query: 397 NKATTQKDKDNKNNNQDVSKKKEKKEEPADQKEAKSDEGIQASNVFA 443
++ K + + + PA + +
Sbjct: 107 PVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSV 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1479FLGHOOKAP11043e-26 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 104 bits (260), Expect = 3e-26
Identities = 72/249 (28%), Positives = 112/249 (44%), Gaps = 14/249 (5%)

Query: 4 SDYNTPLSGLLAAQMGLQTTKQNLSNIHTPGYVRQMVNYGSAGASQGYSPEQKIGYGVQT 63
S N +SGL AAQ L T N+S+ + GY RQ A ++ G +G GV
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLG--AGGWVGNGVYV 59

Query: 64 LGVDRITDEVKTKQFNDQLSQLSYYNYMNSTLSRVESMVGTTGKNSLSSLMDGFFNAFRE 123
GV R D T Q +Q S +S++++M+ T+ SL++ M FF + +
Sbjct: 60 SGVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTS-SLATQMQDFFTSLQT 118

Query: 124 VAKNPEQPNYYDTLISETGKFTSQVNRLAKSLDTAEAQTTEDIEAHVNEFNRLAGSLAEA 183
+ N E P LI ++ +Q + L + Q I A V++ N A +A
Sbjct: 119 LVSNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASL 178

Query: 184 NKKI----GQAGTQVPNQLLDERDRIITEMSKYANIEVS---YESMNPNIASVRMNGVLT 236
N +I G PN LLD+RD++++E+++ +EVS + N +A NG
Sbjct: 179 NDQISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMA----NGYSL 234

Query: 237 VNGQDTYPL 245
V G L
Sbjct: 235 VQGSTARQL 243



Score = 53.0 bits (127), Expect = 1e-09
Identities = 18/51 (35%), Positives = 35/51 (68%)

Query: 380 LLEGIQQEKMGVEGVNMEEEMVNLMAFQKYFVANSKAITTMNEVFDSLFSI 430
++ + ++ + GVN++EE NL FQ+Y++AN++ + T N +FD+L +I
Sbjct: 495 VVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1480FLAGELLIN406e-06 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 40.4 bits (94), Expect = 6e-06
Identities = 30/127 (23%), Positives = 60/127 (47%)

Query: 1 MRVSTFQNASWAKNQLMDLNVQQQYHRNQVTSGKKNLLMSEDPLAASKSFAIQHSLANIE 60
++T + +N L +++SG + +D + + ++ +
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 61 QMQKDLADSKNVLTQTENTLQGVFKSLTRADQLTVQALNGTNSEKELKAIGAEIDQILKQ 120
Q ++ D ++ TE L + +L R +L+VQA NGTNS+ +LK+I EI Q L++
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 121 VVYLANT 127
+ ++N
Sbjct: 122 IDRVSNQ 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1484FLGHOOKAP1310.001 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 31.1 bits (70), Expect = 0.001
Identities = 10/27 (37%), Positives = 15/27 (55%)

Query: 23 NTVSSNIANANTPGYKAQDVTFAEQMN 49
NT S+NI++ N GY Q A+ +
Sbjct: 19 NTASNNISSYNVAGYTRQTTIMAQANS 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1485FLGHOOKAP1358e-05 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 34.5 bits (79), Expect = 8e-05
Identities = 19/75 (25%), Positives = 33/75 (44%), Gaps = 7/75 (9%)

Query: 5 INASGSGLTTARKWMEVTSNNIVNANTTAAPGADLYERRSVVLESNNSFANMLDGSLTNG 64
IN + SGL A+ + SNNI + N Y R++ ++ NS G + NG
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAG------YTRQTTIMAQANSTLGA-GGWVGNG 56

Query: 65 VKIKSIEADKTENLV 79
V + ++ + +
Sbjct: 57 VYVSGVQREYDAFIT 71



Score = 28.0 bits (62), Expect = 0.014
Identities = 10/38 (26%), Positives = 17/38 (44%)

Query: 97 NIDVTAEMTNVMVAQKMYEANTSVLNANKKMLDKDLEI 134
+++ E N+ Q+ Y AN VL + D + I
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1486FLGHOOKFLIE364e-06 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 35.8 bits (82), Expect = 4e-06
Identities = 18/77 (23%), Positives = 36/77 (46%), Gaps = 1/77 (1%)

Query: 34 SQTSVVEGKKFIDLLEDMNQTQNNAQTAVYDLLTKGVG-ETHDVLIQQKKAESQMKTAAL 92
Q ++ + L+ ++ TQ A+T G +DV+ +KA M+
Sbjct: 27 PQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQ 86

Query: 93 VRDNLIENYKSLINMQI 109
VR+ L+ Y+ +++MQ+
Sbjct: 87 VRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1487FLGMRINGFLIF1663e-47 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 166 bits (422), Expect = 3e-47
Identities = 99/540 (18%), Positives = 217/540 (40%), Gaps = 46/540 (8%)

Query: 17 LVIGAALLAIVTGALLYFTLPDKYVVVYQNLNDADKLEITAELSKLGVDYQLAADG-SIR 75
+V G+A +AIV +L+ PD Y ++ NL+D D I A+L+++ + Y+ A +I
Sbjct: 28 IVAGSAAVAIVVAMVLWAKTPD-YRTLFSNLSDQDGGAIVAQLTQMNIPYRFANGSGAIE 86

Query: 76 VQKNDAPWVRKEMNGMGLPFNSKSGEEILLESSLGSSEQDKKMKQIVGTKKQLEQDIVRN 135
V + +R + GLP G E+L + G S+ +++ + +L + I
Sbjct: 87 VPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRALEGELARTI-ET 145

Query: 136 FATVETANVQITLPEKETIFDEEKAKGTAAITVGVKRGQLLTADQVAGIQQMISAAVPGV 195
V++A V + +P+ ++F E+ +A++TV ++ G+ L Q++ + ++S+AV G+
Sbjct: 146 LGPVKSARVHLAMPKP-SLFVREQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGL 204

Query: 196 KAEEVSVIDSKKGVISKGADEAHSTSSSSYEKEVEMQHQIEGKLKQDIDATLMTMFKPNE 255
V+++D ++++ + + ++ + +E ++++ I+A L +
Sbjct: 205 PPGNVTLVDQSGHLLTQSNTSGRDLNDAQ----LKFANDVESRIQRRIEAILSPIVGNGN 260

Query: 256 YKVNTKVSVNYDEVTRQSEKYG-DKGVLRSKQEQEESSTAQEGAETKQGA--GITANG-- 310
+++ + E Y + ++ + + +++ G G +N
Sbjct: 261 VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGVPGALSNQPA 320

Query: 311 -------EVPNYGTNNNQNGKIVYDNKNGNKI----------ENYEIDKTVETIKKHP-E 352
P N QN + N N NYE+D+T+ K + +
Sbjct: 321 PPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDRTIRHTKMNVGD 380

Query: 353 LTKTNVVVWVDNDTLVKRKI------DMTTFKEAIGTAAGLQADPNGNFTNGQVNVVTVQ 406
+ + +V V V+ TL K M ++ A G +NVV
Sbjct: 381 IERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDK-----RGDTLNVVNSP 435

Query: 407 FDQPKEEKKKEPEESGINWWLFGGIPAGLLAIGGLVWFFLARRKRKKEEEEYEEYLAEEE 466
F + P ++ L + + W + R + EE A +E
Sbjct: 436 FSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRRVEEAKAAQE 495

Query: 467 IAASNESIMEIPEEKI----VPEPKPEPEEPKEPTLDEQVQDATKEHVEGTAKVIKKWLN 522
A + E E ++ + + + + +++++ + A VI++W++
Sbjct: 496 QAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPRVVALVIRQWMS 555


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1488FLGMOTORFLIG2064e-66 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 206 bits (525), Expect = 4e-66
Identities = 116/336 (34%), Positives = 195/336 (58%), Gaps = 6/336 (1%)

Query: 10 LDEISSKEKAAILIRTLEEGVAAKVIEYMTAEEKEVLLREIAKFRVYKPETLENVLGEFL 69
+ ++ K+KAAIL+ ++ +++KV +Y++ EE E L EIAK E +NVL EF
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 70 YELNVKELNLVTPDKEYIRRIF-KNMPEDELEKLLEDLWYN-KDNPFEFLNSLTDLEPLL 127
EL + + + +Y R + K++ + ++ +L + PFEF+ D +L
Sbjct: 72 -ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRA-DPANIL 129

Query: 128 TVLNDESPQTIAIIASYIKPQLASQLIERLPDHKRVETVMGIAKLEQVDGELINQIGDLL 187
+ E PQTIA+I SY+ PQ AS ++ LP + IA +++ E++ ++ +L
Sbjct: 130 NFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVL 189

Query: 188 KSKLNNMAFNAINKTDGLKTIVNILNNVSRGVEKTVFQKLDEMDYELSERIKENMFVFED 247
+ KL +++ G+ +V I+N R EK + + L+E D EL+E IK+ MFVFED
Sbjct: 190 EKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFED 249

Query: 248 LLGLEDLALRRVLEEITDNGVIAKALKIAKEEIKEKLFTCMSSNRKEMILEELDGLGPLK 307
++ L+D +++RVL EI D +AKALK ++EK+F MS M+ E+++ LGP +
Sbjct: 250 IVLLDDRSIQRVLREI-DGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTR 308

Query: 308 MTDAEKAQQTITGTVKKLEKEGRIIVQRG-EDDVLI 342
D E++QQ I ++KLE++G I++ RG E+DVL+
Sbjct: 309 RKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVLV 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1494FLGHOOKAP1441e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 1e-06
Identities = 15/36 (41%), Positives = 24/36 (66%)

Query: 5 LYTSITGMNAAQNALSVTSNNIANAQTVGYKKQKAI 40
+ +++G+NAAQ AL+ SNNI++ GY +Q I
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI 39



Score = 37.6 bits (87), Expect = 8e-05
Identities = 10/39 (25%), Positives = 26/39 (66%)

Query: 397 SNVDLSVEFVDLMLYQRGFQGNAKVIKVSDEVLNEVVNL 435
S V+L E+ +L +Q+ + NA+V++ ++ + + ++N+
Sbjct: 507 SGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1496HTHFIS482e-08 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 47.9 bits (114), Expect = 2e-08
Identities = 35/125 (28%), Positives = 55/125 (44%), Gaps = 14/125 (11%)

Query: 176 IYIAEDSAMLRQILEETLSSAGYTKMNFFSNGAEALAQIEKLAKEQGEKMFEHIHLLITD 235
I +A+D A +R +L + LS AGY + A I L++TD
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSN-AATLWRWIAA----------GDGDLVVTD 54

Query: 236 IEMPKMDGHHLTKVVKDSEVMNRLPVIIFSSLITNELFHKGEAVGANAQVSKP-DIQELI 294
+ MP + L +K + LPV++ S+ T K GA + KP D+ ELI
Sbjct: 55 VVMPDENAFDLLPRIK--KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELI 112

Query: 295 GLVDK 299
G++ +
Sbjct: 113 GIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1498FLAGELLIN1276e-36 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 127 bits (321), Expect = 6e-36
Identities = 51/140 (36%), Positives = 81/140 (57%)

Query: 4 MRIGTNVLSMNARQSLYENEKRMNVAMEHLATGKKLNHASDNPANVAIVTRMHARASGMR 63
I TN LS+ + +L +++ ++ A+E L++G ++N A D+ A AI R + G+
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 64 VAIRNNEDAISMLRTAEAALQTVMNILQRMRDLAVQSANGTNSNKNRDSLNKEFQSLTEQ 123
A RN D IS+ +T E AL + N LQR+R+L+VQ+ NGTNS+ + S+ E Q E+
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 124 IGYIGETTEFNDLSVFDGQN 143
I + T+FN + V N
Sbjct: 122 IDRVSNQTQFNGVKVLSQDN 141



Score = 88.6 bits (219), Expect = 4e-22
Identities = 35/100 (35%), Positives = 55/100 (55%)

Query: 169 DINISTEQEARAAIRKIEEALQNVSLHRADLGTMMNRLQFNIENLNSQSMALTDAASRIE 228
+ + ++ + I+ AL V R+ LG + NR I NL + L A SRIE
Sbjct: 408 EDAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIE 467

Query: 229 DADMAQEMSDFLKFKLLTEVALSMVSQANQIPQMISKLLQ 268
DAD A E+S+ K ++L + S+++QANQ+PQ + LL+
Sbjct: 468 DADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1499FLAGELLIN1913e-57 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 191 bits (485), Expect = 3e-57
Identities = 135/508 (26%), Positives = 224/508 (44%), Gaps = 59/508 (11%)

Query: 1 MRINTNINSLRTQEYMRQNQTKMSNAMDRLSSGKRINNASDDAAGLAIATRMRARESGLG 60
INTN SL TQ + ++Q+ +S+A++RLSSG RIN+A DDAAG AIA R + GL
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 61 VAANNTQDGMSLIRTADSAMNSVSNILLRMRDIANQSANGTNTDSNKSALQKEFSELQKQ 120
A+ N DG+S+ +T + A+N ++N L R+R+++ Q+ NGTN+DS+ ++Q E + ++
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 121 ITYIADNTQFNDKNLLNADSEVKIQTLDSSNGDQQIGIDLKAVTLEKLGINNISIGSATT 180
I +++ TQFN +L+ D+++KIQ +N + I IDL+ + ++ LG++ ++
Sbjct: 122 IDRVSNQTQFNGVKVLSQDNQMKIQV--GANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 181 A---------------DLKQTDIEAVSTKIAALDKDSVAKDITDIKAAIDKIKDGMKPED 225
A D + + + T +G D
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 226 VTKLNAALDGFKTGEADDDAAGVTAIKTALS----------------------------- 256
+ N A+D FKT ++ A AI A+
Sbjct: 240 DAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKV 299

Query: 257 KVELPKGSFEVAQKDLDDVSTKIAALDKDSVA-KDITDIKAAIDKIKDGMKPEDVTKLNA 315
+ + D+ + + A S + +
Sbjct: 300 STTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLE 359

Query: 316 ALDGFK------------TGEADDDAAGVTAIKTALSKVELPKLGDTIKPTTNSKADSLA 363
A + K T A D + + K + +K +
Sbjct: 360 ANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTAN 419

Query: 364 AVAAIDKALTTVADNRATLGATLNRLDFNVNNLKSQSSAMAASASQIEDADMAKEMSEMT 423
+A+ID AL+ V R++LGA NR D + NL + + + ++ S+IEDAD A E+S M+
Sbjct: 420 PLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMS 479

Query: 424 KFKILNEAGISMLSQANQTPQMVSKLLQ 451
K +IL +AG S+L+QANQ PQ V LL+
Sbjct: 480 KAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1500PF06580290.020 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.1 bits (65), Expect = 0.020
Identities = 8/42 (19%), Positives = 20/42 (47%), Gaps = 1/42 (2%)

Query: 122 LTKKY-NIQKIRSSNEGKYEDIIDRVSHTYGIPKTLIQKMIE 162
+ Y + I+ + ++E+ I+ +P L+Q ++E
Sbjct: 224 VVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVE 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1501TYPE3OMOPROT424e-08 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 41.5 bits (97), Expect = 4e-08
Identities = 14/67 (20%), Positives = 31/67 (46%)

Query: 5 DDIPLTIYFEIGNTKKKIEDLLHITKGTLYRLENSTKNTVRLMLENEEIGTGKILTKNGK 64
+ +P+ + F + + +L + + L L + + V +M +G G+++ N
Sbjct: 228 NQLPVKLEFVLYRKNVTLAELEAMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDT 287

Query: 65 MYVEIVE 71
+ VEI E
Sbjct: 288 LGVEIHE 294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1502FLGMOTORFLIM1455e-43 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 145 bits (367), Expect = 5e-43
Identities = 91/329 (27%), Positives = 166/329 (50%), Gaps = 10/329 (3%)

Query: 4 EKLSQEQIDALLKAVNEGEEMPAFAQEAGKQEKFQEYDFNRPEKFGVEHLRSLQAIASTF 63
E LSQ++ID LL A++ G+ A+ K YDF RP+KF E +R+L + TF
Sbjct: 3 EVLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETF 62

Query: 64 GKQTSQTLSARMRIPIELEPSTVEQVPFTSEYVEKMPKDYYLYCVIDLGLPELGEIVIEI 123
+ T+ +LSA++R + + ++V+Q+ + E++ +P L VI + P G V+E+
Sbjct: 63 ARLTTTSLSAQLRSMVHVHVASVDQLTY-EEFIRSIPTPSTL-AVITMD-PLKGNAVLEV 119

Query: 124 DLAFVIYIHECWLGGDSKRNFTMRRPLTAFEFLTLDNIFMLLCKNLEQSFESVVAIEPKF 183
D + I + GG + ++R LT E ++ + + + N+ +S+ V+ + P+
Sbjct: 120 DPSITFSIIDRLFGGT-GQAAKVQRDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRL 178

Query: 184 VTTETDPNALKITTASDIISLLNVNMKTDFWNTTVRIGIPFLSVEEIMDKLTSENIVEHS 243
ET+P +I S+++ L+ + K + IP++++E I+ KL+S+ S
Sbjct: 179 GQIETNPQFAQIVPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFW--FS 236

Query: 244 SDKRKK---YTSEVEAKVNQVYKPVHVAIGEQKMTMGEIEQIEEGDIIPLH-TKVSDELL 299
S +R Y + K++ V V +G ++++ +I + GDII LH T V D +
Sbjct: 237 SVRRSSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFV 296

Query: 300 GYVDGKHKFNCFIGKDGTRKALLFKSFVE 328
+ + KF C G G + A +E
Sbjct: 297 LSIGNRKKFLCQPGVVGKKIAAQILERIE 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1503FLGMOTORFLIN584e-14 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 58.0 bits (140), Expect = 4e-14
Identities = 22/94 (23%), Positives = 50/94 (53%)

Query: 13 LEDFAGKRNEASKAHIDTVSDISIELGVKLGKASITLGDVKELKVGDVLEVEKNLGHKVD 72
+ G + ID + DI ++L V+LG+ +T+ ++ L G V+ ++ G +D
Sbjct: 39 FQQLGGGDVSGAMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLD 98

Query: 73 VYLSNMKVGIGEAIVMDEKFGIIISEIEADKKQA 106
+ ++ + GE +V+ +K+G+ I++I ++
Sbjct: 99 ILINGYLIAQGEVVVVADKYGVRITDIITPSERM 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1504FLGBIOSNFLIP1634e-52 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 163 bits (415), Expect = 4e-52
Identities = 71/203 (34%), Positives = 127/203 (62%)

Query: 48 SSVQLFALVTLLSLSSSIVLLFTHFTYFMIVLGITRQGLGVMNLPPNQVLVGLALFLSLF 107
VQ +T L+ +I+L+ T FT +IV G+ R LG + PPNQVL+GLALFL+ F
Sbjct: 40 LPVQTLVFITSLTFIPAILLMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFF 99

Query: 108 TMQPVLGQLKSDVWDPMTKEKITVSQAAETTAPIMKEYMSKHTYKHDLKMMLKVRGEELP 167
M PV+ ++ D + P ++EKI++ +A E A ++E+M + T + DL + ++
Sbjct: 100 IMSPVIDKIYVDAYQPFSEEKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPL 159

Query: 168 KDLKDLSLFTLVPSFTLTQIQKGLLTGMFIYLAFVFIDLIISTLLMYLGMMMVPPMILSL 227
+ + + + L+P++ ++++ G I++ F+ IDL+I+++LM LGMMMVPP ++L
Sbjct: 160 QGPEAVPMRILLPAYVTSELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIAL 219

Query: 228 PFKILVFVYLGGYTKIVDIMFKT 250
PFK+++FV + G+ +V + ++
Sbjct: 220 PFKLMLFVLVDGWQLLVGSLAQS 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1505TYPE3IMRPROT965e-26 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 96.4 bits (240), Expect = 5e-26
Identities = 53/233 (22%), Positives = 113/233 (48%), Gaps = 1/233 (0%)

Query: 10 FFAFCRITSFLYFLPFFSGRSIPAMAKVTFGLALSITVADQVDVSHIKTTWDVAA-YAGT 68
F+ R+ + + P S RS+P K+ + ++ +A + + + A A
Sbjct: 17 FWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPVFSFFALWLAVQ 76

Query: 69 QIVIGLSLSKIVEMLWNIPKMAGHILDFDIGLSQASLFDVNAGSQSTLLSTIFDIFFLII 128
QI+IG++L ++ + + AG I+ +GLS A+ D + +L+ I D+ L++
Sbjct: 77 QILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDMLALLL 136

Query: 129 FISLGGINYFVATILKSFQYTEAISKLLTTSFLDSLLATLLFAITSAVEIALPLMGSLFI 188
F++ G + ++ ++ +F + L ++ +L + + +ALPL+ L
Sbjct: 137 FLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALPLITLLLT 196

Query: 189 INFVLILIAKNAPQLNIFMNAFVIKITCGILFIAMSVPMLGYVFKNMTDVLLE 241
+N L L+ + APQL+IF+ F + +T GI +A +P++ +++ +
Sbjct: 197 LNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIFN 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1506TYPE3IMSPROT2892e-98 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 289 bits (742), Expect = 2e-98
Identities = 92/343 (26%), Positives = 186/343 (54%), Gaps = 2/343 (0%)

Query: 4 DNKTEKATPQKRKKSREEGNIARSKDLNNLFSILVLAVVVYFFGDWLGFEIANSVSVLFD 63
KTE+ TP+K + +R++G +A+SK++ + I+ L+ ++ D+ + + + +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 64 QIGKNTDS--TEYFYMMGILLLKVSAPILILVYAFHLFNYMIQVGFLFSSKVIKPKASRI 121
Q + + + + P+L + + ++++Q GFL S + IKP +I
Sbjct: 63 QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKKI 122

Query: 122 NPKNYFTRLFSRKSLVDILKSLFYMGLIGYVAYVLFKKNLEKIVSMIGFNWTASLTEIIR 181
NP R+FS KSLV+ LKS+ + L+ + +++ K NL ++ + + +
Sbjct: 123 NPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQ 182

Query: 182 QIKFIFLAILIILIVLSIIDFIYQKWEYEQDIKMKKEEVKQEHKDNEGDPQVKGKRKNFM 241
++ + + + +V+SI D+ ++ ++Y +++KM K+E+K+E+K+ EG P++K KR+ F
Sbjct: 183 ILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQFH 242

Query: 242 HAILQGTIAKKMDGATFIVNNPTHISVVLRYNKHVDAAPIVVAKGEDELALYIRTLAREQ 301
I + + + ++ +V NPTHI++ + Y + P+V K D +R +A E+
Sbjct: 243 QEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEEE 302

Query: 302 EIPMVENRPLARSLYYQVEEDETIPEDLYVAVIEVMRYLIQTN 344
+P+++ PLAR+LY+ D IP + A EV+R+L + N
Sbjct: 303 GVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQN 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1509FLGHOOKAP1280.042 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 28.0 bits (62), Expect = 0.042
Identities = 9/57 (15%), Positives = 20/57 (35%), Gaps = 3/57 (5%)

Query: 2 NGLYIGSMGMMNYMQRINVHSNNVANAQTTGFKAENMTSKVFDVQDAYRRGDGAVTN 58
+ + G+ +N SNN+++ G+ + ++ G V N
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMA---QANSTLGAGGWVGN 55


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1513HTHTETR748e-19 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 74.3 bits (182), Expect = 8e-19
Identities = 36/191 (18%), Positives = 70/191 (36%), Gaps = 33/191 (17%)

Query: 1 MAKPN----VVNKEKLLQAAKEIIAEHGMEKLTLKAVAKSAQVTQGTVYYHFKTKDQLLL 56
MA+ ++ +L A + ++ G+ +L +AK+A VT+G +Y+HFK K L
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 57 EVTEAFCKASWEQIGKDVQLEKALQSAESRCVKDSMYHHLFFQLVASGLQNDAMKDKIGG 116
E+ + S IG+ +A + V + H+ S + + + +
Sbjct: 61 EI----WELSESNIGELELEYQAKFPGDPLSVLREILIHVL----ESTVTEERRRLLMEI 112

Query: 117 LLHYENQQ--------------------LTRVLNKNI-GGTMTSQISTETWSVLCNALID 155
+ H + + L I + + + T +++ I
Sbjct: 113 IFHKCEFVGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYIS 172

Query: 156 GLALQALFNPS 166
GL LF P
Sbjct: 173 GLMENWLFAPQ 183


11BALH_1614BALH_1630Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_16142170.928324SNF family sodium-dependent transporter
BALH_16151232.104879polysaccharide deacetylase, chitooligosaccharide
BALH_16162213.374342hypothetical protein
BALH_16170203.600707hypothetical protein
BALH_1618-2184.735141fibronectin-binding protein
BALH_1619-1175.927019dehydrogenase, 3-hydroxyisobutyrate
BALH_1620-3133.021438hypothetical protein
BALH_1621-3113.001516peptide methionine sulfoxide reductase
BALH_1622-3123.867128short chain dehydrogenase
BALH_1623-2123.275312branched-chain amino acid aminotransferase
BALH_1624-2102.672004acetolactate synthase 3 catalytic subunit
BALH_1625-1111.835651acetolactate synthase 1 regulatory subunit
BALH_1626-191.481691ketol-acid reductoisomerase
BALH_1628090.985369dihydroxy-acid dehydratase
BALH_162919-0.299760threonine dehydratase
BALH_1630211-0.426988capsule biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1618PF07299348e-126 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 348 bits (895), Expect = e-126
Identities = 219/219 (100%), Positives = 219/219 (100%)

Query: 1 MYGVIKMEAFIRSDQYNFIKSQAYILANGHATANDRGVIQALKSLAIEKIIHVFENLTDE 60
MYGVIKMEAFIRSDQYNFIKSQAYILANGHATANDRGVIQALKSLAIEKIIHVFENLTDE
Sbjct: 1 MYGVIKMEAFIRSDQYNFIKSQAYILANGHATANDRGVIQALKSLAIEKIIHVFENLTDE 60

Query: 61 QKELIDTVLTVQNREDAESFLLKINPYVIPFQEVTAQTLKKLFPKAKKLKLPDMEELDMK 120
QKELIDTVLTVQNREDAESFLLKINPYVIPFQEVTAQTLKKLFPKAKKLKLPDMEELDMK
Sbjct: 61 QKELIDTVLTVQNREDAESFLLKINPYVIPFQEVTAQTLKKLFPKAKKLKLPDMEELDMK 120

Query: 121 ELSYLSWIDKGSSRKFIIAKNDKNKFVGLQGTFQSLNKKSICSLCHGHEEVGMFLVEIKG 180
ELSYLSWIDKGSSRKFIIAKNDKNKFVGLQGTFQSLNKKSICSLCHGHEEVGMFLVEIKG
Sbjct: 121 ELSYLSWIDKGSSRKFIIAKNDKNKFVGLQGTFQSLNKKSICSLCHGHEEVGMFLVEIKG 180

Query: 181 DIPGTFVKKGNYICKDGVACNQNMKSLDKLQDFIERLKK 219
DIPGTFVKKGNYICKDGVACNQNMKSLDKLQDFIERLKK
Sbjct: 181 DIPGTFVKKGNYICKDGVACNQNMKSLDKLQDFIERLKK 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1622DHBDHDRGNASE885e-23 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 88.2 bits (218), Expect = 5e-23
Identities = 68/263 (25%), Positives = 121/263 (46%), Gaps = 21/263 (7%)

Query: 2 LKGKVALVTGASRGIGRAIAKRLANDGALV-AIHYGNRKEEAEETVYEIQSNGGSAFSIG 60
++GK+A +TGA++GIG A+A+ LA+ GA + A+ Y K E + + ++ AF
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP-- 63

Query: 61 ANLESLHGVEALYSSLDNELQNRTGSTKFDILINNAGIGPGAFIEETTEQFFDRMVSVNA 120
A++ ++ + + ++ E+ DIL+N AG+ I +++ ++ SVN+
Sbjct: 64 ADVRDSAAIDEITARIEREMG------PIDILVNVAGVLRPGLIHSLSDEEWEATFSVNS 117

Query: 121 KAPFFIIQQALSRLRD--NSRIINISSAATRISLPDFIAYSMTKGAINTMTFTLAKQLGA 178
F + + D + I+ + S + AY+ +K A T L +L
Sbjct: 118 TGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAE 177

Query: 179 RGITVNAILPGFIKTDMNAELLSDP---------MMKQYATTISAFNRLGEVEDIADTAA 229
I N + PG +TDM L +D ++ + T I +L + DIAD
Sbjct: 178 YNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIP-LKKLAKPSDIADAVL 236

Query: 230 FLASPDSRWVTGQLIDVSGGSCL 252
FL S + +T + V GG+ L
Sbjct: 237 FLVSGQAGHITMHNLCVDGGATL 259


12BALH_1644BALH_1651Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_1644317-3.014958hypothetical protein
BALH_1645218-1.679881hypothetical protein
BALH_1646219-1.261760hypothetical protein
BALH_1647218-1.129106oligopeptide ABC transporter substrate-binding
BALH_1648318-1.196285hypothetical protein
BALH_1649116-1.039109hypothetical protein
BALH_1650218-0.596185hypothetical protein
BALH_1651218-1.844322coenzyme PQQ synthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1644PYOCINKILLER344e-04 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 34.4 bits (78), Expect = 4e-04
Identities = 26/107 (24%), Positives = 43/107 (40%), Gaps = 15/107 (14%)

Query: 154 GELAGDKHPVTGIPYDL----DGFPIFESKGEVFLKEADFKKSRPTHFRKCNKAFYKQIM 209
G G PV+G +G PI + L+ FK ++R + F+ +
Sbjct: 491 GAATGKGQPVSGNWLGAASQGEGAPIPSQIADK-LRGKTFK-----NWRDFREQFWIAVA 544

Query: 210 EDPNLASKFTEEEIQMFKQGETPKNYTWHHHQDAGRMQLVDYQIHHD 256
DP L+ +F + + + G P Y Q GR++ +IHH
Sbjct: 545 NDPELSKQFNPGSLAVMRDGGAP--YVRESEQAGGRIK---IEIHHK 586


13BALH_1854BALH_1865Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_1854214-1.235696glycosyltransferase
BALH_1855216-1.813212hypothetical protein
BALH_1856215-1.781180GntR family transcriptional regulator
BALH_1857114-2.559118acetyltransferase
BALH_1858215-2.545130hypothetical protein
BALH_1859217-2.281096acetyltransferase
BALH_1860219-3.228578hypothetical protein
BALH_1861318-1.463210acetyltransferase
BALH_1862418-1.874483hypothetical protein
BALH_1863319-2.406624hypothetical protein
BALH_1864119-1.981112hypothetical protein
BALH_1865217-0.897185hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1857SACTRNSFRASE385e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 37.6 bits (87), Expect = 5e-06
Identities = 24/103 (23%), Positives = 41/103 (39%), Gaps = 5/103 (4%)

Query: 27 SREEASCLFQKMKEESYKLFSLRNEEEAVVSLAGVAICTNFYNEKHVFVYDLVTAEAHRS 86
E+ ++EE F E + + I +N+ + + D+ A+ +R
Sbjct: 49 QYEDDDMDVSYVEEEGKAAFLYYLENNCI---GRIKIRSNW--NGYALIEDIAVAKDYRK 103

Query: 87 KGYGNVLLSYVEKWGKEKGCSSIVLTSAFPRIDAHRFYEREGF 129
KG G LL +W KE ++L + I A FY + F
Sbjct: 104 KGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1859PF05616290.016 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 28.6 bits (63), Expect = 0.016
Identities = 16/42 (38%), Positives = 23/42 (54%)

Query: 43 YFSFSMQEYSVYKEKMQTRLKEEPLSNLIIENNGQVIGTVGF 84
+ SFS+Q S YKE+M + EE LS + N + I G+
Sbjct: 220 FISFSLQGNSKYKEEMDAKKLEEILSLKVDANPDKYIKATGY 261


14BALH_1909BALH_1920Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_1909318-1.808775hypothetical protein
BALH_1910316-1.907648N-acetyltransferase family protein
BALH_1911316-1.499063hypothetical protein
BALH_1912621-2.461553stage V sporulation protein S
BALH_1913723-2.771136hypothetical protein
BALH_1914725-2.827311truncated phage-like protein
BALH_1915424-2.141658hypothetical protein
BALH_1916323-1.620370hypothetical protein
BALH_1917322-2.771682hypothetical protein
BALH_1918120-4.096317hypothetical protein
BALH_1919021-1.987537hypothetical protein
BALH_1920216-0.561952N-acetylmuramoyl-L-alanine amidase
15BALH_1949BALH_1973Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_1949215-1.533228hypothetical protein
BALH_1950316-1.997444hypothetical protein
BALH_1951217-1.753338hypothetical protein
BALH_1952117-1.801140hypothetical protein
BALH_1953117-1.694601hypothetical protein
BALH_1954-120-3.134736hypothetical protein
BALH_1955-118-2.867936hypothetical protein
BALH_1956017-1.316504resolvase
BALH_1957218-2.668684transposase
BALH_1958822-1.305999hypothetical protein
BALH_1959925-1.418639hypothetical protein
BALH_1960825-2.137228hypothetical protein
BALH_1961926-2.094933hypothetical protein
BALH_1962924-3.178533hypothetical protein
BALH_1963824-3.486794hypothetical protein
BALH_1964422-3.292865hypothetical protein
BALH_1965318-3.179235hypothetical protein
BALH_1966118-2.877088hypothetical protein
BALH_1967-117-3.366074hypothetical protein
BALH_1968-316-3.401018hypothetical protein
BALH_1969-315-2.809704glycerate kinase
BALH_1971-315-3.505422transporter
BALH_1972016-3.311249DNA-binding response regulator
BALH_1973-115-3.178125sensor histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1950SYCDCHAPRONE300.018 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 30.3 bits (68), Expect = 0.018
Identities = 20/120 (16%), Positives = 41/120 (34%), Gaps = 6/120 (5%)

Query: 596 REIDKENNEAAYLLASANFRIGKYQEAVQNFEQALANNAKGIEPYKKDAMRDLAVSHMKM 655
EI + E Y LA ++ GKY++A + F+ ++ Y L M
Sbjct: 29 NEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCV-----LDHYDSRFFLGLGACRQAM 83

Query: 656 KEFEKAEDVIVKMSTKTNEDKAIVSYLKGQLSTATVQLEKAESFFKEAIMQDSKNPIYTI 715
+++ A + ++ + + +L +AES A + +
Sbjct: 84 GQYDLAIHSYSYGAIMDIKE-PRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEFKE 142


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1951PF03544402e-05 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 40.0 bits (93), Expect = 2e-05
Identities = 19/89 (21%), Positives = 28/89 (31%), Gaps = 2/89 (2%)

Query: 171 DPNTP--VDPNTPVDPKPPVDPNTSVDPKPPIDPNTPVDPKPPVDPNTPVDPKPPVVEPK 228
P P V P D +PP +P +P P+PP + ++ P +PK
Sbjct: 45 APAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPK 104

Query: 229 PPVVDPKPPVVDPKPPEPTHPTITVSPKT 257
P V P + P
Sbjct: 105 PKPVKKVEQPKRDVKPVESRPASPFENTA 133



Score = 39.6 bits (92), Expect = 3e-05
Identities = 25/108 (23%), Positives = 37/108 (34%), Gaps = 6/108 (5%)

Query: 150 VADIPKSPTEPNKPVDTKLQVDPNTPVDPNTPVDPKPPVDPNTSVDPKP-PIDPNTPVDP 208
V + + P P +P+ + + P+P V+P +P P P V
Sbjct: 37 VHQVIELPA-PAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIE 95

Query: 209 K----PPVDPNTPVDPKPPVVEPKPPVVDPKPPVVDPKPPEPTHPTIT 252
K P P + P + KP P P + P PT T T
Sbjct: 96 KPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTAT 143



Score = 35.0 bits (80), Expect = 0.001
Identities = 19/83 (22%), Positives = 29/83 (34%), Gaps = 5/83 (6%)

Query: 178 PNTPVDPKPP-VDPNTSVDPKPPIDPNTPVDPKPPVDPNTPVDPKPPVVEPKPPVVDPKP 236
P +P V D +PP P P+P V+P +P P + P V++
Sbjct: 41 IELPAPAQPISVTMVAPADLEPPQAVQPP--PEPVVEPEPEPEPIPEPPKEAPVVIEKPK 98

Query: 237 PVVDPKPPEPTHPTITVSPKTGD 259
P PKP + +
Sbjct: 99 P--KPKPKPKPVKKVEQPKRDVK 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1953RTXTOXIND386e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 38.3 bits (89), Expect = 6e-05
Identities = 27/175 (15%), Positives = 66/175 (37%), Gaps = 30/175 (17%)

Query: 5 LEIKVKPEQLEQIAKNISEMQTHSQNIQQNLN--QSMFSIQMQWQGATSQHFY----GEY 58
L + K + + I+ + S+ + L+ S+ + A ++H +Y
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLH-----KQAIAKHAVLEQENKY 261

Query: 59 MRSMRLMESYIRNLQVTEKELRRIAQKFRQADEEYQKKQNEKLKEAHKK--EKKNEKSWW 116
+ ++ + Y L+ E E+ ++++ + ++ + +KL++ E +
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELA-- 319

Query: 117 EKGIEGAAEFIGVNDAIRAVTGKDPITG--KELS--TKERLIAAGWTLLNFVPVG 167
E IRA P++ ++L T+ ++ TL+ VP
Sbjct: 320 ------KNEERQQASVIRA-----PVSVKVQQLKVHTEGGVVTTAETLMVIVPED 363


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1972HTHFIS1021e-27 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 102 bits (257), Expect = 1e-27
Identities = 31/115 (26%), Positives = 62/115 (53%)

Query: 4 RILIVEDEEKIARVVQLELEFEGYESEIAKTGTEAMENFENGNWDLILLDVMLPNISGLE 63
IL+ +D+ I V+ L GY+ I G+ DL++ DV++P+ + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 VLRRIRLKNAVIPIILLTARDSVVDKVSGLDQGASDYITKPFQIEELLARIRACL 118
+L RI+ +P+++++A+++ + + ++GA DY+ KPF + EL+ I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119


16BALH_2062BALH_2079Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2062314-2.152683L-lysine 2,3-aminomutase
BALH_2063216-3.594857hypothetical protein
BALH_2064217-3.407330hypothetical protein
BALH_2065015-1.977702hypothetical protein
BALH_2066015-1.505357hypothetical protein
BALH_2067-113-2.117412serine/threonine protein kinase
BALH_2068-213-1.660411sporulation-control protein
BALH_2069-216-1.552181phosphatidylglycerophosphatase B
BALH_2070-115-1.661680cation efflux protein
BALH_2071319-2.557077thioredoxin
BALH_2072318-2.794136hypothetical protein
BALH_2073421-2.332908endoribonuclease L-PSP
BALH_2074319-2.726454hypothetical protein
BALH_2075216-3.649176hypothetical protein
BALH_2076317-3.493157hypothetical protein
BALH_2077218-4.270710complement component C1q domain-containing
BALH_2078416-4.083536hypothetical protein
BALH_2079217-4.327094peptidyl-prolyl isomerase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2078TCRTETB260.009 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 26.0 bits (57), Expect = 0.009
Identities = 10/30 (33%), Positives = 15/30 (50%), Gaps = 3/30 (10%)

Query: 18 VTFFGPCNEVITNVS---IINQLSTTKCQT 44
++FF NE++ NVS I N + T
Sbjct: 22 LSFFSVLNEMVLNVSLPDIANDFNKPPAST 51


17BALH_2131BALH_2160Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2131213-2.427164ABC transporter permease
BALH_2132215-2.163391oligopeptide transport system permease
BALH_2133314-1.940820oligopeptide ABC transporter permease
BALH_2134315-1.315645zinc uptake transporter
BALH_2135317-2.353598methyltransferase
BALH_2136217-2.336590hypothetical protein
BALH_2137219-1.509303hypothetical protein
BALH_2138319-2.317456Zn-dependent hydrolase
BALH_2139120-2.536880inosine-uridine preferring nucleoside hydrolase
BALH_2140014-2.511815hypothetical protein
BALH_2141013-1.974220hypothetical protein
BALH_2142-114-2.418626metallo-beta-lactamase family protein
BALH_2143-215-3.020638hypothetical protein
BALH_2144-216-3.269405TetR family transcriptional regulator
BALH_2146-117-2.726454MmpL family membrane protein
BALH_2147018-4.005430hypothetical protein
BALH_2148018-3.965521chloramphenicol O-acetyltransferase
BALH_2149-116-3.900004acetyltransferase
BALH_2150016-3.050078acetyltransferase
BALH_2151-213-2.571505ribosomal-protein-alanine acetyltransferase
BALH_2152-212-3.457785DMT family permease
BALH_2153114-3.475745MerR family transcriptional regulator
BALH_2154116-3.659706hypothetical protein
BALH_2155114-2.846444alpha/beta fold family hydrolase
BALH_2156014-3.095542protoporphyrinogen oxidase
BALH_2157118-3.404420hypothetical protein
BALH_2158117-3.639696SAM-dependent methyltransferase
BALH_2159-116-3.585976ribosomal-protein-alanine acetyltransferase
BALH_2160-215-3.095730cold-shock DNA-binding protein family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2131TCRTETA340.001 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 33.6 bits (77), Expect = 0.001
Identities = 24/119 (20%), Positives = 45/119 (37%), Gaps = 8/119 (6%)

Query: 57 ATMTQIMIALPAL--IFF--LLVGTVVDRFDRQRICTVSNICCSLCNIGILISLYYGMII 112
AT I +A + ++ G V R +R + I I + + M
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAF 304

Query: 113 LVFLFLFLENACIQFFSPSEQSMIQGVVESDQYGAAAGINQMVNSLYALFGVGIATMVY 171
+ + L A P+ Q+M+ V+ ++ G G + SL ++ G + T +Y
Sbjct: 305 PIMVLL----ASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIY 359



Score = 31.7 bits (72), Expect = 0.005
Identities = 28/198 (14%), Positives = 67/198 (33%), Gaps = 28/198 (14%)

Query: 181 LVNTLTFILSGILIQTISIPEKVRLPNGRTKWKEVNLKMLITEFKEGIRYIYQNETLKKL 240
+N L F+ L+ K + L+ R+ + L
Sbjct: 168 ALNGLNFLTGCFLLPE------------SHKGERRPLRREALNPLASFRWARGMTVVAAL 215

Query: 241 LLGFIVFGLLNGILSVSTTYIL----KYKLAPATYESLAMVGGVVGGISLLIGSIVATSI 296
+ F + L+ + + +++ ++ T G++ ++ + + +
Sbjct: 216 MAVFFIMQLVGQVPA--ALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAM---ITGPV 270

Query: 297 GKKYAPKPIIVFGMAGSGIFFGMCYFVNYVWSFY---VCIAFATFFLPFINVAIMGWMYE 353
+ + ++ GM G + + F W + V +A +P A+ +
Sbjct: 271 AARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMP----ALQAMLSR 326

Query: 354 IVEESFMGRVQSLLSPLT 371
V+E G++Q L+ LT
Sbjct: 327 QVDEERQGQLQGSLAALT 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2144HTHTETR864e-23 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 86.2 bits (213), Expect = 4e-23
Identities = 37/181 (20%), Positives = 70/181 (38%), Gaps = 18/181 (9%)

Query: 1 MEQKQRPLGRPRQNKNTKSTKETILEVATRLFLTQNYQGVSMDEVAKVCGVTKATVYYYF 60
M +K + + T++ IL+VA RLF Q S+ E+AK GVT+ +Y++F
Sbjct: 1 MARKTKQEAQE--------TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHF 52

Query: 61 STKADLFTATMIQMMIRIRENMSQILS-TNNTLEERLLNFAKVYLHATMDIDMKNFMKDA 119
K+DLF+ I E + + L L +T+ + + + +
Sbjct: 53 KDKSDLFSEIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEI 112

Query: 120 KLSLSEEQLKELKK-------AEDSMYEVLEKALDKAMQLGEIQKG-NPKFAAHAFVSLL 171
+ E + E+ Y+ +E+ L ++ + + AA +
Sbjct: 113 -IFHKCEFVGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171

Query: 172 S 172
S
Sbjct: 172 S 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2146ACRIFLAVINRP542e-09 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 54.5 bits (131), Expect = 2e-09
Identities = 38/232 (16%), Positives = 86/232 (37%), Gaps = 25/232 (10%)

Query: 203 LLIATVLLVLVLLILLYRSPILAILPLLVVGFAYGIISPTLGFLADHGWIKVDAQAISIM 262
L A +L+ LV+ + L ++ ++P + V + T LA G+ +I+ +
Sbjct: 344 LFEAIMLVFLVMYLFL-QNMRATLIPTIAVPVV---LLGTFAILAAFGY------SINTL 393

Query: 263 T----VLLFGAGTDYCLFLISRYREYLLEEESKYK-ALQLAIKASGGAIIMSALTVVLGL 317
T VL G D + ++ ++E++ K A + ++ GA++ A+ +
Sbjct: 394 TMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVF 453

Query: 318 GTLLL--AHYGAFHR-FAVPFSVAVFIMGIAALTILPALLLIFGRAAFFPFVPRTTSMNE 374
+ GA +R F++ A+ + + AL + PAL + P + +E
Sbjct: 454 IPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLK-------PVSAEHHE 506

Query: 375 ELARRKKKVVKVKKSKGAFSKKLGDVVVRRPWTIIMLTVFVLGGLASFVPRI 426
++ +++ ++ G+ R+
Sbjct: 507 NKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRL 558



Score = 37.9 bits (88), Expect = 2e-04
Identities = 27/161 (16%), Positives = 67/161 (41%), Gaps = 9/161 (5%)

Query: 203 LLIATVLLVLVLLILLYRSPILAILPLLVVGFAYGIISPTLGFLADHGWIKVDAQAISIM 262
L+ + ++V + L LY S + + +LVV I+ L + V +
Sbjct: 875 LVAISFVVVFLCLAALYESWSIPVSVMLVVPLG--IVGVLLAATLFNQKNDVYFM---VG 929

Query: 263 TVLLFGAGTDYCLFLISRYREYLLEE-ESKYKALQLAIKASGGAIIMSALTVVLGLGTLL 321
+ G + ++ ++ + +E + +A +A++ I+M++L +LG+ L
Sbjct: 930 LLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLA 989

Query: 322 LAH---YGAFHRFAVPFSVAVFIMGIAALTILPALLLIFGR 359
+++ GA + + + + A+ +P ++ R
Sbjct: 990 ISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030



Score = 33.7 bits (77), Expect = 0.003
Identities = 33/202 (16%), Positives = 70/202 (34%), Gaps = 21/202 (10%)

Query: 533 AGISNAEDQL--WIGGETASLYDTKQITERDEAVIIPVMISIIALLLLVYLRSIVAMIYL 590
A + N +L IG + + ++++ ++ + ++ L L S + +
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 591 IVTVVLSFFSALGAGWLLLHYGMGVPAIQGAIPLYAFVFLVALGEDYNIFMVSEIWKNRK 650
++ V L L A L + V + G + + L I +V +
Sbjct: 901 MLVVPLGIVGVLLAAT-LFNQKNDVYFMVG------LLTTIGLSAKNAILIVEFAKDLME 953

Query: 651 TQNHLDAVKNGVIQTGSVITSAGLILAGTFAVLGTLPIQV------LVQFGIVTAI--GV 702
+ V + + L+ + F +LG LP+ + Q + + G+
Sbjct: 954 KEGK--GVVEATLMAVRMRLRPILMTSLAF-ILGVLPLAISNGAGSGAQNAVGIGVMGGM 1010

Query: 703 LLDTFIVRPLLVPAITVVLGRF 724
+ T + VP VV+ R
Sbjct: 1011 VSATLLAI-FFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2150SACTRNSFRASE411e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.1 bits (96), Expect = 1e-06
Identities = 26/98 (26%), Positives = 42/98 (42%), Gaps = 6/98 (6%)

Query: 49 YSSVEMMRYSIEELDS--YKVIMDEKIIGGIIVTISGKSYGRIDRIFVEPVYQGKGIGSN 106
Y +M +EE + ++ IG I + + Y I+ I V Y+ KG+G+
Sbjct: 50 YEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTA 109

Query: 107 VIKL-IE--AEYPSIRIWDLETSSRQINNHHFYKKMGY 141
++ IE E + LET I+ HFY K +
Sbjct: 110 LLHKAIEWAKENHFCGLM-LETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2151SACTRNSFRASE386e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.0 bits (88), Expect = 6e-06
Identities = 25/91 (27%), Positives = 36/91 (39%), Gaps = 11/91 (12%)

Query: 72 FVAEYDGEVVGFVGLTQSPGRRSHSGDLFIGVDSEYHNKGIGKALLTKMLDLADNWLMLE 131
F+ + +G + + + + D I V +Y KG+G AL L A W E
Sbjct: 68 FLYYLENNCIGRIKIRSNWNGYALIED--IAVAKDYRKKGVGTAL----LHKAIEW-AKE 120

Query: 132 RVELGV-LET---NPKAKTLYEKFGFVEEGV 158
G+ LET N A Y K F+ V
Sbjct: 121 NHFCGLMLETQDINISACHFYAKHHFIIGAV 151


18BALH_2173BALH_2182Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2173316-0.348286multidrug resistance ABC transporter ATP-binding
BALH_21746180.871073N-acetylmuramoyl-L-alanine amidase
BALH_21757183.206976amino acid transporter LysE
BALH_21776194.021281DNA-binding protein
BALH_21787191.782403hypothetical protein
BALH_21806192.192271BclA protein
BALH_21814170.674523hypothetical protein
BALH_21823150.052736exosporium protein H
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_218060KDINNERMP300.010 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 29.9 bits (67), Expect = 0.010
Identities = 24/100 (24%), Positives = 38/100 (38%), Gaps = 8/100 (8%)

Query: 159 TGATGTPGVTGPTGVIGPITTTNLLFYTFADGEK-----LIYTDSDGLAQYGTTHILSPD 213
+G TG G P P+ Y A+G+ + YTD+ G + T +L
Sbjct: 112 SGLTGRDGPDNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNT-FTKTFVLKRG 170

Query: 214 EVS-YINLFINGILQPQPLYQVSTGQLTLLDNQPPSQGSS 252
+ + +N + +PL S GQL PP +
Sbjct: 171 DYAVNVNYNVQN-AGEKPLEISSFGQLKQSITLPPHLDTG 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2182cloacin310.008 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.2 bits (70), Expect = 0.008
Identities = 21/66 (31%), Positives = 26/66 (39%), Gaps = 1/66 (1%)

Query: 151 GGPTGPTGPTGPGGGATGPTGPTGPGGGATGPTGPTGDTGAAGPTGPTGDTGLAGATGPT 210
GGPTG G G +G + P GG +G G G G G++G TG
Sbjct: 22 GGPTGL-GVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGN 80

Query: 211 GDTGAA 216
AA
Sbjct: 81 LSAVAA 86


19BALH_2192BALH_2249Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2192212-0.407312hypothetical protein
BALH_2193018-1.071852PTS system diacetylchitobiose-specific
BALH_2194116-1.141351PTS system diacetylchitobiose-specific
BALH_2195117-1.663189anhydro-N-acetylmuramic acid kinase
BALH_2196115-1.892586ATPase family protein
BALH_2197015-2.027851putative glycerol-3-phosphate acyltransferase
BALH_2198017-2.726454serine-pyruvate aminotransferase
BALH_2199217-3.656547threonine dehydratase
BALH_2200520-5.917534hypothetical protein
BALH_2201-215-3.066879hypothetical protein
BALH_2202-113-3.338585hypothetical protein
BALH_2203-313-2.572133Zn-dependent hydrolase
BALH_2204-312-2.131216hypothetical protein
BALH_22051160.145259hypothetical protein
BALH_22065202.097805DEAD/DEAH box helicase
BALH_220811312.505107hypothetical protein
BALH_220913343.268750hypothetical protein
BALH_221015372.976026TetR family transcriptional regulator
BALH_221119464.027527hypothetical protein
BALH_221220505.023174hypothetical protein
BALH_221312382.292939hypothetical protein
BALH_2214423-1.264606hypothetical protein
BALH_2215321-2.140034hypothetical protein
BALH_2216218-2.255519hypothetical protein
BALH_2217015-2.679899phage integrase
BALH_2219-114-3.906836macrolide-efflux protein
BALH_2220-213-3.668606ABC transporter permease
BALH_2221-212-1.764415ABC transporter ATP-binding protein
BALH_2222-115-1.405132methyltransferase
BALH_2223018-0.885963PhnB protein
BALH_2224019-1.547556indolepyruvate decarboxylase, C-terminal
BALH_2225221-0.373512indolepyruvate decarboxylase, central region
BALH_2226223-0.431463DNA integration/recombination/invertion protein
BALH_22272240.289111hypothetical protein
BALH_2228222-0.038282hypothetical protein
BALH_2229221-0.314878hypothetical protein
BALH_22301200.309660hypothetical protein
BALH_2231017-1.352213hypothetical protein
BALH_2232215-2.726454hypothetical protein
BALH_2233214-2.976246phage terminase, large subunit
BALH_2234115-2.874339N-acetylmuramoyl-L-alanine amidase
BALH_2235117-2.782009hypothetical protein
BALH_2236016-2.897770hypothetical protein
BALH_2237017-3.436836hypothetical protein
BALH_2238-114-1.853606indolepyruvate decarboxylase, N-terminal
BALH_2239-214-1.771344MarR family transcriptional regulator
BALH_2240113-1.615343phosphoglyceromutase
BALH_2241-211-2.029936hypothetical protein
BALH_2242-113-2.069235hypothetical protein
BALH_2243010-1.400436aminoacyl-histidine dipeptidase
BALH_2244112-2.075775hypothetical protein
BALH_2245213-2.396239chloramphenicol acetyltransferase
BALH_2246114-3.174242D-alanyl-D-alanine carboxypeptidase
BALH_2247213-3.376779lipase/acylhydrolase
BALH_2248112-3.018386penicillin-binding protein
BALH_2249-112-3.063929hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2196PRPHPHLPASEC290.025 Prokaryotic zinc-dependent phospholipase C signature.
		>PRPHPHLPASEC#Prokaryotic zinc-dependent phospholipase C signature.

Length = 398

Score = 28.8 bits (64), Expect = 0.025
Identities = 14/72 (19%), Positives = 26/72 (36%), Gaps = 11/72 (15%)

Query: 112 GILTIGGTGAICIGKKGKVYEYSGGW-GHILGDEGSGYWIALQGLKRMANQFDQGVTLCP 170
++ ++ G KVY W G I G G+ I QG+ + N +
Sbjct: 8 ALICATLATSLWAGASTKVY----AWDGKIDG-TGTHAMIVTQGVSILENDLSKNEP--- 59

Query: 171 LSLRIQDEFQLL 182
++ ++L
Sbjct: 60 --ESVRKNLEIL 69


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2198AUTOINDCRSYN290.034 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 29.0 bits (65), Expect = 0.034
Identities = 20/120 (16%), Positives = 42/120 (35%), Gaps = 10/120 (8%)

Query: 15 ESIHKLNYKTFVEEIPQHEETKDRVRIDRFHEENT-YLICLDDDKLVGMVALRRKRPFSL 73
+ L +TF + + + D + D++ NT YL + D+ ++ + R
Sbjct: 18 GELFTLRKETFKDRLNWAVQCTDGMEFDQYDNNNTTYLFGIKDNTVICSL---RFIETKY 74

Query: 74 DYKISNLDFYLQEHGE----NVYEIRLLSVEREYRNGRALLGLIRFLHRYLLLNGYELAL 129
I+ F + N E V++ + +LG + L L+ +
Sbjct: 75 PNMITGTFFPYFKEINIPEGNYLESSRFFVDKS--RAKDILGNEYPISSMLFLSMINYSK 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2203PF06580280.040 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 27.5 bits (61), Expect = 0.040
Identities = 9/30 (30%), Positives = 12/30 (40%), Gaps = 2/30 (6%)

Query: 42 PLIENAILKHGYELKNLKNII-ITHYDDDH 70
L+EN I KHG I + D+
Sbjct: 262 TLVENGI-KHGIAQLPQGGKILLKGTKDNG 290


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2206TONBPROTEIN300.013 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 30.3 bits (68), Expect = 0.013
Identities = 20/113 (17%), Positives = 39/113 (34%), Gaps = 6/113 (5%)

Query: 338 AGGSGLAITFVAAKDEKH------LEEIEKTLGAPIQREIIEQPKIKRVDENGKPLPKPA 391
A +++T V D + E + + V E KP PKP
Sbjct: 40 APAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPK 99

Query: 392 PKKSGEYRQRDSREGSRSGSKGRTRNDSRNSSRNENNRSFNKPSNKKGSTKQG 444
PK + +++ R+ S+ + ++ +R ++ + S S G
Sbjct: 100 PKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATSKPVTSVASG 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2208BACTRLTOXIN260.012 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 26.4 bits (58), Expect = 0.012
Identities = 8/22 (36%), Positives = 13/22 (59%)

Query: 31 KINWYNDMKTSFANKELADLVK 52
K+ Y+ +KT N++LA K
Sbjct: 84 KLKNYDKVKTELLNEDLAKKYK 105


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2210HTHTETR724e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.0 bits (176), Expect = 4e-18
Identities = 32/161 (19%), Positives = 70/161 (43%), Gaps = 9/161 (5%)

Query: 8 EERRKEILETAERLFLTKGYTKTTVNDILKEIGIAKGTFYHYFKSKEEVMDEIIMRIIKE 67
+E R+ IL+ A RLF +G + T++ +I K G+ +G Y +FK K ++ E I + +
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE-IWELSES 68

Query: 68 DVAKAKVIVSNPNIPVLEKLFRVLME---QSPKSGDIKDKMIE-QFHQPNNA-EMHQKSL 122
++ + ++ + R ++ +S + + + ++E FH+ EM
Sbjct: 69 NIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQ 128

Query: 123 VQSIIHLSPV--LTEILEQGIEEGIFSTSY-PQETIELLLS 160
Q + L + + L+ IE + + ++
Sbjct: 129 AQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRG 169


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2219TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 38.7 bits (90), Expect = 3e-05
Identities = 32/143 (22%), Positives = 59/143 (41%), Gaps = 9/143 (6%)

Query: 49 IFAGLYAITSIPFLLAPLGGAIADRFNRRNLMVIFDFINTAIVLSFIVLLFTGSVSILLI 108
I LYA+ F AP+ GA++DRF RR +++ A+ + ++ + +L I
Sbjct: 47 ILLALYALMQ--FACAPVLGALSDRFGRR-PVLLVSLAGAAV--DYAIMATAPFLWVLYI 101

Query: 109 GTIMFLLAIVNAMYAPVVMASIPQLVPEKKLEQANGIVNGVQALSNIVAPVLGGILYGII 168
G I +A + V A I + + + G ++ + PVLGG++ G
Sbjct: 102 GRI---VAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLM-GGF 157

Query: 169 GLKMLVIISCLAFFLSAILEMFI 191
+ L+ + F+
Sbjct: 158 SPHAPFFAAAALNGLNFLTGCFL 180



Score = 29.0 bits (65), Expect = 0.033
Identities = 16/79 (20%), Positives = 34/79 (43%), Gaps = 3/79 (3%)

Query: 88 TAIVLSFIVLLFTGSVSILLIGTIMFLLAIVNAMYAPVVMASIPQLVPEKKLEQANGIVN 147
A +I+L F + ++ + P + A + + V E++ Q G +
Sbjct: 285 IADGTGYILLAFATRGWMAFPIMVLLASG---GIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 148 GVQALSNIVAPVLGGILYG 166
+ +L++IV P+L +Y
Sbjct: 342 ALTSLTSIVGPLLFTAIYA 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2244SECA270.035 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 27.1 bits (60), Expect = 0.035
Identities = 9/24 (37%), Positives = 14/24 (58%)

Query: 33 ESQPTQKESRFLDTWRWQNYFLLH 56
E Q E++ L + +QNYF L+
Sbjct: 361 EGVQIQNENQTLASITFQNYFRLY 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2246BLACTAMASEA565e-11 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 56.3 bits (136), Expect = 5e-11
Identities = 32/158 (20%), Positives = 57/158 (36%), Gaps = 16/158 (10%)

Query: 35 FIAIITVLTLFCSITVISGSASAETVSAIEVEAGS---AVLVEANSGKILYEKNADELLA 91
+ II++L V + E + E + + ++ SG+ L ADE
Sbjct: 5 RLCIISLLATLPL-AVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFP 63

Query: 92 IASMTKMMSEYLVHEAVENGKLKWNQKVKISEYAYKISQDRSLSNVPLEN---GGSYTVK 148
+ S K++ V V+ G + +K+ Q + P+ TV
Sbjct: 64 MMSTFKVVLCGAVLARVDAGDEQLERKI-------HYRQQDLVDYSPVSEKHLADGMTVG 116

Query: 149 ELYEAMVIYSANGATIALAEEIAG-KEVN-FVKMMNDK 184
EL A + S N A L + G + F++ + D
Sbjct: 117 ELCAAAITMSDNSAANLLLATVGGPAGLTAFLRQIGDN 154


20BALH_2263BALH_2282Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2263314-2.155957myo-inositol catabolism protein
BALH_2264314-3.003846hypothetical protein
BALH_2265214-3.229426hypothetical protein
BALH_2266114-2.639308hypothetical protein
BALH_2267114-2.8952302',3'-cyclic-nucleotide 2'-phosphodiesterase
BALH_2268218-3.152519response regulator aspartate phosphatase
BALH_2269215-2.096609transcriptional regulator, DNA-binding protein
BALH_2270114-2.375114hypothetical protein
BALH_2271114-2.293778M15B family D-Ala-D-Ala carboxypeptidase VanY
BALH_2272115-3.493300hypothetical protein
BALH_2274116-3.158630N-acetylmuramoyl-L-alanine amidase
BALH_2275217-3.704107hemolysin II
BALH_2276118-3.409285TetR family transcriptional regulator
BALH_2277117-3.281522ABC transporter ATP-binding protein
BALH_2278218-3.122373sensory box/GGDEF family protein
BALH_2279320-1.774897diguanylate cyclase/phosphodiesterase
BALH_2280418-2.129962ribosomal-protein-serine acetyltransferase
BALH_2281318-2.125689methyl-accepting chemotaxis protein
BALH_2282318-2.436210hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2265INFPOTNTIATR270.027 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 27.3 bits (60), Expect = 0.027
Identities = 13/44 (29%), Positives = 23/44 (52%), Gaps = 5/44 (11%)

Query: 101 QQMVDLLNKSRKELMRFLSTIEDESILAKKSVMHPALGELLLEQ 144
+QM D+L+K +K+LM + + KK+ + A G+ L
Sbjct: 76 EQMKDVLSKFQKDLM-----AKRSAEFNKKAEENKAKGDAFLSA 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2275BICOMPNTOXIN1167e-32 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 116 bits (293), Expect = 7e-32
Identities = 68/324 (20%), Positives = 137/324 (42%), Gaps = 31/324 (9%)

Query: 17 VASVIMSGSLGLQATSAFADSKG--TVENLQNGGKV--YNSFKTTYDMKQNIKNSIKVSF 72
+ + + L A ++K E++ G + + K + +I+ F
Sbjct: 7 LTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQNIQFDF 66

Query: 73 IEDPYADKKIAIVTTDGSNIDAKYTI---NGGYYNAGLKWPSAYHTEAEITSGDSAQFHK 129
++D +K I+ G I ++ T + ++WP Y+ + T+
Sbjct: 67 VKDKKYNKDALILKMQG-FISSRTTYYNYKKTNHVKAMRWPFQYNIGLK-TNDKYVSLIN 124

Query: 130 AAPVNTMTSAKVTSEVGYTLGGSVKVGVNDKGPNADASITGSFAWKESVSYDQVDYKTVL 189
P N + S V+ +GY +GG+ + + G GSF + +S+SY Q +Y + +
Sbjct: 125 YLPKNKIESTNVSQTLGYNIGGNFQSAPSLGG-------NGSFNYSKSISYTQQNYVSEV 177

Query: 190 ETHTDKKLNWKVGFQSFNYPEWGIYNRDSFNTFYGNQLFMKSRSYN-EGTNNFVSKDTVP 248
E K + W V SF + + + LF+ + ++ + + FV +P
Sbjct: 178 EQQNSKSVLWGVKANSFA-------TESGQKSAFDSDLFVGYKPHSKDPRDYFVPDSELP 230

Query: 249 ALTGYGFSPNVVAVITADKTES-TSDLKITNRRISD-----QYNIEWVSSKWWGTNNKDT 302
L GF+P+ +A ++ +K S TS+ +IT R D + + + +S G +
Sbjct: 231 PLVQSGFNPSFIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNA 290

Query: 303 Y-NEFFTNNYKLDWKNHQVTLDNQ 325
+ N +T Y+++WK H++ + Q
Sbjct: 291 FVNRNYTVKYEVNWKTHEIKVKGQ 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2276HTHTETR623e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.3 bits (151), Expect = 3e-14
Identities = 30/147 (20%), Positives = 63/147 (42%), Gaps = 8/147 (5%)

Query: 15 REQTMENILKAAKKKFGERGYEGTSIQEIAKEAKVNVAMASYYFNGKENLYYEVFKK-YG 73
++T ++IL A + F ++G TS+ EIAK A V ++F K +L+ E+++
Sbjct: 9 AQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSES 68

Query: 74 LANELPNFLEKNQF-NPINALREYLAVFTTHIKENPE-----IGTLAYEEIIKESARLEK 127
EL + +P++ LRE L E + E A +++
Sbjct: 69 NIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQ 128

Query: 128 -IKPYFIGSFEQLTEILQEGEKQGVFH 153
+ + S++++ + L+ + +
Sbjct: 129 AQRNLCLESYDRIEQTLKHCIEAKMLP 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2277PF05272340.001 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 33.9 bits (77), Expect = 0.001
Identities = 11/34 (32%), Positives = 19/34 (55%)

Query: 338 IVLDGKNGSGKSSILKLILGQSIQYTGLVTLGTG 371
+VL+G G GKS+++ ++G +GTG
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTG 632


21BALH_2295BALH_2466Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2295221-4.344062hypothetical protein
BALH_2296018-4.465962aminoglycoside N(6')-acetyltransferase
BALH_2297017-4.544251hypothetical protein
BALH_2298-214-4.184919alpha/beta fold family hydrolase
BALH_2299-115-4.798148hypothetical protein
BALH_2300-114-4.387656D-alanyl-D-alanine carboxypeptidase
BALH_2301-114-4.300130sensor histidine kinase
BALH_2302017-4.811396DNA-binding response regulator
BALH_2303-119-5.345298HAD superfamily hydrolase
BALH_2304115-3.694716hypothetical protein
BALH_2305114-3.034620S-adenosylhomocysteine nucleosidase
BALH_2306215-3.803895hypothetical protein
BALH_2307216-3.953342acetyltransferase
BALH_2308114-3.824000hypothetical protein
BALH_2309214-3.617984bacillolysin
BALH_2310115-4.993120hypothetical protein
BALH_2311116-5.504231hypothetical protein
BALH_2312115-5.120964hypothetical protein
BALH_2313014-4.385877excinuclease ABC, A subunit-related protein
BALH_2314118-4.598218PadR family transcriptional regulator
BALH_2315218-4.078052hypothetical protein
BALH_2316216-4.133937penicillin-binding protein
BALH_2317214-4.258182MerR family transcriptional regulator
BALH_2319114-4.925637macrolide-efflux protein
BALH_2320315-5.691742mutT/Nudix family protein
BALH_2321114-5.693858bleomycin resistance protein
BALH_2322214-6.077948hypothetical protein
BALH_2323215-6.070998AraC family transcriptional regulator
BALH_2324116-5.963136oligoendopeptidase F
BALH_2325319-5.708444hypothetical protein
BALH_2326017-5.845195GntR family transcriptional regulator
BALH_2327318-4.373357hypothetical protein
BALH_2328317-4.518061hypothetical protein
BALH_2329117-3.110374hypothetical protein
BALH_2330115-2.278896methyltransferase
BALH_2331113-1.806385methyltransferase
BALH_2332114-1.567033regucalcin-like protein
BALH_2333013-1.809548lysyl-tRNA synthetase
BALH_2334-110-1.093571cell wall hydrolase
BALH_2335-110-2.107484acetamidase
BALH_2336012-3.390776response regulator
BALH_2337-213-3.508300two-component sensor histidine kinase
BALH_2338-112-4.136274hypothetical protein
BALH_2339-113-4.637179ABC transporter ATP-binding protein
BALH_2340-114-5.494203acetyltransferase
BALH_2341012-4.323898hypothetical protein
BALH_2342-112-3.731642ABC transporter ATP-binding protein
BALH_2344-111-3.669989ABC transporter permease
BALH_2345112-2.795574hypothetical protein
BALH_2346012-2.075880hypothetical protein
BALH_2347013-1.895331lipase (triacylglycerol lipase)
BALH_2348-114-2.582952homoserine dehydrogenase
BALH_2349-115-3.948515GntR family transcriptional regulator
BALH_2350-115-4.151418D-alanine--D-alanine ligase
BALH_2351-117-4.856313transcriptional activator
BALH_2352-118-5.725059hypothetical protein
BALH_2353-118-6.194796exosporium protein D
BALH_2354-118-5.981501hypothetical protein
BALH_2356-217-4.236815hypothetical protein
BALH_2357-116-3.089101hypothetical protein
BALH_2358-314-2.159946hypothetical protein
BALH_2359-213-2.007177hypothetical protein
BALH_2360-212-1.561631hypothetical protein
BALH_2361-212-1.229167hypothetical protein
BALH_2362-114-1.821149cytochrome P450
BALH_2365-117-2.278896macrolide efflux protein
BALH_2366018-3.566489virginiamycin A acetyltransferase
BALH_2367013-3.250162PadR family transcriptional regulator
BALH_2368014-3.346128cytochrome P450
BALH_2369013-3.599998hypothetical protein
BALH_2370012-3.297803HAD superfamily hydrolase
BALH_2371012-3.434880PAS/PAC sensor signal transduction histidine
BALH_2372013-3.428504penicillin-binding protein transpeptidase
BALH_2373114-4.562341glycosyl transferase family protein
BALH_2374114-4.562682aspartate racemase
BALH_2375315-4.578306hypothetical protein
BALH_2376217-4.291176cobalt ABC transporter ATP-binding protein
BALH_2378315-4.436348ABC transporter permease
BALH_2379114-3.096824sporulation kinase B
BALH_2380114-2.347666serine/threonine protein kinase
BALH_2381215-2.408993hypothetical protein
BALH_2382215-2.173203zinc-containing alcohol dehydrogenase
BALH_2383215-2.948183undecaprenyl-diphosphatase
BALH_2384216-2.948790penicillin-binding protein transpeptidase
BALH_2385217-3.656254TetR family transcriptional regulator
BALH_2386216-3.486188short chain dehydrogenase family protein
BALH_2387216-3.012372major facilitator superfamily permease
BALH_2388115-3.040919major facilitator superfamily permease
BALH_2389118-2.699808penicillin-binding protein
BALH_2390119-3.052452acetyltransferase
BALH_2391218-3.021914hypothetical protein
BALH_2392320-3.490229degV family protein
BALH_2393720-5.440190ThiJ/PfpI family protein
BALH_2394718-5.994428resolvase
BALH_2395616-5.929433DeoR family transcriptional regulator
BALH_2396515-5.779449thiJ/pfpI family protein
BALH_2397314-5.945733spermine/spermidine acetyltransferase
BALH_2398214-5.801069MerR family transcriptional regulator
BALH_2399113-4.752728permease
BALH_2400017-4.313755hypothetical protein
BALH_2401217-3.421365acetyltransferase
BALH_2402317-3.251388glycerophosphoryl diester phosphodiesterase
BALH_2403317-2.822446hypothetical protein
BALH_2404417-2.726454hypothetical protein
BALH_2405318-2.854070hypothetical protein
BALH_2406116-2.794204GntR family transcriptional regulator
BALH_2407113-2.582982DMT family permease
BALH_2409112-2.484127spermine/spermidine acetyltransferase
BALH_2410-114-2.931652cutA1 divalent ion tolerance protein
BALH_2411-115-3.295506mutT/nudix family protein
BALH_2412-115-3.244887penicillin-binding protein
BALH_2415014-2.862826oxalate/formate antiporter
BALH_2416017-4.190418MutT/NUDIX family protein
BALH_2417-117-4.534733DNA polymerase III subunit beta
BALH_2418-116-4.646039MutT/Nudix family protein
BALH_2419-116-4.196687alpha/beta fold family hydrolase
BALH_2420015-4.376887hypothetical protein
BALH_2421012-4.336161hypothetical protein
BALH_2422015-3.549499broad-specificity phosphatase PhoE
BALH_2423014-3.401803endoribonuclease L-PSP
BALH_2424114-3.281205hypothetical protein
BALH_2425116-3.548146hypothetical protein
BALH_2426016-3.071739esterase
BALH_2427119-4.166990hypothetical protein
BALH_2428017-3.563592patatin-like phospholipase
BALH_2429-215-3.036531hypothetical protein
BALH_2430-215-3.115559acetyltransferase
BALH_2431-113-3.116888metal-dependent hydrolase
BALH_2432-113-3.085134ribosomal-protein-alanine acetyltransferase
BALH_2433014-2.195480hypothetical protein
BALH_2434215-2.403071undecaprenyl pyrophosphate phosphatase
BALH_2435316-2.650292excinuclease ABC, C subunit, N-terminal region
BALH_2436518-2.476203acetyltransferase
BALH_2437113-0.707582hypothetical protein
BALH_2438-113-1.184651hypothetical protein
BALH_2439-311-1.638122hypothetical protein
BALH_2440-312-2.086340serine/threonine protein kinase
BALH_2441-313-2.392092dATP pyrophosphohydrolase
BALH_2442-313-2.919411D-amino acid dehydrogenase small subunit
BALH_2443-216-4.576748hypothetical protein
BALH_2444-118-3.866318N-hydroxyarylamine O-acetyltransferase
BALH_2445117-3.769139hypothetical protein
BALH_2446012-2.354779hypothetical protein
BALH_2447-110-1.740169HAD superfamily hydrolase
BALH_2448010-1.574482hypothetical protein
BALH_2449011-1.705139hypothetical protein
BALH_2450110-2.016617undecaprenyl-diphosphatase
BALH_2451110-2.143703pullulanase
BALH_2452-111-1.994442extracellular neutral metalloprotease,
BALH_2453-113-3.330683hypothetical protein
BALH_2454-213-3.317687RNA polymerase sigma factor SigX
BALH_2455-113-2.984718hypothetical protein
BALH_2456-115-2.419159fosfomycin resistance protein
BALH_2457-114-2.189361preprotein translocase subunit SecY
BALH_2458016-2.571894hypothetical protein
BALH_2460216-1.847204DMT family permease
BALH_2461116-2.204642GntR family transcriptional regulator
BALH_2462217-2.974201carboxylesterase
BALH_2463117-4.017877hypothetical protein
BALH_2464018-4.245991hypothetical protein
BALH_2465119-4.455882hypothetical protein
BALH_2466015-3.478500hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2296SACTRNSFRASE280.016 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.4 bits (63), Expect = 0.016
Identities = 25/105 (23%), Positives = 38/105 (36%), Gaps = 6/105 (5%)

Query: 71 EQQLEKYIESENTLAFKVIDEETKEVIGHISLGQIDHINKSARIGKVLVGDTRMRGRSIG 130
+ Y+E E AF E IG I + + N A I + V R + +G
Sbjct: 53 DDMDVSYVEEEGKAAFLYYLEN--NCIGRIKIRS--NWNGYALIEDIAV-AKDYRKKGVG 107

Query: 131 KHMMKAVLHIAFDELKLHRVTLGVYDFNTSAISCYEKIGFVKEGL 175
++ + A E + L D N SA Y K F+ +
Sbjct: 108 TALLHKAIEWA-KENHFCGLMLETQDINISACHFYAKHHFIIGAV 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2300BLACTAMASEA371e-04 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 36.7 bits (85), Expect = 1e-04
Identities = 23/112 (20%), Positives = 46/112 (41%), Gaps = 7/112 (6%)

Query: 39 ILIDANSGEVV--YKKNEETSIQSATLSKLMTEYIVLEQLDKGNIQLDEVVKISNEVFRA 96
I +D SG + ++ +E + S K++ VL ++D G+ QL+ + +
Sbjct: 43 IEMDLASGRTLTAWRADERFPMMST--FKVVLCGAVLARVDAGDEQLERKIHYRQQDLV- 99

Query: 97 ETSPIQVTSKDKT-TVRDLLHALLLTGNNRSTLALAEHIAGNEDNFTQLMNE 147
+ SP+ TV +L A + +N + L + G T + +
Sbjct: 100 DYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGGPA-GLTAFLRQ 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2302HTHFIS922e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.2 bits (229), Expect = 2e-23
Identities = 40/150 (26%), Positives = 74/150 (49%), Gaps = 3/150 (2%)

Query: 22 ILIIDDDKEIVELLAVYLRNEGYNIYKAYDGDDALQMISTYEVDLMILDIMMPKRNGLEV 81
IL+ DDD I +L L GY++ + + I+ + DL++ D++MP N ++
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 82 CQEVRE-NNTVPILMLSAKAEDMDKILGLMTGADDYMIKPFNPLELVARV-KALLRRSSF 139
+++ +P+L++SA+ M I GA DY+ KPF+ EL+ + +AL
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRR 125

Query: 140 QNASSPKNEDGM-IRIRSAEIHKHNHTVKV 168
+ ++DGM + RSA + + +
Sbjct: 126 PSKLEDDSQDGMPLVGRSAAMQEIYRVLAR 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2307SACTRNSFRASE426e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.9 bits (98), Expect = 6e-07
Identities = 24/91 (26%), Positives = 43/91 (47%), Gaps = 8/91 (8%)

Query: 186 DMDYIEKTNHTFYGAYVDNDLKGSICI----NEQGKISFIFIDKEYRNRGIGSKLLQVAR 241
D+ Y+E+ + Y++N+ G I I N I I + K+YR +G+G+ LL A
Sbjct: 56 DVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAI 115

Query: 242 D---ELNLESLLISFPNNSLLE-GFVKKTGF 268
+ E + L++ + ++ F K F
Sbjct: 116 EWAKENHFCGLMLETQDINISACHFYAKHHF 146



Score = 39.2 bits (91), Expect = 5e-06
Identities = 18/52 (34%), Positives = 22/52 (42%)

Query: 83 LAVHPNYRGVGVSQKLFELHKEEALQNECKQLFLEVIVGNDRAIRFYNKLGY 134
+AV +YR GV L E A +N L LE N A FY K +
Sbjct: 95 IAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2309THERMOLYSIN3152e-99 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 315 bits (807), Expect = 2e-99
Identities = 191/591 (32%), Positives = 269/591 (45%), Gaps = 71/591 (12%)

Query: 1 MKNKKEIAIVALTTGLALTSIVPYGIGYAEETDQMQVDIQEDSFRTGELTKPSKKTPESV 60
M + + + L GL P+G ++ + SF +G L + +
Sbjct: 1 MNKRAMLGAIGLAFGLMAW---PFGASAKGKSMVWNEQWKTPSFVSGSLLGR---CSQEL 54

Query: 61 VKDAL-KEKTEHVLSPKQVSGDKGVDYKVLQKRGSYDGTTLVRLQQIYEGKEVYGHQLTA 119
V L +EK Q+ G ++ + G T++R +Q G L A
Sbjct: 55 VYRYLDQEKNTF-----QLGGQARERLSLIGNKLDELGHTVMRFEQAIAASLCMGAVLVA 109

Query: 120 HVDKKGIIKSVSGESAQNLEKEDLKNPINLSKEEAKQYIYKKYGNDIKFI---SEPEVKE 176
HV+ + S+SG NL+K LK +S ++A+ + + + +E
Sbjct: 110 HVNDGEL-SSLSGTLIPNLDKRTLKTEAAISIQQAEMIAKQDVADRVTKERPAAEEGKPT 168

Query: 177 VIFVDENNGQASNAYQVTFAAATPNYVSGTYLVDAHNGVMLK--NTLQESDLKVSEEQVE 234
+ + + AY+V TP + Y++DA +G +L N + E+
Sbjct: 169 RLVIYPDEETPRLAYEVNVRFLTPVPGNWIYMIDAADGKVLNKWNQMDEAKPG------- 221

Query: 235 SLKENKKSNSISLTGTGKDDLGITRIFGISEQSN-GKYALADYTRGQGIETYDVNYRDIN 293
+ S G G+ LG + + S G Y L D TRG GI TYD R
Sbjct: 222 ---GAQPVAGTSTVGVGRGVLGDQKYINTTYSSYYGYYYLQDNTRGSGIFTYDGRNR--- 275

Query: 294 FEERYYPGILATSTSTTFD---DPKAVSAHFLATKVYDFYKDKYKRNSFDNKGKKIVSVV 350
PG L F D AV AH+ A VYD+YK+ + R S+D I S V
Sbjct: 276 ---TVLPGSLWADGDNQFFASYDAAAVDAHYYAGVVYDYYKNVHGRLSYDGSNAAIRSTV 332

Query: 351 HAWHSGETDDPKNWGNAFSANINNVSMLIYGD-------PMVRAFDIAGHEFTHAVTSSE 403
H + + NAF N S ++YGD P D+ GHE THAVT
Sbjct: 333 HY--------GRGYNNAFW----NGSQMVYGDGDGQTFLPFSGGIDVVGHELTHAVTDYT 380

Query: 404 SNLEFFGESGAINEALSDIMGTAIEKYINNGKFNWTIGEQ------SGSVLRDMKNPSSV 457
+ L + ESGAINEA+SDI GT +E Y N +W IGE +G LR M +P+
Sbjct: 381 AGLVYQNESGAINEAMSDIFGTLVEFY-ANRNPDWEIGEDIYTPGVAGDALRSMSDPA-- 437

Query: 458 KFFDGVPYPDDYSKYSDLNGEDNEGVHFNSSIINKVAYLIAQGGTHNGVTVNGIGEDKMF 517
K+ D PD YSK +DN GVH NS IINK AYL++QGG H GV+V GIG DKM
Sbjct: 438 KYGD----PDHYSKRYT-GTQDNGGVHTNSGIINKAAYLLSQGGVHYGVSVTGIGRDKMG 492

Query: 518 DIFYYANTDELNMTSNFSELRLACLKVATNKYGANSIEVETVQKAFDAAKI 568
IFY A L TSNFS+LR AC++ A + YG+ S EV +V++AF+A +
Sbjct: 493 KIFYRALVYYLTPTSNFSQLRAACVQAAADLYGSTSQEVNSVKQAFNAVGV 543


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2319TCRTETA415e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 41.3 bits (97), Expect = 5e-06
Identities = 66/348 (18%), Positives = 118/348 (33%), Gaps = 27/348 (7%)

Query: 40 LPWIAYQLTGSAVVMSS---LFAINVLPIVLFGPLVGVIIDRYDRKKLLLVADITNIILV 96
LP + L S V + L A+ L P++G + DR+ R+ +LLV+ +
Sbjct: 28 LPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDY 87

Query: 97 SFVPILHSLHLLEIWYLYIITFMLAVMSMLFDVTTVTVIPQIAGASLTKANSFYQMVNQL 156
+ + L W LYI + + V + G + F
Sbjct: 88 AIMATAPFL-----WVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGF 142

Query: 157 ASLFGPMIAGVFISFIGGFQLLWINVLSFIATLVAVMLLPSMKTTNKKCEDKNTLQNVLY 216
+ GP++ G+ F L+ + L LLP + L+
Sbjct: 143 GMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGE-----RRPLRREAL 197

Query: 217 DLVNGFTWLKNDRLNIALSFQAMIGNFGASAVLGVFMYYLLSTLQLTPEQSGVNYSLIGI 276
+ + F W + + AL I +++ + G++ + GI
Sbjct: 198 NPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGI 257

Query: 277 -GGLLGSLIAIPLEKRLQRSILIPLLLFVGAIGLTFALWNT-YWFA-PGI----AFGVAM 329
L ++I P+ RL + L + G + T W A P + + G+ M
Sbjct: 258 LHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM 317

Query: 330 TCNIAWNTIVATVRQETVPSNMQGRVLGFSRVLTRLAMPLGALVGGII 377
A +++ V QG++ G LT L +G L+ I
Sbjct: 318 P---ALQAMLSR----QVDEERQGQLQGSLAALTSLTSIVGPLLFTAI 358


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2326BACINVASINB300.026 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 29.7 bits (66), Expect = 0.026
Identities = 25/140 (17%), Positives = 53/140 (37%), Gaps = 4/140 (2%)

Query: 313 LDIDSPMISQAALEIYIKSDMFERHKNKIKSSYNNRSKKLAEALERIHSENPLLFTYKK- 371
LD P +QA + K + + + K +A + + +L ++
Sbjct: 174 LDPADPGYAQAEAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQGT 233

Query: 372 QNTIGIHTCLEVHKTSISDMLIQRLSE-MQISIDTIDKNYVRGFPKQRLLKLNVSNVKED 430
N + + + ++S+ + RL+ M + I+ + KN L + ++
Sbjct: 234 ANAASQNQVSQGEQDNLSN--VARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQA 291

Query: 431 RIEKGIRIVMEEIKQAERLN 450
+EK EE ++AE N
Sbjct: 292 EMEKKSAEFQEETRKAEETN 311


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2336HTHFIS683e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 67.5 bits (165), Expect = 3e-15
Identities = 29/129 (22%), Positives = 58/129 (44%), Gaps = 1/129 (0%)

Query: 3 KIMIVEDDMKIAELLSTHVAKYGYEGIIVSDFQNVLNIFLEEQPELVLLDINLPSFDGYY 62
I++ +DD I +L+ +++ GY+ I S+ + +LV+ D+ +P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 63 WCRQIRGV-STCPILFISAREGTMDQVMALENGGDDFISKPFHYEVVMAKIRSHLRRAYG 121
+I+ P+L +SA+ M + A E G D++ KPF ++ I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 DYAPKVEDR 130
+ +D
Sbjct: 125 RPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2337PF06580482e-08 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 47.9 bits (114), Expect = 2e-08
Identities = 30/133 (22%), Positives = 50/133 (37%), Gaps = 25/133 (18%)

Query: 203 FFIRNFVYPE-LKVEKDITVE-SDAKWLQFLIGQILSNAIKYSSGSRE---KIKVKACKE 257
+ + + + L+ E I D + L+ ++ N IK+ KI +K K+
Sbjct: 229 LQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD 288

Query: 258 GNTVILEIADNGVGIPKQDLPRVFKPFFTGENGRDFKESTGMGLYLVYEITKQL---GHS 314
TV LE+ + G K KESTG GL V E + L
Sbjct: 289 NGTVTLEVENTGSLALKNT-----------------KESTGTGLQNVRERLQMLYGTEAQ 331

Query: 315 VEIHSEVGKGTVV 327
+++ + GK +
Sbjct: 332 IKLSEKQGKVNAM 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2341PF06057290.012 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 28.7 bits (64), Expect = 0.012
Identities = 10/50 (20%), Positives = 17/50 (34%), Gaps = 12/50 (24%)

Query: 6 WFRHLP-QISMDLSEWTPFIQNNWHRKHYMKFVYVLQIIIFLIPYYFGAD 54
W + P ++ D Q + + + LI Y FGA+
Sbjct: 91 WKQKDPKDVTQDTLAIIDKYQAEFGTQK-----------VILIGYSFGAE 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2365TCRTETA330.002 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 33.3 bits (76), Expect = 0.002
Identities = 47/302 (15%), Positives = 105/302 (34%), Gaps = 25/302 (8%)

Query: 86 FIFSFIGGTFADRWKPKKTMIWCETLSAISVFAVLITLMFGTWKIVFFVTLISAILSQFS 145
F + + G +DR+ + ++ +A+ + T +V I I++ +
Sbjct: 57 FACAPVLGALSDRFGRRPVLLVSLAGAAVDYA------IMATAP-FLWVLYIGRIVAGIT 109

Query: 146 QPSG---MKLFKQHLSAEQIQLAMSIYQTIFAIFMVLGPILGTF---IFHSFGIYISIII 199
+G ++ F MV GP+LG + + +
Sbjct: 110 GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAAL 169

Query: 200 TGIAFLLAAAVLLFLPKDLENDNEKKEITLLQEMLDGIKYVKKKKALTLLGLCFMAAGLG 259
G+ FL LL E ++E L ++ + + L F L
Sbjct: 170 NGLNFLT-GCFLLPESHKGERRPLRREAL---NPLASFRWARGMTVVAALMAVFFIMQLV 225

Query: 260 IGLIQPLGIFIVTEQLGLSKESLQWLLTVNGAGMIVGGALAM-VFAKNVAPQKMLIIGML 318
+ L + ++ ++ L G + A+ A + ++ L++GM+
Sbjct: 226 GQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMI 285

Query: 319 GQAIGIGIIGYSTNLWITLTAQLF---SGLALPCIQIGINTLIIQNSDTDFIGRVNGILS 375
G ++ ++T W+ + G+ +P +Q ++ + D + G++ G L+
Sbjct: 286 ADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLA 341

Query: 376 PL 377
L
Sbjct: 342 AL 343



Score = 29.8 bits (67), Expect = 0.023
Identities = 35/188 (18%), Positives = 69/188 (36%), Gaps = 6/188 (3%)

Query: 240 VKKKKALTLLGLCFMAAGLGIGLIQPLGIFIVTEQL--GLSKESLQWLLTVNGAGMIVGG 297
+K + L ++ +GIGLI P+ ++ + + LL +
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 298 ALAMVFAKNVAPQKMLIIGMLGQAIGIGIIGYSTNLWITLTAQLFSGLALPCIQIGINTL 357
+ + + +L++ + G A+ I+ + LW+ ++ +G+
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAY 119

Query: 358 IIQNSDTDFIGRVNGILSPLFTGSMVVTMSIAGSLKEMFSLSMMYEGTAFLFIIGLLFIL 417
I +D D R G +S F MV + G + FS + A L GL F+
Sbjct: 120 IADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALN--GLNFLT 176

Query: 418 PIYNLKPT 425
+ L +
Sbjct: 177 GCFLLPES 184



Score = 29.8 bits (67), Expect = 0.025
Identities = 16/113 (14%), Positives = 38/113 (33%), Gaps = 4/113 (3%)

Query: 76 MISVAEFAPIFIFSFIGGTFADRWKPKKTMIWCETLSAISVFAVLITLMFGTWKIVFFVT 135
++ + I G A R ++ ++ I L F T + F
Sbjct: 251 SLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTG----YILLAFATRGWMAFPI 306

Query: 136 LISAILSQFSQPSGMKLFKQHLSAEQIQLAMSIYQTIFAIFMVLGPILGTFIF 188
++ P+ + + + E+ + ++ ++GP+L T I+
Sbjct: 307 MVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIY 359


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2371PF06580394e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.1 bits (91), Expect = 4e-05
Identities = 17/102 (16%), Positives = 38/102 (37%), Gaps = 18/102 (17%)

Query: 474 QVFI-NILQNSIEAMPDGGRISIHIKEIGKDGIIISVIDKGIGIPEERIKRLGEPFYSTK 532
Q + N +++ I +P GG+I + + + + V + G +
Sbjct: 261 QTLVENGIKHGIAQLPQGGKILLKGTKDNGT-VTLEVENTGSLALKN------------T 307

Query: 533 EKGTGIGLMLSYKIIESHQGN---ISIMSEVGVGTTVTIYLP 571
++ TG GL + ++ G I + + G + +P
Sbjct: 308 KESTGTGLQNVRERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2376HTHFIS310.019 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.6 bits (69), Expect = 0.019
Identities = 20/83 (24%), Positives = 30/83 (36%), Gaps = 31/83 (37%)

Query: 42 VLIAGRSGSGKSTLAHCI---------------NGLIP--------FSYE-GSSTGTISI 77
++I G SG+GK +A + IP F +E G+ TG +
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQT- 221

Query: 78 SGKDPRKKSVFELSKHVGTILQD 100
R FE + GT+ D
Sbjct: 222 -----RSTGRFEQA-EGGTLFLD 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2379PF06580371e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 1e-04
Identities = 13/47 (27%), Positives = 23/47 (48%)

Query: 320 NLIKNGIEAMPNGGTLNISSSISNNKVIIRIEDSGIGMSQEQINRFG 366
N IK+GI +P GG + + + N V + +E++G + G
Sbjct: 266 NGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTG 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2385HTHTETR653e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.0 bits (158), Expect = 3e-15
Identities = 24/71 (33%), Positives = 43/71 (60%)

Query: 1 MNKRRYDSDLAKEIIAKKAIELFSLKGYTRTSIDNIAKASGYSKGHIYYHYKNKEELFVY 60
K + ++ ++ I A+ LFS +G + TS+ IAKA+G ++G IY+H+K+K +LF
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 61 LAKDSMKNWHD 71
+ + S N +
Sbjct: 62 IWELSESNIGE 72


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2386DHBDHDRGNASE902e-23 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 90.1 bits (223), Expect = 2e-23
Identities = 52/186 (27%), Positives = 91/186 (48%)

Query: 3 EQRIAIITGGASGIGKDLAIQLANKDIFVVIADINETSGQDLVNNIKNNNQLARFEYLDV 62
E +IA ITG A GIG+ +A LA++ + D N + +V+++K + A DV
Sbjct: 7 EGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADV 66

Query: 63 TKAESVEDLIIKIANEFGRIDYMFNNAGIAMYGEVSDMSLDNWKHIIEINLLGVIYGTQL 122
+ +++++ +I E G ID + N AG+ G + +S + W+ +N GV ++
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 123 AYQFMKKQGFGYIINTASATGLGPAPLCTAYATTKHAIVGLTTSLHYEAEEYGVNVSVLC 182
++M + G I+ S P AYA++K A V T L E EY + +++
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PTFVDT 188
P +T
Sbjct: 187 PGSTET 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2387TCRTETA351e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 35.2 bits (81), Expect = 1e-04
Identities = 40/171 (23%), Positives = 71/171 (41%), Gaps = 9/171 (5%)

Query: 13 LLLSGVGIANLGAWIYLIALNVLVYHMGGSALAVATLYVIKPLAAL---FTNAWSGSVID 69
++LS V + +G + + L L+ + S A ++ L AL G++ D
Sbjct: 9 VILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSD 68

Query: 70 RLNKRKLMIHLDIYRAVCIAILPLLPSLWIVYVFVFFISMANAIYEPTAMTYMTKLIPVE 129
R +R +++ AV AI+ P LW++Y+ + A A Y+ + +
Sbjct: 69 RFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAYIADITDGD 127

Query: 130 QRQR-FNSLRSLIGSGASVIGPSIAGALLIASTPE---FAIYMNAIAFLLS 176
+R R F + + G G V GP + G + S A +N + FL
Sbjct: 128 ERARHFGFMSACFGFGM-VAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTG 177



Score = 29.0 bits (65), Expect = 0.014
Identities = 23/116 (19%), Positives = 47/116 (40%), Gaps = 3/116 (2%)

Query: 48 TLYVIKPLAALFTNAWSGSVIDRLNKRKLMIHLDIYRAVCIAILPLLPSLWIVY-VFVFF 106
+L L +L +G V RL +R+ ++ I +L W+ + + V
Sbjct: 251 SLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLL 310

Query: 107 ISMANAIYEPTAMTYMTKLIPVEQRQRFNSLRSLIGSGASVIGPSIAGALLIASTP 162
S I P +++ + E++ + + + S S++GP + A+ AS
Sbjct: 311 AS--GGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASIT 364


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2390SACTRNSFRASE280.013 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.4 bits (63), Expect = 0.013
Identities = 18/56 (32%), Positives = 28/56 (50%), Gaps = 3/56 (5%)

Query: 81 IWHIAVHPDFRRMKIGNQLLNEAEKLAKELNLN--RLEAWTRDDLWVHGWYENNGF 134
I IAV D+R+ +G LL++A + AKE + LE + H +Y + F
Sbjct: 92 IEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACH-FYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2397SACTRNSFRASE310.001 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 31.1 bits (70), Expect = 0.001
Identities = 30/145 (20%), Positives = 54/145 (37%), Gaps = 19/145 (13%)

Query: 1 MSNVHFRDIDDTNEFKVRNIKLRSGQEKFIETVDECLNEA---NTYHEWHPVAIYYDEEI 57
M++++ +D + NE V ++ E + T E Y + Y +EE
Sbjct: 5 MTHLNMKDFNKPNEPFVVFGRMIPAFENGVWTYTEERFSKPYFKQYEDDDMDVSYVEEE- 63

Query: 58 IGFAMYGSFGPNK--------DTW-----IDRIMIDEKYQGKGYGKIAMMKLINIVSKEY 104
G A + + N W I+ I + + Y+ KG G A++ +KE
Sbjct: 64 -GKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGT-ALLHKAIEWAKEN 121

Query: 105 GVNVIYLSVTEENRTAYNLYESIGF 129
+ L + N +A + Y F
Sbjct: 122 HFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2401SACTRNSFRASE290.017 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.8 bits (64), Expect = 0.017
Identities = 16/55 (29%), Positives = 26/55 (47%), Gaps = 3/55 (5%)

Query: 199 IYDIATKEEMRGKGFGSTMFHYLLQEAKELNVVQCVLQASPD---GINIYKKAGF 250
I DIA ++ R KG G+ + H ++ AKE + +L+ + Y K F
Sbjct: 92 IEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2409SACTRNSFRASE290.005 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 29.1 bits (65), Expect = 0.005
Identities = 25/123 (20%), Positives = 37/123 (30%), Gaps = 16/123 (13%)

Query: 37 AKVYIKPDGDA---VEYQP------FAIYNGDLMVGFVMHAVVKETTDMYWINGFIIDQK 87
+K Y K D V Y F Y + +G + + I + +
Sbjct: 43 SKPYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRI--KIRSNWNGYALIEDIAVAKD 100

Query: 88 QQGNGYGKAALQESIYLIKNTFKACKEIRLTVHKDNISAKKLYEHYGFRPLGHD---YDG 144
+ G G A L ++I K + L NISA Y + F D Y
Sbjct: 101 YRKKGVGTALLHKAIEWAKE--NHFCGLMLETQDINISACHFYAKHHFIIGAVDTMLYSN 158

Query: 145 EEV 147

Sbjct: 159 FPT 161


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2415TCRTETA476e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 47.1 bits (112), Expect = 6e-08
Identities = 40/188 (21%), Positives = 80/188 (42%), Gaps = 8/188 (4%)

Query: 206 MMRTKQVYLLFFMLFTSCMGGLYLIGMVKDIGVQLVGLSAATAANAVAMIAIFNTVGRI- 264
M + + ++ + +G + LI V ++ + S A+ ++A++ +
Sbjct: 1 MKPNRPLIVILSTVALDAVG-IGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFAC 59

Query: 265 --VLGTLSDKIGRMKIVSATFIIIGLSVFTLSYIPLNYGIYFACVASVAFCFGGNITIFP 322
VLG LSD+ GR ++ + + ++ P + +Y + VA G +
Sbjct: 60 APVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRI--VAGITGATGAVAG 117

Query: 323 AIVGDYFGLKNHSTNYGIVYQGFGFGALAGSFIGALLGGFQP--TFITIGVLSVISFIIS 380
A + D + ++G + FGFG +AG +G L+GGF P F L+ ++F+
Sbjct: 118 AYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTG 177

Query: 381 LVIRPPKT 388
+ P
Sbjct: 178 CFLLPESH 185



Score = 38.3 bits (89), Expect = 4e-05
Identities = 27/146 (18%), Positives = 58/146 (39%), Gaps = 13/146 (8%)

Query: 8 PLLIVLGTIIVQIGLGTIYTWSLFNQPLVSKFGWNLNSVAITFS-ITSFSLSFSTLFAGK 66
L+ + I+ +G W +F + +F W+ ++ I+ + + G
Sbjct: 213 AALMAVFFIMQLVGQVPAALWVIFGE---DRFHWDATTIGISLAAFGILHSLAQAMITGP 269

Query: 67 LQQKLGLRKLIATAGIVLGLGLILSSQVSS----LPLLYLLAGVVVGYADGTAYITSLSN 122
+ +LG R+ + I G G IL + + P++ LLA +G A ++ +
Sbjct: 270 VAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVD 329

Query: 123 LIKWFPNRKGLISGISVSAYGMGSLI 148
R+G + G + + S++
Sbjct: 330 -----EERQGQLQGSLAALTSLTSIV 350



Score = 34.4 bits (79), Expect = 8e-04
Identities = 53/316 (16%), Positives = 95/316 (30%), Gaps = 36/316 (11%)

Query: 63 FAGKLQQKLGLRKLIATAGIVLGLGLILSSQVSSLPLLYL---LAGVV----VGYADGTA 115
G L + G R ++ + + + + L +LY+ +AG+ A
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIA 121

Query: 116 YITSLSNLIKWFPNRKGLISGISVSAYGMGSLIFRYINGSLIDSLGVSQAFLYWGIIVLL 175
IT + F G +S G ++ G L+ F + L
Sbjct: 122 DITDGDERARHF----GFMSACFGFGMVAGPVL-----GGLMGGFSPHAPFFAAAALNGL 172

Query: 176 LVLIGSFFLREAIVSNTVTETLH--NDYTPREMMRTKQ-VYLLFFMLFTSCMGGLYLIGM 232
L G F L E+ N R V L + F + G +
Sbjct: 173 NFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAAL 232

Query: 233 VKDIGVQLVGLSAATAANAVAMIAIFNTVGRIVL-GTLSDKIGRMKIVSATFIIIGLSVF 291
G A T ++A I +++ + ++ G ++ ++G + + I G
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI 292

Query: 292 TLSYIPLNYGIYFACVASVAFCFGGNITIFPAIVGDYFGLKNHSTNYGIVYQGFGFGALA 351
L++ + + PA+ S QG G+LA
Sbjct: 293 LLAFATRGW-----MAFPIMVLLASGGIGMPALQAML------SRQVDEERQGQLQGSLA 341

Query: 352 -----GSFIGALLGGF 362
S +G LL
Sbjct: 342 ALTSLTSIVGPLLFTA 357


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2428PF06340290.027 Vibrio cholerae toxin co-regulated pilus biosynthesis pr...
		>PF06340#Vibrio cholerae toxin co-regulated pilus biosynthesis

protein F (TcpF)
Length = 338

Score = 28.8 bits (64), Expect = 0.027
Identities = 15/50 (30%), Positives = 27/50 (54%)

Query: 48 YLSRQKGRNKKVNTELVADHRYISYRNLIRKRELFGMDFLFDEVPNKIVP 97
+L+++ GR VN V D+ + ++ KR L G + + +PN+I P
Sbjct: 160 FLTKENGRYDIVNVGGVPDNTPVKLPAIVSKRGLMGTTSVVNAIPNEIYP 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2432SACTRNSFRASE355e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 35.3 bits (81), Expect = 5e-05
Identities = 29/110 (26%), Positives = 48/110 (43%), Gaps = 17/110 (15%)

Query: 41 FKYFQKLLDDL-LIEQADGDSYFYLIRNEKKEIVGRINLVDIDTETRSSSLGY------R 93
FK ++ D+ +E+ ++ Y + N +GRI + RS+ GY
Sbjct: 47 FKQYEDDDMDVSYVEEEGKAAFLYYLENN---CIGRIKI-------RSNWNGYALIEDIA 96

Query: 94 VGEKFTKKGVATAAVKLVIEVAKNNKINEIHAKTTTNNLASQSVLEKSGF 143
V + + KKGV TA + IE AK N + +T N+++ K F
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2452THERMOLYSIN311e-101 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 311 bits (799), Expect = e-101
Identities = 185/586 (31%), Positives = 268/586 (45%), Gaps = 63/586 (10%)

Query: 15 MKNKKTLTKVALTTGLALTAVAPYGVGHAEETDQLQVQIQEESFRSGELTQPSQKAPENV 74
M + L + L GL P+G ++ Q + SF SG L +
Sbjct: 1 MNKRAMLGAIGLAFGLM---AWPFGASAKGKSMVWNEQWKTPSFVSGSLLGRCSQ----- 52

Query: 75 VKDALKEKTEQALSPKQVNGETGVDYKVLQKRGSYDGTTLVRIQQTYKGKEVYGHQLTAH 134
+ + +Q + Q+ G+ ++ + G T++R +Q G L AH
Sbjct: 53 --ELVYRYLDQEKNTFQLGGQARERLSLIGNKLDELGHTVMRFEQAIAASLCMGAVLVAH 110

Query: 135 VDNSGVIKSVSGDSAQNLKQEELKKPINLSKDEATQYIYTKYGNDI---NFISEPEVKEV 191
V++ + S+SG NL + LK +S +A + + +E
Sbjct: 111 VNDGE-LSSLSGTLIPNLDKRTLKTEAAISIQQAEMIAKQDVADRVTKERPAAEEGKPTR 169

Query: 192 IFVDENNGQASNAYQVTFEAATPNYVSGTYLINAQNGDMLKNMVQQSNLKASEKLVGALK 251
+ + + AY+V TP + Y+I+A +G +L + + + ++
Sbjct: 170 LVIYPDEETPRLAYEVNVRFLTPVPGNWIYMIDAADGKVLN---KWNQMDEAKP-----G 221

Query: 252 KSKKSSLTSLTGTGKDDLGISRSFGISKQSN-GKYALADYTRGQGIETYDVNYRDITKEE 310
++ + TS G G+ LG + + S G Y L D TRG GI TYD R
Sbjct: 222 GAQPVAGTSTVGVGRGVLGDQKYINTTYSSYYGYYYLQDNTRGSGIFTYDGRNR------ 275

Query: 311 SYYPGTLATSTSTTF---NDPKAVSAHYLATKVFDFYKDKYNRNSFDNQGQKVVSVVHAW 367
+ PG+L F D AV AHY A V+D+YK+ + R S+D + S VH
Sbjct: 276 TVLPGSLWADGDNQFFASYDAAAVDAHYYAGVVYDYYKNVHGRLSYDGSNAAIRSTVHY- 334

Query: 368 DSGDTNDPKNWQNALSANNGSMLVYGD-------PIVKAYDVAGHEFTHAVTSSESNLEY 420
+ + NA NGS +VYGD P DV GHE THAVT + L Y
Sbjct: 335 -------GRGYNNA--FWNGSQMVYGDGDGQTFLPFSGGIDVVGHELTHAVTDYTAGLVY 385

Query: 421 YGESGAINEALSDIMGTSIEKYVNNGSFNWTMGEQT------GSVFRDMENPASVPSSLG 474
ESGAINEA+SDI GT +E Y N + +W +GE G R M +PA
Sbjct: 386 QNESGAINEAMSDIFGTLVEFY-ANRNPDWEIGEDIYTPGVAGDALRSMSDPAKYGD--- 441

Query: 475 VPYPDDYSEFNDFNGWDQGGVHFNSSIINKVAYLIAKGGTHNGVTVKGIGEDKMFDIFYY 534
PD YS+ D GGVH NS IINK AYL+++GG H GV+V GIG DKM IFY
Sbjct: 442 ---PDHYSKRYTGTQ-DNGGVHTNSGIINKAAYLLSQGGVHYGVSVTGIGRDKMGKIFYR 497

Query: 535 ANTDELNMTSNFKELRSACIRVATNKYGANSAEVQAVQKAFDAAKI 580
A L TSNF +LR+AC++ A + YG+ S EV +V++AF+A +
Sbjct: 498 ALVYYLTPTSNFSQLRAACVQAAADLYGSTSQEVNSVKQAFNAVGV 543


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2457SECYTRNLCASE445e-157 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 445 bits (1146), Expect = e-157
Identities = 181/445 (40%), Positives = 271/445 (60%), Gaps = 22/445 (4%)

Query: 1 MFRTISNFMRVAEIRRKILFTLAMLIVFRIGTFIPVPHTNAEVLK-----IQDQANVLGM 55
M + R ++R+K+LFTLA+++V+R+GT IP+P + + ++ + G+
Sbjct: 1 MLTAFARAFRTPDLRKKLLFTLAIIVVYRVGTHIPIPGVDYKNVQQCVREASGNQGLFGL 60

Query: 56 LNVFGGGALQHFSIFAVGITPYITASIIVQLLQMDVIPKFSEWAKQGEMGRKKSAQFTRY 115
+N+F GGAL +IFA+GI PYITASII+QLL + VIP+ K+G+ G K Q+TRY
Sbjct: 61 VNMFSGGALLQITIFALGIMPYITASIILQLLTV-VIPRLEALKKEGQAGTAKITQYTRY 119

Query: 116 FTIILAFIQAIGMSYGFNNI-------AGGQLITDQSWTTYLFIATVLTAGTAFLLWLGE 168
T+ LA +Q G+ + GGQ++ DQS T + + +TAGT ++WLGE
Sbjct: 120 LTVALAILQGTGLVATARSAPLFGRCSVGGQIVPDQSIFTTITMVICMTAGTCVVMWLGE 179

Query: 169 QITANGVGNGISMLIFAGLVAAIPNVANQIYLQQFQNAGDQLFMHIIKMLLIGLVILAIV 228
IT G+GNG+S+L+F + A P+ I Q G F +I V L +V
Sbjct: 180 LITDRGIGNGMSILMFISIAATFPSALWAIKKQGTLAGGWIEFGTVI------AVGLIMV 233

Query: 229 VGVIYIQQAVRKIPIQYAKAVSGNNQYQGAKNTHLPLKVNSAGVIPVIFASAFLMTPRTI 288
V++++QA R+IP+QYAK + G Y G +T++PLKVN AGVIPVIFAS+ L P +
Sbjct: 234 ALVVFVEQAQRRIPVQYAKRMIGRRSY-GGTSTYIPLKVNQAGVIPVIFASSLLYIPALV 292

Query: 289 AQLFPDSSVSKWLVAN--LDFAHPIGMTLYVGLIVAFTYFYAFIQVNPEQMAENLKKQNG 346
AQ +S K V HPI + Y LIV F +FY I NPE++A+N+KK G
Sbjct: 293 AQFAGGNSGWKSWVEQNLTKGDHPIYIVTYFLLIVFFAFFYVAISFNPEEVADNMKKYGG 352

Query: 347 YVPGIRPGKSTEQYVTKILYRLTFIGAIFLGAISILPLVFTKIATLPPSAQIGGTSLLII 406
++PGIR G+ T +Y++ +L R+T+ G+++LG I+++P + + GGTS+LII
Sbjct: 353 FIPGIRAGRPTAEYLSYVLNRITWPGSLYLGLIALVPTMALVGFGASQNFPFGGTSILII 412

Query: 407 VGVALETMKTLESQLVKRHYKGFIK 431
VGV LET+K +ESQL +R+Y+GF++
Sbjct: 413 VGVGLETVKQIESQLQQRNYEGFLR 437


22BALH_2564BALH_2577Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2564013-3.316582x-prolyl-dipeptidyl aminopeptidase
BALH_2566-213-4.769653sensor histidine kinase
BALH_2567-215-3.878936two-component response regulator
BALH_2568016-4.231593hypothetical protein
BALH_2569117-4.082386acetyltransferase
BALH_2570117-3.811123alpha/beta hydrolase
BALH_2571015-2.997068bifunctional
BALH_2572116-3.191743N-acetylmannosaminyltransferase
BALH_2573116-3.721479oligoendopeptidase F
BALH_2575116-3.511383amino acid permease
BALH_2576117-3.353516AsnC family transcriptional regulator
BALH_2577118-3.435674DegV family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2566PF06580355e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 35.2 bits (81), Expect = 5e-04
Identities = 34/173 (19%), Positives = 60/173 (34%), Gaps = 38/173 (21%)

Query: 282 IIKQSDHISNLIEEL---LRFS---KLERDVLQKEEFSIKSLVQSILDKHKIELESKEIN 335
I++ ++ L +R+S R V +E + +V S L I+ E +
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELT---VVDSYLQLASIQFEDR--- 239

Query: 336 LQVNYNVGDAIVYADVNKMRMVFQNLISNAIKY-----TSNQNIKITLEDRNESVYFQIQ 390
LQ + AI+ V M + Q L+ N IK+ I + N +V +++
Sbjct: 240 LQFENQINPAIMDVQVPPM--LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVE 297

Query: 391 NGMNAEHMKDIDKIWEPFYVLESSRSKDHSGTGLGLAIVKSILE-RHGFDYGV 442
N TG GL V+ L+ +G + +
Sbjct: 298 N--TGSLALK----------------NTKESTGTGLQNVRERLQMLYGTEAQI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2567HTHFIS904e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.9 bits (223), Expect = 4e-23
Identities = 27/117 (23%), Positives = 54/117 (46%), Gaps = 1/117 (0%)

Query: 2 KVLIADDEQDMLRILKAYFEKEGFEVFLAKDGEEALQIFYDEKIDLAILDWMMPKHSGIT 61
+L+ADD+ + +L + G++V + + + DL + D +MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VCQEIKK-NSSVKVLMLTAKSESEDELAALQSGADEYVKKPFHPGVLITRAKKLIQH 117
+ IKK + VL+++A++ + A + GA +Y+ KPF LI + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2569SACTRNSFRASE280.015 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 27.6 bits (61), Expect = 0.015
Identities = 20/100 (20%), Positives = 37/100 (37%), Gaps = 27/100 (27%)

Query: 29 EGFKFLKKLINEYENELNTF-----------------------NKSGECLYGIFQGEKLI 65
E F ++I +EN + T+ + G+ + + I
Sbjct: 18 EPFVVFGRMIPAFENGVWTYTEERFSKPYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCI 77

Query: 66 GIGGLNADPYTENNKIGRLRRFYIAKDYRRIGLGKLLLNK 105
G + ++ N + +AKDYR+ G+G LL+K
Sbjct: 78 GRIKIRSNW----NGYALIEDIAVAKDYRKKGVGTALLHK 113


23BALH_2588BALH_2605Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2588213-1.963459major facilitator family transporter
BALH_2589217-2.661094hypothetical protein
BALH_2591216-3.132577hypothetical protein
BALH_2593115-3.409380hypothetical protein
BALH_2594210-3.094778hypothetical protein
BALH_2595112-3.471055inosine-uridine preferring nucleoside hydrolase
BALH_2596-212-4.435113hypothetical protein
BALH_2597-113-3.062250methyltransferase
BALH_2598015-2.407826hypothetical protein
BALH_2599-117-2.301112macrolide efflux protein
BALH_2600017-2.491528hypothetical protein
BALH_2601-115-2.997256hypothetical protein
BALH_2602-116-2.990480aspartate aminotransferase
BALH_2603-119-4.042597(3R)-hydroxymyristoyl-ACP dehydratase
BALH_2604016-3.851060pantothenate kinase
BALH_2605-213-4.110765hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2588TCRTETA845e-20 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 84.5 bits (209), Expect = 5e-20
Identities = 51/319 (15%), Positives = 116/319 (36%), Gaps = 11/319 (3%)

Query: 78 LIFGLQPLSDIIFTLIAGRVTDKYGRKKIMLLGLLLQGVAIGSFIFAQSLFIFALLYVIN 137
++ L L + G ++D++GR+ ++L+ L V A L++ + ++
Sbjct: 47 ILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVA 106

Query: 138 GVGRSLYIPAQRAQIADLTKHGQQAEIFSLLQTMGAIGTLIGPLIGTIFYKAHPEYVFIV 197
G+ + A IAD+T ++A F + G + GP++G + P F
Sbjct: 107 GITGAT-GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFA 165

Query: 198 QSIVLIAYAVVVWTQLPETAPAMTTPTQKLEVSSPKQFVQKHY--AVFGLMITTLPISFF 255
+ + + LPE+ P ++ ++ F V LM +
Sbjct: 166 AAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLV 225

Query: 256 YAQTETNYLIFVKHTLPDFDRILVFIT-TCKALMEITLQVFLV-KWSERFSMAKIILISY 313
++IF + +D + I+ ++ Q + + R + +++
Sbjct: 226 GQVPAALWVIFGEDRF-HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLG- 283

Query: 314 TCYTIAAIGYGLSATIAS--LFFTLLFLVIGGSMALNHLLRFVSEIAPSDKRGLYFSIYG 371
GY L A + F ++ L+ G + + L +S +++G
Sbjct: 284 --MIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 372 LHWDVSRTCGPVIGAVLLS 390
++ GP++ + +
Sbjct: 342 ALTSLTSIVGPLLFTAIYA 360



Score = 46.4 bits (110), Expect = 1e-07
Identities = 24/137 (17%), Positives = 59/137 (43%), Gaps = 2/137 (1%)

Query: 58 AIIMIIYVNKMLNGNIMMTMLIFGLQPLSDIIF-TLIAGRVTDKYGRKKIMLLGLLLQGV 116
A + +I+ + + + + + +I G V + G ++ ++LG++ G
Sbjct: 230 AALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGT 289

Query: 117 AIGSFIFAQSLFIFALLYVINGVGRSLYIPAQRAQIADLTKHGQQAEIFSLLQTMGAIGT 176
FA ++ + V+ G + +PA +A ++ +Q ++ L + ++ +
Sbjct: 290 GYILLAFATRGWMAFPIMVLLASG-GIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTS 348

Query: 177 LIGPLIGTIFYKAHPEY 193
++GPL+ T Y A
Sbjct: 349 IVGPLLFTAIYAASITT 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2589TYPE4SSCAGA280.030 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 28.1 bits (62), Expect = 0.030
Identities = 35/130 (26%), Positives = 51/130 (39%), Gaps = 20/130 (15%)

Query: 20 LAACKGTDEKKETNP----TSENSKNEQNTSSEGK-----KEPEVKSNTDSNSKDTVINQ 70
L A KG+ + NP EN N GK K + KS+ +++ KD +INQ
Sbjct: 719 LKALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENSVKDVIINQ 778

Query: 71 KSINHVKNLFELAKEGKVPNVPFAAHTGDIEEIEKAWGKADKTEQAGNGMYATFTNKNVS 130
K + V NL + K TGD +E+A + A KN S
Sbjct: 779 KVTDKVDNLNQAVSVAKA--------TGDFSRVEQALADLKNFSKE---QLAQQAQKNES 827

Query: 131 FGFNKGSQVF 140
K S+++
Sbjct: 828 LNARKKSEIY 837


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2599TCRTETA461e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 46.0 bits (109), Expect = 1e-07
Identities = 66/342 (19%), Positives = 119/342 (34%), Gaps = 11/342 (3%)

Query: 1 MWRNKNVWIVLIGEFIAGLGLWLGILGNLEFMQKYVPSDFMKS---VILFIGLLAGVLVG 57
M N+ + ++L + +G+ L + ++ V S+ + + ++L + L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 58 PMAGRIIDQYEKKKVLLYAGFGRVISVIFMFFAIQFESIAFMIAFMVALQISAAFYFPAL 117
P+ G + D++ ++ VLL + G + M A + I +VA A
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVL--YIGRIVAGITGATGAVAG- 117

Query: 118 QSVIPLIVREHELLQMNGVHMNVGTIARIAGTSLGGILLVVMSLQYMYAFSMAAYALLFL 177
+ I I E + G +AG LGG L+ S + + A L FL
Sbjct: 118 -AYIADITDGDERARHFGFMSACFGFGMVAGPVLGG-LMGGFSPHAPFFAAAALNGLNFL 175

Query: 178 STFFLQFEDKKSSTPSKQAAKDNSFMEVFRILRGIPIAFTALILSIIPLLFIAGFNLMVI 237
+ FL E K + N +A + I+ L+ L VI
Sbjct: 176 TGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVI 235

Query: 238 -NISEMQHDPTIKGFIYTIEGIAFMLG-AFVIKRLSDHFKPEKLLYFFAVCTAFAHLSLF 295
D T G GI L A + ++ + L + ++ L
Sbjct: 236 FGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLA 295

Query: 296 FSDIKWMSLTSFGLFGFSVGCFFPIMSTIFQTKVEKSYHGRL 337
F+ WM+ L G P + + +V++ G+L
Sbjct: 296 FATRGWMAFPIMVLLAS-GGIGMPALQAMLSRQVDEERQGQL 336


24BALH_2701BALH_2706Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2701114-3.066425glycine betaine/L-proline ABC transporter
BALH_2702318-3.832223hypothetical protein
BALH_2703419-4.068927glycosyl transferase family protein
BALH_2704419-4.005645hypothetical protein
BALH_2705217-3.155097hypothetical protein
BALH_2706317-3.117031glycosyl transferase family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2701TCRTETA562e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 55.6 bits (134), Expect = 2e-10
Identities = 51/312 (16%), Positives = 111/312 (35%), Gaps = 54/312 (17%)

Query: 78 QLVLTFGTFAAAF-LVRPIGGVFFGRIGDKYGRKIVLSTTIILMALSTLFIALLPTYEQI 136
+ +G A + L++ G + D++GR+ VL ++ A+ +A P
Sbjct: 40 DVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLW-- 97

Query: 137 GVWAPILLLVARMIQGFSTGGEYSGAMVYIAESSPDKKR--------GILGSGLEIGTLS 188
+L + R++ G TG + A YIA+ + +R G G+ G +
Sbjct: 98 ------VLYIGRIVAGI-TGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVL 150

Query: 189 GYIAASVIVTILTLLLTDEQMLSWGWRIPFLIAAPIGLVGLYLRRHLDESPIFEEMEKAQ 248
G + PF AA + + L E K +
Sbjct: 151 GGLMGGF-----------------SPHAPFFAAAALNGLNFLTGCFL-----LPESHKGE 188

Query: 249 EESEDNEQFSFMDIIKYHKKDFLLSTVIVAFFNITNYMILSYIPSYLTQVLKVKETTGLL 308
E + + ++ + +++ ++ FF + ++ +P+ L V+ ++
Sbjct: 189 RRPLRREALNPLASFRWARGMTVVAALMAVFFIM---QLVGQVPAAL-WVIFGEDRFHWD 244

Query: 309 IISITMAL-------MIPLALYFGKLSDKIGNKRVVQIGLLGLTVFAIPAFLLIGNGHIA 361
+I ++L + A+ G ++ ++G +R + +LG+ LL
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRAL---MLGMIADGTGYILLAFATRGW 301

Query: 362 AIFAGIFVLGFF 373
F + +L
Sbjct: 302 MAFPIMVLLASG 313


25BALH_2721BALH_2731Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2721213-3.172550hypothetical protein
BALH_2722114-2.627468MutT/Nudix family protein
BALH_2723113-3.925792hypothetical protein
BALH_2724114-4.121627ABC transporter ATP-binding protein
BALH_2725116-3.455620GntR family transcriptional regulator
BALH_2726115-2.670463isoflavone reductase
BALH_2727016-3.339128ABC transporter ATP-binding protein
BALH_2728116-3.874490ABC transporter permease
BALH_2729117-2.582982ABC transporter permease
BALH_2730017-0.584652hypothetical protein
BALH_27312140.138432beta-lactamase class C and other
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2726NUCEPIMERASE310.005 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 31.3 bits (71), Expect = 0.005
Identities = 9/27 (33%), Positives = 15/27 (55%)

Query: 1 MKILILGGTRFLGRAFVEKALNRGYEV 27
MK L+ G F+G ++ L G++V
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQV 27


26BALH_2781BALH_2829Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2781319-0.881435metallo-beta-lactamase family protein
BALH_2782518-2.111204hypothetical protein
BALH_2783417-2.142104hypothetical protein
BALH_2784516-2.498663spore coat protein CotF
BALH_2785516-2.287536hypothetical protein
BALH_2786515-2.220125hypothetical protein
BALH_2789414-1.052401amino acid permease
BALH_2790216-0.339554small acid-soluble spore protein
BALH_27912110.617264hypothetical protein
BALH_27922101.037898hypothetical protein
BALH_27931110.922462small, acid-soluble spore protein
BALH_27941100.175702alcohol dehydrogenase, glutathione-dependent
BALH_2795110-0.494236Mn-containing catalase
BALH_2796011-0.988769aspartate ammonia-lyase
BALH_2797316-2.605729asparaginase
BALH_2798216-3.006032ans operon repressor protein
BALH_2799116-2.911961GerC family spore germination protein
BALH_2800115-2.541973GerB family spore germination protein
BALH_2801115-2.591546GerA family spore germination protein
BALH_2802116-3.109062PAS/PAC sensor signal transduction histidine
BALH_2803316-3.016707hemolysin BL binding component
BALH_2804215-3.734448hemolysin BL binding component B
BALH_2806214-3.141254hemolysin BL lytic component L1
BALH_2807115-3.722009hemolysin BL lytic component L2
BALH_2808017-3.928514AraC family transcriptional regulator
BALH_2809115-3.571048hypothetical protein
BALH_2810116-2.889852UvrC-like protein
BALH_2812214-1.830629amino acid permease
BALH_2813114-2.934323pyrroline-5-carboxylate reductase
BALH_2814114-3.698053response regulator
BALH_2816214-3.268459two-component sensor histidine kinase
BALH_2817213-3.180206glutaminase
BALH_2819214-3.057736glutamine permease
BALH_2820017-3.431093transporter
BALH_2821016-2.251620hypothetical protein
BALH_2822115-1.610026hypothetical protein
BALH_2823115-1.539631hypothetical protein
BALH_2824015-1.1791505'-nucleotidase
BALH_2825217-1.521997hypothetical protein
BALH_2826317-1.806913hypothetical protein
BALH_2827417-2.311731magnesium and cobalt transport protein
BALH_2828218-1.458588hypothetical protein
BALH_2829319-2.048311hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2806PHPHTRNFRASE310.013 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 30.5 bits (69), Expect = 0.013
Identities = 21/88 (23%), Positives = 33/88 (37%), Gaps = 14/88 (15%)

Query: 197 DGKGGLTAILAGQQATIPQLQAEIEQLRATQKKHFDDVLAWSIGGGLGAAILVIAAIGGA 256
DG G I+ + + + + QK+ + ++ GA + + A IG
Sbjct: 222 DGIEG-IVIVNPTEEEVKAYEEKRAAFEK-QKQEWAKLVGEPSTTKDGAHVELAANIG-- 277

Query: 257 VVIVVTGGTATPAVVGGLSALGAAGIGL 284
TP V G+ A G GIGL
Sbjct: 278 ----------TPKDVDGVLANGGEGIGL 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2814HTHFIS512e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 51.4 bits (123), Expect = 2e-09
Identities = 25/137 (18%), Positives = 60/137 (43%), Gaps = 8/137 (5%)

Query: 5 IVDDEKAVRSMLAQIIEDEDLG-EVTGEAEDGLSLEQHMPILKKIDILFIDLLMPIQDGI 63
+ DD+ A+R++L Q + +T A D++ D++MP ++
Sbjct: 8 VADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDG----DLVVTDVVMPDENAF 63

Query: 64 KTIRQIKPSFKG-KIIMVSQVESKELIAEAYSLGVEYYIIKPINRIEVLTVVRKVIE--R 120
+ +IK + ++++S + +A G Y+ KP + E++ ++ + + +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 121 IRLEKSIKNIQESLNMV 137
R K + Q+ + +V
Sbjct: 124 RRPSKLEDDSQDGMPLV 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2816PF06580363e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 35.6 bits (82), Expect = 3e-04
Identities = 23/125 (18%), Positives = 49/125 (39%), Gaps = 22/125 (17%)

Query: 308 IQFVYKIDGVHSHYHI--YTILSIINNIITNAVEAIQSMGTITIDINKDHHFVEFQIGDN 365
+QF +I+ + + +++ N I + + + G I + KD+ V ++ +
Sbjct: 240 LQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENT 299

Query: 366 GPGISSKYKDSIFEPGFTSKYDDLGNPSTGIGLSYVKEMVEQLEGD---VTLEDRTEGKG 422
G K+ STG GL V+E ++ L G + L ++
Sbjct: 300 GSLALKNTKE-----------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN 342

Query: 423 SIFII 427
++ +I
Sbjct: 343 AMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2827PF01540290.035 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 28.5 bits (63), Expect = 0.035
Identities = 15/58 (25%), Positives = 31/58 (53%), Gaps = 6/58 (10%)

Query: 191 LENLTNTLCYKQTHVKLERGFQLVKEYQEELDTMIHLQEVVSSHRGNEIMKALTVLTA 248
+++ +T+ T KLER FQ+ ++++++L + I L + + E+ TV T
Sbjct: 259 IQSFADTI--ALTITKLERKFQIDEKFKKQLISTIELL----NKKSVEVKTFATVNTI 310


27BALH_2841BALH_2846Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2841-112-4.7972041,4-dihydroxy-2-naphthoate
BALH_2842013-5.881028alpha/beta fold family hydrolase
BALH_2843013-6.035935spore maturation protein
BALH_2844013-5.428755ABC transporter ATP-binding protein
BALH_2845-113-4.970472ABC transporter permease associated with
BALH_2846-113-3.249037two-component sensor histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2846PF06580471e-07 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 46.8 bits (111), Expect = 1e-07
Identities = 53/335 (15%), Positives = 123/335 (36%), Gaps = 62/335 (18%)

Query: 214 YIIILFPLFSIHAIAFKNHMSMKQYFPLKKGAIA--------FIFLCIFIFMSALGVLFQ 265
+ I + + + A+++ + + + L G I I + F+ +++ L
Sbjct: 44 FNIAISLMGLVLTHAYRSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFVANTSIWRLLA 103

Query: 266 FHMMT-------------YFIFIHTIIWFVQLYFTLLFQDTKNTISNLNTKELKTLPDSS 312
F + + + T +W + LYF F N E+
Sbjct: 104 FINTKPVAFTLPLALSIIFNVVVVTFMWSL-LYFGWHFF------KNYKQAEIDQW---- 152

Query: 313 YVHNLVQIVKEEQLKSGFSNYLHDEI----LQDILAVKNMMNKSNKKEIHDMIVATLDNL 368
+ + +E QL + L +I + + L + + + +M+ + + +
Sbjct: 153 ---KMASMAQEAQLMA-----LKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELM 204

Query: 369 SHTIRIEMQEYHPTLLKTLTLKENYHNLLKMIQKKYETKHVNISFTCNDDLFLVEPYHLV 428
+++R +L LT+ ++Y L + ++E + + N + V+ +
Sbjct: 205 RYSLRYSNARQV-SLADELTVVDSYLQLASI---QFEDR-LQFENQINPAIMDVQVPPM- 258

Query: 429 VYRILKELVTNAFKH-----SNCSQIYLHLTQENDEIKLIVKDDGKGLATTEADILNGHK 483
+++ LV N KH +I L T++N + L V++ G + +
Sbjct: 259 ---LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLAL--KNTKESTGT 313

Query: 484 GLNSIKEQLLLLNGE--MTISNSNPSGLCITIFIP 516
GL +++E+L +L G + + + IP
Sbjct: 314 GLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


28BALH_2885BALH_2937Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2885215-2.517073hypothetical protein
BALH_2886517-3.1781253-oxoacyl-ACP synthase
BALH_2887420-3.503241hypothetical protein
BALH_2888217-2.980100hypothetical protein
BALH_2889015-2.726454aminopeptidase
BALH_2890113-2.666609CAAX amino terminal protease family protein
BALH_2891312-1.720379MerR family transcriptional regulator
BALH_2892411-1.077249hypothetical protein
BALH_2893213-0.498143hypothetical protein
BALH_2894314-0.425399cell wall anchor domain-containing protein
BALH_28952140.332126ABC transporter ATP-binding protein
BALH_2896014-0.484688ABC transporter permease
BALH_2897013-0.825018ArsR family transcriptional regulator
BALH_2900013-1.177470macrolide-efflux protein
BALH_2901212-2.564977sensor histidine kinase
BALH_2902212-2.726454two-component response regulator
BALH_2904211-2.752951hypothetical protein
BALH_2905213-0.448179MarR family transcriptional regulator
BALH_29060130.484986spermine/spermidine N-acetyltransferase
BALH_2907-1121.210554protease synthase and sporulation negative
BALH_29081131.019312hypothetical protein
BALH_29093170.496949F0F1 ATP synthase subunit alpha
BALH_2910218-0.155202hypothetical protein
BALH_2911217-0.797475microcin C7 self-immunity protein
BALH_2912419-1.827229hypothetical protein
BALH_2913420-2.194024hydroxyacylglutathione hydrolase
BALH_2914117-1.438536sulfide-quinone reductase
BALH_2915-118-0.858159neutral protease
BALH_2916017-1.370581glycerophosphoryl diester phosphodiesterase
BALH_2917219-2.556905teicoplanin resistance protein
BALH_2918220-2.171691D-alanyl-D-alanine carboxypeptidase
BALH_2919218-2.425508vancomycin sensor histidine kinase
BALH_2920418-3.957395vancomycin response regulator
BALH_2921619-5.026696penicillin-binding protein
BALH_2922519-5.180348fucose permease
BALH_2923419-4.677345Gfo/Idh/MocA family oxidoreductase
BALH_2924317-4.309908HAD family phosphatase
BALH_2925217-4.000450aminotransferase
BALH_2926215-3.023933LacI family transcriptional regulator
BALH_2927215-1.922523hypothetical protein
BALH_2928216-1.105333peroxidase
BALH_2929214-1.223721DNA-damage repair protein
BALH_2930315-1.303689hypothetical protein
BALH_2931315-1.603522methyl-accepting chemotaxis protein
BALH_2932419-2.112529serine transporter
BALH_2933521-2.479217L-serine ammonia-lyase
BALH_2934619-3.244194L-serine ammonia-lyase
BALH_2935619-3.186114cell filamentation protein
BALH_2936619-2.968722diaminobutyrate--2-oxoglutarate
BALH_2937519-2.902144hydrogenase maturation protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2894GPOSANCHOR372e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 36.6 bits (84), Expect = 2e-04
Identities = 40/258 (15%), Positives = 95/258 (36%), Gaps = 3/258 (1%)

Query: 50 LAEIKQHKQELDAKLQQHKENVDQTLNELNKVKENVDTKVNELHERKQVADEKINEIKQH 109
A Q + A L++ E + + ++ + L RK ++ +
Sbjct: 111 KASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNF 170

Query: 110 KQELDAKLQQ---DKQIAEDKIAEIKEHKKQVEDKVAEVKEHKQNIDNKVNEIKEHKQTV 166
AK++ +K E + AE+++ + + + ++ + + K +
Sbjct: 171 STADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADL 230

Query: 167 DEKVNEMKQHKENIDQKVNELKEVKKQVDEKLAELKKAKQTAEDKLAELKENKPNTGNTL 226
++ + K+ L+ K ++ + AEL+KA + A +
Sbjct: 231 EKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEK 290

Query: 227 EELKKIKGNLDSLSANLELAKQDVKNKLAALQEARQDLINKINEIKQSKQTVSDELSKKK 286
L+ K +L+ S L +Q ++ L A +EA++ L + ++++ + +
Sbjct: 291 AALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLR 350

Query: 287 QDLDIKINDFKHTEKKID 304
+DLD K E +
Sbjct: 351 RDLDASREAKKQLEAEHQ 368


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2901PF06580362e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.4 bits (84), Expect = 2e-04
Identities = 17/102 (16%), Positives = 34/102 (33%), Gaps = 22/102 (21%)

Query: 359 NIFTNSIKFSNEGGTIEFFVEELEFSVIISISDNGIGMEKEEMDRIFDRFYKVDTARARN 418
N + I +GG I + +V + + + G K
Sbjct: 266 NGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK----------------- 308

Query: 419 VEGSGLGLSIVQKIVELHNGN---VSVYSTKGEGTTVRVELP 457
E +G GL V++ +++ G + + +G V +P
Sbjct: 309 -ESTGTGLQNVRERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2902HTHFIS817e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.0 bits (200), Expect = 7e-20
Identities = 34/123 (27%), Positives = 61/123 (49%), Gaps = 1/123 (0%)

Query: 1 MKMIHILLADDDKHIRELLHYHLQKEGFKVFEAEDGKVAQDVLEKENIHLAIVDIMMPFV 60
M IL+ADDD IR +L+ L + G+ V + + + L + D++MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGYTLCEEIRK-YHDIPVILLTAKDQLVDKEKGFISGTDDYIVKPFEPAEVIFRMKALLR 119
+ + L I+K D+PV++++A++ + K G DY+ KPF+ E+I + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RYQ 122
+
Sbjct: 121 EPK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2906SACTRNSFRASE473e-09 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 47.2 bits (112), Expect = 3e-09
Identities = 32/158 (20%), Positives = 56/158 (35%), Gaps = 15/158 (9%)

Query: 1 MTTHIEKCTLEDIHKLQEISYETFNETF-KHQNSPENMHHYLEKAFNLKQLEKE------ 53
M + ++D +K E + F +N KQ E +
Sbjct: 1 MIMKMTHLNMKDFNKPNE-PFVVFGRMIPAFENGVWTYTEERFSKPYFKQYEDDDMDVSY 59

Query: 54 LSNISSQFFFVYFNDEIAGYLKVNIDDAQSEEMGDESLEVERIYIKSSFQKHGLGKYLLN 113
+ F Y + G +K+ + +E I + ++K G+G LL+
Sbjct: 60 VEEEGKAAFLYYLENNCIGRIKIR-------SNWNGYALIEDIAVAKDYRKKGVGTALLH 112

Query: 114 NAIEIAIEHNKKNIWLGVWEKNENAIAFYKKLGFVQAG 151
AIE A E++ + L + N +A FY K F+
Sbjct: 113 KAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2915THERMOLYSIN459e-159 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 459 bits (1183), Expect = e-159
Identities = 213/539 (39%), Positives = 307/539 (56%), Gaps = 54/539 (10%)

Query: 64 WNEEQGSPSFLSGDLSDKKVETQKAVKEFLEENKELFKI--NPQKDLTLKEVKSDDLGMK 121
WNE+ +PSF+SG L + +Q+ V +L++ K F++ ++ L+L K D+LG
Sbjct: 32 WNEQWKTPSFVSGSLLGRC--SQELVYRYLDQEKNTFQLGGQARERLSLIGNKLDELGHT 89

Query: 122 HYVYTRSINKIPVDGAQFIVHTDKEGKVTTVNGDVHPAAEESLKGNTKAKITKETALSNA 181
+ ++I GA + H + +G++++++G + P ++ A
Sbjct: 90 VMRFEQAIAASLCMGAVLVAHVN-DGELSSLSGTLIPNLDKRTLKTEAA----------- 137

Query: 182 WKHIKLTKSDTLVKMDGNALDQIKENLESTNE-TADLVVYEKDGTYYLTFKVKLQFIKPY 240
I + +++ + K D KE + LV+Y + T L ++V ++F+ P
Sbjct: 138 ---ISIQQAEMIAKQDVAD-RVTKERPAAEEGKPTRLVIYPDEETPRLAYEVNVRFLTPV 193

Query: 241 GANWHIYVNAEDGTIVDSYNAVTDAD---------SAHKGYGYGVLGDKKELNTTFSSVK 291
NW ++A DG +++ +N + +A ++ G G GVLGD+K +NTT+SS
Sbjct: 194 PGNWIYMIDAADGKVLNKWNQMDEAKPGGAQPVAGTSTVGVGRGVLGDQKYINTTYSSYY 253

Query: 292 GKYYLKDTTKPMNGGYIETFTVNHSDEDYPINYRLFDEDNAWINKDQRPAVDAHYYAGKV 351
G YYL+D T+ G I T+ + P D DN + AVDAHYYAG V
Sbjct: 254 GYYYLQDNTR---GSGIFTYDGRNR-TVLP-GSLWADGDNQFFASYDAAAVDAHYYAGVV 308

Query: 352 YDYYKNVHNRNSIDGKGKTIRSAVNYGVNVNNAFWNGQQMIYGDGDGRRFIPLSGSLDVV 411
YDYYKNVH R S DG IRS V+YG NNAFWNG QM+YGDGDG+ F+P SG +DVV
Sbjct: 309 YDYYKNVHGRLSYDGSNAAIRSTVHYGRGYNNAFWNGSQMVYGDGDGQTFLPFSGGIDVV 368

Query: 412 AHELTHAVTEYSADLRYVNQSGALNESFSDVFGYFVDPT-----NWDVGDAVFTPGVSGD 466
HELTHAVT+Y+A L Y N+SGA+NE+ SD+FG V+ +W++G+ ++TPGV+GD
Sbjct: 369 GHELTHAVTDYTAGLVYQNESGAINEAMSDIFGTLVEFYANRNPDWEIGEDIYTPGVAGD 428

Query: 467 ALRSLSNPEKYGQPAHMRNYQYLPETEEGDNGGVHINSGIPNKAAYL----------TIN 516
ALRS+S+P KYG P H T DNGGVH NSGI NKAAYL ++
Sbjct: 429 ALRSMSDPAKYGDPDHYSKRY----TGTQDNGGVHTNSGIINKAAYLLSQGGVHYGVSVT 484

Query: 517 SIGKEKAEKIYYRALTIYLTPTSDFKQARTALLQSAADYDGYDSVTYKAIENAWNQVGV 575
IG++K KI+YRAL YLTPTS+F Q R A +Q+AAD G S +++ A+N VGV
Sbjct: 485 GIGRDKMGKIFYRALVYYLTPTSNFSQLRAACVQAAADLYGSTSQEVNSVKQAFNAVGV 543


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2920HTHFIS875e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 87.2 bits (216), Expect = 5e-22
Identities = 30/120 (25%), Positives = 56/120 (46%), Gaps = 1/120 (0%)

Query: 7 KVLIVEDEREIADLVELYLKNENYTVFKYYTAKEALECIDKNAIDLAILDIMLPDVSGLT 66
+L+ +D+ I ++ L Y V A I DL + D+++PD +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 67 ICQKIREKHTY-PIIMLTAKDTEVDKITGLTIGADDYITKPFRPLELIARVKAQLRRYKK 125
+ +I++ P+++++A++T + I GA DY+ KPF ELI + L K+
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2921BLACTAMASEA381e-04 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 37.8 bits (88), Expect = 1e-04
Identities = 16/59 (27%), Positives = 24/59 (40%), Gaps = 18/59 (30%)

Query: 101 FRLASISKVFTASAVMQLVEQGKIDLNK-------DIVNYMGGLKYQNNMSEPVTMEHL 152
F + S KV AV+ V+ G L + D+V+Y PV+ +HL
Sbjct: 62 FPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDY-----------SPVSEKHL 109


29BALH_3068BALH_3086Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_3068223-0.926830hypothetical protein
BALH_3069219-0.814567PhzF family phenazine biosynthesis protein
BALH_3070418-1.265451acetyltransferase
BALH_3071416-1.581673short chain dehydrogenase
BALH_3072213-1.779255acetyltransferase
BALH_3073114-2.191580acetyltransferase
BALH_3074010-1.991651long-chain-fatty-acid--CoA ligase
BALH_3075113-1.243795NAD(P)H oxidoreductase
BALH_3076115-1.167979hypothetical protein
BALH_3077115-1.229801ankyrin repeat-containing protein
BALH_3078217-1.282772ArsR family transcriptional regulator
BALH_3079317-1.188837glycosyl transferase and polysaccharide
BALH_3080622-0.762058membrane associated protein
BALH_3081521-2.851861putative RNA polymerase sigma factor SigI
BALH_3082621-3.381165CAAX amino terminal protease family protein
BALH_3083821-3.808066TetR family transcriptional regulator
BALH_3084922-3.523022hypothetical protein
BALH_3085421-1.740946DeoR family transcriptional regulator
BALH_30863210.843876hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3071DHBDHDRGNASE612e-13 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 61.2 bits (148), Expect = 2e-13
Identities = 48/197 (24%), Positives = 84/197 (42%), Gaps = 19/197 (9%)

Query: 3 VFITGGNRGLGLQLVKVFHENGHIV----YPLVRTEVAVTQLK-QMFSCRCFPILADLSA 57
FITG +G+G + + G + Y + E V+ LK + FP AD+
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP--ADVRD 68

Query: 58 DESTEQIKKQLEEYTEYIDLVINNAGITGKETEVLHTNS-EELTDLFNIHCLGVIRAVKG 116
+ ++I ++E ID+++N AG+ ++H+ S EE F+++ GV A +
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVL--RPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 117 TYVALAKSDHPRIINVSSRLGSLHKMANKEFPQGQFSYSYRIAKAAQNMLTLCLQQEFED 176
+ I+ V S + + + +Y +KAA M T CL E +
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMA---------AYASSKAAAVMFTKCLGLELAE 177

Query: 177 KGISVTAIHPGKLKTDI 193
I + PG +TD+
Sbjct: 178 YNIRCNIVSPGSTETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3080PF03544310.006 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 31.5 bits (71), Expect = 0.006
Identities = 15/89 (16%), Positives = 33/89 (37%), Gaps = 3/89 (3%)

Query: 220 VTPPATPSNP-VENEEERQSQPDSSPDVAPDLSSVKDKKYEKPEHKEQKKIEEQPTKQIK 278
V P V+ E +P+ P+ P+ K+ + K + K + +P K+++
Sbjct: 55 VAPADLEPPQAVQPPPEPVVEPEPEPEPIPE--PPKEAPVVIEKPKPKPKPKPKPVKKVE 112

Query: 279 ENNGRGSQQENRGNQQENNGRGFQQGNNG 307
+ E+R N + ++
Sbjct: 113 QPKRDVKPVESRPASPFENTAPARPTSST 141


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3083HTHTETR754e-19 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 75.4 bits (185), Expect = 4e-19
Identities = 29/172 (16%), Positives = 68/172 (39%), Gaps = 3/172 (1%)

Query: 8 KEKIIETSLYLFNTNGITRTSIQDIMTATELPKGSIYRRFKNKEEIVLAAYDKSGEIMWG 67
++ I++ +L LF+ G++ TS+ +I A + +G+IY FK+K ++ ++ S +
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 68 HFHKAMENK-KTAIDKILAIFLVYQDAANNPPI-AGGCPLLNSAIESTGVFPELQKAAAK 125
+ + + I + ++ ++ E G +Q+A
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRN 132

Query: 126 GYDDTVMLMASLIKEGIEKHELKEEIDIRSLASFLASSMEGAIMASRVSNDN 177
++ + +K IE L ++ R A + + G M + +
Sbjct: 133 LCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGL-MENWLFAPQ 183


30BALH_3096BALH_3104Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_3096217-2.149647hypothetical protein
BALH_3097319-2.128010hypothetical protein
BALH_3098218-2.699957hypothetical protein
BALH_3099214-2.650292penicillin-binding protein
BALH_3100319-3.870400hypothetical protein
BALH_3101317-2.808623pyridine nucleotide-disulfide oxidoreductase,
BALH_3102114-3.726748hypothetical protein
BALH_3103-113-3.613126cyclic nucleotide-binding domain-containing
BALH_3104014-3.474817membrane-spanning protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3099BLACTAMASEA300.013 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 29.8 bits (67), Expect = 0.013
Identities = 13/58 (22%), Positives = 22/58 (37%), Gaps = 6/58 (10%)

Query: 39 LESGTNRTVTI---DSIFNSCSISKFITTILVLTLSDHKIVHLDEDIN---DRLTSWN 90
++ + RT+T D F S K + VL D L+ I+ L ++
Sbjct: 45 MDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYS 102


31BALH_3149BALH_3167Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_3149016-3.339533LysR family transcriptional regulator
BALH_3150218-3.094487hypothetical protein
BALH_3151217-1.529456hypothetical protein
BALH_3152117-1.807650glycerophosphoryl diester phosphodiesterase
BALH_3153214-1.797007hypothetical protein
BALH_3154-113-0.047222hypothetical protein
BALH_3155-113-0.779611hypothetical protein
BALH_3156-112-0.824409PheA/TfdB family FAD-binding monooxygenase
BALH_3157-115-1.517685ArsR family transcriptional regulator
BALH_3158-115-1.593975zinc-containing alcohol dehydrogenase
BALH_3159015-0.898497hypothetical protein
BALH_3160-116-1.196399response regulator aspartate phosphatase
BALH_3161217-0.225864transcriptional repressor of sporulation and
BALH_3162016-1.937140hypothetical protein
BALH_3163115-2.235980hypothetical protein
BALH_3164115-2.068136bile acid transporter family protein
BALH_3165216-3.545392glycolate oxidase subunit D
BALH_3166318-4.334215group 1 glycosyl transferase
BALH_3167015-3.015008methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3167SYCECHAPRONE290.017 Gram-negative bacterial type III secretion SycE cha...
		>SYCECHAPRONE#Gram-negative bacterial type III secretion SycE

chaperone signature.
Length = 130

Score = 28.9 bits (64), Expect = 0.017
Identities = 25/91 (27%), Positives = 45/91 (49%), Gaps = 20/91 (21%)

Query: 119 LMHVRPSVDSIVVNTQNEKEEIVKHLQGLEELYFNLNDSFSKELLIKLLTFRLLGNHKV- 177
LM PS+D+ +EKE ++ H + FS+++L +L++ +G H V
Sbjct: 47 LMFTLPSLDN-----NDEKETLLSH------------NIFSQDILKPILSWDEVGGHPVL 89

Query: 178 --KMPLNTIDYWKQRKSIPNLIHSSETLQTN 206
+ PLN++D + L+ +E LQT+
Sbjct: 90 WNRQPLNSLDNNSLYTQLEMLVQGAERLQTS 120


32BALH_3181BALH_3186Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_31814142.779283hypothetical protein
BALH_31823132.853619flavodoxin
BALH_31834122.589475GPR1/FUN34/yaaH family protein
BALH_31843122.196007hypothetical protein
BALH_31852122.058663chloramphenicol acetyltransferase
BALH_31862131.961046hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3186CHLAMIDIAOM6462e-06 Chlamydia cysteine-rich outer membrane protein 6 si...
		>CHLAMIDIAOM6#Chlamydia cysteine-rich outer membrane protein 6

signature.
Length = 547

Score = 45.8 bits (108), Expect = 2e-06
Identities = 41/165 (24%), Positives = 67/165 (40%), Gaps = 38/165 (23%)

Query: 626 YTVTIENTGNVLATNVVFQDPIPIGTTFITNSVTVDGVSQPGANPATGFTVANISPGGSR 685
Y + I N G A NVV ++P+P DG + FT+ ++ PG R
Sbjct: 229 YKINIVNQGTATARNVVVENPVP------------DGYAHSSGQRVLTFTLGDMQPGEHR 276

Query: 686 TVTFQV------RVTSTPSGGTIANRGNVTANFVVIPNQPPITINRQTNTVVTQVSTGGL 739
T+T + R T+ + N TA+ + N+P + QVS G
Sbjct: 277 TITVEFCPLKRGRATNIATVSYCGGHKN-TASVTTVINEPCV-----------QVSIAGA 324

Query: 740 NVIKEVNTTQAAVGDTLTYTIAVQNTGNVPLTNVFFQDAISSAVS 784
+ + V + Y I+V N G++ L +V +D +S V+
Sbjct: 325 D--------WSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSPGVT 361



Score = 40.8 bits (95), Expect = 7e-05
Identities = 74/365 (20%), Positives = 136/365 (37%), Gaps = 55/365 (15%)

Query: 484 QTATLNDVLTYTVNVTNNGNVTANNVIFVDSIPAGTTFVANSVTVNGVARPGANPASSIN 543
+ A L + Y +N+ N G TA NV+ + +P +G A +
Sbjct: 219 ENACLRCPVVYKINIVNQGTATARNVVVENPVP------------DGYAHSSGQRVLTFT 266

Query: 544 LGSINASQTTVVRFQVRVTSNPLVNPIPNRASATFTFTPVPGQQPVSGQATSNTVVTTIN 603
LG + + + + N A+ ++ G + +V T IN
Sbjct: 267 LGDMQPGEHRTITVEFCPLKR---GRATNIATVSY----------CGGHKNTASVTTVIN 313

Query: 604 IADIRTRKIVDRAFATVNDVLTYTVTIENTGNVLATNVVFQDPIPIGTTFITNSVTVDGV 663
++ I ++ V + Y +++ N G+++ +VV +D + G T +
Sbjct: 314 EPCVQV-SIAGADWSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSPGVTVL--------- 363

Query: 664 SQPGANPATG---FTVANISPGGSRTVTFQVRVTSTPSGGTIANRGNVTANFVVIPNQPP 720
GA + +TV ++PG ++ ++V + + G N V + +
Sbjct: 364 EAAGAQISCNKVVWTVKELNPG--ESLQYKV-LVRAQTPGQFTNNVVVKSC----SDCGT 416

Query: 721 ITINRQTNTVVTQVSTGGLNVIKEVNTTQAAVGDTLTYTIAVQNTGNVPLTNVFFQDAIS 780
T + T V+ + V+ + VG+ Y I V N G+ TNV S
Sbjct: 417 CTSCAEATTYWKGVAATHMCVVDTCDPV--CVGENTVYRICVTNRGSAEDTNVSLMLKFS 474

Query: 781 SAVSFVANSVTINGVPQSGLNPNTGF--SLPNIPAAQTV--VVTFDVLIIQDPENEDILN 836
+ V+ S G + + NT SLP + + +TV VT + D E IL+
Sbjct: 475 KELQPVSFS----GPTKGTITGNTVVFDSLPRLGSKETVEFSVTLKAVSAGDARGEAILS 530

Query: 837 QANVT 841
+T
Sbjct: 531 SDTLT 535



Score = 35.8 bits (82), Expect = 0.002
Identities = 33/159 (20%), Positives = 62/159 (38%), Gaps = 26/159 (16%)

Query: 362 YTITVPNTGTGSAENVVLRDSIPNGTTFVAGSVTVGGVTQPNANPTTGINLGTIPNNTQR 421
Y I + N GT +A NVV+ + +P+G +G + LG + R
Sbjct: 229 YKINIVNQGTATARNVVVENPVPDGYAHSSGQRVL------------TFTLGDMQPGEHR 276

Query: 422 IVTFQVRITSFPNPNPIPNRAMVSYQFRPFVGSPLITSMSSSNTVQTTVNQATISMQKSV 481
+T + N A VSY ++ +V T +N+ + + +
Sbjct: 277 TITVEF---CPLKRGRATNIATVSY----------CGGHKNTASVTTVINEPCVQVSIAG 323

Query: 482 DLQTATLNDVLTYTVNVTNNGNVTANNVIFVDSIPAGTT 520
+ V Y ++V+N G++ +V+ D++ G T
Sbjct: 324 ADWSYVCKPV-EYVISVSNPGDLVLRDVVVEDTLSPGVT 361


33BALH_3225BALH_3233Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_3225-216-3.656686C4-dicarboxylate transporter DctA
BALH_3227019-3.895312transcriptional regulator LytR
BALH_3228017-4.467914hypothetical protein
BALH_3229-314-3.525921ECF subfamily RNA polymerase sigma factor
BALH_3230-113-4.067762hypothetical protein
BALH_3232011-1.932482ABC transporter permease
BALH_3233212-0.770463ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3230RTXTOXINA300.006 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.5 bits (66), Expect = 0.006
Identities = 25/112 (22%), Positives = 42/112 (37%), Gaps = 15/112 (13%)

Query: 11 IRKEFKKKGNSFITYKKGGNELLQIVQSNITTEDEKDDIIMTNKFEFILGLIGGILGILF 70
I K++K +G+S + +EL VQ DEK+ +T + + G ++G+
Sbjct: 53 IPKDYKGQGSSLNDLVRTADELGIEVQ-----YDEKNGTAITKQ---VFGTAEKLIGLTE 104

Query: 71 SSFAIF-------IGFISDLGPIMDSTTGEIPTNTTMILGSISLIASILGII 115
IF + G I+ I N G +S + LG
Sbjct: 105 RGVTIFAPQLDKLLQKYQKAGNILGGGAENIGDNLGKAGGILSTFQNFLGTA 156


34BALH_3282BALH_3298Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_3282-1153.229439Oye family NADH-dependent flavin oxidoreductase
BALH_3283-1163.776845CarD family transcriptional regulator
BALH_32840173.670853formimidoylglutamase
BALH_32850163.472214imidazolonepropionase
BALH_3286-1122.294969urocanate hydratase
BALH_3287-2100.738782histidine ammonia-lyase
BALH_3288-311-1.106901anti-terminator HutP
BALH_3289-213-0.906459ubiquinone/menaquinone methyltransferase
BALH_32903160.0149344-methyl-5(beta-hydroxyethyl)-thiazole
BALH_32919172.521044hypothetical protein
BALH_32929182.719453hypothetical protein
BALH_32939162.557458hypothetical protein
BALH_32959172.551324hypothetical protein
BALH_32977172.519996hypothetical protein
BALH_32986182.115118hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3285UREASE372e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 36.6 bits (85), Expect = 2e-04
Identities = 19/56 (33%), Positives = 27/56 (48%), Gaps = 8/56 (14%)

Query: 356 TVNSSYAINRGEVAGKIRVGRKADLVLWDAYNYAYVPYHYGVSHVNTVWKNGNIAY 411
T+N + A G + VG++ADLVLW+ P +GV + V G IA
Sbjct: 410 TINPAIAHGLSHEIGSLEVGKRADLVLWN-------PAFFGVK-PDMVLLGGTIAA 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3291PF05272290.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.012
Identities = 21/121 (17%), Positives = 42/121 (34%), Gaps = 28/121 (23%)

Query: 28 YVISLQGPMASGKTTLAKKLELHGLSVIYENPYPIVEKR---KQL---------NLD-MN 74
Y + L+G GK+TL L GL + + I + +Q+ +
Sbjct: 597 YSVVLEGTGGIGKSTLINT--LVGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSEMTAFR 654

Query: 75 SKEGFIANQKMFIEAKIKEFQNAKGSVVIFDRGPEDIEFYTIFYPTTIGKEWDIETELKD 134
+ K F ++ ++ A + R +D + + TT +++ L D
Sbjct: 655 RAD--AEAVKAFFSSRKDRYRGA------YGRYVQDHPRQVVIWCTTNKRQY-----LFD 701

Query: 135 E 135

Sbjct: 702 I 702


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3298CHLAMIDIAOM6394e-04 Chlamydia cysteine-rich outer membrane protein 6 si...
		>CHLAMIDIAOM6#Chlamydia cysteine-rich outer membrane protein 6

signature.
Length = 547

Score = 38.5 bits (89), Expect = 4e-04
Identities = 39/160 (24%), Positives = 64/160 (40%), Gaps = 25/160 (15%)

Query: 271 ITYTITVPNNGNISATNVSITDPIPTGTNFIPNSVTVNGATQSGVTPTNIPLGTIPAGQT 330
+ Y I + N G +A NV + +P+P G A SG LG + G+
Sbjct: 227 VVYKINIVNQGTATARNVVVENPVPDGY-----------AHSSGQRVLTFTLGDMQPGEH 275

Query: 331 TTVTFQVQVTSLPANGTITNEANITFTSQPNPSESPTTTTTTPPPTTTSVRTAIVNPTKS 390
T+T + G TN A +++ + S TT P V+ +I S
Sbjct: 276 RTITVEF---CPLKRGRATNIATVSYCGGHKNTASVTTVINEP-----CVQVSIAGADWS 327

Query: 391 ASPQVVDIGDTITYTITLPNTGNISATNVIVTDPIPAGTT 430
+ V+ Y I++ N G++ +V+V D + G T
Sbjct: 328 YVCKPVE------YVISVSNPGDLVLRDVVVEDTLSPGVT 361



Score = 38.1 bits (88), Expect = 4e-04
Identities = 72/305 (23%), Positives = 112/305 (36%), Gaps = 57/305 (18%)

Query: 389 KSASPQVVDIGDTITYTITLPNTGNISATNVIVTDPIPAGTTFVPNSVTINSIAQPGINP 448
K P+ + + Y I + N G +A NV+V +P+P G +
Sbjct: 214 KQEGPENACLRCPVVYKINIVNQGTATARNVVVENPVPDGYAHSSGQRVLT--------- 264

Query: 449 SGGIQVGTIAAGSTSTVTFQVQVNSLPTS-GVIRNVGNVTFTYQPDPTKPAITTTNPTPP 507
+G + G T+T V P G N+ V++ T T N
Sbjct: 265 ---FTLGDMQPGEHRTIT----VEFCPLKRGRATNIATVSYCGGHKNTASVTTVIN---E 314

Query: 508 TTVPVNTAITNPIKTADKTAVDIGDTITYTVTFNNDGTIPSTNVIFTDTIPAGTTFIPNS 567
V V+ I AD + V + Y ++ +N G + +V+ DT+ G T
Sbjct: 315 PCVQVS------IAGADWSYV--CKPVEYVISVSNPGDLVLRDVVVEDTLSPGVT----- 361

Query: 568 VVLNNASVPNSNPATGISVGTINPGETKTLSFQVLV-TQVPGGGVITNEASTTYTYQPDP 626
VL A S +V +NPGE +L ++VLV Q PG TN
Sbjct: 362 -VLEAAGAQISCNKVVWTVKELNPGE--SLQYKVLVRAQTPGQ--FTNNVVVK------- 409

Query: 627 NLPPVTTTEPTTPTTVAVNTATVNPTKSANQAFVD------IGDIITYTISLQNNGTVPA 680
+ ++ T T+ A T + + VD +G+ Y I + N G+
Sbjct: 410 -----SCSDCGTCTSCAEATTYWKGVAATHMCVVDTCDPVCVGENTVYRICVTNRGSAED 464

Query: 681 TNVIL 685
TNV L
Sbjct: 465 TNVSL 469



Score = 37.7 bits (87), Expect = 6e-04
Identities = 35/165 (21%), Positives = 64/165 (38%), Gaps = 30/165 (18%)

Query: 1194 VTYTVTFTNQGTIPATGVTITDSLPPSTTFVTNSVTVNTIPQPGVSPISGISVGTVNPGE 1253
V Y + NQGT A V + + +P + ++G + PGE
Sbjct: 227 VVYKINIVNQGTATARNVVVENPVPDGYA------------HSSGQRVLTFTLGDMQPGE 274

Query: 1254 --IVTVTFQVQINAIPPNGKIENTASVTYISQPTPGEPPITTTETTPTVTLPVRTANPDT 1311
+TV F G+ N A+V+Y +TT P V + + A
Sbjct: 275 HRTITVEFCPL-----KRGRATNIATVSYCGG-HKNTASVTTVINEPCVQVSIAGA---- 324

Query: 1312 QKTVDREFASIGDTLTYTITLQNNGNIPATDVIITDSIPTGTTFI 1356
+++ + + Y I++ N G++ DV++ D++ G T +
Sbjct: 325 ------DWSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSPGVTVL 363



Score = 35.8 bits (82), Expect = 0.002
Identities = 42/176 (23%), Positives = 65/176 (36%), Gaps = 26/176 (14%)

Query: 2105 KTATPETVTLGDIITYTISLQNTGTIPANNILVSDPIPTGTSFIQNSVTINNVSQPTANP 2164
K PE L + Y I++ N GT A N++V +P+P G + +
Sbjct: 214 KQEGPENACLRCPVVYKINIVNQGTATARNVVVENPVPDG------------YAHSSGQR 261

Query: 2165 ETGIQIPTLSPSESATISFRVLVTSIPPSGEIQNQGNVSFQYQPDATKPPVSVTTPTPTT 2224
+ + P E TI+ G N VS+ K SVTT
Sbjct: 262 VLTFTLGDMQPGEHRTITVEFCPLK---RGRATNIATVSY---CGGHKNTASVTTVINEP 315

Query: 2225 ITPVNVGTINPIKTADKSIVSVGDTITFTITFQNEGTIPVTDISVTDSLPAGTSFI 2280
V+ I AD S V + + I+ N G + + D+ V D+L G + +
Sbjct: 316 CVQVS------IAGADWSYVC--KPVEYVISVSNPGDLVLRDVVVEDTLSPGVTVL 363



Score = 32.4 bits (73), Expect = 0.025
Identities = 53/227 (23%), Positives = 83/227 (36%), Gaps = 38/227 (16%)

Query: 1722 ITYTISLQNTGTVPATNVLVTDPIPAGTTFIPNSVTINDVTQPGIVPSSGILIGTLEPNT 1781
+ Y I++ N GT A NV+V +P+P G + +G ++P
Sbjct: 227 VVYKINIVNQGTATARNVVVENPVPDGYAHSSGQRVLT------------FTLGDMQPGE 274

Query: 1782 SAVVTFQVQVTSIPPTGFIENEGSVSFQYQPDPNSPPVSVTTPTPTTKTQVSEVTTNPNK 1841
+T + G N +VS+ + SVTT QVS + +
Sbjct: 275 HRTITVEFCPLK---RGRATNIATVSY---CGGHKNTASVTTVINEPCVQVSIAGADWSY 328

Query: 1842 QATPQVINLGDTVTYTITFQNVGNINASDVIIADPTPAGTTFIPNS---VTINGVA---- 1894
P V Y I+ N G++ DV++ D G T + + ++ N V
Sbjct: 329 VCKP--------VEYVISVSNPGDLVLRDVVVEDTLSPGVTVLEAAGAQISCNKVVWTVK 380

Query: 1895 --SPGANPNSGVNVGIVTPGQIVTLTYQVTVTALPPDGIIKNTATVT 1939
+PG + V V TPGQ T V V + G + A T
Sbjct: 381 ELNPGESLQYKVLVRAQTPGQ---FTNNVVVKSCSDCGTCTSCAEAT 424



Score = 32.0 bits (72), Expect = 0.033
Identities = 40/174 (22%), Positives = 67/174 (38%), Gaps = 26/174 (14%)

Query: 1841 KQATPQVINLGDTVTYTITFQNVGNINASDVIIADPTPAGTTFIPNSVTINGVASPGANP 1900
KQ P+ L V Y I N G A +V++ +P P +G A
Sbjct: 214 KQEGPENACLRCPVVYKINIVNQGTATARNVVVENPVP------------DGYAHSSGQR 261

Query: 1901 NSGVNVGIVTPGQIVTLTYQVTVTALPPDGIIKNTATVTYTFQPNPSEPPITITDPTPTV 1960
+G + PG+ T+T + G N ATV+Y + T+ + P V
Sbjct: 262 VLTFTLGDMQPGEHRTITVEFCPLK---RGRATNIATVSYCGGHKNTASVTTVIN-EPCV 317

Query: 1961 EVSVITPTPNPNKLADKQIVDINEIITYTVTFQNRGSVPATSVIITDPLANGLT 2014
+VS+ A + + + Y ++ N G + V++ D L+ G+T
Sbjct: 318 QVSI----------AGADWSYVCKPVEYVISVSNPGDLVLRDVVVEDTLSPGVT 361


35BALH_3333BALH_3339Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_33332266.537852aluminum resistance protein
BALH_3334134812.363834GTP-binding protein
BALH_3335165413.570326hypothetical protein
BALH_3336155212.938220stage V sporulation protein K
BALH_3337185613.860848phage integrase
BALH_3338195414.599995triple helix repeat-containing collagen
BALH_3339124010.467472triple helix repeat-containing collagen
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3336HTHFIS330.002 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.9 bits (75), Expect = 0.002
Identities = 26/83 (31%), Positives = 35/83 (42%), Gaps = 16/83 (19%)

Query: 90 LHMLFKGNPGTGKTTVARMIGKLLFEMNILSKGHLVEVERA----DLVG-EYIGH----- 139
L ++ G GTGK VAR + + G V + A DL+ E GH
Sbjct: 161 LTLMITGESGTGKELVARAL----HDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAF 216

Query: 140 -TAQKTRD-LIKKAMGGILFIDE 160
AQ ++A GG LF+DE
Sbjct: 217 TGAQTRSTGRFEQAEGGTLFLDE 239


36BALH_3373BALH_3384Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_3373211-3.826564hypothetical protein
BALH_3374312-3.609112hypothetical protein
BALH_3375412-3.816231phosphoglycerate mutase
BALH_3376314-3.609239alpha/beta fold family hydrolase
BALH_3377214-3.582781glyoxylase
BALH_3378012-3.542159PAS/PAC sensor-containing diguanylate
BALH_3379420-2.536628undecaprenyldiphospho-muramoylpentapeptide
BALH_3380216-2.828640hypothetical protein
BALH_3381215-2.862271UDP-2-acetamido-2,6-dideoxy-hexulose
BALH_3382113-1.739843UDP-N-acetylglucosamine 4,6-dehydratase
BALH_3383213-2.178847hypothetical protein
BALH_3384214-1.822122HAD superfamily hydrolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3381NUCEPIMERASE714e-16 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 71.0 bits (174), Expect = 4e-16
Identities = 64/349 (18%), Positives = 121/349 (34%), Gaps = 83/349 (23%)

Query: 8 LITGANGFTGRHACQYFLEQGFHVI----------PMFQNRAHRESIGNG----ITCNLT 53
L+TGA GF G H + LE G V+ + +A E + +L
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLK-QARLELLAQPGFQFHKIDLA 62

Query: 54 NKSEVIRVMKQIKPDYVLHLAGRNSVIESWTAALEYVEVNVIGTLYLLEAITQEAPHCRT 113
++ + + + V R +V S Y + N+ G L +LE +
Sbjct: 63 DREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKI--QH 120

Query: 114 LVIGSALQADCMSNIKI-----------LNPYSLSKTMQVIIAEAWSELMNSNIIIAKPT 162
L+ S+ + N K+ ++ Y+ +K ++A +S L +
Sbjct: 121 LLYASS-SSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFF 179

Query: 163 NLIGP-GISNGICSILAKKMIDIESGRSKAVIEV-NSLKDSRDFLDVRDAVKA----YHV 216
+ GP G + K M++ +S I+V N K RDF + D +A V
Sbjct: 180 TVYGPWGRPDMALFKFTKAMLEGKS------IDVYNYGKMKRDFTYIDDIAEAIIRLQDV 233

Query: 217 LLRDGIS--------------GKQYNIGSGVKRSLLDVLEQYKELTRLNFFIDETEKS-G 261
+ + YNIG+ L+D +I E + G
Sbjct: 234 IPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMD-------------YIQALEDALG 280

Query: 262 SESNESLV-------------IEDIRR-LGWIPEIQFHQSLKDVLEYVK 296
E+ ++++ + + +G+ PE +K+ + + +
Sbjct: 281 IEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYR 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3382NUCEPIMERASE798e-19 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 79.4 bits (196), Expect = 8e-19
Identities = 44/223 (19%), Positives = 82/223 (36%), Gaps = 25/223 (11%)

Query: 12 ILITGGTGSWGHELIKQLLEKSPKEIRVFSRNE--TIQFEM-QQQFINDDRLKFIIGDIR 68
L+TG G G + K+LLE + + + + N+ + + + + + +F D+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 69 DKDQLA--YACQGVHYVFHLAALKHVPVCEYYPYEAIKTNINGTQNVIEASIQMQVEKVI 126
D++ + +A VF V P+ +N+ G N++E +++ ++
Sbjct: 63 DREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHLL 122

Query: 127 YVST---------------DKAADPSNTYGMTKAIGEKLMVHANRQTKKTKFICVRGGNV 171
Y S+ D P + Y TK E LM H +R V
Sbjct: 123 YASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANE-LMAHTYSHLYGLPATGLRFFTV 181

Query: 172 LGTSGS---VVPLFKKQIKKSSQVGI-TDANMTRFFLTIEDAV 210
G G + F K + + + + M R F I+D
Sbjct: 182 YGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIA 224


37BALH_3434BALH_3441Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_34342151.060613polysaccharide deacetylase
BALH_34353191.329771polynucleotide phosphorylase/polyadenylase
BALH_34363171.02203530S ribosomal protein S15
BALH_34372151.410237bifunctional riboflavin kinase/FMN
BALH_34384182.065505tRNA pseudouridine synthase B
BALH_34397232.563840ribosome-binding factor A
BALH_34406222.933516hypothetical protein
BALH_34414203.102137translation initiation factor IF-2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3441TCRTETOQM861e-19 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 86.4 bits (214), Expect = 1e-19
Identities = 66/302 (21%), Positives = 110/302 (36%), Gaps = 82/302 (27%)

Query: 215 IMGHVDHGKTTLLDSI-----RNSKVTAGEAG-------------GITQHIGAYQVEVND 256
++ HVD GKTTL +S+ +++ + + G GIT G + +
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 257 KKITFLDTPGHAAFTTMRARGAQVTDITILVVAADDGVMPQTVEAINHAKAAGVPIIVAV 316
K+ +DTPGH F R V D IL+++A DGV QT + + G+P I +
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 317 NKMDKPAANPDRVMQE-----------LTEYELVP----------EAWG----------- 344
NK+D+ + V Q+ + EL P E W
Sbjct: 128 NKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLE 187

Query: 345 ----GDTI-----------------FVPI---SAIQGEGIDNLLEMI--LLVSEVEEYKA 378
G ++ P+ SA GIDNL+E+I S ++
Sbjct: 188 KYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTHRGQS 247

Query: 379 NPNRYATGTVIEAQLDKGKGTIATLLVQNGTLRVGDPIVVGT--SFGRVRAMVSDIGRRV 436
G V + + + + +A + + +G L + D + + S G
Sbjct: 248 EL----CGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKIKITEMYTSINGELC 303

Query: 437 KV 438
K+
Sbjct: 304 KI 305


38BALH_3453BALH_3459Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_34532271.717991phosphatidate cytidylyltransferase
BALH_34545332.479767undecaprenyl pyrophosphate synthase
BALH_34555292.144888ribosome recycling factor
BALH_34564262.710899uridylate kinase
BALH_34574212.252642elongation factor Ts
BALH_34582152.35740430S ribosomal protein S2
BALH_34592132.526328transcriptional repressor CodY
39BALH_3470BALH_3485Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_34703161.403663signal peptidase SipM
BALH_34711162.44547050S ribosomal protein L19
BALH_34721141.750307tRNA (guanine-N(1)-)-methyltransferase
BALH_34731151.18104216S rRNA-processing protein RimM
BALH_34744141.362435KH domain-containing protein
BALH_34754141.39401230S ribosomal protein S16
BALH_34764151.397258signal recognition particle subunit FFH/SRP54
BALH_34773160.888908putative DNA-binding protein
BALH_34782151.505478signal recognition particle-docking protein
BALH_34791151.888297condensin subunit Smc
BALH_3480-1112.381073ribonuclease III
BALH_3481-1112.872709acyl carrier protein
BALH_34820112.7849183-ketoacyl-ACP reductase
BALH_34830112.797272ACP S-malonyltransferase
BALH_34842132.694138putative glycerol-3-phosphate acyltransferase
BALH_34852142.291090fatty acid biosynthesis transcriptional
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3476FLGHOOKAP1300.017 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 30.3 bits (68), Expect = 0.017
Identities = 10/39 (25%), Positives = 18/39 (46%)

Query: 399 KKRIAKGSGTTVQEINRLIKQFDDMKKMMKTMTGMQKGK 437
K++ G +V +IN KQ + + +TG+ G
Sbjct: 154 DKQVNIAIGASVDQINNYAKQIASLNDQISRLTGVGAGA 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3479GPOSANCHOR512e-08 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 50.8 bits (121), Expect = 2e-08
Identities = 42/274 (15%), Positives = 86/274 (31%)

Query: 667 KQAKSSLLGRQRELEEWTNKLTDMEEKTTKLENFVKAVKQEIQEKEVQIRELRKSVEAER 726
++ ++ ++ LE N T K LE A+ + E + A+
Sbjct: 116 QELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADS 175

Query: 727 VDEQKLREEINRLELEEHRINDRLSIYDLEIEGFLQDQVKIQGRKEELEKILATLQAEIT 786
+ L E LE + + L ++ K L A L+ +
Sbjct: 176 AKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALE 235

Query: 787 GLDSKIVALTKQKSEQHSSKEKVQKEMTELKVLAAEKQQRLSNQKEKVERLTKEKEETDA 846
G + A + + + K ++ EL+ + K++ L EK +A
Sbjct: 236 GAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEA 295

Query: 847 TLVKTKEDLAFLKQEMTSNSSGEEQITNMIEKKAYDRNQTSELIRSRREQRVSLQERVEQ 906
+ L S + ++ + + E + R SL+ ++
Sbjct: 296 EKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDA 355

Query: 907 LERNLKETTGKHKYILEMLKDQEVKINRLDVELE 940
K+ +H+ + E K E L +L+
Sbjct: 356 SREAKKQLEAEHQKLEEQNKISEASRQSLRRDLD 389



Score = 48.1 bits (114), Expect = 2e-07
Identities = 54/354 (15%), Positives = 115/354 (32%), Gaps = 15/354 (4%)

Query: 175 KKAESKLADTQENLNRVQDIIHELSSQVEPLERQASIAKDYLEKKEELEKVEAALIVHEI 234
+ K +D N ++D EL+ ++ + + L +K +I
Sbjct: 67 NTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKA-----------SKI 115

Query: 235 EELHEKWEALRNQFGHNKNEEAKMSTHLQKGEEELEELRGQLQAVDESVDSLQEVLLLSS 294
+EL + L N S ++ E E L + ++++++ S
Sbjct: 116 QELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADS 175

Query: 295 KELEKLEGQRELLKERKQNATTHCAQLEQLIVELTEKATSYDGEIESSTEVLMQFVNHVK 354
+++ LE ++ L+ R+ + K + + E + ++
Sbjct: 176 AKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALE 235

Query: 355 ELEMKLHDNEQLLATFADNLEEQIENLKGDYIELLNQQASHRNELSMIEEQSKQQNSKNE 414
+ + T LE + L+ EL N + + K ++
Sbjct: 236 GAMNFSTADSAKIKT----LEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKA 291

Query: 415 RLDEENAKYVEMRMEITAKKTKLVESYEQVKEKVAGILSNIQKTEAALGKCKAQYSENET 474
L+ E A + A + L + +E + + QK E +A
Sbjct: 292 ALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRR 351

Query: 475 KLYQAYQFVQQARSRKEMLEEMQEDYSGFYQGVREVLKARENRLQGIEGAVAEL 528
L + + +Q + + LEE + Q +R L A + +E A+ E
Sbjct: 352 DLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEA 405



Score = 45.1 bits (106), Expect = 2e-06
Identities = 33/284 (11%), Positives = 85/284 (29%), Gaps = 3/284 (1%)

Query: 242 EALRNQFGHNKNEEAKMSTHLQKGEEELEELRGQLQAVDESVDSLQEVLLLSSKELEKLE 301
E ++ + + E + + L+ + E + + +E L + K L +
Sbjct: 53 EKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKA 112

Query: 302 GQRELLKERKQNATTHCAQLEQLIVELTEKATSYDGEIESSTEVLMQFVNHVKELE---M 358
+ + L+ RK + + K + + E + ++
Sbjct: 113 SKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST 172

Query: 359 KLHDNEQLLATFADNLEEQIENLKGDYIELLNQQASHRNELSMIEEQSKQQNSKNERLDE 418
+ L LE + L+ +N + ++ +E + ++ L++
Sbjct: 173 ADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEK 232

Query: 419 ENAKYVEMRMEITAKKTKLVESYEQVKEKVAGILSNIQKTEAALGKCKAQYSENETKLYQ 478
+ +AK L ++ + A + ++ A+ E +
Sbjct: 233 ALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAA 292

Query: 479 AYQFVQQARSRKEMLEEMQEDYSGFYQGVREVLKARENRLQGIE 522
+ ++L ++ RE K E Q +E
Sbjct: 293 LEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLE 336



Score = 39.3 bits (91), Expect = 1e-04
Identities = 28/161 (17%), Positives = 55/161 (34%), Gaps = 1/161 (0%)

Query: 153 KSEERRGVFEEAAGVLKYKLRKKKAESKLADTQEN-LNRVQDIIHELSSQVEPLERQASI 211
E + E L+ L S + L + + + +E A
Sbjct: 180 TLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMN 239

Query: 212 AKDYLEKKEELEKVEAALIVHEIEELHEKWEALRNQFGHNKNEEAKMSTHLQKGEEELEE 271
K + + E A + EL + E N + + + E E +
Sbjct: 240 FSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKAD 299

Query: 272 LRGQLQAVDESVDSLQEVLLLSSKELEKLEGQRELLKERKQ 312
L Q Q ++ + SL+ L S + ++LE + + L+E+ +
Sbjct: 300 LEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNK 340



Score = 34.3 bits (78), Expect = 0.004
Identities = 38/172 (22%), Positives = 73/172 (42%)

Query: 677 QRELEEWTNKLTDMEEKTTKLENFVKAVKQEIQEKEVQIRELRKSVEAERVDEQKLREEI 736
++ LE N T K LE A++ E + E Q + L + ++ R D RE
Sbjct: 266 EKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAK 325

Query: 737 NRLELEEHRINDRLSIYDLEIEGFLQDQVKIQGRKEELEKILATLQAEITGLDSKIVALT 796
+LE E ++ ++ I + + +D + K++LE L+ + ++ +L
Sbjct: 326 KQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLR 385

Query: 797 KQKSEQHSSKEKVQKEMTELKVLAAEKQQRLSNQKEKVERLTKEKEETDATL 848
+ +K++V+K + E A ++ +E + KEK E A L
Sbjct: 386 RDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKL 437



Score = 33.9 bits (77), Expect = 0.004
Identities = 25/194 (12%), Positives = 73/194 (37%), Gaps = 7/194 (3%)

Query: 665 AVKQAKSSLLGRQRELEEWTNKLTDMEEKTTKLENFVKAVKQEIQEKEVQIRELRKSVEA 724
A++ A + +++ + +E + +LE ++ +I+ L A
Sbjct: 233 ALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAA 292

Query: 725 ERVDEQKLREEINRLELEEHRINDRLSIYDLEIEGFLQDQVKIQGRKEELEKILATLQAE 784
++ L + L + L + + + +++ ++LE+ +A
Sbjct: 293 LEAEKADLEHQSQVLNANRQSLRRDL-------DASREAKKQLEAEHQKLEEQNKISEAS 345

Query: 785 ITGLDSKIVALTKQKSEQHSSKEKVQKEMTELKVLAAEKQQRLSNQKEKVERLTKEKEET 844
L + A + K + + +K++++ + ++ L +E +++ K EE
Sbjct: 346 RQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEA 405

Query: 845 DATLVKTKEDLAFL 858
++ L ++ L
Sbjct: 406 NSKLAALEKLNKEL 419



Score = 31.6 bits (71), Expect = 0.021
Identities = 75/457 (16%), Positives = 149/457 (32%), Gaps = 28/457 (6%)

Query: 393 ASHRNELSMIEEQSKQQNSKNERLDEENAKYVEMRMEITAKKTKLVESYEQVKEKVAGIL 452
S + L ++E++ + +N L +N+ + +L E KEK+
Sbjct: 46 RSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKND 105

Query: 453 SNIQKTEAALGKCKAQYSENETKLYQAYQFVQQARSRKEMLEEMQEDYSGFYQGVREVLK 512
++ + + + + +A+ ++ E L A F ++ + LE + + + + L+
Sbjct: 106 KSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALE 165

Query: 513 ARENRLQGIEGAVAELLTVPKEYEIAMEIALGAAMQHIVVQKEEHARNAIAFLKQNKHGR 572
N + L E A Q + + E A N
Sbjct: 166 GAMNFSTADSAKIKTLEAEKAALE---------ARQAELEKALEGAMNFSTADSAKIKTL 216

Query: 573 ATFLPQAVMKGRSLSFEQLRIVNQHPSFVGVAAELVQYNNKYENVVSNLLGTVIVAKDLR 632
+ L +N + L E + L K L
Sbjct: 217 EAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAEL------EKALE 270

Query: 633 GANELAKQLQYRYRIVTIEGDVVNPGGSMTGGAVKQAKSSLLGRQRELEEWTNKLTDMEE 692
GA + + + + E + + + ++ +R+L+ +E
Sbjct: 271 GAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEA 330

Query: 693 KTTKLEN---FVKAVKQEIQEKEVQIRELRKSVEAERVDEQKLREEINRLELEEHRINDR 749
+ KLE +A +Q ++ RE +K +EAE QKL E+ E +
Sbjct: 331 EHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAE---HQKLEEQNKISEASRQSLRRD 387

Query: 750 LSIYDLEIEGFLQDQVKIQGRKEELEKILATLQAEITGLDSKIVALTKQKSEQHSSKEKV 809
L + + + +++ EE LA L+ L+ K+K+E + E
Sbjct: 388 L-------DASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKLEAE 440

Query: 810 QKEMTELKVLAAEKQQRLSNQKEKVERLTKEKEETDA 846
K + E AE+ +L K + K A
Sbjct: 441 AKALKEKLAKQAEELAKLRAGKASDSQTPDAKPGNKA 477



Score = 30.4 bits (68), Expect = 0.044
Identities = 43/327 (13%), Positives = 96/327 (29%), Gaps = 28/327 (8%)

Query: 150 LSSKSEERRGVFEEAAGVLKYKLRKKKAESKLADTQENLNRVQDIIHELSSQVEPLERQA 209
+SK +E + L+ + + + L + + + +E A
Sbjct: 111 KASKIQELEARKADLEKALEGAMNFST---ADSAKIKTLEAEKAALAARKADLEKALEGA 167

Query: 210 SIAKDYLEKKEELEKVEAALIVHEIEELHEKWEALRNQFGHNKNEEAKMSTHLQKGEEEL 269
K + + E A + EL + E N + + +
Sbjct: 168 MNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARK 227

Query: 270 EELRGQLQAVDESVDSLQEVLLLSSKELEKLEGQRELLKERKQNATTHCAQLEQLIVELT 329
+L L+ + + E LE ++ L++ + A I L
Sbjct: 228 ADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLE 287

Query: 330 EKATSYDGEIESSTEVLMQFVNHVKELEMKLHDNEQLLATFADNLEEQIENLKGDYIELL 389
+ + L + L + L ++L+ D
Sbjct: 288 AE-------------------------KAALEAEKADLEHQSQVLNANRQSLRRDLDASR 322

Query: 390 NQQASHRNELSMIEEQSKQQNSKNERLDEENAKYVEMRMEITAKKTKLVESYEQVKEKVA 449
+ E +EEQ+K + + L + E + ++ A+ KL E + +
Sbjct: 323 EAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQ 382

Query: 450 GILSNIQKTEAALGKCKAQYSENETKL 476
+ ++ + A + + E +KL
Sbjct: 383 SLRRDLDASREAKKQVEKALEEANSKL 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3482DHBDHDRGNASE1463e-45 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 146 bits (369), Expect = 3e-45
Identities = 91/252 (36%), Positives = 135/252 (53%), Gaps = 11/252 (4%)

Query: 2 LKGKVALVTGASRGIGRAIAIDLAKQGANVVVNYAGNEQKANEVVDEIKKLGSDAIAVRA 61
++GK+A +TGA++GIG A+A LA QGA++ N +K +VV +K A A A
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAA-VDYNPEKLEKVVSSLKAEARHAEAFPA 64

Query: 62 DVANAEDVTNMVKQTVDVFGQVDILVNNAGVTKDNLLMRMKEEEWDTVINTNLKGVFLCT 121
DV ++ + + + G +DILVN AGV + L+ + +EEW+ + N GVF +
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 122 KAVSRFMMRQRHGRIVNIASVVGVTGNPGQANYVAAKAGVIGLTKTSAKELASRNITVNA 181
++VS++MM +R G IV + S A Y ++KA + TK ELA NI N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 182 IAPGFIATDMTDVL--DENIKAEMLK--------LIPAAQFGEAQDIANAVTFFASDQSK 231
++PG TDM L DEN +++K IP + + DIA+AV F S Q+
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 232 YITGQTLNVDGG 243
+IT L VDGG
Sbjct: 245 HITMHNLCVDGG 256


40BALH_3512BALH_3528Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_35120303.142469orotate phosphoribosyltransferase
BALH_35130293.225043orotidine 5'-phosphate decarboxylase
BALH_35141293.024848dihydroorotate dehydrogenase 1B
BALH_35151292.964039dihydroorotate dehydrogenase electron transfer
BALH_35161282.648740carbamoyl phosphate synthase large subunit
BALH_35172232.667873carbamoyl phosphate synthase small subunit
BALH_35182202.320173dihydroorotase
BALH_35192181.440213aspartate carbamoyltransferase catalytic
BALH_35213201.579691uracil permease
BALH_35222190.785162bifunctional pyrimidine regulatory protein
BALH_35231180.665724ribosomal large subunit pseudouridine synthase
BALH_35241190.513247lipoprotein signal peptidase
BALH_35251180.641281DnaK suppressor protein
BALH_35261170.793626isoleucyl-tRNA synthetase
BALH_3527213-0.739699cell-division initiation protein (DivIVA)
BALH_3528213-0.444036S4 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3518UREASE330.003 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 32.8 bits (75), Expect = 0.003
Identities = 25/83 (30%), Positives = 36/83 (43%), Gaps = 20/83 (24%)

Query: 17 IVATDLLVQDGKIAKV--AEN---------ITADNAEVIDVNGKLIAPGLVDVHVHLREP 65
IV D+ ++DG+IA + A N I EVI GK++ G +D H+H P
Sbjct: 83 IVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIHFICP 142

Query: 66 GGEHKETIETGTLAAAKGGFTTI 88
+ IE A G T +
Sbjct: 143 -----QQIEE----ALMSGLTCM 156


41BALH_3555BALH_3569Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_35552150.034888ATP-dependent protease La
BALH_35560130.122122patatin-like phospholipase
BALH_3557-1140.236509hypothetical protein
BALH_3558-4150.172097phosphopantetheine adenylyltransferase
BALH_3559-2170.514799methyltransferase
BALH_3560-2170.188275hypothetical protein
BALH_3562-2180.491291hypothetical protein
BALH_35631180.289163broad-specificity phosphatase PhoE
BALH_3564319-0.107230hypothetical protein
BALH_3565318-0.150182hypothetical protein
BALH_35664150.646256hypothetical protein
BALH_35672110.876223hypothetical protein
BALH_35684110.841621formamidase
BALH_35693100.655876CtaG membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3558LPSBIOSNTHSS2285e-80 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 228 bits (583), Expect = 5e-80
Identities = 88/155 (56%), Positives = 115/155 (74%)

Query: 4 IAISSGSFDPITLGHLDIIKRGAKVFDEVYVVVLNNSSKKPFFSVEERLDLIREATKDIP 63
AI GSFDPIT GHLDII+RG ++FD+VYV VL N +K+P FSV+ERL+ I +A +P
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 NVKVDSHSGLLVEYAKMRNANAILRGLRAVSDFEYEMQITSMNRKLDENIETFFIMTNNQ 123
N +VDS GL V YA+ R A AILRGLR +SDFE E+Q+ + N+ L ++ET F+ T+ +
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 124 YSFLSSSIVKEVARYGGSVVDLVPPVVERALKEKF 158
YSFLSSS+VKEVAR+GG+V VP V AL ++F
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHVAAALYDQF 156


42BALH_3578BALH_3592Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_35782130.128876hypothetical protein
BALH_35790120.718389PhoH family protein
BALH_35804161.106750hypothetical protein
BALH_35814181.269500hypothetical protein
BALH_35823171.015802hypothetical protein
BALH_35831151.485685GTP-binding protein
BALH_3584-1111.032766myo-inositol-1(or 4)-monophosphatase
BALH_35850130.444343hypothetical protein
BALH_3586-115-0.022385hypothetical protein
BALH_35870211.490287hypothetical protein
BALH_35882272.455254arginine decarboxylase
BALH_35892373.143567transglutaminase
BALH_35904423.680332hypothetical protein
BALH_35913433.721235hypothetical protein
BALH_35922393.597915dihydrolipoamide dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3578STREPKINASE270.030 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 27.4 bits (60), Expect = 0.030
Identities = 15/51 (29%), Positives = 26/51 (50%), Gaps = 1/51 (1%)

Query: 93 FLANAEISDDATQVEIETVDDTITLPLETVKNAILGFSKEGKPLKEDGPVH 143
F ++A I+D +V D ++TLP + V+ +L +P KE P+
Sbjct: 129 FASDATITDRNGKVYFADKDGSVTLPTQPVQEFLLSGHVRVRPYKEK-PIQ 178


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3583TCRTETOQM1812e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 181 bits (461), Expect = 2e-51
Identities = 101/476 (21%), Positives = 195/476 (40%), Gaps = 96/476 (20%)

Query: 10 LRNIAIIAHVDHGKTTLVDQLLRQAGTFRANEHVEE--RAMDSNDLERERGITILAKNTA 67
+ NI ++AHVD GKTTL + LL +G V++ D+ LER+RGITI T+
Sbjct: 3 IINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITS 62

Query: 68 IHYEDKRINILDTPGHADFGGEVERIMKMVDGVLLVVDAYEGCMPQTRFVLKKALEQNLT 127
+E+ ++NI+DTPGH DF EV R + ++DG +L++ A +G QTR + + +
Sbjct: 63 FQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIP 122

Query: 128 PIVVVNKIDRDFARPDEVVDEVIDLF---------IELG-------------------AN 159
I +NKID++ V ++ + +EL N
Sbjct: 123 TIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGN 182

Query: 160 EDQLE--------------------------FPVVFASAMNGTASLDSNPANQEENMKSL 193
+D LE FPV SA N + +L
Sbjct: 183 DDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNN------------IGIDNL 230

Query: 194 FDTIIEHIPAPIDNSEEPLQFQVALLDYNDYVGRIGVGRVFRGTMKVGQQVALMKVDGSV 253
+ I + + L +V ++Y++ R+ R++ G + + V + + +
Sbjct: 231 IEVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKE--- 287

Query: 254 KQFRVTKLFGYMGLKRQEIEEAKAGDLVAVSGMEDINVGETVCPVEHQDALPLLRIDEPT 313
+ ++T+++ + + +I++A +G++V + E + + + + + P
Sbjct: 288 -KIKITEMYTSINGELCKIDKAYSGEIVILQN-EFLKLNSVLGDTKLLPQRERIENPLPL 345

Query: 314 LQMTFLVNNSPFAGREGKYITSRKIEER------LRSQLETDVSLRVDNTESPDAWIVSG 367
LQ T + K ++R L ++D LR + I+S
Sbjct: 346 LQTT---------------VEPSKPQQREMLLDALLEISDSDPLLRYYVDSATHEIILSF 390

Query: 368 RGELHLSILIENMRRE-GYELQVSKPEVIIKEVDGVRCEPVERVQIDVPEEYTGSI 422
G++ + + ++ + E+++ +P VI E + E +++ P + SI
Sbjct: 391 LGKVQMEVTCALLQEKYHVEIEIKEPTVIYMERPLKKAEYTIHIEVP-PNPFWASI 445



Score = 39.8 bits (93), Expect = 3e-05
Identities = 17/77 (22%), Positives = 28/77 (36%), Gaps = 1/77 (1%)

Query: 405 EPVERVQIDVPEEYTGSIMESMGARKGEMLDMVNNGNGQVRLTFMVPARGLIGYTTEFLT 464
EP +I P+EY ++D N +V L+ +PAR + Y ++
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNN-EVILSGEIPARCIQEYRSDLTF 595

Query: 465 LTRGYGILNHTFDCYQP 481
T G + Y
Sbjct: 596 FTNGRSVCLTELKGYHV 612


43BALH_3789BALH_3795Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_37892141.212067bifunctional 5,10-methylene-tetrahydrofolate
BALH_37902140.576622transcription antitermination protein NusB
BALH_37913120.239145hypothetical protein
BALH_37921130.236509acetyl-CoA carboxylase biotin carboxylase
BALH_3793215-0.668527acetyl-CoA carboxylase biotin carboxyl carrier
BALH_3794215-0.918730stage III sporulation protein AH
BALH_3795220-0.008285stage III sporulation protein AG
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3793RTXTOXIND270.025 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 27.5 bits (61), Expect = 0.025
Identities = 8/25 (32%), Positives = 12/25 (48%)

Query: 138 GEIVEILVNNGQLVEYGQPLFLVKA 162
+ EI+V G+ V G L + A
Sbjct: 105 SIVKEIIVKEGESVRKGDVLLKLTA 129


44BALH_3811BALH_3819Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_38112170.182134hypothetical protein
BALH_38122171.038607hypothetical protein
BALH_3813014-0.567423lipoate-protein ligase A
BALH_3814-216-1.075180rhodanese-related sulfurtransferase
BALH_3815-217-2.281660LacI family transcriptional regulator
BALH_3816-118-3.489166TetR family transcriptional regulator
BALH_3817-117-3.979390quaternary ammonium compound-resistance protein
BALH_3818-120-3.925854ABC transporter permease
BALH_3819-120-3.287621ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3811SALSPVBPROT290.014 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 29.3 bits (65), Expect = 0.014
Identities = 26/86 (30%), Positives = 43/86 (50%), Gaps = 6/86 (6%)

Query: 3 RMKAQDMIKLNNKKRELLTPENEVAYSDMLVYLRLSNVPEQQVEELLL--EILDHLIEAQ 60
R K++ I +K+ + L + YS + YLR + PE Q +E LL + L +
Sbjct: 381 RPKSKWAIVEESKQIQALRYYSAQGYSVINKYLRGDDYPETQAKETLLSRDYLSTNEPSD 440

Query: 61 TENKNAYDIFGDDLQSYCDELISALP 86
E KNA ++ +D+ E +S+LP
Sbjct: 441 EEFKNAMSVYINDIA----EGLSSLP 462


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3812PF03544353e-04 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 34.6 bits (79), Expect = 3e-04
Identities = 29/102 (28%), Positives = 34/102 (33%), Gaps = 9/102 (8%)

Query: 108 EAAEQEETVVEATPKKEVVVEVPKAVTPAPKPVTRVETPATASTPKPTPAPTPKPVSVEA 167
A+ E P + VV P+ P P P E P PKP P P PKPV
Sbjct: 56 APADLEPPQAVQPPPEPVVEPEPE---PEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVE 112

Query: 168 AVELSTPAPVKREVPTPVTKQETTPVAPAKPKQSALTETNSK 209
+ R APA+P S T SK
Sbjct: 113 QPKRDVKPVESRPASPF------ENTAPARPTSSTATAATSK 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3813DHBDHDRGNASE300.008 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 30.4 bits (68), Expect = 0.008
Identities = 26/98 (26%), Positives = 41/98 (41%), Gaps = 8/98 (8%)

Query: 97 VIVSEDHPNMPKTVTEAYRVISQGLLDGFKALGLE-AYYAVPKTEADRENLKNPRSG-VC 154
V V + +P+T AY + K LGLE A Y + R N+ +P S
Sbjct: 140 VTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI------RCNIVSPGSTETD 193

Query: 155 FDAPSWYEIVVEGRKIAGSAQTRQKGVILQHGSIPLEI 192
W + + I GS +T + G+ L+ + P +I
Sbjct: 194 MQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDI 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3816HTHTETR602e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.0 bits (145), Expect = 2e-13
Identities = 39/205 (19%), Positives = 72/205 (35%), Gaps = 25/205 (12%)

Query: 2 KMTANRIKAVALSHFARYGYEGTSLANIAQEVGIKKPSIYAHFKRKEELYFICLESALQK 61
+ T I VAL F++ G TSL IA+ G+ + +IY HFK K +L+ E +
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 62 DLQSFTDDIENFSNSSTEELLLQLLKGYAKRFGESEESMFWLRTSYFPPDAFRE-QIIEK 120
+ + F +L ++L + E + + + E ++++
Sbjct: 70 IGELELEYQAKFP-GDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQ 128

Query: 121 --ANAHIENVGKLLFPIFKQANEKSELH-NIEVKDALEAFLCLLDGLM------------ 165
N +E+ ++ K E L ++ + A + GLM
Sbjct: 129 AQRNLCLESYDRIE-QTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDL 187

Query: 166 -------VELLFAGLNRFETRLNAS 183
V +L T N +
Sbjct: 188 KKEARDYVAILLEMYLLCPTLRNPA 212


45BALH_3977BALH_3992Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_39772153.178093hypothetical protein
BALH_39783163.180954tRNA-specific 2-thiouridylase MnmA
BALH_39793213.758244class V aminotransferase
BALH_39804243.671651BadM/Rrf2 family transcriptional regulator
BALH_39814233.410022recombination factor protein RarA
BALH_39824253.344501prespore-specific transcriptional regulator
BALH_39832242.628062hesA/moeB/thiF family protein
BALH_39842252.745043aspartyl-tRNA synthetase
BALH_39850161.977622histidyl-tRNA synthetase
BALH_39860131.723018hypothetical protein
BALH_39870161.573880D-tyrosyl-tRNA(Tyr) deacylase
BALH_39881161.390505GTP diphosphokinase
BALH_39891141.550609adenine phosphoribosyltransferase
BALH_39900121.255683exonuclease RecJ
BALH_39912160.817850cobalt-zinc-cadmium resistance protein
BALH_39922191.180810bifunctional preprotein translocase subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3977SYCDCHAPRONE334e-04 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 33.0 bits (75), Expect = 4e-04
Identities = 17/90 (18%), Positives = 32/90 (35%)

Query: 11 GIQYMQEGNWEEAAKNFTEAIEKNPKDALGYINFANLLDVLGDSERAILFYKRALELDDK 70
Q G +E+A K F + D+ ++ +G + AI Y +D K
Sbjct: 43 AFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIK 102

Query: 71 SAAAYYGLGNVYYGQEQFAEAKAVFEQAMQ 100
+ + + AEA++ A +
Sbjct: 103 EPRFPFHAAECLLQKGELAEAESGLFLAQE 132



Score = 31.1 bits (70), Expect = 0.002
Identities = 17/96 (17%), Positives = 27/96 (28%)

Query: 112 LGITHVQLGNDRLALPFLQRATELDENDVEAVFQCGLCFARLEHIQEAKPYFEKVLEMDE 171
L Q G A Q LD D G C + A + MD
Sbjct: 42 LAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDI 101

Query: 172 EHADAYYNLGVAYVFEENNEKALALFKKATEIQPDH 207
+ ++ + + +A + A E+ D
Sbjct: 102 KEPRFPFHAAECLLQKGELAEAESGLFLAQELIADK 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3979RTXTOXINA300.027 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.5 bits (66), Expect = 0.027
Identities = 25/123 (20%), Positives = 46/123 (37%), Gaps = 8/123 (6%)

Query: 114 GFEVTYLPVDETGRVQVSDIQKAL-TEETILVSVMFGNNEVGTMQPIAEIGKLLKEHQAY 172
G++ + E +S K E ++L++ + +G + + G ++Y
Sbjct: 444 GYDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHWDTLIGELAGVTRNGDKTLSGKSY 503

Query: 173 FHTDAVQAYGLVEINVKEFGIDLLSISAHKINGPKGVGFLYAGTNVKF-EPLLIGGEQER 231
D + +E EF + I+ T +KF PLL GE+ R
Sbjct: 504 --IDYYEEGKRLEKKXDEFQKQVFDPLKGNIDLSDSKS----STLLKFVTPLLTPGEEIR 557

Query: 232 KRR 234
+RR
Sbjct: 558 ERR 560


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3986PF05043250.020 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 25.3 bits (55), Expect = 0.020
Identities = 9/32 (28%), Positives = 14/32 (43%)

Query: 32 FISKEQNNTSMELASEFGISLQDVKRLKKQIE 63
FI + + + EF IS + R+ QI
Sbjct: 94 FIFFNEGCQAESICKEFYISSSSLYRIISQIN 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3987THERMOLYSIN280.009 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 28.4 bits (63), Expect = 0.009
Identities = 24/118 (20%), Positives = 46/118 (38%), Gaps = 16/118 (13%)

Query: 16 DGEIVGQIPFGLTLLVGITHEDTEKDATYIAEKIANLRIFEDESGKMNHSVLDVEGQVLS 75
DG+ +PF + V + HE T + + A L ++++ESG +N ++ D+ G ++
Sbjct: 352 DGDGQTFLPFSGGIDV-VGHELTHA----VTDYTAGL-VYQNESGAINEAMSDIFGTLVE 405

Query: 76 ----------ISQFTLYGDCRKGRRPNFMDAAKPDYAEHLYDFFNEEVRKQGLHVETG 123
I + + D AK +H + G+H +G
Sbjct: 406 FYANRNPDWEIGEDIYTPGVAGDALRSMSDPAKYGDPDHYSKRYTGTQDNGGVHTNSG 463


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3992SECFTRNLCASE2702e-86 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 270 bits (691), Expect = 2e-86
Identities = 100/318 (31%), Positives = 165/318 (51%), Gaps = 21/318 (6%)

Query: 443 PTKFDRINFVNVGHKFLIFSIVVVIAGAIILPIFKLNLGIDFASGTRIDLQSKQSVTVSD 502
P K + +F +IV++IA I+ + LN GIDF GT I +S ++ V
Sbjct: 9 PEKTN-FDFFRWQWATFGAAIVMMIASVILPLVIGLNFGIDFKGGTTIRTESTTAIDVGV 67

Query: 503 VHKDFKELNID---VKEENIVPTGDDNKGFAVR-----------TLGVLSKDEIAKTKTF 548
+ L + + E +D +R G ++ + K +T
Sbjct: 68 YRAALEPLELGDVIISEVRDPSFREDQHVAMIRIQMQEDGQGAEGQGAQGQELVNKVETA 127

Query: 549 FH--DKYGTDPNVSTVSPTIGKEIARNAFIAVLIASAVIILYVSIRFRFTYALSAVLALL 606
D + +V P + E+ A ++L A+ VI+ Y+ +RF + +AL AV+AL+
Sbjct: 128 LTAVDPALKITSFESVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALV 187

Query: 607 HDAFVMIVIFSIFQLEVDLTFIAAVLTIIGYSINDSIVTFDRNRELYKQKKRVRDIKDLE 666
HD + + +F++ QL+ DLT +AA+LTI GYSIND++V FDR RE + K L
Sbjct: 188 HDVLLTVGLFAVLQLKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKT----MPLR 243

Query: 667 EIVNASIRQTLGRSINTVLTVLFPVIALLIFGSESLRNFSFALLVGLIVGTYSSVFVASQ 726
+++N S+ +TL R++ T +T L ++ +LI+G + +R F FA++ G+ GTYSSV+VA
Sbjct: 244 DVMNLSVNETLSRTVMTGMTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKN 303

Query: 727 IWLMLENRRLKKGKNKKK 744
I L + R K+ K+
Sbjct: 304 IVLFIGLDRNKEKKDPSD 321



Score = 66.0 bits (161), Expect = 1e-13
Identities = 37/180 (20%), Positives = 84/180 (46%), Gaps = 11/180 (6%)

Query: 249 SVGAKFGQQALEQTIFASAIGIALIFVFMLV-FYRLPGIVAVIMLGLYIFVTLLVFNWMH 307
SVG K + + +++ +I ++ V F + AV+ L + +T+ +F +
Sbjct: 142 SVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHDVLLTVGLFAVLQ 201

Query: 308 AVLTLPGIAALVLGVGIAVDANIITYERLKEELKIGKSMM------SAFRAGNHRSLATI 361
L +AAL+ G +++ ++ ++RL+E L K+M + R++ T
Sbjct: 202 LKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNETLSRTVMTG 261

Query: 362 LDANITTLAAAGVLFVYGNSSVKGFATSLIVSILVGFITNVFGTRFLLSLLVKSRYFDKK 421
+ TTL A + ++G ++GF +++ + G ++V+ + ++ + R +KK
Sbjct: 262 M----TTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIVLFIGLDRNKEKK 317


46BALH_4033BALH_4067Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_4033116-3.407432rod shape-determining protein MreD
BALH_4034117-3.785728rod shape-determining protein MreC
BALH_4035218-4.238855rod shape-determining protein MreB
BALH_4036317-5.617432hypothetical protein
BALH_4037417-6.072575hypothetical protein
BALH_4038617-6.072064phage integrase
BALH_4039620-5.710926hypothetical protein
BALH_4040420-4.191242hypothetical protein
BALH_4041419-1.770041hypothetical protein
BALH_40423180.076456ABC-type bacteriocin transporter family protein
BALH_40432193.585183hypothetical protein
BALH_40440183.399784hypothetical protein
BALH_4045-1162.826297cell surface protein
BALH_4046-1143.675046exosporium protein
BALH_4047-1142.613555DNA repair protein RadC
BALH_40480132.244538Maf-like protein
BALH_40490142.768584stage II sporulation protein B
BALH_4050-1172.760614folylpolyglutamate synthase
BALH_4051-1183.111734valyl-tRNA synthetase
BALH_4052-1142.815234hypothetical protein
BALH_4053-1112.203604stage VI sporulation protein D
BALH_40541122.178840glutamate-1-semialdehyde aminotransferase
BALH_40550101.053807delta-aminolevulinic acid dehydratase
BALH_40562121.130123uroporphyrinogen-III synthase
BALH_40571130.613344porphobilinogen deaminase
BALH_40580130.948087HemX family protein
BALH_40590152.062401glutamyl-tRNA reductase
BALH_4060-1172.163983MarR family transcriptional regulator
BALH_40611192.738027organic hydroperoxide resistance protein
BALH_40622162.051002ribosome biogenesis GTP-binding protein YsxC
BALH_40631141.973608Lon-A peptidase
BALH_40642151.074021ATP-dependent protease LA
BALH_40654170.309482ATP-dependent protease ATP-binding subunit ClpX
BALH_40665190.317347trigger factor
BALH_4067218-0.656060hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4035SHAPEPROTEIN497e-180 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 497 bits (1281), Expect = e-180
Identities = 194/336 (57%), Positives = 252/336 (75%), Gaps = 5/336 (1%)

Query: 4 FGGFTRDLGIDLGTANTLVYVKGKGVVLREPSVVALQTD----TKQIVAVGSDAKQMIGR 59
G F+ DL IDLGTANTL+YVKG+G+VL EPSVVA++ D K + AVG DAKQM+GR
Sbjct: 6 RGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAAVGHDAKQMLGR 65

Query: 60 TPGNVVALRPMKDGVIADYETTATMMKYYIQQAQKSNGFFSRKPYVMVCVPSGITAVERR 119
TPGN+ A+RPMKDGVIAD+ T M++++I+Q SN F P V+VCVP G T VERR
Sbjct: 66 TPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQV-HSNSFMRPSPRVLVCVPVGATQVERR 124

Query: 120 AVIDATRQAGARDAYPIEEPFAAAIGANLPVWEPTGSMVVDIGGGTTEVAIISLGGIVTS 179
A+ ++ + AGAR+ + IEEP AAAIGA LPV E TGSMVVDIGGGTTEVA+ISL G+V S
Sbjct: 125 AIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYS 184

Query: 180 QSVRVAGDDMDDSIIQYIKKSYNLMIGERTAEALKLEIGSAGEPEGIEPMEIRGRDLVSG 239
SVR+ GD D++II Y++++Y +IGE TAE +K EIGSA + + +E+RGR+L G
Sbjct: 185 SSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEG 244

Query: 240 LPKTVLIQPEEIADALKDTVDAIVESVKNTLEKTPPELAADIMDRGIVLTGGGALLRNLD 299
+P+ + EI +AL++ + IV +V LE+ PPELA+DI +RG+VLTGGGALLRNLD
Sbjct: 245 VPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLD 304

Query: 300 KVISEETNMPVLVAEDPLDCVAIGTGKALDNIDLFK 335
+++ EET +PV+VAEDPL CVA G GKAL+ ID+
Sbjct: 305 RLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHG 340


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4041RTXTOXIND1042e-26 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 104 bits (261), Expect = 2e-26
Identities = 84/476 (17%), Positives = 175/476 (36%), Gaps = 68/476 (14%)

Query: 4 PIRDITE---LTDSVEMLEKNPPKFIKLFIYLVLTIIISALIWSYFSKIDISAKGSATIQ 60
P+R+ E L +E++E + +L Y ++ ++ A I S +++I A + +
Sbjct: 32 PVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLT 91

Query: 61 SSTATIAVKPKVSGEINDIKIKQGEKVRRGDPLIVINNKQLEQEKKHLEENRDKLDAGIK 120
S + +KP + + +I +K+GE VR+GD L+ + E + + + +
Sbjct: 92 HSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQT 151

Query: 121 SLKILKDVIDKNQSKLSIDVEESIRNELDSYLKQKTLLNNENKNKIVDFTHRLQVLNNVK 180
+IL I+ N+ ++ + Y + + E + N K
Sbjct: 152 RYQILSRSIELNK-------LPELKLPDEPYFQN--VSEEEVLRLTSLIKEQFSTWQNQK 202

Query: 181 DE-----------------GINNLEFEIESIKNQNNILKIEKEGLLKQNKIIKNENIAQE 223
+ IN E K++ + LL + I K+ + QE
Sbjct: 203 YQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSS----LLHKQAIAKHAVLEQE 258

Query: 224 TEANVNKVNQLDSQIQGNQKSILIKQEQIKNRAQEIELEKKVINENIRAQESSIQDSIES 283
NK + ++++ + + EQI++ + E +++ + +
Sbjct: 259 -----NKYVEAVNELRVYKSQL----EQIESEILSAKEEYQLVTQLFK------------ 297

Query: 284 LKASTVSDIAKEIDEKQQQLSLLQQDINNINLNYEQTVITAPKDGVIEMPNRIKVGDTIE 343
++I ++ + + LL ++ + +VI AP ++ G +
Sbjct: 298 ------NEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVT 351

Query: 344 QDKEVLSISPDEKTNRVLLYISAEEINKIKIGD--KIKYT-FEVGTTETYYGSVVQIYHD 400
+ ++ I P++ T V + ++I I +G IK F G V I D
Sbjct: 352 TAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLD 411

Query: 401 PVTSKNRGTTFFVV-----EGTIENTGNKKLRSGSTGKAAIVVDQKNILSLMLEKL 451
+ + G F V+ N L SG A I ++++S +L L
Sbjct: 412 AIEDQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPL 467


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4046RTXTOXINA300.035 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.5 bits (66), Expect = 0.035
Identities = 29/107 (27%), Positives = 35/107 (32%), Gaps = 3/107 (2%)

Query: 179 GATGDPGPTGATGDPGPTGATGDPGPTGATGDPGPTGPTGNPGPTGPTGNPGPTGATGD- 237
GA GD G G+ G G+ +G GD G GN G GN G GD
Sbjct: 742 GADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDGNDKLIGVAGNNYLNGGDGDD 801

Query: 238 --PGATGATGDPGPTGATGDPGPTGPTGDPGPTGATGDPGPTGPTGD 282
+ G G+ G G G GD G G+
Sbjct: 802 EFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGN 848


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4055ENTEROVIROMP310.004 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 31.0 bits (70), Expect = 0.004
Identities = 32/157 (20%), Positives = 54/157 (34%), Gaps = 25/157 (15%)

Query: 146 AVLAKTAVSQAKAGADIIAPSNMMDGFVTAIRHALDENGFGHVPVMSYAVKYSSAFYGPF 205
+V A + V+ A +D N M GF R+ D + G + +Y K +A G +
Sbjct: 21 SVAATSTVTGGYAQSDAQGQMNKMGGFNLKYRYEEDNSPLGVIGSFTYTEKSRTASSGDY 80

Query: 206 RDAAHGAPQFGDRKTYQMDPANRME-----------AFREAESDVMEGADFLIVKPALSY 254
+ G PA R+ + + ++ SY
Sbjct: 81 NKNQYYGITAG--------PAYRINDWASIYGVVGVGYGKFQTTEYPTYKHDTSDYGFSY 132

Query: 255 LDIVRDVKNNFN-LPVVAYNVSGEYSMIKAAAQNGWI 290
++ FN + VA + S E S I++ WI
Sbjct: 133 GAGLQ-----FNPMENVALDFSYEQSRIRSVDVGTWI 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4062TCRTETOQM280.027 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 27.9 bits (62), Expect = 0.027
Identities = 18/90 (20%), Positives = 37/90 (41%), Gaps = 13/90 (14%)

Query: 58 KTQTLNFFLINEMMHFVDVPGYGYAKVSKTERAAWGKMIETYFTTREQLDAAVLVVDLRH 117
+T +F N ++ +D PG+ +++ R+ LD A+L++ +
Sbjct: 57 QTGITSFQWENTKVNIIDTPGH-MDFLAEVYRSL------------SVLDGAILLISAKD 103

Query: 118 KPTNDDVMMYDFLKHYDIPTIIIATKADKI 147
+++ L+ IPTI K D+
Sbjct: 104 GVQAQTRILFHALRKMGIPTIFFINKIDQN 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4063HTHFIS382e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.5 bits (87), Expect = 2e-04
Identities = 29/101 (28%), Positives = 43/101 (42%), Gaps = 14/101 (13%)

Query: 370 LCLVGPPGVGKTSLARSI-ATSLNRN--FVRVSLGGVRD---ESEIRGHRRTYVGAMPGR 423
L + G G GK +AR++ RN FV +++ + ESE+ GH + GA G
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEK---GAFTGA 219

Query: 424 IIQGMKKAKSVNP-VFLLDEIDKMSNDFRGDPSAALLEVLD 463
+ + + LDEI M D + LL VL
Sbjct: 220 QTRSTGRFEQAEGGTLFLDEIGDMPMDAQ----TRLLRVLQ 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4064HTHFIS584e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 58.3 bits (141), Expect = 4e-11
Identities = 43/214 (20%), Positives = 76/214 (35%), Gaps = 41/214 (19%)

Query: 44 ELEQLRKMREISLTEPLAEKVR----PTSFLDIVGQEDGIKSLK--AALCGPNPQHVIIY 97
+L +L + +L EP + + +VG+ ++ + A ++I
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMIT 166

Query: 98 GPPGVGKTAAARLVLEEAKRNPKSPFRTNATFIELDATTARFDERGIADPLIGSVHDPIY 157
G G GK AR + + KR F+ ++ A I L G
Sbjct: 167 GESGTGKELVARALHDYGKRRNGP-------FVAINM--AAIPRDLIESELFGHE----- 212

Query: 158 QGAGAMGQAGIPQPKKGAVTDAHGGILFIDEIGELHPIQMNKMLKVLEDRKVFLESAYYS 217
GA G G A GG LF+DEIG++ ++L+VL+ +
Sbjct: 213 --KGAF--TGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGG--- 265

Query: 218 EENTMIPTYIHDIFQKGLPADFRLVGATTRSPEE 251
+ + +D R+V AT + ++
Sbjct: 266 --------------RTPIRSDVRIVAATNKDLKQ 285


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4065HTHFIS290.041 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.041
Identities = 38/198 (19%), Positives = 76/198 (38%), Gaps = 38/198 (19%)

Query: 86 VPKPVEIREILDEY--VIGQDNAK-KALAVAVYNHYKRINSNSKIDDV-----ELAKSNI 137
+PKP ++ E++ + + + L + + ++ + ++ L ++++
Sbjct: 102 LPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDL 161

Query: 138 A--LIGPTGSGKTLLAQTL---ARILNVPF------AIADATSLTEAGYVGEDVENILLK 186
+ G +G+GK L+A+ L + N PF AI L E+ G +
Sbjct: 162 TLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPR--DLIESELFGH-EKGAFTG 218

Query: 187 LIQAADYDVEKAEKGIIYIDEIDKVARKSENPSITRDVSGEGVQQALLKILEGTVASVPP 246
+ E+AE G +++DEI D+ Q LL++L+
Sbjct: 219 AQTRSTGRFEQAEGGTLFLDEIG-------------DMP-MDAQTRLLRVLQQG--EYTT 262

Query: 247 QGGRKHPHQEFIQIDTTN 264
GGR + + TN
Sbjct: 263 VGGRTPIRSDVRIVAATN 280


47BALH_4103BALH_4117Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_41032132.188701hypothetical protein
BALH_41041112.320627excinuclease ABC subunit C
BALH_41050133.189360thioredoxin
BALH_4106-2144.654739electron transfer flavoprotein subunit alpha
BALH_4107-1123.401120electron transfer flavoprotein, subunit beta
BALH_4108-1123.893226enoyl-CoA hydratase
BALH_4109-1133.692124TetR family transcriptional regulator
BALH_4110-1133.293852long-chain-fatty-acid--CoA ligase
BALH_4112-1152.145215triple helix repeat-containing collagen
BALH_4113-114-0.079596iron compound ABC transporter substrate-binding
BALH_4114-113-0.376584iron-hydroxamate transporter permease subunit
BALH_4115-213-3.458581hypothetical protein
BALH_4116-214-4.009301hypothetical protein
BALH_4117-113-3.925992hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4109HTHTETR1147e-34 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 114 bits (286), Expect = 7e-34
Identities = 36/192 (18%), Positives = 75/192 (39%), Gaps = 10/192 (5%)

Query: 20 RPKYNQIIDAAVIVIAENGYHQAQVSKIAKQAGVADGTIYLYFKNKEDILISLFQEKMGE 79
+ I+D A+ + ++ G + +IAK AGV G IY +FK+K D+ +++
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 80 FVETIRQKTAGIESAVSKLFMLVETHFLLLSQNDPL--AIVTQLELRQSNQDLRLKINEV 137
E + A + + H L + + ++ + + + +
Sbjct: 70 IGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 138 LKGY----LQVIDEILETGIKQGEFQADLNVRVARQMIFGTVDEVVTNWVMSDHKYDLVA 193
+ I++ L+ I+ ADL R A ++ G + ++ NW+ + +DL
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDL-- 187

Query: 194 LSKTVHGLLIAA 205
K +A
Sbjct: 188 --KKEARDYVAI 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4113FERRIBNDNGPP1836e-58 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 183 bits (466), Expect = 6e-58
Identities = 62/258 (24%), Positives = 115/258 (44%), Gaps = 11/258 (4%)

Query: 72 AKKVVVLEWVYSEDLLALGVQPVGMADIKNYNKWVNTKTKPSKDVVDVGTRQQPNLEEIS 131
++V LEW+ E LLALG+ P G+AD NY WV+ P V+DVG R +PNLE ++
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLP-DSVIDVGLRTEPNLELLT 93

Query: 132 RLKPDLIITASFRGKAIKNELEQIAPTVMFDPSTSNNDHFAEMTETFKQIAKAVGKEEEG 191
+KP ++ S L +IAP F+ S A ++ ++A + +
Sbjct: 94 EMKPSFMVW-SAGYGPSPEMLARIAPGRGFNFSDGKQP-LAMARKSLTEMADLLNLQSAA 151

Query: 192 KKVLADMDKAFADAKAKIEKADLKDKNIAMAQAFTAKNVPTFRILTDNSLALQVTKKLGL 251
+ LA + K + K + + + +++ + NSL ++ + G+
Sbjct: 152 ETHLAQYEDFIRSMKPRFVKRGARP--LLLTTLIDPRHM---LVFGPNSLFQEILDEYGI 206

Query: 252 TNTFEAGKSEPDGFKQTTVESLQSVQDSNFIYIVADEDNIFDTQLKGNPAWEELKFKKEN 311
N ++ G++ G +++ L + +D + + D D L P W+ + F +
Sbjct: 207 PNAWQ-GETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMD-ALMATPLWQAMPFVRAG 264

Query: 312 KMYKLKGDTWIFGGPESA 329
+ ++ W +G SA
Sbjct: 265 RFQRVP-AVWFYGATLSA 281


48BALH_4134BALH_4162Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_41342172.037612pseudouridylate synthase
BALH_41352232.523751recombination and DNA strand exchange inhibitor
BALH_41362282.779378hypothetical protein
BALH_41374333.195484colicin V production protein
BALH_41384323.565843cell division protein ZapA
BALH_41394343.379085ribonuclease HIII
BALH_41404343.267092asparaginyl-tRNA synthetase
BALH_41413292.172484phenylalanyl-tRNA synthetase subunit beta
BALH_4142-118-0.084029phenylalanyl-tRNA synthetase subunit alpha
BALH_4143015-1.66349423S rRNA methyltransferase
BALH_4144017-3.602903small acid-soluble spore protein SspI
BALH_4145-212-1.749510metal-dependent phosphohydrolase
BALH_4146012-1.639919CAAX amino protease
BALH_4147112-1.241011CAAX amino terminal protease family protein
BALH_41491150.548013hypothetical protein
BALH_41501170.495148hypothetical protein
BALH_41523211.620513multidrug resistance protein B
BALH_41535231.692957multidrug resistance protein A
BALH_41545221.899881TetR family transcriptional regulator
BALH_41553322.470564M42 family deblocking aminopeptidase
BALH_41563270.620267hypothetical protein
BALH_41573231.14477750S ribosomal protein L20
BALH_41582171.40244950S ribosomal protein L35
BALH_41593151.255563translation initiation factor IF-3
BALH_41601151.417422threonyl-tRNA synthetase
BALH_41611140.905649hypothetical protein
BALH_41622121.658969primosomal protein DnaI
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4135GPOSANCHOR372e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 37.4 bits (86), Expect = 2e-04
Identities = 35/118 (29%), Positives = 60/118 (50%), Gaps = 11/118 (9%)

Query: 518 KIENMIAKLEE-------SQKNAERDWNEAEALRKQSEKLHREL--QRQIIEFNEERDER 568
++E KLEE S+++ RD + + +KQ E H++L Q +I E + + R
Sbjct: 327 QLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRR 386

Query: 569 LLKAQKEGEEKVEAAKKEAEGIIQELRQLRKAQLANVK--DHELIEAKSRLEGAAPEL 624
L A +E +++VE A +EA + L +L K + K + E E +++LE A L
Sbjct: 387 DLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKLEAEAKAL 444


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4144DNABINDINGHU240.040 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 24.3 bits (53), Expect = 0.040
Identities = 10/33 (30%), Positives = 15/33 (45%), Gaps = 1/33 (3%)

Query: 22 DQLQETIVDAIQSGEEKMLPGLGVLFEVIWKNA 54
D + + + GE+ L G G FEV + A
Sbjct: 27 DAVFSAVSSYLAKGEKVQLIGFGN-FEVRERAA 58


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4152TCRTETB1457e-40 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 145 bits (367), Expect = 7e-40
Identities = 85/400 (21%), Positives = 174/400 (43%), Gaps = 14/400 (3%)

Query: 108 FVSILNQTIINVALPPLMNEFNVSTSTAQWLITGFMLVNGILVPISAFLVSRFTYRKLFV 167
F S+LN+ ++NV+LP + N+FN ++ W+ T FML I + L + ++L +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 168 AAMLFFTVGSIICATSGN-FTMMMTGRVIQAVGAGILMPVGMNIFMTLFPPHKRGAAMGL 226
++ GS+I + F++++ R IQ GA + M + P RG A GL
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 227 LGVAMILAPAIGPTVTGWVIENYSWNLMFYAMFIIGLIITFLSLKFFTLAQPVSNTKLDV 286
+G + + +GP + G + W+ + I IIT L + D+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMI--TIITVPFLMKLLKKEVRIKGHFDI 201

Query: 287 FGVVSSSIGLGSLLYGFSEAGNNSWTSAEVIISLVIGVIGLALFIWRELTTDNKMLDLQV 346
G++ S+G+ + + S + L++ V+ +F+ + +D +
Sbjct: 202 KGIILMSVGIVFFMLF---TTSYSIS------FLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 347 FKYPVFTFTLVINAIVTMALFGGMLLLPVYLQNIRGFTPIESG-LLLLPGSLIMGIMGPV 405
K F ++ I+ + G + ++P ++++ + E G +++ PG++ + I G +
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 406 AGKLFDKYGIRPLAIIGLAITTYATYEFTKLSMDTPYSVIMTDYIIRSIGMSFIMMPIMT 465
G L D+ G + IG+ + + + L T + + + G+SF I T
Sbjct: 313 GGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSW-FMTIIIVFVLGGLSFTKTVIST 371

Query: 466 AGMNALPMKLISHGTATQNTSRQVAGSIGTAILITLMTQQ 505
++L + G + N + ++ G AI+ L++
Sbjct: 372 IVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIP 411


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4153RTXTOXIND771e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 77.2 bits (190), Expect = 1e-18
Identities = 29/135 (21%), Positives = 49/135 (36%), Gaps = 12/135 (8%)

Query: 95 QTVDVTIPQNATVVQSNATT-NAFVGAGSPI-AYAFDMNNLWVTANIEETDVDDVQKGQD 152
Q + P + V Q T V + + + L VTA ++ D+ + GQ+
Sbjct: 326 QASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQN 385

Query: 153 VDVYVDAYPDTT---LSGKVEQVGLTTANTFSMLPSSNATANYTKVTQVVPVKISLDHSK 209
+ V+A+P T L GKV+ + L V + +K
Sbjct: 386 AIIKVEAFPYTRYGYLVGKVKNINLDAI-------EDQRLGLVFNVIISIEENCLSTGNK 438

Query: 210 SVNIVPGMNVTVRIH 224
++ + GM VT I
Sbjct: 439 NIPLSSGMAVTAEIK 453


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4154HTHTETR602e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.0 bits (145), Expect = 2e-13
Identities = 24/100 (24%), Positives = 39/100 (39%), Gaps = 6/100 (6%)

Query: 9 PRVKRTRQLIQDAFVALVGEKGFENVTVQHIAERAPVNRATFYSHYHDKYDLLDKSIEEM 68
+ TRQ I D + L ++G + ++ IA+ A V R Y H+ DK DL + E
Sbjct: 7 QEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELS 66

Query: 69 LEKLTEVIKPKNRNKEDFQLAFDSPHPNFLALFEHIAENA 108
+ E+ E P + H+ E+
Sbjct: 67 ESNIGELE------LEYQAKFPGDPLSVLREILIHVLEST 100


49BALH_4253BALH_4264Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_4253213-1.305049hypothetical protein
BALH_42542190.291472hypothetical protein
BALH_42552180.183777LacI family transcriptional regulator
BALH_4256117-0.442683hypothetical protein
BALH_42571211.332503hypothetical protein
BALH_42580211.697474hypothetical protein
BALH_42591182.895599aminopeptidase
BALH_42601142.433546hypothetical protein
BALH_42612142.703234ribosomal-protein-serine acetyltransferase
BALH_42622162.737609UDP-N-acetylmuramate--L-alanine ligase
BALH_42633162.729156nicotinate phosphoribosyltransferase
BALH_42643153.029519cell division protein FtsK
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4258TYPE4SSCAGA300.004 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 30.4 bits (68), Expect = 0.004
Identities = 21/71 (29%), Positives = 39/71 (54%), Gaps = 4/71 (5%)

Query: 119 VTDEIENNADKVAQVVQWSSAAIEVY---NHYRATRQEKKVEKEERKLERLEKKVEKK-E 174
+ D + +N + V + + ++ A + N+ + +K +EK RK E LEK+VEKK E
Sbjct: 574 IKDFLSSNKELVGKTLNFNKAVADAKNTGNYDEVKKAQKDLEKSLRKREHLEKEVEKKLE 633

Query: 175 KRSRLRMRGES 185
+S + + E+
Sbjct: 634 SKSGNKNKMEA 644


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4264IGASERPTASE591e-10 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 59.3 bits (143), Expect = 1e-10
Identities = 53/233 (22%), Positives = 84/233 (36%), Gaps = 13/233 (5%)

Query: 519 VEETPIVEETPIVEETPIV----EETPIVEETPVVEEQPVVEEAPIAEEQPVAEEAPVVE 574
V+ T I I + P V EE V+E PV P P + VAE + +
Sbjct: 992 VDTTNITTPNNIQADVPSVPSNNEEIARVDEAPV---PPPAPATPSETTETVAENSK--Q 1046

Query: 575 EQRVVEEAPIVEEQPVVQKEEPKREKKRHVPFNVVMLKQDRARLMERHASRTNAMQSSMS 634
E + VE+ + Q E +E K +V N + ++ + T +++
Sbjct: 1047 ESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATV 1106

Query: 635 ERVENKPVHQVEEKPQVEEKQMQQVVEPQVEEKPMQQIVVEPQVEEKQMQQVVEPQVEEV 694
E+ E +VE + E ++ V P+ E+ Q EP E + EPQ +
Sbjct: 1107 EKEEKA---KVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTN 1163

Query: 695 QPVQQVVAEQVQKPISSTEVEEKAYVVNQRENDVRNVLQTPPTYTIPSLTLLS 747
+ V E VN + V N T P T P++ S
Sbjct: 1164 TTADTEQPAKETSSNVEQPVTEST-TVNTGNSVVENPENTTPATTQPTVNSES 1215



Score = 53.9 bits (129), Expect = 4e-09
Identities = 42/210 (20%), Positives = 73/210 (34%), Gaps = 8/210 (3%)

Query: 313 SEEIKRNTEI-EQPTIEVE--EQSPEEAVIVKAEEKLEETIVVEIPEEVEVIAEAEELEE 369
+EEI R E P E + A K E K E + E E + +
Sbjct: 1014 NEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAK 1073

Query: 370 VEVIAEAEESEEVEVIAETEESEKVEV-IAETEAPEEVEPVALEEMQQE--MVLNEAIEQ 426
V A + +E + +ET+E++ E T EE V E+ Q+ + + +Q
Sbjct: 1074 SNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQ 1133

Query: 427 KNEFIHVAEADEQTKKDVQSFADVLIAEEQSVVEETPIVEETPIVEETPIVEETPIVEET 486
+ +A+ + D ++ + + +ET E P+ E T V
Sbjct: 1134 EQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTT-VNTG 1192

Query: 487 PIVEETPIVEETPIVEETPIVEETPIVEET 516
V E P + T + E+ +
Sbjct: 1193 NSVVENPENTTPATTQPT-VNSESSNKPKN 1221



Score = 41.2 bits (96), Expect = 3e-05
Identities = 51/280 (18%), Positives = 82/280 (29%), Gaps = 20/280 (7%)

Query: 421 NEAIEQKNEFIHVAEADEQTKKDVQSFADVLIAEEQSVVEETPIVEETPIV--EETPIVE 478
N +E++N+ + + EE + V+E P+ P E T V
Sbjct: 982 NPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVA 1041

Query: 479 ETPIVEETPIVEETPIVEETPIVEETPIVEETPIVEETPIVEETPIVEETPIVEETPIVE 538
E E + + ET E V+ E +T +
Sbjct: 1042 ENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETK 1101

Query: 539 ETPIVEETPVVEEQPVVEEAPIAEEQPVAEEAPVVEEQRVVEEAPIVEEQPVVQKEEPKR 598
ET VE+ EE+ VE E V + +EQ QP EP R
Sbjct: 1102 ETATVEK----EEKAKVETEKTQEVPKVTSQVSPKQEQSE-------TVQPQA---EPAR 1147

Query: 599 EKKRHVPFNVVMLKQDRARLMERHASRTNAM-QSSMSERVENKPVHQVEEKPQVE---EK 654
E V + + E+ A T++ + ++E + V E P+
Sbjct: 1148 ENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT 1207

Query: 655 QMQQVVEPQVEEKPMQQIVVEPQVEEKQMQQVVEPQVEEV 694
Q E + K + V + V
Sbjct: 1208 QPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTV 1247



Score = 39.3 bits (91), Expect = 1e-04
Identities = 46/269 (17%), Positives = 79/269 (29%), Gaps = 16/269 (5%)

Query: 451 LIAEEQSVVEETPIVEETPIVEETPIVEETPIVEETPIVEETPIVEETPIVEETPIVEET 510
+AE +T E E T E E V+ E +T
Sbjct: 1039 TVAENSKQESKTVEKNEQDATETTAQNREV-AKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 511 PIVEETPIVEETPIVEETPIVEETPIVEETPIVEETPVVEEQPVVEEAPIAEEQPVAEEA 570
+ET VE+ EE VE E + + +EQ E + +P E
Sbjct: 1098 TETKETATVEK----EEKAKVETEKTQEVPKVTSQVSPKQEQS---ETVQPQAEPAREND 1150

Query: 571 PVVEEQRVVEEAPIVEEQPVVQKEEPKREKKRHVPFNVVMLKQDRARLMERHASRTNAMQ 630
P V + P + E+P +E +V V +
Sbjct: 1151 PTVN-----IKEPQSQTNTTADTEQPAKETSSNVEQPVT--ESTTVNTGNSVVENPENTT 1203

Query: 631 SSMSERVENKPVHQV-EEKPQVEEKQMQQVVEPQVEEKPMQQIVVEPQVEEKQMQQVVEP 689
+ ++ N + + + + + VEP + V + V+
Sbjct: 1204 PATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSD 1263

Query: 690 QVEEVQPVQQVVAEQVQKPISSTEVEEKA 718
+ Q V V + V + IS E+ +
Sbjct: 1264 ARAKAQFVALNVGKAVSQHISQLEMNNEG 1292



Score = 37.7 bits (87), Expect = 3e-04
Identities = 47/291 (16%), Positives = 98/291 (33%), Gaps = 23/291 (7%)

Query: 127 PVIKKATAPTQESNRRPF-RPTEMISPIYGYNRPSVEKKEEKQEEVKEREDLEISVEGKS 185
P +A P+ SN R E P PS + E E ++E + +
Sbjct: 1000 PNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPS--ETTETVAENSKQESKTVEKNEQD 1057

Query: 186 VVDAWLEKKGYTLSDFSEGQAPTSSSHGAANEQGERQYEES-KKEEKSVVDQWLEKNGYE 244
+ + + S +A T ++ A + ++ + + KE +V + EK E
Sbjct: 1058 ATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE--EKAKVE 1115

Query: 245 IERQEPIVEEKEVVQEMSAPQEVPAAELLHETIAERMEGAKQESDVVDKNILQEELVDSK 304
E+ + E +V ++S QE ET+ + E A++ V+ ++E +
Sbjct: 1116 TEKTQ---EVPKVTSQVSPKQEQS------ETVQPQAEPARENDPTVN---IKEPQSQTN 1163

Query: 305 VEHEDTILSEEIKRN--TEIEQPTIEVEEQSPEEAVIVKAEEKLEETIVVE---IPEEVE 359
+ ++E N + + T S E + T+ E P+
Sbjct: 1164 TTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRH 1223

Query: 360 VIAEAEELEEVEVIAEAEESEEVEVIAETEESEKVEVIAETEAPEEVEPVA 410
+ VE + + + + V+++ A + +
Sbjct: 1224 RRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALN 1274


50BALH_4284BALH_4312Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_42842171.574622ArsR family transcriptional regulator
BALH_42850171.280114hypothetical protein
BALH_4286-1150.643219hypothetical protein
BALH_4287-1160.593466hypothetical protein
BALH_42880161.593399CarD family transcriptional regulator
BALH_4289-1161.952975glucose-1-dehydrogenase
BALH_4290-2162.604768glucose uptake protein
BALH_4291-1162.951041hypothetical protein
BALH_4292-2173.282023molybdopterin converting factor subunit 1
BALH_4293-2173.235606molybdopterin synthase subunit MoaE
BALH_42946278.509826molybdopterin-guanine dinucleotide biosynthesis
BALH_42956278.889708molybdopterin biosynthesis protein
BALH_42968309.414009molybdenum cofactor biosynthesis protein MoaC
BALH_42978339.209030thiamine/molybdopterin biosynthesis MoeB-like
BALH_42988338.993806molybdenum cofactor biosynthesis protein A
BALH_43006267.321813collagen-like protein
BALH_4302-1160.455279hypothetical protein
BALH_4304-115-0.287429hypothetical protein
BALH_4305-114-0.142963rhodanese-like domain-containing protein
BALH_43062141.931479homoserine O-acetyltransferase
BALH_43071202.841366spore germination protein GerHA
BALH_43082203.664524spore germination protein IB
BALH_43090184.101680spore germination protein
BALH_43100184.800912hypothetical protein
BALH_43120174.113411hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4289DHBDHDRGNASE1204e-35 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 120 bits (303), Expect = 4e-35
Identities = 74/257 (28%), Positives = 120/257 (46%), Gaps = 12/257 (4%)

Query: 7 LAGKVVVITGSATGLGRAMGVRFAKEKAKVV-INYRSRESEAHDVLEEIKKVGGEAIAVK 65
+ GK+ ITG+A G+G A+ A + A + ++Y + E V+ +K A A
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEK--VVSSLKAEARHAEAFP 63

Query: 66 GDVTVESDVVNLIQSAVKEFGTLDVMINNAGIENAVPSHEMPLEDWNKVINTNLTGAFLG 125
DV + + + +E G +D+++N AG+ H + E+W + N TG F
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 126 SREAIKYFVEHDIKGSIINMSSVHEKIPWPLFVHYAASKGGMKLMTETLAMEYAPKGIRV 185
SR KY ++ GSI+ + S +P YA+SK + T+ L +E A IR
Sbjct: 124 SRSVSKYMMDRR-SGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 186 NNIGPGAINTPINAEKFADPKKRADV--------ESMIPMGYIGKPEEIAAVATWLASSE 237
N + PG+ T + +AD V ++ IP+ + KP +IA +L S +
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 238 ASYVTGITLFADGGMTL 254
A ++T L DGG TL
Sbjct: 243 AGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4304PF07675250.045 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 24.7 bits (53), Expect = 0.045
Identities = 13/41 (31%), Positives = 18/41 (43%)

Query: 33 VYAGAGGSSAAIFLNGKRQPEAVIRTSVFLPPLATSTRTLG 73
VYA + G+ A+ F N + +T V P TR G
Sbjct: 1175 VYASSTGNDASNFANALLEEVLTAKTVVTAPEAIRGTRAQG 1215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4307IGASERPTASE473e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 47.0 bits (111), Expect = 3e-07
Identities = 41/207 (19%), Positives = 84/207 (40%), Gaps = 14/207 (6%)

Query: 9 KKKSNTTEKNETDNSEQKPNNQEDDNKEQTRSTKHNKSNNSEQKKEEHKESSQDKQQNQS 68
K++S T EKNE D +E N+E + ++ + ++N Q E KE+ + + +
Sbjct: 1045 KQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETA 1104

Query: 69 NQNQQQSAKQD-ESSQEQQNHS-------KQGDSNQGQQNHSKQNDSDQGQQQHSKQDDS 120
+++ AK + E +QE + +Q ++ Q Q +++ND ++ Q ++
Sbjct: 1105 TVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNT 1164

Query: 121 NQGQQNHSKQGDSNQGQQNHSKQGDSNQGQQNHSKQNDSDQGQQQHSKQDDSNQGQQNHS 180
+ +K+ SN Q + + +N + Q + SN+ + H
Sbjct: 1165 TADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHR 1224

Query: 181 K------QNDSDQDDSSQDKQQSSKQD 201
+ N SS D+ + D
Sbjct: 1225 RSVRSVPHNVEPATTSSNDRSTVALCD 1251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_431256KDTSANTIGN300.012 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 29.9 bits (67), Expect = 0.012
Identities = 14/34 (41%), Positives = 15/34 (44%), Gaps = 7/34 (20%)

Query: 259 QQQQQYQQYQQQQQSSPWAGGIGAGAAGAAAGAA 292
Q QQ Q QQQQ+ A A A A AA
Sbjct: 338 PQAQQQQGQGQQQQAQ-------ATAQEAVAAAA 364


51BALH_4335BALH_4346Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_43350163.110209transferase
BALH_43361212.864944phosphatidylglycerophosphatase B
BALH_43371172.582686glycosyl transferase family protein
BALH_43383202.712143molybdenum cofactor guanylyltransferase
BALH_43392202.240562molybdenum cofactor biosynthesis protein B
BALH_43400181.241800S-adenosylmethionine synthetase
BALH_4341-210-0.508339phosphoenolpyruvate carboxykinase
BALH_4342-114-2.682039DMT family permease
BALH_4343-114-3.385704hypothetical protein
BALH_4344-113-2.689000phosphatase
BALH_4345013-3.235027DeoR family transcriptional regulator
BALH_4346014-3.101161phosphoglycolate phosphatase
52BALH_4365BALH_4376Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_43652120.221550spore germination protein
BALH_43661110.131327S-layer protein
BALH_43670120.108954hypothetical protein
BALH_43680120.353256hypothetical protein
BALH_43691120.577894hypothetical protein
BALH_4370414-2.034070hypothetical protein
BALH_4372616-3.954873hypothetical protein
BALH_4373216-3.953120phage shock protein A
BALH_4374220-5.129130hypothetical protein
BALH_4375322-5.398464hypothetical protein
BALH_4376621-5.905941hypothetical protein
53BALH_4387BALH_4422Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_43872170.290788cell wall surface anchor family protein
BALH_4388017-0.202326hypothetical protein
BALH_4389-116-0.504231hypoxanthine phosphoribosyltransferase
BALH_4390-214-0.608282diacylglycerol kinase
BALH_4391-312-2.269263IS605 family transposase
BALH_4392-114-3.001936hypothetical protein
BALH_4393-111-3.443300hypothetical protein
BALH_4394-111-2.891289DedA family protein
BALH_4395-211-2.308917ribosomal RNA adenine dimethylase
BALH_4396-112-1.861314AraC family transcriptional regulator
BALH_4397014-0.757396hypothetical protein
BALH_4398014-1.544710ABC transporter permease
BALH_4399115-1.545672ABC transporter ATP-binding protein
BALH_4400114-1.664362AraC family transcriptional regulator
BALH_4401214-2.359242DNA-binding response regulator
BALH_4402113-2.149829sensor histidine kinase
BALH_4403211-2.118725ABC transporter permease
BALH_4404211-0.603721ABC transporter permease
BALH_44052110.514076ABC transporter ATP-binding protein
BALH_44062120.975643ABC transporter permease
BALH_44082142.969854gluconate permease
BALH_44092142.835082GntR family transcriptional regulator
BALH_44102152.4120852-dehydro-3-deoxygluconokinase
BALH_44111171.982146hypothetical protein
BALH_44121202.692513L-seryl-tRNA(Sec) selenium transferase
BALH_44130202.934520dihydroorotase
BALH_44140173.480100M15B family D-Ala-D-Ala carboxypeptidase VanY
BALH_44151184.330603sensor histidine kinase
BALH_44161175.192848DNA-binding response regulator
BALH_44171144.809660o-succinylbenzoate synthase
BALH_44180134.458762O-succinylbenzoic acid--CoA ligase
BALH_44191144.283035naphthoate synthase
BALH_44202133.747515alpha/beta fold family hydrolase
BALH_44211143.8553952-succinyl-5-enolpyruvyl-6-hydroxy-3-
BALH_4422-1183.360324menaquinone-specific isochorismate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4387IGASERPTASE356e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.4 bits (81), Expect = 6e-05
Identities = 22/80 (27%), Positives = 37/80 (46%), Gaps = 1/80 (1%)

Query: 46 AAIQAQQKNDTEKKQVAQAQEKNEVAKKQAVQAQEKNEVAKKQAAQVQEKSNMAKEVVAP 105
A ++Q++ T +K A E ++ A +A+ N A Q +V + + KE
Sbjct: 1040 VAENSKQESKTVEKNEQDATETTAQNREVAKEAK-SNVKANTQTNEVAQSGSETKETQTT 1098

Query: 106 QVKEKEAIAKKEAAKEAGEK 125
+ KE + K+E AK EK
Sbjct: 1099 ETKETATVEKEEKAKVETEK 1118



Score = 30.0 bits (67), Expect = 0.004
Identities = 17/84 (20%), Positives = 29/84 (34%), Gaps = 4/84 (4%)

Query: 51 QQKNDTEKKQVAQAQEKNE----VAKKQAVQAQEKNEVAKKQAAQVQEKSNMAKEVVAPQ 106
+ KQ ++ EKNE Q + ++ + K Q E + E Q
Sbjct: 1037 TETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQ 1096

Query: 107 VKEKEAIAKKEAAKEAGEKLPNTA 130
E + A E ++A + T
Sbjct: 1097 TTETKETATVEKEEKAKVETEKTQ 1120



Score = 30.0 bits (67), Expect = 0.005
Identities = 21/106 (19%), Positives = 37/106 (34%), Gaps = 7/106 (6%)

Query: 24 ASTGTVTKEEAAQIQQDIAK-----KEAAIQAQQKNDTEKKQVAQAQEKNEVAKKQAVQA 78
+ T E + Q + + K E Q ++ K V + NEVA+ +
Sbjct: 1034 SETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQ-SGSET 1092

Query: 79 QEKNEVAKKQAAQVQEKSNMAKEVVAPQVKEKEAIAKKEAAKEAGE 124
+E K+ A EK AK + + ++ +E E
Sbjct: 1093 KETQTTETKETAT-VEKEEKAKVETEKTQEVPKVTSQVSPKQEQSE 1137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4389ANTHRAXTOXNA300.004 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 30.1 bits (67), Expect = 0.004
Identities = 30/143 (20%), Positives = 59/143 (41%), Gaps = 19/143 (13%)

Query: 3 IEIKDTLISEEQLQEKVKELALQIERDFEGEEIVVIAVLKGSFVFAADLIRHIKNDV-TI 61
I IKD I+ EQ +E E+ I D +I+ K +LI+ + +D +
Sbjct: 154 INIKDYAINSEQSKEVYYEIGKGISLD-------IISKDKSLDPEFLNLIKSLSDDSDSS 206

Query: 62 DFISASSYGNQTETTGKVKLLKDIDVNITGKNVIVVEDIIDSGLTLHFLKDH---FFMHK 118
D + + + + E K ID+N +N+ + + +F DH ++
Sbjct: 207 DLLFSQKFKEKLELNNK-----SIDINFIKENLTEFQHAFSLAFSYYFAPDHRTVLELYA 261

Query: 119 PKALKFCTLLDKPERRKVDLTAE 141
P ++ ++K E+ + +E
Sbjct: 262 PDMFEY---MNKLEKGGFEKISE 281


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4392FLGFLIH352e-04 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 35.2 bits (80), Expect = 2e-04
Identities = 17/56 (30%), Positives = 30/56 (53%)

Query: 214 DAREKVLMDEAAKFAHAETEGMKRGMEKGLEKGIEQGIEQGIEQGIEQGRKEGVQQ 269
+A + A A +G + G+ +G ++G +QG ++G+ QG+EQG E Q
Sbjct: 35 EAEPSLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQ 90



Score = 33.6 bits (76), Expect = 6e-04
Identities = 14/48 (29%), Positives = 31/48 (64%)

Query: 226 KFAHAETEGMKRGMEKGLEKGIEQGIEQGIEQGIEQGRKEGVQQGKIQ 273
+ A + + ++G + G+ +G +QG +QG ++G+ QG ++G+ + K Q
Sbjct: 43 QLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQ 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4401HTHFIS812e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 80.6 bits (199), Expect = 2e-19
Identities = 33/117 (28%), Positives = 54/117 (46%), Gaps = 1/117 (0%)

Query: 21 KILIVEDDPNISSLLQSHIQKYGYEAVVAENFDDIMESFNAVKPHLVLLDVNLPKFDGFY 80
IL+ +DD I ++L + + GY+ + N + A LV+ DV +P + F
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 81 WCRQIR-HESTCPIIFISARAGEMEQIMAIESGADDYITKPFHYDVVMAKIKGQLRR 136
+I+ P++ +SA+ M I A E GA DY+ KPF ++ I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4402PF06580422e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.8 bits (98), Expect = 2e-06
Identities = 23/113 (20%), Positives = 43/113 (38%), Gaps = 23/113 (20%)

Query: 216 DAKWLKFIIYQLMTNAVRY---SGERGKKVFLSAYRNGKDIILEVRDEGVGIPQEDIRRV 272
D + ++ L+ N +++ +G K+ L ++ + LEV + G +
Sbjct: 252 DVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNT---- 307

Query: 273 FEPFYTGKNGRTFGESTGMGLYIVSK-ICDYLG--HSVKLDSEVGKGTTIKII 322
ESTG GL V + + G +KL + GK + +I
Sbjct: 308 -------------KESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4413UREASE320.004 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 32.0 bits (73), Expect = 0.004
Identities = 21/85 (24%), Positives = 38/85 (44%), Gaps = 17/85 (20%)

Query: 19 DIVIENNKIAQVTKAG-----------AGEGGKVLDYSGTYVSSGWIDLHVHAFPEFDPY 67
DI +++ +IA + KAG G G +V+ G V++G +D H+H
Sbjct: 87 DIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIHFI------ 140

Query: 68 GDEVDEIGVKQGVTTIVDAGSCGAD 92
+ E + G+T ++ G+ A
Sbjct: 141 CPQQIEEALMSGLTCMLGGGTGPAH 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4415PF06580354e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.8 bits (80), Expect = 4e-04
Identities = 30/148 (20%), Positives = 53/148 (35%), Gaps = 28/148 (18%)

Query: 177 QIK-HF--------STIAYTKSQRLESLIDELFEITRMNYGMLKLDKKPIDISELLIQLE 227
QI HF + + ++ L E+ R + L+ + L
Sbjct: 169 QINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYS---LRYSNAR------QVSLA 219

Query: 228 EEL-----YPLLEKHHLEARLNVDPHLPIN-GDGKLLARVFENLLTNAVRYG----YDGQ 277
+EL Y L E RL + + D ++ + + L+ N +++G G
Sbjct: 220 DELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGG 279

Query: 278 FVDLNGYIDNGEVVVQVMNYGDSIPEED 305
+ L G DNG V ++V N G +
Sbjct: 280 KILLKGTKDNGTVTLEVENTGSLALKNT 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4416HTHFIS1023e-27 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 102 bits (255), Expect = 3e-27
Identities = 36/144 (25%), Positives = 71/144 (49%), Gaps = 5/144 (3%)

Query: 1 MKRISILIADDEAEIADLIEIHLEKEGYHVVKAADGEEAIHIIETQPIDLVVLDIMMPKM 60
M +IL+ADD+A I ++ L + GY V ++ I DLVV D++MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGYEVTRQIRA-KHHMPIIFLSAKTSDFDKVTGLVLGADDYMTKPFTPIELVARVNAQLR 119
+ +++ +I+ + +P++ +SA+ + + GA DY+ KPF EL+ +
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGII----G 116

Query: 120 RFLTLNQPKVAESKSALEVGGVVI 143
R L + + ++ + + G ++
Sbjct: 117 RALAEPKRRPSKLEDDSQDGMPLV 140


54BALH_4440BALH_4466Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_44402161.468817glucose-6-phosphate isomerase
BALH_44413171.497539general stress protein 13
BALH_44422171.587034hypothetical protein
BALH_44433181.785517AsnC family transcriptional regulator
BALH_44443192.095818D-isomer specific 2-hydroxyacid dehydrogenase
BALH_44453181.684881alpha/beta hydrolase fold protein
BALH_44462151.857269hypothetical protein
BALH_4447-1151.562591class I and II aminotransferase
BALH_4448-1161.494243superoxide dismutase, Cu-Zn
BALH_44491231.899050hypothetical protein
BALH_44501221.957245kinase-associated protein B
BALH_44510211.267489sporulation inhibitor KapD
BALH_4452-1251.556813arsenical pump membrane protein
BALH_44532241.619564DNA-7-methylguanine glycosylase
BALH_44542261.980659glycyl-tRNA synthetase
BALH_44551181.294455comA operon protein
BALH_44561181.815998hypothetical protein
BALH_44571202.533178biotin synthesis protein
BALH_44581202.593961UDP-glucose pyrophosphorylase
BALH_44591222.882831alpha-phosphoglucomutase
BALH_44601213.045236hypothetical protein
BALH_44613151.938549leucyl aminopeptidase
BALH_44624151.781379hypothetical protein
BALH_44634141.810930MutT/NUDIX family protein
BALH_44643141.652594pyridine nucleotide-disulfide oxidoreductase
BALH_44655141.584181thioredoxin-disulfide reductase
BALH_44666151.318854cell surface anchor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4466IGASERPTASE642e-11 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 63.5 bits (154), Expect = 2e-11
Identities = 40/262 (15%), Positives = 85/262 (32%), Gaps = 11/262 (4%)

Query: 3264 NQPLEPTVSIKPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPE 3323
NQ ++ T P + + P P E + E P E +
Sbjct: 989 NQTVDTTNITTPNNIQA--DVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQ 1046

Query: 3324 DPKEPEVKPEDPKEPEVKPEDPKEPEVK--PEDPKEPEVKPEDPKEPEVKPEDPKEPEVK 3381
+ K E +D E + + + + + EV + E + + KE
Sbjct: 1047 ESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATV 1106

Query: 3382 PEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVK 3441
++ K + + P+V + P+ + + +P+ +P +P V ++P+
Sbjct: 1107 EKEEKAKVETEKTQEVPKVTSQVS--PKQEQSETVQPQAEPARENDPTVNIKEPQSQTNT 1164

Query: 3442 PEDPKEP-EVKPEDPKEPEVKP---EDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKE 3497
D ++P + + ++P + PE+ +P E KP++
Sbjct: 1165 TADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHR 1224

Query: 3498 PEVKPEDPKEPEVKPEDPKEPE 3519
V+ P E +
Sbjct: 1225 RSVRSV-PHNVEPATTSSNDRS 1245



Score = 63.2 bits (153), Expect = 2e-11
Identities = 39/257 (15%), Positives = 82/257 (31%), Gaps = 10/257 (3%)

Query: 3280 VKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDP-KEPEVKPEDPKEP 3338
V + P D E+ + P P P E E ++ K
Sbjct: 992 VDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTV 1051

Query: 3339 EVKPEDPKEPEVKPEDPKEPEVK--PEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPK 3396
E +D E + + + + + EV + E + + KE ++ K
Sbjct: 1052 EKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEK 1111

Query: 3397 EPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPK 3456
+ + P+V + P+ + + +P+ +P +P V ++P+ D +
Sbjct: 1112 AKVETEKTQEVPKVTSQVS--PKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTE 1169

Query: 3457 EP-EVKPEDPKEPEVKP---EDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKP 3512
+P + + ++P + PE+ +P E KP++ V+
Sbjct: 1170 QPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRS 1229

Query: 3513 EDPKEPEVKPEDPKEPE 3529
P E +
Sbjct: 1230 V-PHNVEPATTSSNDRS 1245



Score = 61.6 bits (149), Expect = 6e-11
Identities = 37/232 (15%), Positives = 79/232 (34%), Gaps = 11/232 (4%)

Query: 3256 RAIVDDGINQPLEPTVSIKPKE--PEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVK-- 3311
A VD+ P P + E E ++ K E +D E + + +
Sbjct: 1017 IARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNV 1076

Query: 3312 PEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVK 3371
+ + EV + E + + KE ++ K + + P+V + P+ +
Sbjct: 1077 KANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVS--PKQE 1134

Query: 3372 PEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEP-EVKPEDPKEPEVKP---EDPKE 3427
+ +P+ +P +P V ++P+ D ++P + + ++P +
Sbjct: 1135 QSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNS 1194

Query: 3428 PEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPE 3479
PE+ +P E KP++ V+ P E +
Sbjct: 1195 VVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSV-PHNVEPATTSSNDRS 1245



Score = 61.2 bits (148), Expect = 8e-11
Identities = 39/257 (15%), Positives = 83/257 (32%), Gaps = 6/257 (2%)

Query: 3310 VKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDP-KEPEVKPEDPKEP 3368
V + P D E+ + P P P E E ++ K
Sbjct: 992 VDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTV 1051

Query: 3369 EVKPEDPKEPEVKPEDPKEPEVK--PEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPK 3426
E +D E + + + + + EV + E + + KE ++ K
Sbjct: 1052 EKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEK 1111

Query: 3427 EPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPK 3486
+ + P+V + P+ + + +P+ +P +P V ++P+ D +
Sbjct: 1112 AKVETEKTQEVPKVTSQVS--PKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTE 1169

Query: 3487 EP-EVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKLEKPEVRLEKLIKE 3545
+P + + ++P + E+P+ P KP+ R + ++
Sbjct: 1170 QPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRS 1229

Query: 3546 PQVKVERELPKTGAASP 3562
VE + S
Sbjct: 1230 VPHNVEPATTSSNDRST 1246



Score = 57.4 bits (138), Expect = 1e-09
Identities = 35/236 (14%), Positives = 75/236 (31%), Gaps = 10/236 (4%)

Query: 3221 NGKLTAKYENIIDTKERNITFKVKVKEKAGEEIVNRAIVDDGINQPLEPTVSIKPKEPEV 3280
N + A+ + T + A V+ E T + E
Sbjct: 1013 NNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEA 1072

Query: 3281 KPE---DPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKE 3337
K + + EV + E + + KE ++ K + + P+V +
Sbjct: 1073 KSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVS-- 1130

Query: 3338 PEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEP-EVKPEDPKEPEVKP---E 3393
P+ + + +P+ +P +P V ++P+ D ++P + + ++P +
Sbjct: 1131 PKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVN 1190

Query: 3394 DPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPE 3449
PE+ +P E KP++ V+ P E +
Sbjct: 1191 TGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSV-PHNVEPATTSSNDRS 1245



Score = 42.0 bits (98), Expect = 6e-05
Identities = 26/196 (13%), Positives = 62/196 (31%), Gaps = 8/196 (4%)

Query: 3209 VKAKGAIEVKVENG-KLTAKYENIIDTKERNITFKVKVKEKAGEEIVNRAIVDDGINQPL 3267
+ A E +N N+ + N + + K + + +
Sbjct: 1053 KNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKA 1112

Query: 3268 EPTVSIKPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKE 3327
+ + P+V + P+ + + +P+ +P +P V ++P+ D ++
Sbjct: 1113 KVETEKTQEVPKVTSQVS--PKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQ 1170

Query: 3328 P-EVKPEDPKEPEVKP---EDPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPEVKPE 3383
P + + ++P + PE+ +P E KP++ V+
Sbjct: 1171 PAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSV 1230

Query: 3384 DPKEPEVKPEDPKEPE 3399
P E +
Sbjct: 1231 -PHNVEPATTSSNDRS 1245



Score = 33.9 bits (77), Expect = 0.017
Identities = 36/216 (16%), Positives = 70/216 (32%), Gaps = 27/216 (12%)

Query: 3158 KAEKEVSNKEPKLGEEVEYRISFKNTVENGKLAEVKIEDQLPDGLEYVKDSVKAKGAIEV 3217
K E++ + + E + + N N + EV Q G E +
Sbjct: 1053 KNEQDATETTAQNREVAKE--AKSNVKANTQTNEV---AQS--GSETKETQTTETKETAT 1105

Query: 3218 KVENGKLTAKYENIIDTKERNITFKVKVKEKAGEEIVNRAIVDDGINQPLEPTVSIKPKE 3277
+ K AK E + +T +V K++ E + +A EP + +
Sbjct: 1106 VEKEEK--AKVETEKTQEVPKVTSQVSPKQEQSETVQPQA----------EPA---REND 1150

Query: 3278 PEVKPEDPKEPEVKPEDPKEP-EVKPEDPKEPEVKP---EDPKEPEVKPEDPKEPEVKPE 3333
P V ++P+ D ++P + + ++P + PE+ +P
Sbjct: 1151 PTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPT 1210

Query: 3334 DPKEPEVKPEDPKEPEVKPEDPKEPEVKPEDPKEPE 3369
E KP++ V+ P E +
Sbjct: 1211 VNSESSNKPKNRHRRSVRSV-PHNVEPATTSSNDRS 1245


55BALH_4505BALH_4524Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_45053161.706624M24/M37 family peptidase
BALH_45062191.884325hypothetical protein
BALH_45072191.944459hypothetical protein
BALH_45082192.1065245'-nucleotidase
BALH_45092242.481880PadR family transcriptional regulator
BALH_45103312.933924antibiotic resistance protein
BALH_45112302.949814FeS assembly protein SufB
BALH_45123261.619750NifU family protein
BALH_45133251.683473cysteine desulfurase
BALH_45142221.454086FeS assembly protein SufB
BALH_45151170.122357iron regulated ABC transporter ATP-binding
BALH_4516016-0.641550ABC transporter substrate-binding protein
BALH_4517015-0.112075ABC transporter substrate-binding protein
BALH_45181151.038558ABC transporter permease
BALH_45192130.258621ABC transporter ATP-binding protein
BALH_4520315-1.671892hypothetical protein
BALH_4521315-3.005005thioredoxin
BALH_4522617-2.630115DNA primase
BALH_4523518-3.080593glycine cleavage system protein H
BALH_4524418-3.439890arsenate reductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4517adhesinb280.040 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 27.9 bits (62), Expect = 0.040
Identities = 16/49 (32%), Positives = 24/49 (48%), Gaps = 4/49 (8%)

Query: 4 MKKLLLTALISTSIFGLAACGGKDNDEK----KLVVGASNVPHAEILEK 48
MKK L+ + GLAAC + + + KL V A+N A+I +
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKN 49


56BALH_4536BALH_4545Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_4536122-4.361586hypothetical protein
BALH_4537122-5.418761transcriptional regulator
BALH_4538223-6.184435TetR family transcriptional regulator
BALH_4539425-6.490569metallo-beta-lactamase family protein
BALH_4540016-5.189926ABC transporter ATP-binding protein
BALH_4541015-2.678721hypothetical protein
BALH_45423181.030023hypothetical protein
BALH_45432192.160182hypothetical protein
BALH_45442212.683945hypothetical protein
BALH_45452222.536704acyl-CoA dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4538TETREPRESSOR423e-07 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 42.2 bits (99), Expect = 3e-07
Identities = 18/43 (41%), Positives = 26/43 (60%)

Query: 7 LTLPKIVETAAEIADTNGIQEVTLASLAQRLGIRSPSLYNHVK 49
L +++ A E+ + GI +T LAQ+LGI P+LY HVK
Sbjct: 4 LNRESVIDAALELLNETGIDGLTTRKLAQKLGIEQPTLYWHVK 46


57BALH_4559BALH_4576Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_45593200.999478hypothetical protein
BALH_4561219-0.018953hypothetical protein
BALH_4562318-0.574407hypothetical protein
BALH_4563014-2.044635cytolysin immunity CylI protein
BALH_4564014-1.977727hypothetical protein
BALH_4565012-2.635296DNA-binding protein
BALH_4566-112-3.438514hypothetical protein
BALH_4567-115-1.684122DedA family protein
BALH_4568-211-0.814611hypothetical protein
BALH_4569-3110.473024hypothetical protein
BALH_4570-2130.629981response regulator
BALH_45711140.887142sensor histidine kinase
BALH_45721161.753612DMT superfamily transporter
BALH_45731121.316233methionyl-tRNA synthetase
BALH_45743161.152489methyltransferase
BALH_45754150.613007methyl-accepting chemotaxis protein
BALH_45762130.290288methyl-accepting chemotaxis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4570HTHFIS852e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.3 bits (211), Expect = 2e-21
Identities = 33/145 (22%), Positives = 64/145 (44%), Gaps = 2/145 (1%)

Query: 2 THILVIEDNPDIQELIREFLMAQNFTVDVVGAGTEGILLFQKNSYDLVLLDVMLPDIDGY 61
ILV +D+ I+ ++ + L + V + DLV+ DV++PD + +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 S-ICKIMRGQSDVPIIMLTGLHNEESEIKGFELGIDDYITKPFHYTVFIKRVEAVLRRAA 120
+ +I + + D+P+++++ + + IK E G DY+ KPF T I + L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 121 TKEAEAATILQ-FHELMLNSTAYAA 144
+ ++ Q L+ S A
Sbjct: 124 RRPSKLEDDSQDGMPLVGRSAAMQE 148


58BALH_4620BALH_4638Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_46201213.033487response regulator
BALH_46210213.392026SsrA-binding protein
BALH_46220253.998828ribonuclease R
BALH_46230313.836493carboxylesterase
BALH_46240354.595834murein hydrolase export regulator
BALH_46251424.824734holin-like protein
BALH_46263474.968139inosine-uridine preferring nucleoside hydrolase
BALH_46274455.361458phosphopyruvate hydratase
BALH_46284374.707848phosphoglyceromutase
BALH_46295264.004145triosephosphate isomerase
BALH_46303233.410801phosphoglycerate kinase
BALH_46313172.208495glyceraldehyde-3-phosphate dehydrogenase
BALH_46322171.980925central glycolytic genes regulator
BALH_46331162.807751glutaredoxin family protein
BALH_46340183.700153RNA polymerase factor sigma-54
BALH_46350274.413879*hypothetical protein
BALH_46360263.208435hypothetical protein
BALH_46370294.082663stage V sporulation protein AC
BALH_4638-1264.098653stage V sporulation protein AD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4620HTHFIS583e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 58.3 bits (141), Expect = 3e-12
Identities = 21/116 (18%), Positives = 50/116 (43%), Gaps = 1/116 (0%)

Query: 4 VLVIKNERSLAKKIVSGLTQEGHFILKLHNENEGLNIIYEQDWDIIILDWDSLSISGPEI 63
+LV ++ ++ + L++ G+ + N I D D+++ D + ++
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 CRQIR-LVKMTPIIIVTDNISSKDCVAGLQAGADDYIRKPFAKEELVARVQAILRR 118
+I+ P+++++ + + + GA DY+ KPF EL+ + L
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


59BALH_4654BALH_4687Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_4654015-4.137744prolipoprotein diacylglyceryl transferase
BALH_4655017-5.158580HPr kinase/phosphorylase
BALH_4656322-6.788742membrane protein
BALH_4657524-6.695270hypothetical protein
BALH_4658628-7.364350periplasmic protein of efflux system
BALH_4659-118-1.036746ABC-type bacteriocin transporter family protein
BALH_46601213.282435hypothetical protein
BALH_46611213.098629hypothetical protein
BALH_46622202.597830hypothetical protein
BALH_46632192.452537hypothetical protein
BALH_46642203.464208excinuclease ABC subunit A
BALH_46652193.038745excinuclease ABC subunit B
BALH_46663222.618541hypothetical protein
BALH_46672222.890968MerR family transcriptional regulator
BALH_46682273.709190hypothetical protein
BALH_46693284.730793hypothetical protein
BALH_46703222.671274LysR family transcriptional regulator
BALH_46711192.843897MerR family transcriptional regulator
BALH_46720172.660213FMN reductase, NADPH-dependent
BALH_4673-1162.222890ABC transporter ATP-binding protein
BALH_46740172.487787hypothetical protein
BALH_4675-1162.273546multidrug resistance ABC transporter ATP-binding
BALH_4676-1182.802435hypothetical protein
BALH_46781222.829102carboxyl-terminal protease
BALH_46803262.553413cell division protein FtsX
BALH_46813253.105450cell division ATP-binding protein FtsE
BALH_46823212.496718cytochrome c-551
BALH_46833172.662197peptide chain release factor 2
BALH_46843162.151963preprotein translocase subunit SecA
BALH_46852121.288881sigma 54 modulation protein/30S ribosomal
BALH_46861121.373368cold-shock DNA-binding protein family protein
BALH_46872121.120960comF operon protein 3
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4658RTXTOXIND823e-19 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 81.8 bits (202), Expect = 3e-19
Identities = 49/281 (17%), Positives = 115/281 (40%), Gaps = 12/281 (4%)

Query: 20 EELTDSVELLEKKPPKFILLSLLVLFIFLIGFSIWALFGKVDIVSKGNASIQNKEDLIIY 79
E L +EL+E + L + FL+ I ++ G+V+IV+ N + +
Sbjct: 40 EFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEI 99

Query: 80 KSQINGVVSSVFVKSGDTVKKGDILIQLE----NQDLSNKKNELEAALKNFEMNKSMLEQ 135
K N +V + VK G++V+KGD+L++L D ++ L A + +
Sbjct: 100 KPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRS 159

Query: 136 LKNSIKLNQDSFMDDIDKKFLEEYKAYEDGYKSLKLEKEDISSIEAYKSNAIVSINQR-- 193
+ KL + D+ + + E E+ + L KE S+ + K ++++++
Sbjct: 160 I-ELNKLPELKLPDEPYFQNVSE----EEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRA 214

Query: 194 -IQAIDQEHFIKKQELDSINKQSELLKIQANKDGVVQFSSAIQKGDLIDANEELVALIPE 252
+ + + + +K + + + Q+ ++A EL +
Sbjct: 215 ERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQ 274

Query: 253 IDKKKVKIYLSAQEIKGIRKGNQVQYAFNLKGADKQLGKIT 293
+++ + +I + +E + + + + + L+ +G +T
Sbjct: 275 LEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLT 315



Score = 51.4 bits (123), Expect = 2e-09
Identities = 50/265 (18%), Positives = 93/265 (35%), Gaps = 22/265 (8%)

Query: 105 IQLENQDLSNKKNELEAALKNFEMNKSML-EQLKNSIKLNQDSFMDDIDKKFLEEYKAYE 163
+L ++ + A + +E + +L + L + LE+ Y
Sbjct: 205 KELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKH--AVLEQENKYV 262

Query: 164 DGYKSLKLEKEDISSIEAYKSNAIVSINQRIQAIDQEHFIK-----------KQELDSIN 212
+ L++ K + IE+ +A Q E K EL
Sbjct: 263 EAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNE 322

Query: 213 KQSELLKIQANKDGVVQFSSAIQKGDLIDANEELVALIPEIDKKKVKIYLSAQEIKGIRK 272
++ + I+A VQ +G ++ E L+ ++PE D +V + ++I I
Sbjct: 323 ERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINV 382

Query: 273 GNQVQY---AFNLKGADKQLGKITYISKYPIFDKNTKEYVY----ELDATINIKDTN-EL 324
G AF +GK+ I+ I D+ + ++ + N L
Sbjct: 383 GQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKNIPL 442

Query: 325 YTGMIGKAAVITGESSVWEFILRKL 349
+GM A + TG SV ++L L
Sbjct: 443 SSGMAVTAEIKTGMRSVISYLLSPL 467


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4673BLACTAMASEA260.034 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 26.3 bits (58), Expect = 0.034
Identities = 12/45 (26%), Positives = 23/45 (51%)

Query: 32 AFIRSVADHILQVDESEPRVFHGNYEQYTNRTADASVNVTAQELL 76
AF+R + D++ ++D E + + T AS+ T ++LL
Sbjct: 146 AFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLL 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4678BINARYTOXINB300.023 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 30.0 bits (67), Expect = 0.023
Identities = 14/44 (31%), Positives = 22/44 (50%), Gaps = 1/44 (2%)

Query: 227 GKDIGYMQITSFAENTAKEFKDQLKELEKKNIKGLVIDVRGNPG 270
GKDI F + T++ K+QL EL NI ++ ++ N
Sbjct: 573 GKDITEFDFN-FDQQTSQNIKNQLAELNATNIYTVLDKIKLNAK 615


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4684SECA11700.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 1170 bits (3029), Expect = 0.0
Identities = 446/897 (49%), Positives = 598/897 (66%), Gaps = 65/897 (7%)

Query: 1 MIGILKKVF-DVNQRQIKRMQKTVEQIDALESSIKPLTDEQLKGKTLEFKERLTKGETVD 59
+I +L KVF N R ++RM+K V I+A+E ++ L+DE+LKGKT EF+ RL KGE ++
Sbjct: 2 LIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVLE 61

Query: 60 DLLPEAFAVVREAATRVLGMRPYGVQLMGGIALHEGNISEMKTGEGKTLTSTLPVYLNAL 119
+L+PEAFAVVREA+ RV GMR + VQL+GG+ L+E I+EM+TGEGKTLT+TLP YLNAL
Sbjct: 62 NLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNAL 121

Query: 120 TGKGVHVVTVNEYLAQRDASEMGQLHEFLGLTVGINLNSMSREEKQEAYAADITYSTNNE 179
TGKGVHVVTVN+YLAQRDA L EFLGLTVGINL M K+EAYAADITY TNNE
Sbjct: 122 TGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNNE 181

Query: 180 LGFDYLRDNMVLYKEQCVQRPLHFAIIDEVDSILVDEARTPLIISGQAQKSTELYMFANA 239
GFDYLRDNM E+ VQR LH+A++DEVDSIL+DEARTPLIISG A+ S+E+Y N
Sbjct: 182 YGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVNK 241

Query: 240 FVRTL-----------ENEKDYSFDVKTKNVMLTEDGITKAEKAFHI-------ENLFDL 281
+ L + E +S D K++ V LTE G+ E+ E+L+
Sbjct: 242 IIPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYSP 301

Query: 282 KHVALLHHINQALRAHVVMHRDTDYVVQEGEIVIVDQFTGRLMKGRRYSEGLHQAIEAKE 341
++ L+HH+ ALRAH + RD DY+V++GE++IVD+ TGR M+GRR+S+GLHQA+EAKE
Sbjct: 302 ANIMLMHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAKE 361

Query: 342 GVEIQNESMTLATITFQNYFRMYEKLSGMTGTAKTEEEEFRNIYNMNVIVIPTNKPIIRD 401
GV+IQNE+ TLA+ITFQNYFR+YEKL+GMTGTA TE EF +IY ++ +V+PTN+P+IR
Sbjct: 362 GVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIRK 421

Query: 402 DRADLIFKSMEGKFNAVVEDIVNRHKQGQPVLVGTVAIETSELISKMLTRKGVRHNILNA 461
D DL++ + K A++EDI R +GQPVLVGT++IE SEL+S LT+ G++HN+LNA
Sbjct: 422 DLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLNA 481

Query: 462 KNHAREADIIAEAGMKGAVTIATNMAGRGTDIKLG------------------------- 496
K HA EA I+A+AG AVTIATNMAGRGTDI LG
Sbjct: 482 KFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKADW 541

Query: 497 ----DDIKNIG-LAVIGTERHESRRIDNQLRGRAGRQGDPGVTQFYLSMEDELMRRFGSD 551
D + G L +IGTERHESRRIDNQLRGR+GRQGD G ++FYLSMED LMR F SD
Sbjct: 542 QVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFASD 601

Query: 552 NMKAMMDRLGMDDSQPIESKMVSRAVESAQKRVEGNNYDARKQLLQYDDVLRQQREVIYK 611
+ MM +LGM + IE V++A+ +AQ++VE N+D RKQLL+YDDV QR IY
Sbjct: 602 RVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIYS 661

Query: 612 QRQEVMESENLRGIIEGMMKSTVERAV-ALHTQEEIEEDWNIKGLVDYLNTNLLQEGDVK 670
QR E+++ ++ I + + + + A + +EE W+I GL + L + + +
Sbjct: 662 QRNELLDVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPIA 721

Query: 671 E--EELRRLAPEEMSEPIIAKLIERYNDKEKLMPEEQMREFEKVVVFRVVDTKWTEHIDA 728
E ++ L E + E I+A+ IE Y KE+++ E MR FEK V+ + +D+ W EH+ A
Sbjct: 722 EWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAA 781

Query: 729 MDHLREGIHLRAYGQIDPLREYQMEGFAMFESMIASIEEEISRYIMKAEI---------- 778
MD+LR+GIHLR Y Q DP +EY+ E F+MF +M+ S++ E+ + K ++
Sbjct: 782 MDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMPEEVEELE 841

Query: 779 -EQNLERQEVVQGEAVHPSSDGEEAKKKPVVKGDQ--VGRNDLCKCGSGKKYKNCCG 832
++ +E + + Q + + D A + + VGRND C CGSGKKYK C G
Sbjct: 842 QQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCPCGSGKKYKQCHG 898


60BALH_4719BALH_4732Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_4719-2183.504807LysR family transcriptional regulator
BALH_4720-1173.645749hypothetical protein
BALH_4722-2173.629529cytochrome d ubiquinol oxidase, subunit I
BALH_4723-1162.802584cytochrome d ubiquinol oxidase, subunit II
BALH_4724-1152.645629arsenical pump family protein
BALH_47250172.447595thiamine biosynthesis protein ThiC
BALH_47260161.100195L-lactate permease
BALH_47272191.291267IS605 family transposase
BALH_47282251.725033hypothetical protein
BALH_47291261.467422hypothetical protein
BALH_47301261.387511sulfatase
BALH_47310211.753761D-amino acid aminotransferase
BALH_47322242.208369cell wall hydrolase, N-acetylmuramoyl-L-alanine
61BALH_4767BALH_4817Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_4767215-1.795707hypothetical protein
BALH_4768316-2.697291membrane-bound transcriptional regulator LytR
BALH_4769719-4.536716mannose-6-phosphate isomerase
BALH_4770921-5.325055mannose-1-phosphate guanylyltransferase (GDP)
BALH_47711225-6.356170glycosyltransferase, group 1 family protein
BALH_47721225-6.674375mannosyltransferase
BALH_4773924-5.983407UDP-glucose 6-dehydrogenase
BALH_4774821-5.759570hypothetical protein
BALH_4775118-3.490343glycoside hydrolase
BALH_4777-216-2.539817polysaccharide biosynthesis protein
BALH_4778013-0.922330sugar transferase
BALH_4779113-0.164095UDP-glucose pyrophosphorylase
BALH_47800170.536333capsular polysaccharide biosynthesis protein
BALH_47810180.857248tyrosine-protein kinase
BALH_47821191.317492capsular polysaccharide biosynthesis protein
BALH_47832180.961388tyrosine-protein kinase
BALH_47842190.816670(3R)-hydroxymyristoyl-ACP dehydratase
BALH_47850150.358062rod shape-determining protein Mbl
BALH_47862131.022778stage III sporulation protein D
BALH_47870121.766375stage II sporulation protein Q
BALH_4788-1111.207738ABC transporter permease
BALH_4789-1152.360937ABC transporter ATP-binding protein
BALH_4790-1163.399749ABC transporter ATP-binding protein
BALH_47910194.493034stage II sporulation protein D
BALH_47920193.924058UDP-N-acetylglucosamine
BALH_47932254.066529hypothetical protein
BALH_47944275.124511NADH dehydrogenase subunit N
BALH_47964296.326354NADH dehydrogenase subunit M
BALH_47973276.313094NADH dehydrogenase subunit L
BALH_47981172.866863NADH dehydrogenase subunit J
BALH_47991152.419891NADH dehydrogenase subunit H
BALH_48000131.416995NADH dehydrogenase subunit D
BALH_4801-2110.464023NADH dehydrogenase subunit C
BALH_4802-212-0.068342NADH dehydrogenase subunit B
BALH_4803-1140.142339PAS/PAC sensor-containing diguanylate
BALH_48042253.153816hypothetical protein
BALH_48063283.581322hypothetical protein
BALH_48074324.249846F0F1 ATP synthase subunit epsilon
BALH_48084334.265030F0F1 ATP synthase subunit beta
BALH_48093293.513510F0F1 ATP synthase subunit gamma
BALH_48102303.402422F0F1 ATP synthase subunit alpha
BALH_48113272.760089F0F1 ATP synthase subunit delta
BALH_48122253.512406F0F1 ATP synthase subunit B
BALH_48133263.769406F0F1 ATP synthase subunit C
BALH_48142223.685685F0F1 ATP synthase subunit A
BALH_48152213.251807uracil phosphoribosyltransferase
BALH_48163233.100666serine hydroxymethyltransferase
BALH_48170193.063372hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4785SHAPEPROTEIN478e-173 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 478 bits (1233), Expect = e-173
Identities = 179/330 (54%), Positives = 244/330 (73%), Gaps = 5/330 (1%)

Query: 1 MFARDIGIDLGTANVLIHVKGKGIVLNEPSVVAIDRNTG----KVLAVGEEARSMVGRTP 56
MF+ D+ IDLGTAN LI+VKG+GIVLNEPSVVAI ++ V AVG +A+ M+GRTP
Sbjct: 8 MFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAAVGHDAKQMLGRTP 67

Query: 57 GNIVAIRPLKDGVIADFEITEAMLKYFINKLDVKSFFS-KPRILICCPTNITSVEQKAIR 115
GNI AIRP+KDGVIADF +TE ML++FI ++ SF PR+L+C P T VE++AIR
Sbjct: 68 GNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIR 127

Query: 116 EAAERSGGKTVFLEEEPKVAAVGAGMEIFQPSGNMVVDIGGGTTDIAVLSMGDIVTSSSI 175
E+A+ +G + VFL EEP AA+GAG+ + + +G+MVVDIGGGTT++AV+S+ +V SSS+
Sbjct: 128 ESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSV 187

Query: 176 KMAGDKFDMEILNYIKRKYKLLIGERTSEDIKIKVGTVFPGARSEELEIRGRDMVTGLPR 235
++ GD+FD I+NY++R Y LIGE T+E IK ++G+ +PG E+E+RGR++ G+PR
Sbjct: 188 RIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPR 247

Query: 236 TITVCSEEITEALKENAAVIVQAAKGVLERTPPELSADIIDRGVILTGGGALLHGIDMLL 295
T+ S EI EAL+E IV A LE+ PPEL++DI +RG++LTGGGALL +D LL
Sbjct: 248 GFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLL 307

Query: 296 AEELKVPVLIAENPMHCVAVGTGIMLENID 325
EE +PV++AE+P+ CVA G G LE ID
Sbjct: 308 MEETGIPVVVAEDPLTCVARGGGKALEMID 337


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4801IGASERPTASE547e-10 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 53.9 bits (129), Expect = 7e-10
Identities = 42/212 (19%), Positives = 77/212 (36%), Gaps = 20/212 (9%)

Query: 36 SKLEEENREKEKALPKNDDMTIEEAKRRAAAAAKAKAAALAKQKREGTE------EVTEE 89
S EE R E +P T E A +K ++ + K +++ TE EV +E
Sbjct: 1012 SNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKE 1071

Query: 90 EK--VKAKAKAAAAAKAKAAALAKQKREG--TEEVTEEEKAKMKAKAVAAAKAKAAALAK 145
K VKA + A++ + Q E T V +EEKAK++ + +
Sbjct: 1072 AKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS---- 1127

Query: 146 QKREGIEEVTEEEKVKAKAKAAEAAKAKAAALAKQKASQGNGDSGDEKAKAIAAAKAKAA 205
+ ++E+ + AE A+ + ++ + D + A +
Sbjct: 1128 ------QVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQ 1181

Query: 206 AAARAKTKGAEGKKEEEPKQEEPSVNEPYLNQ 237
+ T E P+ P+ +P +N
Sbjct: 1182 PVTESTTVNTGNSVVENPENTTPATTQPTVNS 1213



Score = 46.6 bits (110), Expect = 1e-07
Identities = 32/159 (20%), Positives = 57/159 (35%), Gaps = 8/159 (5%)

Query: 83 TEEVTEEEKVKAKAKAAAAAKAKAAALAKQKREGTEEVTE--EEKAKMKAKAVAAAKAKA 140
T + + + A+ A + E TE E +K ++K V + A
Sbjct: 999 TPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDA 1058

Query: 141 AALAKQKREGIEEVTEEEKVKAKAKAAEAAKAKAAALAKQK--ASQGNGDSGDEKAKAIA 198
Q RE +E + VKA + E A++ + Q + +EKAK
Sbjct: 1059 TETTAQNREVAKE--AKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVET 1116

Query: 199 AAKAKAAAAARAKTKGAEGKKEEEPKQEEPSV-NEPYLN 236
+ ++ + + E Q EP+ N+P +N
Sbjct: 1117 EKTQEVPKVT-SQVSPKQEQSETVQPQAEPARENDPTVN 1154



Score = 41.6 bits (97), Expect = 5e-06
Identities = 31/224 (13%), Positives = 67/224 (29%), Gaps = 21/224 (9%)

Query: 19 AKEEARKRLVAKHGVEISKLEEENREKEKALPKNDDMTIEEAKRR------AAAAAKAKA 72
A VA++ + SK E+N + + +EAK A++ +
Sbjct: 1031 ATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGS 1090

Query: 73 AALAKQKREGTEEVTEEEKVKAKA---KAAAAAKAKAAALAKQKREGTEEVTEEEKAKMK 129
Q E E T E++ KAK K K + KQ++ T + E
Sbjct: 1091 ETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAE------ 1144

Query: 130 AKAVAAAKAKAAALAKQKREGIEEVTEEEKVKAKAKAAEAAKAKAAALAKQKASQGNGDS 189
A + K+ + + T + + + + + ++
Sbjct: 1145 ----PARENDPTVNIKEPQS--QTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVEN 1198

Query: 190 GDEKAKAIAAAKAKAAAAARAKTKGAEGKKEEEPKQEEPSVNEP 233
+ A + ++ + K + + E + +
Sbjct: 1199 PENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSN 1242



Score = 35.8 bits (82), Expect = 3e-04
Identities = 32/201 (15%), Positives = 59/201 (29%), Gaps = 9/201 (4%)

Query: 14 EAARRAKEEARKRLVAKHGVEISKLEEENREKEKALPKNDDMTIEEAKRRAAAAAK-AKA 72
+A + E A+ K E EKE+ + T E K + + K ++
Sbjct: 1077 KANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQS 1136

Query: 73 AALAKQKREGTEEVTEEEKVKAKAKAAAAAKAKAAALAKQKREGTEEVTEEEKAKMKAKA 132
+ Q E + +++ A + A + + VTE
Sbjct: 1137 ETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPA-KETSSNVEQPVTESTTVNTGNSV 1195

Query: 133 VAAAKAKAAALAKQKREGIEEVTEEEKVKAKAKAAEAAKAKAAALAKQKASQGNGDSGDE 192
V + A + V E K K + + ++ + S N S
Sbjct: 1196 VENPENTTPATTQ------PTVNSESSNKPKNRHRRSVRSVPHNVEPATTSS-NDRSTVA 1248

Query: 193 KAKAIAAAKAKAAAAARAKTK 213
+ + ARAK +
Sbjct: 1249 LCDLTSTNTNAVLSDARAKAQ 1269



Score = 32.7 bits (74), Expect = 0.003
Identities = 31/198 (15%), Positives = 55/198 (27%), Gaps = 10/198 (5%)

Query: 13 REAARR-AKEEARKRLVAKHGVEISKLEEENREKEKALPKNDDMTIEEAKRRAAAAAKAK 71
+E KE A K VE K +E + + PK + + + A
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPT 1152

Query: 72 AAALAKQKREGTEEVTEEEKVKAKAKAAAAAKAKAAALAKQKREGTEEVTEEEKAKMKAK 131
Q + T TE+ + + E T +
Sbjct: 1153 VNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVN 1212

Query: 132 AVAAAKAKAAALAKQKREGIEEVTEEEKVKAKAKAAEAAKAKAAALAKQKASQGNGDSGD 191
+ ++ K K + R + V + + AL ++ N D
Sbjct: 1213 SESSNKPKN-----RHRRSVRSVPHNV----EPATTSSNDRSTVALCDLTSTNTNAVLSD 1263

Query: 192 EKAKAIAAAKAKAAAAAR 209
+AKA A A ++
Sbjct: 1264 ARAKAQFVALNVGKAVSQ 1281


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4806ACRIFLAVINRP280.041 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.3 bits (63), Expect = 0.041
Identities = 10/51 (19%), Positives = 22/51 (43%)

Query: 146 TIVFGLLGFLDENANPSFIIVFFILLAISFIVMAILYFSYSMTYYVMVEKP 196
+ + + + + + I + F+ +A LY S+S+ VM+ P
Sbjct: 855 GYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVP 905


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4812IGASERPTASE300.005 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.005
Identities = 23/116 (19%), Positives = 49/116 (42%), Gaps = 6/116 (5%)

Query: 36 PLMGIMKEREEHVANEIDAAERNNAEAKKLVEEQREMLKQSRVEAQELIERAKKQAVDQK 95
P E E VA + +++ + +K ++ E Q+R A+E ++ +A Q
Sbjct: 1028 PAPATPSETTETVA---ENSKQESKTVEKNEQDATETTAQNREVAKE--AKSNVKANTQT 1082

Query: 96 DVIVAAAKEEAESIKASAVQEIQREKEQAIAALQEQVASLSVQIASKVIEKELKEE 151
+ + + E E+ + EKE+ E+ + ++ S+V K+ + E
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVP-KVTSQVSPKQEQSE 1137


62BALH_4826BALH_4841Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_4826-1173.204997stage II sporulation protein R
BALH_4827-1174.171809HemK family modification methylase
BALH_48280224.495493peptide chain release factor 1
BALH_4829-1214.112111thymidine kinase
BALH_4830-1223.720890transcription termination factor Rho
BALH_48310274.153546fructose 1,6-bisphosphatase II
BALH_48322263.554857UDP-N-acetylglucosamine
BALH_48332272.384804fructose-bisphosphate aldolase
BALH_4834-1173.060727stage 0 sporulation protein F
BALH_4835-1173.786339hypothetical protein
BALH_4836-2184.001065CTP synthetase
BALH_48371165.177695DNA-directed RNA polymerase subunit delta
BALH_48380175.289718TetR family transcriptional regulator
BALH_48400154.869302butyryl-CoA dehydrogenase
BALH_4841-1133.530118butyryl-CoA dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4826IGASERPTASE409e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.0 bits (93), Expect = 9e-06
Identities = 27/99 (27%), Positives = 42/99 (42%), Gaps = 5/99 (5%)

Query: 180 AESPEEEQVKQIDDEEVVDTEEKKEDEVKEKKVVKQEVATKVTASEKKVVKNETKVEEQP 239
A E + + ++ T EK E + E +EVA + K VK T+ E
Sbjct: 1031 ATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKE----AKSNVKANTQTNEVA 1086

Query: 240 VSKEETKTVEKVEKPVEQKQEKQNEY-VKVEEEEEEPEV 277
S ETK + E EK+ + V+ E+ +E P+V
Sbjct: 1087 QSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKV 1125



Score = 33.5 bits (76), Expect = 0.001
Identities = 24/119 (20%), Positives = 44/119 (36%), Gaps = 6/119 (5%)

Query: 169 TAVRKEEHVVKAESPEEEQVKQIDDEEVVDTEEKKEDEVKEKKVVKQEVATKVTASEKKV 228
+ ++ + V K E E Q V E K + + + ++ ++
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQ---NREVAKEAKSNVKANTQTNEVAQSGSETKETQTTE 1099

Query: 229 VKNETKVEEQPVSKEETKTVEKVEKPVEQ---KQEKQNEYVKVEEEEEEPEVKLFIVEA 284
K VE++ +K ET+ ++V K Q KQE+ E E + + I E
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEP 1158



Score = 29.6 bits (66), Expect = 0.019
Identities = 20/103 (19%), Positives = 39/103 (37%), Gaps = 5/103 (4%)

Query: 179 KAESPEEEQVKQIDDEEVVDTEEKKEDEVKEKKVV----KQEVATKVTASEKKVVKNETK 234
K+ Q ++ +T+E + E KE V K +V T+ T KV +
Sbjct: 1073 KSNVKANTQTNEVAQSGS-ETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSP 1131

Query: 235 VEEQPVSKEETKTVEKVEKPVEQKQEKQNEYVKVEEEEEEPEV 277
+EQ + + + P +E Q++ + E+ +
Sbjct: 1132 KQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKE 1174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4834HTHFIS1122e-32 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 112 bits (281), Expect = 2e-32
Identities = 31/117 (26%), Positives = 56/117 (47%)

Query: 6 GKILIVDDQYGIRVLLHEVFQKEGYQTFQAANGFQALDIVKKDNPDLVVLDMKIPGMDGI 65
IL+ DD IR +L++ + GY +N + + DLVV D+ +P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 66 EILKHVKEIDESIKVILMTAYGELDMIQEAKDLGALMHFAKPFDIDEIRQAVRNELA 122
++L +K+ + V++M+A +A + GA + KPFD+ E+ + LA
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4838HTHTETR653e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.4 bits (159), Expect = 3e-15
Identities = 27/141 (19%), Positives = 61/141 (43%), Gaps = 6/141 (4%)

Query: 22 RREQMIKGAVQLFKQKGFPRTTTREIAKAAGFSIGTLYEYIRTKDDVLYLVCDSIYEHVK 81
R+ ++ A++LF Q+G T+ EIAKAAG + G +Y + + K D+ + + ++
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 82 ERLEEV-VCTEKGSVESLKIAITNYFKVMDELQEE---VLIMYQEVRFLPKESLPYVLEK 137
E E + L+ + + + + + I++ + F+ + ++ ++
Sbjct: 72 ELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQR 131

Query: 138 EF--QMVGMFENILEQCTENG 156
+ E L+ C E
Sbjct: 132 NLCLESYDRIEQTLKHCIEAK 152


63BALH_4913BALH_4929Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_49132101.191274pyridoxal kinase
BALH_49153121.130713diguanylate cyclase
BALH_49162111.738150hypothetical protein
BALH_49191111.182441carbon starvation protein A
BALH_4920010-1.228520response regulator
BALH_4921010-0.310487major facilitator family transporter
BALH_4922-210-1.832983hypothetical protein
BALH_4923-29-3.080409N-acetylmannosaminyltransferase
BALH_4924-110-3.310972glycosyltransferase, group 1 family protein
BALH_4925012-4.192642hypothetical protein
BALH_4926014-3.954955permease
BALH_4927-114-3.267302methyl-accepting chemotaxis protein
BALH_4928-211-3.184978hypothetical protein
BALH_4929-112-3.122466cytosolic long-chain acyl-CoA thioester
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4920HTHFIS511e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 51.0 bits (122), Expect = 1e-09
Identities = 19/137 (13%), Positives = 47/137 (34%), Gaps = 12/137 (8%)

Query: 6 KILLIMEEVEERRSLAEKFTENIRNVECFEANTGTESLFMMKKHTPDFVFLNSKLIDGTG 65
IL+ ++ R L + + + + + D V + + D
Sbjct: 5 TILVADDDAAIRTVLNQAL--SRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 66 FEYASLLREVNCYTKFIFMGE--DIEESITAFRFQAFYYLLRPFREEDLQFLLYRMGKEQ 123
F+ +++ + M +I A A+ YL +PF +L ++
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGII------- 115

Query: 124 GEKAKSYLRKLPIEGQE 140
+A + ++ P + ++
Sbjct: 116 -GRALAEPKRRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4921TCRTETA606e-12 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 59.8 bits (145), Expect = 6e-12
Identities = 72/380 (18%), Positives = 142/380 (37%), Gaps = 35/380 (9%)

Query: 7 ISKRKLLGIAGLGWLFDAMDVGMLSFVMVALQKDWGLSTQEMGWIG---SINSIGMAVGA 63
+ + L + DA+ +G++ V+ L +D S G ++ ++ A
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 64 LVFGILSDKIGRKSVFIITLLLFSIGSGLTALTTTLAMFLVLRFLIGMGLGGELPVASTL 123
V G LSD+ GR+ V +++L ++ + A L + + R + G+ G VA
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAY 119

Query: 124 VSESVEAHERGKIVVLLESFWAGGWLIAALISYF---VIPKYGWEVAMILSAIPALYALY 180
+++ + ER + + + + G + ++ P + A L+ + L +
Sbjct: 120 IADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCF 179

Query: 181 LRWNLPDSPRFQKVEKRPTVIENIKSVWSGEYRKATIMLWILWFSV---------VFSYY 231
L LP+S + ++ R + + S L ++F + ++ +
Sbjct: 180 L---LPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIF 236

Query: 232 GM--FLWLPSV--MVLKGFSLIKSFQYVLIMTLAQLPGYFTAAWFIERLGRKFVLVTYLI 287
G F W + + L F ++ S +I RLG + L+ +I
Sbjct: 237 GEDRFHWDATTIGISLAAFGILHSLAQAMITGP-----------VAARLGERRALMLGMI 285

Query: 288 GTACSAYLFGVAESLTVLIVAGMLLSFFNLGAWGALYAYTPEQYPTVIRGTGAGMAAAFG 347
L A + +LL+ +G AL A Q +G G AA
Sbjct: 286 ADGTGYILLAFATRGWMAFPIMVLLASGGIGM-PALQAMLSRQVDEERQGQLQGSLAALT 344

Query: 348 RIGGILGPLLVGYLVASQAS 367
+ I+GPLL + A+ +
Sbjct: 345 SLTSIVGPLLFTAIYAASIT 364



Score = 33.6 bits (77), Expect = 0.001
Identities = 29/125 (23%), Positives = 45/125 (36%), Gaps = 5/125 (4%)

Query: 274 ERLGRKFVLVTYLIGTACSAYLFGVAESLTVLIVAGMLLSFFNLGAWGALYAYTPEQYPT 333
+R GR+ VL+ L G A + A L VL + G +++ AY +
Sbjct: 68 DRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYI-GRIVAGITGATGAVAGAYIADITDG 126

Query: 334 VIRGTGAG-MAAAFGRIGGILGPLLVGYLVASQASLSLIFTIFCGSILIGVFAVIILGQE 392
R G M+A FG G + GP+L G + + L E
Sbjct: 127 DERARHFGFMSACFG-FGMVAGPVLGGLMGGFSPHAPFFAAAALN--GLNFLTGCFLLPE 183

Query: 393 TKQRE 397
+ + E
Sbjct: 184 SHKGE 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4925GPOSANCHOR355e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.4 bits (81), Expect = 5e-04
Identities = 9/47 (19%), Positives = 18/47 (38%)

Query: 289 EQEQSAKKEEKKKEEAKEHKPPVTQQEKEKEKEKEKVAEKKEETQAL 335
Q + K E + + E EK + + A+ + ++Q L
Sbjct: 261 RQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVL 307



Score = 32.3 bits (73), Expect = 0.004
Identities = 19/63 (30%), Positives = 30/63 (47%), Gaps = 9/63 (14%)

Query: 291 EQSAKKEEKKKEEAKE-----HKPPVTQQEKEKEKEKEKVAEKKEETQALIFSGRQLFEQ 345
++ K+ EK EEA K +E +K EKEK AE + + +A + L E+
Sbjct: 392 REAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEK-AELQAKLEA---EAKALKEK 447

Query: 346 MYK 348
+ K
Sbjct: 448 LAK 450


64BALH_0063BALH_0070N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0063-2163.381349FtsH-2 peptidase
BALH_0064-1172.490938pantothenate kinase
BALH_0065-1162.695769Hsp33-like chaperonin
BALH_00660172.359618cysteine synthase
BALH_00672181.584259aminodeoxychorismate synthase subunit I
BALH_00682201.364060para-aminobenzoate/anthranilate synthase
BALH_0069-1171.4079134-amino-4-deoxychorismate lyase
BALH_00700151.587533dihydropteroate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0063HTHFIS364e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 36.3 bits (84), Expect = 4e-04
Identities = 38/179 (21%), Positives = 57/179 (31%), Gaps = 41/179 (22%)

Query: 185 RKFAEVGARIPKGVLLVGPPGTGKTLLARAV---AGEAGVPFFS-----ISGSDFVEMFV 236
+ + +++ G GTGK L+ARA+ PF + I
Sbjct: 150 YRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELF 209

Query: 237 GV------GASRVRD-LFENAKKNAPCIIFIDEIDAVGRQRGAGLGGGHDEREQTLNQLL 289
G GA FE A+ +F+DEI + L +
Sbjct: 210 GHEKGAFTGAQTRSTGRFEQAEGGT---LFLDEIGDMPMDAQTRLLRVLQQG-------- 258

Query: 290 VEMDGFGANEGII----IIAATNRPDILDPALLRPGRFDRQITVDRPDVNGREAVLKVH 344
E G I I+AATN+ L + G F R D+ R V+ +
Sbjct: 259 -EYTTVGGRTPIRSDVRIVAATNKD--L-KQSINQGLF-------REDLYYRLNVVPLR 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0064PF03309379e-136 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 379 bits (975), Expect = e-136
Identities = 96/269 (35%), Positives = 163/269 (60%), Gaps = 12/269 (4%)

Query: 1 MIFVLDVGNTNAVLGVF----EEGELRQHWRMETDRHKTEDEYGMLVKQLLEHEGLSFED 56
M+ +DV NT+ V+G+ + ++ Q WR+ T+ T DE + + L+ G E
Sbjct: 1 MLLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTADELALTIDGLI---GDDAER 57

Query: 57 VKGIIVSSVVPPIMFALERMCEKYFKIKP-LVVGPGIKTGLNIKYENPREVGADRIVNAV 115
+ G S VP ++ + M E+Y+ P +++ PG++TG+ + +NP+EVGADRIVN +
Sbjct: 58 LTGASGLSTVPSVLHEVRVMLEQYWPNVPHVLIEPGVRTGIPLLVDNPKEVGADRIVNCL 117

Query: 116 AGIHLYGSPLIIVDFGTATTYCYINEEKHYMGGVITPGIMISAEALYSRAAKLPRIEITK 175
A H YG+ I+VDFG++ ++ + ++GG I PG+ +S++A +R+A L R+E+T+
Sbjct: 118 AAYHKYGTAAIVVDFGSSICVDVVSAKGEFLGGAIAPGVQVSSDAAAARSAALRRVELTR 177

Query: 176 PSSVVGKNTVSAMQSGILYGYVGQVEGIVKRMKEEA----KQEPKVIATGGLAKLISEES 231
P SV+GKNTV MQ+G ++G+ G V+G+V R++++ + V+ATG A L+ +
Sbjct: 178 PRSVIGKNTVECMQAGAVFGFAGLVDGLVNRIRDDVDGFSGADVAVVATGHTAPLVLPDL 237

Query: 232 NVIDVVDPFLTLKGLYMLYERNANLQHEK 260
++ D LTL GL +++ERN Q K
Sbjct: 238 RTVEHYDRHLTLDGLRLVFERNRANQRGK 266


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0069RTXTOXINA280.044 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.4 bits (63), Expect = 0.044
Identities = 19/94 (20%), Positives = 42/94 (44%), Gaps = 21/94 (22%)

Query: 191 ILYTPSLETGILNGITRAFIIKVAEELGIKVKEGFFTKDELLSADEVFVTNSIQEIVPLN 250
IL P G + + +++ A+ELGI+V+ + K+ +VF + ++++ L
Sbjct: 50 ILLIPKDYKGQGSSLND--LVRTADELGIEVQ--YDEKNGTAITKQVF--GTAEKLIGL- 102

Query: 251 RIEERDFPGKVGMVTKRFINLYEMQREKLWSRNE 284
T+R + ++ Q +KL + +
Sbjct: 103 --------------TERGVTIFAPQLDKLLQKYQ 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0070PF07201290.015 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 29.4 bits (66), Expect = 0.015
Identities = 10/72 (13%), Positives = 26/72 (36%), Gaps = 4/72 (5%)

Query: 163 ILMHNRDNMNYRNLMADMIADLYDSIKIAKDAGVRDENIILDPGIGFAKTPEQNLEAMRN 222
L + + +L+ + + + G R I +++ L+ +R+
Sbjct: 145 ALKGRPELAHLSHLVEQALVSMAEEQGETIVLGAR----ITPEAYRESQSGVNPLQPLRD 200

Query: 223 LEQLNVLGYPVL 234
+ V+GY +
Sbjct: 201 TYRDAVMGYQGI 212


65BALH_0243BALH_0249N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0243-3100.417148*************kinase
BALH_0244-3100.392455O-sialoglycoprotein endopeptidase
BALH_0245-3100.68684830S ribosomal protein S18 alanine
BALH_02460202.044508putative DNA-binding/iron metalloprotein/AP
BALH_0247-1181.277978ABC transporter ATP-binding protein
BALH_02480201.213509redox-sensing transcriptional repressor Rex
BALH_0249-1282.102943CAAX amino protease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0243PF05272300.005 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.005
Identities = 8/25 (32%), Positives = 12/25 (48%)

Query: 28 VRAQDVIILEGDLGAGKTTFTKGLA 52
+ ++LEG G GK+T L
Sbjct: 593 CKFDYSVVLEGTGGIGKSTLINTLV 617


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0245SACTRNSFRASE442e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 44.2 bits (104), Expect = 2e-08
Identities = 21/72 (29%), Positives = 32/72 (44%)

Query: 72 ITNIAILPEYRGLKLGDALLKEVISEAKTLGVKTMTLEVRVSNEVAKQLYRKYGFQNGGI 131
I +IA+ +YR +G ALL + I AK + LE + N A Y K+ F G +
Sbjct: 92 IEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAV 151

Query: 132 RKRYYADNQEDG 143
Y++
Sbjct: 152 DTMLYSNFPTAN 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0247PF05272300.024 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.024
Identities = 13/44 (29%), Positives = 18/44 (40%), Gaps = 2/44 (4%)

Query: 379 LVGPNGIGKSTLLKSIVNKLPLLHGDVSFGSNVSVGYYDQEQAN 422
L G GIGKSTL+ ++V G+ Y+Q
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKD--SYEQIAGI 642


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0249SSPAMPROTEIN290.009 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type

M signature.
Length = 147

Score = 29.3 bits (65), Expect = 0.009
Identities = 14/30 (46%), Positives = 19/30 (63%)

Query: 17 LSSIAGLPLLLKTGLYDNRGFTREEKFQLI 46
+ IAGL LLL T +NR +REE + L+
Sbjct: 43 VEQIAGLKLLLDTLRAENRQLSREEIYALL 72


66BALH_0360BALH_0366N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0360115-0.576422methyl-accepting chemotaxis protein
BALH_0361016-1.530406ArgR family transcriptional regulator
BALH_0362116-2.056190arginine deiminase
BALH_0363212-2.109448ornithine carbamoyltransferase
BALH_0365113-1.669342arginine-ornithine antiporter
BALH_0366-212-2.216173carbamate kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0360BACINVASINB330.003 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 32.8 bits (74), Expect = 0.003
Identities = 40/204 (19%), Positives = 84/204 (41%), Gaps = 11/204 (5%)

Query: 124 VKNIDTTFSYTNNQVQQIRKQTGEATKQAQGVSETLAEISSGAEQSAASIQAIVSAVDTT 183
+ + ++ Q + + G ++ A + +++ + +A+ A D T
Sbjct: 150 TDTAKSVYDAATKKLTQAQNKLQSLDPADPGYAQAEAAVEQAGKEATEAKEALDKATDAT 209

Query: 184 TSIASEVEEKAKQSDELSSEMVQALGHSTRVFTSLIQGIQTLAKENEDSMQNVQKLEERM 243
++ + KA+++D + ++ T+ +++ +D++ NV +L M
Sbjct: 210 VKAGTDAKAKAEKADNILTKFQG---------TANAASQNQVSQGEQDNLSNVARLTMLM 260

Query: 244 KQVEHIVSVVSEIASQTNLLALNASIEAARAGEHGRGFAVVAEEVRKLADESDHSARNIS 303
IV +E S N LAL +++ R E + A EE RK A+E++ I
Sbjct: 261 AMFIEIVGKNTE-ESLQNDLALFNALQEGRQAEMEKKSAEFQEETRK-AEETNRIMGCIG 318

Query: 304 QLLRNMQEEVQQVAMKMTEQVKIA 327
++L + V VA T +A
Sbjct: 319 KVLGALLTIVSVVAAVFTGGASLA 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0361ARGREPRESSOR1342e-43 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 134 bits (339), Expect = 2e-43
Identities = 64/145 (44%), Positives = 95/145 (65%)

Query: 4 MKKEKRQRLIKQFVKEYEIEKQERLVELLAKKDVLVTQATVSRDIRELNLTKVPSQEGLM 63
M K +R I++ + EIE Q+ LV++L K VTQATVSRDI+EL+L KVP+ G
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKELHLVKVPTNNGSY 60

Query: 64 IYKIFSEEHLQTDIKLKKKLREVVVKIDCVDQLMVIKTLPGNAHVIGVLFDELDWKEKIG 123
Y + +++ KLK+ L + VKID L+V+KT+PGNA IG L D LDW+E +G
Sbjct: 61 KYSLPADQRFNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEEIMG 120

Query: 124 CICGNDTCLIISQSKSDREILEERL 148
ICG+DT LII ++ D +++++++
Sbjct: 121 TICGDDTILIICRTHDDTKVVQKKI 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0362ARGDEIMINASE5420.0 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 542 bits (1399), Expect = 0.0
Identities = 190/409 (46%), Positives = 269/409 (65%), Gaps = 8/409 (1%)

Query: 4 PIHVTSEIGELQTVLLKRPGKEVENLTPDYLQQLLFDDIPYLPIIQKEHDYFAQTLRNRG 63
PI++ SEIG L+ VLL RPG+E+ENLTP ++ LFDDIPYL + ++EH+ FA L+N
Sbjct: 7 PINIFSEIGRLKKVLLHRPGEELENLTPFIMKNFLFDDIPYLEVARQEHEVFASILKNNL 66

Query: 64 VEVLYLEKLAAEALVD-KKLREEFVDRILKEGQADVNVAHQTLKEYLLSFSNEELIQKIM 122
VE+ Y+E L +E LV L +F+ + + E + + LK+Y S + + +I K++
Sbjct: 67 VEIEYIEDLISEVLVSSVALENKFISQFILEAEIKTDFTINLLKDYFSSLTIDNMISKMI 126

Query: 123 GGVRKNEIETSKKTHLYELMEDHYPFYLDPMPNLYFTRDPAASVGDGLTINKMREPARRR 182
GV E++ + + L +L+ F +DPMPN+ FTRDP AS+G+G+TINKM R+R
Sbjct: 127 SGVVTEELK-NYTSSLDDLVNGANLFIIDPMPNVLFTRDPFASIGNGVTINKMFTKVRQR 185

Query: 183 ESLFMEYIIKYHPRFSSHNVPIWLDRDYKFPIEGGDELILNEETIAIGVSARTSAKAIER 242
E++F EYI KYHP + NVPIWL+R + +EGGDEL+LN+ + IG+S RT AK++E+
Sbjct: 186 ETIFAEYIFKYHPVYK-ENVPIWLNRWEEASLEGGDELVLNKGLLVIGISERTEAKSVEK 244

Query: 243 LAKNLFSRQNKIKKVLAIEIPKCRAFMHLDTVFTMVDYDKFTIHPAIQGPKGNMNIYILE 302
LA +LF + +LA +IPK R++MHLDTVFT +DY FT +IY+L
Sbjct: 245 LAISLFKNKTSFDTILAFQIPKNRSYMHLDTVFTQIDYSVFTSFT---SDDMYFSIYVLT 301

Query: 303 KGPDEETLKIT-HSTSLMEALKEVLGLSELVLIPCGGGDVIASAREQWNDGSNTLAIAPG 361
P + I + + L LG ++ +I C GGD+I AREQWNDG+N LAIAPG
Sbjct: 302 YNPSSSKIHIKKEKARIKDVLSFYLG-RKIDIIKCAGGDLIHGAREQWNDGANVLAIAPG 360

Query: 362 VVVTYDRNYVSNTLLREHGIEVIEVLSSELSRGRGGPRCMSMPIVRKDI 410
++ Y RN+V+N L E+GI+V + SSELSRGRGGPRCMSMP++R+DI
Sbjct: 361 EIIAYSRNHVTNKLFEENGIKVHRIPSSELSRGRGGPRCMSMPLIREDI 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0366CARBMTKINASE418e-150 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 418 bits (1075), Expect = e-150
Identities = 154/311 (49%), Positives = 204/311 (65%), Gaps = 4/311 (1%)

Query: 4 RKIVVALGGNAIQ--SGKATAGAQQEALEKTAEQLVKIMENDVDIVIAHGNGPQVGNILL 61
+++V+ALGGNA+Q K + + + KTA Q+ +I+ ++VI HGNGPQVG++LL
Sbjct: 3 KRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLL 62

Query: 62 QQKAAE-TEKTPAMPLDTCGAMSQGMIGYWMENAIEKALKKRNIKKDVATVITRVVVDKK 120
A + T PA P+D GAMSQG IGY ++ A++ L+KR ++K V T+IT+ +VDK
Sbjct: 63 HMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKN 122

Query: 121 DEAFKNPTKPIGPFYTEEEARRLMEETKAVFKEDAGRGWRRVVPSPKPVSIHEHKVINSL 180
D AF+NPTKP+GPFY EE A+RL E + KED+GRGWRRVVPSP P E + I L
Sbjct: 123 DPAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKL 182

Query: 181 VEDGNIVIAVGGGGIPVIDSEEGLKGTEAVIDKDFAAQKLAELVDADTLVILTAVDHVYV 240
VE G IVIA GGGG+PVI + +KG EAVIDKD A +KLAE V+AD +ILT V+ +
Sbjct: 183 VERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAAL 242

Query: 241 NYNQPNQKKLEHITVNKLEEYIEEQQFAAGSMLPKIEAAINFVNTNPKRKTIITSLEKVY 300
Y ++ L + V +L +Y EE F AGSM PK+ AAI F+ + II LEK
Sbjct: 243 YYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEW-GGERAIIAHLEKAV 301

Query: 301 EALEEKAGTII 311
EALE K GT +
Sbjct: 302 EALEGKTGTQV 312


67BALH_0383BALH_0397N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0383-114-0.524780ABC transporter ATP-binding protein
BALH_0384-215-0.826060ABC transporter ATP-binding protein
BALH_0385-114-0.463079glycosyl hydrolase family chitinase
BALH_0386018-2.061118hypothetical protein
BALH_0387113-1.169537thioredoxin
BALH_0388011-0.448642hypothetical protein
BALH_0389-19-0.415736TetR family transcriptional regulator
BALH_0391-1142.046520sugar transporter
BALH_0393-1172.012028Cro/CI family transcriptional regulator
BALH_0394-1172.117515major facilitator superfamily permease
BALH_03950172.579740type I phosphodiesterase/nucleotide
BALH_03961213.118350prolyl-tRNA synthetase
BALH_03971253.263487pentapeptide repeat-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0383PF05272352e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.7 bits (79), Expect = 2e-04
Identities = 11/28 (39%), Positives = 15/28 (53%)

Query: 34 IVGENGIGKSTLLRILTGELIHDDGNIE 61
+ G GIGKSTL+ L G D + +
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFFSDTHFD 628


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0388cloacin250.027 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 25.4 bits (55), Expect = 0.027
Identities = 13/32 (40%), Positives = 16/32 (50%)

Query: 50 GSDNDSSHDGGSHDCGGSSGGDSGGSCGGGGD 81
GS + GGS G G+SGG G GG+
Sbjct: 49 GSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGN 80



Score = 24.7 bits (53), Expect = 0.048
Identities = 9/24 (37%), Positives = 12/24 (50%)

Query: 57 HDGGSHDCGGSSGGDSGGSCGGGG 80
H+ G+H G+ G G GGG
Sbjct: 9 HNTGAHSTSGNINGGPTGLGVGGG 32


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0389HTHTETR842e-22 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 83.9 bits (207), Expect = 2e-22
Identities = 46/198 (23%), Positives = 83/198 (41%), Gaps = 8/198 (4%)

Query: 3 MRRSAEEIKKEIAYKAEILFSQKGYAATSMEEICEITERSKGSIYYHFKSKEELFLFVVK 62
++ A+E ++ I A LFSQ+G ++TS+ EI + ++G+IY+HFK K +LF + +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 63 QHTYDWLEKWNEK-EKLYSTSTEKLYALAEYHVEDIQQPISN----AIEEFSMSQVVSKE 117
+ E E K L + + +E I V
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA 124

Query: 118 ILDEMLALT-RESYVMFETLIEAGIQSGEFRED-NTRDLMYIVNGLLSGL-GVLYYELDY 174
++ + ESY E ++ I++ D TR I+ G +SGL +
Sbjct: 125 VVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQS 184

Query: 175 KELKRIYKKAIDVLLKGM 192
+LK+ + + +LL+
Sbjct: 185 FDLKKEARDYVAILLEMY 202


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0391TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.0 bits (91), Expect = 3e-05
Identities = 27/130 (20%), Positives = 50/130 (38%), Gaps = 5/130 (3%)

Query: 34 FIMERTNNDPVSVSL-LSVMEYAPIFIFSFIGGALADRWNPKRTMVAGDVLSVLSIIGIV 92
F +R + D ++ + L+ + I G +A R +R ++ G + I +
Sbjct: 236 FGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI--L 293

Query: 93 LLLKLDYWQAIFFATLISAIVGQFSQPSSSRIFKRYVKEEQVANAIAFNQTLQSLFMIFG 152
L W + F ++ G P+ + R V EE+ L SL I G
Sbjct: 294 LAFATRGW--MAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVG 351

Query: 153 PVVGSLVYTQ 162
P++ + +Y
Sbjct: 352 PLLFTAIYAA 361



Score = 35.2 bits (81), Expect = 4e-04
Identities = 60/344 (17%), Positives = 124/344 (36%), Gaps = 26/344 (7%)

Query: 58 FIFSFIGGALADRWNPKRTMVAGDVLSVLS--IIGIVLLLKLDYWQAIFFATLISAIVGQ 115
F + + GAL+DR+ + ++ + + I+ L + ++ +++ I G
Sbjct: 57 FACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWV-----LYIGRIVAGITGA 111

Query: 116 FSQPSSSRIFKRYVKEEQVANAIAFNQTLQSLFMIFGPVVGSL---VYTQLGLFTSLYSL 172
+ + ++ A F M+ GPV+G L F +
Sbjct: 112 -TGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALN 170

Query: 173 IILFLLSAIALSFLPKWVEQEQVARDSLKNDIKEGWKYVLHTKNLRMITITFTIMGLAVG 232
+ FL LP+ + E+ + +++ + + F IM L
Sbjct: 171 GLNFLTGCF---LLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQ 227

Query: 233 LTNPLEVFLVIERLGMEKEAVQYLAAADGI-GMLIGGIVAAVFASKVNPKKMFVFGMSIL 291
+ L V +R + + AA GI L ++ A+++ ++ + GM
Sbjct: 228 VPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIAD 287

Query: 292 AMSFLVEGLSTSFWITSFMRFGTGICLACVNI---VVGTLMIQLVPENMVGRVNGTILPL 348
+++ +T W M F + LA I + ++ + V E G++ G++ L
Sbjct: 288 GTGYILLAFATRGW----MAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAAL 343

Query: 349 FMGAMLIGTALAGGLKEMTSLV---IVFCIAMALILLAIGPVLR 389
++G L + + + AL LL + P LR
Sbjct: 344 TSLTSIVGPLLFTAIYAASITTWNGWAWIAGAALYLLCL-PALR 386


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0394TCRTETB461e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 46.4 bits (110), Expect = 1e-07
Identities = 30/158 (18%), Positives = 56/158 (35%), Gaps = 3/158 (1%)

Query: 264 DLGISATNLLIILFVTQIVACPFALLYGKLSTTFTGKKMLYVGIIIYIIICIYAYFLKTT 323
D + + + +YGKLS K++L GIII + + +
Sbjct: 43 DFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSF 102

Query: 324 LDFWILAMLV-ATSQGGIQALSRSYFAKLVPKESANEFFGFYNIFGKFAAIMGPVLVGVT 382
I+A + AL A+ +PKE+ + FG +GP + G+
Sbjct: 103 FSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMI 162

Query: 383 TQLTGKTNAGVLSIIVLFIIGGFLLTRVPENNTSVTPP 420
+ +L I ++ II L ++ + +
Sbjct: 163 AHYIHWSY--LLLIPMITIITVPFLMKLLKKEVRIKGH 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0397cloacin491e-08 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 49.3 bits (117), Expect = 1e-08
Identities = 36/81 (44%), Positives = 43/81 (53%)

Query: 106 SGDDGSGGNGSGGNGSGGNGSGGNGSGGNGSGGNGSGGSGSGDNGSGGNGSGGNGSGGSG 165
SG DG G N + SG G G G G +GSG S + GG+GSG + GGSG
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 166 SGDNGSGGNGSGGNGLGGNGS 186
G+ G GN GG+G GGN S
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLS 82



Score = 49.3 bits (117), Expect = 1e-08
Identities = 34/81 (41%), Positives = 43/81 (53%)

Query: 116 SGGNGSGGNGSGGNGSGGNGSGGNGSGGSGSGDNGSGGNGSGGNGSGGSGSGDNGSGGNG 175
SGG+G G N + SG G G G G +GSG + GGSGSG + GG+G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 176 SGGNGLGGNGSGGNGSGGSGS 196
G G GN GG+G+GG+ S
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLS 82



Score = 48.9 bits (116), Expect = 1e-08
Identities = 33/81 (40%), Positives = 43/81 (53%)

Query: 156 SGGNGSGGSGSGDNGSGGNGSGGNGLGGNGSGGNGSGGSGSGDNGSGGNGSGGSGSGGNG 215
SGG+G G + + SG G GLG G +GSG S + GG+GSG GG+G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 216 SGGSGSGDNGSGGNGSGGSGS 236
G G N GG+G+GG+ S
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLS 82



Score = 48.6 bits (115), Expect = 2e-08
Identities = 33/81 (40%), Positives = 43/81 (53%)

Query: 126 SGGNGSGGNGSGGNGSGGSGSGDNGSGGNGSGGNGSGGSGSGDNGSGGNGSGGNGLGGNG 185
SGG+G G N + SG G G G G +GSG S + GG+GSG + GG+G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 186 SGGNGSGGSGSGDNGSGGNGS 206
G G G+ G +G+GGN S
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLS 82



Score = 48.6 bits (115), Expect = 2e-08
Identities = 40/111 (36%), Positives = 51/111 (45%), Gaps = 2/111 (1%)

Query: 86 SGGNGSGDNGSGGNGSGDNGSGDDGSGGNGSGGNGSGGNGSGGNGSGGNGSGGNGSGGSG 145
SGG+G G N + SG+ G G G G +GSG + GG+GSG + GGSG
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 146 SGDNGSGGNGSGGNGSGGSGSGDNGSGGNGSGGNGLGGNGSGGNGSGGSGS 196
G GGNG+ G GSG G+ + G L G+GG S
Sbjct: 62 HG--NGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAG 110



Score = 48.2 bits (114), Expect = 3e-08
Identities = 35/81 (43%), Positives = 44/81 (54%)

Query: 136 SGGNGSGGSGSGDNGSGGNGSGGNGSGGSGSGDNGSGGNGSGGNGLGGNGSGGNGSGGSG 195
SGG+G G + + SG G G G G +GSG + GG+GSG + GGSG
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 196 SGDNGSGGNGSGGSGSGGNGS 216
G+ G GN GGSG+GGN S
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLS 82



Score = 47.8 bits (113), Expect = 3e-08
Identities = 36/81 (44%), Positives = 41/81 (50%)

Query: 161 SGGSGSGDNGSGGNGSGGNGLGGNGSGGNGSGGSGSGDNGSGGNGSGGSGSGGNGSGGSG 220
SGG G G N + SG G G G G GSG + GGSGSG + GGSG
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 221 SGDNGSGGNGSGGSGSGDNGS 241
G+ G GN GGSG+G N S
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLS 82



Score = 47.0 bits (111), Expect = 6e-08
Identities = 33/81 (40%), Positives = 42/81 (51%)

Query: 81 SGGNGSGGNGSGDNGSGGNGSGDNGSGDDGSGGNGSGGNGSGGNGSGGNGSGGNGSGGNG 140
SGG+G G N + SG G G G G +GSG + GG+GSG + GG+G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 141 SGGSGSGDNGSGGNGSGGNGS 161
G G N GG+G+GGN S
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLS 82



Score = 46.2 bits (109), Expect = 1e-07
Identities = 33/81 (40%), Positives = 40/81 (49%)

Query: 146 SGDNGSGGNGSGGNGSGGSGSGDNGSGGNGSGGNGLGGNGSGGNGSGGSGSGDNGSGGNG 205
SG +G G N + SG G G G G +G G + GGSGSG + GG+G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 206 SGGSGSGGNGSGGSGSGDNGS 226
G G GN GGSG+G N S
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLS 82



Score = 45.9 bits (108), Expect = 1e-07
Identities = 33/80 (41%), Positives = 40/80 (50%)

Query: 171 SGGNGSGGNGLGGNGSGGNGSGGSGSGDNGSGGNGSGGSGSGGNGSGGSGSGDNGSGGNG 230
SGG+G G N + SG G +G G G +GSG S GGSGSG + GG+G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 231 SGGSGSGDNGSGGSGSQGGN 250
G G N GGSG+ G
Sbjct: 62 HGNGGGNGNSGGGSGTGGNL 81



Score = 45.5 bits (107), Expect = 2e-07
Identities = 32/81 (39%), Positives = 42/81 (51%)

Query: 76 SGDNGSGGNGSGGNGSGDNGSGGNGSGDNGSGDDGSGGNGSGGNGSGGNGSGGNGSGGNG 135
SG +G G N + SG+ G G G G DGSG + GG+GSG + GG+G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 136 SGGNGSGGSGSGDNGSGGNGS 156
G G G+ G +G+GGN S
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLS 82



Score = 45.5 bits (107), Expect = 2e-07
Identities = 33/81 (40%), Positives = 39/81 (48%)

Query: 71 SDGSGSGDNGSGGNGSGGNGSGDNGSGGNGSGDNGSGDDGSGGNGSGGNGSGGNGSGGNG 130
S G G G N + SG G G G G +GSG GG+GSG + GG+G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 131 SGGNGSGGNGSGGSGSGDNGS 151
G G GN GGSG+G N S
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLS 82



Score = 42.8 bits (100), Expect = 1e-06
Identities = 30/81 (37%), Positives = 42/81 (51%)

Query: 66 SDGNGSDGSGSGDNGSGGNGSGGNGSGDNGSGGNGSGDNGSGDDGSGGNGSGGNGSGGNG 125
S G+G + + SG G G G G +GSG + + GG+GSG + GG+G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 126 SGGNGSGGNGSGGNGSGGSGS 146
G G GN GG+G+GG+ S
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLS 82



Score = 38.5 bits (89), Expect = 3e-05
Identities = 34/83 (40%), Positives = 39/83 (46%), Gaps = 3/83 (3%)

Query: 182 GGNGSGGNGSGGSGSGDNGSGGNGSGGSGSGGNGSGGSGSGDNGSGGNGSGGSGSGDNGS 241
GG+G G N S SG N +GG G G G + G S +N G G GSG G
Sbjct: 3 GGDGRGHNTGAHSTSG-NINGGPTGLGVGGGASDGSGWSSENNPWG--GGSGSGIHWGGG 59

Query: 242 GGSGSQGGNRVKGDSSSQGGNGS 264
G G+ GGN G S GGN S
Sbjct: 60 SGHGNGGGNGNSGGGSGTGGNLS 82



Score = 35.5 bits (81), Expect = 3e-04
Identities = 27/75 (36%), Positives = 35/75 (46%), Gaps = 2/75 (2%)

Query: 57 NGGSQDDSGSDGNGSDGSGSGDNGSGGNGSGGNGSGDNGSGGNGSGDNGSGDDGSGGNGS 116
N G+ SG+ G G G G S G+G + G G+G G G+GG
Sbjct: 10 NTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGN- 68

Query: 117 GGNGSGGNGSGGNGS 131
GN GG+G+GGN S
Sbjct: 69 -GNSGGGSGTGGNLS 82



Score = 31.2 bits (70), Expect = 0.005
Identities = 21/53 (39%), Positives = 23/53 (43%)

Query: 55 PGNGGSQDDSGSDGNGSDGSGSGDNGSGGNGSGGNGSGDNGSGGNGSGDNGSG 107
G G S S N G GSG G GSG G NG+ G GSG G+
Sbjct: 29 VGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNL 81


68BALH_0438BALH_0444N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0438-114-0.287429PTS system N-acetylglucosamine-specific
BALH_0439014-0.788266penicillin-binding protein
BALH_0440-1140.065171hypothetical protein
BALH_0441-3120.356911hypothetical protein
BALH_0442-2120.021456cell wall biosynthesis glycosyltransferase
BALH_0443-212-0.167200UDP-glucose/GDP-mannose dehydrogenase family
BALH_0444-212-0.131561UDP-glucose 4-epimerase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0438PF07299330.002 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 32.9 bits (75), Expect = 0.002
Identities = 26/159 (16%), Positives = 58/159 (36%), Gaps = 29/159 (18%)

Query: 84 YLVLQNTTNALSKTYSAAELNDKLKSVQ--------DLVGSVDPTK--LADTMTKV---S 130
Y +++ L+ ++ A +++++ + ++ + L DT+ V
Sbjct: 16 YNFIKSQAYILANGHATANDRGVIQALKSLAIEKIIHVFENLTDEQKELIDTVLTVQNRE 75

Query: 131 KAAALTPKINMAILG-GIIAGVVAGLLYNKFHKIKLPEW-------LGFFA------GKR 176
A + KIN ++ + L+ K K+KLP+ L + + ++
Sbjct: 76 DAESFLLKINPYVIPFQEVTAQTLKKLFPKAKKLKLPDMEELDMKELSYLSWIDKGSSRK 135

Query: 177 FVPIITSIVMLLLGLVFGQIWPTIQSGIDAVAHGIVNLG 215
F II + G + I ++ HG +G
Sbjct: 136 F--IIAKNDKNKFVGLQGTFQSLNKKSICSLCHGHEEVG 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0439TONBPROTEIN310.009 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 30.7 bits (69), Expect = 0.009
Identities = 13/55 (23%), Positives = 21/55 (38%), Gaps = 4/55 (7%)

Query: 61 AEMKKQEELKKKEEKKKQEAKKQKEQKEQVKQAQAVEKPAEGPPQEVNENAQLDQ 115
A + ++ K + K K K Q++ K VK + P E A+L
Sbjct: 85 APVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVE----SRPASPFENTAPARLTS 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0443NUCEPIMERASE300.016 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 30.1 bits (68), Expect = 0.016
Identities = 9/22 (40%), Positives = 15/22 (68%)

Query: 13 GYVGLPLAVHLAERGHTVLGLD 34
G++G ++ L E GH V+G+D
Sbjct: 10 GFIGFHVSKRLLEAGHQVVGID 31


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0444NUCEPIMERASE2111e-68 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 211 bits (538), Expect = 1e-68
Identities = 96/352 (27%), Positives = 157/352 (44%), Gaps = 61/352 (17%)

Query: 4 KCLITGGAGFIGSHLAEELVGRGYNVTIVDNFYKGKNKYHDELMKEIRV----------I 53
K L+TG AGFIG H+++ L+ G+ V +DN N Y+D +K+ R+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNL----NDYYDVSLKQARLELLAQPGFQFH 57

Query: 54 PISVLDKNSIYELVNQH--DVVFHLAAILGVKTTMEKSIELIETNFDGTRNILQAAL-NG 110
I + D+ + +L + VF L V+ ++E ++N G NIL+ N
Sbjct: 58 KIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNK 117

Query: 111 KKKVVFASTSEVYGK-AKPPFSEEG--DR---LYGATSKIRWSYAVCKTLEETLCLGYA- 163
+ +++AS+S VYG K PFS + D LY AT K E + Y+
Sbjct: 118 IQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAAT----------KKANELMAHTYSH 167

Query: 164 LEGLPVTIVRYFNIYGPRAKDGPYAGVIPRFISAALQGEDILVYGDGEQTRCFTYVSDAV 223
L GLP T +R+F +YGP + P + +F A L+G+ I VY G+ R FTY+ D
Sbjct: 168 LYGLPATGLRFFTVYGPWGR--PDM-ALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIA 224

Query: 224 EATIRAMD------------------EKVNGEIINIGSENEKSIKEVAEVIKKLTKSSSK 265
EA IR D + NIG+ + + + + ++ +K
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 266 IVQVPFEEVYPHGFEEIPNRRPDVAKLRELVQFQAEVTWEEGLKETIKWFRE 317
+P + ++ D L E++ F E T ++G+K + W+R+
Sbjct: 285 KNMLPLQP------GDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRD 330


69BALH_0491BALH_0500N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_049119-1.720379internalin protein
BALH_0492-110-0.851454acetyltransferase
BALH_0494-19-0.820295glycine betaine transporter
BALH_0495011-0.888337collagenase ColA
BALH_0496011-0.262122hypothetical protein
BALH_0497012-0.137457hypothetical protein
BALH_0498-113-0.914003methyl-accepting chemotaxis protein
BALH_0499-114-0.454594sensor histidine kinase
BALH_0500-115-0.032128response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0491IGASERPTASE551e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 55.5 bits (133), Expect = 1e-09
Identities = 38/252 (15%), Positives = 80/252 (31%), Gaps = 7/252 (2%)

Query: 858 TQNIVAKEEPKEPVEEVEGSKEEPIKEAEGSKEEPKEPAKEVEGSKEEVKESKEEVKEPA 917
T N + + P P E ++ + EA P P++ E E K+ + V++
Sbjct: 999 TPNNIQADVPSVPSNNEEIAR---VDEAPVPPPAPATPSETTETVAENSKQESKTVEKNE 1055

Query: 918 KEVEGPKEEVKESAKEVEGPKEEVKEPAKEVEGPKEEVKEPAKEVEGPKEEVKEPTKEVE 977
++ + +E AKE + EV E KE V++ K
Sbjct: 1056 QDATETTAQNREVAKE-AKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKV 1114

Query: 978 GPKEEVKEPTKEVEGPKEEVKEPMKEVEGSKEEVKEPTKEAEGLKEEVKEPTTEVEGSKE 1037
++ + P + ++ + + + +PT + + + + +KE
Sbjct: 1115 ETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKE 1174

Query: 1038 VKEPGKEVEGSK---DAINQSAVAQETNVNNQVGKEKVVENQNMKENKPAVTKQEESKKS 1094
++ + N E E+ N +N+ + +
Sbjct: 1175 TSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNV 1234

Query: 1095 LGATGGQENTST 1106
AT + ST
Sbjct: 1235 EPATTSSNDRST 1246



Score = 50.4 bits (120), Expect = 4e-08
Identities = 34/223 (15%), Positives = 73/223 (32%), Gaps = 4/223 (1%)

Query: 859 QNIVAKEEPKEPVEEVEGSKEEPIKEAEGSKEEPKEPAKEVEGSKE--EVKESKEEVKEP 916
E K+ + VE ++++ + ++E KE V+ + + EV +S E KE
Sbjct: 1036 TTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKET 1095

Query: 917 AKEVEGPKEEVKESAKEVEGPKEEVKEPAKEVEGPKEEVKEPAKEVEGPKEEVKEPTKEV 976
V++ K ++ + P + ++ + + + +PT +
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNI 1155

Query: 977 EGPKEEVKEPTKEVEGPKEEVKEPMKEVEGSKEEVK--EPTKEAEGLKEEVKEPTTEVEG 1034
+ P+ + + KE + V S + E +PT E
Sbjct: 1156 KEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSES 1215

Query: 1035 SKEVKEPGKEVEGSKDAINQSAVAQETNVNNQVGKEKVVENQN 1077
S + K + S + A + + + N N
Sbjct: 1216 SNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTN 1258



Score = 39.3 bits (91), Expect = 9e-05
Identities = 44/247 (17%), Positives = 80/247 (32%), Gaps = 12/247 (4%)

Query: 858 TQNIVAKEEPKEPVEEVEGSKEEPIKEAEGSKEEPKEPAKEVEGSKEEVKESKEEVKEPA 917
TQ KE VE+ E +K E K E K + K+ + + + +P
Sbjct: 1095 TQTTETKETAT--VEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPT 1152

Query: 918 KEVEGPKEEVKESAKEVEGPKEEVKEPAKEVEGPKEEVKEPAKEVEGPKEEVKEPTKEVE 977
++ P+ + +A + E P +E ++ V VE P+ T+
Sbjct: 1153 VNIKEPQSQTNTTA-DTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTV 1211

Query: 978 GPKEEVKEPTKEVEGPKEEVKEPMKEVEGSKEEVKEPTKEAEGLKEEVKEPTTEVEGSKE 1037
+ K + + V+ VE + + L + T V
Sbjct: 1212 NSESSNKPKNRH----RRSVRSVPHNVEPAT--TSSNDRSTVALCDLTSTNTNAVLSDAR 1265

Query: 1038 VKEPGKEVEGSKDAINQSAVAQETNVNNQVGKEKVVENQNMKENKPAVTKQEESKKSLGA 1097
K + K + + +NN+ V N +M +N + + S KS
Sbjct: 1266 AKAQFVALNVGKAVSQHIS---QLEMNNEGQYNVWVSNTSMNKNYSSSQYRRFSSKSTQT 1322

Query: 1098 TGGQENT 1104
G + T
Sbjct: 1323 QLGWDQT 1329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0495MICOLLPTASE7590.0 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 759 bits (1961), Expect = 0.0
Identities = 412/888 (46%), Positives = 571/888 (64%), Gaps = 20/888 (2%)

Query: 94 YSMADLNKMNNQELVETLGSIKWHQITDLFQFNEDAKAFYKDKGKMQVVIDELAHRGSTF 153
Y+ +LN+MN +LVE + +I + + DLF FN+ + F+ ++ ++Q +I L G T+
Sbjct: 93 YTFDELNRMNYSDLVELIKTISYENVPDLFNFNDGSYTFFSNRDRVQAIIYGLEDSGRTY 152

Query: 154 TKDDSKGIQTFTEVLRSAFYLAFYNNELSELNERSFQDKCLPALKAIAKNPNFKLGTAEQ 213
T DD KGI T E LR+ +YL FYN +LS LN +++CLPA+KAI N NF+LGT Q
Sbjct: 153 TADDDKGIPTLVEFLRAGYYLGFYNKQLSYLNTPQLKNECLPAMKAIQYNSNFRLGTKAQ 212

Query: 214 DTVVSAYGKLISNASSDVETVQYASNILKQYNDNFTTYVNDRMKGQAIYDIMQGIDYDIQ 273
D VV A G+LI NAS+D E + +L + DN Y ++ KG A++++M+GIDY
Sbjct: 213 DGVVEALGRLIGNASADPEVINNCIYVLSDFKDNIDKYGSNYSKGNAVFNLMKGIDYYTN 272

Query: 274 SYLIEARKE-ANETMWYGKVDGFINEINRIALL-NEVTQENKWLVNNGIYFASRLGKFHS 331
S + + A T +Y ++D ++ + + + +++ +N WLVNN +Y+ R+GKF
Sbjct: 273 SVIYNTKGYDAKNTEFYNRIDPYMERLESLCTIGDKLNNDNAWLVNNALYYTGRMGKFRE 332

Query: 332 NPNKGLEVVTQAMHMYPRLSEPYFVAVEQITTNYNGKDYSGNTVDLEKIRKEGKEQYLPK 391
+P+ + +AM YP LS Y A + N+ GK+ SGN +D KI+ + +E+YLPK
Sbjct: 333 DPSISQRALERAMKEYPYLSYQYIEAANDLDLNFGGKNSSGNDIDFNKIKADAREKYLPK 392

Query: 392 TYTFDDGSIVFKTGDKVSEEKIKRLYWAAKEVKAQYHRVIGNDKALEPGNADDILTIVIY 451
TYTFDDG V K GDKV+EEKIKRLYWA+KEVKAQ+ RV+ NDKALE GN DDILT+VIY
Sbjct: 393 TYTFDDGKFVVKAGDKVTEEKIKRLYWASKEVKAQFMRVVQNDKALEEGNPDDILTVVIY 452

Query: 452 NSPEEYQLNRQLYGYETNNGGIYIEETGTFFTYERTPEQSIYSLEELFRHEFTHYLQGRY 511
NSPEEY+LNR + G+ T+NGGIYIE GTFFTYERTPE+SIY+LEELFRHEFTHYLQGRY
Sbjct: 453 NSPEEYKLNRIINGFSTDNGGIYIENIGTFFTYERTPEESIYTLEELFRHEFTHYLQGRY 512

Query: 512 EVPGLFGRGDMYQNERLTWFQEGNAEFFAGSTRTNNVVPRKSIISGLSSDPASRYTAERT 571
VPG++G+G+ YQ LTW++EG AEFFAGSTRT+ + PRKS+ GL+ D +R +
Sbjct: 513 VVPGMWGQGEFYQEGVLTWYEEGTAEFFAGSTRTDGIKPRKSVTQGLAYDRNNRMSLYGV 572

Query: 572 LFAKYGSWDFYNYSFALQSYLYTHQFETFDKIQDLIRANDVKNYDAYRENLSKDLKLNEE 631
L AKYGSWDFYNY FAL +Y+Y + F+K+ + I+ NDV Y Y ++S D LN++
Sbjct: 573 LHAKYGSWDFYNYGFALSNYMYNNNMGMFNKMTNYIKNNDVSGYKDYIASMSSDYGLNDK 632

Query: 632 YQEYMQQLIDNQDKYNVPEVADDYLAEHAPKSLTAVEKEITETLPMKDAKMTKHSSQFFN 691
YQ+YM L++N D +VP V+D+Y+ H K + + +I E +KD SQFF
Sbjct: 633 YQDYMDSLLNNIDNLDVPLVSDEYVNGHEAKDINEITNDIKEVSNIKDLSSNVEKSQFFT 692

Query: 692 TFTLEGTYTGSVTKGESEDWNAMSKKVNEALEQLAQKEWSGYKTVTAYFVNYRVNSSNQF 751
T+ + GTY G ++GE DW M+ K+N+ L++L++K W+GYKTVTAYFVN++V+ + +
Sbjct: 693 TYDMRGTYVGGRSQGEENDWKDMNSKLNDILKELSKKSWNGYKTVTAYFVNHKVDGNGNY 752

Query: 752 EYDVVFHG----IAKDDEENKAPTVHINGPYNGLVKEGIQFKSDGSKDEDGKIVSYLWDF 807
YDVVFHG D NK P I + +V+E I F SKDEDG+I +Y WDF
Sbjct: 753 VYDVVFHGMNTDTNTDVHVNKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDF 812

Query: 808 GDGSTSAEVNPLHVYEREGSYKVALIVKDDKGKESKSETTVTVKDGS------LTESEPN 861
GDG S E H Y + G Y+V L V D+ G + + +K + ESEPN
Sbjct: 813 GDGEKSNEAKATHKYNKTGEYEVKLTVTDNNG--GINTESKKIKVVEDKPVEVINESEPN 870

Query: 862 NRPEEANRIG-LNTTIKGSLIGGDHTDVYTFNVASAKNIDISVLNEYGIGMTWVLHHESD 920
N E+AN+I N +KG+L D++D Y F+VA N+ I++ N +G+TW L+ E D
Sbjct: 871 NDFEKANQIAKSNMLVKGTLSEEDYSDKYYFDVAKKGNVKITLNNLNSVGITWTLYKEGD 930

Query: 921 MQNYAAYGQVNGNHI---EANFNAKPGKYYLYVYKYDNGDGTYELSVK 965
+ NY Y GN + +PG+YYL VY YDN GTY ++VK
Sbjct: 931 LNNYVLYA--TGNDGTVLKGEKTLEPGRYYLSVYTYDNQSGTYTVNVK 976



Score = 96.7 bits (240), Expect = 2e-22
Identities = 58/250 (23%), Positives = 99/250 (39%), Gaps = 47/250 (18%)

Query: 762 KDDEENKAPTVHINGPYNGLVKEGIQFKSD----GSKDEDGKIVSYLWDF---------- 807
K E+ ++ + P N K KS+ G+ E+ Y +D
Sbjct: 854 KVVEDKPVEVINESEPNNDFEKANQIAKSNMLVKGTLSEEDYSDKYYFDVAKKGNVKITL 913

Query: 808 ---------------GDGST-SAEVNPLHVYEREGSYKVA-----LIVKDDKGKES---- 842
GD + +G + L V +
Sbjct: 914 NNLNSVGITWTLYKEGDLNNYVLYATGNDGTVLKGEKTLEPGRYYLSVYTYDNQSGTYTV 973

Query: 843 ------KSETTVTVKDGSLTESEPNNRPEEANRIGLNTTIKGSLIGGDHTDVYTFNVASA 896
K+E T KD ++ E E NN ++A ++ N+ I G+L D D+Y+ ++ +
Sbjct: 974 NVKGNLKNEVKETAKD-AIKEVENNNDFDKAMKVDSNSKIVGTLSNDDLKDIYSIDIQNP 1032

Query: 897 KNIDISVLNEYGIGMTWVLHHESDMQNYAAYGQVNGNHIEANFNAKPGKYYLYVYKYDN- 955
+++I V N I M W+L+ D+ NY Y +GN + PGKYYL VY+++N
Sbjct: 1033 SDLNIVVENLDNIKMNWLLYSADDLSNYVDYANADGNKLSNTCKLNPGKYYLCVYQFENS 1092

Query: 956 GDGTYELSVK 965
G G Y ++++
Sbjct: 1093 GTGNYIVNLQ 1102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0497IGASERPTASE412e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.8 bits (95), Expect = 2e-05
Identities = 30/194 (15%), Positives = 67/194 (34%), Gaps = 12/194 (6%)

Query: 201 QPQIATVKRDATIANAEREKEARIEKARAEKEAKEAEYQRDAQIAEAEKHKELKVQSYKR 260
P + T AE K+ + E++A E Q EA+ + + Q+ +
Sbjct: 1026 PPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEV 1085

Query: 261 EQEQARADADLSYELQQAKAQQGVTEEQMRVKIIEREKQIELEEKEIARREKQYDAEVKK 320
Q + E Q + ++ T E+ E + ++E E+ + + + ++
Sbjct: 1086 AQSGSETK-----ETQTTETKETATVEK------EEKAKVETEKTQEVPKVTSQVSPKQE 1134

Query: 321 KADADRYAVEQSAEAEKVKQIKKADADQYKIEAEARARAEEVRVEGLAKAEIEKAQGQAK 380
+++ + E + E + IK+ + A+ A+E
Sbjct: 1135 QSETVQPQAEPARENDPTVNIKEPQSQTNT-TADTEQPAKETSSNVEQPVTESTTVNTGN 1193

Query: 381 AEVQKAQGTAEADV 394
+ V+ + T A
Sbjct: 1194 SVVENPENTTPATT 1207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0499PF06580300.019 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.2 bits (68), Expect = 0.019
Identities = 22/105 (20%), Positives = 42/105 (40%), Gaps = 19/105 (18%)

Query: 430 ILGNLITNAFE-AIERNEEHNKKVRMFVTDIGEEIIIEVEDSGQGVHDEIITSIFYKGFS 488
++ L+ N + I + + K + + T + +EVE++G
Sbjct: 259 LVQTLVENGIKHGIAQLPQGGK-ILLKGTKDNGTVTLEVENTGSLALKN----------- 306

Query: 489 TKEGGKRGYGLAKVKELVEDLNG---SIAIEKGDLGGALFIIALP 530
TKE G GL V+E ++ L G I + + G ++ +P
Sbjct: 307 TKEST--GTGLQNVRERLQMLYGTEAQIKLSEKQ-GKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0500HTHFIS781e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 77.6 bits (191), Expect = 1e-18
Identities = 23/107 (21%), Positives = 44/107 (41%), Gaps = 2/107 (1%)

Query: 7 VLIVEDDIRIADIHRRFTEKVEGFKVIGTATTGEQAKEWLDLVKPQLVLLDVYLPDMQGT 66
+L+ +DD I + + + G+ V + W+ LV+ DV +PD
Sbjct: 6 ILVADDDAAIRTVLNQALSR-AGYDVR-ITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 ELVTYIRHHLHDTDIIMITAASETDVVKHALRGGVTDYIVKPLMFDR 113
+L+ I+ D +++++A + A G DY+ KP
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTE 110


70BALH_0506BALH_0517N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0506-314-0.855975glycerol-3-phosphate ABC transporter ATP-binding
BALH_0507-314-1.205804glycerol-3-phosphate ABC transporter permease
BALH_0508-115-1.052740glycerol-3-phosphate ABC transporter permease
BALH_0509117-0.925134ABC transporter glycerol-3-phosphate periplasmic
BALH_0510216-1.229592serine/threonine protein phosphatase
BALH_0511216-1.499631response regulator
BALH_0512217-1.737871sensor histidine kinase
BALH_0513013-1.393670hypothetical protein
BALH_0514-112-0.754144hypothetical protein
BALH_0515-113-0.760918methyl-accepting chemotaxis protein
BALH_0516-212-0.575916sensory histidine kinase DcuS
BALH_0517-213-0.207357response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0506PF05272330.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.7 bits (74), Expect = 0.002
Identities = 13/32 (40%), Positives = 16/32 (50%)

Query: 44 VLVGPSGCGKSTLLRMIAGLEEISSGDLIINE 75
VL G G GKSTL+ + GL+ S I
Sbjct: 600 VLEGTGGIGKSTLINTLVGLDFFSDTHFDIGT 631


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0509MALTOSEBP393e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 38.9 bits (90), Expect = 3e-05
Identities = 71/327 (21%), Positives = 119/327 (36%), Gaps = 43/327 (13%)

Query: 131 IKKDKYDTSKLEKAITNYYSVDGKMYSMPFNSSTPVLIYNKDAFAKAGLDPEKAPKTYAE 190
I DK KL + +GK+ + P LIYNKD PKT+ E
Sbjct: 105 ITPDKAFQDKLYPFTWDAVRYNGKLIAYPIAVEALSLIYNKDLLP-------NPPKTWEE 157

Query: 191 LQEAAKKLTIKEGGNVKQYGFSMLNYGWFFEELLATQGALYVDNENGRKDAAKKAVFNGK 250
+ K+L K G + + + W L+A G ENG+ D V N
Sbjct: 158 IPALDKELKAK-GKSALMFNLQEPYFTW---PLIAADGGYAFKYENGKYDIKDVGVDNAG 213

Query: 251 EGQKVFGMLDELNKAGALGKYGASWDDIRAAFQSGQVAVYLDSSAGIRDLIDASKFNVGV 310
+ ++D + + AAF G+ A+ ++ + ID SK N GV
Sbjct: 214 AKAGLTFLVDLIKNKHM--NADTDYSIAEAAFNKGETAMTINGPWAWSN-IDTSKVNYGV 270

Query: 311 SYIPYPEDSKQN---GVVIGGASLWMTNMVSEETQQGAWDFMKYLTKPDVQAKWHTATGY 367
+ +P + GV+ G + N K L K ++ T G
Sbjct: 271 TVLPTFKGQPSKPFVGVLSAGINAASPN--------------KELAKEFLENYLLTDEGL 316

Query: 368 FSINPD----AYNEPLVKEQYEKYPQLKVTVDQLQATKQSPATQGALISVFPESRDAVVK 423
++N D A +E+ K P++ T++ Q +G ++ P+
Sbjct: 317 EAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQ--------KGEIMPNIPQMSAFWYA 368

Query: 424 ALEAMYDGKNSKEALDEAAKATDRAIS 450
A+ + + ++ +DEA K I+
Sbjct: 369 VRTAVINAASGRQTVDEALKDAQTRIT 395


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0511HTHFIS951e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.9 bits (236), Expect = 1e-24
Identities = 35/140 (25%), Positives = 68/140 (48%), Gaps = 2/140 (1%)

Query: 11 RLLVVEDNASLLESIVQILHDE-FEVDTALNGEEGLFLALQNIYDVILLDVMMPEMDGFE 69
+LV +D+A++ + Q L ++V N D+++ DV+MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 70 VIQKIRDEKIETPVLFLTARDSLEDRVKGLDFGGDDYIVKPFQAPELKARI-RALLRRSG 128
++ +I+ + + PVL ++A+++ +K + G DY+ KPF EL I RAL
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 129 SLTTKQTIRYKGIELFGKDK 148
+ + G+ L G+
Sbjct: 125 RPSKLEDDSQDGMPLVGRSA 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0512PF06580393e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.7 bits (90), Expect = 3e-05
Identities = 35/198 (17%), Positives = 72/198 (36%), Gaps = 53/198 (26%)

Query: 234 TISKECRRLSKLVANLLL---------LARSDSNQIEMDKKIFELDKLLEEIVEPYKEIA 284
I +L L S++ Q+ + ++ +V+ Y ++A
Sbjct: 181 NIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADEL--------TVVDSYLQLA 232

Query: 285 SYQEKEMILKVEHDISFMGDRERIHQMMV------ILLDNAMKY----TNEGGHIQIDCT 334
S Q ++ L+ E+ I+ I + V L++N +K+ +GG I + T
Sbjct: 233 SIQFEDR-LQFENQIN-----PAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGT 286

Query: 335 QTNSSIRIRVKDDGIGVKREDIPKLFDRFYQGDKARSTSEGAGLGLSIANWIVEKHYGK- 393
+ N ++ + V++ G ++T E G GL ++ YG
Sbjct: 287 KDNGTVTLEVENTG-----------------SLALKNTKESTGTGLQNVRERLQMLYGTE 329

Query: 394 --ISVESRWGEGTCFEVI 409
I + + G+ +I
Sbjct: 330 AQIKLSEKQGKVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0516PF06580356e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.8 bits (80), Expect = 6e-04
Identities = 19/99 (19%), Positives = 40/99 (40%), Gaps = 19/99 (19%)

Query: 454 LIDNALE-AVTNCEKK-QVEVEIQY-GNTLIITVQDTGKGIQEKEIEALFTKGYSTKGDN 510
L++N ++ + + ++ ++ T+ + V++TG + E +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE------------S 310

Query: 511 RGYGLYLVKESIQRINGE---IHMHSLVGKGTKITIEIP 546
G GL V+E +Q + G I + GK + IP
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0517HTHFIS801e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 80.3 bits (198), Expect = 1e-19
Identities = 32/129 (24%), Positives = 59/129 (45%), Gaps = 5/129 (3%)

Query: 2 IKVLIVEDDPMVAMLNAHYLEQVGGFELVQAVNSVKSAIEVLEESRIDLVLLDIFMPEET 61
+L+ +DD + + L + G V+ ++ + + DLV+ D+ MP+E
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GFELLMYIRNQEKEIDIMMISAVHDMGSIKKALQYGVVDYLIKPFTFERFKEALTIYREK 121
F+LL I+ ++ ++++SA + + KA + G DYL KPF E + I
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT---ELIGIIGRA 118

Query: 122 LTFMKEQQK 130
L K +
Sbjct: 119 LAEPKRRPS 127


71BALH_0524BALH_0527N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0524-2100.239044acetyltransferase
BALH_0525-280.495196sensor histidine kinase
BALH_0526-280.285432response regulator
BALH_0527-1100.915003acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0524SACTRNSFRASE385e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.0 bits (88), Expect = 5e-06
Identities = 20/87 (22%), Positives = 31/87 (35%), Gaps = 4/87 (4%)

Query: 59 GAFKDGKLIGVATLETKPYVKQEHKAKIGSVYVSPKARGLGAGKALIKECLELAKSLEVE 118
+ + IG + + A I + V+ R G G AL+ + +E AK
Sbjct: 69 LYYLENNCIGRIKIRSN----WNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFC 124

Query: 119 QVMLDVVVGNDGAKKLYESLGFKTFGV 145
+ML+ N A Y F V
Sbjct: 125 GLMLETQDINISACHFYAKHHFIIGAV 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_052560KDINNERMP310.014 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 30.7 bits (69), Expect = 0.014
Identities = 23/87 (26%), Positives = 35/87 (40%), Gaps = 9/87 (10%)

Query: 152 KKSKFITTVSP-IHTTEFQGKLYMLLKTSFLENMLLKLMKQFLIISVLTIILTTISVFIF 210
+ + V+P + T G L+ + + F LLK + F+ +II+ T V
Sbjct: 312 EIQDKMAAVAPHLDLTVDYGWLWFISQPLF---KLLKWIHSFVGNWGFSIIIITFIV--- 365

Query: 211 SRVITEPL-IKMKRATEKMSKLNKPIQ 236
R I PL + KM L IQ
Sbjct: 366 -RGIMYPLTKAQYTSMAKMRMLQPKIQ 391


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0526HTHFIS951e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.9 bits (236), Expect = 1e-24
Identities = 34/121 (28%), Positives = 64/121 (52%), Gaps = 1/121 (0%)

Query: 7 KILLVDDEERMLRLLDLFLSPRGYFCMKATSGLEALKLIEQKDFDIILLDVMMPNMDGWD 66
IL+ DD+ + +L+ LS GY ++ + I D D+++ DV+MP+ + +D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 67 TCYQIRQI-SNVPIIMLTARNQNYDMVKGLTMGADDYITKPFDEHVLVARIEAILRRTKK 125
+I++ ++P+++++A+N +K GA DY+ KPFD L+ I L K+
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 126 D 126

Sbjct: 125 R 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0527SACTRNSFRASE431e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 42.6 bits (100), Expect = 1e-07
Identities = 29/123 (23%), Positives = 42/123 (34%), Gaps = 7/123 (5%)

Query: 37 TKNPEAFSSSYEDVLKHEDPVAAMAKRLSNPDKYTLGVFKDKDLIGIATLETKPFIKQEH 96
T E FS Y K + + K + + + IG + +
Sbjct: 36 TYTEERFSKPY---FKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSN----WNG 88

Query: 97 KAKIGSVFVSPKARGLGAGRALIKAIIENADKLHVEQLMLDVVVGNDAAKKLYESLGFQT 156
A I + V+ R G G AL+ IE A + H LML+ N +A Y F
Sbjct: 89 YALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148

Query: 157 YGV 159
V
Sbjct: 149 GAV 151


72BALH_0552BALH_0559N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_05520140.618951iron(III) dicitrate ABC transporter periplasmic
BALH_05540141.457431iron compound ABC transporter permease
BALH_05550131.559807iron(III) dicitrate ABC transporter permease
BALH_05560140.679631iron(III) dicitrate ABC transporter ATP-binding
BALH_05570180.214723SAM-dependent methyltransferase
BALH_05580160.0370792-amino-3-ketobutyrate CoA ligase
BALH_0559-2150.582549L-threonine 3-dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0552FERRIBNDNGPP994e-26 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 98.9 bits (246), Expect = 4e-26
Identities = 65/295 (22%), Positives = 111/295 (37%), Gaps = 46/295 (15%)

Query: 15 LLAFSLLLSACGKSNTKEESKEDTKKEMIPVEHAMGKTEVPANPKRVVILTNEGTEALLE 74
LL L + NT + D P R+V L E LL
Sbjct: 12 LLTAMALSPLLWQMNTAHAAAID--------------------PNRIVALEWLPVELLLA 51

Query: 75 LGVKPVGAV-----KSWTGDPWYPHIKDKMKDVKVVGDEGQVNVETIASLKPDLIIGNKM 129
LG+ P G + W +P P V VG + N+E + +KP ++ +
Sbjct: 52 LGIVPYGVADTINYRLWVSEPPLP------DSVIDVGLRTEPNLELLTEMKPSFMVWS-A 104

Query: 130 RHEKVYEQLKAIAPTV---FSETLR--GEWKDNFKFYAKALNKEKDGQKVLADYDKRMKD 184
+ E L IAP FS+ + + + A LN + + LA Y+ ++
Sbjct: 105 GYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRS 164

Query: 185 LKAKLGDKVNQEISMVRFM-PGDVRIYHGDTFSGVILKELGFKRPGDQNKDDFAERNVSK 243
+K + + + + + + P + ++ ++ IL E G + + VS
Sbjct: 165 MKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSI 224

Query: 244 ERISAM-DGDVLFYFTFDKGNEKKGSELEKEYINDPLFKNLNAVKNGKAYKVDDV 297
+R++A D DVL FD N K L PL++ + V+ G+ +V V
Sbjct: 225 DRLAAYKDVDVLC---FDHDNSKDMDALMA----TPLWQAMPFVRAGRFQRVPAV 272


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0554TYPE3IMSPROT320.004 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 31.7 bits (72), Expect = 0.004
Identities = 24/173 (13%), Positives = 67/173 (38%), Gaps = 16/173 (9%)

Query: 106 GAAFFIVVAIVIFSVTSLSAFTWIAFL-------GAAIAAVLVFASSSLGKEGTTPLKLT 158
A + ++ ++ ++ + + + L + ++ E
Sbjct: 31 STALIVALSAMLMGLSDYYFEHFSKLMLIPAEQSYLPFSQALSYVVDNVLLEFFYLCFPL 90

Query: 159 LAGVAISALFSSLTQGLLVLNEKALE------EVLFWLAGSVQGRKL-EILQSVFPYLLI 211
L A+ A+ S + Q +++ +A++ + + L E L+S+ +L+
Sbjct: 91 LTVAALMAIASHVVQYGFLISGEAIKPDIKKINPIEGAKRIFSIKSLVEFLKSILKVVLL 150

Query: 212 GWIASIMMAGKVNTLMMGEDVAKGLGQRTILMKSFVLLIIVLLSGGSVAVAGP 264
+ I++ G + TL+ + G+ T L+ + ++V+ + G V ++
Sbjct: 151 SILIWIIIKGNLVTLL--QLPTCGIECITPLLGQILRQLMVICTVGFVVISIA 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0555BORPETOXINA290.020 Bordetella pertussis toxin A subunit signature.
		>BORPETOXINA#Bordetella pertussis toxin A subunit signature.

Length = 269

Score = 29.4 bits (65), Expect = 0.020
Identities = 12/31 (38%), Positives = 20/31 (64%)

Query: 286 PHISRRLVGSLYGALLPVAAIVGAILVLAAD 316
P+ SRR V S+ G L+ +A ++GA + A+
Sbjct: 211 PYTSRRSVASIVGTLVRMAPVIGACMARQAE 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0559NUCEPIMERASE886e-22 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.9 bits (218), Expect = 6e-22
Identities = 57/241 (23%), Positives = 101/241 (41%), Gaps = 17/241 (7%)

Query: 6 KILVTGSLGQIGSELVMKLRD----VYGASNVIA---TDIRETDSEVVTSGPFE--TLDV 56
K LVTG+ G IG + +L + V G N+ +++ E++ F+ +D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 57 TDGQKLHDIAKRNEVDTIIHLAALLSAT-AEKNPLFAWNLNMGGLVNALEAARELNCKFF 115
D + + D+ + + L+ + +NP + N+ G +N LE R +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 116 T-PSSIGAFGPSTPKDNTPQDTIQRPTTMYGVNKVAGELLCDYYHQKFGVDTRGVRFPGL 174
SS +G + + D++ P ++Y K A EL+ Y +G+ G+RF
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRF--- 178

Query: 175 ISYVAPPGGGTTDYAVEIYYEAIKKGTYTSYIAEGTYM-DMMYMPDALQAIISLMEADPS 233
V P G D A+ + +A+ +G G D Y+ D +AII L + P
Sbjct: 179 -FTVYGP-WGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPH 236

Query: 234 K 234

Sbjct: 237 A 237


73BALH_0573BALH_0580N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0573112-0.954439sensor histidine kinase domain-containing
BALH_05742100.155456DNA-binding response regulator
BALH_05752110.338934peptidase Vpr
BALH_05760120.298938camelysin
BALH_05771130.597846CAAX amino terminal protease family protein
BALH_05780141.363867hypothetical protein
BALH_05791131.322752sodium/alanine symporter family protein
BALH_05800150.439165glutamine ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0573SECYTRNLCASE310.016 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 30.9 bits (70), Expect = 0.016
Identities = 14/64 (21%), Positives = 28/64 (43%), Gaps = 7/64 (10%)

Query: 46 ILLLIFMGITEYFPVRFWRGTSSLTFPIIYAMSWQFGIHITIIAIVLVTLIIHLH---RR 102
+ +L+F+ I FP W + A W + + +++V L++ + RR
Sbjct: 190 MSILMFISIAATFPSALWA----IKKQGTLAGGWIEFGTVIAVGLIMVALVVFVEQAQRR 245

Query: 103 SPIQ 106
P+Q
Sbjct: 246 IPVQ 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0574HTHFIS733e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 72.9 bits (179), Expect = 3e-17
Identities = 28/115 (24%), Positives = 48/115 (41%), Gaps = 2/115 (1%)

Query: 5 IKILVVDDHAFLRDAIRSILEDESDMNVVGEASSGDGVLEKVEACRPDCILMDINLPGKN 64
ILV DD A +R + L +V S+ + + A D ++ D+ +P +N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 65 GIEATELVKKNYPNCRVLVFTMYEHDEYLMDALQAGADGYLLKDSSSEQVVAAIR 119
+ +KK P+ VLV + + A + GA YL K +++ I
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0575SUBTILISIN1632e-46 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 163 bits (415), Expect = 2e-46
Identities = 72/229 (31%), Positives = 100/229 (43%), Gaps = 44/229 (19%)

Query: 239 VPQIGVDKLHDEKITGKGIKVGVLDTGIDYNHPDLKDAYKGYRAKQGEDPSKIDPNSIKG 298
V I + + + G+G+KV VLDTG D +HPDLK I G
Sbjct: 26 VEMIQAPAVWN-QTRGRGVKVAVLDTGCDADHPDLKARI------------------IGG 66

Query: 299 WDFVNNDADPMETTYKDWQNSGGYPEIYDGSAYYTSHGTHVAGTIAGDKQNSVDYAVKGV 358
+F ++D E HGTHVAGTIA + V GV
Sbjct: 67 RNFTDDDEGDPEIFKDY-----------------NGHGTHVAGTIAA---TENENGVVGV 106

Query: 359 APDVDLYSYRVLGPYGSGQTSGILAAIDKAVKDDMDVINLSLGASINDPLYPTSIAVNNA 418
AP+ DL +VL GSGQ I+ I A++ +D+I++SLG + P AV A
Sbjct: 107 APEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQKVDIISMSLGGPEDVP--ELHEAVKKA 164

Query: 419 MLAGVVTVVAAGNSGPGEG---TLGSPSAAALPITVGASDAAMTIPTFS 464
+ + ++ + AAGN G G+ LG P I+VGA + FS
Sbjct: 165 VASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAINFDRHASEFS 213



Score = 81.0 bits (200), Expect = 3e-18
Identities = 37/145 (25%), Positives = 57/145 (39%), Gaps = 27/145 (18%)

Query: 596 TEGDHLADFSSRGPATKTDDIKPDIVAPGVSIFSTVPEYINDPKDGENYPVAYGRMSGTS 655
H ++FS+ + D+VAPG I STVP Y SGTS
Sbjct: 204 NFDRHASEFSNSNN-------EVDLVAPGEDILSTVPGG------------KYATFSGTS 244

Query: 656 MATPHTAGVAALILQ-----EHPNYSPFEVKEALMNTAVDLKEARSVFEVGSGRIDAYRA 710
MATPH AG ALI Q + + E+ L+ + L S G+G +
Sbjct: 245 MATPHVAGALALIKQLANASFERDLTEPELYAQLIKRTIPL--GNSPKMEGNGLLYLTAV 302

Query: 711 VHADTAIEVIDKTSNIVNDEEVDIE 735
+ I + + I++ + ++
Sbjct: 303 E-ELSRIFDTQRVAGILSTASLKVK 326


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0580PF05272330.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.7 bits (74), Expect = 0.002
Identities = 12/53 (22%), Positives = 22/53 (41%), Gaps = 5/53 (9%)

Query: 65 VVVVVGPSGSGKSTLLRCINQLESITDGELIVQNTEVHNAKTDMNELRRNIGM 117
VV+ G G GKSTL+ + L+ +D ++ K ++ +
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHF-----DIGTGKDSYEQIAGIVAY 645


74BALH_0609BALH_0617N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0609-1131.984693ribose ABC transporter ATP-binding protein
BALH_06111161.794266ribose ABC transporter permease
BALH_06122161.311383ribose ABC transporter periplasmic-binding
BALH_06130160.914657putative translaldolase
BALH_0614-1110.750029immune inhibitor A
BALH_0615-290.673105multidrug efflux transporter
BALH_0616-29-0.258960zinc-containing alcohol dehydrogenase
BALH_0617-214-0.814440phospholipase C
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0609MICOLLPTASE310.013 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 31.2 bits (70), Expect = 0.013
Identities = 23/97 (23%), Positives = 49/97 (50%), Gaps = 6/97 (6%)

Query: 179 IETLFTVINKLRKEGVSFVYIS-HRMEEIFSICD---AITILRDGEYVGKRSIPETSFDE 234
+++L I+ L VS Y++ H ++I I + ++ ++D ++S T++D
Sbjct: 637 MDSLLNNIDNLDVPLVSDEYVNGHEAKDINEITNDIKEVSNIKDLSSNVEKSQFFTTYDM 696

Query: 235 VVSMMVGRSIGERYPER--NSQIGDVIFEMRNGTKKG 269
+ + GRS GE + NS++ D++ E+ + G
Sbjct: 697 RGTYVGGRSQGEENDWKDMNSKLNDILKELSKKSWNG 733


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0614GPOSANCHOR330.006 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.7 bits (74), Expect = 0.006
Identities = 25/97 (25%), Positives = 40/97 (41%), Gaps = 9/97 (9%)

Query: 74 EETKKAVEKYIEKKQGDQANKEILPADTAKEASDFVKKVKE---KKMEEKEKVK--KPEK 128
+K ++ E K+ + K L A EA K +KE K+ EE K++ K
Sbjct: 410 AALEKLNKELEESKKLTEKEKAELQAKLEAEA----KALKEKLAKQAEELAKLRAGKASD 465

Query: 129 NVSPEQKPEPNKKQLNGQVPTSKAKQAPYKGSVRTDK 165
+ +P+ KP GQ P + K K ++ K
Sbjct: 466 SQTPDAKPGNKAVPGKGQAPQAGTKPNQNKAPMKETK 502


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0615TCRTETA898e-22 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 89.5 bits (222), Expect = 8e-22
Identities = 67/355 (18%), Positives = 129/355 (36%), Gaps = 13/355 (3%)

Query: 4 FIYFIVIVAFLDTFSQLPIMSTFAQSLGGSPLII---GLVVGMYSFANMIGNIIAGAAVD 60
I V + + +P++ + L S + G+++ +Y+ + GA D
Sbjct: 9 VILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSD 68

Query: 61 KFGAKKILYISMGLTSFIVLLYTVVQSGEQLLVVRFMHGFSDGFLIPAAFTFLSKQTNAA 120
+FG + +L +S+ + + L + R + G + G A +++ T+
Sbjct: 69 RFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGD 127

Query: 121 RQGKAMALSGAAVGTAAIVGPAFSGIMKATAGIEWVFITISILMVLGTIVSLFFLPNNVS 180
+ + A G + GP G+M F + L L + F LP S
Sbjct: 128 ERARHFGFMSACFGFGMVAGPVLGGLM-GGFSPHAPFFAAAALNGLNFLTGCFLLPE--S 184

Query: 181 RKDTSRTQMMNKEDMFELLKSEPLLQAYIGAFTLMF-----SQGIVTYMLPVKVEALALK 235
K R + + + + F Q + +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWD 244

Query: 236 ASTTGMMLSVFGITAILFFLLPTNRIYDRFNRSKLMLIGIAVMALALSLLGLFATKGMLF 295
A+T G+ L+ FGI L + T + R + +++G+ LL FAT+G +
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILL-AFATRGWMA 303

Query: 296 IVMMIYGIGFAILFPSINALLVENTTDDKRGKAFGLFYAFFSLGVVAGSFTVGAI 350
+M+ I P++ A+L ++++G+ G A SL + G AI
Sbjct: 304 FPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAI 358



Score = 32.9 bits (75), Expect = 0.002
Identities = 28/118 (23%), Positives = 48/118 (40%), Gaps = 6/118 (5%)

Query: 53 IIAGAAVDKFGAKKILYISMGLTSF-IVLLYTVVQSGEQLLVVRFMHGFSDGFLIPAAFT 111
+I G + G ++ L + M +LL + ++ + G +PA
Sbjct: 265 MITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASG--GIGMPALQA 322

Query: 112 FLSKQTNAARQGKAMALSGAAVGTAAIVGP-AFSGIMKATAGI--EWVFITISILMVL 166
LS+Q + RQG+ A +IVGP F+ I A+ W +I + L +L
Sbjct: 323 MLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWAWIAGAALYLL 380


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0617PRPHPHLPASEC2731e-92 Prokaryotic zinc-dependent phospholipase C signature.
		>PRPHPHLPASEC#Prokaryotic zinc-dependent phospholipase C signature.

Length = 398

Score = 273 bits (700), Expect = 1e-92
Identities = 71/255 (27%), Positives = 114/255 (44%), Gaps = 40/255 (15%)

Query: 41 AEDKHKEGVNSHLWIVNRAIDIM----SRNTTLVKQDRVAQLNEWRTELENGIYAADYEN 96
A D +G +H IV + + I+ S+N + + L E EL+ G DY+
Sbjct: 28 AWDGKIDGTGTHAMIVTQGVSILENDLSKNEPESVRKNLEILKENMHELQLGSTYPDYDK 87

Query: 97 PYYDNSTFASHFYDPDNGKTYIP-----FAKQAKETGA----KYFKLAGESYKNKDMKQA 147
YD + HF+DPD + A +TG K+ LA ++ + KQA
Sbjct: 88 NAYD--LYQDHFWDPDTDNNFSKDNSWYLAYSIPDTGESQIRKFSALARYEWQRGNYKQA 145

Query: 148 FFYLGLSLHYLGDVNQPMHAANFTNLSYPQGFHSKYENFVDTIKDNYKVTDGNGYWNWKG 207
FYLG ++HY GD++ P H AN T + H K+E F + K+ YK+ N
Sbjct: 146 TFYLGEAMHYFGDIDTPYHPANVTAVD--SAGHVKFETFAEERKEQYKINTAGCKTN--- 200

Query: 208 TNPEDWIHGAAVVAKQDYS--------GIVNDNTKDWFVKAAVSQEYAD-KWRAEVTPMT 258
ED+ A ++ +D++ G ++ A++S + D + A+VT
Sbjct: 201 ---EDFY--ADILKNKDFNAWSKEYARGFAKTGKSIYYSHASMSHSWDDWDYAAKVT--- 252

Query: 259 GKRLIDAQRVTAGYI 273
L ++Q+ TAGYI
Sbjct: 253 ---LANSQKGTAGYI 264


75BALH_0655BALH_0666N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0655-218-1.066398acriflavin resistance protein
BALH_0657114-1.257155********************hypothetical protein
BALH_0658112-1.152777hypothetical protein
BALH_0659012-0.057265hypothetical protein
BALH_0660-1110.717727peptidase NLP/P60 /M23/M37 peptidase
BALH_06610151.452711transcriptional activator
BALH_0662-1161.620408ABC transporter ATP-binding protein
BALH_0663-2141.707593hydroxymethylpyrimidine ABC transporter
BALH_0665-1141.668920ABC transporter hydroxymethylpyrimidine-binding
BALH_06661152.207757transcriptional regulator TenI
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0655ACRIFLAVINRP5640.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 564 bits (1456), Expect = 0.0
Identities = 222/1038 (21%), Positives = 446/1038 (42%), Gaps = 53/1038 (5%)

Query: 4 LTKFSLKNRAAVIIMVFLISILGVYSGSKLPMEFLPSIDNPAVTVTTLSPGLDAEAMTKE 63
+ F ++ ++ ++ + G + +LP+ P+I PAV+V+ PG DA+ +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 64 VTDPLEKQFRNLEHIDNITS-STHEGLSRIDIAYTSKANMKDATREVEKAINTIK--LPK 120
VT +E+ ++++ ++S S G I + + S + A +V+ + LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 DATKPIVSQLNTSMIPLAQIAIQKQNGFSKADE--KQIEKEIVPQLESIDGVANVMFFGK 178
+ + +S +S L N + D+ + + L ++GV +V FG
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 179 STSELSIILDPNQLKDKNVTTEQILKVLQGKETSTPAG------AVTVNKEEYNLRVIGD 232
+ + I LD + L +T ++ L+ + AG A+ + ++
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 233 IKNVDDIKNITVAP-----QVKLQDVAQIEL-KQHYDTISHINGEEGTGLIIMKEPSKNA 286
KN ++ +T+ V+L+DVA++EL ++Y+ I+ ING+ GL I NA
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 287 VAIGKEIDKKIKDISKQYKDQFSIKLLASTHEQVENAVTSMGKEVILGAIAATLIILIFL 346
+ K I K+ ++ + + T V+ ++ + K + + L++ +FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 347 RNFRTTLIAVVSIPLSILLTLFLLHQSNITLNTLTLGGLAVAVGRLVDDSIVVIENIFRR 406
+N R TLI +++P+ +L T +L ++NTLT+ G+ +A+G LVDD+IVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 407 LQKESFS-KDIILDATKEVAVAITSSTLTTVAVFLPIGLVSGVIGKLMLPMVLAVVYSIL 465
+ ++ K+ + ++ A+ + AVF+P+ G G + + +V ++
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 466 SSLLVALTVVPLMAFLLLKKIK---HKKPS------------SSPRYVATLKWALSHKFI 510
S+LVAL + P + LLK + H+ S Y ++ L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 511 ILLTSFLLFAGSIAAYVLLPKANIKSEDDTMLSINMTFPADYALETQKQKAFDFEKKLLS 570
LL L+ AG + ++ LP + + ED + + PA E ++ L
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 571 NSDVMDVILRMGSSAEDAQWGQTTKNNLASIFVVFK-------KGSDIDQYIKELKKEHK 623
N + + + + GQ N FV K + + I K E
Sbjct: 600 NE--KANVESVFTVNGFSFSGQA--QNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELG 655

Query: 624 TFEPA-----ELDYIKTSYSSSGGGNNLQFNVTATNETNLKKAATIVETKLKNMDDLSKV 678
+ I +++G L ++ + ++ ++ L V
Sbjct: 656 KIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSV 715

Query: 679 KTNLEDSKKEWQIHVDQTKAEQLGLTPELAAQQVTFLMKKSPIGEVSINNEKTTIMIEHK 738
+ N + ++++ VDQ KA+ LG++ Q ++ + + + + + ++
Sbjct: 716 RPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQAD 775

Query: 739 KESITKQEDILNTNILSPINGPIPLKDIATISEKQLQTEVFHKDGKETIQIIAEASNEDL 798
+ ED+ + S +P T + +G +++I EA+
Sbjct: 776 AKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTS 835

Query: 799 SKVSTEVNKAIANLDLPSGAKVNIAGATESMQENFTDLFKIMGIAIGIVYLIMVITFGQA 858
S + + + +A+ LP+G + G + + + ++ I+ +V+L + +
Sbjct: 836 SGDAMALMENLAS-KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESW 894

Query: 859 RAPFAILFSLPLAAVGGILGLIISGTPVDVNSLIGALMLIGIVVTNAIVLIERVQQNREH 918
P +++ +PL VG +L + DV ++G L IG+ NAI+++E + E
Sbjct: 895 SIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEK 954

Query: 919 -GIETREALLEAGSTRLRPIIMTAITTIVAMLPLLFGQSQAGSMVSKSLAVVVIGGLAVS 977
G EA L A RLRPI+MT++ I+ +LPL + AGS ++ + V+GG+ +
Sbjct: 955 EGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAIS-NGAGSGAQNAVGIGVMGGMVSA 1013

Query: 978 TVLTLVVVPVMYELLDKI 995
T+L + VPV + ++ +
Sbjct: 1014 TLLAIFFVPVFFVVIRRC 1031



Score = 128 bits (322), Expect = 5e-32
Identities = 93/516 (18%), Positives = 200/516 (38%), Gaps = 38/516 (7%)

Query: 509 FIILLTSFLLFAGSIAAYVLLPKANIKSEDDTMLSINMTFPADYALETQKQKAFDFEKKL 568
F +L L+ AG++A + LP A + +S++ +P A Q E+ +
Sbjct: 11 FAWVLAIILMMAGALA-ILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQVIEQNM 69

Query: 569 LSNSDVMDVILRMGSSAEDAQWGQTTKNNLASIFVVFKKGSDID---QYIKELKKEHKTF 625
++M + + +I + F+ G+D D ++ +
Sbjct: 70 NGIDNLMYMS------------STSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPL 117

Query: 626 EPAELDYIKTSYSSSGGGNNLQFNVTATNETNLKK-----AATIVETKLKNMDDLSKVKT 680
P E+ S S + + N + A+ V+ L ++ + V
Sbjct: 118 LPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDV-- 175

Query: 681 NLEDSKKEWQIHVDQTKAEQLGLTPE-----LAAQQVTFLMKKSPIGEVSINNEKTTIMI 735
L ++ +I +D + LTP L Q + G ++ ++ I
Sbjct: 176 QLFGAQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQ-IAAGQLGGTPALPGQQLNASI 234

Query: 736 EHKKESITKQEDILNTNILSPING-PIPLKDIATISE-KQLQTEVFHKDGKETIQI-IAE 792
+ E+ + +G + LKD+A + + + +GK + I
Sbjct: 235 IAQTRFKNP-EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKL 293

Query: 793 ASNEDLSKVSTEVNKAIANL--DLPSGAKVNIA-GATESMQENFTDLFKIMGIAIGIVYL 849
A+ + + + +A L P G KV T +Q + ++ K + AI +V+L
Sbjct: 294 ATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFL 353

Query: 850 IMVITFGQARAPFAILFSLPLAAVGGILGLIISGTPVDVNSLIGALMLIGIVVTNAIVLI 909
+M + RA ++P+ +G L G ++ ++ G ++ IG++V +AIV++
Sbjct: 354 VMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVV 413

Query: 910 ERVQQ-NREHGIETREALLEAGSTRLRPIIMTAITTIVAMLPLLFGQSQAGSMVSKSLAV 968
E V++ E + +EA ++ S ++ A+ +P+ F G++ + ++
Sbjct: 414 ENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIY-RQFSI 472

Query: 969 VVIGGLAVSTVLTLVVVPVMYELLDKIGRKRRSRRK 1004
++ +A+S ++ L++ P + L K K
Sbjct: 473 TIVSAMALSVLVALILTPALCATLLKPVSAEHHENK 508



Score = 97.2 bits (242), Expect = 2e-22
Identities = 75/516 (14%), Positives = 175/516 (33%), Gaps = 41/516 (7%)

Query: 3 RLTKFSLKNRAAVIIMVFLISILGVYSGSKLPMEFLPSIDNPAVTVT-TLSPGLDAE--- 58
L + +++ LI V +LP FLP D L G E
Sbjct: 528 NSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQ 587

Query: 59 -AMTKEVTDPLEKQFRNLEHIDNITSSTHEGLSRID-IAYTSKANMKDATR---EVEKAI 113
+ + L+ + N+E + + + G ++ +A+ S ++ E I
Sbjct: 588 KVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVI 647

Query: 114 NTIKLPKDATK---------PIVSQLNTSMIPLAQIAIQKQNGFSKADEKQIEKEIVPQL 164
+ K+ + P + +L T+ + Q G Q +++
Sbjct: 648 HRAKMELGKIRDGFVIPFNMPAIVELGTATG--FDFELIDQAGLGHDALTQARNQLLGMA 705

Query: 165 -ESIDGVANVMFFGKS-TSELSIILDPNQLKDKNVTTEQILKVLQGKETSTPAGAVTVNK 222
+ + +V G T++ + +D + + V+ I + + T
Sbjct: 706 AQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRG 765

Query: 223 EEYNLRVIGD---IKNVDDIKNITVAPQ----VKLQDVAQIELKQHYDTISHINGEEGTG 275
L V D +D+ + V V + NG
Sbjct: 766 RVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSME 825

Query: 276 LIIMKEPSKNAVAIGKEIDKKIKDISKQYKDQFSIKLLASTHEQVENAVTSMGKEVILGA 335
+ P ++ + +++++ + ++++ S + L A
Sbjct: 826 IQGEAAPGTSS----GDAMALMENLASKLPAGIGYDWTGMSYQERL----SGNQAPALVA 877

Query: 336 IAATLIILIF---LRNFRTTLIAVVSIPLSILLTLFLLHQSNITLNTLTLGGLAVAVGRL 392
I+ ++ L ++ + ++ +PL I+ L N + + GL +G
Sbjct: 878 ISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLS 937

Query: 393 VDDSIVVIENIFRRLQKESFS-KDIILDATKEVAVAITSSTLTTVAVFLPIGLVSGVIGK 451
++I+++E ++KE + L A + I ++L + LP+ + +G
Sbjct: 938 AKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSG 997

Query: 452 LMLPMVLAVVYSILSSLLVALTVVPLMAFLLLKKIK 487
+ + V+ ++S+ L+A+ VP+ ++ + K
Sbjct: 998 AQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRCFK 1033


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0658MICOLLPTASE320.004 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 32.0 bits (72), Expect = 0.004
Identities = 14/68 (20%), Positives = 28/68 (41%), Gaps = 1/68 (1%)

Query: 50 KEFYKEENLAAFIVYGM-NKAKNLPQFHKDEIPTLVRILRLCQEIGWYEEANTFMVNQGL 108
F+ + I+YG+ + + IPTLV LR +G+Y + +++ L
Sbjct: 129 YTFFSNRDRVQAIIYGLEDSGRTYTADDDKGIPTLVEFLRAGYYLGFYNKQLSYLNTPQL 188

Query: 109 AEFVHTSL 116
++
Sbjct: 189 KNECLPAM 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0660RTXTOXIND320.004 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.004
Identities = 17/71 (23%), Positives = 27/71 (38%), Gaps = 14/71 (19%)

Query: 284 AAQGNVSIQAAAAGKVVKSYYSASYGNVVFIAHQINGKLYTTVYAHMKDRTVQAGDQVQA 343
+ G V I A A GK+ S G I N + K+ V+ G+ V+
Sbjct: 75 SVLGQVEIVATANGKLTHS------GRSKEIKPIENSIV--------KEIIVKEGESVRK 120

Query: 344 GQLVGHMGNTG 354
G ++ + G
Sbjct: 121 GDVLLKLTALG 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0662PF05272300.014 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.014
Identities = 11/32 (34%), Positives = 15/32 (46%)

Query: 55 GPSGCGKSTLFRLITGLEEASTGQIELTETKS 86
G G GKSTL + GL+ S ++ K
Sbjct: 603 GTGGIGKSTLINTLVGLDFFSDTHFDIGTGKD 634


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0666adhesinmafb300.007 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 30.0 bits (67), Expect = 0.007
Identities = 23/133 (17%), Positives = 50/133 (37%), Gaps = 21/133 (15%)

Query: 87 TDVRSVKEKFSYLHAGYSVHSLEEAIEAFKNGADSLVYGHVFPTDCKKDVPARGLEEISD 146
TD RS++++ S ++ + + A EA + +F + K D +E I+
Sbjct: 180 TDTRSIRQRISDNYSNLGSNFSDRADEANRK---------MFEHNAKLDRWGNSMEFING 230

Query: 147 IARSLSIPIIAIGGITPENTKDILASEVSGIAVMSGI--VSSSNPYSKA----------K 194
+A P I+ G A M I + + ++ K
Sbjct: 231 VAAGALNPFISAGEALGIGDILYGTRYAIDKAAMRNIAPLPAEGKFAVIGGLGSVAGFEK 290

Query: 195 SYKESIRKWAEKH 207
+ +E++ +W +++
Sbjct: 291 NTREAVDRWIQEN 303


76BALH_0756BALH_0763N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0756-2130.230711acetyltransferase
BALH_0757-1130.051324quaternary ammonium compound-resistance protein
BALH_07580130.035451hypothetical protein
BALH_0759-1120.076947TetR family transcriptional regulator
BALH_0761-2100.540674multidrug resistance protein B
BALH_0762-213-0.048268oligopeptide ABC transporter solute-binding
BALH_0763-1151.015347multidrug resistance protein B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0756SACTRNSFRASE363e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 35.7 bits (82), Expect = 3e-05
Identities = 17/81 (20%), Positives = 30/81 (37%), Gaps = 7/81 (8%)

Query: 42 LYVVKEEGEIVGVAGLHVLGEDLAEVRSLVVSHTYAGKGIGRMLVNHVINEAAKIKVSRV 101
++ E +G + A + + V+ Y KG+G L++ I E AK
Sbjct: 67 AFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAI-EWAKENHFCG 125

Query: 102 ISLTYET------EFFQKCGF 116
+ L + F+ K F
Sbjct: 126 LMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0759HTHTETR785e-20 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 77.7 bits (191), Expect = 5e-20
Identities = 30/168 (17%), Positives = 57/168 (33%), Gaps = 8/168 (4%)

Query: 2 MNKKEKIVYAAIEVFQEKGVEKTKISDIVKLAGIAQGTFYLYFPSKLSVMPAIAEVMVEK 61
++ I+ A+ +F ++GV T + +I K AG+ +G Y +F K + I E+
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 62 MILAVKERVQNDAPFSSK-VTQVIGAVFHFIEEYREIQALMYAGLASTEHIKEWEAV--- 117
+ E + +++ V + LM E + E V
Sbjct: 70 IGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 118 ----YEPLYMWLSEFLSEAKEAGEIRESVHAERTAKLFIALVESAAEQ 161
Y + + L EA + + R A + + E
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMEN 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0761TCRTETA2622e-86 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 262 bits (671), Expect = 2e-86
Identities = 93/380 (24%), Positives = 173/380 (45%), Gaps = 15/380 (3%)

Query: 15 LVILLSNIFIAFLGIGLIIPVMPSFMNEMHLTGK---TMGYLVAVFAMAQLIASPITGRW 71
L+++LS + + +GIGLI+PV+P + ++ + G L+A++A+ Q +P+ G
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 72 VDLYGRKKMIIIGLFIFGVSELLFGLGTDVWMLYVARVLGGISAAFIMPGVTAYVADITS 131
D +GR+ ++++ L V + +W+LY+ R++ GI+ A AY+ADIT
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGA-TGAVAGAYIADITD 125

Query: 132 IQERPKAMGYLSAAISTGFIIGPGIGGFIAEYGIRVPFFVAAAIAFIACVISIFILKEPL 191
ER + G++SA G + GP +GG + + PFF AAA+ + + F+L E
Sbjct: 126 GDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESH 185

Query: 192 TKEE--LAEISSNTKESSFIGDLKKSLHPMYAIAFIIVFVLAFGLSAYETVFSLFSDHKF 249
E L + N S + + A+ FI+ V ++ +F + +F
Sbjct: 186 KGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVP----AALWVIFGEDRF 241

Query: 250 GFTPKDIAAIITISSIFGVVVQVFMFGKLVDMFGEKVLIQICLIVGAVLAFVSTVVFNYW 309
+ I + I + Q + G + GE+ + + +I + W
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 310 IVLLVTCFIFLAFDLLRPALTTFLSKAAGKE-QGFVAGMNSTYTSLGNIAGPAMGGILFD 368
+ + + + + PAL LS+ +E QG + G + TSL +I GP + ++
Sbjct: 302 MAFPIM-VLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA 360

Query: 369 MNIHYPYAFSGVVLIVGLAI 388
+I ++G I G A+
Sbjct: 361 ASITT---WNGWAWIAGAAL 377



Score = 34.4 bits (79), Expect = 7e-04
Identities = 22/119 (18%), Positives = 42/119 (35%)

Query: 274 MFGKLVDMFGEKVLIQICLIVGAVLAFVSTVVFNYWIVLLVTCFIFLAFDLLRPALTTFL 333
+ G L D FG + ++ + L AV + W++ + + A
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIA 121

Query: 334 SKAAGKEQGFVAGMNSTYTSLGNIAGPAMGGILFDMNIHYPYAFSGVVLIVGLAITFMW 392
G E+ G S G +AGP +GG++ + H P+ + + +
Sbjct: 122 DITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0763TCRTETA432e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.9 bits (101), Expect = 2e-06
Identities = 26/130 (20%), Positives = 49/130 (37%), Gaps = 12/130 (9%)

Query: 280 VLCSALLIKMLKSFNDLKILYVGLFIYTIGFTILGTSNSLWIL----LIAGLFQTVGEMM 335
C+ +L + F +L V L + + I+ T+ LW+L ++AG+ G
Sbjct: 57 FACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATG--- 113

Query: 336 YVPVRQSIMADMVPNEARGSYMAINGMVFQVAKMNGALGVMLGSFLASWGMSALYFIVGM 395
V + +AD+ + R + F G +LG + + A +F
Sbjct: 114 --AVAGAYIADITDGDERARHFGFMSACFGFGM---VAGPVLGGLMGGFSPHAPFFAAAA 168

Query: 396 SSILLFMKAI 405
+ L F+
Sbjct: 169 LNGLNFLTGC 178



Score = 42.1 bits (99), Expect = 2e-06
Identities = 51/339 (15%), Positives = 116/339 (34%), Gaps = 28/339 (8%)

Query: 45 GALLLINVMASLVIGLYGGYVGDRLGRKKVMIIGQSIQVISIACMGIANSDYVDSPWLTF 104
G LL + + G + DR GR+ V+++ + + A M A W+ +
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAP-----FLWVLY 100

Query: 105 VFMLVNSLGSGLMNPATEAMLIDVSTPENRKVMYSINYWAINLSIAIGAIFGGLLFENYR 164
+ +V + +G A + D++ + R + + G + GGL+
Sbjct: 101 IGRIVAGI-TGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSP 159

Query: 165 LQLFIVLTLVAIITLYVMAVYMEEVYVARKTVEKKNVLKDMAD-SYKVVVKDRAFLIFCA 223
F + + + E + + ++ L +A + + A L+
Sbjct: 160 HAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVF 219

Query: 224 ASICTLSLEFQINNYLGVRLQKEFETVQFFFGNGFTFDLTGIRMLSWISAENTILVVLCS 283
+ L Q+ + FG F + A IL L
Sbjct: 220 FIM---QLVGQV-----------PAALWVIFGED-RFHWDAT-TIGISLAAFGILHSLAQ 263

Query: 284 ALLI-KMLKSFNDLKILYVGLFIYTIGFTILGTSNSLWILLIAGLFQTVGEMMYVPVRQS 342
A++ + + + L +G+ G+ +L + W+ + + +P Q+
Sbjct: 264 AMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFP-IMVLLASGGIGMPALQA 322

Query: 343 IMADMVPNEARGSYMAINGMVFQVAKMNGALGVMLGSFL 381
+++ V E +G + G + + + +G +L + +
Sbjct: 323 MLSRQVDEERQG---QLQGSLAALTSLTSIVGPLLFTAI 358


77BALH_0827BALH_0832N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0827-380.772346hypothetical protein
BALH_0828-1203.176507hypothetical protein
BALH_0829-1182.213594hypothetical protein
BALH_0830-2131.133195major facilitator superfamily sugar transporter
BALH_0831014-0.512320sensor histidine kinase
BALH_0832215-0.801921two-component response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0827PF00577360.001 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 35.6 bits (82), Expect = 0.001
Identities = 36/156 (23%), Positives = 57/156 (36%), Gaps = 22/156 (14%)

Query: 214 QVEEMQSGYEKLYSQLVPFAQCDFNFNINDSQAITDSLTLGFNETLNESLTQTQSFT--- 270
QV Q+GY +Y+ VP F IND A +S L T+ E+ TQ FT
Sbjct: 310 QVTIKQNGY-DIYNSTVPPG----PFTINDIYAAGNSGDLQV--TIKEADGSTQIFTVPY 362

Query: 271 --NGNSETKGSSE---TKGKTRSVAGV--APLVGA-----GIGAVFAGPMGAAFGGSIGS 318
+ +G + T G+ RS P G+ A + G +
Sbjct: 363 SSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRA 422

Query: 319 AAASMMGSTSESKASNESVTESHSKTEGTSSTTGKS 354
+ + A + +T+++S S G+S
Sbjct: 423 FNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQS 458


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0830TCRTETA364e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 35.6 bits (82), Expect = 4e-04
Identities = 30/129 (23%), Positives = 57/129 (44%), Gaps = 16/129 (12%)

Query: 42 EFFPKGDPTSQLLNTAAIFAVGFLMRPIGSLLMGRYADRHGRRAALTLSITVMAGGSFII 101
+ D T+ A++A LM+ + ++G +DR GRR L +S+ A I+
Sbjct: 34 DLVHSNDVTAHYGILLALYA---LMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIM 90

Query: 102 ACTPSYESIGIMAPIILVLARLLQGLSLGGEYGTSATYLSEMASSGRR----GFYSSFQY 157
A P +L + R++ G++ G + Y++++ R GF S+
Sbjct: 91 ATAPFLW--------VLYIGRIVAGIT-GATGAVAGAYIADITDGDERARHFGFMSACFG 141

Query: 158 VTLVAGQMV 166
+VAG ++
Sbjct: 142 FGMVAGPVL 150



Score = 29.0 bits (65), Expect = 0.044
Identities = 21/82 (25%), Positives = 39/82 (47%), Gaps = 11/82 (13%)

Query: 287 VVLQPIAGLLSDKIGRRPLLMAFGILGTLLTAPIFFFMEKTTEPIVAFLLMMVGLII--V 344
P+ G LSD+ GRRP+L+ +L A + + + T ++ +G I+ +
Sbjct: 57 FACAPVLGALSDRFGRRPVLLV-----SLAGAAVDYAIMATAP---FLWVLYIGRIVAGI 108

Query: 345 TGYT-SINAIVKAELFPTEIRA 365
TG T ++ A++ + RA
Sbjct: 109 TGATGAVAGAYIADITDGDERA 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0831PF06580372e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 2e-04
Identities = 22/131 (16%), Positives = 51/131 (38%), Gaps = 24/131 (18%)

Query: 397 EKKIDFHIEGDSALHPLPDHIKVSHLITILGNIIDNAFD-AVSGQEEK-SVSFFVTDIGH 454
E ++ F + + A+ ++V ++ + +++N ++ + + T
Sbjct: 237 EDRLQFENQINPAIM----DVQVPPML--VQTLVENGIKHGIAQLPQGGKILLKGTKDNG 290

Query: 455 DIVFEVIDSGAGIPAEKITTIFQKGFSTKGNDRGYGLANVKEMVDLL---EGTIEIQNEK 511
+ EV ++G+ G GL NV+E + +L E I++ +EK
Sbjct: 291 TVTLEVENTGSL------------ALKNTKESTGTGLQNVRERLQMLYGTEAQIKL-SEK 337

Query: 512 NGGAIFTIYLP 522
G + +P
Sbjct: 338 QGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0832HTHFIS623e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.8 bits (150), Expect = 3e-13
Identities = 37/142 (26%), Positives = 63/142 (44%), Gaps = 4/142 (2%)

Query: 3 KVAIAEDDFRVAQIQEEFLSKIK-DVKVIGKALNAKETIELLQKEEIDLLLLDNYLPDGI 61
+ +A+DD + + + LS+ DV++ NA + + DL++ D +PD
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITS---NAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GTDLLPKIHADFPDVDVIMVTAANENHMLEKAIRNGVSNYLIKPVTLEKFVRTIEDYKRK 121
DLLP+I PD+ V++++A N KA G +YL KP L + + I +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 122 KQLLHSNNEVNQALIDNFFGIS 143
+ S E + G S
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRS 143


78BALH_0893BALH_0905N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0893-213-1.264114sensor histidine kinase
BALH_0894012-0.083657hypothetical protein
BALH_0895-113-0.536487hypothetical protein
BALH_0896015-0.696985hypothetical protein
BALH_08970160.984442hypothetical protein
BALH_08984191.804559NADPH:quinone reductase (quinone
BALH_08995171.306365sensor histidine kinase
BALH_09006162.114126DNA-binding response regulator
BALH_09016162.222276hypothetical protein
BALH_09035161.758500DNA repair exonuclease
BALH_09044130.849795hypothetical protein
BALH_090509-0.4974773'-5' exoribonuclease YhaM
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0893HTHFIS701e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 1e-14
Identities = 26/107 (24%), Positives = 50/107 (46%), Gaps = 3/107 (2%)

Query: 777 TILIVDDDHRNIFALQNALKKQHANIITAQNGLECLEILKNNTNIDLILMDIMMPNMDGY 836
TIL+ DDD L AL + ++ N + DL++ D++MP+ + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDG-DLVVTDVVMPDENAF 63

Query: 837 ETMEHIRMNLGLHEIPIIALTAKAMPNDKEKCLSAGASDYISKPLNL 883
+ + I+ ++P++ ++A+ K GA DY+ KP +L
Sbjct: 64 DLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDL 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0895PF07132290.012 Harpin protein (HrpN)
		>PF07132#Harpin protein (HrpN)

Length = 356

Score = 28.5 bits (63), Expect = 0.012
Identities = 21/59 (35%), Positives = 30/59 (50%), Gaps = 6/59 (10%)

Query: 70 GLVAGGVAGGLGGLLTGLGVLAVSGLGPIVAAGPIAAAIGGAGIGGGAGSLIGAFIGLG 128
++ GG+ GGLGGL + LG L LG + G G+ +G G GS +G +G
Sbjct: 63 SMMGGGLGGGLGGLGSSLGGLGGGLLGGGLGGG------LGSSLGSGLGSALGGGLGGA 115



Score = 28.5 bits (63), Expect = 0.016
Identities = 21/66 (31%), Positives = 31/66 (46%), Gaps = 6/66 (9%)

Query: 64 DIFSATGLVAGGVAGGLGGLLTGLGVLAVSGLGPIVAAGPIAAAIGGAGIGGGAGSLIGA 123
DI + + + GGLGG L GLG G ++ G G G+G GS +G+
Sbjct: 53 DIMTTMMFMGSMMGGGLGGGLGGLGSSLGGLGGGLLGGG------LGGGLGSSLGSGLGS 106

Query: 124 FIGLGI 129
+G G+
Sbjct: 107 ALGGGL 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0899PF06580349e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.1 bits (78), Expect = 9e-04
Identities = 53/354 (14%), Positives = 114/354 (32%), Gaps = 42/354 (11%)

Query: 62 IFHWYASSLKNRQLLYFFFVQLFIVFLAAFIVPNGSIAIFVGLTPILIAQSLYVYNNIFK 121
W +L + + + + + + Q N
Sbjct: 17 GIGWGVYTLTGFGFASLYGSPKLHSMIFNIAISLMGLVLTHAYRSFIKRQGWLKLNMGQI 76

Query: 122 VMAVFTLMYAIFCIAISMNYGVNKVAILISM----FLLVLAIIIPFSYINKQQYDARNRI 177
++ V I + N + ++ I+ F L LA+ I F+ + +
Sbjct: 77 ILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSIIFNVVVVTFMWSLLYF 136

Query: 178 -QSYIQELESAYMRVEELTLANERQRMARDLHDTLAQGVASLIMQ---------LEAIDA 227
+ + + A + ++ MA++ AQ + +L Q L I A
Sbjct: 137 GWHFFKNYKQAEIDQWKM------ASMAQE-----AQ-LMALKAQINPHFMFNALNNIRA 184

Query: 228 HMQKGNTRRSQEIMKQTMIRARQTLHDARLVIDDLRHTTNSFNKAVEEEVQRFSEATSIH 287
+ + T+ + + + + R +L + L + ++ +F +
Sbjct: 185 LILEDPTKAREMLTSLSEL-MRYSLRYSNARQVSLADELTVVDSYLQLASIQFED----R 239

Query: 288 VRFTIQSPPHISS-LVKEHCLYVISECLTNIAKH---SQATDVHLKVEYSGSLERLTIEV 343
++F Q P I V + + E N KH + ++ + +T+EV
Sbjct: 240 LQFENQINPAIMDVQVPPMLVQTLVE---NGIKHGIAQLPQGGKILLKGTKDNGTVTLEV 296

Query: 344 EDNGIGFDTGYIGKNPGHYGLIGLNERVRLINGEIHIL--SEKVKGTKVCIQVP 395
E+ G K GL + ER++++ G + SEK + +P
Sbjct: 297 ENTGSLALKN--TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0900HTHFIS792e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.5 bits (196), Expect = 2e-19
Identities = 24/116 (20%), Positives = 49/116 (42%), Gaps = 2/116 (1%)

Query: 10 VLIVDDHFVVREGLKLIIETSDSFQIIGEAENGEEALSFIEKKKPDVILMDLNMPKMSGL 69
+L+ DD +R L + + + + N +I D+++ D+ MP +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAG-YDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 70 ETIEALNKKQNHTPIIILTTYNEDELMLKGIELGAKGYLLKDTDRENLFRTLEAAI 125
+ + + K + P+++++ N +K E GA YL K D L + A+
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0904RTXTOXIND310.019 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.3 bits (71), Expect = 0.019
Identities = 20/145 (13%), Positives = 46/145 (31%), Gaps = 5/145 (3%)

Query: 267 MARYEAIKAKMEPLQLQVDSLHKKIENVQSEIESIQIDEDFLQKESYVEELRMQHMS--Y 324
+ IK + Q Q ++ ++E ++ + + S VE+ R+ S
Sbjct: 185 LRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLL 244

Query: 325 EN---ARQEMRDLTGAITNIKEELAELEQQIGATFEKETVLSFDMSLATKELITQAVQKA 381
A+ + + EL + Q+ + + L T+ + + K
Sbjct: 245 HKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKL 304

Query: 382 RELETQKAQLDDRFKVAQEQLEEQE 406
R+ L +E+ +
Sbjct: 305 RQTTDNIGLLTLELAKNEERQQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0905MICOLLPTASE310.005 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 31.2 bits (70), Expect = 0.005
Identities = 21/108 (19%), Positives = 40/108 (37%), Gaps = 7/108 (6%)

Query: 20 IKTATKGIASNGKPFLTVILQDPSGDIEAKLWDV-------SPEVEKQYVAETIVKVAGD 72
IK+ + I F +D G+I+A WD + +Y +V
Sbjct: 779 IKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEAKATHKYNKTGEYEVKLT 838

Query: 73 ILNYKGRIQLRVKQIRVANENEVTDISDFVEKAPVKKEDMVEKITQYI 120
+ + G I K+I+V + V I++ +K + + K +
Sbjct: 839 VTDNNGGINTESKKIKVVEDKPVEVINESEPNNDFEKANQIAKSNMLV 886


79BALH_0939BALH_0944N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_0939417-1.534457TetR family transcriptional regulator
BALH_0940518-1.652246hypothetical protein
BALH_0941620-1.989262hypothetical protein
BALH_0942418-1.403873TetR family transcriptional regulator
BALH_0943417-1.226508collagen adhesion protein
BALH_09442151.525596hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0939HTHTETR757e-19 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 74.7 bits (183), Expect = 7e-19
Identities = 34/181 (18%), Positives = 68/181 (37%), Gaps = 7/181 (3%)

Query: 1 MRYSMTKNLQTSQNIVEASFKLMAEHGIEKMSLSMIAKEVGISKPAIYYHFSSKEALVDF 60
R + + +T Q+I++ + +L ++ G+ SL IAK G+++ AIY+HF K L
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 61 LFEEVFSGYHFVSYF-DKEQYTRENFAEKLIADGLHMLS---EYEGQEGILRVINEFIVT 116
++E S + + + + L +H+L E + ++ +I
Sbjct: 62 IWELSES--NIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 117 AARNEKYQKRLFEIQEEFLNGFHDLLKKGARLG-VVSQHATEENAHTLALVIDNMSNYML 175
Q+ + E + LK + + T A + I + L
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 176 M 176

Sbjct: 180 F 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0942HTHTETR617e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.8 bits (147), Expect = 7e-14
Identities = 36/163 (22%), Positives = 63/163 (38%), Gaps = 19/163 (11%)

Query: 6 TRQKILAAASQIVQCKGVAKLTLEAVAKEAGVSKGGLLYHFSNKEALIEGMIVRGVEDYE 65
TRQ IL A ++ +GV+ +L +AK AGV++G + +HF +K L + +
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 66 GAIYNKVAEDPERKGRWVRS----FVEERLNNERRTEELSSSMMAAFMLKP-ELLEPLQQ 120
A+ P +R +E + ERR + + +++ Q+
Sbjct: 72 ELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQR 131

Query: 121 SFQQ---------LQHKIENDEID-----SVCATIIRLAADGL 149
+ L+H IE + A I+R GL
Sbjct: 132 NLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGL 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0943TONBPROTEIN350.003 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 34.6 bits (79), Expect = 0.003
Identities = 19/85 (22%), Positives = 31/85 (36%), Gaps = 5/85 (5%)

Query: 1775 LTSLAPPGPEKPETTDPEKPETTDPEKPETTDPEKPGTTDPEKLETTDPEKPGTTNPEKP 1834
+T + P E P+ P +PE PE P E + KP KP
Sbjct: 47 VTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPP----KEAPVVIEKPKPKPKPKPKP 102

Query: 1835 ETMNPEKPEKELPKTGQKMPVEPYM 1859
E+P++++ + P P+
Sbjct: 103 VKKVQEQPKRDVKPVESR-PASPFE 126



Score = 33.4 bits (76), Expect = 0.007
Identities = 20/94 (21%), Positives = 28/94 (29%), Gaps = 6/94 (6%)

Query: 1742 VEAPKGYEKLTNPIPFEITKGMISPVQLQVLNKLTSLAPPGPEKPETTDPEKPETTDPEK 1801
+E P PI + V + P PE +P K EK
Sbjct: 36 IELPAP----AQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEK 91

Query: 1802 PETTDPEKPGTTDPEKLETTDPE-KPGTTNPEKP 1834
P+ KP + E + KP + P P
Sbjct: 92 PKPKPKPKP-KPVKKVQEQPKRDVKPVESRPASP 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_0944IGASERPTASE378e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 37.4 bits (86), Expect = 8e-05
Identities = 25/92 (27%), Positives = 36/92 (39%)

Query: 139 AEAKETLNKLVLETKDNKNLEEYNKKAVGLVTKMNEEEKTEKGKATAKAQKTSAVTSQKV 198
A ET + +K E N++ T N E E +T+ V
Sbjct: 1031 ATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGS 1090

Query: 199 AIEKTQKTEVEKNQKVEAEKNQKVETGKNQNA 230
++TQ TE ++ VE E+ KVET K Q
Sbjct: 1091 ETKETQTTETKETATVEKEEKAKVETEKTQEV 1122


80BALH_1162BALH_1168N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_1162-2130.802115response regulator
BALH_1163-1120.311879sensor histidine kinase
BALH_1164-1120.773109GntR family transcriptional regulator
BALH_11651120.671910(S)-2-hydroxy-acid oxidase, iron-sulfur subunit
BALH_11660130.526558iron-sulfur cluster-binding protein
BALH_1167013-0.717434hypothetical protein
BALH_1168-115-0.710417N-methyltransferase)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1162HTHFIS1126e-31 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 112 bits (281), Expect = 6e-31
Identities = 33/130 (25%), Positives = 61/130 (46%), Gaps = 1/130 (0%)

Query: 1 MSKYRVLVVDDESDMRQLVGMYLDNFGYEWGEAENGKEALKKLETDHYDFVVLDIMMPEM 60
M+ +LV DD++ +R ++ L GY+ N + + D VV D++MP+
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLSVCKEIRKT-SDVPIIFLTAKGEEWNRVNGLRMGADDYIVKPFSPGELIARMEAVLR 119
+ + I+K D+P++ ++A+ + GA DY+ KPF ELI + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RYTKQEQQEE 129
++ + E
Sbjct: 121 EPKRRPSKLE 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1163PF06580385e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.3 bits (89), Expect = 5e-05
Identities = 31/188 (16%), Positives = 73/188 (38%), Gaps = 32/188 (17%)

Query: 275 EKVTQLIHKEADRMQRLVHDLLDL--AQLEGEHFPLQKQPIVFSQ---LIEDVLDTYEIK 329
+ LI ++ + + ++ L +L L + + + +++ L I+
Sbjct: 180 NNIRALILEDPTKAREMLTSLSELMRYSLRYS----NARQVSLADELTVVDSYLQLASIQ 235

Query: 330 FIEKKIRISTNLNPEII-VMIDEDRMQQVLHNVLDNAIRYTHQNGDIMITLRQIDDYCEL 388
F E +++ +NP I+ V + +Q ++ N + + I Q G I++ + + L
Sbjct: 236 F-EDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTL 294

Query: 389 SIKDTGIGIDTEHLENLGERFYRVDKARSRQHGGTGLGLAIVRQ-IVHIHDGQW--KIES 445
+++TG A TG GL VR+ + ++ + K+
Sbjct: 295 EVENTG------------------SLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSE 336

Query: 446 EKGNGTTV 453
++G +
Sbjct: 337 KQGKVNAM 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1166ANTHRAXTOXNA320.007 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 31.6 bits (71), Expect = 0.007
Identities = 22/87 (25%), Positives = 35/87 (40%), Gaps = 7/87 (8%)

Query: 88 KTKEEAAKYIQDVAKKKQAKKVVKSKSMVTEEISMNHALEEIGCEVLE--SDLGEYILQV 145
KT++E K + K + K T+++ L++I +VLE S+LG I
Sbjct: 53 KTEKEKFKDSINNLVKTEFTNETLDKIQQTQDL-----LKKIPKDVLEIYSELGGEIYFT 107

Query: 146 DNDPPSHIIAPALHKNRTQIRDVFKEK 172
D D H L + + EK
Sbjct: 108 DIDLVEHKELQDLSEEEKNSMNSRGEK 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1168PREPILNPTASE1335e-40 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 133 bits (337), Expect = 5e-40
Identities = 64/264 (24%), Positives = 123/264 (46%), Gaps = 35/264 (13%)

Query: 13 YVYALLVGMVFGSFFMLIAMRIPL------------------------GESIIIPRSHCH 48
+ L ++ GSF ++ R+P+ ++++PRS C
Sbjct: 16 FSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEPPYNLMVPRSCCP 75

Query: 49 YCKYVLKPKELIPIISFCIQRGRCTNCKRKISILYVIFELVTGIICLLTVYMIGVERELI 108
+C + + E IP++S+ RGRC C+ IS Y + EL+T ++ + + +
Sbjct: 76 HCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSVAVAMTLAPGWGTL 135

Query: 109 IILSLFSLLLIISVTDYIYMLIPNRI---LAWFCCLLILECVFVPLVTWTESIVGSGVIF 165
L L +L+ ++ D ML+P+++ L W L L FV L ++++G+ +
Sbjct: 136 AALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLG---DAVIGAMAGY 192

Query: 166 ILLYCMRKIY-----PEGLGGGDIKLLSLLGFIVGLKGIFMILFLSSFFSLCFFGVGLVL 220
++L+ + + EG+G GD KLL+ LG +G + + ++L LSS ++L
Sbjct: 193 LVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILL 252

Query: 221 KRMKMRTQIPFGPFISLGAICYML 244
+ IPFGP++++ +L
Sbjct: 253 RNHHQSKPIPFGPYLAIAGWIALL 276


81BALH_1469BALH_1473N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_146908-0.963563flagellar motor protein MotS
BALH_147007-1.281576chemotaxis protein
BALH_147118-1.488238chemotaxis protein
BALH_147219-1.722330flagellar motor switch protein
BALH_1473312-3.068451hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1469OMPADOMAIN635e-14 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 63.4 bits (154), Expect = 5e-14
Identities = 30/127 (23%), Positives = 56/127 (44%), Gaps = 17/127 (13%)

Query: 112 SVVIVDNLIFDTGDANVKPEAKEIISQLVGFFQSVPNP---IVVEGHTDSRPIHNDKFPS 168
+ +++F+ A +KPE + + QL ++ +VV G+TD +D +
Sbjct: 214 HFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI--GSDAY-- 269

Query: 169 NWELSSARAANMIHHLIEVYNVDDKRLAAVGYADTKPVVPN---------DSPQNWEKNR 219
N LS RA +++ +LI + +++A G ++ PV N +R
Sbjct: 270 NQGLSERRAQSVVDYLIS-KGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDR 328

Query: 220 RVVIYIK 226
RV I +K
Sbjct: 329 RVEIEVK 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1470HTHFIS837e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.0 bits (205), Expect = 7e-22
Identities = 28/112 (25%), Positives = 46/112 (41%), Gaps = 2/112 (1%)

Query: 4 KILVVDDAMFMRTMIKNLLKSNSEFEVIGEAENGVEAIQKYKELQPDIVTLDITMPEMDG 63
ILV DD +RT++ L + + N + D+V D+ MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQAL--SRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 64 LEALKEIIKIDSSAKVVICSAMGQQGMVLDAIKGGAKDFIVKPFQADRVIEA 115
+ L I K V++ SA + A + GA D++ KPF +I
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGI 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1471PF06580363e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.4 bits (84), Expect = 3e-04
Identities = 10/48 (20%), Positives = 19/48 (39%), Gaps = 8/48 (16%)

Query: 396 LIRNAIDHGVETVEQRRDAGKNETGTIKLEAFHSGNHVVIQITDDGNG 443
L+ N I HG+ + G I L+ V +++ + G+
Sbjct: 263 LVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEVENTGSL 302


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1472FLGMOTORFLIN562e-11 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 55.7 bits (134), Expect = 2e-11
Identities = 23/71 (32%), Positives = 40/71 (56%)

Query: 474 DTSILQNVEMNVKFVFGSTVKTIQDILSLQENEAVVLDEDIDEPIRIYVNDVLVAYGELV 533
D ++ ++ + + G T TI+++L L + V LD EP+ I +N L+A GE+V
Sbjct: 53 DIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVV 112

Query: 534 NVDGFFGVKVT 544
V +GV++T
Sbjct: 113 VVADKYGVRIT 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1473PF03544330.002 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 33.0 bits (75), Expect = 0.002
Identities = 15/107 (14%), Positives = 31/107 (28%), Gaps = 7/107 (6%)

Query: 339 TEKQEDSKVEIPLQEEKPP--VVQIPKKEEKVNDFIKEPLKEKERITYVIKEPLTDNKEV 396
+ +E P + PP VV+ + E + + KE E+ +P K
Sbjct: 52 VTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEK-----PKPKPKPKPK 106

Query: 397 NKATTQKDKDNKNNNQDVSKKKEKKEEPADQKEAKSDEGIQASNVFA 443
++ K + + + PA + +
Sbjct: 107 PVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSV 153


82BALH_1479BALH_1488N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_1479115-1.510384flagellar hook-associated protein FlgK
BALH_1480217-1.651185flagellar hook-associated protein FlgL
BALH_1481218-1.627897flagellar capping protein
BALH_1482118-0.185083flagellar protein
BALH_1483015-0.278382hypothetical protein
BALH_1484013-0.462387flagellar basal body rod protein FlgB
BALH_1485011-0.034362flagellar basal body rod protein FlgC
BALH_1486211-0.406962flagellar hook-basal body protein FliE
BALH_1487311-0.661413flagellar MS-ring protein
BALH_1488311-0.881077flagellar motor switch protein G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1479FLGHOOKAP11043e-26 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 104 bits (260), Expect = 3e-26
Identities = 72/249 (28%), Positives = 112/249 (44%), Gaps = 14/249 (5%)

Query: 4 SDYNTPLSGLLAAQMGLQTTKQNLSNIHTPGYVRQMVNYGSAGASQGYSPEQKIGYGVQT 63
S N +SGL AAQ L T N+S+ + GY RQ A ++ G +G GV
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLG--AGGWVGNGVYV 59

Query: 64 LGVDRITDEVKTKQFNDQLSQLSYYNYMNSTLSRVESMVGTTGKNSLSSLMDGFFNAFRE 123
GV R D T Q +Q S +S++++M+ T+ SL++ M FF + +
Sbjct: 60 SGVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTS-SLATQMQDFFTSLQT 118

Query: 124 VAKNPEQPNYYDTLISETGKFTSQVNRLAKSLDTAEAQTTEDIEAHVNEFNRLAGSLAEA 183
+ N E P LI ++ +Q + L + Q I A V++ N A +A
Sbjct: 119 LVSNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASL 178

Query: 184 NKKI----GQAGTQVPNQLLDERDRIITEMSKYANIEVS---YESMNPNIASVRMNGVLT 236
N +I G PN LLD+RD++++E+++ +EVS + N +A NG
Sbjct: 179 NDQISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMA----NGYSL 234

Query: 237 VNGQDTYPL 245
V G L
Sbjct: 235 VQGSTARQL 243



Score = 53.0 bits (127), Expect = 1e-09
Identities = 18/51 (35%), Positives = 35/51 (68%)

Query: 380 LLEGIQQEKMGVEGVNMEEEMVNLMAFQKYFVANSKAITTMNEVFDSLFSI 430
++ + ++ + GVN++EE NL FQ+Y++AN++ + T N +FD+L +I
Sbjct: 495 VVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1480FLAGELLIN406e-06 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 40.4 bits (94), Expect = 6e-06
Identities = 30/127 (23%), Positives = 60/127 (47%)

Query: 1 MRVSTFQNASWAKNQLMDLNVQQQYHRNQVTSGKKNLLMSEDPLAASKSFAIQHSLANIE 60
++T + +N L +++SG + +D + + ++ +
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 61 QMQKDLADSKNVLTQTENTLQGVFKSLTRADQLTVQALNGTNSEKELKAIGAEIDQILKQ 120
Q ++ D ++ TE L + +L R +L+VQA NGTNS+ +LK+I EI Q L++
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 121 VVYLANT 127
+ ++N
Sbjct: 122 IDRVSNQ 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1484FLGHOOKAP1310.001 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 31.1 bits (70), Expect = 0.001
Identities = 10/27 (37%), Positives = 15/27 (55%)

Query: 23 NTVSSNIANANTPGYKAQDVTFAEQMN 49
NT S+NI++ N GY Q A+ +
Sbjct: 19 NTASNNISSYNVAGYTRQTTIMAQANS 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1485FLGHOOKAP1358e-05 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 34.5 bits (79), Expect = 8e-05
Identities = 19/75 (25%), Positives = 33/75 (44%), Gaps = 7/75 (9%)

Query: 5 INASGSGLTTARKWMEVTSNNIVNANTTAAPGADLYERRSVVLESNNSFANMLDGSLTNG 64
IN + SGL A+ + SNNI + N Y R++ ++ NS G + NG
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAG------YTRQTTIMAQANSTLGA-GGWVGNG 56

Query: 65 VKIKSIEADKTENLV 79
V + ++ + +
Sbjct: 57 VYVSGVQREYDAFIT 71



Score = 28.0 bits (62), Expect = 0.014
Identities = 10/38 (26%), Positives = 17/38 (44%)

Query: 97 NIDVTAEMTNVMVAQKMYEANTSVLNANKKMLDKDLEI 134
+++ E N+ Q+ Y AN VL + D + I
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1486FLGHOOKFLIE364e-06 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 35.8 bits (82), Expect = 4e-06
Identities = 18/77 (23%), Positives = 36/77 (46%), Gaps = 1/77 (1%)

Query: 34 SQTSVVEGKKFIDLLEDMNQTQNNAQTAVYDLLTKGVG-ETHDVLIQQKKAESQMKTAAL 92
Q ++ + L+ ++ TQ A+T G +DV+ +KA M+
Sbjct: 27 PQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQ 86

Query: 93 VRDNLIENYKSLINMQI 109
VR+ L+ Y+ +++MQ+
Sbjct: 87 VRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1487FLGMRINGFLIF1663e-47 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 166 bits (422), Expect = 3e-47
Identities = 99/540 (18%), Positives = 217/540 (40%), Gaps = 46/540 (8%)

Query: 17 LVIGAALLAIVTGALLYFTLPDKYVVVYQNLNDADKLEITAELSKLGVDYQLAADG-SIR 75
+V G+A +AIV +L+ PD Y ++ NL+D D I A+L+++ + Y+ A +I
Sbjct: 28 IVAGSAAVAIVVAMVLWAKTPD-YRTLFSNLSDQDGGAIVAQLTQMNIPYRFANGSGAIE 86

Query: 76 VQKNDAPWVRKEMNGMGLPFNSKSGEEILLESSLGSSEQDKKMKQIVGTKKQLEQDIVRN 135
V + +R + GLP G E+L + G S+ +++ + +L + I
Sbjct: 87 VPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRALEGELARTI-ET 145

Query: 136 FATVETANVQITLPEKETIFDEEKAKGTAAITVGVKRGQLLTADQVAGIQQMISAAVPGV 195
V++A V + +P+ ++F E+ +A++TV ++ G+ L Q++ + ++S+AV G+
Sbjct: 146 LGPVKSARVHLAMPKP-SLFVREQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGL 204

Query: 196 KAEEVSVIDSKKGVISKGADEAHSTSSSSYEKEVEMQHQIEGKLKQDIDATLMTMFKPNE 255
V+++D ++++ + + ++ + +E ++++ I+A L +
Sbjct: 205 PPGNVTLVDQSGHLLTQSNTSGRDLNDAQ----LKFANDVESRIQRRIEAILSPIVGNGN 260

Query: 256 YKVNTKVSVNYDEVTRQSEKYG-DKGVLRSKQEQEESSTAQEGAETKQGA--GITANG-- 310
+++ + E Y + ++ + + +++ G G +N
Sbjct: 261 VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGVPGALSNQPA 320

Query: 311 -------EVPNYGTNNNQNGKIVYDNKNGNKI----------ENYEIDKTVETIKKHP-E 352
P N QN + N N NYE+D+T+ K + +
Sbjct: 321 PPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDRTIRHTKMNVGD 380

Query: 353 LTKTNVVVWVDNDTLVKRKI------DMTTFKEAIGTAAGLQADPNGNFTNGQVNVVTVQ 406
+ + +V V V+ TL K M ++ A G +NVV
Sbjct: 381 IERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDK-----RGDTLNVVNSP 435

Query: 407 FDQPKEEKKKEPEESGINWWLFGGIPAGLLAIGGLVWFFLARRKRKKEEEEYEEYLAEEE 466
F + P ++ L + + W + R + EE A +E
Sbjct: 436 FSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRRVEEAKAAQE 495

Query: 467 IAASNESIMEIPEEKI----VPEPKPEPEEPKEPTLDEQVQDATKEHVEGTAKVIKKWLN 522
A + E E ++ + + + + +++++ + A VI++W++
Sbjct: 496 QAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPRVVALVIRQWMS 555


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1488FLGMOTORFLIG2064e-66 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 206 bits (525), Expect = 4e-66
Identities = 116/336 (34%), Positives = 195/336 (58%), Gaps = 6/336 (1%)

Query: 10 LDEISSKEKAAILIRTLEEGVAAKVIEYMTAEEKEVLLREIAKFRVYKPETLENVLGEFL 69
+ ++ K+KAAIL+ ++ +++KV +Y++ EE E L EIAK E +NVL EF
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 70 YELNVKELNLVTPDKEYIRRIF-KNMPEDELEKLLEDLWYN-KDNPFEFLNSLTDLEPLL 127
EL + + + +Y R + K++ + ++ +L + PFEF+ D +L
Sbjct: 72 -ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRA-DPANIL 129

Query: 128 TVLNDESPQTIAIIASYIKPQLASQLIERLPDHKRVETVMGIAKLEQVDGELINQIGDLL 187
+ E PQTIA+I SY+ PQ AS ++ LP + IA +++ E++ ++ +L
Sbjct: 130 NFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVL 189

Query: 188 KSKLNNMAFNAINKTDGLKTIVNILNNVSRGVEKTVFQKLDEMDYELSERIKENMFVFED 247
+ KL +++ G+ +V I+N R EK + + L+E D EL+E IK+ MFVFED
Sbjct: 190 EKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFED 249

Query: 248 LLGLEDLALRRVLEEITDNGVIAKALKIAKEEIKEKLFTCMSSNRKEMILEELDGLGPLK 307
++ L+D +++RVL EI D +AKALK ++EK+F MS M+ E+++ LGP +
Sbjct: 250 IVLLDDRSIQRVLREI-DGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTR 308

Query: 308 MTDAEKAQQTITGTVKKLEKEGRIIVQRG-EDDVLI 342
D E++QQ I ++KLE++G I++ RG E+DVL+
Sbjct: 309 RKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVLV 344


83BALH_1494BALH_1509N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_1494014-0.382169flagellar hook protein FlgE
BALH_1495214-0.337439hypothetical protein
BALH_1496215-0.467238chemotaxis protein
BALH_14972170.337559hypothetical protein
BALH_14982211.495474flagellin
BALH_14992190.917633flagellin
BALH_15003260.470247soluble lytic murein transglycosylase
BALH_1501425-0.310995flagellar motor switch protein
BALH_15023200.098688flagellar motor switch protein FliM
BALH_1503416-0.268477flagellar motor switch protein
BALH_1504415-0.271097flagellar biosynthesis protein FliP
BALH_1505412-0.253432flagellar biosynthesis protein FliR
BALH_15062110.000684flagellar biosynthesis protein FlhB
BALH_15071100.505870flagellar biosynthesis protein FlhA
BALH_15080110.221255flagellar biosynthesis regulator FlhF
BALH_15090120.105451flagellar basal body rod protein FlgG
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1494FLGHOOKAP1441e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 1e-06
Identities = 15/36 (41%), Positives = 24/36 (66%)

Query: 5 LYTSITGMNAAQNALSVTSNNIANAQTVGYKKQKAI 40
+ +++G+NAAQ AL+ SNNI++ GY +Q I
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI 39



Score = 37.6 bits (87), Expect = 8e-05
Identities = 10/39 (25%), Positives = 26/39 (66%)

Query: 397 SNVDLSVEFVDLMLYQRGFQGNAKVIKVSDEVLNEVVNL 435
S V+L E+ +L +Q+ + NA+V++ ++ + + ++N+
Sbjct: 507 SGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1496HTHFIS482e-08 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 47.9 bits (114), Expect = 2e-08
Identities = 35/125 (28%), Positives = 55/125 (44%), Gaps = 14/125 (11%)

Query: 176 IYIAEDSAMLRQILEETLSSAGYTKMNFFSNGAEALAQIEKLAKEQGEKMFEHIHLLITD 235
I +A+D A +R +L + LS AGY + A I L++TD
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSN-AATLWRWIAA----------GDGDLVVTD 54

Query: 236 IEMPKMDGHHLTKVVKDSEVMNRLPVIIFSSLITNELFHKGEAVGANAQVSKP-DIQELI 294
+ MP + L +K + LPV++ S+ T K GA + KP D+ ELI
Sbjct: 55 VVMPDENAFDLLPRIK--KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELI 112

Query: 295 GLVDK 299
G++ +
Sbjct: 113 GIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1498FLAGELLIN1276e-36 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 127 bits (321), Expect = 6e-36
Identities = 51/140 (36%), Positives = 81/140 (57%)

Query: 4 MRIGTNVLSMNARQSLYENEKRMNVAMEHLATGKKLNHASDNPANVAIVTRMHARASGMR 63
I TN LS+ + +L +++ ++ A+E L++G ++N A D+ A AI R + G+
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 64 VAIRNNEDAISMLRTAEAALQTVMNILQRMRDLAVQSANGTNSNKNRDSLNKEFQSLTEQ 123
A RN D IS+ +T E AL + N LQR+R+L+VQ+ NGTNS+ + S+ E Q E+
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 124 IGYIGETTEFNDLSVFDGQN 143
I + T+FN + V N
Sbjct: 122 IDRVSNQTQFNGVKVLSQDN 141



Score = 88.6 bits (219), Expect = 4e-22
Identities = 35/100 (35%), Positives = 55/100 (55%)

Query: 169 DINISTEQEARAAIRKIEEALQNVSLHRADLGTMMNRLQFNIENLNSQSMALTDAASRIE 228
+ + ++ + I+ AL V R+ LG + NR I NL + L A SRIE
Sbjct: 408 EDAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIE 467

Query: 229 DADMAQEMSDFLKFKLLTEVALSMVSQANQIPQMISKLLQ 268
DAD A E+S+ K ++L + S+++QANQ+PQ + LL+
Sbjct: 468 DADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1499FLAGELLIN1913e-57 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 191 bits (485), Expect = 3e-57
Identities = 135/508 (26%), Positives = 224/508 (44%), Gaps = 59/508 (11%)

Query: 1 MRINTNINSLRTQEYMRQNQTKMSNAMDRLSSGKRINNASDDAAGLAIATRMRARESGLG 60
INTN SL TQ + ++Q+ +S+A++RLSSG RIN+A DDAAG AIA R + GL
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 61 VAANNTQDGMSLIRTADSAMNSVSNILLRMRDIANQSANGTNTDSNKSALQKEFSELQKQ 120
A+ N DG+S+ +T + A+N ++N L R+R+++ Q+ NGTN+DS+ ++Q E + ++
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 121 ITYIADNTQFNDKNLLNADSEVKIQTLDSSNGDQQIGIDLKAVTLEKLGINNISIGSATT 180
I +++ TQFN +L+ D+++KIQ +N + I IDL+ + ++ LG++ ++
Sbjct: 122 IDRVSNQTQFNGVKVLSQDNQMKIQV--GANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 181 A---------------DLKQTDIEAVSTKIAALDKDSVAKDITDIKAAIDKIKDGMKPED 225
A D + + + T +G D
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 226 VTKLNAALDGFKTGEADDDAAGVTAIKTALS----------------------------- 256
+ N A+D FKT ++ A AI A+
Sbjct: 240 DAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKV 299

Query: 257 KVELPKGSFEVAQKDLDDVSTKIAALDKDSVA-KDITDIKAAIDKIKDGMKPEDVTKLNA 315
+ + D+ + + A S + +
Sbjct: 300 STTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLE 359

Query: 316 ALDGFK------------TGEADDDAAGVTAIKTALSKVELPKLGDTIKPTTNSKADSLA 363
A + K T A D + + K + +K +
Sbjct: 360 ANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTAN 419

Query: 364 AVAAIDKALTTVADNRATLGATLNRLDFNVNNLKSQSSAMAASASQIEDADMAKEMSEMT 423
+A+ID AL+ V R++LGA NR D + NL + + + ++ S+IEDAD A E+S M+
Sbjct: 420 PLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMS 479

Query: 424 KFKILNEAGISMLSQANQTPQMVSKLLQ 451
K +IL +AG S+L+QANQ PQ V LL+
Sbjct: 480 KAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1500PF06580290.020 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.1 bits (65), Expect = 0.020
Identities = 8/42 (19%), Positives = 20/42 (47%), Gaps = 1/42 (2%)

Query: 122 LTKKY-NIQKIRSSNEGKYEDIIDRVSHTYGIPKTLIQKMIE 162
+ Y + I+ + ++E+ I+ +P L+Q ++E
Sbjct: 224 VVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVE 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1501TYPE3OMOPROT424e-08 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 41.5 bits (97), Expect = 4e-08
Identities = 14/67 (20%), Positives = 31/67 (46%)

Query: 5 DDIPLTIYFEIGNTKKKIEDLLHITKGTLYRLENSTKNTVRLMLENEEIGTGKILTKNGK 64
+ +P+ + F + + +L + + L L + + V +M +G G+++ N
Sbjct: 228 NQLPVKLEFVLYRKNVTLAELEAMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDT 287

Query: 65 MYVEIVE 71
+ VEI E
Sbjct: 288 LGVEIHE 294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1502FLGMOTORFLIM1455e-43 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 145 bits (367), Expect = 5e-43
Identities = 91/329 (27%), Positives = 166/329 (50%), Gaps = 10/329 (3%)

Query: 4 EKLSQEQIDALLKAVNEGEEMPAFAQEAGKQEKFQEYDFNRPEKFGVEHLRSLQAIASTF 63
E LSQ++ID LL A++ G+ A+ K YDF RP+KF E +R+L + TF
Sbjct: 3 EVLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETF 62

Query: 64 GKQTSQTLSARMRIPIELEPSTVEQVPFTSEYVEKMPKDYYLYCVIDLGLPELGEIVIEI 123
+ T+ +LSA++R + + ++V+Q+ + E++ +P L VI + P G V+E+
Sbjct: 63 ARLTTTSLSAQLRSMVHVHVASVDQLTY-EEFIRSIPTPSTL-AVITMD-PLKGNAVLEV 119

Query: 124 DLAFVIYIHECWLGGDSKRNFTMRRPLTAFEFLTLDNIFMLLCKNLEQSFESVVAIEPKF 183
D + I + GG + ++R LT E ++ + + + N+ +S+ V+ + P+
Sbjct: 120 DPSITFSIIDRLFGGT-GQAAKVQRDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRL 178

Query: 184 VTTETDPNALKITTASDIISLLNVNMKTDFWNTTVRIGIPFLSVEEIMDKLTSENIVEHS 243
ET+P +I S+++ L+ + K + IP++++E I+ KL+S+ S
Sbjct: 179 GQIETNPQFAQIVPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFW--FS 236

Query: 244 SDKRKK---YTSEVEAKVNQVYKPVHVAIGEQKMTMGEIEQIEEGDIIPLH-TKVSDELL 299
S +R Y + K++ V V +G ++++ +I + GDII LH T V D +
Sbjct: 237 SVRRSSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFV 296

Query: 300 GYVDGKHKFNCFIGKDGTRKALLFKSFVE 328
+ + KF C G G + A +E
Sbjct: 297 LSIGNRKKFLCQPGVVGKKIAAQILERIE 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1503FLGMOTORFLIN584e-14 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 58.0 bits (140), Expect = 4e-14
Identities = 22/94 (23%), Positives = 50/94 (53%)

Query: 13 LEDFAGKRNEASKAHIDTVSDISIELGVKLGKASITLGDVKELKVGDVLEVEKNLGHKVD 72
+ G + ID + DI ++L V+LG+ +T+ ++ L G V+ ++ G +D
Sbjct: 39 FQQLGGGDVSGAMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLD 98

Query: 73 VYLSNMKVGIGEAIVMDEKFGIIISEIEADKKQA 106
+ ++ + GE +V+ +K+G+ I++I ++
Sbjct: 99 ILINGYLIAQGEVVVVADKYGVRITDIITPSERM 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1504FLGBIOSNFLIP1634e-52 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 163 bits (415), Expect = 4e-52
Identities = 71/203 (34%), Positives = 127/203 (62%)

Query: 48 SSVQLFALVTLLSLSSSIVLLFTHFTYFMIVLGITRQGLGVMNLPPNQVLVGLALFLSLF 107
VQ +T L+ +I+L+ T FT +IV G+ R LG + PPNQVL+GLALFL+ F
Sbjct: 40 LPVQTLVFITSLTFIPAILLMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFF 99

Query: 108 TMQPVLGQLKSDVWDPMTKEKITVSQAAETTAPIMKEYMSKHTYKHDLKMMLKVRGEELP 167
M PV+ ++ D + P ++EKI++ +A E A ++E+M + T + DL + ++
Sbjct: 100 IMSPVIDKIYVDAYQPFSEEKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPL 159

Query: 168 KDLKDLSLFTLVPSFTLTQIQKGLLTGMFIYLAFVFIDLIISTLLMYLGMMMVPPMILSL 227
+ + + + L+P++ ++++ G I++ F+ IDL+I+++LM LGMMMVPP ++L
Sbjct: 160 QGPEAVPMRILLPAYVTSELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIAL 219

Query: 228 PFKILVFVYLGGYTKIVDIMFKT 250
PFK+++FV + G+ +V + ++
Sbjct: 220 PFKLMLFVLVDGWQLLVGSLAQS 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1505TYPE3IMRPROT965e-26 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 96.4 bits (240), Expect = 5e-26
Identities = 53/233 (22%), Positives = 113/233 (48%), Gaps = 1/233 (0%)

Query: 10 FFAFCRITSFLYFLPFFSGRSIPAMAKVTFGLALSITVADQVDVSHIKTTWDVAA-YAGT 68
F+ R+ + + P S RS+P K+ + ++ +A + + + A A
Sbjct: 17 FWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPVFSFFALWLAVQ 76

Query: 69 QIVIGLSLSKIVEMLWNIPKMAGHILDFDIGLSQASLFDVNAGSQSTLLSTIFDIFFLII 128
QI+IG++L ++ + + AG I+ +GLS A+ D + +L+ I D+ L++
Sbjct: 77 QILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDMLALLL 136

Query: 129 FISLGGINYFVATILKSFQYTEAISKLLTTSFLDSLLATLLFAITSAVEIALPLMGSLFI 188
F++ G + ++ ++ +F + L ++ +L + + +ALPL+ L
Sbjct: 137 FLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALPLITLLLT 196

Query: 189 INFVLILIAKNAPQLNIFMNAFVIKITCGILFIAMSVPMLGYVFKNMTDVLLE 241
+N L L+ + APQL+IF+ F + +T GI +A +P++ +++ +
Sbjct: 197 LNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIFN 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1506TYPE3IMSPROT2892e-98 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 289 bits (742), Expect = 2e-98
Identities = 92/343 (26%), Positives = 186/343 (54%), Gaps = 2/343 (0%)

Query: 4 DNKTEKATPQKRKKSREEGNIARSKDLNNLFSILVLAVVVYFFGDWLGFEIANSVSVLFD 63
KTE+ TP+K + +R++G +A+SK++ + I+ L+ ++ D+ + + + +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 64 QIGKNTDS--TEYFYMMGILLLKVSAPILILVYAFHLFNYMIQVGFLFSSKVIKPKASRI 121
Q + + + + P+L + + ++++Q GFL S + IKP +I
Sbjct: 63 QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKKI 122

Query: 122 NPKNYFTRLFSRKSLVDILKSLFYMGLIGYVAYVLFKKNLEKIVSMIGFNWTASLTEIIR 181
NP R+FS KSLV+ LKS+ + L+ + +++ K NL ++ + + +
Sbjct: 123 NPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQ 182

Query: 182 QIKFIFLAILIILIVLSIIDFIYQKWEYEQDIKMKKEEVKQEHKDNEGDPQVKGKRKNFM 241
++ + + + +V+SI D+ ++ ++Y +++KM K+E+K+E+K+ EG P++K KR+ F
Sbjct: 183 ILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQFH 242

Query: 242 HAILQGTIAKKMDGATFIVNNPTHISVVLRYNKHVDAAPIVVAKGEDELALYIRTLAREQ 301
I + + + ++ +V NPTHI++ + Y + P+V K D +R +A E+
Sbjct: 243 QEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEEE 302

Query: 302 EIPMVENRPLARSLYYQVEEDETIPEDLYVAVIEVMRYLIQTN 344
+P+++ PLAR+LY+ D IP + A EV+R+L + N
Sbjct: 303 GVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQN 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1509FLGHOOKAP1280.042 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 28.0 bits (62), Expect = 0.042
Identities = 9/57 (15%), Positives = 20/57 (35%), Gaps = 3/57 (5%)

Query: 2 NGLYIGSMGMMNYMQRINVHSNNVANAQTTGFKAENMTSKVFDVQDAYRRGDGAVTN 58
+ + G+ +N SNN+++ G+ + ++ G V N
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMA---QANSTLGAGGWVGN 55


84BALH_1520BALH_1524N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_1520-313-0.588501multidrug resistance protein
BALH_1521-313-0.509399hypothetical protein
BALH_1522-314-0.853961LysR family transcriptional regulator
BALH_1523-214-0.649113ABC transporter ATP-binding protein
BALH_1524-114-0.807262ABC transporter substrate-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1520TCRTETA431e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 43.3 bits (102), Expect = 1e-06
Identities = 43/229 (18%), Positives = 89/229 (38%), Gaps = 11/229 (4%)

Query: 55 FATTLVCGSLPRMICGPIAGAVADRVSRRWLVVGTDLLSSLTMLIMFILATIFGPSLLFI 114
+ L +L + C P+ GA++DR RR +++ + +++ IM P L +
Sbjct: 45 YGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIM-----ATAPFLWVL 99

Query: 115 YISAALLSICASFYSVALTSSIPNLVDEGRIQKASALNQTAASLSNILGPIIGGVVFGFF 174
YI + I + +VA + I ++ D + + GP++GG++ G F
Sbjct: 100 YIGRIVAGITGATGAVA-GAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLM-GGF 157

Query: 175 SIQSFFLLNSITFFLAVILQLFIVFDLYKKEVAESKEHFLTSIKEGFSYVKRQHEIYGLM 234
S + F + L + F++ + +K E + L + F + + + LM
Sbjct: 158 SPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREAL-NPLASFRWARGMTVVAALM 216

Query: 235 KIALWVNFFACGLTVALPYIIVHTLHLSSKQLGIVEGMLAVGMLMGAIA 283
+ + H + +GI LA ++ ++A
Sbjct: 217 AVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGI---SLAAFGILHSLA 262



Score = 33.6 bits (77), Expect = 0.001
Identities = 18/97 (18%), Positives = 35/97 (36%), Gaps = 2/97 (2%)

Query: 76 VADRVSRRWLVVGTDLLSSLTMLIMFILATIFGPSLLFIYISAALLSICASFYSVALTSS 135
+ V+ R +L + +IL + I +L AL +
Sbjct: 266 ITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPI--MVLLASGGIGMPALQAM 323

Query: 136 IPNLVDEGRIQKASALNQTAASLSNILGPIIGGVVFG 172
+ VDE R + SL++I+GP++ ++
Sbjct: 324 LSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1522TETREPRESSOR290.018 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 29.1 bits (65), Expect = 0.018
Identities = 26/99 (26%), Positives = 40/99 (40%), Gaps = 9/99 (9%)

Query: 21 VAEKLGVKQPTITFHIKSLEEELGVSLFE--LRSGRYFLTEAGEALHHYACKIDALMKEA 78
+A+KLG++QPT+ +H+K+ L E R Y L AGE+ L A
Sbjct: 30 LAQKLGIEQPTLYWHVKNKRALLDALAVEILARHHDYSLPAAGESWQ------SFLRNNA 83

Query: 79 RRVTQEFKDFHKGA-ITIGASYVPATYLLPEIVYQFQCE 116
+ + GA + +G Y E +F E
Sbjct: 84 MSFRRALLRYRDGAKVHLGTRPDEKQYDTVETQLRFMTE 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1523PF05272300.019 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.019
Identities = 13/41 (31%), Positives = 18/41 (43%), Gaps = 7/41 (17%)

Query: 32 TLLGPSGCGKTTLLRMIAGLEEPDKGEIYFGDTCMYSSAKK 72
L G G GK+TL+ + GL+ +F DT K
Sbjct: 600 VLEGTGGIGKSTLINTLVGLD-------FFSDTHFDIGTGK 633


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1524MALTOSEBP290.036 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 28.5 bits (63), Expect = 0.036
Identities = 78/308 (25%), Positives = 116/308 (37%), Gaps = 69/308 (22%)

Query: 40 EKKIVVYSAGPKG---LAEKIQKDFEKKTGIKVEMFQGTTGKILARMEAEKKNPVVDV-- 94
E K+V++ G KG LAE + K FEK TGIKV + + E+K P V
Sbjct: 30 EGKLVIWINGDKGYNGLAE-VGKKFEKDTGIKVTVEHPD--------KLEEKFPQVAATG 80

Query: 95 ----VVLASLPAMEGLKKDGQTLAYKEAKQADKLRSEWSDDKGHYFG------YSASALG 144
++ + G + G K ++ D Y G + AL
Sbjct: 81 DGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTWDAVRYNGKLIAYPIAVEALS 140

Query: 145 IVYNTKNVKTAPEDWSDI--------TKGEWKGKVNLPDP--------ALSGSALDFVTG 188
++YN + P+ W +I KG+ NL +P A G A + G
Sbjct: 141 LIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEPYFTWPLIAADGGYAFKYENG 200

Query: 189 -YVKKN-------GKDGWNLFEQLKKNEVTVAGANQEALDPVVT-GAKDMVIAG------ 233
Y K+ K G L KN+ A + + G M I G
Sbjct: 201 KYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAEAAFNKGETAMTINGPWAWSN 260

Query: 234 -----VDY-MTYSAKAKGEPVDIVYPKSGTVISPRAAGIMKDSKNVEGAKEFID-YLLSD 286
V+Y +T KG+P S + +AGI S N E AKEF++ YLL+D
Sbjct: 261 IDTSKVNYGVTVLPTFKGQP-------SKPFVGVLSAGINAASPNKELAKEFLENYLLTD 313

Query: 287 DVQKQISK 294
+ + ++K
Sbjct: 314 EGLEAVNK 321


85BALH_1578BALH_1585N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_1578-1110.483228multidrug ABC transporter permease
BALH_1579-1120.759809cardiolipin synthetase
BALH_1580-2111.354906uridylate kinase
BALH_1581-390.961575proton/sodium-glutamate symport protein
BALH_1582-280.791604aspartate ammonia-lyase
BALH_1583-190.165057malate dehydrogenase
BALH_1584-110-0.464066sensor histidine kinase
BALH_1585010-0.401773response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1578ABC2TRNSPORT452e-07 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 45.3 bits (107), Expect = 2e-07
Identities = 28/106 (26%), Positives = 47/106 (44%)

Query: 276 IVMIGVLMLFALIAIGISLVLVAFSKNSASANTMQNLVIVPTCLLAGCYFPYDIMPKAVQ 335
+ + V+ L L + +V+ A + + Q LVI P L+G FP D +P Q
Sbjct: 148 LYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQ 207

Query: 336 KVADFLPQRWLLDTIAKLQQGIPFSELYVNILILFAFAVAFFLIAI 381
A FLP +D I + G P ++ ++ L + V F ++
Sbjct: 208 TAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLST 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1580CARBMTKINASE280.028 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 28.3 bits (63), Expect = 0.028
Identities = 15/60 (25%), Positives = 24/60 (40%), Gaps = 14/60 (23%)

Query: 122 LDNGYIVIFGGGNGQPFVTT-------------DYPSVQRAIEMNSDAILVAKQGVDGVF 168
++ G IVI GG G P + D + A E+N+D ++ V+G
Sbjct: 183 VERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILT-DVNGAA 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1584PF06580371e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 1e-04
Identities = 26/130 (20%), Positives = 46/130 (35%), Gaps = 20/130 (15%)

Query: 297 LGKDIRFSKHIEGEHAAYHV--YTVLSIFNNLVANAVEAIEDRGLIHIKLYKREQHVIFE 354
++F I V V ++ N + + + + G I +K K V E
Sbjct: 236 FEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLE 295

Query: 355 VIDDGPGITQKYKKLVFKPGFTSKYDQTGTPSTGIGLSYIDEMVTEL-GGEVRLEDNENG 413
V + G + K+ STG GL + E + L G E +++ +E
Sbjct: 296 VENTGSLALKNTKE-----------------STGTGLQNVRERLQMLYGTEAQIKLSEKQ 338

Query: 414 NGCKFIVCLP 423
+V +P
Sbjct: 339 GKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1585HTHFIS543e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 54.1 bits (130), Expect = 3e-10
Identities = 29/177 (16%), Positives = 71/177 (40%), Gaps = 10/177 (5%)

Query: 19 IVDDDEVFRSMLSQIIEDGDLGEVIGESEDGAFIEAEQLNYKKVDILFIDLLMPMRDGIE 78
+ DDD R++L+Q + I + + + D++ D++MP + +
Sbjct: 8 VADDDAAIRTVLNQALSRAGYDVRITSNAATLW---RWIAAGDGDLVVTDVVMPDENAFD 64

Query: 79 TVRHIASSFTG-KIIMISQVESKQLIGEAYTLGVEYYITKPLNKIEVVSVVRKVIERIRL 137
+ I + ++++S + +A G Y+ KP + E++ ++ + + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 138 ERSIYDIQKSLNNVFQWEKPQMRNETVQEGKKISDSGRFLLSELGIAGENGS-KDLL 193
S + M+ E + ++ + L+ I GE+G+ K+L+
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQ-EIYRVLARLMQTDLTLM----ITGESGTGKELV 176


86BALH_1653BALH_1660N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_1653-214-0.876505two-component response regulator
BALH_1654-312-0.884749sensor histidine kinase
BALH_1655-311-0.185277cell wall endopeptidase
BALH_1656-310-0.601403proline iminopeptidase
BALH_1657-210-1.080040manganese transport protein MntH
BALH_1658-111-1.351883hypothetical protein
BALH_1659010-1.764790hypothetical protein
BALH_1660011-1.6747452-dehydropantoate 2-reductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1653HTHFIS964e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 96.4 bits (240), Expect = 4e-25
Identities = 39/169 (23%), Positives = 71/169 (42%), Gaps = 13/169 (7%)

Query: 16 MEKNSILIVDDDQDIVRFVNANLMQEGFKVLSAHNGEEALKIINNNSIQLAILDIMMPHM 75
M +IL+ DDD I +N L + G+ V N + I L + D++MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 76 DGIELCRRIREKHS-LPIMFLSAKSSDVDKVIGFSTGADDYIVKPFSTIEFIARVKAQLR 134
+ +L RI++ LP++ +SA+++ + + GA DY+ KPF E I + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 135 RYTYFNQNAVQVIEKKISIRGL---------EIDEVSRT---VMLYGET 171
+ + + G + + +T +M+ GE+
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGES 169


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1654PF06580300.017 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.8 bits (67), Expect = 0.017
Identities = 17/103 (16%), Positives = 39/103 (37%), Gaps = 23/103 (22%)

Query: 274 VISNSIMYG----KDGKQILIQISKRDLNVEIEIKNFGQCIPNENLPYVFEKFYRGEKSR 329
++ N I +G G +IL++ +K + V +E++N G
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGS----------------LALKN 306

Query: 330 SSHTGGKGMGLAIARSIAELHKGD--ITVRSNEKETVFTIALP 370
+ + G G+ R + L+ + I + + + + +P
Sbjct: 307 TKESTGTGLQNVRER-LQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1655RTXTOXIND290.024 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.024
Identities = 9/35 (25%), Positives = 18/35 (51%), Gaps = 3/35 (8%)

Query: 202 SGTFLVLAHLKKG---SIKVREGQHVNEGEFLAQV 233
SG + ++ I V+EG+ V +G+ L ++
Sbjct: 93 SGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKL 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1660NUCEPIMERASE320.003 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 31.7 bits (72), Expect = 0.003
Identities = 18/83 (21%), Positives = 32/83 (38%), Gaps = 14/83 (16%)

Query: 1 MRILVLGAGG-VGGFFGGRLVEKGEDVTFL----------VRSKRKKQLEERGLVIRSVN 49
M+ LV GA G +G RL+E G V + ++ R + L + G
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQF--HK 58

Query: 50 GDFSFQPKLITKEDRTAPFDVIL 72
D + + +T + F+ +
Sbjct: 59 IDLADRE-GMTDLFASGHFERVF 80


87BALH_1747BALH_1754N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_1747-116-2.789189DNA-binding response regulator
BALH_1748014-2.437927sensor histidine kinase
BALH_1749-113-1.921148polysaccharide deacetylase
BALH_1750014-1.934897lipoprotein
BALH_1751114-1.717294MgtC/SapB family membrane protein
BALH_1752013-1.897561acetyltransferase
BALH_1753013-1.722778siderophore biosynthesis protein
BALH_1754015-1.576399siderophore biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1747HTHFIS964e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 96.1 bits (239), Expect = 4e-25
Identities = 29/123 (23%), Positives = 58/123 (47%), Gaps = 3/123 (2%)

Query: 5 PTILVIEDEIPIRSFIVLNLKRAGFYVLEASTGEEALQILCEHTVDVALLDVMLPGMDGF 64
TILV +D+ IR+ + L RAG+ V S + + D+ + DV++P + F
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 65 QVCKAIREENKKIGIIMLTARVQNEDKVQGLGIGADDYIAKPFSP---VELTARIQSLLR 121
+ I++ + +++++A+ ++ GA DY+ KPF + + R + +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 122 RIE 124
R
Sbjct: 124 RRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1748PF06580392e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.5 bits (92), Expect = 2e-05
Identities = 22/101 (21%), Positives = 44/101 (43%), Gaps = 22/101 (21%)

Query: 371 IVQNAIKY----SHKNGKVYIEATKNEGQAVIKVKDDGIGIAKEHLPYIEQSFYQINNHT 426
+V+N IK+ + GK+ ++ TK+ G ++V++ G K N
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALK--------------NTK 308

Query: 427 TGAGLGLAIVKKMVELHGG---TINIISKEGIGTTILIKLP 464
G GL V++ +++ G I + K+G ++ +P
Sbjct: 309 ESTGTGLQNVRERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1753PF041832871e-91 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 287 bits (737), Expect = 1e-91
Identities = 101/543 (18%), Positives = 192/543 (35%), Gaps = 55/543 (10%)

Query: 82 QFYYQMGDSNSVMKADYVTVITFLIKEMSINYG-EGTNPAELMLRVIRSCQNIEEFTKER 140
+ + D+ ++ AD + L+ ++ AE M + + + K R
Sbjct: 54 IWGWLWIDAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKAR 113

Query: 141 KEDTSALYGFHTSFIEAEQSLLFGHLTHPTPKSRQGILEWKSAMYSPELKGECQLHYFRA 200
+ +++ + Q LL GH K R+G + Y+PE +LH+
Sbjct: 114 RGLSASDL--INLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAV 171

Query: 201 HKSIVNEKSLLLDSTTVILKEELRNDEM-VSKEFISKYCNEDEYSLLPIHPLQAEWLLHQ 259
+ + + +L + E + + + + LP+HP Q + +
Sbjct: 172 KREHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIAT 231

Query: 260 PYVQDWIEQGVLEYIGPTGKCYMATSSLRTLYHPDAKYMLKFSFPVKV--TNSMRINKLK 317
++ D +G + +G G ++A SLRTL + + L P+ + T+ R +
Sbjct: 232 DFIAD-FAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGR 290

Query: 318 ELESGLEGKAMLNTAI-GEVLEKFPGFDFICDPAFITL-----------NYGTQESGFEV 365
+ +G L + G + +PA + Y QE V
Sbjct: 291 YIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEM-LGV 349

Query: 366 IIRENPFYSEHADDATLIAGLVQDAIPGERTRLSNIIHRLADLESRSCEEVSLEWFRRYM 425
I RENP D++ ++ + + + I R E W +
Sbjct: 350 IWRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRS----GLDAET----WLTQLF 401

Query: 426 NISLKPMVWMYLQYGVALEAHQQNSVVQLKDGYPVKYYFRDNQG-FYFCNSMKEMLNNEL 484
+ + P+ + +YGVAL AH QN + +K+G P + +D QG +++
Sbjct: 402 RVVVVPLYHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLP 461

Query: 485 AGIGERTGNLYDDYIVDERFRYYL--IFNHMFGLINGFGTAGLIREEILLTELRTVLES- 541
+ + T L DY++ + + + + L+ G + E L VL
Sbjct: 462 QEVRDVTSRLSADYLIHDLQTGHFVTVLRFISPLMVRLG----VPERRFYQLLAAVLSDY 517

Query: 542 ----------FLPYNREPSTFLRELLEEDKLACKANLLTRFFDVDELSNPLEQAIYVQVQ 591
F ++ +R +L KL + D+D S L +Q
Sbjct: 518 MKKHPQMSERFALFSLFRPQIIRVVLNPVKLT--------WPDLDGGSRMLPN-YLEDLQ 568

Query: 592 NPL 594
NPL
Sbjct: 569 NPL 571


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1754PF041835850.0 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 585 bits (1509), Expect = 0.0
Identities = 146/602 (24%), Positives = 267/602 (44%), Gaps = 45/602 (7%)

Query: 14 IESEDYISVRRRVLRQLVESLIYEGIITPARIEKEEQILFLIQGLDEDNKSVTYECYGRE 73
+ +D+ V RR++ +++ L YE + + +++ + G + + E
Sbjct: 1 MNHKDWDLVNRRLVAKMLSELEYEQVFHA-ESQGDDRYCINLPG-------AQWR-FIAE 51

Query: 74 RITFGRISIDSLIVRVQDGKQEIQSVAQFLEEVFRVVNVEQTKLDSFIHELEQTIFKDTI 133
R +G + ID+ +R D Q L ++ +V+++ + + +L T+ D
Sbjct: 52 RGIWGWLWIDAQTLRCADEPVLAQ---TLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQ 108

Query: 134 AQYER--CNKLKYTQKSYDELENHLIDGHPYHPSYKARIGFQYRDNFRYGYEFMRPIKLI 191
R + + D L+ L+ GHP K R G+ RY E+ +L
Sbjct: 109 LLKARRGLSASDLINLNADRLQ-CLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLH 167

Query: 192 WIAAHKKNATVGYENEVIYDKILKSEVGERKLEAYKERIHSMGCDPKQYLFIPVHPWQWE 251
W+A +++ +NE+ ++L + + ++ + + G D +L +PVHPWQW+
Sbjct: 168 WLAVKREHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLD-HNWLPLPVHPWQWQ 226

Query: 252 NFIISNYAEDIQDKGIIYLGESADDYCAQQSMRTLRNVTNPKRPYVKVSLNILNTSTLRT 311
I +++ D + ++ LGE D + AQQS+RTL N + +K+ L I NTS R
Sbjct: 227 QKIATDFIADFAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRG 286

Query: 312 LKPYSVASAPAISNWLSNVVSQDSYLRDESRVILLKEFSSVM----YDTNKKATYG---S 364
+ +A+ P S WL V + D+ L VIL + + + Y +A Y
Sbjct: 287 IPGRYIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEM 346

Query: 365 LGCIWRESVHHYLGEQEDAVPFNGLYAKEKDGTPIIDAWLNKYGI--ENWLRLLIQKAII 422
LG IWRE+ +L E V L +++ P+ A++++ G+ E WL L + ++
Sbjct: 347 LGVIWRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVV 406

Query: 423 PVIHLVVEHGIALESHGQNMILVHKEGLPVRIALKDFHEGLEFYRPFLKEMNKCPDFTKM 482
P+ HL+ +G+AL +HGQN+ L KEG+P R+ LKDF + + EM+ P +
Sbjct: 407 PLYHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRD 466

Query: 483 HKTYANGKMNDFFEMDRIECLQEMVLDALFLFNVGELAFVLADKYEWKEESFWMIVVEEI 542
+ + D+ D L V L + E F+ ++ +
Sbjct: 467 VTSRLS---ADYLIHD---------LQTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVL 514

Query: 543 ENHFRKYPHLKDRFESIQLYTPTFYAEQLTKRRL-YMDVESLVHEVP-------NPLYRA 594
++ +K+P + +RF L+ P L +L + D++ +P NPL+
Sbjct: 515 SDYMKKHPQMSERFALFSLFRPQIIRVVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLV 574

Query: 595 RQ 596
Q
Sbjct: 575 TQ 576


88BALH_1988BALH_1993N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_1988115-1.412947acetyltransferase
BALH_1989016-1.521634hypothetical protein
BALH_1990018-2.235536acetyltransferase
BALH_1991118-2.397939ABC transporter ATP-binding protein
BALH_1992115-3.352799hypothetical protein
BALH_1993011-2.075048hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1988SACTRNSFRASE525e-11 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 51.9 bits (124), Expect = 5e-11
Identities = 24/92 (26%), Positives = 44/92 (47%), Gaps = 5/92 (5%)

Query: 58 MERKESVIFVAVEDGEYIGFTQLYPSFSSISMKELWILNDLFVQAAKRGAGTGKKLLEAA 117
+E + F+ + IG ++ +++ ++ D+ V R G G LL A
Sbjct: 60 VEEEGKAAFLYYLENNCIGRIKIRSNWN-----GYALIEDIAVAKDYRKKGVGTALLHKA 114

Query: 118 KEFALENGAKGVKLQTEIDNLSAQRLYAENGY 149
E+A EN G+ L+T+ N+SA YA++ +
Sbjct: 115 IEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1990SACTRNSFRASE501e-10 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 50.3 bits (120), Expect = 1e-10
Identities = 27/109 (24%), Positives = 46/109 (42%), Gaps = 6/109 (5%)

Query: 32 SYEDMNNRLQFVQMSPFDFLYVYEEEKTIFGLLGFRIRENLEDITRYGEISIISVDSTIR 91
YED + + +V+ ++Y E G + R N Y I I+V R
Sbjct: 49 QYEDDDMDVSYVEEEG-KAAFLYYLENNCIGRIKIRSNWN-----GYALIEDIAVAKDYR 102

Query: 92 RKGIGHILMDYAEQLAKKHNCIGTWLVSGTKRVEAHPFYKKLGYEVNGY 140
+KG+G L+ A + AK+++ G L + + A FY K + +
Sbjct: 103 KKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAV 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1991PF05272320.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.002
Identities = 14/36 (38%), Positives = 18/36 (50%)

Query: 43 LIGANGAGKSTTIKTMLGLLVNVNGEISFGEKKNPY 78
L G G GKST I T++GL + G K+ Y
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSY 636


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_1993TRNSINTIMINR290.019 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 28.5 bits (63), Expect = 0.019
Identities = 23/91 (25%), Positives = 41/91 (45%), Gaps = 9/91 (9%)

Query: 42 DKPTSTAGQQNLESTSYTYEETNDRLTTDTFITYAMQEAEKQSMQKFGTKIGPVIEDEFK 101
D PT+T Q + + T D+LT + F E +K ++ G I E K
Sbjct: 264 DDPTTTDPDQ---AANAAESATKDQLTQEAFKN---PENQKVNIDANGNAIP---SGELK 314

Query: 102 DVILPKIEEAIAELANDVPEESLQSLAISQK 132
D I+ +I + E +++++S A +Q+
Sbjct: 315 DDIVEQIAQQAKEAGEVARQQAVESNAQAQQ 345


89BALH_2103BALH_2119N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_210309-0.188308exonuclease
BALH_21040120.754737ArsR family transcriptional regulator
BALH_21050111.130713hypothetical protein
BALH_2106-2101.293997oxalate/formate antiporter
BALH_2107-3101.3553502,3-dihydroxybenzoate-2,3-dehydrogenase
BALH_2108-3101.258622isochorismate synthase DhbC
BALH_2109-3111.0710152,3-dihydroxybenzoate-AMP ligase
BALH_2110-3110.521596isochorismatase
BALH_2111-3110.513619nonribosomal peptide synthetase
BALH_2112-116-1.864172mbtH-like protein
BALH_2113-216-2.038222drug resistance transporter
BALH_2114017-2.0451884'-phosphopantetheinyl transferase
BALH_2115-114-1.119009hypothetical protein
BALH_2116-116-1.085040HU family DNA-binding protein
BALH_2117-114-1.611212hypothetical protein
BALH_2118015-1.226882DinB protein
BALH_2119-116-1.039211thermitase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2103GPOSANCHOR350.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 34.7 bits (79), Expect = 0.002
Identities = 47/335 (14%), Positives = 103/335 (30%), Gaps = 35/335 (10%)

Query: 327 EQWHEEAMQNEQKAESLLKQIIAKKEKIMNNFELAQEKYEVVKNKEPERENVKKLVQRLE 386
E+ E A + E + +L + + E E + N + + K +
Sbjct: 53 EKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKA 112

Query: 387 -ELQPIIASLAEKQLSLQNAEIQIGKLKESMQNLDRQLEEHTNQKQLMTGELQQLERALE 445
++Q + A A+ + +L+ A ++ L+ + +K + L+
Sbjct: 113 SKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST 172

Query: 446 QYVDKVEELTNMREDAKVLKQAYDVWQEKQKFEKEKEAAYSKMQLAVNAYENMERRWLSE 505
K++ L EK E + ++ A+N + +
Sbjct: 173 ADSAKIKTLEA----------------EKAALEARQAELEKALEGAMNFSTADSAKIKTL 216

Query: 506 QAGILALHLHDGESCPVCGSTTHPKKATEQSGAIDENELNDLRDKKNIAEKLHVQVEEKW 565
+A AL K E++ N K E +E +
Sbjct: 217 EAEKAAL--------------AARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQ 262

Query: 566 NFYHLQYEQVIEEVKKRGYQSEELVETYSALVQKGKQLATEVNTLKASEETRKQ----TA 621
E + + + L +AL + L + L A+ ++ ++ +
Sbjct: 263 AELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASR 322

Query: 622 VKIKSVEEKVDALQKQKREVETEQHRIEMDCMQLR 656
K +E + L++Q + E + + D R
Sbjct: 323 EAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASR 357


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2106TCRTETA478e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 46.7 bits (111), Expect = 8e-08
Identities = 41/195 (21%), Positives = 84/195 (43%), Gaps = 8/195 (4%)

Query: 206 MLGTKQVYLLFIMLFTSCMSGLYLIGMVKDIGVQLVGLSAATAANAVAMVAIFNTLGRI- 264
M + + ++ + + G+ LI V ++ + S A+ ++A++ +
Sbjct: 1 MKPNRPLIVILSTVALDAV-GIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFAC 59

Query: 265 --ILGPLSDKIGRLKIVTGTFVVMASSVLVLSFVDLNYGIYFVCVASVAFCFGGNITIFP 322
+LG LSD+ GR ++ + A +++ + +Y + VA G +
Sbjct: 60 APVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRI--VAGITGATGAVAG 117

Query: 323 AIVGDFFGMKNHSKNYGIVYQGFGFGALAGSFIGALLGGFKP--TFMVIGLLCVVSFIIA 380
A + D ++++G + FGFG +AG +G L+GGF P F L ++F+
Sbjct: 118 AYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTG 177

Query: 381 MLIQAPNHKKEQEEE 395
+ +HK E+
Sbjct: 178 CFLLPESHKGERRPL 192



Score = 36.0 bits (83), Expect = 2e-04
Identities = 24/146 (16%), Positives = 59/146 (40%), Gaps = 13/146 (8%)

Query: 8 PWLVVLGTVIVQMGLGTIYTWSLFNQPLVSKYGWSLNAVAITFSITSLSLA-FSTLFASK 66
L+ + ++ +G W +F + ++ W + I+ + + + +
Sbjct: 213 AALMAVFFIMQLVGQVPAALWVIFGE---DRFHWDATTIGISLAAFGILHSLAQAMITGP 269

Query: 67 LQEKWGLRKLIMIAGLALGLGLILSSQASS----LILLYVLAGVVVGYADGTAYITSLSN 122
+ + G R+ +M+ +A G G IL + A+ ++ +LA +G A ++ +
Sbjct: 270 VAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVD 329

Query: 123 LIKWFPERKGLIAGISVSAYGSGSLI 148
ER+G + G + S++
Sbjct: 330 -----EERQGQLQGSLAALTSLTSIV 350



Score = 32.1 bits (73), Expect = 0.004
Identities = 52/317 (16%), Positives = 107/317 (33%), Gaps = 38/317 (11%)

Query: 63 FASKLQEKWGLRKLIMIAGLALGLGLILSSQASSLILLYVLAGVVVGYADGT-------- 114
L +++G R +++++ + + + A L +LY + +V G T
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLY-IGRIVAGITGATGAVAGAYI 120

Query: 115 AYITSLSNLIKWFPERKGLIAGISVSAYGSGSLIFKYVNAQLIESVGVSQAFIYWGLIVT 174
A IT + F G + +G G ++ V L+ F +
Sbjct: 121 ADITDGDERARHF--------GFMSACFGFG-MVAGPVLGGLMGGFSPHAPFFAAAALNG 171

Query: 175 AMIVLGACLI---HQAADQSAVQETKTHEYTTKEMLGTKQVYLLFIMLFTSCMSGLYLIG 231
+ G L+ H+ + +E + + G V L + F + G
Sbjct: 172 LNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAA 231

Query: 232 MVKDIGVQLVGLSAATAANAVAMVAIFNTLGRIIL-GPLSDKIGRLKIVTGTFVVMASSV 290
+ G A T ++A I ++L + ++ GP++ ++G + + + +
Sbjct: 232 LWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGY 291

Query: 291 LVLSFVDLNYGIYFVCVASVAFCFGGNITIFPAIVGDFFGMKNHSKNYGIVYQGFGFGAL 350
++L+F + + PA+ S+ QG G+L
Sbjct: 292 ILLAFATRGW-----MAFPIMVLLASGGIGMPALQAML------SRQVDEERQGQLQGSL 340

Query: 351 A-----GSFIGALLGGF 362
A S +G LL
Sbjct: 341 AALTSLTSIVGPLLFTA 357


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2107DHBDHDRGNASE323e-114 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 323 bits (829), Expect = e-114
Identities = 164/261 (62%), Positives = 197/261 (75%), Gaps = 3/261 (1%)

Query: 1 MNVGEFDGKTVLVTGAAQGIGSVVAKMFLERGATVIAVDQNGEGLNVLLNQNETRMKI-- 58
MN +GK +TGAAQGIG VA+ +GA + AVD N E L +++ + +
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 59 -FHLDVSDSNAVEDTVKRIENDIAPIDILVNVAGVLRMGAIHSLSDEDWNKTFSVNSTGV 117
F DV DS A+++ RIE ++ PIDILVNVAGVLR G IHSLSDE+W TFSVNSTGV
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 118 FYMSRAVSKHMMQRKSGAIVTVGSNAANTPRVEMAAYAASKAATTMFMKCLGLELAAYNI 177
F SR+VSK+MM R+SG+IVTVGSN A PR MAAYA+SKAA MF KCLGLELA YNI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 178 RCNLVSPGSTETEMQRLLWADENGAKNIIAGSQNTYRLGIPLQKIAQPSEIAEAVLFLAS 237
RCN+VSPGSTET+MQ LWADENGA+ +I GS T++ GIPL+K+A+PS+IA+AVLFL S
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVS 240

Query: 238 DKASHITMHNLCVDGGATLGV 258
+A HITMHNLCVDGGATLGV
Sbjct: 241 GQAGHITMHNLCVDGGATLGV 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2110ISCHRISMTASE389e-139 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 389 bits (1001), Expect = e-139
Identities = 177/306 (57%), Positives = 229/306 (74%), Gaps = 9/306 (2%)

Query: 1 MAIPSISVYKMPIESELPKNKVNWTPDPKRAVLLIHDMQEYFLDAYSDKESPKVELISNI 60
MAIP+I Y+MP S++P+NKV+W PDP RAVLLIHDMQ YF+DA++ SP EL +NI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 KVIREKCKELGIPVVYTAQPGGQTLEQRGLLQDFWGDGIPAGPDKKKIVDELSPNDDDIF 120
+ ++ +C +LGIPVVYTAQPG Q + R LL DFWG G+ +GP ++KI+ EL+P DDD+
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LTKWRYSAFKKTNLLEILNEQGRDQLIICGIYAHIGCLLTACEAFMDGIEPFFVADAVAD 180
LTKWRYSAFK+TNLLE++ ++GRDQLII GIYAHIGCL+TACEAFM+ I+ FFV DAVAD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSLGHHKQALEYASSRCAVTTSTHLLLKDLQSVKGD---------KSEGITLQEVHELVA 231
FSL H+ ALEYA+ RCA T T LL LQ+ D K T + + + +A
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 232 QLLREPVESIETDEDLLNRGLDSVRIMSLVEKWRREGKEITFADLAENPTVVDWYRLLSP 291
+LL+E E I EDLL+RGLDSVRIM+LVE+WRREG E+TF +LAE PT+ +W +LL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLTT 300

Query: 292 QTEHVL 297
+++ VL
Sbjct: 301 RSQQVL 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2113TCRTETB1191e-31 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 119 bits (301), Expect = 1e-31
Identities = 90/398 (22%), Positives = 171/398 (42%), Gaps = 14/398 (3%)

Query: 20 FMAAMDATIVNVALQTISKELQVLPSAMGTVNVGYLVSLAVFLPISGWLGDRFGTKRIFL 79
F + ++ ++NV+L I+ + P++ VN ++++ ++ + G L D+ G KR+ L
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TALFVFTTASALCGIANDITSLNIF-RIIQGAGGGLLTPVGMAMLFRTFSPEERPKISRF 138
+ + S + + + SL I R IQGAG + M ++ R E R K
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 IVLPIAVAPAVGPIIGGFFVDQMSWRWAFYINLPFGIIALLFGLLFLKEHIEKSAGRFDS 198
I +A+ VGP IGG + W++ + +P I + L+ L + + G FD
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHYIH--WSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 199 LGFILSAPGFAMIIYALSQGPSRGWISTEIISTGIAGTVFITLFILVELKVKQPMLDLRL 258
G IL + G + ++ IS I + +F+ KV P +D L
Sbjct: 202 KGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 259 LKEPVFRKMSLISLFSSAGLLGMLFVFPLMYQNVIGVSALESG-LTTFPEAIGLMISSQI 317
K F L + G + + P M ++V +S E G + FP + ++I I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 318 VPWSYKKLGARKVISIGLICTVIIFVLLSFVNHDTNPWQIRALLFGIGIFLGQSVGAVQF 377
+ G V++IG+ + F+ SF+ T+ + ++F +G L + +
Sbjct: 313 GGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLG-GLSFTKTVIST 371

Query: 378 SAFNNITPPSMGRATTIFNVQNRLGSAIGVAVLASILA 415
+++ G ++ N + L G+A++ +L+
Sbjct: 372 IVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2114ENTSNTHTASED398e-06 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 38.9 bits (90), Expect = 8e-06
Identities = 22/129 (17%), Positives = 48/129 (37%), Gaps = 23/129 (17%)

Query: 53 RARFIIGCVISRLVLGKILSMSPVQVPIDRMCPVCKLQHGRPQLPEGMPQLSVSHSGEWV 112
+A + G + + L + + + V D+ +P P+G+ S+SH
Sbjct: 47 KAEHLAGRIAAVHALRE-VGVRTVPGMGDK---------RQPLWPDGLFG-SISHCATTA 95

Query: 113 VVAFTKFAPVGVDVEQMNPNVDVMKMAEGVLTDIEKAQVMKLPNEQKIEGFLTYWTR--- 169
+ ++ +G+D+E++ ++A ++ E+ Q
Sbjct: 96 LAVISR-QRIGIDIEKIMSQHTATELAPSIIDSDER------QILQASLLPFPLALTLAF 148

Query: 170 --KEAVLKA 176
KE+V KA
Sbjct: 149 SAKESVYKA 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2116DNABINDINGHU1243e-41 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 124 bits (313), Expect = 3e-41
Identities = 57/89 (64%), Positives = 74/89 (83%)

Query: 2 NKTELIKNVAQSADISQKDASAAVQSVFDTIATALQSGDKVQLIGFGTFEVRERSARTGR 61
NK +LI VA++ ++++KD++AAV +VF +++ L G+KVQLIGFG FEVRER+AR GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGEEIQIAAGKVPAFKAGKELKEAVK 90
NPQTGEEI+I A KVPAFKAGK LK+AVK
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2119SUBTILISIN2673e-89 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 267 bits (683), Expect = 3e-89
Identities = 101/304 (33%), Positives = 152/304 (50%), Gaps = 19/304 (6%)

Query: 110 TPNDPYYKN-QYGLQKIQAPLAWDSQRSDSSIKVAIIDTGVQGSHPDLSSKVIYGHDYVD 168
+ G++ IQAP W+ R +KVA++DTG HPDL +++I G ++ D
Sbjct: 13 IKQEQQVNEIPRGVEMIQAPAVWNQTRG-RGVKVAVLDTGCDADHPDLKARIIGGRNFTD 71

Query: 169 NDN----VSDDGNGHGTHCAGITGALTNNSVGIAGVAPQTSIYAVRVLDNQGSGTLDAVA 224
+D + D NGHGTH AG A T N G+ GVAP+ + ++VL+ QGSG D +
Sbjct: 72 DDEGDPEIFKDYNGHGTHVAGTIAA-TENENGVVGVAPEADLLIIKVLNKQGSGQYDWII 130

Query: 225 QGIREAADSGAKVISLSLGAPNGGTALQQAVQYAWNKGSVIVAAAGNAGNTKAN-----Y 279
QGI A + +IS+SLG P L +AV+ A +++ AAGN G+ Y
Sbjct: 131 QGIYYAIEQKVDIISMSLGGPEDVPELHEAVKKAVASQILVMCAAGNEGDGDDRTDELGY 190

Query: 280 PAYYSEVIAVASTDQSDRKSSFSTYGSWVDVAAPGSNIYSTYKGSTYQSLSGTSMATPHV 339
P Y+EVI+V + + S FS + VD+ APG +I ST G Y + SGTSMATPHV
Sbjct: 191 PGCYNEVISVGAINFDRHASEFSNSNNEVDLVAPGEDILSTVPGGKYATFSGTSMATPHV 250

Query: 340 AGVAAL-------LANQGYSNTQIRQIIESTTDKISGTGTYWKNGRVNAYKAVQYAKQLQ 392
AG AL + + ++ + T + + NG + + ++
Sbjct: 251 AGALALIKQLANASFERDLTEPELYAQLIKRTIPLGNSPKMEGNGLLYLTAVEELSRIFD 310

Query: 393 EKKA 396
++
Sbjct: 311 TQRV 314


90BALH_2144BALH_2151N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2144-216-3.269405TetR family transcriptional regulator
BALH_2146-117-2.726454MmpL family membrane protein
BALH_2147018-4.005430hypothetical protein
BALH_2148018-3.965521chloramphenicol O-acetyltransferase
BALH_2149-116-3.900004acetyltransferase
BALH_2150016-3.050078acetyltransferase
BALH_2151-213-2.571505ribosomal-protein-alanine acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2144HTHTETR864e-23 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 86.2 bits (213), Expect = 4e-23
Identities = 37/181 (20%), Positives = 70/181 (38%), Gaps = 18/181 (9%)

Query: 1 MEQKQRPLGRPRQNKNTKSTKETILEVATRLFLTQNYQGVSMDEVAKVCGVTKATVYYYF 60
M +K + + T++ IL+VA RLF Q S+ E+AK GVT+ +Y++F
Sbjct: 1 MARKTKQEAQE--------TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHF 52

Query: 61 STKADLFTATMIQMMIRIRENMSQILS-TNNTLEERLLNFAKVYLHATMDIDMKNFMKDA 119
K+DLF+ I E + + L L +T+ + + + +
Sbjct: 53 KDKSDLFSEIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEI 112

Query: 120 KLSLSEEQLKELKK-------AEDSMYEVLEKALDKAMQLGEIQKG-NPKFAAHAFVSLL 171
+ E + E+ Y+ +E+ L ++ + + AA +
Sbjct: 113 -IFHKCEFVGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171

Query: 172 S 172
S
Sbjct: 172 S 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2146ACRIFLAVINRP542e-09 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 54.5 bits (131), Expect = 2e-09
Identities = 38/232 (16%), Positives = 86/232 (37%), Gaps = 25/232 (10%)

Query: 203 LLIATVLLVLVLLILLYRSPILAILPLLVVGFAYGIISPTLGFLADHGWIKVDAQAISIM 262
L A +L+ LV+ + L ++ ++P + V + T LA G+ +I+ +
Sbjct: 344 LFEAIMLVFLVMYLFL-QNMRATLIPTIAVPVV---LLGTFAILAAFGY------SINTL 393

Query: 263 T----VLLFGAGTDYCLFLISRYREYLLEEESKYK-ALQLAIKASGGAIIMSALTVVLGL 317
T VL G D + ++ ++E++ K A + ++ GA++ A+ +
Sbjct: 394 TMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVF 453

Query: 318 GTLLL--AHYGAFHR-FAVPFSVAVFIMGIAALTILPALLLIFGRAAFFPFVPRTTSMNE 374
+ GA +R F++ A+ + + AL + PAL + P + +E
Sbjct: 454 IPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLK-------PVSAEHHE 506

Query: 375 ELARRKKKVVKVKKSKGAFSKKLGDVVVRRPWTIIMLTVFVLGGLASFVPRI 426
++ +++ ++ G+ R+
Sbjct: 507 NKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRL 558



Score = 37.9 bits (88), Expect = 2e-04
Identities = 27/161 (16%), Positives = 67/161 (41%), Gaps = 9/161 (5%)

Query: 203 LLIATVLLVLVLLILLYRSPILAILPLLVVGFAYGIISPTLGFLADHGWIKVDAQAISIM 262
L+ + ++V + L LY S + + +LVV I+ L + V +
Sbjct: 875 LVAISFVVVFLCLAALYESWSIPVSVMLVVPLG--IVGVLLAATLFNQKNDVYFM---VG 929

Query: 263 TVLLFGAGTDYCLFLISRYREYLLEE-ESKYKALQLAIKASGGAIIMSALTVVLGLGTLL 321
+ G + ++ ++ + +E + +A +A++ I+M++L +LG+ L
Sbjct: 930 LLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLA 989

Query: 322 LAH---YGAFHRFAVPFSVAVFIMGIAALTILPALLLIFGR 359
+++ GA + + + + A+ +P ++ R
Sbjct: 990 ISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030



Score = 33.7 bits (77), Expect = 0.003
Identities = 33/202 (16%), Positives = 70/202 (34%), Gaps = 21/202 (10%)

Query: 533 AGISNAEDQL--WIGGETASLYDTKQITERDEAVIIPVMISIIALLLLVYLRSIVAMIYL 590
A + N +L IG + + ++++ ++ + ++ L L S + +
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 591 IVTVVLSFFSALGAGWLLLHYGMGVPAIQGAIPLYAFVFLVALGEDYNIFMVSEIWKNRK 650
++ V L L A L + V + G + + L I +V +
Sbjct: 901 MLVVPLGIVGVLLAAT-LFNQKNDVYFMVG------LLTTIGLSAKNAILIVEFAKDLME 953

Query: 651 TQNHLDAVKNGVIQTGSVITSAGLILAGTFAVLGTLPIQV------LVQFGIVTAI--GV 702
+ V + + L+ + F +LG LP+ + Q + + G+
Sbjct: 954 KEGK--GVVEATLMAVRMRLRPILMTSLAF-ILGVLPLAISNGAGSGAQNAVGIGVMGGM 1010

Query: 703 LLDTFIVRPLLVPAITVVLGRF 724
+ T + VP VV+ R
Sbjct: 1011 VSATLLAI-FFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2150SACTRNSFRASE411e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.1 bits (96), Expect = 1e-06
Identities = 26/98 (26%), Positives = 42/98 (42%), Gaps = 6/98 (6%)

Query: 49 YSSVEMMRYSIEELDS--YKVIMDEKIIGGIIVTISGKSYGRIDRIFVEPVYQGKGIGSN 106
Y +M +EE + ++ IG I + + Y I+ I V Y+ KG+G+
Sbjct: 50 YEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTA 109

Query: 107 VIKL-IE--AEYPSIRIWDLETSSRQINNHHFYKKMGY 141
++ IE E + LET I+ HFY K +
Sbjct: 110 LLHKAIEWAKENHFCGLM-LETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2151SACTRNSFRASE386e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.0 bits (88), Expect = 6e-06
Identities = 25/91 (27%), Positives = 36/91 (39%), Gaps = 11/91 (12%)

Query: 72 FVAEYDGEVVGFVGLTQSPGRRSHSGDLFIGVDSEYHNKGIGKALLTKMLDLADNWLMLE 131
F+ + +G + + + + D I V +Y KG+G AL L A W E
Sbjct: 68 FLYYLENNCIGRIKIRSNWNGYALIED--IAVAKDYRKKGVGTAL----LHKAIEW-AKE 120

Query: 132 RVELGV-LET---NPKAKTLYEKFGFVEEGV 158
G+ LET N A Y K F+ V
Sbjct: 121 NHFCGLMLETQDINISACHFYAKHHFIIGAV 151


91BALH_2203BALH_2210N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2203-313-2.572133Zn-dependent hydrolase
BALH_2204-312-2.131216hypothetical protein
BALH_22051160.145259hypothetical protein
BALH_22065202.097805DEAD/DEAH box helicase
BALH_220811312.505107hypothetical protein
BALH_220913343.268750hypothetical protein
BALH_221015372.976026TetR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2203PF06580280.040 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 27.5 bits (61), Expect = 0.040
Identities = 9/30 (30%), Positives = 12/30 (40%), Gaps = 2/30 (6%)

Query: 42 PLIENAILKHGYELKNLKNII-ITHYDDDH 70
L+EN I KHG I + D+
Sbjct: 262 TLVENGI-KHGIAQLPQGGKILLKGTKDNG 290


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2206TONBPROTEIN300.013 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 30.3 bits (68), Expect = 0.013
Identities = 20/113 (17%), Positives = 39/113 (34%), Gaps = 6/113 (5%)

Query: 338 AGGSGLAITFVAAKDEKH------LEEIEKTLGAPIQREIIEQPKIKRVDENGKPLPKPA 391
A +++T V D + E + + V E KP PKP
Sbjct: 40 APAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPK 99

Query: 392 PKKSGEYRQRDSREGSRSGSKGRTRNDSRNSSRNENNRSFNKPSNKKGSTKQG 444
PK + +++ R+ S+ + ++ +R ++ + S S G
Sbjct: 100 PKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATSKPVTSVASG 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2208BACTRLTOXIN260.012 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 26.4 bits (58), Expect = 0.012
Identities = 8/22 (36%), Positives = 13/22 (59%)

Query: 31 KINWYNDMKTSFANKELADLVK 52
K+ Y+ +KT N++LA K
Sbjct: 84 KLKNYDKVKTELLNEDLAKKYK 105


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2210HTHTETR724e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.0 bits (176), Expect = 4e-18
Identities = 32/161 (19%), Positives = 70/161 (43%), Gaps = 9/161 (5%)

Query: 8 EERRKEILETAERLFLTKGYTKTTVNDILKEIGIAKGTFYHYFKSKEEVMDEIIMRIIKE 67
+E R+ IL+ A RLF +G + T++ +I K G+ +G Y +FK K ++ E I + +
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE-IWELSES 68

Query: 68 DVAKAKVIVSNPNIPVLEKLFRVLME---QSPKSGDIKDKMIE-QFHQPNNA-EMHQKSL 122
++ + ++ + R ++ +S + + + ++E FH+ EM
Sbjct: 69 NIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQ 128

Query: 123 VQSIIHLSPV--LTEILEQGIEEGIFSTSY-PQETIELLLS 160
Q + L + + L+ IE + + ++
Sbjct: 129 AQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRG 169


92BALH_2385BALH_2390N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2385217-3.656254TetR family transcriptional regulator
BALH_2386216-3.486188short chain dehydrogenase family protein
BALH_2387216-3.012372major facilitator superfamily permease
BALH_2388115-3.040919major facilitator superfamily permease
BALH_2389118-2.699808penicillin-binding protein
BALH_2390119-3.052452acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2385HTHTETR653e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.0 bits (158), Expect = 3e-15
Identities = 24/71 (33%), Positives = 43/71 (60%)

Query: 1 MNKRRYDSDLAKEIIAKKAIELFSLKGYTRTSIDNIAKASGYSKGHIYYHYKNKEELFVY 60
K + ++ ++ I A+ LFS +G + TS+ IAKA+G ++G IY+H+K+K +LF
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 61 LAKDSMKNWHD 71
+ + S N +
Sbjct: 62 IWELSESNIGE 72


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2386DHBDHDRGNASE902e-23 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 90.1 bits (223), Expect = 2e-23
Identities = 52/186 (27%), Positives = 91/186 (48%)

Query: 3 EQRIAIITGGASGIGKDLAIQLANKDIFVVIADINETSGQDLVNNIKNNNQLARFEYLDV 62
E +IA ITG A GIG+ +A LA++ + D N + +V+++K + A DV
Sbjct: 7 EGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADV 66

Query: 63 TKAESVEDLIIKIANEFGRIDYMFNNAGIAMYGEVSDMSLDNWKHIIEINLLGVIYGTQL 122
+ +++++ +I E G ID + N AG+ G + +S + W+ +N GV ++
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 123 AYQFMKKQGFGYIINTASATGLGPAPLCTAYATTKHAIVGLTTSLHYEAEEYGVNVSVLC 182
++M + G I+ S P AYA++K A V T L E EY + +++
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PTFVDT 188
P +T
Sbjct: 187 PGSTET 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2387TCRTETA351e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 35.2 bits (81), Expect = 1e-04
Identities = 40/171 (23%), Positives = 71/171 (41%), Gaps = 9/171 (5%)

Query: 13 LLLSGVGIANLGAWIYLIALNVLVYHMGGSALAVATLYVIKPLAAL---FTNAWSGSVID 69
++LS V + +G + + L L+ + S A ++ L AL G++ D
Sbjct: 9 VILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSD 68

Query: 70 RLNKRKLMIHLDIYRAVCIAILPLLPSLWIVYVFVFFISMANAIYEPTAMTYMTKLIPVE 129
R +R +++ AV AI+ P LW++Y+ + A A Y+ + +
Sbjct: 69 RFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAYIADITDGD 127

Query: 130 QRQR-FNSLRSLIGSGASVIGPSIAGALLIASTPE---FAIYMNAIAFLLS 176
+R R F + + G G V GP + G + S A +N + FL
Sbjct: 128 ERARHFGFMSACFGFGM-VAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTG 177



Score = 29.0 bits (65), Expect = 0.014
Identities = 23/116 (19%), Positives = 47/116 (40%), Gaps = 3/116 (2%)

Query: 48 TLYVIKPLAALFTNAWSGSVIDRLNKRKLMIHLDIYRAVCIAILPLLPSLWIVY-VFVFF 106
+L L +L +G V RL +R+ ++ I +L W+ + + V
Sbjct: 251 SLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLL 310

Query: 107 ISMANAIYEPTAMTYMTKLIPVEQRQRFNSLRSLIGSGASVIGPSIAGALLIASTP 162
S I P +++ + E++ + + + S S++GP + A+ AS
Sbjct: 311 AS--GGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASIT 364


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2390SACTRNSFRASE280.013 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.4 bits (63), Expect = 0.013
Identities = 18/56 (32%), Positives = 28/56 (50%), Gaps = 3/56 (5%)

Query: 81 IWHIAVHPDFRRMKIGNQLLNEAEKLAKELNLN--RLEAWTRDDLWVHGWYENNGF 134
I IAV D+R+ +G LL++A + AKE + LE + H +Y + F
Sbjct: 92 IEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACH-FYAKHHF 146


93BALH_2563BALH_2569N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2563113-2.637184methyltransferase
BALH_2564013-3.316582x-prolyl-dipeptidyl aminopeptidase
BALH_2566-213-4.769653sensor histidine kinase
BALH_2567-215-3.878936two-component response regulator
BALH_2568016-4.231593hypothetical protein
BALH_2569117-4.082386acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2563DHBDHDRGNASE290.012 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 29.2 bits (65), Expect = 0.012
Identities = 12/49 (24%), Positives = 18/49 (36%), Gaps = 5/49 (10%)

Query: 46 VGSGR-----VIIPLLEAGFTVDGIDYSPEMLESCRIRCKERGLHPNLY 89
G+ + V L G + +DY+PE LE K H +
Sbjct: 14 TGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2566PF06580355e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 35.2 bits (81), Expect = 5e-04
Identities = 34/173 (19%), Positives = 60/173 (34%), Gaps = 38/173 (21%)

Query: 282 IIKQSDHISNLIEEL---LRFS---KLERDVLQKEEFSIKSLVQSILDKHKIELESKEIN 335
I++ ++ L +R+S R V +E + +V S L I+ E +
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELT---VVDSYLQLASIQFEDR--- 239

Query: 336 LQVNYNVGDAIVYADVNKMRMVFQNLISNAIKY-----TSNQNIKITLEDRNESVYFQIQ 390
LQ + AI+ V M + Q L+ N IK+ I + N +V +++
Sbjct: 240 LQFENQINPAIMDVQVPPM--LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVE 297

Query: 391 NGMNAEHMKDIDKIWEPFYVLESSRSKDHSGTGLGLAIVKSILE-RHGFDYGV 442
N TG GL V+ L+ +G + +
Sbjct: 298 N--TGSLALK----------------NTKESTGTGLQNVRERLQMLYGTEAQI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2567HTHFIS904e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.9 bits (223), Expect = 4e-23
Identities = 27/117 (23%), Positives = 54/117 (46%), Gaps = 1/117 (0%)

Query: 2 KVLIADDEQDMLRILKAYFEKEGFEVFLAKDGEEALQIFYDEKIDLAILDWMMPKHSGIT 61
+L+ADD+ + +L + G++V + + + DL + D +MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VCQEIKK-NSSVKVLMLTAKSESEDELAALQSGADEYVKKPFHPGVLITRAKKLIQH 117
+ IKK + VL+++A++ + A + GA +Y+ KPF LI + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2569SACTRNSFRASE280.015 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 27.6 bits (61), Expect = 0.015
Identities = 20/100 (20%), Positives = 37/100 (37%), Gaps = 27/100 (27%)

Query: 29 EGFKFLKKLINEYENELNTF-----------------------NKSGECLYGIFQGEKLI 65
E F ++I +EN + T+ + G+ + + I
Sbjct: 18 EPFVVFGRMIPAFENGVWTYTEERFSKPYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCI 77

Query: 66 GIGGLNADPYTENNKIGRLRRFYIAKDYRRIGLGKLLLNK 105
G + ++ N + +AKDYR+ G+G LL+K
Sbjct: 78 GRIKIRSNW----NGYALIEDIAVAKDYRKKGVGTALLHK 113


94BALH_2583BALH_2589N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2583-211-1.365909multidrug resistance protein B
BALH_2584-210-2.114993hypothetical protein
BALH_2585-312-2.386318hypothetical protein
BALH_2586-313-2.707769hypothetical protein
BALH_2587-214-2.014203oligopeptide ABC transporter substrate-binding
BALH_2588213-1.963459major facilitator family transporter
BALH_2589217-2.661094hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2583TCRTETB1445e-40 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 144 bits (364), Expect = 5e-40
Identities = 92/410 (22%), Positives = 165/410 (40%), Gaps = 24/410 (5%)

Query: 19 ILMASMDNTIVVTAMGTIVGDLGGLENFV-WVVSAYMVAEMAGMPIFGKLSDMYGRKRFF 77
+ ++ ++ ++ I D WV +A+M+ G ++GKLSD G KR
Sbjct: 23 SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLL 82

Query: 78 IFGLIVFMIGSALCGTAENITQLGIY-RAIQGIGGGALVPIAFTIVFDIFPPEKRGKMGG 136
+FG+I+ GS + + L I R IQG G A + +V P E RGK G
Sbjct: 83 LFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFG 142

Query: 137 LFGAVFGLSSIFGPLLGAYITDYISWHWVFYINLPLGVLALIFITFFYKESRVHRKQKID 196
L G++ + GP +G I YI HW + + +P+ + + + V K D
Sbjct: 143 LIGSIVAMGEGVGPAIGGMIAHYI--HWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFD 200

Query: 197 WFGAITLVGAVICLMFALELGGQKYDWDSSFILSLFGGFAILIIAFIFI----ERKVEEP 252
G I + ++ M L Y F I+ + I RKV +P
Sbjct: 201 IKGIILMSVGIVFFM----LFTTSYSI----------SFLIVSVLSFLIFVKHIRKVTDP 246

Query: 253 IISFEMFKQRLFGMSTIIALCYGAAFMSATVYIPLFIQGVYGGSATNSG-LLLLPMMLGS 311
+ + K F + + +P ++ V+ S G +++ P +
Sbjct: 247 FVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSV 306

Query: 312 VVTAQLGGFLTTKLSYRNIMIISAVIMLIGLFLLSTLTPETSRALLTVYMIIIGFGVGFS 371
++ +GG L + ++ I + + FL ++ ET+ +T+ ++ + G+ F+
Sbjct: 307 IIFGYIGGILVDRRGPLYVLNIGVTFLSVS-FLTASFLLETTSWFMTIIIVFVLGGLSFT 365

Query: 372 FSVLSMAAIHNFGMEQRGSATSTSNFIRSLGMTLGITIFGMIQRTGFQDQ 421
+V+S + ++ G+ S NF L GI I G + DQ
Sbjct: 366 KTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQ 415


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2586PYOCINKILLER310.003 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.9 bits (69), Expect = 0.003
Identities = 28/143 (19%), Positives = 47/143 (32%), Gaps = 23/143 (16%)

Query: 21 LELTGISYGQLYRWKRKNLIPEDWFVRKSTFTGQETFFPKEKILERINKIQTMKEDLSLD 80
L+ + G KNL P D R T G +K+L KI ++ L
Sbjct: 97 LDKADAALGPA-----KNLAPLDVINRSLTIVGNALQQKNQKLLLNQKKITSLGAKNFL- 150

Query: 81 ELANMFSPSVREILLTKEDILRKGIASEP--VLQFFIEQTNKTSEFQFVDILYVYMLEEL 138
R E +R+G + P ++F + + V + E +
Sbjct: 151 ---------TRTAEEIGEQAVREGNINGPEAYMRFLDREMEGLTAAYNVKLF----TEAI 197

Query: 139 --LQSGEISLEEGKMAMQVLREN 159
LQ +L K +++ N
Sbjct: 198 SSLQIRMNTLTAAKASIEAAAAN 220


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2588TCRTETA845e-20 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 84.5 bits (209), Expect = 5e-20
Identities = 51/319 (15%), Positives = 116/319 (36%), Gaps = 11/319 (3%)

Query: 78 LIFGLQPLSDIIFTLIAGRVTDKYGRKKIMLLGLLLQGVAIGSFIFAQSLFIFALLYVIN 137
++ L L + G ++D++GR+ ++L+ L V A L++ + ++
Sbjct: 47 ILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVA 106

Query: 138 GVGRSLYIPAQRAQIADLTKHGQQAEIFSLLQTMGAIGTLIGPLIGTIFYKAHPEYVFIV 197
G+ + A IAD+T ++A F + G + GP++G + P F
Sbjct: 107 GITGAT-GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFA 165

Query: 198 QSIVLIAYAVVVWTQLPETAPAMTTPTQKLEVSSPKQFVQKHY--AVFGLMITTLPISFF 255
+ + + LPE+ P ++ ++ F V LM +
Sbjct: 166 AAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLV 225

Query: 256 YAQTETNYLIFVKHTLPDFDRILVFIT-TCKALMEITLQVFLV-KWSERFSMAKIILISY 313
++IF + +D + I+ ++ Q + + R + +++
Sbjct: 226 GQVPAALWVIFGEDRF-HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLG- 283

Query: 314 TCYTIAAIGYGLSATIAS--LFFTLLFLVIGGSMALNHLLRFVSEIAPSDKRGLYFSIYG 371
GY L A + F ++ L+ G + + L +S +++G
Sbjct: 284 --MIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 372 LHWDVSRTCGPVIGAVLLS 390
++ GP++ + +
Sbjct: 342 ALTSLTSIVGPLLFTAIYA 360



Score = 46.4 bits (110), Expect = 1e-07
Identities = 24/137 (17%), Positives = 59/137 (43%), Gaps = 2/137 (1%)

Query: 58 AIIMIIYVNKMLNGNIMMTMLIFGLQPLSDIIF-TLIAGRVTDKYGRKKIMLLGLLLQGV 116
A + +I+ + + + + + +I G V + G ++ ++LG++ G
Sbjct: 230 AALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGT 289

Query: 117 AIGSFIFAQSLFIFALLYVINGVGRSLYIPAQRAQIADLTKHGQQAEIFSLLQTMGAIGT 176
FA ++ + V+ G + +PA +A ++ +Q ++ L + ++ +
Sbjct: 290 GYILLAFATRGWMAFPIMVLLASG-GIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTS 348

Query: 177 LIGPLIGTIFYKAHPEY 193
++GPL+ T Y A
Sbjct: 349 IVGPLLFTAIYAASITT 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2589TYPE4SSCAGA280.030 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 28.1 bits (62), Expect = 0.030
Identities = 35/130 (26%), Positives = 51/130 (39%), Gaps = 20/130 (15%)

Query: 20 LAACKGTDEKKETNP----TSENSKNEQNTSSEGK-----KEPEVKSNTDSNSKDTVINQ 70
L A KG+ + NP EN N GK K + KS+ +++ KD +INQ
Sbjct: 719 LKALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENSVKDVIINQ 778

Query: 71 KSINHVKNLFELAKEGKVPNVPFAAHTGDIEEIEKAWGKADKTEQAGNGMYATFTNKNVS 130
K + V NL + K TGD +E+A + A KN S
Sbjct: 779 KVTDKVDNLNQAVSVAKA--------TGDFSRVEQALADLKNFSKE---QLAQQAQKNES 827

Query: 131 FGFNKGSQVF 140
K S+++
Sbjct: 828 LNARKKSEIY 837


95BALH_2624BALH_2631N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2624-218-1.012800acetyltransferase
BALH_2625-316-1.031538hypothetical protein
BALH_2626-119-1.949090hypothetical protein
BALH_2627013-1.051269acetyltransferase
BALH_2628112-1.233511acetyltransferase
BALH_2629111-0.416742nucleotidyltransferase
BALH_2630-214-0.884400hypothetical protein
BALH_2631-220-0.278149hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2624SACTRNSFRASE362e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 35.7 bits (82), Expect = 2e-05
Identities = 26/81 (32%), Positives = 34/81 (41%), Gaps = 9/81 (11%)

Query: 46 YEEQACIGIEIIGAN---KAKIRHIAVIPQYRHKGIALQMI---KEVVRVYQLTYLEAET 99
Y E CIG I +N A I IAV YR KG+ ++ E + L ET
Sbjct: 71 YLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLET 130

Query: 100 DD---EAVEFYKKIGFQVRSL 117
D A FY K F + ++
Sbjct: 131 QDINISACHFYAKHHFIIGAV 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2627SACTRNSFRASE451e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 45.3 bits (107), Expect = 1e-08
Identities = 20/90 (22%), Positives = 37/90 (41%), Gaps = 5/90 (5%)

Query: 57 FGAFNEDHQLVGVVTLLTEEKEAYKHKGHIVAMYVDANNRRSGLARELICKAIERAKEMN 116
F + ++ +G + + + + I + V + R+ G+ L+ KAIE AKE +
Sbjct: 68 FLYY-LENNCIGRIKI----RSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENH 122

Query: 117 LEQLTLGVVSTNEPAKNLYESMGFKTYGIE 146
L L N A + Y F ++
Sbjct: 123 FCGLMLETQDINISACHFYAKHHFIIGAVD 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2628SACTRNSFRASE280.021 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 27.6 bits (61), Expect = 0.021
Identities = 17/58 (29%), Positives = 25/58 (43%), Gaps = 9/58 (15%)

Query: 97 KEYWGKGYGKAALYSMLHIAFFEFELEK----VWLRVDEDNLQARKSYEKAGFVCEGL 150
K+Y KG G A +LH A E+ E + L + N+ A Y K F+ +
Sbjct: 99 KDYRKKGVGTA----LLHKAI-EWAKENHFCGLMLETQDINISACHFYAKHHFIIGAV 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2631IGASERPTASE250.046 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 25.0 bits (54), Expect = 0.046
Identities = 17/61 (27%), Positives = 28/61 (45%), Gaps = 2/61 (3%)

Query: 28 KDEKEPDPTEEPSEQRQEEKNEKQD-PAKEQNNELNK-KDEQEPDPTEEPSEEQKKKKEN 85
++ E D TE ++ R+ K K + A Q NE+ + E + T E E +KE
Sbjct: 1051 VEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE 1110

Query: 86 E 86
+
Sbjct: 1111 K 1111


96BALH_2645BALH_2651N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2645-212-1.328463bifunctional 3-deoxy-7-phosphoheptulonate
BALH_2646-215-1.001608hypothetical protein
BALH_2647-1150.008469isochorismatase
BALH_26481150.783317acetyltransferase
BALH_26491151.614295hypothetical protein
BALH_26502162.260662hypothetical protein
BALH_26511182.389704hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2645PF06776290.027 Invasion associated locus B
		>PF06776#Invasion associated locus B

Length = 214

Score = 28.7 bits (64), Expect = 0.027
Identities = 15/61 (24%), Positives = 23/61 (37%)

Query: 268 RATRNTLDISAVPILKKETHLPVIVDVTHSTGRRDLLLPTAKAALAIGADAVMAEVHPDP 327
R +R + AVP LK P + ++ RR A+ LA ++ D
Sbjct: 10 RISRRPVTNHAVPALKAIQMGPAELSPMLASCRRLARRNGARLMLAGAMAIALSFGWSDR 69

Query: 328 A 328
A
Sbjct: 70 A 70


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2647ISCHRISMTASE551e-11 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 55.0 bits (132), Expect = 1e-11
Identities = 43/158 (27%), Positives = 71/158 (44%), Gaps = 19/158 (12%)

Query: 2 KKALLVIDVQ---AGMYTAGMPVHNGEKFLEALQELIGECRSNDIPVIYVQHNGPKDHPL 58
+ LL+ D+Q +TAG + +++L +C IPV+Y G + +P
Sbjct: 30 RAVLLIHDMQNYFVDAFTAGASPV--TELSANIRKLKNQCVQLGIPVVYTAQPGSQ-NPD 86

Query: 59 EKG--TDGW-----------KIHAAITPLEGDSIVEKTTPDSFYKTNLNEVLQDKGIDHV 105
++ TD W KI + P + D ++ K +F +TNL E+++ +G D +
Sbjct: 87 DRALLTDFWGPGLNSGPYEEKIITELAPEDDDLVLTKWRYSAFKRTNLLEMMRKEGRDQL 146

Query: 106 IISGMQTQYCVDTTTRRAFSEGYKITLVSDAHSTFDTE 143
II+G+ T AF E K V DA + F E
Sbjct: 147 IITGIYAHIGCLVTACEAFMEDIKAFFVGDAVADFSLE 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2648SACTRNSFRASE342e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.8 bits (77), Expect = 2e-04
Identities = 18/89 (20%), Positives = 30/89 (33%), Gaps = 14/89 (15%)

Query: 78 VDSEWKTLYGYEESQNVWG-------------MDQFIGEPTYWGKGIGTKFVKAAITYIF 124
V+ E K + Y N G ++ Y KG+GT + AI +
Sbjct: 60 VEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWA- 118

Query: 125 SEMGAEAIAMDPKVNNERAIKCYERCGFK 153
E + ++ + N A Y + F
Sbjct: 119 KENHFCGLMLETQDINISACHFYAKHHFI 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2651IGASERPTASE632e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 62.8 bits (152), Expect = 2e-12
Identities = 57/316 (18%), Positives = 101/316 (31%), Gaps = 23/316 (7%)

Query: 103 NTESKILGRLKKDDVIESTNQVKDGWLQFEYKGKTAYVNVSFLSSKAPIEKKADEKTKQV 162
N E + + I + N ++ + V P E T+ V
Sbjct: 982 NPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNE-EIARVDEAPVPPPAPATPSETTETV 1040

Query: 163 AKVQKSVKAKEEAKTQKITKAKETIKPKEEVKVQEVVKPKEEVKVQEVAKPKEEVKVQEV 222
A+ K E Q A ET EV + K + EVA+ E K +
Sbjct: 1041 AENSKQESKTVEKNEQD---ATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 223 AKPKEEVKVQEVAKPKEEVKVQEVAKPKEEVKVQEVVKPKEEVKVQEVAKAKEEAKVQEI 282
+ KE V++ K K E K +E KV V PK+E + + V E A+ +
Sbjct: 1098 TETKETATVEKEEKAKV-----ETEKTQEVPKVTSQVSPKQE-QSETVQPQAEPARENDP 1151

Query: 283 AKAKEEAKAQEIAKAKEEAKAQEIAKAKEEAKAQEIAKAKEEAKAQEIAKAKEEAKAREI 342
+E ++Q A E A+E + E Q + ++ + E
Sbjct: 1152 TVNIKEPQSQTNTTADTEQPAKETSSNVE----QPVTESTTVNTGNSVV---------EN 1198

Query: 343 AKAKEEAKAQEIAKAKEEEKAREIAKAKEEAKAQEIAKAKEEAKVREALKAKEESKNNAQ 402
+ A Q ++ K + + + + A + R + + + N
Sbjct: 1199 PENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTN 1258

Query: 403 SAKRELMVVATAYTAD 418
+ + A +
Sbjct: 1259 AVLSDARAKAQFVALN 1274



Score = 58.9 bits (142), Expect = 3e-11
Identities = 38/214 (17%), Positives = 72/214 (33%), Gaps = 12/214 (5%)

Query: 209 EVAKPKEEVKVQEVAKPKEEVKVQEVAKPKEEVKVQEVAKPKEEVKVQEVVKPKEEVKVQ 268
+ P +E+A+ E V P + E K + K E
Sbjct: 1004 QADVPSVPSNNEEIARVDEAP----VPPPAPATPSETTETVAENSKQESKTVEKNEQDAT 1059

Query: 269 EVAKAKEEAKVQEIAKAKEEAKAQEIAKAKEEAKAQEIAKAKEEAKAQEIAKAKEEA-KA 327
E E + + K + E+A++ E K + + KE A ++ KAK E K
Sbjct: 1060 ETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKT 1119

Query: 328 QEIAKAKEEAKAREIAKAKEEAKAQEIAKAKEEEKAREIAKAKEEAKAQEIAKAKEEAKV 387
QE+ K + + + +++ + E + + +E ++Q A E
Sbjct: 1120 QEVPKVTSQVSPK-------QEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPA 1172

Query: 388 REALKAKEESKNNAQSAKRELMVVATAYTADPSE 421
+E E+ + + VV P+
Sbjct: 1173 KETSSNVEQPVTESTTVNTGNSVVENPENTTPAT 1206



Score = 58.2 bits (140), Expect = 6e-11
Identities = 41/214 (19%), Positives = 72/214 (33%), Gaps = 6/214 (2%)

Query: 213 PKEEVKVQEV-AKPKEEVKVQEVAKPKEEVKVQEVAKPKEEVKVQEVVKPKEEVKVQEVA 271
P+ E + Q V + P +E+A+ E V P +
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAP----VPPPAPATPSETTE 1038

Query: 272 KAKEEAKVQEIAKAKEEAKAQEIAKAKEEAKAQEIAKAKEEAKAQEIAKAKEEAKAQEIA 331
E +K + K E A E E + + K + E+A++ E K +
Sbjct: 1039 TVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTT 1098

Query: 332 KAKEEAKAREIAKAKEEA-KAQEIAKAKEEEKAREIAKAKEEAKAQEIAKAKEEAKVREA 390
+ KE A + KAK E K QE+ K + ++ + +A+ + ++E
Sbjct: 1099 ETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEP 1158

Query: 391 LKAKEESKNNAQSAKRELMVVATAYTADPSENGT 424
+ + Q AK V T + N
Sbjct: 1159 QSQTNTTADTEQPAKETSSNVEQPVTESTTVNTG 1192



Score = 53.9 bits (129), Expect = 1e-09
Identities = 47/268 (17%), Positives = 88/268 (32%), Gaps = 14/268 (5%)

Query: 161 QVAKVQKSVKAKEEAKTQKITKAKETIKPKEE--VKVQEVV--KPKEEVKVQEVAKPKEE 216
+V K ++V I ++ E +V E P + E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 217 VKVQEVAKPKEEVKVQEVAKPKEEVKVQEVAKPKEEVKVQEVVKPKEEVKVQEVAKAKEE 276
K + K E E EV + + K + EV + E K + + KE
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKET 1103

Query: 277 AKVQEIAKAKEEAKAQEIAKAKEEAKAQEIAKAKEEAKAQEIAKAKEEAKAQEIAKAKEE 336
A V++ +E+AK E K +E K K+E +++ + E A+ + +E
Sbjct: 1104 ATVEK----EEKAKV-ETEKTQEVPKVTSQVSPKQE-QSETVQPQAEPARENDPTVNIKE 1157

Query: 337 AKAREIAKAKEEAKAQEIAKAKEEE----KAREIAKAKEEAKAQEIAKAKEEAKVREALK 392
+++ A E A+E + E+ + E + E+
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 393 AKEESKNNAQSAKRELMVVATAYTADPS 420
+ + + + AT + D S
Sbjct: 1218 KPKNRHRRSVRSVPHNVEPATTSSNDRS 1245



Score = 48.1 bits (114), Expect = 7e-08
Identities = 51/301 (16%), Positives = 100/301 (33%), Gaps = 20/301 (6%)

Query: 152 EKKADEKTKQVAKVQKSVKAKEEAKTQKITKAKETIKPKEEVKVQEVVKPKEEVKVQEVA 211
E+ A E T Q +V K K+ +A TQ E + E K + + KE V++
Sbjct: 1055 EQDATETTAQNREVAKEAKSNVKANTQ----TNEVAQSGSETKETQTTETKETATVEKEE 1110

Query: 212 KPKEEV-KVQEVAKPKEEVKVQEVAKPKEEVKVQEVAKPKEEVKVQEVVKPKEEVKVQEV 270
K K E K QEV K +V ++ + + E A+ + + + +
Sbjct: 1111 KAKVETEKTQEVPKVTSQVSPKQ-EQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTE 1169

Query: 271 AKAKEEAKVQEIAKAKEEAKAQEIAKAKEEAKAQEIAKAKEEAKAQEIAKAKEEAKAQEI 330
AKE + + + E+ + E + E + + + +
Sbjct: 1170 QPAKETSS--NVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSV 1227

Query: 331 AKAKEEAKAREIAKAKEEAKAQEIAKAKEEEKAREIAKAKEEAKAQEIAKAKEEAKVREA 390
+ + A + A+AK + A + KA + +
Sbjct: 1228 RSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLE 1287

Query: 391 LKAKEESK---NNAQSAKRELMVVATAYTADPSENGTYGGRVLTAMGHDLTANPNMRIIA 447
+ + + +N K ++ Y S+ T +G D T + N+++
Sbjct: 1288 MNNEGQYNVWVSNTSMNKN---YSSSQYRRFSSK------STQTQLGWDQTISNNVQLGG 1338

Query: 448 V 448
V
Sbjct: 1339 V 1339



Score = 44.3 bits (104), Expect = 1e-06
Identities = 41/240 (17%), Positives = 75/240 (31%), Gaps = 16/240 (6%)

Query: 132 EYKGKTAYVNVSFLSSKAPIEKKADEKTKQVAKVQKSVK------AKEEAKTQKITKAKE 185
E ++ +A KA+ +T +VA+ K KE A +K KAK
Sbjct: 1055 EQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKV 1114

Query: 186 TIKPKEEV-KVQEVVKPKEEVKVQEVAKPKEEVKVQEVAKPKEEVKVQEVAKPKEEVKVQ 244
+ +EV KV V PK+E E V+ Q + + V +
Sbjct: 1115 ETEKTQEVPKVTSQVSPKQE--------QSETVQPQAEPARENDPTVNIKEPQSQTNTTA 1166

Query: 245 EVAKPKEEVKVQEVVKPKEEVKVQEVAKAKEEAKVQEIAKAKEEAKAQEIAKAKEEAKAQ 304
+ +P +E V +P E + E + E + + +
Sbjct: 1167 DTEQPAKETS-SNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRR 1225

Query: 305 EIAKAKEEAKAQEIAKAKEEAKAQEIAKAKEEAKAREIAKAKEEAKAQEIAKAKEEEKAR 364
+ + + A + A+AK + A + KA + ++
Sbjct: 1226 SVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQ 1285


97BALH_2868BALH_2874N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_28680122.237768NADPH-cytochrome P450 reductase
BALH_28690160.847528multidrug resistance protein B
BALH_2870-217-0.045888hypothetical protein
BALH_2871-114-0.736710hypothetical protein
BALH_2872014-1.282637hypothetical protein
BALH_2873-112-0.063460hypothetical protein
BALH_2874-3100.380621two-component response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2868MECHCHANNEL363e-04 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 35.6 bits (82), Expect = 3e-04
Identities = 20/64 (31%), Positives = 31/64 (48%), Gaps = 13/64 (20%)

Query: 270 IITFLIAGHETTSGLLSFAIYFLLKNPDKLKKAYEEVDRVLTDPTPTYQQVMKLKYIRMI 329
+ FLI ++FAI+ +K +KL + EE P PT ++V+ L IR +
Sbjct: 82 VFDFLI---------VAFAIFMAIKLINKLNRKKEEPA---AAPAPTKEEVL-LTEIRDL 128

Query: 330 LNES 333
L E
Sbjct: 129 LKEQ 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2869TCRTETB1297e-35 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 129 bits (327), Expect = 7e-35
Identities = 93/411 (22%), Positives = 187/411 (45%), Gaps = 20/411 (4%)

Query: 35 MLVILFIGAFVSFLNNSLLNVALPSIMKDLDIKDYSTIQWLSTGYMLVSGILIPASAFLI 94
+L+ L I +F S LN +LNV+LP I D + ST W++T +ML I L
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPAST-NWVNTAFMLTFSIGTAVYGKLS 73

Query: 95 TRFSNRSLFITSMMIFTLGTALAAVAPN-FGLLLTGRMVQAAGSSVMGPLLMNIMLVSFP 153
+ + L + ++I G+ + V + F LL+ R +Q AG++ L+M ++ P
Sbjct: 74 DQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIP 133

Query: 154 REKRGTAMGIFGLVMITAPAIGPTLSGYIVEYYDWRLLFEMILPLAIISLLLGIWKSENV 213
+E RG A G+ G ++ +GP + G I Y W L L + ++ + +
Sbjct: 134 KENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL-----LIPMITIITVPFLMKL 188

Query: 214 MRQNKNAK--LDYLSLLLSSIGFGGLLYGFSSASSDGWTNKVVVTTLIIGAIALIAFIIR 271
+++ K D ++L S+G + +S S ++ LI+ ++ + F+
Sbjct: 189 LKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYS---------ISFLIVSVLSFLIFVKH 239

Query: 272 QLKMNEPLLDLRVYKYPMFALASVIAIVNAVAMFSGMILTPAYVQNVRGISPLSSG-LMM 330
K+ +P +D + K F + + + + + + P +++V +S G +++
Sbjct: 240 IRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVII 299

Query: 331 LPGAVIMGIMSPITGKLFDKYGPRILGIVGLSITAVSTYMLANLQLDSSHTHIILIYTLR 390
PG + + I I G L D+ GP + +G++ +VS + L L+++ + +I
Sbjct: 300 FPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFL-LETTSWFMTIIIVFV 358

Query: 391 MFGMAMVMMPLMTNGLNQLPTRLNPHGTAVNNTAQQVSGSIGTAILVTIMN 441
+ G++ + T + L + G ++ N +S G AI+ +++
Sbjct: 359 LGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2870cloacin290.018 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 28.5 bits (63), Expect = 0.018
Identities = 19/55 (34%), Positives = 21/55 (38%), Gaps = 1/55 (1%)

Query: 138 NTNNGGIFGNSPGWGGQISPMGWGNNGGYG-QPGMLGNLLTVGKGTMNGIGMLSS 191
N GG G+ WGG G NG G G GNL V G LS+
Sbjct: 43 NNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALST 97


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2874HTHFIS1022e-27 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 102 bits (256), Expect = 2e-27
Identities = 32/112 (28%), Positives = 58/112 (51%)

Query: 14 RVLIVEDEQDLQNILVKRLNAEHYSVDACGNGEDALDYINMATYDLIVLDIMIPGINGLQ 73
+L+ +D+ ++ +L + L+ Y V N +I DL+V D+++P N
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 74 VLQKLRADNHTTPVLLLTAKDTIDDRVKGLDLGADDYLVKPFAFDELLARIR 125
+L +++ PVL+++A++T +K + GA DYL KPF EL+ I
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


98BALH_2993BALH_2999N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_2993112-0.390668transcriptional regulator
BALH_2994212-0.136765high-affinity zinc uptake system
BALH_29951120.539354GntR family transcriptional regulator
BALH_29960120.852583L-proline dehydrogenase
BALH_2997-1111.090340transcriptional regulator
BALH_2998-1110.394053alkaline D-peptidase
BALH_2999-214-0.407271DNA-binding response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_299360KDINNERMP300.020 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 29.9 bits (67), Expect = 0.020
Identities = 20/89 (22%), Positives = 39/89 (43%), Gaps = 10/89 (11%)

Query: 47 ITFLPLLASYFNISIDELISYKPQMEQEDIKELYHRLAEAFSEKSFDEVMMEC--REIIK 104
PL + + S+ ++ +P+++ ++E + S++ MM E +
Sbjct: 367 GIMYPLTKAQY-TSMAKMRMLQPKIQA--MRERLGDDKQRISQE-----MMALYKAEKVN 418

Query: 105 KYYSCFPLLIQIGILFINHHMLTEDTDKR 133
CFPLLIQ+ I ++ML + R
Sbjct: 419 PLGGCFPLLIQMPIFLALYYMLMGSVELR 447


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2994adhesinb2075e-65 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 207 bits (529), Expect = 5e-65
Identities = 81/321 (25%), Positives = 143/321 (44%), Gaps = 22/321 (6%)

Query: 3 KKLTMFSFLLIFTLILAGCSNKKESALKKEEKLSVYTTIFPLADFAKKIGGNYVNVEAIY 62
KK LL+ + LA CS++K S KL+V T +AD K I G+ +N+ +I
Sbjct: 2 KKCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKNIAGDKINLHSIV 61

Query: 63 PPGADSHTFEPSQKQTVKVAKADLFIYNGAELE-----PFAEKMEKTLQKENIKIVNASK 117
P G D H +EP + K ++ADL YNG LE F + +E +KEN S+
Sbjct: 62 PVGQDPHEYEPLPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVSE 121

Query: 118 GIELRAATEDEHHDHGDGHKHKEDEHHHDKDPHVWLNPILAMKQAEKIKNALVELQPDHK 177
G+++ + +DPH WLN + A+ I L E P +K
Sbjct: 122 GVDV--------------IYLEGQSEKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANK 167

Query: 178 QEFEKNFAALQTKFTDLDDRFKAAVAN--AKTKEILVSHAAYGYWEQRYGLKQIPIAGIS 235
+ +EKN A K + LD K N + K I+ S + Y+ + Y + I I+
Sbjct: 168 ETYEKNLKAYVEKLSALDKEAKEKFNNIPGEKKMIVTSEGCFKYFSKAYNVPSAYIWEIN 227

Query: 236 ASDEPSQKKLAEITKTVKEYGLKYILFETFSTPKVASVIQKETGTKILRLNHLATISEDD 295
+E + ++ + + +++ + + E+ + + K+T I +++E
Sbjct: 228 TEEEGTPDQIKTLVEKLRKTKVPSLFVESSVDDRPMKTVSKDTNIPIYAKIFTDSVAEKG 287

Query: 296 AKNNKDYFTLMEENVNTLKEA 316
+ + Y+++M+ N+ + E
Sbjct: 288 EEGD-SYYSMMKYNLEKIAEG 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2998BLACTAMASEA320.006 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 32.1 bits (73), Expect = 0.006
Identities = 15/102 (14%), Positives = 29/102 (28%), Gaps = 14/102 (13%)

Query: 17 AILVGALAIPSYENAHAVNKRTSIEQVIDKAADAKNIPGVIVTVKNGEASWAYASGEGNI 76
+ + +L HA + ++ + + G+I G
Sbjct: 5 RLCIISLLATLPLAVHASPQPLEQIKLSESQLSGR--VGMI----------EMDLASGRT 52

Query: 77 ERNHKVDADSAFRIGSTTKTFVATVVLQLAGEKKLSLDDTVE 118
+ AD F + ST K + VL L+ +
Sbjct: 53 LTAWR--ADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIH 92


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_2999HTHFIS843e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.1 bits (208), Expect = 3e-19
Identities = 28/110 (25%), Positives = 49/110 (44%), Gaps = 2/110 (1%)

Query: 381 IRLLVVDDQALITNSLEQILENQTDFIVVGKAYDGSEALILCEQLQPDIVLMDIQMPEMN 440
+LV DD A I L Q L + + V + + D+V+ D+ MP+ N
Sbjct: 4 ATILVADDDAAIRTVLNQALS-RAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 441 GIEALLEMKRRWPNMKIVLMTTFEDSLQAATALEHGAEGYMLKSIHPQEM 490
+ L +K+ P++ +++M+ + A A E GA Y+ K E+
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTEL 111


99BALH_3119BALH_3126N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_31192161.609590short chain dehydrogenase
BALH_31200131.727328short chain dehydrogenase
BALH_31210140.712505hypothetical protein
BALH_3122-2120.663377hypothetical protein
BALH_3123-1101.227353arsenical pump family protein
BALH_3124-1120.644333hypothetical protein
BALH_31250130.272192acetyltransferase
BALH_3126-1110.191318iron compound ABC transporter substrate-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3119DHBDHDRGNASE553e-12 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 54.7 bits (131), Expect = 3e-12
Identities = 30/103 (29%), Positives = 48/103 (46%), Gaps = 1/103 (0%)

Query: 10 LKDKVAIITGGASGIGESTVRLFIEEGAKVVIADFS-EHGKELSDELNAHGYNTLFIKTD 68
++ K+A ITG A GIGE+ R +GA + D++ E +++ L A + D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 69 VTKEADIKQLIHETVSTYGKLDIMYANAGVADDAPANELSYEK 111
V A I ++ G +DI+ AGV + LS E+
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEE 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3120DHBDHDRGNASE651e-15 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 65.1 bits (158), Expect = 1e-15
Identities = 44/141 (31%), Positives = 65/141 (46%), Gaps = 11/141 (7%)

Query: 3 GVFLSNKYSIEQFLKQGTGGVIVNAGSIHSFVSLPTTTAYSSAKGGVKLLTQNLCTAYAK 62
GVF +++ + + + G IV GS + V + AY+S+K + T+ L A+
Sbjct: 119 GVFNASRSVSKYMMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAE 177

Query: 63 YGIRINAVCPGYIHTPLLGSVNPQQ-------KEYLASLH---PQGRLGTPEEVAKAVLF 112
Y IR N V PG T + S+ + K L + P +L P ++A AVLF
Sbjct: 178 YNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLF 237

Query: 113 LASDDASFVNGTTLLVDGGYT 133
L S A + L VDGG T
Sbjct: 238 LVSGQAGHITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3125SACTRNSFRASE483e-09 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 47.6 bits (113), Expect = 3e-09
Identities = 20/82 (24%), Positives = 35/82 (42%), Gaps = 3/82 (3%)

Query: 83 QDDRLIGFVEIHGIEWNNRTGLLAIGIGDANDRGKGYGREAIHLILKYAFYELNLHRVGL 142
++ IG ++I WN + I + + R KG G +H +++A E + + L
Sbjct: 72 LENNCIGRIKIRS-NWNGYALIEDIAV-AKDYRKKGVGTALLHKAIEWA-KENHFCGLML 128

Query: 143 DVISYNKAAIELYKKMGFQIEG 164
+ N +A Y K F I
Sbjct: 129 ETQDINISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3126FERRIBNDNGPP832e-20 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 82.7 bits (204), Expect = 2e-20
Identities = 64/281 (22%), Positives = 107/281 (38%), Gaps = 44/281 (15%)

Query: 19 ACGQTKSNEEATKKTEKSNDPK-IASMSIHLTNDLLALGITPVG--SVIGGDLKDFLPHA 75
Q + A + DP I ++ LLALGI P G I L P
Sbjct: 21 LLWQMNTAHAA------AIDPNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPL 74

Query: 76 KEQLKDTKKLGVVTDPNMEALLQLKPSEIYVDEKYAGKDVAKYEKIAKTHSFNLDEGT-- 133
+ + D +G+ T+PN+E L ++KPS V G +IA FN +G
Sbjct: 75 PDSVID---VGLRTEPNLELLTEMKPS-FMVWSAGYGPSPEMLARIAPGRGFNFSDGKQP 130

Query: 134 ---WRDHLKQVGKLVNREKEADTYIQGYEEQAKRVKSLIDKELGNNEK--VMAIRVTAKE 188
R L ++ L+N + A+T++ YE+ + +K + + + ++ + +
Sbjct: 131 LAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKP---RFVKRGARPLLLTTLIDPRH 187

Query: 189 LRVFSTKRPMGPILFQD----LGLQPANGVEKIDGNR-PFEVISQEVLPDF-DADAI-FV 241
+ VF LFQ+ G+ A E N +S + L + D D + F
Sbjct: 188 MLVFGP-----NSLFQEILDEYGIPNAWQGET---NFWGSTAVSIDRLAAYKDVDVLCFD 239

Query: 242 VVNRDDKAKAAFKQLQETPIWKDLKAVKGKHVYIINDQPWL 282
N D L TP+W+ + V+ + W
Sbjct: 240 HDNSKDM-----DALMATPLWQAMPFVRAGRFQRVPAV-WF 274


100BALH_3234BALH_3242N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_3234113-0.839661TetR family transcriptional regulator
BALH_3235014-0.109610Gfo/Idh/MocA family oxidoreductase
BALH_3236114-0.094561IS605 family transposase
BALH_32370131.080722DNA topoisomerase IV subunit A
BALH_3238-2111.163673DNA topoisomerase IV subunit B
BALH_3239-3120.534213CoA-binding domain-containing protein
BALH_3241-3110.544130serine protease
BALH_3242-1140.474660transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3234HTHTETR683e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 68.5 bits (167), Expect = 3e-16
Identities = 29/171 (16%), Positives = 58/171 (33%), Gaps = 28/171 (16%)

Query: 52 KEKKKRAIKEAAFLLFSERGFNEVKIEHIAKEANVSQVTIYNHFGSKDALFRELIQEFII 111
++ ++ I + A LFS++G + + IAK A V++ IY HF K LF E+ +
Sbjct: 9 AQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWEL--- 65

Query: 112 CEFQYYKELAEEKLP-------------FHDMMKKMIVRKMNTG---GLFQPDMLLQMMQ 155
EL E +++ + + +F + M
Sbjct: 66 -SESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA 124

Query: 156 RDEILREFIYSYQNEKILPWYLEILELAQRNNEI----NPHLTKEMMLLYI 202
+ + + ++I + L+ + +M YI
Sbjct: 125 VVQQAQRNLCLESYDRI----EQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3238ACRIFLAVINRP310.015 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.3 bits (71), Expect = 0.015
Identities = 12/49 (24%), Positives = 22/49 (44%), Gaps = 1/49 (2%)

Query: 455 INTEKAKLADIFKNEEINTIIYAIGGGVGNEFDVEDINYDKVVIMTDAD 503
++ EKA+ + ++ TI A+GG N+F + + DA
Sbjct: 730 VDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKK-LYVQADAK 777


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3241V8PROTEASE687e-15 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 68.5 bits (167), Expect = 7e-15
Identities = 33/166 (19%), Positives = 62/166 (37%), Gaps = 38/166 (22%)

Query: 176 NKAYIVTNNHVVDGANKLAVKLS------------DGKKVDAKLVGKDPWLDLAVVEI-- 221
K ++TN HVVD + L +G ++ DLA+V+
Sbjct: 110 GKDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSP 169

Query: 222 --DGANVN---KVATLGDSSKLRAGEKAIAIGNPLGFDG---SVTEGIISSKEREIPVDI 273
++ K AT+ ++++ + + G P ++G I+ + E
Sbjct: 170 NEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKPVATMWESKGKITYLKGEA---- 225

Query: 274 DGDKRADWNAQVIQTDAAINPGNSGGALFNQNGEIIGINSSKIAQQ 319
+Q D + GNSG +FN+ E+IGI+ + +
Sbjct: 226 ------------MQYDLSTTGGNSGSPVFNEKNEVIGIHWGGVPNE 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3242HTHFIS882e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.3 bits (219), Expect = 2e-22
Identities = 34/164 (20%), Positives = 76/164 (46%), Gaps = 16/164 (9%)

Query: 4 TVLLVEDERRLREIVSDYFRNEGFEVIEAEDGKKALELFAEHEIDLIMLDIMLPEIDGWS 63
T+L+ +D+ +R +++ G++V + A + DL++ D+++P+ + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 VCRRIRKESA-VPIIMLTARSDEDDTLLGFELGADEYVTKPFSPKVLVA---RAKTLLKR 119
+ RI+K +P+++++A++ + E GA +Y+ KPF L+ RA KR
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 120 ADGVVGVAEENAMSLAGIE------------VNRLSRTVLVDGE 151
+ ++ M L G + + T+++ GE
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGE 168


101BALH_3811BALH_3816N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_38112170.182134hypothetical protein
BALH_38122171.038607hypothetical protein
BALH_3813014-0.567423lipoate-protein ligase A
BALH_3814-216-1.075180rhodanese-related sulfurtransferase
BALH_3815-217-2.281660LacI family transcriptional regulator
BALH_3816-118-3.489166TetR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3811SALSPVBPROT290.014 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 29.3 bits (65), Expect = 0.014
Identities = 26/86 (30%), Positives = 43/86 (50%), Gaps = 6/86 (6%)

Query: 3 RMKAQDMIKLNNKKRELLTPENEVAYSDMLVYLRLSNVPEQQVEELLL--EILDHLIEAQ 60
R K++ I +K+ + L + YS + YLR + PE Q +E LL + L +
Sbjct: 381 RPKSKWAIVEESKQIQALRYYSAQGYSVINKYLRGDDYPETQAKETLLSRDYLSTNEPSD 440

Query: 61 TENKNAYDIFGDDLQSYCDELISALP 86
E KNA ++ +D+ E +S+LP
Sbjct: 441 EEFKNAMSVYINDIA----EGLSSLP 462


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3812PF03544353e-04 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 34.6 bits (79), Expect = 3e-04
Identities = 29/102 (28%), Positives = 34/102 (33%), Gaps = 9/102 (8%)

Query: 108 EAAEQEETVVEATPKKEVVVEVPKAVTPAPKPVTRVETPATASTPKPTPAPTPKPVSVEA 167
A+ E P + VV P+ P P P E P PKP P P PKPV
Sbjct: 56 APADLEPPQAVQPPPEPVVEPEPE---PEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVE 112

Query: 168 AVELSTPAPVKREVPTPVTKQETTPVAPAKPKQSALTETNSK 209
+ R APA+P S T SK
Sbjct: 113 QPKRDVKPVESRPASPF------ENTAPARPTSSTATAATSK 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3813DHBDHDRGNASE300.008 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 30.4 bits (68), Expect = 0.008
Identities = 26/98 (26%), Positives = 41/98 (41%), Gaps = 8/98 (8%)

Query: 97 VIVSEDHPNMPKTVTEAYRVISQGLLDGFKALGLE-AYYAVPKTEADRENLKNPRSG-VC 154
V V + +P+T AY + K LGLE A Y + R N+ +P S
Sbjct: 140 VTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI------RCNIVSPGSTETD 193

Query: 155 FDAPSWYEIVVEGRKIAGSAQTRQKGVILQHGSIPLEI 192
W + + I GS +T + G+ L+ + P +I
Sbjct: 194 MQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDI 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_3816HTHTETR602e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.0 bits (145), Expect = 2e-13
Identities = 39/205 (19%), Positives = 72/205 (35%), Gaps = 25/205 (12%)

Query: 2 KMTANRIKAVALSHFARYGYEGTSLANIAQEVGIKKPSIYAHFKRKEELYFICLESALQK 61
+ T I VAL F++ G TSL IA+ G+ + +IY HFK K +L+ E +
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 62 DLQSFTDDIENFSNSSTEELLLQLLKGYAKRFGESEESMFWLRTSYFPPDAFRE-QIIEK 120
+ + F +L ++L + E + + + E ++++
Sbjct: 70 IGELELEYQAKFP-GDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQ 128

Query: 121 --ANAHIENVGKLLFPIFKQANEKSELH-NIEVKDALEAFLCLLDGLM------------ 165
N +E+ ++ K E L ++ + A + GLM
Sbjct: 129 AQRNLCLESYDRIE-QTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDL 187

Query: 166 -------VELLFAGLNRFETRLNAS 183
V +L T N +
Sbjct: 188 KKEARDYVAILLEMYLLCPTLRNPA 212


102BALH_4062BALH_4065N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_40622162.051002ribosome biogenesis GTP-binding protein YsxC
BALH_40631141.973608Lon-A peptidase
BALH_40642151.074021ATP-dependent protease LA
BALH_40654170.309482ATP-dependent protease ATP-binding subunit ClpX
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4062TCRTETOQM280.027 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 27.9 bits (62), Expect = 0.027
Identities = 18/90 (20%), Positives = 37/90 (41%), Gaps = 13/90 (14%)

Query: 58 KTQTLNFFLINEMMHFVDVPGYGYAKVSKTERAAWGKMIETYFTTREQLDAAVLVVDLRH 117
+T +F N ++ +D PG+ +++ R+ LD A+L++ +
Sbjct: 57 QTGITSFQWENTKVNIIDTPGH-MDFLAEVYRSL------------SVLDGAILLISAKD 103

Query: 118 KPTNDDVMMYDFLKHYDIPTIIIATKADKI 147
+++ L+ IPTI K D+
Sbjct: 104 GVQAQTRILFHALRKMGIPTIFFINKIDQN 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4063HTHFIS382e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.5 bits (87), Expect = 2e-04
Identities = 29/101 (28%), Positives = 43/101 (42%), Gaps = 14/101 (13%)

Query: 370 LCLVGPPGVGKTSLARSI-ATSLNRN--FVRVSLGGVRD---ESEIRGHRRTYVGAMPGR 423
L + G G GK +AR++ RN FV +++ + ESE+ GH + GA G
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEK---GAFTGA 219

Query: 424 IIQGMKKAKSVNP-VFLLDEIDKMSNDFRGDPSAALLEVLD 463
+ + + LDEI M D + LL VL
Sbjct: 220 QTRSTGRFEQAEGGTLFLDEIGDMPMDAQ----TRLLRVLQ 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4064HTHFIS584e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 58.3 bits (141), Expect = 4e-11
Identities = 43/214 (20%), Positives = 76/214 (35%), Gaps = 41/214 (19%)

Query: 44 ELEQLRKMREISLTEPLAEKVR----PTSFLDIVGQEDGIKSLK--AALCGPNPQHVIIY 97
+L +L + +L EP + + +VG+ ++ + A ++I
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMIT 166

Query: 98 GPPGVGKTAAARLVLEEAKRNPKSPFRTNATFIELDATTARFDERGIADPLIGSVHDPIY 157
G G GK AR + + KR F+ ++ A I L G
Sbjct: 167 GESGTGKELVARALHDYGKRRNGP-------FVAINM--AAIPRDLIESELFGHE----- 212

Query: 158 QGAGAMGQAGIPQPKKGAVTDAHGGILFIDEIGELHPIQMNKMLKVLEDRKVFLESAYYS 217
GA G G A GG LF+DEIG++ ++L+VL+ +
Sbjct: 213 --KGAF--TGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGG--- 265

Query: 218 EENTMIPTYIHDIFQKGLPADFRLVGATTRSPEE 251
+ + +D R+V AT + ++
Sbjct: 266 --------------RTPIRSDVRIVAATNKDLKQ 285


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4065HTHFIS290.041 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.041
Identities = 38/198 (19%), Positives = 76/198 (38%), Gaps = 38/198 (19%)

Query: 86 VPKPVEIREILDEY--VIGQDNAK-KALAVAVYNHYKRINSNSKIDDV-----ELAKSNI 137
+PKP ++ E++ + + + L + + ++ + ++ L ++++
Sbjct: 102 LPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDL 161

Query: 138 A--LIGPTGSGKTLLAQTL---ARILNVPF------AIADATSLTEAGYVGEDVENILLK 186
+ G +G+GK L+A+ L + N PF AI L E+ G +
Sbjct: 162 TLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPR--DLIESELFGH-EKGAFTG 218

Query: 187 LIQAADYDVEKAEKGIIYIDEIDKVARKSENPSITRDVSGEGVQQALLKILEGTVASVPP 246
+ E+AE G +++DEI D+ Q LL++L+
Sbjct: 219 AQTRSTGRFEQAEGGTLFLDEIG-------------DMP-MDAQTRLLRVLQQG--EYTT 262

Query: 247 QGGRKHPHQEFIQIDTTN 264
GGR + + TN
Sbjct: 263 VGGRTPIRSDVRIVAATN 280


103BALH_4383BALH_4392N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_4383-112-1.476723thiamine transporter
BALH_4384113-2.067063sensory histidine kinase
BALH_4385016-0.057466two-component response regulator
BALH_43861180.137323sortase family protein
BALH_43872170.290788cell wall surface anchor family protein
BALH_4388017-0.202326hypothetical protein
BALH_4389-116-0.504231hypoxanthine phosphoribosyltransferase
BALH_4390-214-0.608282diacylglycerol kinase
BALH_4391-312-2.269263IS605 family transposase
BALH_4392-114-3.001936hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4383ACRIFLAVINRP310.006 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.0 bits (70), Expect = 0.006
Identities = 13/76 (17%), Positives = 36/76 (47%), Gaps = 3/76 (3%)

Query: 37 NTNLQAMIESAILAAFAMVIDILPLSISLPTGGSISFAMIPIFIIAYRWGFK--MAFLGG 94
+ ++ + E+ +L M + + + +L ++ ++ F I +G+ + G
Sbjct: 338 HEVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFG 397

Query: 95 LIWGLLQIVVGDAIIV 110
++ + ++V DAI+V
Sbjct: 398 MVLA-IGLLVDDAIVV 412


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4385HTHFIS832e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.6 bits (204), Expect = 2e-20
Identities = 31/136 (22%), Positives = 64/136 (47%), Gaps = 3/136 (2%)

Query: 5 ILIVEDEDILREILKDYFLSEQYVVFEARDGKEALVVFEEEEVDLVILDIMLPELDGWSV 64
IL+ +D+ +R +L Y V + + DLV+ D+++P+ + + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 65 CRRIRKT-SEVPIIMLTARVDEDDTLLGFELGADDYVTKPYSPPILLARAKRLLESRKVT 123
RI+K ++P+++++A+ + E GA DY+ KP+ L+ R L K
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK-- 123

Query: 124 KQLLENEDDTLSIHGI 139
++ + EDD+ +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4387IGASERPTASE356e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.4 bits (81), Expect = 6e-05
Identities = 22/80 (27%), Positives = 37/80 (46%), Gaps = 1/80 (1%)

Query: 46 AAIQAQQKNDTEKKQVAQAQEKNEVAKKQAVQAQEKNEVAKKQAAQVQEKSNMAKEVVAP 105
A ++Q++ T +K A E ++ A +A+ N A Q +V + + KE
Sbjct: 1040 VAENSKQESKTVEKNEQDATETTAQNREVAKEAK-SNVKANTQTNEVAQSGSETKETQTT 1098

Query: 106 QVKEKEAIAKKEAAKEAGEK 125
+ KE + K+E AK EK
Sbjct: 1099 ETKETATVEKEEKAKVETEK 1118



Score = 30.0 bits (67), Expect = 0.004
Identities = 17/84 (20%), Positives = 29/84 (34%), Gaps = 4/84 (4%)

Query: 51 QQKNDTEKKQVAQAQEKNE----VAKKQAVQAQEKNEVAKKQAAQVQEKSNMAKEVVAPQ 106
+ KQ ++ EKNE Q + ++ + K Q E + E Q
Sbjct: 1037 TETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQ 1096

Query: 107 VKEKEAIAKKEAAKEAGEKLPNTA 130
E + A E ++A + T
Sbjct: 1097 TTETKETATVEKEEKAKVETEKTQ 1120



Score = 30.0 bits (67), Expect = 0.005
Identities = 21/106 (19%), Positives = 37/106 (34%), Gaps = 7/106 (6%)

Query: 24 ASTGTVTKEEAAQIQQDIAK-----KEAAIQAQQKNDTEKKQVAQAQEKNEVAKKQAVQA 78
+ T E + Q + + K E Q ++ K V + NEVA+ +
Sbjct: 1034 SETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQ-SGSET 1092

Query: 79 QEKNEVAKKQAAQVQEKSNMAKEVVAPQVKEKEAIAKKEAAKEAGE 124
+E K+ A EK AK + + ++ +E E
Sbjct: 1093 KETQTTETKETAT-VEKEEKAKVETEKTQEVPKVTSQVSPKQEQSE 1137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4389ANTHRAXTOXNA300.004 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 30.1 bits (67), Expect = 0.004
Identities = 30/143 (20%), Positives = 59/143 (41%), Gaps = 19/143 (13%)

Query: 3 IEIKDTLISEEQLQEKVKELALQIERDFEGEEIVVIAVLKGSFVFAADLIRHIKNDV-TI 61
I IKD I+ EQ +E E+ I D +I+ K +LI+ + +D +
Sbjct: 154 INIKDYAINSEQSKEVYYEIGKGISLD-------IISKDKSLDPEFLNLIKSLSDDSDSS 206

Query: 62 DFISASSYGNQTETTGKVKLLKDIDVNITGKNVIVVEDIIDSGLTLHFLKDH---FFMHK 118
D + + + + E K ID+N +N+ + + +F DH ++
Sbjct: 207 DLLFSQKFKEKLELNNK-----SIDINFIKENLTEFQHAFSLAFSYYFAPDHRTVLELYA 261

Query: 119 PKALKFCTLLDKPERRKVDLTAE 141
P ++ ++K E+ + +E
Sbjct: 262 PDMFEY---MNKLEKGGFEKISE 281


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4392FLGFLIH352e-04 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 35.2 bits (80), Expect = 2e-04
Identities = 17/56 (30%), Positives = 30/56 (53%)

Query: 214 DAREKVLMDEAAKFAHAETEGMKRGMEKGLEKGIEQGIEQGIEQGIEQGRKEGVQQ 269
+A + A A +G + G+ +G ++G +QG ++G+ QG+EQG E Q
Sbjct: 35 EAEPSLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQ 90



Score = 33.6 bits (76), Expect = 6e-04
Identities = 14/48 (29%), Positives = 31/48 (64%)

Query: 226 KFAHAETEGMKRGMEKGLEKGIEQGIEQGIEQGIEQGRKEGVQQGKIQ 273
+ A + + ++G + G+ +G +QG +QG ++G+ QG ++G+ + K Q
Sbjct: 43 QLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQ 90


104BALH_4844BALH_4855N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_4844-212-0.727934ferredoxin, 4Fe-4S
BALH_4845-112-2.286860cardiolipin synthetase domain-containing
BALH_4846-110-2.236708putative UV damage endonuclease
BALH_4847-110-1.990336diguanylate cyclase/phosphodiesterase
BALH_4848018-0.916089transcription activator PlcR
BALH_48490140.546527neutral protease B
BALH_4850013-0.865922DNA-binding response regulator
BALH_4851015-0.641864sensor histidine kinase
BALH_4852117-0.636501ABC transporter permease
BALH_4853117-0.811560ABC transporter ATP-binding protein
BALH_4854118-1.033378methionine aminopeptidase
BALH_4855219-0.820723collagen adhesion protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4844PF06580300.034 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.8 bits (67), Expect = 0.034
Identities = 19/101 (18%), Positives = 36/101 (35%), Gaps = 9/101 (8%)

Query: 109 AFTFFQEIVTLVILIAVFWAFHRRYVEKLVRLKRNFKSGLVLIFIGGLMISVLLGNGMGI 168
F ++ LV+ A R+ KL + + + IG +
Sbjct: 43 IFNIAISLMGLVLTHAYRSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFV-----ANTS 97

Query: 169 IWHGEELSWSEP----IASAIAYVFSGINETAAISVFYFSW 205
IW ++P + A++ +F+ + T S+ YF W
Sbjct: 98 IWRLLAFINTKPVAFTLPLALSIIFNVVVVTFMWSLLYFGW 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4849THERMOLYSIN357e-06 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 34.6 bits (79), Expect = 7e-06
Identities = 14/50 (28%), Positives = 25/50 (50%), Gaps = 1/50 (2%)

Query: 13 NKEIKENQTEVRLAQTYKNYKVYGQDLIVKVDKNGVIITVSGNVVQNLDQ 62
NK + T +R Q G L+ V+ +G + ++SG ++ NLD+
Sbjct: 81 NKLDELGHTVMRFEQAIAASLCMGAVLVAHVN-DGELSSLSGTLIPNLDK 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4850HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 2e-13
Identities = 23/118 (19%), Positives = 47/118 (39%), Gaps = 2/118 (1%)

Query: 2 IRIIIAEDQRMLRGALGALLDLEDDIEVIGQAANGEEALKLIESLKPDVSIMDIEMPIQS 61
I++A+D +R L L + +N + I + D+ + D+ MP ++
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLDVAETLKKENSACKVMILTTFARPGYFERAMKAGVHGYLLKDSPSEDLSASIRNVM 119
D+ +KK V++++ +A + G + YL K +L I +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4851PF06580535e-10 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 53.3 bits (128), Expect = 5e-10
Identities = 55/333 (16%), Positives = 114/333 (34%), Gaps = 69/333 (20%)

Query: 22 FVYLLFPIYHLVQASGWKLIIGSGMLIIFIITYRQLYFVQRTFIFWACIQMILIFLFALF 81
F L I S M ++ YR + ++ ++ Q+IL L
Sbjct: 30 FASLYGSPKLHSMIFN---IAISLMGLVLTHAYRS-FIKRQGWLKLNMGQIILRVL---- 81

Query: 82 YNPFMIFFGFFTASAMGFAPSKKVFRVLLCLLVIMLGAFVFVNMNQLTPTSLVNIVPMFI 141
P + G V + L+ + L + + N+V +
Sbjct: 82 --PACVVIGMVWF----------VANTSIWRLLAFINTKPVAFTLPLALSIIFNVVVVTF 129

Query: 142 LMILTPFGMRNFNQKKMLRNQLNEANEQIKDLVKREERQRIARDLHDTLGHTLSLITLKS 201
+ L FG F + D K + A+ L+ LK+
Sbjct: 130 MWSLLYFGWHFFKN----------YKQAEIDQWKMASMAQEAQ-----------LMALKA 168

Query: 202 QL-----------VEKLIVKNPERASTEAKEITQTSRTALKQLRELISDMRMITVEEELE 250
Q+ + LI+++P +A +++ R +L+ S+ R +++ +EL
Sbjct: 169 QINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLRY-----SNARQVSLADELT 223

Query: 251 QIKAILQAANIR----LEIKQETSASSLSAIEQNILGMCLREAVTNVVKH-----SKATR 301
+ + LQ A+I+ L+ + + + + + + M ++ V N +KH + +
Sbjct: 224 VVDSYLQLASIQFEDRLQFENQINPAIMDVQ---VPPMLVQTLVENGIKHGIAQLPQGGK 280

Query: 302 CTVSVLESQGELILTVEDNGIGLADQNHDGNGI 334
+ + G + L VE+ G + G
Sbjct: 281 ILLKGTKDNGTVTLEVENTGSLALKNTKESTGT 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4855PF05616385e-04 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 38.2 bits (88), Expect = 5e-04
Identities = 36/122 (29%), Positives = 51/122 (41%), Gaps = 19/122 (15%)

Query: 211 PEQNSATKPATD-NPEQNPATRPVTDNPEQNSATKPATDNPEQNPATKPATD-NPEQNPA 268
PE + A PA + P +NP TRP NPE +P+ NP P TD P P
Sbjct: 329 PEVSPAENPANNPAPNENPGTRP---NPEP---------DPDLNPDANPDTDGQPGTRPD 376

Query: 269 TKPAADNPEQNLASDPAESTNSGPK-QITTNILTGVKLTDKDGKPFTEDNRPSTDSPANI 327
+ D P + E + G + +IL +L + + P + N PS N+
Sbjct: 377 SPAVPDRPNGRHRKERKEGEDGGLLCKFFPDILACDRLPEPN--PAEDLNLPS--ETVNV 432

Query: 328 EF 329
EF
Sbjct: 433 EF 434


105BALH_4911BALH_4921N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_49110111.294117DNA-binding response regulator
BALH_49120120.459142two-component sensor protein
BALH_49132101.191274pyridoxal kinase
BALH_49153121.130713diguanylate cyclase
BALH_49162111.738150hypothetical protein
BALH_49191111.182441carbon starvation protein A
BALH_4920010-1.228520response regulator
BALH_4921010-0.310487major facilitator family transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4911HTHFIS881e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.3 bits (219), Expect = 1e-22
Identities = 41/219 (18%), Positives = 84/219 (38%), Gaps = 31/219 (14%)

Query: 2 KIKVLLVDDHTVVLKGLAFFLSTQEDIELVGEASNGKEALVKVGETNPDVVLMDLYMPEM 61
+L+ DD + L L ++ ++ SN + + D+V+ D+ MP+
Sbjct: 3 GATILVADDDAAIRTVLNQAL-SRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 62 DGVEATACIKKEYPDVKVIVLTSFSDQAHVLPALRAGASGYILKDVEPDQLVEAIRSAYK 121
+ + IKK PD+ V+V+++ + + A GA Y+ K + +L+ I
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG---- 116

Query: 122 GNIQLHPDIANALLSQTLPVEEKEEEPSIQVDVL--TARENEVLQLLAKGMSNKEIASVL 179
AL + E++ + ++ +A E+ ++LA+ M +++
Sbjct: 117 ----------RALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTD--LTLM 164

Query: 180 VITE----KTVKAHVSSILSKLH-LSDRTQAALYAVKNG 213
+ E K + A LH R A+
Sbjct: 165 ITGESGTGKELVARA------LHDYGKRRNGPFVAINMA 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4912PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.5 bits (79), Expect = 0.001
Identities = 21/83 (25%), Positives = 39/83 (46%), Gaps = 7/83 (8%)

Query: 447 NVSKHA---NVREATIYFKVTEKNVSLEIVDQGNGFVE-KNIKEKKSLGMTTMRERVELV 502
N KH + I K T+ N ++ + + G + KN KE G+ +RER++++
Sbjct: 266 NGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQML 325

Query: 503 GG---TIKIVSDKKRTSVKVNVP 522
G IK+ + + + V +P
Sbjct: 326 YGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4920HTHFIS511e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 51.0 bits (122), Expect = 1e-09
Identities = 19/137 (13%), Positives = 47/137 (34%), Gaps = 12/137 (8%)

Query: 6 KILLIMEEVEERRSLAEKFTENIRNVECFEANTGTESLFMMKKHTPDFVFLNSKLIDGTG 65
IL+ ++ R L + + + + + D V + + D
Sbjct: 5 TILVADDDAAIRTVLNQAL--SRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 66 FEYASLLREVNCYTKFIFMGE--DIEESITAFRFQAFYYLLRPFREEDLQFLLYRMGKEQ 123
F+ +++ + M +I A A+ YL +PF +L ++
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGII------- 115

Query: 124 GEKAKSYLRKLPIEGQE 140
+A + ++ P + ++
Sbjct: 116 -GRALAEPKRRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4921TCRTETA606e-12 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 59.8 bits (145), Expect = 6e-12
Identities = 72/380 (18%), Positives = 142/380 (37%), Gaps = 35/380 (9%)

Query: 7 ISKRKLLGIAGLGWLFDAMDVGMLSFVMVALQKDWGLSTQEMGWIG---SINSIGMAVGA 63
+ + L + DA+ +G++ V+ L +D S G ++ ++ A
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 64 LVFGILSDKIGRKSVFIITLLLFSIGSGLTALTTTLAMFLVLRFLIGMGLGGELPVASTL 123
V G LSD+ GR+ V +++L ++ + A L + + R + G+ G VA
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAY 119

Query: 124 VSESVEAHERGKIVVLLESFWAGGWLIAALISYF---VIPKYGWEVAMILSAIPALYALY 180
+++ + ER + + + + G + ++ P + A L+ + L +
Sbjct: 120 IADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCF 179

Query: 181 LRWNLPDSPRFQKVEKRPTVIENIKSVWSGEYRKATIMLWILWFSV---------VFSYY 231
L LP+S + ++ R + + S L ++F + ++ +
Sbjct: 180 L---LPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIF 236

Query: 232 GM--FLWLPSV--MVLKGFSLIKSFQYVLIMTLAQLPGYFTAAWFIERLGRKFVLVTYLI 287
G F W + + L F ++ S +I RLG + L+ +I
Sbjct: 237 GEDRFHWDATTIGISLAAFGILHSLAQAMITGP-----------VAARLGERRALMLGMI 285

Query: 288 GTACSAYLFGVAESLTVLIVAGMLLSFFNLGAWGALYAYTPEQYPTVIRGTGAGMAAAFG 347
L A + +LL+ +G AL A Q +G G AA
Sbjct: 286 ADGTGYILLAFATRGWMAFPIMVLLASGGIGM-PALQAMLSRQVDEERQGQLQGSLAALT 344

Query: 348 RIGGILGPLLVGYLVASQAS 367
+ I+GPLL + A+ +
Sbjct: 345 SLTSIVGPLLFTAIYAASIT 364



Score = 33.6 bits (77), Expect = 0.001
Identities = 29/125 (23%), Positives = 45/125 (36%), Gaps = 5/125 (4%)

Query: 274 ERLGRKFVLVTYLIGTACSAYLFGVAESLTVLIVAGMLLSFFNLGAWGALYAYTPEQYPT 333
+R GR+ VL+ L G A + A L VL + G +++ AY +
Sbjct: 68 DRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYI-GRIVAGITGATGAVAGAYIADITDG 126

Query: 334 VIRGTGAG-MAAAFGRIGGILGPLLVGYLVASQASLSLIFTIFCGSILIGVFAVIILGQE 392
R G M+A FG G + GP+L G + + L E
Sbjct: 127 DERARHFGFMSACFG-FGMVAGPVLGGLMGGFSPHAPFFAAAALN--GLNFLTGCFLLPE 183

Query: 393 TKQRE 397
+ + E
Sbjct: 184 SHKGE 188


106BALH_4937BALH_4955N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
BALH_4937-214-0.699520TetR family transcriptional regulator
BALH_4938-213-0.605113AcrB/AcrD/AcrF family transporter
BALH_4939-214-0.734839peptide methionine sulfoxide reductase
BALH_4941-114-0.581239hypothetical protein
BALH_49421130.625078antiholin-like protein LrgB
BALH_49450100.690141murein hydrolase regulator LrgA
BALH_49460101.035169response regulator
BALH_4948-1101.334881sensor histidine kinase
BALH_49501100.969916major facilitator family transporter
BALH_49522120.606880glycine betaine transporter
BALH_4953212-0.339301nitric-oxide synthase, oxygenase subunit
BALH_4954413-1.083597superoxide dismutase, manganese
BALH_4955212-2.337727hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4937HTHTETR634e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 63.5 bits (154), Expect = 4e-14
Identities = 21/62 (33%), Positives = 38/62 (61%)

Query: 2 KEKERLIIEMAMKLFATKGVNATSVQEIVTACGISKGAFYLYFKSKEELLLATLRYYYDK 61
+E + I+++A++LF+ +GV++TS+ EI A G+++GA Y +FK K +L
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 62 IQ 63
I
Sbjct: 70 IG 71


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4938ACRIFLAVINRP6690.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 669 bits (1728), Expect = 0.0
Identities = 240/1066 (22%), Positives = 459/1066 (43%), Gaps = 68/1066 (6%)

Query: 4 IINFSLKNKFAVWLLTIIVTIAGIYSGLNMKLETIPDITTPVVTVTTVYPGATPEEVADK 63
+ NF ++ W+L II+ +AG + L + + P I P V+V+ YPGA + V D
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 64 VSKPMEEQLQNLSGVNVVSSSSFQNASS-IQVEYDFDKNMEKAETEIKDALANVK--LPE 120
V++ +E+ + + + +SS+S S I + + + + A+ ++++ L LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 GVKDPKVSRVNF--NAFPVISLSVASKNESLATLTENVEKNVVPGLKGLDGVASVQISGQ 178
V+ +S + V + + +++ V NV L L+GV VQ+ G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 179 QVDEVQLVFKKDKMKELGLSEDTVKNVIKGSDVSLPLGLYTFKDT------EKSVVVDGN 232
Q +++ D + + L+ V N +K + + G S++
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 233 ITTMKALKELKIPAVPSSASSQGSQTAGAGAQMPQMNPAAMNGIPTVTLSEIADIKEVGK 292
+ ++ + +G V L ++A ++ G+
Sbjct: 240 FKNPEEFGKVTLRVNS-------------------------DGSV-VRLKDVARVELGGE 273

Query: 293 A-ESISRTNGKEAIGIQIVKAADANTVDVVNAVKDKVKELEKKY-KDLEIISTFDQGAPI 350
I+R NGK A G+ I A AN +D A+K K+ EL+ + + ++++ +D +
Sbjct: 274 NYNVIARINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFV 333

Query: 351 EKSVETMLSKAIFGAIFAIVIIMLFLRNIRTTLISVVSIPLSLLIAVLVIKQMDITLNIM 410
+ S+ ++ + +++ LFL+N+R TLI +++P+ LL ++ ++N +
Sbjct: 334 QLSIHEVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTL 393

Query: 411 TLGAMTVAIGRVVDDSIVVIENIYRRMSLSEEKLRGKDLIREATKEMFIPIMSSTIVTIA 470
T+ M +AIG +VDD+IVV+EN+ R M E+KL K+ ++ ++ ++ +V A
Sbjct: 394 TMFGMVLAIGLLVDDAIVVVENVERVM--MEDKLPPKEATEKSMSQIQGALVGIAMVLSA 451

Query: 471 VFLPLGLVKGMIGEMFLPFALTIVFALLASLLVAVTIVPMLAHSLFKKESMREKEVHH-- 528
VF+P+ G G ++ F++TIV A+ S+LVA+ + P L +L K S E
Sbjct: 452 VFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGF 511

Query: 529 ----EEKPSKLANIYKRILAWALNHKIITSSIAVLLLVGSLALVPIIGVSFLPSEEEKMI 584
N Y + L I L++ G + L + SFLP E++ +
Sbjct: 512 FGWFNTTFDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVF 571

Query: 585 IATYNPEPGQTLEDVEKIATKAEKHFQDNKDVKTIQ--FSLGGENPMSPGQSNQAMFFVQ 642
+ G T E +K+ + ++ + ++ F++ G + Q N M FV
Sbjct: 572 LTMIQLPAGATQERTQKVLDQVTDYYL-KNEKANVESVFTVNGFSFSGQAQ-NAGMAFVS 629

Query: 643 YD--NDTKNFEKEKEQVVKDLQKMSGKGEWKN---------QDFGASGGSNEIKLYVYGD 691
+ E E V+ + GK + G + G + + G
Sbjct: 630 LKPWEERNGDENSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGL 689

Query: 692 SSEDIKPVVKDIQNIMKKN-KDLKDIDSSIAKTYAEYTLVADQEKLSKMGLTAAQIGMGL 750
+ + + + ++ L + + + A++ L DQEK +G++ + I +
Sbjct: 690 GHDALTQARNQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTI 749

Query: 751 SNQHDRPVLTTIKKDGKDVNVYVEAEKQTYETIDDLTNRKITTPLGNEVAVKDVMTVKEG 810
S + G+ +YV+A+ + +D+ + + G V T
Sbjct: 750 STALGGTYVNDFIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWV 809

Query: 811 ETSNTVKHRDGRVYAEVSAKLTSDDVSK-ASAAVQKEVDKMDLPSGVDVSMGGVTKDIEE 869
S ++ +G E+ + S A A ++ K LP+G+ G++
Sbjct: 810 YGSPRLERYNGLPSMEIQGEAAPGTSSGDAMALMENLASK--LPAGIGYDWTGMSYQERL 867

Query: 870 SFKQLGLAMLAAIAIVYFVLVVTFGGALAPFAILFSLPFTIIGALVALLISGETLSVSAM 929
S Q + + +V+ L + P +++ +P I+G L+A + + V M
Sbjct: 868 SGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFM 927

Query: 930 IGALMLIGIVVTNAIVLIDRVIH-KENEGLSTREALLEAGATRLRPILMTAIATIGALIP 988
+G L IG+ NAI++++ E EG EA L A RLRPILMT++A I ++P
Sbjct: 928 VGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLP 987

Query: 989 LALGFEGSGLISKGLGVTVIGGLTSSTLLTLLIVPIVYEVLSKFKK 1034
LA+ +G+ V+GG+ S+TLL + VP+ + V+ + K
Sbjct: 988 LAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRCFK 1033



Score = 93.4 bits (232), Expect = 2e-21
Identities = 93/518 (17%), Positives = 198/518 (38%), Gaps = 46/518 (8%)

Query: 546 ALNHKIITSSIAVLLLVGSLALVPIIGVSFLPSEEEKM--IIATYNPEPGQTLEDVE-KI 602
+ I +A++L++ + + V+ P+ + A Y PG + V+ +
Sbjct: 5 FIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANY---PGADAQTVQDTV 61

Query: 603 ATKAEKHFQDNKDVKTIQFSLGGENPMSPGQSNQAMFFVQYDNDTKNFEKEKEQVVKDLQ 662
E++ ++ + S S ++ + + + QV LQ
Sbjct: 62 TQVIEQNMNGIDNLMYMS---------STSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQ 112

Query: 663 KMSGK--GEWKNQDFGASGGSNEIKLYVYGDSSEDIKPVVKDIQNIMKKN--KDLKDID- 717
+ E + Q S+ L V G S++ DI + + N L ++
Sbjct: 113 LATPLLPQEVQQQGISVEKSSSSY-LMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNG 171

Query: 718 ----SSIAKTYAEYTLVADQEKLSKMGLTAAQIGMGLSNQHDR----PVLTTIKKDGKDV 769
YA + D + L+K LT + L Q+D+ + T G+ +
Sbjct: 172 VGDVQLFGAQYA-MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQL 230

Query: 770 NVYVEAEKQTYETIDDLTNRKI-TTPLGNEVAVKDVMTVKEG--ETSNTVKHRDGRVYAE 826
N + A+ + ++ + G+ V +KDV V+ G + +
Sbjct: 231 NASIIAQTRFK-NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGL 289

Query: 827 VSAKLTSDDVSKASAAVQKEVDKM--DLPSGVDVSMGGVTKD----IEESFKQLGLAMLA 880
T + + A++ ++ ++ P G+ V D ++ S ++ +
Sbjct: 290 GIKLATGANALDTAKAIKAKLAELQPFFPQGMKVL---YPYDTTPFVQLSIHEVVKTLFE 346

Query: 881 AIAIVYFVLVVTFGGALAPFAILFSLPFTIIGALVALLISGETLSVSAMIGALMLIGIVV 940
AI +V+ V+ + A ++P ++G L G +++ M G ++ IG++V
Sbjct: 347 AIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLV 406

Query: 941 TNAIVLIDRVI-HKENEGLSTREALLEAGATRLRPILMTAIATIGALIPLALGFEGS-GL 998
+AIV+++ V + L +EA ++ + ++ A+ IP+A F GS G
Sbjct: 407 DDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAF-FGGSTGA 465

Query: 999 ISKGLGVTVIGGLTSSTLLTLLIVPIVYEVLSKFKKKK 1036
I + +T++ + S L+ L++ P + L K +
Sbjct: 466 IYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAE 503


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4946HTHFIS653e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 65.2 bits (159), Expect = 3e-14
Identities = 30/126 (23%), Positives = 57/126 (45%), Gaps = 6/126 (4%)

Query: 3 KVLVVDDEMLARDELKYLLERTK-EVEIIGEADCVEDALEELMKNKPDIVFLDIQLSDDN 61
+LV DD+ R L L R +V I A + D+V D+ + D+N
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNA---ATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GFEIANILKKMKNPPAIVFATAYDQY--ALQAFEVDALDYILKPFDEERIVQTLKKYKKQ 119
F++ +KK + ++ +A + + A++A E A DY+ KPFD ++ + + +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 120 KQSQIE 125
+ +
Sbjct: 122 PKRRPS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4948PF065802314e-73 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 231 bits (591), Expect = 4e-73
Identities = 65/216 (30%), Positives = 111/216 (51%), Gaps = 13/216 (6%)

Query: 359 QLELGEAELQSKLLQDAEIKALQAQINPHFLFNAINTVSALCRTDVEKARKLLLQLSVYF 418
Q E+ + ++ + Q+A++ AL+AQINPHF+FNA+N + AL D KAR++L LS
Sbjct: 146 QAEIDQWKMA-SMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELM 204

Query: 419 RCNLQGARQLLIPLEQELNHVQAYLSLEQARFPNKYEVKMYIEDELKTTLVPPFVLQLLV 478
R +L+ + + L EL V +YL L +F ++ + + I + VPP ++Q LV
Sbjct: 205 RYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLV 264

Query: 479 ENALRHAFPKKQPVCEVEVHVFEKEGMVHFEVKDNGQGIEEERLEQLGKMVVLSKKGTGT 538
EN ++H + ++ + + G V EV++ G + +K+ TGT
Sbjct: 265 ENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN-----------TKESTGT 313

Query: 539 ALYNINERLIGLFGKETMLHIESEVNEGTEVTFVIP 574
L N+ ERL L+G E + + + + +IP
Sbjct: 314 GLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4950TCRTETB547e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 53.7 bits (129), Expect = 7e-10
Identities = 81/419 (19%), Positives = 148/419 (35%), Gaps = 44/419 (10%)

Query: 35 LDMLLLSFVLVYILKEFHLSPVEGGNLTLATTIGMLIGSYLFGFIADLFGRIRTMAFTIL 94
L+ ++L+ L I +F+ P + A + IG+ ++G ++D G R + F I+
Sbjct: 28 LNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGII 87

Query: 95 LFSLATALIYFATDYWQLLIL-RFLVGMGVGGEFGIGMAIVTETWSKEMRAKATSVVALG 153
+ + + + ++ LLI+ RF+ G G + M +V KE R KA ++
Sbjct: 88 INCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSI 147

Query: 154 WQFGVLVASLLPAFIVPHFGWRAVFLFGLIPALLAVYVRKSLSEPKIWEQKQRYKKELL- 212
G V + I + W + L +I + ++ K L + + K +L
Sbjct: 148 VAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILM 207

Query: 213 -----------QKESEGNLTTT-------EAEQLKQMKKFPLRKLFGNKKVTITTIGLII 254
S L + K F L N I ++
Sbjct: 208 SVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMI----GVL 263

Query: 255 MSFIQNFGYYGIFTWMPTILANKYNYTLAKA-SGWMFISTIGMLIGIATFGILADKIGRR 313
I G + +P ++ + + + A+ S +F T+ ++I GIL D+ G
Sbjct: 264 CGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPL 323

Query: 314 KTFTIYYVGGTIYCLIY-FFLFTDSTLLLWG-SALLGFFANGMMGGFGAVLAENYPAEAR 371
I ++ L F L T S + +LG + F + + +
Sbjct: 324 YVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLS------FTKTVISTIVSSSL 377

Query: 372 STAENFIFGTGRGL--------AGFGPVIIGLLAAGGNVMGALSLIFIIYPIGLVTMLL 422
E G G L G G I+G L + + L + + L + LL
Sbjct: 378 KQQEA---GAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTYLYSNLL 433


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
BALH_4955NUCEPIMERASE361e-04 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 36.3 bits (84), Expect = 1e-04
Identities = 12/26 (46%), Positives = 15/26 (57%)

Query: 6 KVLVLGGTRFFGKHLVEALLKDGHDV 31
K LV G F G H+ + LL+ GH V
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQV 27



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.