PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeNC_003295.gbThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_003295 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1RS_RS00170RS_RS00240Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS001704123.212412RNA polymerase sigma factor
RS_RS001756153.845231anti-sigma factor
RS_RS001803101.898401hypothetical protein
RS_RS001852110.422199hypothetical protein
RS_RS001902110.280338membrane protein
RS_RS001950110.337367sensor histidine kinase
RS_RS002001130.212189chemotaxis protein CheY
RS_RS00205214-0.403977type III effector protein
RS_RS00210321-0.237318ATP-dependent protease ATPase subunit HslU
RS_RS002151190.717997ATP-dependent protease subunit HslV
RS_RS002201171.556688ABC transporter substrate-binding protein
RS_RS002251171.618266GTPase
RS_RS002300161.715991molecular chaperone DnaK
RS_RS002350142.383056cobalamin biosynthesis protein CobW
RS_RS002401123.278254Fur family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00200HTHFIS732e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 72.9 bits (179), Expect = 2e-17
Identities = 27/121 (22%), Positives = 52/121 (42%), Gaps = 11/121 (9%)

Query: 8 LILDDDDVFAQTLARALTRRGFAPQIAHTGGEALSLARQTPFAYVTVDLHLAASDRGLMP 67
L+ DDD L +AL+R G+ +I V D+ MP
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVV--------MP 58

Query: 68 HGGTDSGLHWIAPLRQALPEARMLVLTGYASIATAVQAVKLGADEYLAKPANVDSILLAL 127
+ + +++A P+ +LV++ + TA++A + GA +YL KP ++ ++ +
Sbjct: 59 DE---NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGII 115

Query: 128 Q 128

Sbjct: 116 G 116



Score = 45.2 bits (107), Expect = 5e-08
Identities = 12/47 (25%), Positives = 22/47 (46%)

Query: 141 ENPAPLSVARLEWEHIQRVLSEHGGNISATARALNMHRRTLQRKLGK 187
+A +E+ I L+ GN A L ++R TL++K+ +
Sbjct: 426 SGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00210HTHFIS310.015 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.6 bits (69), Expect = 0.015
Identities = 11/36 (30%), Positives = 18/36 (50%), Gaps = 3/36 (8%)

Query: 50 TPKNILMIGPTGVGKTEIAR---RLAKLADAPFIKI 82
T +++ G +G GK +AR K + PF+ I
Sbjct: 159 TDLTLMITGESGTGKELVARALHDYGKRRNGPFVAI 194


2RS_RS00480RS_RS00565Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS004804220.233060alkaline phosphatase
RS_RS00485320-1.074543alkaline phosphatase
RS_RS00490123-2.893632hypothetical protein
RS_RS00495028-5.014780phosphomethylpyrimidine kinase
RS_RS00500130-5.865594membrane protein
RS_RS00505027-4.584449endonuclease DDE
RS_RS00515-123-3.929611hypothetical protein
RS_RS00520221-2.785018glyoxalase
RS_RS00530-1111.284479thiamine monophosphate synthase
RS_RS00535-1100.230710thiazole synthase
RS_RS005400101.616025endonuclease DDE
RS_RS005451102.271606thiamine biosynthesis protein ThiS
RS_RS005501102.123142cytochrome C biogenesis protein CcdA
RS_RS005551101.660697phosphomethylpyrimidine synthase
RS_RS005603122.4498942-nitropropane dioxygenase
RS_RS005655132.482902hemagglutinin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00495BACINVASINB310.006 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 31.3 bits (70), Expect = 0.006
Identities = 35/133 (26%), Positives = 62/133 (46%), Gaps = 8/133 (6%)

Query: 72 DIVGAQIDAVAQDIGVDAAKTGMLGSPAVVEAIVSALARHPIAALVVDPVMVSTSGAQLG 131
+++G I + +GVD M GS +V AIV+A+A +A +VV V+ + A+LG
Sbjct: 383 ELIGKAITKALEGLGVDKKTAEMAGS--IVGAIVAAIAM--VAVIVVVAVVGKGAAAKLG 438

Query: 132 SDATAQAMAKWLFPRALLITPNLPE-ASALLGRAVRTADDMLPAARDLLTLGSRAVL--L 188
+A ++ M + + + L + S L + ++ L + L + A+ L
Sbjct: 439 -NALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLGNVGSKMGLQTNALSKEL 497

Query: 189 KGGHLADVAIPGE 201
G L VA+ E
Sbjct: 498 VGNTLNKVALGME 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00500INTIMIN280.011 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 27.7 bits (61), Expect = 0.011
Identities = 18/54 (33%), Positives = 26/54 (48%)

Query: 56 NGRDALTITANGPFVFAATVPFNGGYTVTIGTQPSSASCAVSNGSGIATADVTS 109
+G +A+T TA A + + GT SA+ A +NGSG AT + S
Sbjct: 573 DGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKS 626


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00565INTIMIN391e-04 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 39.3 bits (91), Expect = 1e-04
Identities = 69/383 (18%), Positives = 121/383 (31%), Gaps = 27/383 (7%)

Query: 400 TPTSAGTFSFTVTATDSSTGTGPFSATSGTLSLTIAAPTITVSPSTLTSPTVGAASSQSV 459
+ + T A D + S+ + L++T+ + V +T T S+++
Sbjct: 518 VQGGSNVYKVTARAYDRNGN----SSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKAD 573

Query: 460 SASGGTSPYTYA---VTAGALPAGMSLSSAGTLSGTPTA--GGAFSFTATATDST-GGTG 513
T T V +P ++ S + +A G+ T T G
Sbjct: 574 GTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVV 633

Query: 514 PYTGSRSYTLTVNAPTLSITPAAGSLSATSGVAYSQAFVAASGTAPYTYALSVNSGALPT 573
+ T +NA + + ++ + + + A+G TY + V G P
Sbjct: 634 VSAKTAEMTSALNANAVIFVD--QTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPV 691

Query: 574 G---LSFNTATGMLSGTPTTAGTANFTVTGTDSSTGTGPYTVSATYTLTTSAPTISLAPA 630
++F T G LS + T + T +ST G VSA S + +
Sbjct: 692 SNQEVTFTTTLGKLSNSTEKTDTNGYA-KVTLTSTTPGKSLVSA----RVSDVAVDVKAP 746

Query: 631 TLTGATVGTAYSQSVTASG-GATPYTYAITSGALPAGLNLNTGTGALTGTPTAAGTFSFT 689
+ T T ++ G G + L + G G T S
Sbjct: 747 EVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIAS-- 804

Query: 690 VRGTDANAFSGTRSYSLTVAAPTIALTPTTLAGATVSSAYSQSVAASGGTAPYTYAVTSG 749
DA++ T ++ + A T+++ S V Y AV +
Sbjct: 805 ---VDASSGQVTLKEK-GTTTISVISSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTC 860

Query: 750 ALPAGLSLSSAGVLSGTPTAGGS 772
G SS L A G+
Sbjct: 861 KNFGGKLPSSQNELENVFKAWGA 883



Score = 32.7 bits (74), Expect = 0.015
Identities = 62/388 (15%), Positives = 112/388 (28%), Gaps = 57/388 (14%)

Query: 237 TVTTSGGTSATSSADQFTYAAPPTANASSATVA----HGSSSNAITLSISGGTPTSVAVS 292
+ SG SA +N T +G+SSN + L+I+ V
Sbjct: 498 QIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTIT--------VL 549

Query: 293 TAASHGTATASGTSITYTPTASYSGTDTFAYTASNAGGTSAAATVTITVSSATVSYAPAS 352
+ +A GT+ YTA+ A A V ++ + +
Sbjct: 550 SNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVS------- 602

Query: 353 PAAGTVGTAYSQSVASASGGTSPYTYALASGSLPPGLTLSTGGTLSGTPT---SAGTFSF 409
GTA + ++ + G+ T L S PG + + T T +A F
Sbjct: 603 ------GTAVLSANSANTNGSGKATVTLKSDK--PGQVVVSAKTAEMTSALNANAVIFVD 654

Query: 410 TVTATDSSTGTGPFSATSGTLSLTIAAPTITVSPSTLTSPTVGAASSQSVSASGGTSPYT 469
A+ + +A + IT + + + + + + G +
Sbjct: 655 QTKASITEIKADKTTAVANGQD------AITYTVKVMKGDKPVSNQEVTFTTTLGKLSNS 708

Query: 470 YAVTAGALPAGMSLSSAGTLSGTPTA---GGAFSFTATATDSTGGTGPYTGSRSYTLTVN 526
T A ++L+S +A A A + G N
Sbjct: 709 TEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDG--------N 760

Query: 527 APTLSITPAAGSLSATSGVAYSQAFVAASGTAPYTYALSVNSGALPTGLSFNTATGMLSG 586
+ + A+ G YT+ + + A + SG
Sbjct: 761 IEIVGTGVKGKLPTVWLQYGQVN-LKASGGNGKYTWRSANPAIA---------SVDASSG 810

Query: 587 TPTTAGTANFTVTGTDSSTGTGPYTVSA 614
T T++ S T YT++
Sbjct: 811 QVTLKEKGTTTISVISSDNQTATYTIAT 838


3RS_RS00720RS_RS00835Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS007200154.686014membrane protein
RS_RS00725-2114.037960signal peptidase
RS_RS00730-2103.666909hypothetical protein
RS_RS007351133.506342MarR family transcriptional regulator
RS_RS007401133.689666hypothetical protein
RS_RS007450123.611098ABC transporter permease
RS_RS00750-1122.481539lipoprotein
RS_RS007550123.500596peptidase M48
RS_RS007601133.551812muropeptide transporter
RS_RS00765-1103.634347hypothetical protein
RS_RS007700103.787957short-chain dehydrogenase
RS_RS007750124.040487diguanylate cyclase
RS_RS007800114.533329arginase
RS_RS00785-1103.919310ornithine-oxoacid aminotransferase
RS_RS00790-1112.916719LysR family transcriptional regulator
RS_RS007951113.596914aldehyde dehydrogenase
RS_RS008000122.776130TetR family transcriptional regulator
RS_RS00805-190.937757secretion protein HlyD
RS_RS00810-18-0.566478multidrug ABC transporter ATP-binding protein
RS_RS00815-115-3.520158mannose-1-phosphate guanyltransferase
RS_RS00820-118-3.485630RND transporter
RS_RS00825122-4.889273membrane protein
RS_RS00830121-4.528816hypothetical protein
RS_RS00835021-3.208456oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00725PERTACTIN290.005 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 28.9 bits (64), Expect = 0.005
Identities = 18/68 (26%), Positives = 23/68 (33%), Gaps = 4/68 (5%)

Query: 47 RMAPPPRTMQSINQDLQQHRRPSGKRPDGAPGRGRKPPPDSANADGKPPAPPDGPPPSDA 106
R+A S L + P +P PG P P +PP PP P
Sbjct: 551 RLAANGNGQWS----LVGAKAPPAPKPAPQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPE 606

Query: 107 PPPDGPPS 114
P PP+
Sbjct: 607 APAPQPPA 614


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00760TCRTETA462e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 45.6 bits (108), Expect = 2e-07
Identities = 78/316 (24%), Positives = 125/316 (39%), Gaps = 30/316 (9%)

Query: 69 LMDRFTPPLIGR------RRGWLVLTQAGLAASIAAMAFCPPHAALWALAGLAVLVAFLS 122
LM P++G RR L+++ AG A A MA P LW L + +VA ++
Sbjct: 54 LMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAP---FLWVLY-IGRIVAGIT 109

Query: 123 ASQDIVFDAYSTDVLHASERGVGAAVKVLGYRLAMLVSGGLALWLADRVMGWSS--MYLL 180
+ V AY D+ ER + G+ A G +A + +MG S
Sbjct: 110 GATGAVAGAYIADITDGDER-----ARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFF 164

Query: 181 MAGLMAL--FVLATLWSPEPEAAARPPRSLQEAVVGPLRDFFARRGAWALLALIVLYKLG 238
A + F+ PE R P + + PL F RG + AL+ ++ +
Sbjct: 165 AAAALNGLNFLTGCFLLPESHKGERRPLRREA--LNPLASFRWARGMTVVAALMAVFFIM 222

Query: 239 DAFAGSLSTTFLIRG---VGFSAGEVGIVNKTLGLVATIIGALYGGTLMVRLGLVRALLL 295
+ ++I G + A +GI G++ ++ A+ G + RLG RAL+L
Sbjct: 223 QLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALML 282

Query: 296 FGVLQAVSNLGYWVLAVTPQHLWTMALAIGIENLCGGMGTAAFVALLMALCNRSFSATQY 355
+ ++L W MA I + GG+G A A+L +
Sbjct: 283 GMIADGTG----YILLAFATRGW-MAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQ 337

Query: 356 ALLSALASIGRVYVGP 371
L+AL S+ + VGP
Sbjct: 338 GSLAALTSLTSI-VGP 352


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00770DHBDHDRGNASE643e-14 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 63.9 bits (155), Expect = 3e-14
Identities = 72/266 (27%), Positives = 109/266 (40%), Gaps = 24/266 (9%)

Query: 6 NESPILPRIALVTGAARRVGRVIALALARQGWDV-AVHCHRSRAEADAVAAEIIAMGRRA 64
N I +IA +TGAA+ +G +A LA QG + AV + + E V + + A R A
Sbjct: 2 NAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLE--KVVSSLKAEARHA 59

Query: 65 AVLQADLADEAAAGRLIADCTAALGTPTCLVNNASLFQYDVATSFSYASLDTHMRINVAA 124
AD+ D AA + A +G LVN A + + + S S + +N
Sbjct: 60 EAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTG 119

Query: 125 PLLLARELHRVLAAEDGGAGEVRGVVVNLLDQKLDNLNPDFLSYTLSKAALATATTQLAQ 184
+R + + + G+ +V + +Y SKAA T L
Sbjct: 120 VFNASRSVSKYMMDRRSGS------IVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGL 173

Query: 185 ALAPT-LRVVGVAPGATLVSWKQSE-----------SGFERAHKMA-PLGRSSTPEDIAE 231
LA +R V+PG+T + S G K PL + + P DIA+
Sbjct: 174 ELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIAD 233

Query: 232 AVCYLTTARA--VTGTTLFVDGGQHL 255
AV +L + +A +T L VDGG L
Sbjct: 234 AVLFLVSGQAGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00800HTHTETR627e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.3 bits (151), Expect = 7e-14
Identities = 35/211 (16%), Positives = 79/211 (37%), Gaps = 7/211 (3%)

Query: 13 RPRGPVREDLRDRLLDIAVQRFARDGIDATTMAAIAREAGVTAPMVHYHFATRDQLLDAV 72
R ++ R +LD+A++ F++ G+ +T++ IA+ AGVT +++HF + L +
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 73 VDMRLKPLIDEVMDPALTAIPDDGHAPELARIVAGAAQRMVAVASATPWFPTLWIREIAS 132
++ + + ++ D L I+ + V ++ +
Sbjct: 63 WELSESNIGELELEYQAKFPGDP--LSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 133 EGGQLRERVFARIALERATVLVERIARAQAAGAVNAALEPTLVMVSVIGLAMLPLATRAL 192
+ ++ + LE + + + A + A L ++ + L
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAA-IIMRGYISGLMEN-- 177

Query: 193 WGRLPHAETIDAQAIGRHVAALLLHGVGPAP 223
W P ++ D + R A+LL P
Sbjct: 178 WLFAP--QSFDLKKEARDYVAILLEMYLLCP 206


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00805RTXTOXIND573e-11 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 57.1 bits (138), Expect = 3e-11
Identities = 40/357 (11%), Positives = 90/357 (25%), Gaps = 85/357 (23%)

Query: 44 VASPVGGRLEHLGVQRGQTVSAGTPLFILESTDEAAARQQAAAQQQAAEAQLADLNLGRR 103
+ ++ + V+ G++V G L L + A + + A + + R
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSR 158

Query: 104 VPEVDAV------------RAQLAQAVAADQLSATQRVRDEAQFRAGGIPQAQLDASRST 151
E++ + + + L Q + Q + + A R T
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT 218

Query: 152 AQTNAQRVRELTNQLRI-----------------------AQLPARTDQIHAQSAQVEAA 188
R L+ + + +++ +Q+E
Sbjct: 219 VLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQI 278

Query: 189 RAAVAQAQWRLDQKAQ------------------------------------KATQGGLV 212
+ + A+ Q +A V
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKV 338

Query: 213 FD-TLYREGEWVGAGSPVVRMLPPANV-KVRFFVPEGVVGRLKPGQAVRIRCDG----CA 266
++ EG V ++ ++P + +V V +G + GQ I+ +
Sbjct: 339 QQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRY 398

Query: 267 AEVNATISYV---ASEAEYTPPVIYSNSRRDKLVFMVEARPAAADGPKLRPGQPVEV 320
+ + + A E + V ++ L G V
Sbjct: 399 GYLVGKVKNINLDAIEDQRLGLVFNVIISIEE-----NCLSTGNKNIPLSSGMAVTA 450


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00815ABC2TRNSPORT361e-04 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 36.4 bits (84), Expect = 1e-04
Identities = 39/192 (20%), Positives = 80/192 (41%), Gaps = 8/192 (4%)

Query: 194 GVILTMTMVMMT----GLAMTRERERGTMENLLATPVRPLEVMTGKIVPYIFIGLVQVSI 249
G++ T M T A R + T E +L T +R +++ G++ + +
Sbjct: 72 GMVATSAMTAATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAG 131

Query: 250 ILLAARFIFHVPFVGSAWMVYLAALL-FIVASLTVGITLSSLAQNQLQATQLTFFYFLPS 308
I + A + + ++ + + + AL ASL + +T + + + Q P
Sbjct: 132 IGVVAAALGYTQWLSLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVI--TPI 189

Query: 309 ILLSGFMFPFAGMPKWAQVIGDVLPMTYFHRLTRGILLKGNGWVELWPSIWPLLVFTAVV 368
+ LSG +FP +P Q LP+++ L R I+L V++ + L ++ +
Sbjct: 190 LFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPV-VDVCQHVGALCIYIVIP 248

Query: 369 MGIALRFYRKTL 380
++ R+ L
Sbjct: 249 FFLSTALLRRRL 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00820RTXTOXIND355e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 35.2 bits (81), Expect = 5e-04
Identities = 30/190 (15%), Positives = 52/190 (27%), Gaps = 35/190 (18%)

Query: 77 LDALVDEALRASPTVAQAAARLREAQAQA--DAQFGAVLPSVDGSASAVRQQVNPEAFGF 134
L AL EA + ARL + + Q + LP + Q V+ E
Sbjct: 127 LTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEE--- 183

Query: 135 NTPKPGPFTLYSASLSVSYALDLFGGVRRALEASRAQVDMQRYELEAARLSLAGNVVTAA 194
V R + Q + + L+L
Sbjct: 184 --------------------------VLRLTSLIKEQFSTWQNQKYQKELNLDK----KR 213

Query: 195 VRIASLDAQIATTQRLLAAQRDQLVITERRFGAGGVARVDVLSQRTQVAQTEATLPPLAQ 254
++ A+I + L ++ +L +A+ VL Q + + L
Sbjct: 214 AERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKS 273

Query: 255 QAAQARHRLS 264
Q Q +
Sbjct: 274 QLEQIESEIL 283


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00835NUCEPIMERASE496e-09 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 49.4 bits (118), Expect = 6e-09
Identities = 27/131 (20%), Positives = 53/131 (40%), Gaps = 18/131 (13%)

Query: 5 RIVLPGGAGLVGQNLVARLKARGYTNLVVLDK---------HEANLDVLRSVHPDITAVF 55
+ ++ G AG +G ++ RL G+ +V +D +A L++L P
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQ-VVGIDNLNDYYDVSLKQARLELLA--QPGFQFHK 58

Query: 56 ADLAEPGDWARHFE--GADVVVMLQAQIGAP--TREP--FVRNNIDSTRCVLEVIKQHAI 109
DLA+ F + V + ++ P + +N+ +LE + + I
Sbjct: 59 IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 110 PYTVHISSSVV 120
+ ++ SSS V
Sbjct: 119 QHLLYASSSSV 129


4RS_RS00880RS_RS01040Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS00880126-3.063791glutamine--fructose-6-phosphate aminotransferase
RS_RS00885244-5.476542universal stress protein UspA
RS_RS00890146-5.819168hypothetical protein
RS_RS00895146-6.639373hypothetical protein
RS_RS00900041-7.163779histidine kinase
RS_RS00910038-6.090480universal stress protein UspA
RS_RS00915036-6.092359universal stress protein UspA
RS_RS00920-132-5.206875transporter
RS_RS00925029-5.143815hypothetical protein
RS_RS00930131-5.754401hypothetical protein
RS_RS00935-131-5.708080Crp/Fnr family transcriptional regulator
RS_RS00940-133-5.647456histidine kinase
RS_RS00945-130-5.610566LuxR family transcriptional regulator
RS_RS00955033-6.034746hypothetical protein
RS_RS00960031-5.475414alcohol dehydrogenase
RS_RS00965-129-5.583785ribonucleotide reductase-like protein
RS_RS00970026-5.288814cytochrome C
RS_RS00975-134-5.195694cytochrome C signal peptide protein
RS_RS00980-134-5.552593membrane protein
RS_RS00985-135-4.702132cytochrome B561
RS_RS00990-135-5.058742heat-shock protein Hsp20
RS_RS00995-136-5.605183hypothetical protein
RS_RS01000-138-5.705229phosphoribosylpyrophosphate synthetase
RS_RS01005-136-5.728301hypothetical protein
RS_RS01010-136-5.656896thymidine phosphorylase
RS_RS01015-134-6.079050LysR family transcriptional regulator
RS_RS01020-132-5.250438membrane protein
RS_RS01025033-4.839717transcriptional regulator
RS_RS01030-132-5.391553transposase
RS_RS01035034-5.642607membrane protein
RS_RS01040031-4.325296MarR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00945PF06580385e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.3 bits (89), Expect = 5e-05
Identities = 28/140 (20%), Positives = 53/140 (37%), Gaps = 14/140 (10%)

Query: 318 PLLDELELVPAIEWIAKNFRQRFGLIVNVHAQEMPMQERAAISVFRIVQEALNNVVRHA- 376
L DEL +V + +A +F + Q P + +VQ + N ++H
Sbjct: 217 SLADELTVVDSYLQLASI---QFEDRLQFENQINPAIMDVQVPPM-LVQTLVENGIKHGI 272

Query: 377 ----DATRVEIEFVRLDDMLELSIQDNGRGWSGVPPAPGERKPLGLLGIRERARLL-GGQ 431
++ ++ + + + L +++ G S E GL +RER ++L G +
Sbjct: 273 AQLPQGGKILLKGTKDNGTVTLEVENTG---SLALKNTKESTGTGLQNVRERLQMLYGTE 329

Query: 432 ATIAHTP-DSGFRLSVRFPA 450
A I + V P
Sbjct: 330 AQIKLSEKQGKVNAMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00950HTHFIS813e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.4 bits (201), Expect = 3e-20
Identities = 30/116 (25%), Positives = 55/116 (47%), Gaps = 2/116 (1%)

Query: 2 IRVMIADDHAVMRDGVRHILERAGDFEVACEASDGTQVLQLVRDHQPQVIVLDLSMPGRS 61
+++ADD A +R + L RAG ++V S+ + + + ++V D+ MP +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAG-YDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLRQLHDDHPTVRVLVLTMHAEEQYVARAFRAGAAAYLTKESAATELVTALQK 117
+LL ++ P + VLV++ +A GA YL K TEL+ + +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00970OMADHESIN290.043 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 29.1 bits (64), Expect = 0.043
Identities = 21/77 (27%), Positives = 37/77 (48%), Gaps = 7/77 (9%)

Query: 190 TLIDGGLAQQVVGAS---AYAFWQRHVYAELGAYRTADQAFSV---FRAGQDTATPGGVA 243
T +D GLA S Y + + A +G YR++ QA ++ +R ++ A GVA
Sbjct: 380 TRVDKGLASSAALNSLFQPYGVGKVNFTAGVGGYRSS-QALAIGSGYRVNENVALKAGVA 438

Query: 244 RLSGANPYWRVAYNQEW 260
++ + ++N EW
Sbjct: 439 YAGSSDVMYNASFNIEW 455


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01025TCRTETA356e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.8 bits (80), Expect = 6e-04
Identities = 57/354 (16%), Positives = 112/354 (31%), Gaps = 17/354 (4%)

Query: 31 AFILVASEFMPVSLLTPIAGSL----HLTEGQAGQAISVSGAFALLTSLFIP---AMAAR 83
VA + + + L+ P+ L + + +AL+ P A++ R
Sbjct: 10 ILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDR 69

Query: 84 LDRKVLLLLLTLLTVVSGVVVTLAPDYSTFIIGRALIGVAIGGFWSMSAATAMRLVPHHD 143
R+ +LL+ V ++ AP IGR + G+ G +++ A + +
Sbjct: 70 FGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDE 128

Query: 144 VPHALAIVNGGNALATVLAAPLGSLLGAVVGWRGAFFCIVPVATIALVWQWMTLPALKAD 203
++ V LG L+G FF + + + LP
Sbjct: 129 RARHFGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKG 187

Query: 204 SSPGAAGDVFTLLRRPAVALGMAAVGLFFMGQFMLF-------TYLRPFLETVTNTRAST 256
+ L A GM V F++ F E + A+T
Sbjct: 188 ERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATT 247

Query: 257 LSFVLLVIGVAGFVGTAVI-GTFLKDGLYRTLIVIPVLMAGIALALASFGGSLAASTILL 315
+ L G+ + A+I G R +++ ++ G L +F + ++
Sbjct: 248 IGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIM 307

Query: 316 AIWGLVATAAPVGWWTWLARTLPQDAETGGGLMVAIIQLAITLGAAGGGAMFDA 369
+ P + + G + A+ L +G A++ A
Sbjct: 308 VLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAA 361


5RS_RS01190RS_RS01230Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS01190722-1.935269hydrolase
RS_RS01195621-2.413049LysR family transcriptional regulator
RS_RS01200524-3.2560033-hydroxybutyryl-CoA dehydrogenase
RS_RS012051030-3.218224hypothetical protein
RS_RS012101036-4.552011hypothetical protein
RS_RS01215832-4.193494transposase
RS_RS01220427-3.954069transposase
RS_RS01225323-1.784858hypothetical protein
RS_RS01230321-0.791764transposase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01190ISCHRISMTASE270.047 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 27.3 bits (60), Expect = 0.047
Identities = 18/97 (18%), Positives = 42/97 (43%), Gaps = 1/97 (1%)

Query: 78 PGKELLERTSMNSWDDQKVRDVLARNGRKKVVVSGLWTEVCNTTFALCAMAEGGYEIYMV 137
+L + +++ + +++ + GR +++++G++ + A A E + + V
Sbjct: 116 DDDLVLTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMED-IKAFFV 174

Query: 138 ADASGGTSKEAHDYAMQRLIQAGAVPVTWQQVLLEWQ 174
DA S E H A++ A V +L + Q
Sbjct: 175 GDAVADFSLEKHQMALEYAAGRCAFTVMTDSLLDQLQ 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01225PF04647290.011 Accessory gene regulator B
		>PF04647#Accessory gene regulator B

Length = 212

Score = 29.4 bits (66), Expect = 0.011
Identities = 14/64 (21%), Positives = 23/64 (35%), Gaps = 3/64 (4%)

Query: 169 VDEMGYLPMDREQANLLFQVIAKRYETGSLVLTSNLPFGQWDQTFAGDATLTAALLDRLL 228
VD Y P ++E+ +V ++L G + L+AA+ R
Sbjct: 13 VDRSDY-PFNQEEIRYGIEVFLGTVFQIIIILLVAFVIGLAKEVAF--CLLSAAVYRRFS 69

Query: 229 HHAH 232
AH
Sbjct: 70 GGAH 73


6RS_RS01415RS_RS01550Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS01415213-1.682523hypothetical protein
RS_RS01420115-1.568264sensor histidine kinase
RS_RS01425219-1.544740hypothetical protein
RS_RS01430117-1.221692chemotaxis protein CheY
RS_RS01435218-1.465389hisitidine kinase
RS_RS01440017-0.260723hypothetical protein
RS_RS014450160.5772185-methyltetrahydrofolate--homocysteine
RS_RS01450-1132.508677sulfurtransferase
RS_RS014550112.480695hypothetical protein
RS_RS014600113.0846493-oxoadipate enol-lactonase
RS_RS014651103.479700hypothetical protein
RS_RS01470-1103.255407IclR family transcriptional regulator
RS_RS01475183.352843fumarylacetoacetate hydrolase
RS_RS01485-1103.834075IclR family transcriptional regulator
RS_RS01490-1101.810220hypothetical protein
RS_RS01495-2101.974526enoyl-CoA hydratase
RS_RS01500-1132.398511hypothetical protein
RS_RS01505-2142.678870patatin
RS_RS01510-3163.181021glutathione peroxidase
RS_RS015150142.882338ferritin
RS_RS015203134.075461transferase
RS_RS015252124.101637signal peptidase
RS_RS015302124.296384type III pantothenate kinase
RS_RS015351143.664773biotin--[acetyl-CoA-carboxylase] synthetase
RS_RS015402153.250335ABC transporter permease
RS_RS015451122.795461iron ABC transporter ATP-binding protein
RS_RS015502112.900021ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01420PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.7 bits (77), Expect = 0.001
Identities = 17/91 (18%), Positives = 35/91 (38%), Gaps = 11/91 (12%)

Query: 360 IVQESLTNASKYAHATIVS---VALDASEEG--VRLRVRDNGRGFPADLERRRMVGHHGL 414
+VQ + N K+ A + + L +++ V L V + G L+ + GL
Sbjct: 259 LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA---LKNTKESTGTGL 315

Query: 415 LGMEQRAIALGG---TLTIDSTPGGGVTIIV 442
+ +R L G + + G +++
Sbjct: 316 QNVRERLQMLYGTEAQIKLSEKQGKVNAMVL 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01430HTHFIS536e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 52.5 bits (126), Expect = 6e-11
Identities = 26/127 (20%), Positives = 50/127 (39%), Gaps = 3/127 (2%)

Query: 14 RVLLIEDSPVLRGMVLEYLKASAFVAVVEWADTEDLALRLLAQGHYDVVIVDLQLRQGNG 73
+L+ +D +R VL + A V ++ R +A G D+V+ D+ + N
Sbjct: 5 TILVADDDAAIR-TVLNQALSRAGYDVRITSNAAT-LWRWIAAGDGDLVVTDVVMPDENA 62

Query: 74 FKVLQSLRDQASPSVRIVYTNHAQVPTYRQRCFEAGANYFFDKSLELDKVFEVIEERAGM 133
F +L ++ +V + T + E GA + K +L ++ +I
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTA-IKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 134 VRPRPQA 140
+ RP
Sbjct: 122 PKRRPSK 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01435HTHFIS911e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.0 bits (226), Expect = 1e-23
Identities = 37/122 (30%), Positives = 54/122 (44%), Gaps = 2/122 (1%)

Query: 2 IRVLIADDHEIVRAGLRQFLSEERDIEVAGEAASGEEVMEQLRTGTFDVVVLDISMPDRN 61
+L+ADD +R L Q L +V ++ + + G D+VV D+ MPD N
Sbjct: 4 ATILVADDDAAIRTVLNQAL-SRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GIDTLKLVRQRHPDLPVLILSTFPEDQYAINLIRAGASGYLTKESAPDELVKAIRTVSQG 121
D L +++ PDLPVL++S AI GA YL K EL+ I
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 122 RR 123
+
Sbjct: 122 PK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01450YERSINIAYOPE280.045 Yersinia virulence determinant YopE protein signature.
		>YERSINIAYOPE#Yersinia virulence determinant YopE protein signature.

Length = 219

Score = 28.2 bits (62), Expect = 0.045
Identities = 24/119 (20%), Positives = 43/119 (36%), Gaps = 3/119 (2%)

Query: 215 SGTVTDASGRILSGQTVEAFWNSL--RHAKPITFGLNCALGAALMRPYIAELAKICDTAV 272
S +V + SGR +S QT + + N+L R P L + L + + I
Sbjct: 20 SSSVGEMSGRSVSQQTSDQYANNLAGRTESPQGSSLASRIIERLSSVAHSVIGFI-QRMF 78

Query: 273 SCYPNAGLPNPMSDTGFDETPEVTSSLVDEFAAAGLVNLVGGCCGTTPEHIRAIAERVA 331
S + + P +P S + + AA L + E ++ ++ A
Sbjct: 79 SEGSHKPVVTPAPTPAQMPSPTSFSDSIKQLAAETLPKYMQQLNSLDAEMLQKNHDQFA 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01470TONBPROTEIN405e-06 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 40.4 bits (94), Expect = 5e-06
Identities = 30/141 (21%), Positives = 44/141 (31%), Gaps = 13/141 (9%)

Query: 22 RWRRWGVVALAALLAHGIAVIWVARSHQVMWPPAPEQ------VVPTLLLQP-------E 68
R W + + +A + HQV+ PAP Q V P L P E
Sbjct: 7 RRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQAVQPPPE 66

Query: 69 PVHPPAPPAPAPVAAKPRPPAPRPHPQHAPTPTPEPAPTVPDMADTELPELTGTGAQASA 128
PV P P P P+ P P P+P V + ++ + A
Sbjct: 67 PVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFE 126

Query: 129 DTGVVTDLGAERPDAPAAPPG 149
+T + A + P
Sbjct: 127 NTAPARLTSSTATAATSKPVT 147



Score = 35.3 bits (81), Expect = 2e-04
Identities = 23/107 (21%), Positives = 37/107 (34%), Gaps = 3/107 (2%)

Query: 53 PPAPEQVVPTLLLQPEPVHPPAPPAPAPVAA-KPRP-PAPRPHPQHAPTPTPEPAPTVPD 110
PP Q P +++PEP P P P +P P P+P P+ +P V
Sbjct: 57 PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKP 116

Query: 111 MADTELPELTGTGAQASADTGVVTDLGAERPDAPAAPPGPGFALPPS 157
++ A A + T ++ + A+ P P
Sbjct: 117 -VESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQ 162



Score = 30.3 bits (68), Expect = 0.009
Identities = 22/112 (19%), Positives = 32/112 (28%), Gaps = 5/112 (4%)

Query: 44 VARSHQVMWPPAPEQVVPTLLLQPEPVHPPAPPAPAPVAAKPRPPAPR-PHPQHAPTPTP 102
V + + P PE + PV P KP P P +
Sbjct: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120

Query: 103 EPAPTVPDMADTELPELTGTGAQASADTGVVTDLGAERPDAPAAPPGPGFAL 154
+P +T LT + A A+ V + R + P P A
Sbjct: 121 PASP----FENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQ 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01530PF033091361e-40 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 136 bits (343), Expect = 1e-40
Identities = 55/280 (19%), Positives = 100/280 (35%), Gaps = 34/280 (12%)

Query: 13 LLLIDAGNTRIKWAWTAADVAPPAVAPGGTPWQ--HAGARPHDQLAELVEDWRDCHAGAG 70
LL ID NT + V W+ D+LA + G
Sbjct: 2 LLAIDVRNTHTVVGLISGSGDHAKVVQ---QWRIRTEPEVTADELALTI------DGLIG 52

Query: 71 MAPPDV----WISVVAGPALRDALCARIARVFDGARLRIVASEAAAAGLRNGYRDPAQLG 126
+ +S V P++ + + + + ++ G+ +P ++G
Sbjct: 53 DDAERLTGASGLSTV--PSVLHEVRVMLEQYW-PNVPHVLIEPGVRTGIPLLVDNPKEVG 109

Query: 127 TDRWVGAVGARHAWPDTALLLVTAGTATTLDIVAPDGRFAGGLILPGLTLMMRALSRNTA 186
DR V + A H + A ++V G++ +D+V+ G F GG I PG+ + A + +A
Sbjct: 110 ADRIVNCLAAYHKYGT-AAIVVDFGSSICVDVVSAKGEFLGGAIAPGVQVSSDAAAARSA 168

Query: 187 QLPEIDIGYLAARDDAQAPADVPSWADNTQDAIALGCVTAQAGAI----AQTWQALQAQY 242
L +++ P V NT + + G V AG + + +
Sbjct: 169 ALRRVELT---------RPRSV--IGKNTVECMQAGAVFGFAGLVDGLVNRIRDDVDGFS 217

Query: 243 PGPYRCVLSGGARAALAPHLRMPFQMHDNLVLLGLQVLAH 282
V +G + P LR +L L GL+++
Sbjct: 218 GADVAVVATGHTAPLVLPDLRTVEHYDRHLTLDGLRLVFE 257


7RS_RS01990RS_RS02050Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS01990-113-3.516914ATP-dependent protease
RS_RS01995016-3.840575glmZ(sRNA)-inactivating NTPase
RS_RS02000216-3.722221hypothetical protein
RS_RS02005015-2.899458HPr kinase/phosphorylase
RS_RS02010-119-3.366136PTS sugar transporter subunit IIA
RS_RS02015-219-2.533430ribosome hibernation promoting factor
RS_RS02020-218-1.114511RNA polymerase sigma-54 factor
RS_RS020250140.741345LPS export ABC transporter ATP-binding protein
RS_RS020301141.266441organic solvent tolerance protein OstA
RS_RS020351131.071598LPS export ABC transporter periplasmic protein
RS_RS020403141.5365613-deoxy-D-manno-octulosonate 8-phosphate
RS_RS020453161.674565KpsF/GutQ family protein
RS_RS020503171.586453potassium transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02035TYPE3IMQPROT270.018 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 27.0 bits (60), Expect = 0.018
Identities = 8/36 (22%), Positives = 17/36 (47%)

Query: 12 SLLQIMLRGLPILLMAVVCGVTFLLVQVNTPQTEET 47
+L +++ ++A + G+ L Q T E+T
Sbjct: 11 ALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQT 46


8RS_RS02095RS_RS02175Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS020950164.024766hypothetical protein
RS_RS021000125.561902hypothetical protein
RS_RS021050115.5582773-oxoacyl-ACP synthase
RS_RS02110-1115.4034993-oxoacyl-ACP synthase
RS_RS021151115.330950polysaccharide deacetylase
RS_RS021201125.470020membrane protein
RS_RS021251114.719568acyltransferase
RS_RS021301133.444759acyl-CoA synthetase
RS_RS021403173.758135beta-hydroxyacyl-ACP dehydratase
RS_RS021454173.121216membrane protein
RS_RS021504162.739216acyl carrier protein
RS_RS021555153.5462993-ketoacyl-ACP reductase
RS_RS021603133.408747glycosyl transferase family 2
RS_RS021652122.216753membrane protein
RS_RS021702112.481980hypothetical protein
RS_RS021752112.099398alpha/beta hydrolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02125ACRIFLAVINRP350.002 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 34.8 bits (80), Expect = 0.002
Identities = 30/189 (15%), Positives = 74/189 (39%), Gaps = 14/189 (7%)

Query: 637 QVRAAVE--RAGVPDALFIDMKAEADRLYVNYVREDIRLSLAGVGAIALLLAIALRSARR 694
++A + + P + + + + E ++ + + L++ + L++ R
Sbjct: 305 AIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNMRA 364

Query: 695 AM-----LALAPLAAAVLMVAAGFALAGVPLTILHL-IGMLL---IVAVGSNYALFFSRR 745
+ + + L ++ A G+++ + + + L IG+L+ IV V + + +
Sbjct: 365 TLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDK 424

Query: 746 ANASAGAQPVTPQTLVSLLIANLATVAGFGLLAL---SRVPMLETFGLTVGPGAILALVF 802
+ Q +L+ + A F +A S + F +T+ L+++
Sbjct: 425 LPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLV 484

Query: 803 AAILAPAAT 811
A IL PA
Sbjct: 485 ALILTPALC 493


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02155DHBDHDRGNASE1002e-27 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 99.7 bits (248), Expect = 2e-27
Identities = 70/247 (28%), Positives = 112/247 (45%), Gaps = 14/247 (5%)

Query: 3 ALVTGGSGAIGQAVSEALARAGHEVWVHANRNLEQAEAVAQRIVAAGGTAHAIAFDVTDA 62
A +TG + IG+AV+ LA G + + N E+ E V + A A A DV D+
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHI-AAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 63 QATEAALAPVL-DGAPVQILVNNAGVHDDAPMAGMTQQQWRRVIDVSLNGFFNVTQPLLM 121
A + A + + P+ ILVN AGV + ++ ++W V+ G FN ++ +
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSK 129

Query: 122 PMIRTRWGRIINMASVAGVMGNRGQANYAAAKAGLIGATKSLAQELASRGITVNAVAPGI 181
M+ R G I+ + S + A YA++KA + TK L ELA I N V+PG
Sbjct: 130 YMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGS 189

Query: 182 IASPMAGEAFP------------AERIKQLVPAGRAGQPDEVAGMVAYLASDAAAYVTGQ 229
+ M + E K +P + +P ++A V +L S A ++T
Sbjct: 190 TETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITMH 249

Query: 230 VLSINGG 236
L ++GG
Sbjct: 250 NLCVDGG 256


9RS_RS02510RS_RS02575Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS025102123.698901Fis family transcriptional regulator
RS_RS025152113.764852tRNA-dihydrouridine synthase B
RS_RS025204114.313818transcriptional regulator
RS_RS025254124.592473FAD-dependent oxidoreductase
RS_RS025305124.5153922-octaprenyl-6-methoxyphenyl hydroxylase
RS_RS025355133.740011proline aminopeptidase P II
RS_RS025401111.987209membrane protein
RS_RS02545-1121.766694mannose-1-phosphate guanylyltransferase
RS_RS025500121.732917branched-chain amino acid transporter
RS_RS02555-1121.815022branched-chain amino acid ABC transporter
RS_RS025600131.722368aminoglycoside phosphotransferase
RS_RS025650131.208124LPS-assembly protein LptD
RS_RS025702122.444658chaperone SurA
RS_RS025752101.6997484-hydroxythreonine-4-phosphate dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02510DNABINDNGFIS742e-21 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 74.2 bits (182), Expect = 2e-21
Identities = 29/75 (38%), Positives = 55/75 (73%)

Query: 2 SRNPIERCIRDSLDTYYRDLDGENPSNVYDMVLQAIERPLLETVMEWASNNQSLAADYLG 61
++ P+ ++ +L Y+ L+G++ +++Y++VL +E+PLL+ VM++ NQ+ AA +G
Sbjct: 23 TQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQPLLDMVMQYTRGNQTRAALMMG 82

Query: 62 INRNTLRKKLQQHGL 76
INR TLRKKL+++G+
Sbjct: 83 INRGTLRKKLKKYGM 97


10RS_RS02650RS_RS02680Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS02650-1143.215801hypothetical protein
RS_RS02655-1134.000918membrane protein
RS_RS02660-2123.535416membrane protein
RS_RS02665-1133.9959573-ketoacyl-ACP reductase
RS_RS02670-1113.614466acyl-CoA dehydrogenase
RS_RS02675-1123.926139TetR family transcriptional regulator
RS_RS02680-1123.492917RNA helicase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02680DHBDHDRGNASE872e-21 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 87.0 bits (215), Expect = 2e-21
Identities = 64/255 (25%), Positives = 112/255 (43%), Gaps = 14/255 (5%)

Query: 229 LAGRTALVTGASRGIGAAIAQVLARDGARVLCLD-VPAAQPALDGVARAIG--GEALAYD 285
+ G+ A +TGA++GIG A+A+ LA GA + +D P + +A EA D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 286 IAEAETPARLARQLAA-MGGIDILVHNAGITRDKTIARMTEAAWRSVLDINLAAQLRIND 344
+ ++ + ++ MG IDILV+ AG+ R I +++ W + +N +
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 345 ALLAAGALHAGGAIVCVSSISGIAGNPGQTNYATAKAGVIGLVQACAPLLAERGITINAV 404
++ G+IV V S YA++KA + + LAE I N V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 405 APGFIETQMTAAVPFTIREAGRRMNAMAQG----------GQPVDVAEAIAWLACPASNG 454
+PG ET M ++ A + + + +P D+A+A+ +L +
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 455 VTGNVVRVCGQSLLG 469
+T + + V G + LG
Sbjct: 246 ITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02690HTHTETR615e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.8 bits (147), Expect = 5e-13
Identities = 24/104 (23%), Positives = 35/104 (33%), Gaps = 5/104 (4%)

Query: 1 MARPGAG---DTKNRILEATELLFIEFGYEAMSLRQITARAKVNLAAVNYHFGSKEALMQ 57
MAR +T+ IL+ LF + G + SL +I A V A+ +HF K L
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 58 SVLGRRLDPLNTRRLALLTACEE--RWPQRLSCEHVLGALFVPA 99
+ + L R HVL +
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEE 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02695SECA300.033 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.033
Identities = 19/65 (29%), Positives = 33/65 (50%), Gaps = 4/65 (6%)

Query: 253 VLVFTRTKHGANRLAEQLTRDGIPALAIHG-NKSQSARTRALSEFKAGTLRVLVATDIAA 311
VLV T + + ++ +LT+ GI ++ + A A + + A V +AT++A
Sbjct: 452 VLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNMAG 508

Query: 312 RGIDI 316
RG DI
Sbjct: 509 RGTDI 513


11RS_RS02755RS_RS02785Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS02755221-2.274462two-component system response regulator
RS_RS02760218-3.007063protein RecA
RS_RS02765119-3.581955regulatory protein RecX
RS_RS02770-123-4.588626hypothetical protein
RS_RS02775126-5.469438succinyl-CoA ligase subunit beta
RS_RS02780128-5.263028succinyl-CoA synthetase subunit alpha
RS_RS02785-128-3.070067hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02755HTHFIS936e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.6 bits (230), Expect = 6e-24
Identities = 42/157 (26%), Positives = 79/157 (50%), Gaps = 5/157 (3%)

Query: 2 RILIAEDDATLADGLTRSLRQAGYAVDRAADGAAADAALSAQTAQTYDLLILDVGLPRLS 61
IL+A+DDA + L ++L +AGY V ++ A ++A DL++ DV +P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD---GDLVVTDVVMPDEN 61

Query: 62 GLEVLKRLRSRGAMLPVLILTAADSVDERVKGLDLGADDYMAKPFALSELEARVRALVRR 121
++L R++ LPVL+++A ++ +K + GA DY+ KPF L+EL + +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 122 GTGGGATLVRHGPLAFDQVGRIAYIRD--QMVDLSAR 156
+ L VGR A +++ +++ +
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQ 158


12RS_RS02890RS_RS02935Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS02890-122-3.211297translation factor Sua5
RS_RS02895025-4.673258transposase
RS_RS02900-115-2.073213transposase
RS_RS02905214-0.955158lipase
RS_RS02910117-1.398694phosphoglycolate phosphatase
RS_RS02915114-0.356366molybdopterin-guanine dinucleotide biosynthesis
RS_RS029200112.426545D-alanyl-D-alanine carboxypeptidase
RS_RS029251123.490093amino acid ABC transporter permease
RS_RS029302133.515917hypothetical protein
RS_RS029351113.062353hypothetical protein
13RS_RS03065RS_RS03210Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS030652190.369031thiamine biosynthesis lipoprotein ApbE
RS_RS03070322-1.670950membrane protein
RS_RS03075122-3.323785hypothetical protein
RS_RS03080124-3.838526hypothetical protein
RS_RS03085026-5.014438LysR family transcriptional regulator
RS_RS03090130-7.107021hypothetical protein
RS_RS03095129-5.545182signal peptidase
RS_RS03100229-5.248323recombination-associated protein RdgC
RS_RS03105230-5.442399cytochrome B561
RS_RS03110128-4.823939hypothetical protein
RS_RS03115030-4.919126hypothetical protein
RS_RS03120-228-4.183146DEAD/DEAH box helicase
RS_RS03125129-4.794165hypothetical protein
RS_RS03130130-4.280247IMP dehydrogenase
RS_RS03135232-4.102091AraC family transcriptional regulator
RS_RS03140233-4.521666(2Fe-2S)-binding protein
RS_RS03145232-4.067345aldehyde oxidase
RS_RS03150133-4.515149lipoprotein
RS_RS03155-132-5.031677cytochrome C
RS_RS03160138-6.273844hypothetical protein
RS_RS03165239-6.661345hypothetical protein
RS_RS03170138-6.192257hydroxyacid dehydrogenase
RS_RS03175241-6.629712FMN reductase
RS_RS03180137-6.714511transcriptional regulator
RS_RS03185140-6.908162hypothetical protein
RS_RS03190225-5.259716transcriptional regulator
RS_RS03195128-5.373942endonuclease DDE
RS_RS03205-118-4.735223heat-shock protein
RS_RS03210-115-3.717190molecular chaperone GroES
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03195HTHTETR626e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.0 bits (150), Expect = 6e-14
Identities = 28/160 (17%), Positives = 56/160 (35%), Gaps = 2/160 (1%)

Query: 30 RERKRRDTLQRIAQTGLDLFIAKGYEATTLDDVAAAAGISRRTFFHYFTSKDEILLAWQV 89
+++ ++T Q I L LF +G +T+L ++A AAG++R + +F K ++
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 90 GLVDALYDAVLQAATDQ--SPLEALCGALQKLASHFDAEKAIDIARVLRSSEQLRAANHA 147
+ + L+ PL L L + E+ + + + A
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA 124

Query: 148 KFMNLENAAFEALCRLWPQRGRRGLLRMVAMAGLGAFRVA 187
+ Q + + + A L R A
Sbjct: 125 VVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAA 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03215SHAPEPROTEIN491e-08 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 49.4 bits (118), Expect = 1e-08
Identities = 57/239 (23%), Positives = 89/239 (37%), Gaps = 61/239 (25%)

Query: 1 MMSNACGVDFGTSNSTVGWLRSDAPTLLPLEDGKVT-LPSVIFFHAEDPLVSYGRAALTD 59
M SN +D GT+N+ + G V PSV+ + +
Sbjct: 8 MFSNDLSIDLGTANTLI----------YVKGQGIVLNEPSVV------AIRQDRAGSPKS 51

Query: 60 YLA-GYEGRLM-----------RSLKSLLGTSMMDESTEVMGQAMPFRKLLAHFIGELKR 107
A G++ + M R +K G TE M L HFI K+
Sbjct: 52 VAAVGHDAKQMLGRTPGNIAAIRPMKD--GVIADFFVTEKM---------LQHFI---KQ 97

Query: 108 RAERAAGHEFRRAVLGRPVFFIDEDPKADQLAEDTLSEIARDAGFDEIAFQYEPIAAAFD 167
+ R ++ PV A Q+ + E A+ AG E+ EP+AAA
Sbjct: 98 VHSNSFMRPSPRVLVCVPV-------GATQVERRAIRESAQGAGAREVFLIEEPMAAAIG 150

Query: 168 YEAGISGEELVLVADIGGGTSDFSLVRLSPDRAKRPDRREDILANGGVHIGGTDFDRAL 226
+S +V DIGGGT++ +++ L+ ++ + V IGG FD A+
Sbjct: 151 AGLPVSEATGSMVVDIGGGTTEVAVISLN-----------GVVYSSSVRIGGDRFDEAI 198


14RS_RS03255RS_RS03285Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS03255210-0.948011type II secretion system protein
RS_RS032602130.177218pilus assembly protein TadB
RS_RS032654150.531107pilus assembly protein CpaF
RS_RS032703151.424318pilus assembly protein
RS_RS032802131.484579hypothetical protein
RS_RS032853141.891945secretin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03285HTHFIS371e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.1 bits (86), Expect = 1e-04
Identities = 25/123 (20%), Positives = 44/123 (35%), Gaps = 7/123 (5%)

Query: 2 AKIAVVSADESHLQFLAGLVTQAA--NHVVQRARTTPSQALAQPGLTLGTDLLVLEAGNF 59
A I V D + L +++A + A T A G DL+V +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDG-----DLVVTDVVMP 58

Query: 60 TADDIDQLRRLTAEQQDTLCMLLTENPSAELLMRAMRAGVQCVLPWPPDAQEFRDEVQRC 119
+ D L R+ + D ++++ + ++A G LP P D E + R
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 120 TSH 122
+
Sbjct: 119 LAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03295BCTERIALGSPD1253e-32 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 125 bits (315), Expect = 3e-32
Identities = 55/268 (20%), Positives = 113/268 (42%), Gaps = 29/268 (10%)

Query: 260 RIINMMSVEAPQQVMLEVKVAEVSKTLINQMGAA---------------LNLNGSF-GSW 303
R+I + + QV++E +AEV +G L ++ + G+
Sbjct: 335 RVIAQLDIR-RPQVLVEAIIAEVQDADGLNLGIQWANKNAGMTQFTNSGLPISTAIAGAN 393

Query: 304 TFGALANFLSGAPDIFSADKANKLPF-----QLRVDAQKGDGLVKVLAEPNLMAISGQEA 358
+ S S+ F + + A +LA P+++ + EA
Sbjct: 394 QYNKDGTVSSSLASALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEA 453

Query: 359 SFLAGGKVYIPVPQSFGTGTT-TILLQEETFGVGLRFTPTVLENGRINLKVAPEVSELSP 417
+F G +V + +G ++ +T G+ L+ P + E + L++ EVS ++
Sbjct: 454 TFNVGQEVPVLTGSQTTSGDNIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVAD 513

Query: 418 TGVAVTAPNVSGTSILPLITTRRASTTLQVHDGQSFAIGGLIKSNVTGSLKAIPGVGELP 477
+ + + TR + + V G++ +GGL+ +V+ + +P +G++P
Sbjct: 514 AAS------STSSDLGATFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIP 567

Query: 478 VIGALARSTSFQQDQTELLFVVTPHLVK 505
VIGAL RSTS + + L+ + P +++
Sbjct: 568 VIGALFRSTSKKVSKRNLMLFIRPTVIR 595


15RS_RS03590RS_RS03660Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS035902171.5368063,4-dihydroxy-2-butanone 4-phosphate synthase
RS_RS035952181.738255riboflavin synthase subunit alpha
RS_RS036002191.788773diaminohydroxyphosphoribosylaminopyrimidine
RS_RS036052251.565990pilus assembly protein
RS_RS036101260.873943pilus assembly protein
RS_RS036150260.069171pilus assembly protein PilY
RS_RS03620317-3.752611pilus assembly protein PilZ
RS_RS03625418-4.402937pilus assembly protein PilW
RS_RS03630520-5.374730membrane protein
RS_RS03635421-5.679788general secretion pathway protein GspH
RS_RS03640420-5.727109pilus assembly protein PilE
RS_RS03645417-5.296925pilus biosynthesis protein PilY
RS_RS03650-117-4.356724pilus assembly protein PilX
RS_RS03655-116-4.163155pilus assembly protein PilW
RS_RS03660-114-3.466323pilus assembly protein PilV
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03605BCTERIALGSPG290.009 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 28.7 bits (64), Expect = 0.009
Identities = 11/31 (35%), Positives = 17/31 (54%)

Query: 1 MTRVAWQAGVTLAELTMVLAILAILAVFAMP 31
M Q G TL E+ +V+ I+ +LA +P
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVP 31


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03610BCTERIALGSPG413e-07 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 40.6 bits (95), Expect = 3e-07
Identities = 19/88 (21%), Positives = 32/88 (36%), Gaps = 2/88 (2%)

Query: 1 MLRPTGFTFIELLITLAIAGVLA-LAASQAWSGAYLRTERMAAHAALVATMAALEQHHAQ 59
+ GFT +E+++ + I GVLA L + ++ A + +VA AL+ +
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKE-KADKQKAVSDIVALENALDMYKLD 62

Query: 60 TGSYALPGDDTSPLDRWPHTLEHPRGYR 87
Y L P Y
Sbjct: 63 NHHYPTTNQGLESLVEAPTLPPLAANYN 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03620BCTERIALGSPG290.006 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 29.5 bits (66), Expect = 0.006
Identities = 10/25 (40%), Positives = 15/25 (60%)

Query: 3 MRPLRPSRGFMLVFVMTSTVLLGLL 27
MR RGF L+ +M V++G+L
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVL 25


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03635BCTERIALGSPG401e-06 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 39.9 bits (93), Expect = 1e-06
Identities = 19/58 (32%), Positives = 31/58 (53%), Gaps = 3/58 (5%)

Query: 9 RRRSARGFTLLELMVTIAIISIMLVLVAPSF---SDFLRKQRLLSAADSVASAIGQAR 63
RGFTLLE+MV I II ++ LV P+ + KQ+ +S ++ +A+ +
Sbjct: 3 ATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK 60


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03640BCTERIALGSPG472e-09 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 47.2 bits (112), Expect = 2e-09
Identities = 13/43 (30%), Positives = 31/43 (72%)

Query: 12 QRGFTLIELMIVVIVVAILSTIAYPSYTQFVQKSRRTQAKAAL 54
QRGFTL+E+M+V++++ +L+++ P+ +K+ + +A + +
Sbjct: 7 QRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDI 49


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03655BCTERIALGSPG336e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 32.9 bits (75), Expect = 6e-04
Identities = 15/33 (45%), Positives = 22/33 (66%), Gaps = 1/33 (3%)

Query: 10 RRTRGFSLVELMVALVIALLVLAATVSFYLMTR 42
+ RGF+L+E+MV +VI + VLA+ V LM
Sbjct: 5 DKQRGFTLLEIMVVIVI-IGVLASLVVPNLMGN 36


16RS_RS03715RS_RS03770Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS03715219-1.3451927-cyano-7-deazaguanine synthase
RS_RS03720020-1.448736transporter
RS_RS03730221-1.710675*hypothetical protein
RS_RS03735118-0.864908chemotaxis protein CheY
RS_RS037400160.207173chemotaxis protein CheZ
RS_RS03745-1140.833541phospho-2-dehydro-3-deoxyheptonate aldolase
RS_RS037500112.318965endonuclease
RS_RS037553122.945318nucleoside 2-deoxyribosyltransferase
RS_RS037603113.556906taurine dioxygenase
RS_RS037653143.239981NUDIX hydrolase
RS_RS037702152.950529ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03735HTHFIS787e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 77.6 bits (191), Expect = 7e-20
Identities = 32/135 (23%), Positives = 58/135 (42%), Gaps = 9/135 (6%)

Query: 1 MDKSQYRFLVVDDFPTMRRIIRGQLKELGFANIDEAEDGTAGLSKIKESRFDFVISDWNM 60
M + LV DD +R ++ L G+ ++ + I D V++D M
Sbjct: 1 MTGA--TILVADDDAAIRTVLNQALSRAGY-DVRITSNAATLWRWIAAGDGDLVVTDVVM 57

Query: 61 PKMDGLQMLQAIRA-DPNPGISKLPVLIVTAEAKKENVIAAAQAGASGYVVKPFTAATLG 119
P + +L I+ P+ LPVL+++A+ I A++ GA Y+ KPF L
Sbjct: 58 PDENAFDLLPRIKKARPD-----LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELI 112

Query: 120 EKLNKIFEKFERAEA 134
+ + + +R +
Sbjct: 113 GIIGRALAEPKRRPS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03745PF07520290.037 Virulence protein SrfB
		>PF07520#Virulence protein SrfB

Length = 1041

Score = 29.2 bits (65), Expect = 0.037
Identities = 12/31 (38%), Positives = 15/31 (48%)

Query: 310 LSSGERRISGVMIESHLEEGRQDLKPGVPLQ 340
L G R G++IE E R DL PL+
Sbjct: 283 LDIGNSRTCGILIERFPGETRVDLTRSFPLE 313


17RS_RS04090RS_RS04515Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS04090226-2.593136glutathione S-transferase
RS_RS04095329-3.555662endoglucanase
RS_RS04100435-6.421908hypothetical protein
RS_RS04105344-8.850967transcriptional regulator
RS_RS04110147-8.928082NADH-ubiquinone oxidoreductase subunit 3
RS_RS04115246-8.985508POPP protein
RS_RS04120244-8.100667hypothetical protein
RS_RS04125353-10.968526transposase
RS_RS04130250-10.521168transposase
RS_RS04135446-9.241596hypothetical protein
RS_RS04140445-9.666775hypothetical protein
RS_RS04145340-8.326693hypothetical protein
RS_RS04150340-8.417783hypothetical protein
RS_RS04155120-3.085526hypothetical protein
RS_RS04160018-1.469465hypothetical protein
RS_RS04165216-1.075342hypothetical protein
RS_RS04170217-0.595947hypothetical protein
RS_RS04180019-1.471230hypothetical protein
RS_RS04185016-1.047476hypothetical protein
RS_RS04190017-0.982236hypothetical protein
RS_RS04200-117-1.359160hypothetical protein
RS_RS04205-116-1.690696transposase
RS_RS04210014-1.276062DNA methylase N-4
RS_RS04215120-2.640699DNA methylase N-4
RS_RS04220032-3.566259hypothetical protein
RS_RS04225125-1.929729hypothetical protein
RS_RS04230-215-0.617389hypothetical protein
RS_RS04235-112-0.149027hypothetical protein
RS_RS04240-1110.247496hypothetical protein
RS_RS04245-1120.725979hypothetical protein
RS_RS04250-1101.148983hypothetical protein
RS_RS042550130.925264terminase
RS_RS042602171.291469hypothetical protein
RS_RS042650150.267398hypothetical protein
RS_RS042701160.229919hypothetical protein
RS_RS042751150.738472hypothetical protein
RS_RS04280215-0.248710hypothetical protein
RS_RS04285215-0.811205hypothetical protein
RS_RS04290114-1.082674hypothetical protein
RS_RS04295-118-0.251512hypothetical protein
RS_RS04300118-0.645028hypothetical protein
RS_RS04305128-4.307966hypothetical protein
RS_RS04310134-4.198212hypothetical protein
RS_RS04315136-4.643237hypothetical protein
RS_RS04320135-4.872823hypothetical protein
RS_RS04325023-1.591735DNA-binding protein
RS_RS04330-123-1.370922YOPP/AvrRxv family protein
RS_RS251650140.777440hypothetical protein
RS_RS043400130.679663recombinase
RS_RS043450130.957561prevent-host-death protein
RS_RS043500121.164269membrane protein
RS_RS043553111.309039hypothetical protein
RS_RS043603111.375761membrane protein
RS_RS043650171.243765hypothetical protein
RS_RS043700221.170435hypothetical protein
RS_RS043750230.777892hypothetical protein
RS_RS043800250.052155hypothetical protein
RS_RS04385029-2.690443hypothetical protein
RS_RS04390329-4.433868membrane protein
RS_RS04395414-0.942859membrane protein
RS_RS04400414-1.144075glycoside hydrolase
RS_RS04405414-1.493769hypothetical protein
RS_RS04410414-1.529477hypothetical protein
RS_RS04415414-1.540070HrgA protein
RS_RS04420414-1.278683membrane protein
RS_RS04425224-3.705107transposase
RS_RS04430016-2.230608hypothetical protein
RS_RS04435-112-1.554929recombinase
RS_RS04440-112-1.469642hypothetical protein
RS_RS04445-114-1.004727transposase
RS_RS04450013-1.181473membrane protein
RS_RS04455-29-1.702616phosphoglycolate phosphatase
RS_RS04460-114-3.257245ubiquinone biosynthesis O-methyltransferase
RS_RS04465013-3.793369membrane protein
RS_RS04470-211-3.179942membrane protein
RS_RS04475-211-2.906626DNA gyrase subunit A
RS_RS04480-111-2.128656hypothetical protein
RS_RS04485-211-0.414983phosphoserine aminotransferase
RS_RS04490-114-0.286354chorismate mutase
RS_RS04495213-1.612242histidinol-phosphate aminotransferase
RS_RS04500315-1.896525prephenate dehydrogenase
RS_RS04505415-2.4081003-phosphoshikimate 1-carboxyvinyltransferase
RS_RS04510314-2.496839cytidylate kinase
RS_RS04515212-3.22596930S ribosomal protein S1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS04115HTHTETR632e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 63.5 bits (154), Expect = 2e-14
Identities = 39/209 (18%), Positives = 76/209 (36%), Gaps = 16/209 (7%)

Query: 19 PTQERSRATVDAIMQAATYILVKFGWERLTTNAIAERAGVNIGSLYQYFPNKQAIIAELQ 78
T++ ++ T I+ A + + G + IA+ AGV G++Y +F +K + +E+
Sbjct: 4 KTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIW 63

Query: 79 RRHVLQTREILSKALPQVSEQ--RSLREALTLLV-----------VAMIDEHRVAPAVHR 125
E+ + + LRE L ++ + I H+
Sbjct: 64 ELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEM 123

Query: 126 AIDAELPRSVRVPLGDESRGGDQPLQVLQPFMKNVPDPHFALRIARIAAHAVIHEAASHQ 185
A+ + R++ + D + + ++ A I R ++
Sbjct: 124 AVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADL-MTRRAAIIMRGYISGLMENWLF-A 181

Query: 186 PELFDRPDFVDEVVA-LLEGYFRRPAARS 213
P+ FD + VA LLE Y P R+
Sbjct: 182 PQSFDLKKEARDYVAILLEMYLLCPTLRN 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS04150SECA564e-10 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 56.4 bits (136), Expect = 4e-10
Identities = 19/23 (82%), Positives = 22/23 (95%)

Query: 847 RKIGRNEPCPCGSGAKYKRCHGR 869
RK+GRN+PCPCGSG KYK+CHGR
Sbjct: 877 RKVGRNDPCPCGSGKKYKQCHGR 899


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS04190PF05272495e-08 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 49.3 bits (117), Expect = 5e-08
Identities = 95/361 (26%), Positives = 139/361 (38%), Gaps = 94/361 (26%)

Query: 31 LLDRLDSVLAILFPAGKKRRNKFVIGDIQGNLGDSLEIVLDGEKAGLWTDRATGDGG--- 87
LL R +L P G +++ G + G GDS ++ + G W D +TG+ G
Sbjct: 18 LLTRAKDLLPEWLPGGVLVGHEYECGSLAGGKGDSCKVNV---TTGKWCDFSTGESGRDL 74

Query: 88 -DVFAVIAGTLGVDVQTEFPR---VLARAADLLGLASTQPVRRK-RKEPPTDDL------ 136
D++A I G + R + + A ++G + P + R EPP +
Sbjct: 75 LDLYAEIHGLKVSKAAAQVAREEGLESVAGIVMGAPAGAPAPKPPRPEPPPRPVVEKECW 134

Query: 137 ------------------GPETAKWDYLDAAGR------LIGVVYRYDPPGRGKEFRPWD 172
P+ + D ++ R L G V R+ K P+
Sbjct: 135 ETIQPVPEHAVPPSFWHPAPKGREPDKIEHTARYQVGPVLWGYVVRFIKSDGDKLTLPYV 194

Query: 173 AKRRKMAPP---------DPRPLYNQPGLASATQ-VVLVEGEK---CAQALIDAG---IV 216
R + DPRPLY A ++ VVLVEGE+ C Q L+DAG +
Sbjct: 195 YSRSQRDGSEAWKWRGWDDPRPLYFPSHRAPESRTVVLVEGERKADCLQQLLDAGAPGVY 254

Query: 217 ATTAMHGANAPVEKTDWSPLAGKAVLIWPDRD--------------------------KP 250
+ G + K DWS LAG V++WPD D KP
Sbjct: 255 CVASWPGGSNGWPKADWSWLAGCTVVLWPDCDSLREKLTRQELKDTPDPLAREKLLAAKP 314

Query: 251 GWEYADRASQ-AILQAGALL------VAILLP---PDDKPEGWDAADAIE-EGFDVAGYL 299
+ + Q A+L G +L +LP P KP GWD DAIE +G+D+ L
Sbjct: 315 YLPFDKQPGQKAMLGIGEVLRDTHGCTVQMLPIDKPGIKPSGWDCRDAIETDGWDIDRVL 374

Query: 300 A 300
A
Sbjct: 375 A 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS04295FLGFLIH280.005 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 28.2 bits (62), Expect = 0.005
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 21 VWRPSDGGPPRTNMVGFAAPDETLLD 46
W P D PP+ V P+ET+++
Sbjct: 9 TWTPDDLAPPQAEFVPIVEPEETIIE 34


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS04325ACRIFLAVINRP260.032 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 26.0 bits (57), Expect = 0.032
Identities = 11/50 (22%), Positives = 21/50 (42%), Gaps = 4/50 (8%)

Query: 5 KDLLSQKAKLEEQLEAAR---QKELAEITAQV-RQVVQEYGLTAEDIGLA 50
LL A+ L + R ++ A+ +V ++ Q G++ DI
Sbjct: 699 NQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQT 748


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS04350GPOSANCHOR491e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 48.9 bits (116), Expect = 1e-07
Identities = 63/419 (15%), Positives = 136/419 (32%), Gaps = 32/419 (7%)

Query: 498 GEVEQAVSKASSTVNDATARMAEAYKGLTSIIDGHLQRQVEAVKARYQQELSALERSGQA 557
G V ++ T + + + + +++ AL+
Sbjct: 32 GLVVNTNEVSAVATRSQTDTLEKVQERADKFEIENNTLKLKNSD--LSFNNKALKDHNDE 89

Query: 558 QAVQIATSTQLLVGALTQQTALRQQAATEALKLIDDESRARVDAVAREGKTEAERAANVQ 617
+++ +L ++A+ + A T
Sbjct: 90 LTEELS---NAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTL 146

Query: 618 RVENEILATRRQTLTQAASEYRQHIDALNAEANRHLTEIRRIEDEKRQLSMSTEERIRDI 677
E LA R+ L +A A +A+ E +E + +L + E
Sbjct: 147 EAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGA---- 202

Query: 678 RRAGLSDYEAQEDRKRQIAEYQASARAALADGEFDQARQRASKAMDLAAQVASAQSGEAK 737
++ A + + + +A+ A AD E + ++
Sbjct: 203 ----MNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMN------------FSTADSA 246

Query: 738 RAEDARKQSEAAVTQVAQLEAQAREATGRREYAQAEALTRQADELRARSAQQAANADAQA 797
+ + + A + A+LE A A+ T +A++ + + +Q
Sbjct: 247 KIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQV 306

Query: 798 VRGKAAVNEAIGRIRDSQAILNQTLDAEAQAHQRAAQSAVSARQDIQQTLAQTDSQIAQL 857
A +++ R D+ + L+AE Q + + + ++RQ +++ L + QL
Sbjct: 307 ---LNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQL 363

Query: 858 TAKLQQGLKVTIDADTQRFDKAIADLDKALVERERLVVIKADLEQAEKTLQDYEQRLKE 916
A+ Q+ + ++ R DLD + RE ++ LE+A L E+ KE
Sbjct: 364 EAEHQKLEEQNKISEASRQS-LRRDLDAS---REAKKQVEKALEEANSKLAALEKLNKE 418


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS04425IGASERPTASE373e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 36.6 bits (84), Expect = 3e-04
Identities = 10/78 (12%), Positives = 24/78 (30%)

Query: 30 TPPTQSAATATANTEQEQRLRQQQQAREREQAVQAPAVRAQQAAPAEFPELPTETPCFRI 89
PP + + T T E ++ + + EQ + ++ A + T +
Sbjct: 1026 PPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEV 1085

Query: 90 DRFALEVPQDLPEAARTK 107
+ E + +
Sbjct: 1086 AQSGSETKETQTTETKET 1103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS04455PERTACTIN320.026 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 31.6 bits (71), Expect = 0.026
Identities = 45/155 (29%), Positives = 57/155 (36%), Gaps = 19/155 (12%)

Query: 368 APDGSINFAQLGGKPAPEAKPAA-PAPAAQPKPLDVAVGKLQLANSSVHWRDATTQPAAD 426
A +G+ ++ +G K P KPA P P P+P R
Sbjct: 553 AANGNGQWSLVGAKAPPAPKPAPQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQP 612

Query: 427 LVLDDLHG--DAAVRTLGGPTTLDVGAKLASGGTLNVKGSTSLEKRTGELELKLEALKLA 484
+L +AAV T G + TL S +L KR GEL L +A
Sbjct: 613 PAGRELSAAANAAVNTGG----------VGLASTLWYAESNALSKRLGELRLNPDAGGAW 662

Query: 485 GIGPYLRQAGAPQLQNGALSA-DGKIA-LEFGADK 517
G G RQ QL N A D K+A E GAD
Sbjct: 663 GRGFAQRQ----QLDNRAGRRFDQKVAGFELGADH 693


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS04475OMPADOMAIN1352e-40 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 135 bits (340), Expect = 2e-40
Identities = 63/147 (42%), Positives = 87/147 (59%), Gaps = 5/147 (3%)

Query: 73 APAAAPQAQPTPAAKPVVGSEKVTFAADALFDFDKAVLKPEGKSKLDDLVSKLKGITLE- 131
AAP P PA P V ++ T +D LF+F+KA LKPEG++ LD L S+L + +
Sbjct: 193 QGEAAPVVAPAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKD 252

Query: 132 -VVIAVGHTDSFGSDKYNDRLSVRRAEAVKGYLVSKGIEANRVYTEGKGKRQLKVDPKSC 190
V+ +G+TD GSD YN LS RRA++V YL+SKGI A+++ G G+ V +C
Sbjct: 253 GSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPADKISARGMGESN-PVTGNTC 311

Query: 191 KG--ARKAQIACQQPNRRVEVEVVGTR 215
R A I C P+RRVE+EV G +
Sbjct: 312 DNVKQRAALIDCLAPDRRVEIEVKGIK 338


18RS_RS04735RS_RS04780Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS04735321-3.332737RNA helicase
RS_RS04740528-4.949320hypothetical protein
RS_RS04745527-4.844020hypothetical protein
RS_RS04750425-4.273599type VI secretion protein
RS_RS04755318-4.113262hypothetical protein
RS_RS04760419-4.191843hypothetical protein
RS_RS04765321-3.208185hypothetical protein
RS_RS04775320-3.153335hypothetical protein
RS_RS04780216-2.195473hypothetical protein
19RS_RS05200RS_RS05250Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS05200116-3.2057663-oxoacyl-ACP synthase III
RS_RS05205221-3.234894malonyl CoA-ACP transacylase
RS_RS05210019-4.0789383-ketoacyl-ACP reductase
RS_RS05215117-3.331723acyl carrier protein
RS_RS05220016-2.9435523-oxoacyl-ACP synthase
RS_RS05225-115-3.343488RNA polymerase sigma factor RpoE
RS_RS05230-112-3.326050anti-sigma factor
RS_RS05235-113-3.823995sugar dehydratase
RS_RS05240-214-3.609203peptidase
RS_RS05245-112-4.144814glutaredoxin
RS_RS05250-111-3.496734elongation factor 4
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS05210DHBDHDRGNASE1253e-37 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 125 bits (316), Expect = 3e-37
Identities = 81/257 (31%), Positives = 125/257 (48%), Gaps = 10/257 (3%)

Query: 3 QALNNQVALVTGASRGIGRAIALELARQGATVVGTATSEAGAQAISAYFAEAGVKGAGVV 62
+ + ++A +TGA++GIG A+A LA QGA + + + + +
Sbjct: 4 KGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 LNVNDAARCEAVIDEAIKTHGGLNILVNNAGITQDNLAMRMKDDEWMAVIDTNLSAVFRL 122
+V D+A + + + G ++ILVN AG+ + L + D+EW A N + VF
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 123 SRAVLRPMMKARGGRIINITSVVGSAGNPGQANYAAAKAGVEGMARAMAKEIGSRNITVN 182
SR+V + MM R G I+ + S A YA++KA + + E+ NI N
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 183 SVAPGFIDTDMTKVLSDEQHTA----------LKAQIPLGRLGMPEDIANAVAFLASPAA 232
V+PG +TDM L +++ A K IPL +L P DIA+AV FL S A
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 233 GYITGATLHVNGGMYMG 249
G+IT L V+GG +G
Sbjct: 244 GHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS05240V8PROTEASE794e-18 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 78.9 bits (194), Expect = 4e-18
Identities = 36/157 (22%), Positives = 62/157 (39%), Gaps = 24/157 (15%)

Query: 114 SRGVGSGFIMSGDGYVLTNAHVVEGAETIYVTLTDKR-----------EFKA-KLIGSDK 161
+ SG ++ G +LTN HVV+ L F A ++
Sbjct: 100 GTFIASGVVV-GKDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSG 158

Query: 162 RTDVALVKVEAT--------GLPSLKLGDSDKVRVGEWVLAIGSPFGLDNTVTAGIVSAK 213
D+A+VK + + ++ + +V + + G P A + +K
Sbjct: 159 EGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKPV---ATMWESK 215

Query: 214 GRDTGDYLPFIQSDVAVNPGNSGGPLINLRGEVIGIN 250
G+ T +Q D++ GNSG P+ N + EVIGI+
Sbjct: 216 GKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIH 252


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS05250TCRTETOQM1388e-37 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 138 bits (350), Expect = 8e-37
Identities = 107/497 (21%), Positives = 200/497 (40%), Gaps = 99/497 (19%)

Query: 3 NIRNFSIIAHIDHGKSTLADRIIQLCGG---LSDREMEAQVLDSMDIEKERGITIKAQTA 59
I N ++AH+D GK+TL + ++ G L + D+ +E++RGITI QT
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITI--QTG 59

Query: 60 ALSYKARDGKVYNLNLIDTPGHVDFSYEVSRSLSACEGALLVVDASQGVEAQTVANCYTA 119
S++ + KV N+IDTPGH+DF EV RSLS +GA+L++ A GV+AQT +
Sbjct: 60 ITSFQWENTKV---NIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHAL 116

Query: 120 IELGVEVVPVLNKIDLPAADPDSAIQEIEDVIGIDA--------------QDATRCSAKT 165
++G+ + +NKID D + Q+I++ + + + T
Sbjct: 117 RKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWD 176

Query: 166 GV--GVPDVLEALIAKVPAPKGDPDAPLQALIID---------SWFDNYVGVVMLVRVVN 214
V G D+LE ++ + + + S +N +G+ L+ V+
Sbjct: 177 TVIEGNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNN-IGIDNLIEVIT 235

Query: 215 GTL-------------------------------------RAKDKVLLMATGAQHLVEQV 237
+D V + + E
Sbjct: 236 NKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKIKITEMY 295

Query: 238 GVFSPKSVPRESLSAGQVGFVIAGIKELKAAKVGDTITHVVPRKADAPLPGFKEVKPQVF 297
+ + + +G++ + +L + +GDT + + PLP +
Sbjct: 296 TSINGELCKIDKAYSGEIVILQNEFLKLNSV-LGDTKLLPQRERIENPLPLLQTT----- 349

Query: 298 AGLYPVEANQYEALRESLEKLQLNDASLQFE-PEVSQALGFGFRCGFLGLLHMEIVQERL 356
+ P + Q E L ++L ++ +D L++ + + FLG + ME+ L
Sbjct: 350 --VEPSKPQQREMLLDALLEISDSDPLLRYYVDSATHEIIL----SFLGKVQMEVTCALL 403

Query: 357 EREFDMDLITTAPTVVYQVMQR----DGTTVQVENPAK--------MPDPSKIESILEPI 404
+ ++ +++ PTV+Y M+R T+ +E P P + S ++
Sbjct: 404 QEKYHVEIEIKEPTVIY--MERPLKKAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYE 461

Query: 405 VTVNL-YMPQEYVGAVI 420
+V+L Y+ Q + AV+
Sbjct: 462 SSVSLGYLNQSFQNAVM 478



Score = 48.7 bits (116), Expect = 5e-08
Identities = 28/118 (23%), Positives = 48/118 (40%), Gaps = 18/118 (15%)

Query: 362 MDLITTAPTVVYQVMQRDGTTVQVENPAKMPDPSKIESILEPIVTVNLYMPQEYVGAVIT 421
D AP V+ QV+++ GT +LEP ++ +Y PQEY+ T
Sbjct: 514 ADFRMLAPIVLEQVLKKAGTE-----------------LLEPYLSFKIYAPQEYLSRAYT 556

Query: 422 LCEQKRGSQINMSYHGRQVQLTYEIPMGEIVLDFFDRLKSVSRGYASMDYEFKEYRPS 479
+ + ++ +V L+ EIP I ++ L + G + E K Y +
Sbjct: 557 DAPKYCANIVDTQLKNNEVILSGEIPARCI-QEYRSDLTFFTNGRSVCLTELKGYHVT 613


20RS_RS05335RS_RS05480Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS053351113.090574peptidoglycan-binding protein LysM
RS_RS053402142.396600sensor histidine kinase
RS_RS05345-1121.294773transcriptional regulator
RS_RS05350-1131.307729transglycosylase
RS_RS053550131.889619GntR family transcriptional regulator
RS_RS053600131.783517glucarate dehydratase
RS_RS053651121.949262hexuronate transporter ExuT
RS_RS053701112.283062alpha-glucosidase
RS_RS053752122.413857porin
RS_RS053801123.101977aldose epimerase
RS_RS053852151.485629membrane protein
RS_RS05390316-0.031234maleylacetoacetate isomerase
RS_RS05395415-0.1932385-carboxymethyl-2-hydroxymuconate isomerase
RS_RS05400314-0.833387gentisate 1,2-dioxygenase
RS_RS05405215-0.567135naphthalene 1,2-dioxygenase
RS_RS05410015-0.044175salicylate-5-hydroxylase small oxygenase NagH
RS_RS054151160.934346salicylate 5-hydroxylase large oxygenase NagG
RS_RS054201161.821997naphthalene 1,2-dioxygenase
RS_RS054252161.714843LysR family transcriptional regulator
RS_RS054301141.7989424-hydroxybenzoate transporter
RS_RS054353151.106353LysR family transcriptional regulator
RS_RS054402140.351318transporter
RS_RS05445213-0.551375aspartate aminotransferase
RS_RS05450416-1.379433transcription regulator protein
RS_RS05455416-1.338125AraC family transcriptional regulator
RS_RS05460417-1.312140hypothetical protein
RS_RS05465417-1.637026sarcosine oxidase subunit beta
RS_RS05470419-1.932289sarcosine oxidase subunit delta
RS_RS05475321-2.355206sarcosine oxidase subunit alpha
RS_RS05480323-3.320394sarcosine oxidase subunit gamma
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS05345SECA300.011 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.011
Identities = 10/46 (21%), Positives = 23/46 (50%), Gaps = 1/46 (2%)

Query: 105 RPPELVARLDALIRRVAVSRAVRAAEIKL-GHYTVDRQNRSIHLRE 149
E+ R++ +I + + + GH++VD ++R ++L E
Sbjct: 231 DSSEMYKRVNKIIPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTE 276


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS05365TCRTETB424e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 41.8 bits (98), Expect = 4e-06
Identities = 31/164 (18%), Positives = 62/164 (37%), Gaps = 11/164 (6%)

Query: 10 WIIALVCLGTITNYLARNSLGVLAPQLKTELGMSTQQYSYVVGAFQIGYTIMQPVCGFIV 69
W+ L + + L V P + + ++V AF + ++I V G +
Sbjct: 18 WLCILSFFSVLNEMV----LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLS 73

Query: 70 DLIGLR----IGFALFAVLWSIAGCLHAGASGWLSLASFRGLMGLTEAAAIPSGMKAVAE 125
D +G++ G + S+ G + L +A F + G AA M VA
Sbjct: 74 DQLGIKRLLLFGIIIN-CFGSVIGFVGHSFFSLLIMARF--IQGAGAAAFPALVMVVVAR 130

Query: 126 WFPDKEKSVAVGYFNAGTSLGALLAPPLVVFLSLRYGWQSAFVV 169
+ P + + A G + ++G + P + ++ W ++
Sbjct: 131 YIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLI 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS05375ECOLNEIPORIN934e-23 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 92.6 bits (230), Expect = 4e-23
Identities = 77/357 (21%), Positives = 120/357 (33%), Gaps = 59/357 (16%)

Query: 28 AAAQSSVTLYGVVDTAFAYSSNQGGHS-----NTYMSQGNLLASKFGLSGTEDLGGGTQA 82
AA + VTLYG + S + + + L SK G G EDLG G +A
Sbjct: 15 VAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSKIGFKGQEDLGNGLKA 74

Query: 83 LFRLESGFNSATGAQAAAGYLFNRLAFVGLSDKTYGALTLGRQYTPYFQYVAALGPTNVL 142
++++E + A NR +F+GL +G L +GR + +
Sbjct: 75 IWQVEQKASIAGTDSG----WGNRQSFIGLK-GGFGKLRVGRLNSVLKDTGDINPWDSKS 129

Query: 143 T---GATGAHPGDVDALDTTLRFNNSVTYTLPVIGGLQAGVQYGMGEQPGSISGGSSFSA 199
A P SV Y P GL VQY + + G S+ A
Sbjct: 130 DYLGVNKIAEPEA---------RLISVRYDSPEFAGLSGSVQYALNDNAGR-HNSESYHA 179

Query: 200 ALRYDYQAFAWSAGYIHLKNIPTSNSVGTFANNSPVNSGYASADSAQLIATAARYTFGKL 259
Y F G G + + V + + Q+ + Y L
Sbjct: 180 GFNYKNGGFFVQYG-------------GAYKRHHQVQEN-VNIEKYQIHRLVSGYDNDAL 225

Query: 260 MVGLNYSNVQYKPGAGSLFTQAAMFNS-YGLISTYA-----LTPAITVAAGYSYTAEKAR 313
Y++V + L + NS + +T A +TP + Y++ + +
Sbjct: 226 -----YASVAVQQQDAKLVEENYSHNSQTEVAATLAYRFGNVTPRV----SYAHGFKGSF 276

Query: 314 NGISSPARYHQFSMEQTYALSKRTAFYALEAYQKASGKTLRTVGGVSTIVDTVASVG 370
+ + Y Q + Y SKRT+ S L+ G S V T VG
Sbjct: 277 DATNYNNDYDQVVVGAEYDFSKRTSAL-------VSAGWLQEGKGESKFVSTAGGVG 326


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS05385ECOLNEIPORIN747e-17 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 74.1 bits (182), Expect = 7e-17
Identities = 76/355 (21%), Positives = 130/355 (36%), Gaps = 55/355 (15%)

Query: 8 LSLLLAAPALASAQSVTLYGVVDTGVEYVNRIGTTGN--SVVRMPNLSGTVPSRWGLRGM 65
++L LAA +A+ VTLYG + GVE + G + V + S+ G +G
Sbjct: 6 IALTLAALPVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSKIGFKGQ 65

Query: 66 EDLGGGTRALFALESGFATDSGTANQGGRLFGRQAWVGLSNSSWGQLSFGRQYTMLF--- 122
EDLG G +A++ +E + A RQ+++GL +G+L GR ++L
Sbjct: 66 EDLGNGLKAIWQVEQK----ASIAGTDSGWGNRQSFIGLK-GGFGKLRVGRLNSVLKDTG 120

Query: 123 ----WATLDTDLLGPNAYGSSSLDNYLPNARADNAIAYKGKFGGLTVGATYSFGRDTVNA 178
W +D LG N + L + R D+ +F GL+ Y+ +
Sbjct: 121 DINPW-DSKSDYLGVNKIAEP--EARLISVRYDSP-----EFAGLSGSVQYALNDN---- 168

Query: 179 GPSPSGTNCAGENPADSLACREWSAMLMYETGRWGLSAAY------DSLRGGPGAFAGLT 232
AG + ++S + A Y+ G + + +
Sbjct: 169 ---------AGRHNSES-----YHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIH 214

Query: 233 SSALKDDRLAL--GGYALFDNARIGLGWIGR--RNGALATPRSDLFYAGAAYDVTPALTL 288
D AL +A++ + AT + V+ A
Sbjct: 215 RLVSGYDNDALYASVAVQQQDAKLVEENYSHNSQTEVAATLAYR--FGNVTPRVSYAHGF 272

Query: 289 AGEVFHLRYHNSANKAWLGAVRASYALSKRSSVYATVGYIDNGGSLALSVSSAAT 343
G Y+N ++ +G A Y SKR+S + G++ G + VS+A
Sbjct: 273 KGSFDATNYNNDYDQVVVG---AEYDFSKRTSALVSAGWLQEGKGESKFVSTAGG 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS05425PF05043320.003 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 31.8 bits (72), Expect = 0.003
Identities = 15/54 (27%), Positives = 27/54 (50%)

Query: 8 LNLLVVFNELLRQRRVSAVAETLGITQPAISNALNRLRKLLGDELFVRTSKGMI 61
L LL + E R S +AE L T+ A+ + L+ ++ D +F ++ G+
Sbjct: 13 LELLELLFEHKRWFHRSELAELLNCTERAVKDDLSHVKSAFPDLIFHSSTNGIR 66


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS05430TCRTETB561e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 56.4 bits (136), Expect = 1e-10
Identities = 40/160 (25%), Positives = 73/160 (45%), Gaps = 4/160 (2%)

Query: 48 APALAQDWHIGPAVLGTVFSAGLAGLMVGALIFGPLADRIGRKQTLLFTVGAFGLASVLS 107
P +A D++ PA V +A + +G ++G L+D++G K+ LLF + SV+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 108 AFAPS-VGALVALRFLTGLGLGGAMPNAIA-LTSEYCPERRRAFLTTVMFCGFTLGSGFG 165
S L+ RF+ G G A P + + + Y P+ R ++ +G G G
Sbjct: 97 FVGHSFFSLLIMARFIQGAG-AAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVG 155

Query: 166 GIVAAQLVPEFGWRSVLLFGGVIPLLLLPVMVFALPESVR 205
+ + W +LL +I ++ +P ++ L + VR
Sbjct: 156 PAIGGMIAHYIHWSYLLLI-PMITIITVPFLMKLLKKEVR 194


21RS_RS06210RS_RS06240Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS0621029-0.1099123D-(3,5/4)-trihydroxycyclohexane-1,2-dione
RS_RS0621548-0.6839565-dehydro-2-deoxygluconokinase
RS_RS06220611-1.641584rhizopine-binding protein
RS_RS06225611-1.130491D-ribose transporter ATP-binding protein
RS_RS06230613-0.825100sugar ABC transporter permease
RS_RS06235513-0.887516protein iolH
RS_RS06240211-0.406319protein implicated IN myo-inositol catabolique
22RS_RS06375RS_RS06450Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS06375490.415350ATPase
RS_RS06380490.483313hypothetical protein
RS_RS0638549-0.462228carbonate dehydratase
RS_RS06390212-3.012858cytochrome oxidase maturation protein Cbb3
RS_RS06395212-2.830643cytochrome oxidase subunit I
RS_RS06400014-2.234511peptidase S41
RS_RS06405015-2.671140cytochrome oxidase
RS_RS06410117-2.000292cytochrome CBB3
RS_RS06415018-1.325779cytochrome C oxidase
RS_RS064200200.257141membrane protein
RS_RS06425122-0.660997membrane protein
RS_RS06430219-1.427423transcriptional regulator
RS_RS06435315-0.645705membrane protein
RS_RS06440413-0.900513SMC-Scp complex subunit ScpB
RS_RS06445412-0.462392hypothetical protein
RS_RS06450312-1.217951ribosome maturation factor RimP
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06425ECOLNEIPORIN280.004 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 27.8 bits (62), Expect = 0.004
Identities = 8/29 (27%), Positives = 16/29 (55%), Gaps = 1/29 (3%)

Query: 1 MKRRLLMWILWPAFLAAALAELVVFSVVD 29
MK+ L+ L A AA+A++ ++ +
Sbjct: 1 MKKSLIALTL-AALPVAAMADVTLYGTIK 28


23RS_RS06520RS_RS06570Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS06520-217-3.426534*hypothetical protein
RS_RS06525-214-1.938861hypothetical protein
RS_RS06530-314-2.452091hypothetical protein
RS_RS06535-117-3.229954hypothetical protein
RS_RS06540-116-3.872727LexA repressor
RS_RS06545019-4.603608ferredoxin--NADP reductase
RS_RS06550-219-4.330198L-asparaginase
RS_RS06555017-6.46752130S ribosomal protein S6
RS_RS06560-114-5.165268hypothetical protein
RS_RS06565-113-3.66508330S ribosomal protein S18
RS_RS06570-113-3.16316750S ribosomal protein L9
24RS_RS06700RS_RS06725Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS067000163.556169ABC transporter permease
RS_RS067053154.049808FMN reductase
RS_RS067101134.040128sulfonate ABC transporter substrate-binding
RS_RS067152134.205793alkanesulfonate monooxygenase
RS_RS067201153.293202sulfonate ABC transporter
RS_RS06725-1143.090685aliphatic sulfonate ABC transporter ATP-binding
25RS_RS06840RS_RS06890Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS06840-2103.092385hypothetical protein
RS_RS068450122.099499MFS transporter
RS_RS068500121.910180acyl-CoA-binding protein
RS_RS068551132.374836tRNA threonylcarbamoyladenosine biosynthesis
RS_RS068601151.600500acyltransferase
RS_RS068652171.799303DNA polymerase
RS_RS068701152.125995MFS transporter
RS_RS068750123.151544alanine racemase
RS_RS06880-1143.564946DNA repair protein RadA
RS_RS06885-1143.982940disulfide bond formation protein B
RS_RS068900143.934766hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06845TCRTETB1244e-33 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 124 bits (312), Expect = 4e-33
Identities = 87/405 (21%), Positives = 179/405 (44%), Gaps = 13/405 (3%)

Query: 23 VLFTLTVGAVSSIISATIVNVAIPDLSRHFVLGQERAQWVSASFMVAMTLAMLLTPWLLL 82
+L L + + S+++ ++NV++PD++ F WV+ +FM+ ++ + L
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 83 RFGLRRTFIGALLLLGVGGLVGGLSPTY-GVMIAMRVVGGVAAGIMQPLPNILILRVFPE 141
+ G++R + +++ G ++G + ++ ++I R + G A L +++ R P+
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 142 REQGKAFGLFGFGVVLAPALGPSLGGFLVELFGWRSIFFVVVPFTLLGWAMARRFMAINS 201
+GKAFGL G V + +GP++GG + W S ++ T++ + +
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW-SYLLLIPMITIITVPFLMKLLKKEV 193

Query: 202 SMAGEPKPLDWRGLLLVGAATVALLNGLVELHADVTRGVALMAVSAVCLAVFLFWQRRVE 261
+ G D +G++L+ V + + ++ + VS + +F+ R+V
Sbjct: 194 RIKG---HFDIKGIILMSVGIVFFMLFT------TSYSISFLIVSVLSFLIFVKHIRKVT 244

Query: 262 SPLLDLRLFSYRQFAMGAVVAFIYGAGLYGSTYLLPVYMQVALAYTPSSAGLVLLPPG-L 320
P +D L F +G + I + G ++P M+ + + G V++ PG +
Sbjct: 245 DPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTM 304

Query: 321 ALAATIVVAGRLTSRIEPYRLVSFGLAALAVSFLLMTTNTRATGYLLLIGIAAIGRVGLG 380
++ + G L R P +++ G+ L+VSFL + T + + I I + GL
Sbjct: 305 SVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFV-LGGLS 363

Query: 381 FVLPSLSLGAMRGVDFTLIPQGSSAVNFLRQLGGAIGVSATGIFL 425
F +S + G S +NF L G++ G L
Sbjct: 364 FTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLL 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06860SACTRNSFRASE437e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 43.4 bits (102), Expect = 7e-08
Identities = 18/74 (24%), Positives = 30/74 (40%)

Query: 90 NVTVAPAWQRQGLGRWLLRAAQALTLAHGFASLLLEVRPSNAGAIALYRRVGFAEIGRRK 149
++ VA ++++G+G LL A + F L+LE + N A Y + F
Sbjct: 94 DIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVDT 153

Query: 150 RYYPAENNTREDAL 163
Y E A+
Sbjct: 154 MLYSNFPTANEIAI 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06865SURFACELAYER290.029 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 28.9 bits (64), Expect = 0.029
Identities = 13/71 (18%), Positives = 28/71 (39%)

Query: 48 VADAPAPRIEASPVVVAEVVPVAAPAAAAAATVADVPTDSTVPTDSTDPVAPRAQRIAAF 107
+ A A + A + A +PV A A + + T++ D T ++ A +
Sbjct: 7 IVSAAAAALLAVAPIAATAMPVNAATTINADSAINANTNAKYDVDVTPSISAIAAVAKSD 66

Query: 108 DWAQLEAAVSG 118
+ +++G
Sbjct: 67 TMPAIPGSLTG 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06875ALARACEMASE419e-149 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 419 bits (1080), Expect = e-149
Identities = 211/359 (58%), Positives = 256/359 (71%), Gaps = 4/359 (1%)

Query: 1 MPRPIQAVIHGPALVNNLQVVRRHAADSRVWAVIKANAYGHGIERAYEGLRQADGFGLLD 60
M RPIQA + AL NL +VR+ A +RVW+V+KANAYGHGIER + + DGF LL+
Sbjct: 1 MTRPIQASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGATDGFALLN 60

Query: 61 LDEAVRLRQLGWQGPILLLEGFFKPEDLALVEQYRLTTTVHCEEQLRMLELARLKGPVSI 120
L+EA+ LR+ GW+GPIL+LEGFF +DL + +Q+RLTT VH QL+ L+ ARLK P+ I
Sbjct: 61 LEEAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDI 120

Query: 121 QLKINTGMSRLGFAPAAYRAAWEHARAISGIGTIVHMTHFSDADGPRGIDHQLAAFEQAT 180
LK+N+GM+RLGF P W+ RA++ +G + M+HF++A+ P GI +A EQA
Sbjct: 121 YLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAEHPDGISGAMARIEQAA 180

Query: 181 QGLPGEASLSNSAATLWHPRAHRDWVRPGVILYGASPTGVAADIEGTGLMPAMTLKSELI 240
+GL SLSNSAATLWHP AH DWVRPG+ILYGASP+G DI TGL P MTL SE+I
Sbjct: 181 EGLECRRSLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTGLRPVMTLSSEII 240

Query: 241 AVQDLQPGATVGYGSRFEAEQPMRIGIVACGYADGYPRHAPGWDGNYTPVLVDGVRTRMV 300
VQ L+ G VGYG R+ A RIGIVA GYADGYPRHAP TPVLVDGVRT V
Sbjct: 241 GVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAP----TGTPVLVDGVRTMTV 296

Query: 301 GRVSMDMITVDLAEVPGARVGAPVTLWGQGLPIDEVAHAAGTVGYELMCALAPRVPVTV 359
G VSMDM+ VDL P A +G PV LWG+ + ID+VA AAGTVGYELMCALA RVPV
Sbjct: 297 GTVSMDMLAVDLTPCPQAGIGTPVELWGKEIKIDDVAAAAGTVGYELMCALALRVPVVT 355


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06880TCRTETOQM290.046 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 29.1 bits (65), Expect = 0.046
Identities = 15/74 (20%), Positives = 29/74 (39%), Gaps = 12/74 (16%)

Query: 106 LLQALSNLAASRRVLYVSGEESGAQIALRARRLGVESPNLALLAEIQLERIQATIEAEKP 165
LL AL ++ S +L + + +I L L ++Q+E A ++ +
Sbjct: 361 LLDALLEISDSDPLLRYYVDSATHEIILS------------FLGKVQMEVTCALLQEKYH 408

Query: 166 EVVVIDSIQTLYSE 179
+ I +Y E
Sbjct: 409 VEIEIKEPTVIYME 422


26RS_RS07175RS_RS07275Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS07175-116-3.043524hypothetical protein
RS_RS07180018-4.657534GMP synthase
RS_RS07210433-6.664626**transcriptional regulator
RS_RS07215533-6.947845hypothetical protein
RS_RS07220534-7.214729transposase
RS_RS07225429-5.380803transposase
RS_RS07230324-5.227065transposase
RS_RS07235119-3.701576transposase
RS_RS07240-114-2.241134transposase
RS_RS07245014-0.454445transposase
RS_RS072502102.192447LysR family transcriptional regulator
RS_RS07255192.654096protocatechuate 3,4-dioxygenase subunit beta
RS_RS072600103.478174protocatechuate 3,4-dioxygenase subunit alpha
RS_RS07265-1114.1954022'-5' RNA ligase
RS_RS07270-1133.990594protein-disulfide isomerase
RS_RS07275-2133.081145MFS transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS07265IGASERPTASE310.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 31.2 bits (70), Expect = 0.003
Identities = 22/84 (26%), Positives = 31/84 (36%), Gaps = 12/84 (14%)

Query: 98 VSFHGRRDHRPFVLTGGVGLHALIDFQHALGAALERAGLRVPQARFVPHVTLLYDRGGFA 157
V+F G+ + F+LTGG L+ G G R PH D G +
Sbjct: 669 VTFKGKSEQNRFLLTGGTNLN---------GDLTVEKGTLFLSGRPTPHA---RDIAGIS 716

Query: 158 PKPIEPIAWTVREFVLIDSWLGRT 181
+P E V+ D W+ R
Sbjct: 717 STKKDPHFAENNEVVVEDDWINRN 740


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS07275TCRTETA582e-11 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 57.9 bits (140), Expect = 2e-11
Identities = 82/365 (22%), Positives = 135/365 (36%), Gaps = 33/365 (9%)

Query: 22 GVSLLMDVASEMIHSLLPMFMVSALGASALTVGLVEGLAESTALIVKIFSGALSDYLGRR 81
G+ L+M V ++ L+ G++ L GALSD GRR
Sbjct: 20 GIGLIMPVLPGLLRDLVHS------NDVTAHYGILLALYALMQFACAPVLGALSDRFGRR 73

Query: 82 KGLAVFGYALGAVSKPLFALAPTAGLVLAARLLDRVGKGVRGAPRDALVADIAPPELRGA 141
L + A AV + A AP ++ R++ + G GA A +ADI + R
Sbjct: 74 PVL-LVSLAGAAVDYAIMATAPFLWVLYIGRIVAGI-TGATGAVAGAYIADITDGDERAR 131

Query: 142 AFGLRQSLDTVGAFLGPLLAVGLMLRWANDFRAVFWVAVVPGLLSVALLAFGLREPAAR- 200
FG + G GP+L GLM ++ A F+ A L+ F L E
Sbjct: 132 HFGFMSACFGFGMVAGPVLG-GLMGGFSP--HAPFFAAAALNGLNFLTGCFLLPESHKGE 188

Query: 201 ---AATHRANPIRRANLRRLTRAYWWVVAIGAVFTLARFSEAFLVLRAQ----RGGVPIA 253
NP+ R ++A+ F + + L R
Sbjct: 189 RRPLRREALNPLASFRWARGMTVVAALMAVF--FIMQLVGQVPAALWVIFGEDRFHWDAT 246

Query: 254 LVPLVMVAMNVVYAL-SAYPFGKLSDRVSHTGLLALGLVV-LIAADLVLGATDHWGAVLA 311
+ + + A ++++L A G ++ R+ L LG++ L+ AT W A
Sbjct: 247 TIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPI 306

Query: 312 GVALWGVHMGI--TQGLLATMVADTAPADLRGTAYGAFNLVSGIAMLVASAL-------- 361
V L +G+ Q +L+ V + L+G+ +L S + L+ +A+
Sbjct: 307 MVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTW 366

Query: 362 AGWLW 366
GW W
Sbjct: 367 NGWAW 371


27RS_RS07360RS_RS07525Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS073600133.200653hypothetical protein
RS_RS07365-2123.111042hypothetical protein
RS_RS07370-1123.192518carbon monoxide dehydrogenase
RS_RS07375-1132.627791VWA domain-containing protein
RS_RS07380-2131.939806ATPase
RS_RS073850111.719195carbon-monoxide dehydrogenase
RS_RS073900101.661967carbon-monoxide dehydrogenase
RS_RS073954112.379694carbon monoxide dehydrogenase
RS_RS07400092.071586hypothetical protein
RS_RS07405182.475900LysR family transcriptional regulator
RS_RS07410282.880022MFS transporter
RS_RS07415173.498739hypothetical protein
RS_RS07420173.241352hypothetical protein
RS_RS07425282.966297glycosyl transferase family 51
RS_RS074304123.796397adenosylmethionine--8-amino-7-oxononanoate
RS_RS074353123.6554588-amino-7-oxononanoate synthase
RS_RS074401140.181325ATP-dependent dethiobiotin synthetase BioD
RS_RS07445218-1.310920lipoprotein transmembrane
RS_RS07450021-1.902602hypothetical protein
RS_RS07455022-2.419375hypothetical protein
RS_RS07460124-2.796889hypothetical protein
RS_RS07465126-3.281472transposase
RS_RS07470228-4.039497hypothetical protein
RS_RS07475231-4.676434hypothetical protein
RS_RS07480430-4.777073hypothetical protein
RS_RS07485433-5.612834hypothetical protein
RS_RS07490633-2.546436hypothetical protein
RS_RS07495628-1.037306transposase
RS_RS07500524-0.136212transposase
RS_RS075056220.798209integrase
RS_RS075104231.169535hemagglutinin
RS_RS075155212.474451hypothetical protein
RS_RS075202171.531885hypothetical protein
RS_RS075252161.272171hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS07360NUCEPIMERASE383e-05 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 38.2 bits (89), Expect = 3e-05
Identities = 12/24 (50%), Positives = 15/24 (62%)

Query: 1 MSILVTGATGFVGGAIAANLAEKG 24
M LVTGA GF+G ++ L E G
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAG 24


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS07410TCRTETB676e-14 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 66.8 bits (163), Expect = 6e-14
Identities = 54/332 (16%), Positives = 121/332 (36%), Gaps = 22/332 (6%)

Query: 47 MANLMFSIAGPHIEGGVLASQDVYLWAVTSYAAAASLAILVMGRLADRMTYRRHTLLSLL 106
+ ++ +++ P I W T++ S+ V G+L+D++ +R L ++
Sbjct: 28 LNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGII 87

Query: 107 IFIAGTLMCAAAEGG-TMLIFGRAVQGFGGGPMLSTSRIFVQHSPAHERRTMMKGMIYGI 165
I G+++ ++LI R +QG G + + V E R G+I I
Sbjct: 88 INCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSI 147

Query: 166 FGL-TCVSPVLSAALTEQFGWRAIFVAQLLVAVMVCGLVAAFYPHPHQHERTGDFASLDW 224
+ V P + + W + + ++ + V L+ + R D
Sbjct: 148 VAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKL----LKKEVRIKG--HFDI 201

Query: 225 PSAIAFGLAALIALHGFQQARFVHPDGSPAQVLPILAVVALVAWIG-IRQTGHPLPWVDL 283
I + + + I++V++ + ++ IR+ P VD
Sbjct: 202 KGIILMSVGIVFFMLFTTSYSISF---------LIVSVLSFLIFVKHIRKVTDPF--VDP 250

Query: 284 RATLQRRFIVGMVFYAIYYGFASAWSFLSSSLLQNGLGFR-YETTAQFMSLSGLVTLLLG 342
F++G++ I +G + + + ++++ E + + + ++ G
Sbjct: 251 GLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFG 310

Query: 343 IANFQLTDYLPRKHVLIAIGFVLMAGAMLWLS 374
L D +VL IG ++ + L S
Sbjct: 311 YIGGILVDRRGPLYVLN-IGVTFLSVSFLTAS 341


28RS_RS07745RS_RS07830Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS077453160.632244MBL fold metallo-hydrolase
RS_RS07755218-1.217858histidine kinase
RS_RS07760326-2.558308hypothetical protein
RS_RS07765326-3.026696glutamate-1-semialdehyde aminotransferase
RS_RS07770126-3.641459formate acetyltransferase
RS_RS07775224-3.815977hypothetical protein
RS_RS07780232-7.349766membrane protein
RS_RS07785234-6.251245transposase
RS_RS07790221-3.801785transposase
RS_RS07795222-4.167659transposase
RS_RS07800224-3.848779hypothetical protein
RS_RS07805118-2.793682hypothetical protein
RS_RS07810117-2.485363hypothetical protein
RS_RS25170115-2.026617integrase
RS_RS07820111-0.574295*membrane protein
RS_RS078302131.190829D-2-hydroxyacid dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS07790TETREPRESSOR270.012 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 26.8 bits (59), Expect = 0.012
Identities = 12/36 (33%), Positives = 23/36 (63%)

Query: 12 VEAVNQVLDRGHSVAEVAQRLGVSQHSLYQWIKQRR 47
+E +N+ G + ++AQ+LG+ Q +LY +K +R
Sbjct: 14 LELLNETGIDGLTTRKLAQKLGIEQPTLYWHVKNKR 49


29RS_RS08385RS_RS08420Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS08385328-0.547321hypothetical protein
RS_RS08390231-2.461817hypothetical protein
RS_RS08395131-3.646740hypothetical protein
RS_RS08400436-4.841219hypothetical protein
RS_RS08405441-5.694664hypothetical protein
RS_RS08410043-5.841247hypothetical protein
RS_RS08415-238-5.275131hypothetical protein
RS_RS08420-237-4.994042transposase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08385IGASERPTASE352e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 34.7 bits (79), Expect = 2e-04
Identities = 18/133 (13%), Positives = 35/133 (26%), Gaps = 8/133 (6%)

Query: 56 AAAVQIYEPSALSILQQAQAAATANSAAGKTKAIANEPPKASSKNASTTDAPPKRGPGRP 115
A + + Q + K A + KA + T + P P
Sbjct: 1072 AKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSP 1131

Query: 116 PKNAGAAAAKAEVTSSQSEDDSEDTGAPEVDPRQMSLVTPEPEATPAATDTP--PPAEAS 173
+ ++E Q+E E+ + Q T PA +
Sbjct: 1132 KQ------EQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTE 1185

Query: 174 PQPSDATPAVADS 186
+ +V ++
Sbjct: 1186 STTVNTGNSVVEN 1198



Score = 30.8 bits (69), Expect = 0.004
Identities = 31/138 (22%), Positives = 47/138 (34%), Gaps = 9/138 (6%)

Query: 63 EPSALSILQQAQAAATANSAAGKTKAIANEPPK-ASSKNASTTDAPPKRGPGRPPKNAGA 121
E A + Q+++ A +T A E K A S + T G K
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 122 AAAKAEVTSSQSEDDSEDTG--------APEVDPRQMSLVTPEPEATPAATDTPPPAEAS 173
K T + E +T +V P+Q T +P+A PA + P
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 174 PQPSDATPAVADSDPRKT 191
PQ T A + ++T
Sbjct: 1158 PQSQTNTTADTEQPAKET 1175


30RS_RS08490RS_RS08515Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS08490218-1.918083hypothetical protein
RS_RS08495218-1.849373phage capsid protein
RS_RS08500422-1.758634hypothetical protein
RS_RS08505322-1.796523nucleoid-structuring protein H-NS
RS_RS08510216-0.705764hypothetical protein
RS_RS08515216-0.691499hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08495ENTSNTHTASED290.023 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 29.2 bits (65), Expect = 0.023
Identities = 28/110 (25%), Positives = 40/110 (36%), Gaps = 19/110 (17%)

Query: 227 FLAGRIAARIGRAESRL-VVQGTGD-GKPQQPQGLAASVTLTKSTANAAKLTWQEVNTLI 284
LAGRIAA E + V G GD +P P GL S++ +TA A ++ Q + I
Sbjct: 50 HLAGRIAAVHALREVGVRTVPGMGDKRQPLWPDGLFGSISHCATTA-LAVISRQRIGIDI 108

Query: 285 HAVDPAYRNAPMYRLAFNDQTLKVLEELVDGNGRPLWLPGLESSAPPTIL 334
+ + EL L++S P L
Sbjct: 109 EKIMSQH----------------TATELAPSIIDSDERQILQASLLPFPL 142


31RS_RS08565RS_RS08650Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS08565319-2.110378phage tail assembly protein
RS_RS08570319-2.054830DNA repair protein HhH-GPD
RS_RS08575219-1.674249hexulose-6-phosphate synthase
RS_RS08580220-1.326090hypothetical protein
RS_RS08585325-1.307636hypothetical protein
RS_RS08590122-2.193759hypothetical protein
RS_RS08595426-3.883636lysozyme
RS_RS08600431-4.433289membrane protein
RS_RS08605333-4.577324membrane protein
RS_RS08610223-0.477315hypothetical protein
RS_RS08615216-1.268991transposase
RS_RS08620315-0.824341hypothetical protein
RS_RS08625212-1.592464membrane protein
RS_RS0863039-2.207408hypothetical protein
RS_RS0863538-2.398015tRNA 5-methylaminomethyl-2-thiouridine
RS_RS08640410-4.574107trigger factor
RS_RS08645210-4.143563ATP-dependent Clp protease proteolytic subunit
RS_RS0865029-3.756282ATP-dependent Clp protease ATP-binding subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08590SHAPEPROTEIN280.026 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 28.2 bits (63), Expect = 0.026
Identities = 15/46 (32%), Positives = 26/46 (56%), Gaps = 5/46 (10%)

Query: 84 VYVFDEPAAPSGQPGLQVFKADGTLAFDSG-----LAYLKVSGVVT 124
V++ +EP A + GL V +A G++ D G +A + ++GVV
Sbjct: 138 VFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVY 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08650HTHFIS310.013 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.6 bits (69), Expect = 0.013
Identities = 15/90 (16%), Positives = 34/90 (37%), Gaps = 16/90 (17%)

Query: 67 LPTPHEIRQSLDQYVIGQEQAKKILAVAVYNHYKRLKHLGKKDDVELSKSNILLIGPTGS 126
P+ E ++G+ A + + + + + + +++ G +G+
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAA-----------MQEIYRVLAR--LMQTDLTLMITGESGT 171

Query: 127 GKTLLAQTLARL---LNVPFVIADATTLTE 153
GK L+A+ L N PFV + +
Sbjct: 172 GKELVARALHDYGKRRNGPFVAINMAAIPR 201


32RS_RS08940RS_RS08965Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS08940271.008631alpha/beta hydrolase
RS_RS08945380.7650984-hydroxybutyrate dehydrogenase
RS_RS089503100.012179alpha/beta hydrolase
RS_RS08955410-0.587284MFS transporter
RS_RS08960410-0.258046N-acetyltransferase GCN5
RS_RS08965410-0.301881membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08955RTXTOXINA290.009 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.8 bits (64), Expect = 0.009
Identities = 15/78 (19%), Positives = 25/78 (32%), Gaps = 13/78 (16%)

Query: 51 TEVVEAFGVDFYVVKST-LEYHASARFDDVLDLRCRVARLGRSSLRFIIEIYRGEGDDRA 109
+ +++ V + +E H D V L S IY G+G D
Sbjct: 594 SNLIQHASVGNNQYREIRIESHLGDGDDKVF--------LSAGS----ANIYAGKGHDVV 641

Query: 110 HLVTGEVIYVCADPATQT 127
+ + Y+ D T
Sbjct: 642 YYDKTDTGYLTIDGTKAT 659


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08960SACTRNSFRASE362e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.1 bits (83), Expect = 2e-05
Identities = 17/56 (30%), Positives = 27/56 (48%), Gaps = 3/56 (5%)

Query: 76 IGRMAVRKEARGTGIGARVLQALIDKARALGYTQLILNAQTHAMP---FYARAGFT 128
I +AV K+ R G+G +L I+ A+ + L+L Q + FYA+ F
Sbjct: 92 IEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFI 147


33RS_RS09010RS_RS09160Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS090100133.054441type 4 fimbrial biogenesis protein
RS_RS09015-1142.819915DNAase
RS_RS09020-1133.003828hypothetical protein
RS_RS090251153.294037hypothetical protein
RS_RS090300153.668310LacI family transcriptional regulator
RS_RS090351163.717680ABC transporter substrate-binding protein
RS_RS090400101.645229ABC transporter permease
RS_RS090450122.311613ABC transporter permease
RS_RS090500102.180468ABC transporter ATP-binding protein
RS_RS090550102.946905metallophosphatase
RS_RS090600132.605366N-acetylmuramoyl-L-alanine amidase
RS_RS09065-1162.144941ABC transporter permease
RS_RS090700163.2817552-oxoisovalerate dehydrogenase subunit beta
RS_RS090750163.012196branched-chain alpha-keto acid dehydrogenase
RS_RS090800143.814060GALA protein
RS_RS090851143.787798GALA protein
RS_RS090901105.230894transcriptional regulator
RS_RS090950105.113290voltage-gated chloride channel protein
RS_RS09100094.926436hypothetical protein
RS_RS091050104.967027siderophore synthase
RS_RS09115185.103498polyketide synthase
RS_RS09120184.404699ligand-gated channel
RS_RS09125284.218159ABC transporter ATP-binding protein
RS_RS091301123.335606ABC transporter ATP-binding protein
RS_RS091351132.927677polyketide synthase
RS_RS091401190.432997short-chain isoprenyl diphosphate synthase
RS_RS09145131-3.509079hypothetical protein
RS_RS09150132-3.661976AraC family transcriptional regulator
RS_RS09155032-3.180730hypothetical protein
RS_RS09160124-4.068098acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09025PF05616320.003 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 32.0 bits (72), Expect = 0.003
Identities = 23/62 (37%), Positives = 27/62 (43%), Gaps = 5/62 (8%)

Query: 225 PGAQPVPARQPAAPAGMRPAAPVVPDV-PAPMPAPRA-PESAIDRDATSGLPGSLSDQKT 282
P AQP+P PA PA P P P P P P++ D D G PG+ D
Sbjct: 323 PNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTD---GQPGTRPDSPA 379

Query: 283 VP 284
VP
Sbjct: 380 VP 381


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09055PF05272280.049 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.049
Identities = 9/22 (40%), Positives = 14/22 (63%)

Query: 45 LVLLGPSGCGKTTTLRIVAGLE 66
+VL G G GK+T + + GL+
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09070SECETRNLCASE280.034 Bacterial translocase SecE signature.
		>SECETRNLCASE#Bacterial translocase SecE signature.

Length = 127

Score = 27.9 bits (62), Expect = 0.034
Identities = 10/48 (20%), Positives = 23/48 (47%), Gaps = 2/48 (4%)

Query: 69 GQEAIGVGVASAMRAEDVLFPSYRD--HSAQLLRGVSMAESLLYWGGD 114
G+ + + V++P+ ++ H+ ++ V+ SL+ WG D
Sbjct: 65 GKATVAFAREARTEVRKVIWPTRQETLHTTLIVAAVTAVMSLILWGLD 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09115NUCEPIMERASE340.009 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 34.4 bits (79), Expect = 0.009
Identities = 36/156 (23%), Positives = 59/156 (37%), Gaps = 26/156 (16%)

Query: 1842 YLVAGAYGALGRHTTDWLAAHGATHLVLA----GRRAPPAGWQARLALLRAQGVRIDPVD 1897
YLV GA G +G H + L G H V+ + QARL LL G +
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG--HQVVGIDNLNDYYDVSLKQARLELLAQPGFQF--HK 58

Query: 1898 ADLAEAADVERLFDAVAALEATTGRTLAGVFHCAGTSRFNDLAGL--TTDDCAAVTGAKM 1955
DLA+ + LF + VF + + ++ A + +
Sbjct: 59 IDLADREGMTDLFASG---HFER------VFISPHR------LAVRYSLENPHAYADSNL 103

Query: 1956 TGAWLLHEQTRARRLDWFVCFTSISGVWGSRLQIPY 1991
TG + E R ++ + + S S V+G ++P+
Sbjct: 104 TGFLNILEGCRHNKIQHLL-YASSSSVYGLNRKMPF 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09135ISCHRISMTASE340.005 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 34.2 bits (78), Expect = 0.005
Identities = 19/42 (45%), Positives = 25/42 (59%)

Query: 1798 LRAGIATLLHCAPDTITDQANLIQLGLDSLLFLELCETIARE 1839
+R IA LL P+ ITDQ +L+ GLDS+ + L E RE
Sbjct: 235 IRKQIAELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRRE 276



Score = 32.3 bits (73), Expect = 0.023
Identities = 15/71 (21%), Positives = 36/71 (50%), Gaps = 1/71 (1%)

Query: 1681 DTVRARVAEVLGRDAQAIPAEANLIQLGLDSLLFLDLSERLGKQFGVSISAETAFKANTL 1740
+ +R ++AE+L + I + +L+ GLDS+ + L E+ ++ G ++ + T+
Sbjct: 233 ENIRKQIAELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRRE-GAEVTFVELAERPTI 291

Query: 1741 NAFVQALAAQL 1751
+ + L +
Sbjct: 292 EEWQKLLTTRS 302


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09155GPOSANCHOR528e-09 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 52.4 bits (125), Expect = 8e-09
Identities = 76/387 (19%), Positives = 125/387 (32%), Gaps = 31/387 (8%)

Query: 455 GAPYALSTEQVVAIASHNGGKQALEAVKADLLELRGAPYALSTEQVVAIASHNGGKQA-- 512
GA ++T V+ + LE V+ + + ++ +N +
Sbjct: 30 GAGLVVNTN-EVSAVATRSQTDTLEKVQERADKFEI-ENNTLKLKNSDLSFNNKALKDHN 87

Query: 513 ------LEAVKAHLLDLRGVPYALSTEQVVAIASHNGGKQALEAVKAQLLDLRGAPYALS 566
L K L +++ A ++ALE L
Sbjct: 88 DELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLE 147

Query: 567 TAQVVAIASNGGGKQALEGIGEQLLKLRTAPYGLSTEQVVAIASHDGGKQALEAVGAQLV 626
+ A ++ALEG L E K ALEA A+L
Sbjct: 148 AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE-----------KAALEARQAELE 196

Query: 627 ALRAAPYALSTEQVVAIASNKGGKQALEAVKAQLLELRGAPYALSTAQVVAIASHDGGNQ 686
ST I + + K AL A KA L + STA I + +
Sbjct: 197 KALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKA 256

Query: 687 ALEAVGTQLVALRAAPYALSTEQVVAIASHDGGKQALEAVGAQLVALRAAPYALNTEQVV 746
ALEA +L A+ ++ + A+ AL A L + V
Sbjct: 257 ALEARQAELEKALE----------GAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQV 306

Query: 747 AIASSHGGKQALEAVRALFPDLRAAPYALSTAQLVAIASNPGGKQALEAVRALFRELRAA 806
A+ ++ L+A R L A L ++ AS ++ L+A R ++L A
Sbjct: 307 LNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAE 366

Query: 807 PYALSTEQVVAIASNHGGKQALEAVRA 833
L + ++ AS ++ L+A R
Sbjct: 367 HQKLEEQNKISEASRQSLRRDLDASRE 393



Score = 48.1 bits (114), Expect = 2e-07
Identities = 75/383 (19%), Positives = 125/383 (32%), Gaps = 29/383 (7%)

Query: 490 GAPYALSTEQVVAIASHNGGKQALEAVKAHLLDLRGVPYALSTEQVVAIASHNGGKQALE 549
GA ++T V+ + LE V+ L + ++ K +
Sbjct: 30 GAGLVVNTN-EVSAVATRSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHND 88

Query: 550 AVKAQLLDLRGAPYALSTAQVVAIASNGGGKQALEGIGEQL-------LKLRTAPYGLST 602
+ +L + + + + + + + L L
Sbjct: 89 ELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEA 148

Query: 603 EQVVAIASHDGGKQALEAVGAQLVALRAAPYALSTEQVVAIASNKGGKQALEAVKAQLLE 662
E+ A ++ALE A A L E K ALEA +A+L +
Sbjct: 149 EKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE-----------KAALEARQAELEK 197

Query: 663 LRGAPYALSTAQVVAIASHDGGNQALEAVGTQLVALRAAPYALSTEQVVAIASHDGGKQA 722
STA I + + AL A L ST I + + K A
Sbjct: 198 ALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAA 257

Query: 723 LEAVGAQLVALRAAPYALNTEQVVAIASSHGGKQALEAVRALFPDLRAAPYALSTAQLVA 782
LEA A+L +T I + K ALEA +A L V
Sbjct: 258 LEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKA----------DLEHQSQVL 307

Query: 783 IASNPGGKQALEAVRALFRELRAAPYALSTEQVVAIASNHGGKQALEAVRALFRGLRAAP 842
A+ ++ L+A R ++L A L + ++ AS ++ L+A R + L A
Sbjct: 308 NANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEH 367

Query: 843 YGLSTAQVVAIASSNGGKQALEA 865
L ++ AS ++ L+A
Sbjct: 368 QKLEEQNKISEASRQSLRRDLDA 390


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09160SACTRNSFRASE333e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.6 bits (74), Expect = 3e-04
Identities = 20/96 (20%), Positives = 39/96 (40%), Gaps = 6/96 (6%)

Query: 40 ERIEGDESLIFVADAGNNALVGFCQLYPTFCSVAAAPIYVLYDLFVSPTVRRQGIAKQLL 99
+E + F+ NN +G ++ + A ++ D+ V+ R++G+ LL
Sbjct: 58 SYVEEEGKAAFLYYLENN-CIGRIKIRSNWNGYA-----LIEDIAVAKDYRKKGVGTALL 111

Query: 100 QAAAYQAKVDGKARIDLTTGKQNVNAQALYEMLGWK 135
A AK + + L T N++A Y +
Sbjct: 112 HKAIEWAKENHFCGLMLETQDINISACHFYAKHHFI 147


34RS_RS09260RS_RS09355Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS092602132.361257flavin reductase
RS_RS092652131.755303alanine-phosphoribitol ligase
RS_RS092702151.630587short-chain dehydrogenase
RS_RS092753161.381692hypothetical protein
RS_RS092802160.959299hypothetical protein
RS_RS09285-124-2.586319aminotransferase
RS_RS09290-126-2.842788hydrolase
RS_RS09295226-2.975983hypothetical protein
RS_RS09300127-3.798355transposase
RS_RS09305127-3.896705hypothetical protein
RS_RS09310229-4.383683hypothetical protein
RS_RS09315229-4.388464integrase
RS_RS09320232-5.042758integrase
RS_RS09325430-4.868674hypothetical protein
RS_RS09330722-3.001827MarR family transcriptional regulator
RS_RS09335618-1.648774hypothetical protein
RS_RS09340518-0.525046hypothetical protein
RS_RS09345520-2.204525MarR family transcriptional regulator
RS_RS09350320-1.114511DSBA oxidoreductase
RS_RS09355218-1.128541hemolysin D
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09270DHBDHDRGNASE1104e-31 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 110 bits (275), Expect = 4e-31
Identities = 77/254 (30%), Positives = 115/254 (45%), Gaps = 12/254 (4%)

Query: 2 KGLEGKVAIVTGGATLIGAAVVRTLRSYRVKVALFDIDAASGNRVAAEDSDGTRF---WP 58
KG+EGK+A +TG A IG AV RTL S +A D + +V + R +P
Sbjct: 4 KGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 59 VDITDDRQLERGVADAAAHFGRVDFLVNLAATYLDDGA--KSGREDWLKALDVNLVSAVM 116
D+ D ++ A G +D LVN A L G E+W VN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVN-VAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 117 AARAVHPHLAAAGGGAIVNFTSISSKVAQTGRWLYPVSKAALVQLTRGMAMDYAADGIRV 176
A+R+V ++ G+IV S + V +T Y SKAA V T+ + ++ A IR
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 177 NSVSPGWTWSRVMDALTHGDRAKTDRVAADYHL------LRRVGDPEEVAEVVAFLLSGH 230
N VSPG T + + +L + + L+++ P ++A+ V FL+SG
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 231 AAFVTGADYAVDGG 244
A +T + VDGG
Sbjct: 243 AGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09305HTHFIS330.003 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.9 bits (75), Expect = 0.003
Identities = 6/51 (11%), Positives = 21/51 (41%)

Query: 1 MGRPDTITMSMRELDRCKVIQAVADDGLMVWRAAEKLGISKRQVERLVLRY 51
+ + E++ ++ A+ +AA+ LG+++ + + +
Sbjct: 423 LPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIREL 473


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09355TCRTETB1162e-30 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 116 bits (293), Expect = 2e-30
Identities = 85/426 (19%), Positives = 166/426 (38%), Gaps = 21/426 (4%)

Query: 1 MTTPSSDSQAPLTGGMLWLAAIVLAAANFVAVLDMTIANVSVPNISGNLGVSTSQGTWVI 60
M T S S ++WL + F +VL+ + NVS+P+I+ + + WV
Sbjct: 1 MNTSYSQSNLRHNQILIWLCILS-----FFSVLNEMVLNVSLPDIANDFNKPPASTNWVN 55

Query: 61 TSYAVAEAILVPLTGWLASRFGTVRVFAVAMGCFGLFSALCGLALS-LGMLVAARICQGL 119
T++ + +I + G L+ + G R+ + S + + S +L+ AR QG
Sbjct: 56 TAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGA 115

Query: 120 AGGPLMPLSQTLLLRIFPKEKAGAAIGLWSMTTIVAPVLGPILGGYLCDEYSWPWIFYIN 179
L ++ R PKE G A GL + +GP +GG + W ++ I
Sbjct: 116 GAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIP 175

Query: 180 LPIAVGCAWAAWKMLARYESPIARNPIDKVGLALLVLWVGSLQIMLDEGKQLDWFASSTI 239
+ + + + + + D G+ L+ VG + ML F +S
Sbjct: 176 MITIITVPFLMKLLK---KEVRIKGHFDIKGIILMS--VGIVFFML--------FTTSYS 222

Query: 240 AGLAVVAAIGFAAFLIWELHAEHPAVNLRVFRHRGFSASVLTISLAFGAFFGANVLTPQW 299
+V+ + F F+ P V+ + ++ F VL + FG G + P
Sbjct: 223 ISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYM 282

Query: 300 LQSFMGYTALEAGKTTAWSG-FFALFAAPIAAGLGGKVDARKLVFGGVIWLGIITLWRTV 358
++ + E G + G + I L + ++ GV +L ++
Sbjct: 283 MKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTAS 341

Query: 359 ATTDMTYWQVAVPLMAMGLGIPFFFVPTTGLAMASVDEEEMASAAGLMNFLRTLSGAFAT 418
+ T W + + ++ + G+ F + + +S+ ++E + L+NF LS
Sbjct: 342 FLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGI 401

Query: 419 SLITTL 424
+++ L
Sbjct: 402 AIVGGL 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09360RTXTOXIND913e-22 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 91.4 bits (227), Expect = 3e-22
Identities = 63/410 (15%), Positives = 124/410 (30%), Gaps = 65/410 (15%)

Query: 19 RRARLFLILAAVVAAGAIGGTAYWQLYASRFVSTDNAYAAVEIAQVTPAIDGTIAEVRVS 78
RR RL A + + + ++ P + + E+ V
Sbjct: 55 RRPRLVAYFIMGFLVIAFI-LSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 79 DTQAVKKGDVLVVIDPTDAKLALEQAEAQLGSA--------------------------- 111
+ ++V+KGDVL+ + A+ + ++ L A
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 112 -----------VRRVRGYVANDSSLGAQIIAREADEKRAAADLFSAQA-------DFERA 153
+R S+ Q +E + + A+ + A
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 154 KIDLRRREALVASGSVSGEELTKARNAHATADAARAAPASAQARANRNAAVGSRQANAV- 212
K L +L+ +++ + + N + A S + + V
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 213 -----LIANATEDTNPEVALARAKRDQAAVDLGRTVIRAPVDGVVAKRQVQ-LGQRVKSG 266
I + T + L + + +VIRAPV V + +V G V +
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 267 TALLSIVPVQD-MHVDANFKEVQLENVRIGQPATVKADIYGGS--ASYRGVVEGFSGGSG 323
L+ IVP D + V A + + + +GQ A +K + + + G V+ +
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNIN---- 409

Query: 324 AAFAAIPAQNATGNWIKVVQRLPVRIKLDAAELEKHPLKVGLSMVVTIDT 373
G V+ + + PL G+++ I T
Sbjct: 410 ---LDAIEDQRLGLVFNVIISIEENCLS--TGNKNIPLSSGMAVTAEIKT 454


35RS_RS09410RS_RS09440Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS09410227-2.934976DNA primase
RS_RS09415133-4.400896hypothetical protein
RS_RS09420335-5.100315hypothetical protein
RS_RS09425227-4.113780hypothetical protein
RS_RS09430221-3.673609transcriptional regulator
RS_RS09435119-2.676706hypothetical protein
RS_RS09440118-3.132406integrase
36RS_RS09495RS_RS09775Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS094952143.578348LysR family transcriptional regulator
RS_RS095000152.804542FAD-dependent oxidoreductase
RS_RS095051133.149996ABC transporter permease
RS_RS09510-1123.274829peptide ABC transporter permease
RS_RS09515-1124.112843peptide ABC transporter substrate-binding
RS_RS09520-2114.211686peptide ABC transporter ATP-binding protein
RS_RS09525-2114.231664ABC transporter substrate-binding protein
RS_RS095300114.353867acyl CoA thioester hydrolase
RS_RS095352172.566981alkylhydroperoxidase
RS_RS095403202.240327acylaldehyde oxidase
RS_RS09545321-0.967344(2Fe-2S)-binding protein
RS_RS09550218-1.630308hypothetical protein
RS_RS09555119-2.065019two-component system response regulator
RS_RS09560015-2.276727diguanylate cyclase
RS_RS09565115-2.638492chemotaxis protein
RS_RS09570217-3.572612transcription regulator protein
RS_RS09580318-2.876535*integrase
RS_RS09585321-3.547792hypothetical protein
RS_RS09590320-3.241619hypothetical protein
RS_RS09595534-5.311154hypothetical protein
RS_RS09600436-6.519917hypothetical protein
RS_RS09605240-7.569910hypothetical protein
RS_RS09610223-5.242952hypothetical protein
RS_RS09615219-4.679574transcriptional regulator
RS_RS09620114-2.206465transcriptional regulator
RS_RS09625114-1.489511hypothetical protein
RS_RS09630113-1.297327hypothetical protein
RS_RS09635111-1.308648tail protein
RS_RS09640312-0.960724oxidoreductase
RS_RS09645212-0.561498tail protein
RS_RS09650017-0.034403hypothetical protein
RS_RS096552140.560325tail fiber protein
RS_RS096603150.871222major tail tube protein
RS_RS096652121.660283tail sheath protein
RS_RS096701122.248246hypothetical protein
RS_RS096753142.306381tail assembly protein
RS_RS096803140.235437tail protein
RS_RS09685117-0.130813tail protein
RS_RS09690117-0.670067baseplate assembly protein
RS_RS09695220-1.114511hypothetical protein
RS_RS09700422-0.968569baseplate assembly protein
RS_RS09705224-1.338655hypothetical protein
RS_RS097101181.114677virion morphogenesis protein
RS_RS097153211.132680tail protein
RS_RS097201191.942579hypothetical protein
RS_RS097251181.813647hypothetical protein
RS_RS09730115-0.034117membrane protein
RS_RS09735215-0.733248membrane protein
RS_RS09740113-1.378642tail protein X
RS_RS09745112-1.761269phage head protein
RS_RS09750212-3.356100hypothetical protein
RS_RS09755314-4.457578phage capsid protein
RS_RS09760316-3.586240hypothetical protein
RS_RS09765318-3.911219bacteriophage protein
RS_RS09770322-4.638321hypothetical protein
RS_RS09775326-4.763382hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09555HTHFIS310.003 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.3 bits (71), Expect = 0.003
Identities = 20/118 (16%), Positives = 35/118 (29%), Gaps = 5/118 (4%)

Query: 18 RTLRTEGYSAAHYADTREFLRSIKRDSHDLVLISHAASHLALSKLPMLIRSRLQQHTPLL 77
+ L GY ++ R I DLV+ L I+ + P+L
Sbjct: 21 QALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKA-RPDLPVL 79

Query: 78 LVGHDHDEDAVAASLGAGADGYIALPISP----ALFSARVTALLRRIYPIRAARQTFM 131
++ + + GA Y+ P + + RR + Q M
Sbjct: 80 VMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGM 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09585TETREPRESSOR280.003 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 27.6 bits (61), Expect = 0.003
Identities = 13/63 (20%), Positives = 25/63 (39%), Gaps = 12/63 (19%)

Query: 1 MKAINIKQVAEKVSLGQSTIYRMITKGEFPKPFSLGGNRTAWLDEDIDAWLAMRAGRPIP 60
+ + +++A+K+ + Q T+Y + N+ A LD LA +P
Sbjct: 22 IDGLTTRKLAQKLGIEQPTLYWHV------------KNKRALLDALAVEILARHHDYSLP 69

Query: 61 KPG 63
G
Sbjct: 70 AAG 72


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09645RTXTOXIND320.012 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.012
Identities = 18/188 (9%), Positives = 57/188 (30%), Gaps = 20/188 (10%)

Query: 1 MSDARRLRLEVVLAAVDKATRPLRNLMN-------ANNDLARAVKATRAQLKDLERTQAS 53
+ + R +++ +++ P L + + ++ R + Q + +
Sbjct: 145 QARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQ 204

Query: 54 ID-TFRKLSRDAAITGNQLKAVRGRADELARQLKQTSEPSAALSKAFEAARREAQALKSK 112
+ K + ++ + +L ++L A+ ++K
Sbjct: 205 KELNLDKKRAERLTVLARINRYENLSRVEKSRLDDF----SSLLHKQAIAKHAVLEQENK 260

Query: 113 QSELSEKLHQVRGRLSEAGIGTQSL--------AEHQRALKTRIAETNQQLEAQTQRMAA 164
E +L + +L + S + + ++ +T + T +A
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 165 VTAQQRRM 172
+Q+
Sbjct: 321 NEERQQAS 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09735RTXTOXINA280.009 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.4 bits (63), Expect = 0.009
Identities = 24/101 (23%), Positives = 42/101 (41%), Gaps = 9/101 (8%)

Query: 25 LPGVDP-----GTVLGAFAGA-AVFALNSGELTVAKKLS--FLLLSIVAGVLSAPLAASL 76
LP +D TV G + A F L++ + K + L + V G + ++ +
Sbjct: 232 LPNLDNIGAGLDTVSGILSAISASFILSNADADTRTKAAAGVELTTKVLGNVGKGISQYI 291

Query: 77 IARALPANTEVSEAVGALVASTVMVRLL-LALIRAADNSDK 116
IA+ S A L+AS V + + L+ + AD +
Sbjct: 292 IAQRAAQGLSTSAAAAGLIASAVTLAISPLSFLSIADKFKR 332


37RS_RS09990RS_RS10045Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS09990-114-3.0134793-isopropylmalate dehydrogenase
RS_RS09995-116-2.8920723-isopropylmalate dehydratase small subunit
RS_RS10000-115-2.9958183-isopropylmalate dehydratase large subunit
RS_RS10005014-4.270452type II citrate synthase
RS_RS10010213-3.273942hypothetical protein
RS_RS10015116-3.112533succinate dehydrogenase iron-sulfur subunit
RS_RS10020015-1.958156succinate dehydrogenase flavoprotein subunit
RS_RS10025-113-1.249282succinate dehydrogenase
RS_RS10030-116-0.296251succinate dehydrogenase
RS_RS10035013-0.272406GntR family transcriptional regulator
RS_RS10040221-1.206629malate dehydrogenase
RS_RS10045219-0.763172hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS09990SECA310.007 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 31.4 bits (71), Expect = 0.007
Identities = 24/84 (28%), Positives = 36/84 (42%), Gaps = 11/84 (13%)

Query: 119 IVAGLDILIVRELTGDIYFGQPRGVRAAPDGLFAG--AREGFDTMRYSEPEIRRIAHVAF 176
IV +++IV E TG G R DGL A+EG + E + +A + F
Sbjct: 327 IVKDGEVIIVDEHTGRTMQG-----RRWSDGLHQAVEAKEGVQI----QNENQTLASITF 377

Query: 177 QAAAKRGKKLCSVDKANVLETFQF 200
Q + +KL + E F+F
Sbjct: 378 QNYFRLYEKLAGMTGTADTEAFEF 401


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS10045PHPHTRNFRASE320.003 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 32.1 bits (73), Expect = 0.003
Identities = 21/124 (16%), Positives = 42/124 (33%), Gaps = 9/124 (7%)

Query: 98 ILVGQAGRRLAYVTVPKVRDVVEVARVTDRVNQA-----ARAAGIARHLPIHVLIETHGA 152
+L L V P + + E+ + + + + ++ + + +++E
Sbjct: 378 LLRASTYGNL-KVMFPMIATLEELRQAKAIMQEEKDKLLSEGVDVSDSIEVGIMVEIPST 436

Query: 153 LAQVFDIAALVQVECLSFGLMDFVSAHNGAIPGAAMGSPEQFE-HPLIRRALTDIAAACH 211
A +V+ S G D + A S HP I R + + A H
Sbjct: 437 AVAANLFAK--EVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPAILRLVDMVIKAAH 494

Query: 212 AHGK 215
+ GK
Sbjct: 495 SEGK 498


38RS_RS10260RS_RS10330Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS10260018-3.231775acyl-CoA dehydrogenase
RS_RS10265020-3.727970hypothetical protein
RS_RS10270020-4.274601ADP-ribose pyrophosphatase
RS_RS10275121-4.363031membrane protein
RS_RS10280022-4.270606NADH:ubiquinone oxidoreductase subunit N
RS_RS10285021-4.949173NADH:ubiquinone oxidoreductase subunit M
RS_RS10290-219-3.111906NADH:ubiquinone oxidoreductase subunit L
RS_RS10295-317-2.872324NADH-quinone oxidoreductase subunit K
RS_RS10300-216-2.833212NADH:ubiquinone oxidoreductase subunit J
RS_RS10305-316-3.067692NADH-quinone oxidoreductase subunit I
RS_RS10310-316-3.051181NADH-quinone oxidoreductase subunit H
RS_RS10315-215-2.624062NADH dehydrogenase subunit G
RS_RS10320016-4.492289NADH dehydrogenase
RS_RS10325114-4.295818NADH dehydrogenase subunit E
RS_RS10330115-3.143497NADH dehydrogenase subunit D
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS10330BINARYTOXINA300.014 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 30.4 bits (68), Expect = 0.014
Identities = 21/98 (21%), Positives = 30/98 (30%), Gaps = 28/98 (28%)

Query: 152 AYYRPGGVYRDLPDSMPQYKASKVRNEKALAAMNETRSGSLLDFIE-------------- 197
Y R G L + P+Y +K+ N A E + + +FI
Sbjct: 334 VYRRSGPQEFGLTLTSPEYDFNKIENIDAFKEKWEGKVITYPNFISTSIGSVNMSAFAKR 393

Query: 198 --------------AFTDRFPTYVDEYETLLTDNRIWK 221
A+ P Y EYE LL +K
Sbjct: 394 KIILRINIPKDSPGAYLSAIPGYAGEYEVLLNHGSKFK 431


39RS_RS10390RS_RS10485Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS10390215-1.565757membrane protein
RS_RS10395115-2.4251072-isopropylmalate synthase
RS_RS10400114-2.700924CDP-diacylglycerol--serine
RS_RS10405115-2.922352phosphatidylserine decarboxylase proenzyme
RS_RS10410-114-2.070236ketol-acid reductoisomerase
RS_RS10415-112-1.157605acetolactate synthase
RS_RS10420-290.897483acetolactate synthase
RS_RS10425-182.948850DNA-directed RNA polymerase sigma-70 factor
RS_RS10430093.597423membrane protein
RS_RS104350103.483190hypothetical protein
RS_RS104400122.129763hypothetical protein
RS_RS10445-1121.485069permease
RS_RS10450-212-0.746949ser/threonine protein phosphatase
RS_RS10455-112-0.637963GDP-mannose-dependent alpha-mannosyltransferase
RS_RS10460212-1.204846membrane protein
RS_RS104651141.181785TetR family transcriptional regulator
RS_RS104702132.330794LOG family protein
RS_RS104750122.283147hypothetical protein
RS_RS104802112.507659membrane protein
RS_RS104852112.551338oxygen-binding protein (globin)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS10465HTHTETR629e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 61.6 bits (149), Expect = 9e-14
Identities = 21/78 (26%), Positives = 39/78 (50%)

Query: 4 KPHATGPRRTRDRILEVSLRLFNELGEPNVTTTTIAEELEISPGNLYYHFRNKDDIINSI 63
+ + TR IL+V+LRLF++ G + + IA+ ++ G +Y+HF++K D+ + I
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 64 FVQFEQEISRRLRLPDDH 81
+ E I
Sbjct: 63 WELSESNIGELELEYQAK 80


40RS_RS10530RS_RS10610Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS10530216-0.356936guanine deaminase
RS_RS10535418-1.808956membrane protein
RS_RS10540622-2.561908hypothetical protein
RS_RS10545722-3.499049GntR family transcriptional regulator
RS_RS10550521-4.596756amino acid ABC transporter substrate-binding
RS_RS10555425-5.192526polar amino acid ABC transporter permease
RS_RS10560121-4.622285amino acid ABC transporter permease
RS_RS10565120-4.297727amino acid ABC transporter ATP-binding protein
RS_RS10570120-4.088368hypothetical protein
RS_RS10575224-4.673975porin
RS_RS10580223-4.699697aromatic amino acid exporter
RS_RS10585017-3.684074aldehyde dehydrogenase
RS_RS10590-119-3.984269dihydrodipicolinate synthetase
RS_RS10595-118-3.486742metabolite transporter
RS_RS10600-112-2.164245oxidoreductase
RS_RS106051110.694143transcriptional regulator
RS_RS106102151.325703membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS10530UREASE300.024 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 29.7 bits (67), Expect = 0.024
Identities = 12/28 (42%), Positives = 17/28 (60%), Gaps = 2/28 (7%)

Query: 64 AGAQVHDYSGKLIVPGFIDTHIHF--PQ 89
G +V GK++ G +D+HIHF PQ
Sbjct: 116 PGTEVIAGEGKIVTAGGMDSHIHFICPQ 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS10575ECOLNEIPORIN783e-18 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 78.3 bits (193), Expect = 3e-18
Identities = 48/207 (23%), Positives = 81/207 (39%), Gaps = 25/207 (12%)

Query: 13 AVVTPAF---SQGSVTLYGVLDEGLNYTNNVGGHSQVALASG-----FPHGSRWGLKGAE 64
A+ A + VTLYG + G+ + +V + A + GS+ G KG E
Sbjct: 7 ALTLAALPVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSKIGFKGQE 66

Query: 65 AIGGGIKTIFQLENGFDVDSGRAFQGGLLFGRQAYVGVSSDTLGALTAGRQYDAVVDFLA 124
+G G+K I+Q+E + + G RQ+++G+ G L GR + D
Sbjct: 67 DLGNGLKAIWQVEQKASIAGTDSGWG----NRQSFIGLKGG-FGKLRVGRLNSVLKDTGD 121

Query: 125 QN--TAGGTW-GGYMFAHPYDSDNLINTFRTNNAVKYTSPALGGLKFGATYGFSNDVKAS 181
N + + G A P + + +V+Y SP GL Y +++
Sbjct: 122 INPWDSKSDYLGVNKIAEP---EARL------ISVRYDSPEFAGLSGSVQYALNDNAGRH 172

Query: 182 NNRLFSVGGQYTVGGWLLSAAYLQADH 208
N+ + G Y GG+ + H
Sbjct: 173 NSESYHAGFNYKNGGFFVQYGGAYKRH 199


41RS_RS10675RS_RS10760Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS10675228-1.073528*DNA-binding transcriptional regulator
RS_RS10680128-1.159214xylulokinase
RS_RS10685031-2.512668D-arabinitol 4-dehydrogenase
RS_RS10690030-2.979835SKWP protein 6
RS_RS10695034-4.967401hypothetical protein
RS_RS10700127-0.884870hypothetical protein
RS_RS10705126-1.267346hypothetical protein
RS_RS10710225-0.694404porin
RS_RS10715323-0.428456hypothetical protein
RS_RS10720423-0.961351HxlR family transcriptional regulator
RS_RS10725320-0.930009AWR family protein
RS_RS10730725-1.050959sugar ABC transporter ATP-binding protein
RS_RS10735827-0.202608haloacid dehalogenase
RS_RS10740727-0.953714mannitol ABC transporter permease
RS_RS10745725-1.028910sugar ABC transporter permease
RS_RS107504190.470382sugar ABC transporter substrate-binding protein
RS_RS107555201.267896tagatose-bisphosphate aldolase
RS_RS107602160.907222carbohydrate kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS10710ECOLNEIPORIN865e-21 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 86.0 bits (213), Expect = 5e-21
Identities = 74/348 (21%), Positives = 123/348 (35%), Gaps = 29/348 (8%)

Query: 13 VIAAAL-ALPAHAQSSVTLYGRVVGGVEYIDKVAVPATGKTGSLLQAAGNQWGTSMFGMK 71
+IA L ALP A + VTLYG + GVE VA G S G K
Sbjct: 5 LIALTLAALPVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLG-SKIGFK 63

Query: 72 GSESLGGGLQALFTLESGFSLPNSALNGPGLFNRRSYVGLSSPSWGTLIVGKNLSISNDI 131
G E LG GL+A++ +E + A G NR+S++GL +G L VG+ S+ D
Sbjct: 64 GQEDLGNGLKAIWQVEQK---ASIAGTDSGWGNRQSFIGLKG-GFGKLRVGRLNSVLKDT 119

Query: 132 WDIDPTGQQFMSTATLVKGRNWPGADNM---IEYRSPDLGGFSFGAQASLSEQPAGRRVS 188
DI+P + S + + + + Y SP+ G S Q +L++
Sbjct: 120 GDINP----WDSKSDYLGVNKIAEPEARLISVRYDSPEFAGLSGSVQYALNDNAGRHNSE 175

Query: 189 AQGLSATYQTSDLMLRGIYTERRDSAGAFSDVYNASKDGILGGTYRIGPAKLFAGYERIL 248
+ Y+ ++ +R + + L Y ++
Sbjct: 176 SYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSGYDNDALYASVAVQQQD 235

Query: 249 ASQTGADTAPSALSQAWLGIRYDLTQAVTLIGAVYHVKS--NQSGGNATL--AMLGVDYY 304
A + + ++ ++ + Y + + K + + N ++G +Y
Sbjct: 236 AKLVEENYSHNSQTEVAATLAYRFGNVTPRVSYAHGFKGSFDATNYNNDYDQVVVGAEYD 295

Query: 305 LSKRTFLYASLGGVSNGANADYAADVTAAGPGVGKSQRVVYVGMGHSF 352
SKRT S G + G VG+ H F
Sbjct: 296 FSKRTSALVSAGWLQEGKGESKFVSTAGG------------VGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS10725PF03544310.019 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 31.1 bits (70), Expect = 0.019
Identities = 11/55 (20%), Positives = 18/55 (32%)

Query: 76 TPTRAQPKPDVVPALKPAPRRMRPPAAPGRKHQRNVDAVVPPAMATSPRAEIRFP 130
P + P V+ KP P+ P + +R+V V + P
Sbjct: 83 IPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARP 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS10730PF05272310.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.007
Identities = 19/86 (22%), Positives = 30/86 (34%), Gaps = 20/86 (23%)

Query: 32 VVFVGPSGCGKSTLMRMIAGLEDISSGDLLIDGAKVNDVPSAKRGIAMVFQSYALYPHMT 91
VV G G GKSTL+ + GL+ S D+ + K Y
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHF--------DIGTGKDS-YEQIAGIVAY---- 645

Query: 92 LYDNMAFGLKLAGAKKAEIDQAVKGA 117
++ ++A+ +AVK
Sbjct: 646 -----ELS-EMTAFRRADA-EAVKAF 664


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS10750MALTOSEBP300.015 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 30.5 bits (68), Expect = 0.015
Identities = 66/307 (21%), Positives = 114/307 (37%), Gaps = 37/307 (12%)

Query: 124 DSLSYDGQLYALPFYVESSMTFYRKDLFAAKGLTMPDQP-TYDQIAQFADKLTDKDKGVY 182
D++ Y+G+L A P VE+ Y KDL +P+ P T+++I +L K K
Sbjct: 121 DAVRYNGKLIAYPIAVEALSLIYNKDL-------LPNPPKTWEEIPALDKELKAKGKSAL 173

Query: 183 GICLRGKAGWGENMAYVSTVVNTFGGRWFDEKWNAQLTSPEWKKAIGFYVNLLKKNGPPG 242
L+ +A + +D K + + + K + F V+L+K
Sbjct: 174 MFNLQEPYFTWPLIAADGGYAFKYENGKYDIK-DVGVDNAGAKAGLTFLVDLIKNKHMNA 232

Query: 243 ASSNGFNENLTLTASGKCAMWIDATVAAGMLYNKQQSQVADKIGFAAAPIADTPKGSHWL 302
+ E G+ AM I+ A N S+V G P ++
Sbjct: 233 DTDYSIAE--AAFNKGETAMTINGPWAWS---NIDTSKV--NYGVTVLPTFKGQPSKPFV 285

Query: 303 WAWALAIPKTSKQQEAAKTFV-TWATSKQYIEMVGKDEGWASVPPGTRASTYQRAEYKAA 361
+ I S +E AK F+ + + + +E V KD+ +V +Y+ K
Sbjct: 286 GVLSAGINAASPNKELAKEFLENYLLTDEGLEAVNKDKPLGAVA----LKSYEEELAK-- 339

Query: 362 APFSDFVLRAIETADPTDPSAKKVPYTGVAYVGIPEFQSFGTVVGQSIAGAVAGQFSVDQ 421
DP + + G IP+ +F V ++ A +G+ +VD+
Sbjct: 340 --------------DPRIAATMENAQKGEIMPNIPQMSAFWYAVRTAVINAASGRQTVDE 385

Query: 422 ALAAGQA 428
AL Q
Sbjct: 386 ALKDAQT 392


42RS_RS10875RS_RS10955Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS108752102.993799membrane protein
RS_RS108802113.611227membrane protein
RS_RS108852123.604412LamB/YcsF family protein
RS_RS108902133.414139allophanate hydrolase
RS_RS108951101.461386hypothetical protein
RS_RS10900-2100.837890Bcr/CflA family drug resistance efflux
RS_RS10905-411-0.028581ABC transporter
RS_RS10910-111-0.908865sugar ABC transporter permease
RS_RS10915110-1.151901ABC transporter permease
RS_RS10920213-2.533725endonuclease DDE
RS_RS10925314-1.509604membrane protein
RS_RS10930320-1.240678hypothetical protein
RS_RS10935116-0.177797hypothetical protein
RS_RS10940114-0.018940membrane protein
RS_RS10945-1120.698177radical SAM protein
RS_RS109502122.592093hypothetical protein
RS_RS109552142.589192membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS10900TCRTETB668e-14 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 65.7 bits (160), Expect = 8e-14
Identities = 45/198 (22%), Positives = 80/198 (40%), Gaps = 6/198 (3%)

Query: 22 LLLLGALTAIGPLSVDMYLPSLPTIARDLRTSSAAAGITLTAFLVSLAIGQLIYGPASDR 81
L+ L L+ L+ + SLP IA D A+ TAF+++ +IG +YG SD+
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 82 FGRKPPLYIGLALYVAASVGCAFA-TDATMLAVLRAVQGFGGCAGMVISRAAVRDRMDPA 140
G K L G+ + SV + ++L + R +QG G A + V +
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 141 GAAQAYSTLMLVMGVAPILAPMIGGAVLQVTSWRMIFAVLAAFGVLSLAAVHFLMRES-- 198
+A+ + ++ + + P IGG + W + + + + L +E
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRI 195

Query: 199 ---LDAAHARPLAVGRVL 213
D ++VG V
Sbjct: 196 KGHFDIKGIILMSVGIVF 213


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS10935YERSSTKINASE290.014 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 28.6 bits (63), Expect = 0.014
Identities = 15/48 (31%), Positives = 22/48 (45%), Gaps = 2/48 (4%)

Query: 97 GEPTIVDLAQHPERG--PKIRTNGFSLPGYHSGWYVGADKRRIFAAIA 142
GEP ++DL H G PK T F P G ++K +F ++
Sbjct: 283 GEPVVIDLGLHSRSGEQPKGFTESFKAPELGVGNLGASEKSDVFLVVS 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS10955ACRIFLAVINRP290.002 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 29.4 bits (66), Expect = 0.002
Identities = 9/38 (23%), Positives = 14/38 (36%), Gaps = 2/38 (5%)

Query: 21 FALRHPEAPWWLTPAALLLAGYVVS--PVDLIPDVIPV 56
F +R P W L ++ + PV P + P
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPP 41


43RS_RS11270RS_RS11715Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS112702133.251348branched-chain amino acid ABC transporter
RS_RS112753133.979936hemolysin III
RS_RS112803124.070674ABC transporter permease
RS_RS112853123.916580ABC transporter permease
RS_RS112903143.9371764-carboxymuconolactone decarboxylase
RS_RS112952123.5206513-oxoadipate enol-lactonase
RS_RS113001113.4266613-carboxy-cis,cis-muconate cycloisomerase
RS_RS11305-1131.468822acetyl-CoA acetyltransferase
RS_RS11310-1100.3064063-oxoadipate CoA-transferase subunit B
RS_RS113150110.2271503-oxoadipate CoA-transferase subunit A
RS_RS113201120.050670IclR family transcriptional regulator
RS_RS11325015-0.959369hypothetical protein
RS_RS113300140.252714amino acid ABC transporter substrate-binding
RS_RS113352120.517698ABC transporter permease
RS_RS1134019-0.619374amino acid ABC transporter permease
RS_RS11345-110-1.371486amino acid ABC transporter ATP-binding protein
RS_RS11350-112-2.258390ABC transporter
RS_RS11355118-1.960944D-amino acid dehydrogenase 2
RS_RS11360222-4.323067hypothetical protein
RS_RS11370019-5.481324transposase
RS_RS11375-111-3.042716transposase
RS_RS11380111-1.486328transposase
RS_RS113852140.114445transposase
RS_RS113903150.754346hypothetical protein
RS_RS113953132.458710hypothetical protein
RS_RS114003102.854367glycosyl transferase family 1
RS_RS11405292.778106hypothetical protein
RS_RS11410182.686237sugar ABC transporter
RS_RS11415172.737052hypothetical protein
RS_RS11420182.012568signal peptidase
RS_RS11425111-1.999154hypothetical protein
RS_RS11430426-5.484160magnesium transporter
RS_RS11435932-5.652033signal peptidase
RS_RS11440728-5.630431hypothetical protein
RS_RS11445424-4.389272membrane protein
RS_RS11450219-2.195457hypothetical protein
RS_RS11455115-0.025335hypothetical protein
RS_RS11460114-0.670324structural protein MipA
RS_RS11465113-0.921461signal peptidase
RS_RS11470112-0.475072hypothetical protein
RS_RS11475213-1.136645alkyl hydroperoxide reductase
RS_RS11480111-0.753566hypothetical protein
RS_RS11485213-0.598752cytochrome C
RS_RS114901160.600387iron oxidase
RS_RS114952180.464976lytic transglycosylase
RS_RS11500114-1.042344membrane protein
RS_RS11505114-0.995517cytochrome C
RS_RS11510214-1.720300hypothetical protein
RS_RS11515213-1.732360hypothetical protein
RS_RS11520313-1.617656membrane protein
RS_RS11525212-1.569735hypothetical protein
RS_RS11530113-0.431570hypothetical protein
RS_RS115351140.258137hypothetical protein
RS_RS11540-1181.727085hypothetical protein
RS_RS11545-1191.606577general secretion pathway protein GspG
RS_RS115500201.976234GSPG signal peptide protein
RS_RS115552201.522526GSPD
RS_RS115602221.679685hypothetical protein
RS_RS115653231.075807membrane protein
RS_RS115703200.164984membrane protein
RS_RS11575320-0.389173hypothetical protein
RS_RS11580119-1.672887general secretion pathway protein GspE
RS_RS11585-121-2.687047general secretion pathway protein GspF
RS_RS11590-125-4.911636type II secretion system protein GspG
RS_RS11595-223-4.735240ATPase
RS_RS11600-230-5.922462transcriptional regulator
RS_RS11605034-6.485162transposase
RS_RS11610-135-6.551670transposase
RS_RS11615-139-7.025016hypothetical protein
RS_RS11620-140-7.072875MFS transporter
RS_RS11625038-7.407141AsnC family transcriptional regulator
RS_RS11630034-4.739670lysophospholipase
RS_RS11635032-4.042306AsnC family transcriptional regulator
RS_RS11640129-4.143146oxidoreductase
RS_RS11645227-3.708466MerR family transcriptional regulator
RS_RS11650129-4.065996MFS transporter
RS_RS11655331-4.286032MFS transporter
RS_RS11660338-7.897304MarR family transcriptional regulator
RS_RS11665440-8.989655hypothetical protein
RS_RS11670438-8.305147hypothetical protein
RS_RS11675437-7.963448LysR family transcriptional regulator
RS_RS11680231-7.215938hypothetical protein
RS_RS11685130-7.033514flavodoxin
RS_RS11690130-6.280886hypothetical protein
RS_RS11695231-5.700465transposase
RS_RS11700119-4.405828transposase
RS_RS11705113-2.438764transposase
RS_RS11710210-1.114511transposase
RS_RS1171539-0.393791transposase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11275AUTOINDCRSYN290.013 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 29.0 bits (65), Expect = 0.013
Identities = 26/144 (18%), Positives = 49/144 (34%), Gaps = 24/144 (16%)

Query: 51 TTFINLLTGAFAPTAGQVRLDGDDITRLS----AHQRVKRGITRTFQINT-LFPGL---- 101
T + N++TG F P ++ + + S R K + + I++ LF +
Sbjct: 72 TKYPNMITGTFFPYFKEINIPEGNYLESSRFFVDKSRAKDILGNEYPISSMLFLSMINYS 131

Query: 102 ---------TVLESVMLAIFERTGASWHWHRTVAAHTAARDEAMALLGRLQLGADASSPT 152
T++ ML I +R+G W V + E L + L D +
Sbjct: 132 KDKGYDGIYTIVSHPMLTILKRSG----WGIRVVEQGLSEKEERVYL--VFLPVDDENQE 185

Query: 153 HALPYGKQRLLEIALALATLPRIL 176
+ ++ L P +
Sbjct: 186 ALARRINRSGTFMSNELKQWPLRV 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11375ADHESNFAMILY270.043 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 26.7 bits (59), Expect = 0.043
Identities = 12/54 (22%), Positives = 22/54 (40%), Gaps = 13/54 (24%)

Query: 106 SKQDPQPAAQPAQDERSREDLLKENECLRAEVAYLKKLDALLRAKQQQAPKKKR 159
S +DP ++ ++N L+ L KLD + K + P +K+
Sbjct: 159 SAKDPN-----------NKEFYEKN--LKEYTDKLDKLDKESKDKFNKIPAEKK 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11425AEROLYSIN330.007 Aerolysin signature.
		>AEROLYSIN#Aerolysin signature.

Length = 493

Score = 32.7 bits (74), Expect = 0.007
Identities = 12/18 (66%), Positives = 14/18 (77%)

Query: 799 DMAIARDGDAWVVRGNGD 816
DM + RDGD WV+RGN D
Sbjct: 162 DMDVTRDGDGWVIRGNND 179


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11545BCTERIALGSPG747e-20 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 73.8 bits (181), Expect = 7e-20
Identities = 37/127 (29%), Positives = 62/127 (48%), Gaps = 18/127 (14%)

Query: 8 RRRGFTLIELVVVMAIIGLLLTLALPRYFHSIERGRAQVQQQNLAVIRDAIDKYYGDNGQ 67
++RGFTL+E++VV+ IIG+L +L +P + E+ Q ++ + +A+D Y DN
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHH 65

Query: 68 YPDT---LDDLVAK-------------RYLRGIPVDPVSGGTAWAVIAPPDTSKTGIYDV 111
YP T L+ LV Y++ +P DP G + ++ P + +
Sbjct: 66 YPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADP--WGNDYVLVNPGEHGAYDLLSA 123

Query: 112 GPAREAG 118
GP E G
Sbjct: 124 GPDGEMG 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11550BCTERIALGSPG391e-06 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 39.5 bits (92), Expect = 1e-06
Identities = 31/143 (21%), Positives = 56/143 (39%), Gaps = 15/143 (10%)

Query: 13 RANCRGHGFTLIELVITLALVGIVAMAIVPLSELAVQRQKEQALRVALREIRTAIDAYKE 72
RA + GFTL+E+++ + ++G++A +VP ++ +Q + + A+D YK
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK- 60

Query: 73 ASDSGSVEHEAGASGYP---PSLAVLVEGVKDAKDPKG-GLLMFLRRVPRDPFFTGEAST 128
D+ YP L LVE +++R+P DP+
Sbjct: 61 -LDNHH---------YPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLV 110

Query: 129 PAADTWNLRAYGVPVDGGTGGDD 151
+ DG G +D
Sbjct: 111 NPGEHGAYDLLSAGPDGEMGTED 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11555BCTERIALGSPD1753e-48 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 175 bits (444), Expect = 3e-48
Identities = 86/361 (23%), Positives = 154/361 (42%), Gaps = 53/361 (14%)

Query: 279 KVRTFQLSNTDAKHIDTLLKNLLKIKE-----------------IVTDERANTVSIRATP 321
+ L A + +L + + I + N + + A P
Sbjct: 268 NTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAKPVAALDKNIIIKAHGQTNALIVTAAP 327

Query: 322 ETIRVAERMIAAQDLPEPEVMLEVEVLEVSRDRLTDLGIDWPNS-------FSLGTPSSA 374
+ + ER+IA D+ P+V++E + EV +LGI W N + G P S
Sbjct: 328 DVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGIQWANKNAGMTQFTNSGLPIST 387

Query: 375 STWGALHHLTVNQLTASGLSVTANLKL------------------TDTDANLLASPRIRT 416
+ GA + +++S S ++ + T ++LA+P I T
Sbjct: 388 AIAGANQYNKDGTVSSSLASALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVT 447

Query: 417 RNKEKAKILIGDKVPVISSSSVPSTSGPVYSQSIQYLDVGIKLEVEPQVYRDNDVGIKMS 476
+ +A +G +VPV++ S S +++ VGIKL+V+PQ+ + V +++
Sbjct: 448 LDNMEATFNVGQEVPVLTGSQTTSGDNIF--NTVERKTVGIKLKVKPQINEGDSVLLEIE 505

Query: 477 LEVSNITQIISSSNAQTGLSSLAYQIGTRNASTTLRLKDGETQILGGLIQDEDRDAANKV 536
EVS++ A + S L TR + + + GET ++GGL+ D A+KV
Sbjct: 506 QEVSSVAD-----AASSTSSDLGATFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKV 560

Query: 537 PGLGQLPVLGRLFSNHNGDHKKTEIVLQITPHIVRPQLAADADTREIWSGTDSTVRTEQL 596
P LG +PV+G LF + + K ++L I P ++R + + R+ SG + Q
Sbjct: 561 PLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIRDR----DEYRQASSGQYTAFNDAQS 616

Query: 597 R 597
+
Sbjct: 617 K 617



Score = 48.8 bits (116), Expect = 7e-08
Identities = 28/188 (14%), Positives = 69/188 (36%), Gaps = 21/188 (11%)

Query: 185 AAASIMRKPVSLQ-----FRDANVRMVFEALSRTTGLSVIFDRDVR--VDLKTTIFVSNA 237
A+++ +P + + F+ +++ +S+ +VI D VR + +++ ++
Sbjct: 16 IFAALLFRPAAAEEFSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRSYDMLNEE 75

Query: 238 SLEDTVDMILLQNQLAKKVLNANTVFIYPATPAKQQEY-----------QDLKVRTFQLS 286
+L A +N + + + AK ++ R L+
Sbjct: 76 QYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPGIGDEVVTRVVPLT 135

Query: 287 NTDAKHIDTLLKNLL---KIKEIVTDERANTVSIRATPETIRVAERMIAAQDLPEPEVML 343
N A+ + LL+ L + +V E +N + + I+ ++ D ++
Sbjct: 136 NVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVDNAGDRSVV 195

Query: 344 EVEVLEVS 351
V + S
Sbjct: 196 TVPLSWAS 203


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11585BCTERIALGSPF2047e-64 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 204 bits (520), Expect = 7e-64
Identities = 102/403 (25%), Positives = 181/403 (44%), Gaps = 6/403 (1%)

Query: 3 YLLRVFDSGGLVQTVCIEGDTPAGATAAAHARGWKVIAVRAGGARGRHRIRGTVL----- 57
Y + D+ G E D+ A RG ++V + +
Sbjct: 4 YHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLRRKI 63

Query: 58 GGTAFDVELFARELAALLDAGVSVIDALRTLGSNERREASAAVYRDLLRRLEEGHALSAA 117
+ D+ L R+LA L+ A + + +AL + + + + + ++ EGH+L+ A
Sbjct: 64 RLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLADA 123

Query: 118 LELADNVFPPVLVACVKASEQTGGLAASLKRYSQNSATLQALRARVVSASIYPAVLLAVG 177
++ F + A V A E +G L A L R + + Q +R+R+ A IYP VL V
Sbjct: 124 MKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLTVVA 183

Query: 178 GSVAVFLLAFVVPRFAGLLEHSGRELPMLSQWLMAWGAMVHAHGQGLAVGFAICVLAGIG 237
+V LL+ VVP+ H + LP+ ++ LM V G + + +A
Sbjct: 184 IAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMAFRV 243

Query: 238 ALRRQATRSWATDRLLSLPGVGEHFRVYRQAQFFRTSAMLVDGGIPAVQAFDLACGLVS- 296
LR++ R RLL LP +G R A++ RT ++L +P +QA ++ ++S
Sbjct: 244 MLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDVMSN 303

Query: 297 RADRAALASAMERIRNGGRMSDAFLGSGLADPITYRLLTVAEKTGGLGPVLDKIAAFQEA 356
R L+ A + +R G + A + L P+ ++ E++G L +L++ A Q+
Sbjct: 304 DYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADNQDR 363

Query: 357 HVSHAIDLASRLVEPAMMMVIGVVIGGIVVLMYLPIFQLASSV 399
S + LA L EP +++ + V+ IV+ + PI QL + +
Sbjct: 364 EFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11590BCTERIALGSPG1601e-53 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 160 bits (406), Expect = 1e-53
Identities = 57/140 (40%), Positives = 81/140 (57%), Gaps = 6/140 (4%)

Query: 19 RCARREGGFTLLELLVVMVIIGLLAGLVAPQYFDQIGKSNLKIAKAQIESLGKALDQYRL 78
R ++ GFTLLE++VV+VIIG+LA LV P K++ + A + I +L ALD Y+L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL 61

Query: 79 DVGAYPSTEEGLDALNTRPQSQP---RWSGPYLKKAVPLDPWDRPYVYRSPGEHGEYDLY 135
D YP+T +GL++L P P ++ K +P DPW YV +PGEHG YDL
Sbjct: 62 DNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLL 121

Query: 136 SLGKSGQPGGTGENVAVTSW 155
S G G+ G + +T+W
Sbjct: 122 SAGPDGEMGTEDD---ITNW 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11600HTHFIS697e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 69.5 bits (170), Expect = 7e-16
Identities = 31/118 (26%), Positives = 53/118 (44%), Gaps = 2/118 (1%)

Query: 6 RLLIAEDHHLLRCGLRSMLSALGEYDVVGEAKDGREACQLAISLAPDLVLTDLSMPGMNG 65
+L+A+D +R L LS G YDV + + + DLV+TD+ MP N
Sbjct: 5 TILVADDDAAIRTVLNQALSRAG-YDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 66 IDMVATIKRRLPQIRVVVLTVYKSEEYVREALRVGVDGYVLKDASFEELVAALRAVMQ 123
D++ IK+ P + V+V++ + +A G Y+ K EL+ + +
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11620TCRTETB363e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 35.6 bits (82), Expect = 3e-04
Identities = 63/388 (16%), Positives = 130/388 (33%), Gaps = 57/388 (14%)

Query: 23 NSTVIFATGAIIGHMLAPSPALATVPVSIFVVGMAAATLPVGVVTRRYGRKTSALFGSVC 82
N V+ + I + PA + F++ + T G ++ + G K LFG
Sbjct: 29 NEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFG--- 85

Query: 83 GVTVGLLAALALVIQSFSLFCVA--MLFGGAYAAVVLTYRFAAAECVPAEHRARAM---S 137
+ + + V SF + + G AA A +P E+R +A
Sbjct: 86 IIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIG 145

Query: 138 TVLAGGVAAGVLGPQLVTAT--------------------MNLWSPHAYAVTYLASAGTA 177
+++A G G ++ M L + G
Sbjct: 146 SIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGII 205

Query: 178 VLSAIVLQGVRFEHQPPVSAH-----------AHGRPSSDIMRQPRFV--VAMLCGVVSY 224
++S ++ + F +S H R +D P + + GV+
Sbjct: 206 LMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCG 265

Query: 225 MMMNFMMTSAPLAMELCGIPRVHANYGIEIHVIAMYAPS-------FFTGRLIARFGAPR 277
++ + + + VH EI + ++ + + G L+ R G
Sbjct: 266 GIIFGTVAGFVSMVPYM-MKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLY 324

Query: 278 VSLAGLALIALAATTGMTGVSVNHFWVALLLLGLGWNFGFLGASATVLTCH-----TPAE 332
V G+ ++++ T + +++ ++++ + G L + TV++ E
Sbjct: 325 VLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFV---LGGLSFTKTVISTIVSSSLKQQE 381

Query: 333 GPRVQSIYDFVVFGAMVVGSFVSGGLLA 360
S+ +F F + G + GGLL+
Sbjct: 382 AGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11630RTXTOXINA310.010 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 30.7 bits (69), Expect = 0.010
Identities = 15/54 (27%), Positives = 22/54 (40%), Gaps = 9/54 (16%)

Query: 128 LCGGSLGGYLAYLCAARMGAG-----PIAGVIATTLADPRSPL----VRRQFAR 172
+ G G Y+ A R G AG+IA+ + SPL + +F R
Sbjct: 279 VLGNVGKGISQYIIAQRAAQGLSTSAAAAGLIASAVTLAISPLSFLSIADKFKR 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11640DHBDHDRGNASE822e-20 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 81.6 bits (201), Expect = 2e-20
Identities = 53/192 (27%), Positives = 86/192 (44%), Gaps = 1/192 (0%)

Query: 3 NCFDFRGKTALITGASSGIGREFAYALAKRGAKLLLVARSRDKLHDLTAELRRDYACDAD 62
N GK A ITGA+ GIG A LA +GA + V + +KL + + L+ + A A+
Sbjct: 2 NAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE-ARHAE 60

Query: 63 FLTVDLSAPDAIPTVAHLLKATGTVVDVLINNAGFATYGRFETIPWTRQRDEVLVNCMAA 122
D+ AI + ++ +D+L+N AG G ++ VN
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 123 IELTHLLLPGMQARSDGAVINVASTAAFQPDPYMAIYGATKAFLLSFSEAVWAENRHRGI 182
+ + M R G+++ V S A P MA Y ++KA + F++ + E I
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 183 RVLALCPGATQT 194
R + PG+T+T
Sbjct: 181 RCNIVSPGSTET 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11650TCRTETB1283e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 128 bits (322), Expect = 3e-34
Identities = 95/413 (23%), Positives = 175/413 (42%), Gaps = 17/413 (4%)

Query: 21 RNTVALLAVCLAALMFGLEISSVPVILPTLEVALHGDFSGIQWIMNAYTIACTAVLMATG 80
R+ L+ +C+ + L + V LP + + + W+ A+ + + G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 81 TLADRYGRKRVFLYTIALFGVASLLCGLAWNT-PILIASRFLQGASGGAMLICLVAVLSH 139
L+D+ G KR+ L+ I + S++ + + +LI +RF+QG +G A LV V+
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQG-AGAAAFPALVMVVVA 129

Query: 140 QFPEGAERGKAFGIWGIVFGIGLGFGPLIGGLIVAMSGWKWVFWVHVFIAALTFGLAIVG 199
++ RGKAFG+ G + +G G GP IGG+I W ++ + + L +
Sbjct: 130 RYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLL 189

Query: 200 VRESRDPHARRLDLAGIVTLSLTVLGLAYFVTQGADTGFGSASALRVIAATAVSFALFLI 259
+E R D+ GI+ +S+ ++ F T S S +I + +SF +F+
Sbjct: 190 KKEVR--IKGHFDIKGIILMSVGIVFFMLFTT--------SYSISFLIVSV-LSFLIFVK 238

Query: 260 VEMRAAHPMFDFSVFRVRNFSGALLGSVGMNFSFWPFIIYLPIYFQSALGQGAAAAG-VS 318
+ P D + + F +L + + F+ +P + A G V
Sbjct: 239 HIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVI 298

Query: 319 LLAYTLPTLVFPPIGERLALRYRPGIVIPAGLFTIGLGFFLMLIGSSIDQASWLTVLPGC 378
+ T+ ++F IG L R P V+ G+ + + F + ++ SW +
Sbjct: 299 IFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSF--LTASFLLETTSWFMTIIIV 356

Query: 379 LLAGAGLGITNTPVTNTTTGSVSPARAGMASGIDMSARMISLAINIALMGFLL 431
+ G GL T T ++ + S+ AG + +S IA++G LL
Sbjct: 357 FVLG-GLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLL 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11655TCRTETB601e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 59.5 bits (144), Expect = 1e-11
Identities = 61/322 (18%), Positives = 127/322 (39%), Gaps = 23/322 (7%)

Query: 61 WNTTLFIVASITGAAVAAQVIQHLGPRGAYLLGAMVFGGGSALCALA-PIMQVLLVGRVL 119
W T F++ G AV ++ LG + L G ++ GS + + +L++ R +
Sbjct: 53 WVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFI 112

Query: 120 QGLGGGVLLSAPYVLMRSVLPEPLWPRALALLSGMWGIATLLGPAVGGIFAQYATWRLAF 179
QG G + V++ +P+ +A L+ + + +GPA+GG+ A Y
Sbjct: 113 QGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHY-----IH 167

Query: 180 WSLVVLIALAAAAAVVVLKPAAPQAERPT-PVPTRQLVLLAVAVVAASWASVSDSPWLGA 238
WS ++LI + V L + R + ++L++V +V + S S
Sbjct: 168 WSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLI 227

Query: 239 LGLIAAALTLLAIRRVEMRSSRRILPADAYTGRTALAALYGVSALLAVTVTCTEIFVPLF 298
+ +++ + + IR+V + + + + ++ TV VP
Sbjct: 228 VSVLSFLIFVKHIRKV----TDPFVDPGLGKNIPFMIGVL-CGGIIFGTVAGFVSMVPYM 282

Query: 299 LQRLHGQSPLIAGYIAATASAGWTLGAILSAALRAASVTR-----AIRLAPWACALGLLV 353
++ +H S G + T+ I+ + V R + + ++ L
Sbjct: 283 MKDVHQLSTAEIGSVIIFPG---TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLT 339

Query: 354 LAMLVPIGAGSAWMTTLIIVAL 375
+ L+ ++W T+IIV +
Sbjct: 340 ASFLL---ETTSWFMTIIIVFV 358


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11710PF04647290.011 Accessory gene regulator B
		>PF04647#Accessory gene regulator B

Length = 212

Score = 29.4 bits (66), Expect = 0.011
Identities = 14/64 (21%), Positives = 23/64 (35%), Gaps = 3/64 (4%)

Query: 169 VDEMGYLPMDREQANLLFQVIAKRYETGSLVLTSNLPFGQWDQTFAGDATLTAALLDRLL 228
VD Y P ++E+ +V ++L G + L+AA+ R
Sbjct: 13 VDRSDY-PFNQEEIRYGIEVFLGTVFQIIIILLVAFVIGLAKEVAF--CLLSAAVYRRFS 69

Query: 229 HHAH 232
AH
Sbjct: 70 GGAH 73


44RS_RS11990RS_RS12070Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS119901103.600654miscellaneous; unknown
RS_RS119952105.429671cobyrinic acid a,c-diamide synthase
RS_RS120003115.871304cobyric acid synthase
RS_RS120053105.661711adenosylcobinamide kinase
RS_RS12010495.906923cobalamin biosynthesis protein CobD
RS_RS120155115.867985threonine-phosphate decarboxylase
RS_RS120208145.659620cobalamin-binding protein
RS_RS120254173.698497alpha-ribazole phosphatase
RS_RS120303163.298252adenosylcobinamide-GDP ribazoletransferase
RS_RS120352153.045253nicotinate-nucleotide--dimethylbenzimidazole
RS_RS120401131.943972ABC transporter ATP-binding protein
RS_RS12045-1110.691509iron ABC transporter
RS_RS12050-210-1.114511TonB-dependent receptor
RS_RS12055-117-1.649011hypothetical protein
RS_RS12060018-4.105964cell division protein ZapA
RS_RS12065323-5.164691hypothetical protein
RS_RS12070219-4.116678transposase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS12020FERRIBNDNGPP384e-05 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 37.6 bits (87), Expect = 4e-05
Identities = 39/178 (21%), Positives = 71/178 (39%), Gaps = 15/178 (8%)

Query: 44 AQRVVSLAPHATELLFAAG----GGARIVGTVTYSDYPSAARDIPRVGDNRAVDLERIAA 99
R+V+L ELL A G G A + + P + VG +LE +
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTE 94

Query: 100 LKPDLVVV-WRHGNAQQQTDRLRALGIPLFFSEPRRLTDIPRAIEALGTLLDTRAGAHDA 158
+KP +V +G + + R+ A G FS+ ++ + R ++L + D A
Sbjct: 95 MKPSFMVWSAGYGPSPEMLARI-APGRGFNFSDGKQPLAMAR--KSLTEMAD-LLNLQSA 150

Query: 159 AERFRRQADDL----RERYAGR--PPVTVFFQVWQQPLMTLNGQHVFSDMLALCGGRN 210
AE Q +D + R+ R P+ + + + ++ +F ++L G N
Sbjct: 151 AETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGIPN 208


45RS_RS12185RS_RS12255Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS12185093.606396tartronate semialdehyde reductase
RS_RS12190-183.119683*MarR family transcriptional regulator
RS_RS12200-182.255175chloride channel protein
RS_RS12205-272.232653diguanylate cyclase
RS_RS12210-271.341972NAD-dependent dehydratase
RS_RS12215-3100.822811CDP-6-deoxy-delta-3,4-glucoseen reductase
RS_RS12220-213-1.449333acetylornithine aminotransferase
RS_RS12225-212-2.406796N-acetyltransferase GCN5
RS_RS12230-214-3.427746amino acid ABC transporter ATP-binding protein
RS_RS12240-212-3.272654ABC transporter
RS_RS12245-211-3.275176ABC transporter ATP-binding protein
RS_RS12250-113-3.485965ABC transporter permease
RS_RS12255313-3.810001branched-chain amino acid ABC transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS12215NUCEPIMERASE340.001 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 33.6 bits (77), Expect = 0.001
Identities = 18/69 (26%), Positives = 25/69 (36%), Gaps = 15/69 (21%)

Query: 149 QPRTLVYASTSGVYGDCGGAWVDETRPV-RPATPRAMRRVAAERRV-------------- 193
+ + L+YAS+S VYG V P + A + A E
Sbjct: 117 KIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGL 176

Query: 194 RWFGVGGGW 202
R+F V G W
Sbjct: 177 RFFTVYGPW 185


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS12230SACTRNSFRASE436e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 43.4 bits (102), Expect = 6e-08
Identities = 19/73 (26%), Positives = 31/73 (42%), Gaps = 4/73 (5%)

Query: 68 ELAGFARVISDHATFAYLCDVFVLPEWRGKGISHALMRLLREHPELQGLRRTVLVTTD-- 125
G ++ S+ +A + D+ V ++R KG+ AL+ E + +L T D
Sbjct: 75 NCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDIN 134

Query: 126 --ADGLYRKHGFT 136
A Y KH F
Sbjct: 135 ISACHFYAKHHFI 147


46RS_RS12370RS_RS12560Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS12370012-4.459792ATP-dependent Clp protease ATP-binding subunit
RS_RS12375447-9.004058ATP-dependent Clp protease adapter protein ClpS
RS_RS12380346-8.432078cold-shock protein
RS_RS12385346-8.497552membrane protein
RS_RS12390350-9.921778hypothetical protein
RS_RS12395349-9.584159hypothetical protein
RS_RS12400446-7.870764type IV secretion protein Rhs
RS_RS12405435-5.890435hypothetical protein
RS_RS12410233-5.418605membrane protein
RS_RS12415232-4.649488hypothetical protein
RS_RS12420231-4.483687hypothetical protein
RS_RS12425231-4.070532hypothetical protein
RS_RS12430126-3.693217hypothetical protein
RS_RS12435126-3.299519hypothetical protein
RS_RS12440124-3.130813hypothetical protein
RS_RS12445024-3.211789hypothetical protein
RS_RS12450127-4.142941hypothetical protein
RS_RS12455125-4.161401DNA primase
RS_RS12460319-5.941274hypothetical protein
RS_RS12465421-6.170967hypothetical protein
RS_RS12470114-3.547046hypothetical protein
RS_RS12475111-1.462558hypothetical protein
RS_RS12480212-0.400344integrase
RS_RS124851150.648141isocitrate dehydrogenase
RS_RS124900142.472115hypothetical protein
RS_RS12495-1113.098232ser/threonine protein phosphatase
RS_RS12500-1133.860028cytochrome B6
RS_RS125050152.800414hypothetical protein
RS_RS125100152.584976MBL fold metallo-hydrolase
RS_RS125152132.740708hypothetical protein
RS_RS125203153.187100LysR family transcriptional regulator
RS_RS125252122.789241LysR family transcriptional regulator
RS_RS125304142.696992MFS transporter
RS_RS125354103.702648membrane protein
RS_RS125404114.352498gamma-glutamyltransferase
RS_RS125454104.862903hypothetical protein
RS_RS12550193.476206flagellar biosynthesis protein FlgM
RS_RS12555193.612419glucose-1-dehydrogenase
RS_RS125602103.365775transcription regulator protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS12370HTHFIS310.027 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.6 bits (69), Expect = 0.027
Identities = 30/124 (24%), Positives = 46/124 (37%), Gaps = 24/124 (19%)

Query: 178 NALAKAGKIDPLIGREQEVERVVQVLCR--RRKNNPLLVGEAGVGKTAIAEGL---AWRI 232
+ PL+GR ++ + +VL R + ++ GE+G GK +A L R
Sbjct: 128 KLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRR 187

Query: 233 TK-------GEVPDILARSVVYSLDMGALLAGTKYR-GDFEQRLKGVLKSLKDNPNAILF 284
+P L S ++ + GA G FEQ G LF
Sbjct: 188 NGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGT-----------LF 236

Query: 285 IDEI 288
+DEI
Sbjct: 237 LDEI 240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS12435RTXTOXINA300.042 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.9 bits (67), Expect = 0.042
Identities = 22/115 (19%), Positives = 39/115 (33%), Gaps = 9/115 (7%)

Query: 411 KLGSALKRFNEYAEKNPGTIERMTDGFIGLSAVLIGGGVISMLNSSVRSFRLLTDVLFPA 470
LGS L K+ + L + G +S + S++ + +L++
Sbjct: 211 TLGSVL-----SNTKHLNGVGNKLQNLPNLDNIGAGLDTVSGILSAISASFILSNADADT 265

Query: 471 SRGLGGAAEATSALSKAIAMKEVGGALGIARIAAALGPVGLAGALIALGAGAIAL 525
E T+ + + K + + R A L A LI A A+ L
Sbjct: 266 RTKAAAGVELTTKVLGNVG-KGISQYIIAQRAAQGLSTSAAAAGLI---ASAVTL 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS12530TCRTETB556e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 54.5 bits (131), Expect = 6e-10
Identities = 54/332 (16%), Positives = 121/332 (36%), Gaps = 26/332 (7%)

Query: 45 VAGQHIQGGVHADPQAYLYAVTAYAVAAVVANLAIGRIAARIGYRMFSLIGIVLFGVGCV 104
V+ I + P + + TA+ + + G+++ ++G + L GI++ G V
Sbjct: 35 VSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSV 94

Query: 105 VCAQSNS-IDMLVAGRAIQGLGAGGLFSASRILVQLTTDPDERIAPMLMFSVGLFGLTTI 163
+ +S +L+ R IQG GA + ++V + R + + +
Sbjct: 95 IGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGV 154

Query: 164 APWICAEVLEYSEWRVIFWLELLLAAAAWLAMFFLPPERHQPRTRAARRPHEAPPAEGRL 223
P I + Y W + + ++ M L E R +G
Sbjct: 155 GPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKE--------VRI-------KGHF 199

Query: 224 DWIGVGAIAIGALAFLMGLSELRYNRLTATPAIPLLLLGGAAGLLLAVHRLRTHPDPWLD 283
D G+ +++G + F++ T + +I L++ + L+ H + DP++D
Sbjct: 200 DIKGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVT-DPFVD 249

Query: 284 LTRLNGRRYLWGIGFYGIYYLMSGIWSYLFPAVSQGGLGFTFRTTTLFLMISGAVSMVVA 343
++ G+ GI + + + P + + + ++ G +S+++
Sbjct: 250 PGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIF 309

Query: 344 MIFTIWLPFFFRKRRVIAIGFGIYAAAALLLA 375
L V+ IG + + L +
Sbjct: 310 GYIGGILVDRRGPLYVLNIGVTFLSVSFLTAS 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS12555DHBDHDRGNASE724e-17 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 72.0 bits (176), Expect = 4e-17
Identities = 64/256 (25%), Positives = 108/256 (42%), Gaps = 19/256 (7%)

Query: 3 KTIVITGGSRGIGRATAVLCARRGWSV-AIQYRGNRQAADETAGLVEQAGGRLLAVQGDV 61
K ITG ++GIG A A A +G + A+ Y + ++ E A DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE--ARHAEAFPADV 66

Query: 62 SSEADVMALFEAAQDRFGALHGVVNNAGIVAPAQDVADMSAQRLRAIFETNVLGAFLVAR 121
A + + + G + +VN AG++ P + +S + A F N G F +R
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGL-IHSLSDEEWEATFSVNSTGVFNASR 125

Query: 122 EAARRLSTARGGAGGALVNVSSAAARLGSPHEYVD-YAASKGAVDTMTLGLARELGREGV 180
++ + R G+ +V V S A G P + YA+SK A T L EL +
Sbjct: 126 SVSKYMMDRRSGS---IVTVGSNPA--GVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 181 RVNAVRPGLIDTEIHAC--GGQPDRAQRLGA-------ATPMGRPGTAAEVAETIVWLLS 231
R N V PG +T++ + Q + P+ + +++A+ +++L+S
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVS 240

Query: 232 DAASYVTGALLDCSGG 247
A ++T L GG
Sbjct: 241 GQAGHITMHNLCVDGG 256


47RS_RS12790RS_RS12890Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS12790120-3.292542alpha/beta hydrolase
RS_RS12795014-3.081516hypothetical protein
RS_RS12800012-2.979047argininosuccinate synthase
RS_RS12805-110-3.147298ornithine carbamoyltransferase
RS_RS12810-111-2.517036hypothetical protein
RS_RS12815-311-2.41405630S ribosomal protein S20
RS_RS12820-213-0.832447membrane protein
RS_RS12825-390.561554transglutaminase
RS_RS12830-291.773353membrane protein
RS_RS12835-192.533524transcriptional regulator
RS_RS12840-193.189649hypothetical protein
RS_RS12845-1103.804552mammalian cell entry protein
RS_RS12850-1124.107274DNA mismatch repair protein MutL
RS_RS128552134.412386tRNA dimethylallyltransferase
RS_RS128600144.601147GCN5 family N-acetyltransferase
RS_RS128651154.911130methylated-DNA--protein-cysteine
RS_RS128700154.638050prolyl 4-hydroxylase
RS_RS12875-1111.996600dioxygenase
RS_RS128800121.268291DNA-3-methyladenine glycosylase
RS_RS128851120.596600ADA regulatory
RS_RS12890217-1.046944metal-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS12835HTHTETR589e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 58.5 bits (141), Expect = 9e-13
Identities = 22/132 (16%), Positives = 50/132 (37%), Gaps = 1/132 (0%)

Query: 7 RSERSRKAAIQAALAILSRDGPGQLTFDAIARESGISKGGLMHQFPNKGAVLKALLEHQI 66
++ +R+ + AL + S+ G + IA+ +G+++G + F +K + + E
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 67 EHFDKFTRGYLAEIGEDRPQAHLAAQIATLRESITTPHSVAF-AILAALIEDPELLALNR 125
+ + Y A+ D I L ++T I+ E +A+ +
Sbjct: 68 SNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQ 127

Query: 126 QMEADAIERIRA 137
Q + +
Sbjct: 128 QAQRNLCLESYD 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS12855PF06917300.011 Periplasmic pectate lyase
		>PF06917#Periplasmic pectate lyase

Length = 555

Score = 30.3 bits (68), Expect = 0.011
Identities = 28/145 (19%), Positives = 45/145 (31%), Gaps = 13/145 (8%)

Query: 124 ADPAIRAEIDAEAARDGWPALHAKLAQVDPVTAARLHATDAQRIQRALELYRLTGQPMSA 183
D + I R L+ K + + AA+ + +EL P
Sbjct: 408 EDEELLDLIGVLLLRWQLAELN-KTQRRATLMAAQRPIASPYLLLALVELAEHCQCPTLF 466

Query: 184 LLAREAGAAAFHRHEAAAAYLSIALEPADRAVLHARIAQRFDAMLAGGLLDEVEALRRRG 243
LA + G F RH ++ R D +A LL + A +
Sbjct: 467 TLAWQIGDDLFKRHYHRGLFVES----------AQHRYFRIDNPIALALLTLIAAK--QD 514

Query: 244 DLSPVLPSIRCVGYRQAWAYLDGEI 268
L+ + I GY ++GE
Sbjct: 515 KLAAIPQFITNGGYIHGDYRVNGES 539


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS12860SACTRNSFRASE452e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 44.6 bits (105), Expect = 2e-08
Identities = 15/56 (26%), Positives = 21/56 (37%)

Query: 89 LYLEDLFVEPAWRGHGIGKALLVHVARLARERDCGRFEWSVLDWNAPSIAFYEAMG 144
+ED+ V +R G+G ALL A+E D N + FY
Sbjct: 90 ALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHH 145


48RS_RS12960RS_RS13125Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS12960023-3.873746CopG family transcriptional regulator
RS_RS12965026-4.761618conjugal transfer protein TraG
RS_RS12970239-8.281641hypothetical protein
RS_RS12975338-7.090426D-alanyl-D-alanine endopeptidase
RS_RS12980640-7.602045limonene-1,2-epoxide hydrolase
RS_RS12985539-6.934224limonene-1,2-epoxide hydrolase
RS_RS12990535-6.654351AraC family transcriptional regulator
RS_RS12995431-6.175339hypothetical protein
RS_RS13000333-5.649220carboxylesterase
RS_RS13005331-5.042081glutathione S-transferase
RS_RS13010132-4.349160hypothetical protein
RS_RS13015026-2.367785amidase
RS_RS13020027-1.6000393-oxoacyl-ACP reductase
RS_RS13025-125-0.492185AraC family transcriptional regulator
RS_RS13030-1211.076724ATP-dependent helicase
RS_RS13035-1201.194388type VI secretion protein
RS_RS13040-1171.026161peptidase
RS_RS130451190.896170transposase
RS_RS130502181.509222chromosome partitioning protein ParB
RS_RS130550171.101380ATPase
RS_RS130601180.303258RepA replication protein
RS_RS13065224-3.341036transcriptional regulator
RS_RS13070333-8.851455hypothetical protein
RS_RS13075331-9.010086lipoprotein
RS_RS13080638-13.225863XRE family transcriptional regulator
RS_RS13085639-13.700324hypothetical protein
RS_RS13090430-8.411481hypothetical protein
RS_RS13095428-6.556224hypothetical protein
RS_RS13100536-10.093475hypothetical protein
RS_RS13105435-9.902992hypothetical protein
RS_RS13110530-8.665896hypothetical protein
RS_RS13115428-8.209194hypothetical protein
RS_RS13120425-8.867449DNA repair protein RadC
RS_RS13125321-8.152472hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13020DHBDHDRGNASE821e-20 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 82.0 bits (202), Expect = 1e-20
Identities = 64/258 (24%), Positives = 107/258 (41%), Gaps = 16/258 (6%)

Query: 4 KIALITGASRGLGRNMALHLAKRGVHIIGTYRSGAEQAQTLKQEIEALGGKATMLALDVT 63
KIA ITGA++G+G +A LA +G HI E+ + + ++A A DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAA-VDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 64 ETAGYPAFTSAVTDTLKNDFGRERFDFLINNAGNGLFAKFVDATEEQFASLIATHLRGPI 123
++A T+ + + D L+N AG ++E++ + + + G
Sbjct: 68 DSAAIDEITARIEREM------GPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVF 121

Query: 124 FLTQKLLPLL--ENGGRILNVSSGFVRFTMPGYSVYAAVKAALEVLTRYMAVELGSRQIR 181
++ + + G I+ V S + YA+ KAA + T+ + +EL IR
Sbjct: 122 NASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIR 181

Query: 182 VNAIAPGAIATDF-------GGGAVRDNKDVNAYVAQGIALGRVGLPDDIGGAVAAILSD 234
N ++PG+ TD GA + K GI L ++ P DI AV ++S
Sbjct: 182 CNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSG 241

Query: 235 DMGWANGTTFDISGGQLL 252
G + GG L
Sbjct: 242 QAGHITMHNLCVDGGATL 259


49RS_RS13170RS_RS13290Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS131702113.2994923-methyl-2-oxobutanoate
RS_RS131750112.570325ABC transporter permease
RS_RS13180-2121.092902ABC transporter ATP-binding protein
RS_RS13185-1121.052671aminodeoxychorismate synthase component I
RS_RS13190-1100.809157molecular chaperone DnaJ
RS_RS131950110.496393molecular chaperone DnaK
RS_RS13205213-1.139406elongation factor GreAB
RS_RS13210-1111.342956thioredoxin
RS_RS13215-1102.514117protein GrpE
RS_RS13220-1103.427951membrane protein
RS_RS132251123.822406RNA-binding protein S4
RS_RS132300123.669570ferrochelatase
RS_RS132350123.518120imidazolonepropionase
RS_RS132402123.322674formimidoylglutamase
RS_RS132451132.573432histidine ammonia-lyase
RS_RS132501122.589192urocanate hydratase
RS_RS132551122.470847histidine utilization repressor
RS_RS132602102.905821HrcA family transcriptional regulator
RS_RS13265282.953025inorganic polyphosphate/ATP-NAD kinase
RS_RS13270273.519635DNA repair protein RecN
RS_RS13275193.564604hypothetical protein
RS_RS13280093.696659peptidase
RS_RS13290-1103.242626peptidase S8
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13190PF05272300.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.012
Identities = 14/36 (38%), Positives = 21/36 (58%)

Query: 35 NAVVITGANGVGKTTLLRMLAGLVPAAEACVDFGRG 70
+VV+ G G+GK+TL+ L GL ++ D G G
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTG 632


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13205SHAPEPROTEIN1375e-38 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 137 bits (348), Expect = 5e-38
Identities = 81/386 (20%), Positives = 143/386 (37%), Gaps = 79/386 (20%)

Query: 5 IGIDLGTTNSCVSIMEG----NTPKVIE-NAEGARTTPSIIAYMEDGEILVGAPAKRQAV 59
+ IDLGT N+ + + N P V+ + A + S+ A VG AK+
Sbjct: 13 LSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAA--------VGHDAKQMLG 64

Query: 60 TNPRNTLYAVKRLIGRKFEEKEVQKDIGLMPYTISKADNGDAWVEVRDKKMAPPQISAEV 119
P N + A++ + +D +A ++ ++
Sbjct: 65 RTPGN-IAAIRPM---------------------------------KDGVIADFFVTEKM 90

Query: 120 LRK-MKKTAEDYLGEEVTEAVITVPAYFNDSQRQATKDAGRIAGLDVKRIINEPTAAALA 178
L+ +K+ + ++ VP +R+A +++ + AG +I EP AAA+
Sbjct: 91 LQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIG 150

Query: 179 FGLDKNEKGDRKIAVYDLGGGTFDISIIEIADVDGEKQFEVLSTNGDTFLGGEDFDQRII 238
GL +E V D+GGGT ++++I + V + +GG+ FD+ II
Sbjct: 151 AGLPVSE--ATGSMVVDIGGGTTEVAVISLNGV---------VYSSSVRIGGDRFDEAII 199

Query: 239 DYIIGEFKKESGVDLSKDVLALQRLKDAAEKAKIELSST----QQTEINLPYITADASGP 294
+Y+ + G + AE+ K E+ S + EI + P
Sbjct: 200 NYVRRNYGSLIG-------------EATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVP 246

Query: 295 KHLNLKITRAKLEALVEDLIARTIEPCRTAIKDAGVKVSDIHD--VILVGGMTRMPKVQE 352
+ L + LEAL E L + SDI + ++L GG + +
Sbjct: 247 RGFTLN-SNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDR 305

Query: 353 KVKEFFGKEARKDVNPDEAVAVGAAI 378
+ E G +P VA G
Sbjct: 306 LLMEETGIPVVVAEDPLTCVARGGGK 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13240UREASE310.011 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 30.9 bits (70), Expect = 0.011
Identities = 17/62 (27%), Positives = 25/62 (40%), Gaps = 9/62 (14%)

Query: 25 YGEIRDGAIAVRDGRIAWLGARA--------DLPAGARAEREHDGGGAWLTPGLIDCHTH 76
+ I I ++DGRIA +G + G E G G +T G +D H H
Sbjct: 80 HWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTE-VIAGEGKIVTAGGMDSHIH 138

Query: 77 LV 78
+
Sbjct: 139 FI 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13270PF06057310.005 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 30.6 bits (69), Expect = 0.005
Identities = 15/68 (22%), Positives = 25/68 (36%), Gaps = 9/68 (13%)

Query: 49 DIVFERETALNIGVQDYPALP----PDEMARHAD-VAVVLGGDGTLLGIGRHLAGA---- 99
F E A N+G+ P P + + + L GDG + + + G
Sbjct: 18 ANAFADEFADNLGLTLLPVEPSTQVNAASSHTKPPLVIFLSGDGGWATLDKAVGGILQQQ 77

Query: 100 SVPVIGVN 107
PV+G +
Sbjct: 78 GWPVVGWS 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13275CHANLCOLICIN340.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 33.9 bits (77), Expect = 0.002
Identities = 47/227 (20%), Positives = 90/227 (39%), Gaps = 26/227 (11%)

Query: 163 RAWQAVVRLREAAEQQSREAQLERERVEWQVSELQKLAPQPGEWEEVQAEHHRLSHAASL 222
+A + + EAAE+ +EA+ R+ +E + +E ++ Q E + LS A
Sbjct: 134 KAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETER---QLKLAEAEEKRLAALSEEAKA 190

Query: 223 IEGTRAALDT-------LSEADSAVLTQLGAAVHGLQALAEIDPALADVLAALEPAQVQV 275
+E + L + + ++L +++H A + LA L A +
Sbjct: 191 VEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMK---TLAGKRNELAQASAKY 247

Query: 276 QEAVHSLARYADRAELDPDR----LAEVDARLQALHTM--ARKYRVAPET----LPAELT 325
+E + + + RA DP + R+ A +K A ET + A++T
Sbjct: 248 KELDELVKKLSPRAN-DPLQNRPFFEATRRRVGAGKIREEKQKQVTASETRINRINADIT 306

Query: 326 QRQAQLAALQAASDLDALQAQEAQTHAAYLQAAQALSRGRAKAAREL 372
Q Q A Q +++ +A A+ + +A L + K A +
Sbjct: 307 --QIQKAISQVSNNRNAGIARVHEAEENLKKAQNNLLNSQIKDAVDA 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13285SUBTILISIN1151e-30 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 115 bits (289), Expect = 1e-30
Identities = 82/386 (21%), Positives = 131/386 (33%), Gaps = 87/386 (22%)

Query: 130 PNDPLFSTQNNLQSPTVVAGGINVASAWDITLGSSSLVVAVVDTGY-TDHPDLSGKILPG 188
N + I + W+ T G + VAV+DTG DHPDL +I+ G
Sbjct: 11 QVIKQEQQVNEIPRG---VEMIQAPAVWNQTRGRG-VKVAVLDTGCDADHPDLKARIIGG 66

Query: 189 YNFISDPARAGNSTGRGSDAHDTGDGVTSADVSAISGCTSSDIGNSTWHGTEVMSVLAAG 248
NF D D D HGT V A
Sbjct: 67 RNFTDD---------DEGDPEIFKDY--------------------NGHGTHVAGT-IAA 96

Query: 249 TNNALDIAGVGWNTRIVPVRTSGKCG-ALLSDTVDGMLWAGGISVSGVPANPNPARVINV 307
T N + GV ++ ++ K G + G+ + A +I++
Sbjct: 97 TENENGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYY----------AIEQKVDIISM 146

Query: 308 SLGSVGSCSAAEQDAINRLAALGTVVVAAAGNEGSAVDA------PANCSGAIAVTAHVD 361
SLG +A+ + A +V+ AAGNEG D P + I+V A
Sbjct: 147 SLGGPED-VPELHEAVKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAINF 205

Query: 362 SGENASYANVGSQVALSAPGGGCANSQATSSGCTGPVSVIQADSNDGQYSLGNSVVKSVA 421
+ ++N ++V L APG I + G+Y+ + +
Sbjct: 206 DRHASEFSNSNNEVDLVAPGED-----------------ILSTVPGGKYA-------TFS 241

Query: 422 GTSFSTPEVAGTIALMLS-----VQSQLSNAQILAGLQQTARAHPSGTFCATRGGCGAGL 476
GTS +TP VAG +AL+ + L+ ++ A L + + G GL
Sbjct: 242 GTSMATPHVAGALALIKQLANASFERDLTEPELYAQLIKRTIP-----LGNSPKMEGNGL 296

Query: 477 LDTAGAVRYAQTTTPAGIGSGSSSSS 502
L ++ + S++S
Sbjct: 297 LYLTAVEELSRIFDTQRVAGILSTAS 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13290SUBTILISIN1302e-35 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 130 bits (328), Expect = 2e-35
Identities = 81/359 (22%), Positives = 118/359 (32%), Gaps = 94/359 (26%)

Query: 172 NLTTAWDTTKGSNTVTVTVIDTGLLAGHADLSGATIQPGYDFVSSTAMTGTPDPVTNLSI 231
W+ T+G V V V+DTG A H DL I G +F
Sbjct: 30 QAPAVWNQTRGRG-VKVAVLDTGCDADHPDLKA-RIIGGRNFTD---------------- 71

Query: 232 PSGFVENDATAGRDSDPTDPGDWITTSDATNYPSFCGTSVTDSSWHGTFVTGLIAAQHNA 291
+ DP D + HGT V G IAA N
Sbjct: 72 -----------DDEGDPEIF--------------------KDYNGHGTHVAGTIAATENE 100

Query: 292 IGVAGVAPGVSVQMTRAIGKCG-GASSDILDALTWAAGGTVPGVTTNATPAKVINMSLGG 350
GV GVAP + + + + K G G I+ + +A +I+MSLGG
Sbjct: 101 NGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAI----------EQKVDIISMSLGG 150

Query: 351 STACTTAQQSAITAARSLGATIVVATGNEFQNA----AIDAPASCSGTIAVTAHTLEGDI 406
+ A+ A + ++ A GNE + P + I+V A +
Sbjct: 151 PEDVPELHE-AVKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAINFDRHA 209

Query: 407 ANYANVGTGTTLSAPGGGNGSTATGLGALVPSTSNAGTTTASTDTYVGEEGTSMSTPQVA 466
+ ++N L APG +T Y GTSM+TP VA
Sbjct: 210 SEFSNSNNEVDLVAPGEDI------------------LSTVPGGKYATFSGTSMATPHVA 251

Query: 467 GVAALMLSVNAA-----LTPDQIKTILQTSSRPFPPDTFCTTHAGVCGAGMLDAGSAIA 520
G AL+ + A LT ++ L + P + G G+L +
Sbjct: 252 GALALIKQLANASFERDLTEPELYAQLIKRTIPLGNSPK------MEGNGLLYLTAVEE 304


50RS_RS13345RS_RS13395Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS133453112.729671LysR family transcriptional regulator
RS_RS133502132.580897FAD-binding protein
RS_RS133552122.338527FAD-binding protein
RS_RS133601101.255519glycolate oxidase
RS_RS13365-111-0.074814murein hydrolase transporter LrgA
RS_RS13370110-0.948094membrane protein
RS_RS13375110-1.561017glyoxalase
RS_RS1338019-1.470769cupin
RS_RS13385210-1.373381DNA-binding transcriptional regulator
RS_RS13390311-1.775200glutathione peroxidase
RS_RS13395312-1.654194type 4 fimbrial biogenesis PilY1 signal peptide
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13385ARGREPRESSOR300.005 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 29.8 bits (67), Expect = 0.005
Identities = 16/60 (26%), Positives = 27/60 (45%), Gaps = 5/60 (8%)

Query: 6 RRADRLFQIAQILRGRRLTTAAMLADRL-----GVSERTVYRDIRDLSVSGVPIEGEAGI 60
+ R +I +I+ + T L D L V++ TV RDI++L + VP +
Sbjct: 2 NKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKELHLVKVPTNNGSYK 61


51RS_RS13455RS_RS13545Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS13455313-1.850051ferritin
RS_RS13460219-1.808518membrane protein
RS_RS13465318-1.515887hypothetical protein
RS_RS13470317-1.825282LysR family transcriptional regulator
RS_RS13475520-2.145031hypothetical protein
RS_RS13480323-2.475436hypothetical protein
RS_RS13485223-2.651775usher protein
RS_RS13490226-3.620777phytochrome sensor protein
RS_RS13495332-5.505098spore coat protein U
RS_RS13500139-6.807395spore coat protein U
RS_RS13505144-7.599435transglutaminase
RS_RS13510042-8.050199hypothetical protein
RS_RS13515141-8.357205hypothetical protein
RS_RS13520-130-4.527128hypothetical protein
RS_RS13525-321-2.098925hypothetical protein
RS_RS13530-216-1.034957translation initiation factor 1
RS_RS13535-316-0.442468transposase
RS_RS135400113.282819transposase
RS_RS135451113.719836alanine racemase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13455HELNAPAPROT1523e-50 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 152 bits (386), Expect = 3e-50
Identities = 46/150 (30%), Positives = 76/150 (50%), Gaps = 1/150 (0%)

Query: 18 IAEKDRKK-IAEGLSRLLADTYSLYLKTHNFHWNVTGPMFNTLHLMFETQYNELWTAVDS 76
K + + L+ L++ + LY K H FHW V GP F TLH FE Y+ VD+
Sbjct: 4 ENAKTNQTLVENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDT 63

Query: 77 VAERIRALGYPAPGSYSEFARLSSIPEAKGVPAAEDMIRELVAGHESVTRTARSLFPDVD 136
+AER+ A+G + E+ +SI + +A +M++ LV ++ ++ ++ + +
Sbjct: 64 IAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYKQISSESKFVIGLAE 123

Query: 137 KAADEPTADLLTQRMDIHEKTAWMLRSLLA 166
+ D TADL ++ EK WML S L
Sbjct: 124 ENQDNATADLFVGLIEEVEKQVWMLSSYLG 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13485PF00577344e-108 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 344 bits (884), Expect = e-108
Identities = 150/800 (18%), Positives = 260/800 (32%), Gaps = 92/800 (11%)

Query: 27 LNQQGLDDTVLLLKTKGGELYA----SAEDLKRWRLRQPEATP--LAHNNEVFYPLKAIT 80
LN + + T E + L L + L ++ I
Sbjct: 84 LNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIH 143

Query: 81 GLSYQLDETLLTVAVTAPAQAFLPNAFDEANKPVALSLKPGFGGFLNYDLLAEHATGQT- 139
+ QLD + +T P QAF+ N P L G LNY+ +
Sbjct: 144 DATAQLDVGQQRLNLTIP-QAFMSNRARGYIPP-ELWDPGINAGLLNYNFSGNSVQNRIG 201

Query: 140 RSSGLFEAGVFNGLGVGVASF----------VANSGDTEHRLIRLETTWTKDLPDHLSSV 189
+S + +GL +G +S ++++ + T +D+ S +
Sbjct: 202 GNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRL 261

Query: 190 RLGDSISRAGAWGRSVRFGGIQYSTNFAVQPGFVALPLQSIGGQAALPSTVDVYVNNVLS 249
LGD ++ + + F G Q +++ + P I G A + V + N
Sbjct: 262 TLGDGYTQGDIFD-GINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDI 320

Query: 250 SRKDVAPGPFSITNVPVVTGQGDVRVVVRDLLGREQVISVPFYASASLLAKGLHDFSYEA 309
V PGPF+I ++ GD++V +++ G Q+ +VP+ + L +G +S A
Sbjct: 321 YNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITA 380

Query: 310 GAERRNFGTDSNDYGRAFAAATHRLGFTDWLTGEVHAESQLERQTVGLGTVMLAPSVGVF 369
G R F +T G T + + G ++G
Sbjct: 381 GEYRSGNAQQEKPR---FFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGAL 437

Query: 370 SAAAAASRGRQGSGGLV---SVGFERQASPVSVGVHAQLASPRF---------------- 410
S + SV F S G + QL R+
Sbjct: 438 SVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRM 497

Query: 411 ----------------TQLGLELDMPAPRLLSSVNVGMALPGGGSLGLAYVYLD--NRDG 452
R + V L +L L+ +
Sbjct: 498 NGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSN 557

Query: 453 HPTQLASASYGTQLGRFGF-ISLALSKTLHQPGGLM-VGLNWTLPLGERTTTSINMTQQQ 510
Q A T + +S +L+K Q G + LN +P + +
Sbjct: 558 VDEQF-QAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRH 616

Query: 511 ------------GRTEVQAQVQQGLPPGDGFGYSVRAS--------QLGAWDAALNAQNR 550
GR A V L + YSV+ A LN +
Sbjct: 617 ASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGG 676

Query: 551 VGTYLLEAAGDRGQTGVRMGASGGIAILD-GLHLSRRISDSFAVVKVPDFPNVRIYADNQ 609
G + + + G SGG+ G+ L + ++D+ +VK P + ++ +NQ
Sbjct: 677 YGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKV--ENQ 734

Query: 610 LVAKTNASGEALIPRLRAYEKNAISLEQLDLPFDAKVGTLSLDAVPYYRSGMVLDFPVTH 669
+T+ G A++P Y +N ++L+ L + + + VP + + +F
Sbjct: 735 TGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARV 794

Query: 670 AHDALMTLKLENGSDLPAGATVQISGQKEAFPVGNDGVVYLTGLAAQNRLQANW---RGQ 726
LMTL N LP GA V + + V ++G VYL+G+ ++Q W
Sbjct: 795 GIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENA 853

Query: 727 SC--DISVPFTPGADPLPDL 744
C + +P L L
Sbjct: 854 HCVANYQLPPESQQQLLTQL 873


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13550ALARACEMASE320.004 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 31.7 bits (72), Expect = 0.004
Identities = 37/229 (16%), Positives = 70/229 (30%), Gaps = 59/229 (25%)

Query: 11 ALIDVARMTRNIERMQQRLDALGVRFRPHVKTTKCEPVVRAQLAAGAQGITVSTLKEAEQ 70
A +D+ + +N+ ++Q H + V + + A A G + + A
Sbjct: 7 ASLDLQALKQNLSIVRQAA--------THAR-------VWSVVKANAYGHGIERIWSA-- 49

Query: 71 FFSAGICDIVYAVGMAPARLPQALALRRRGCDLKIV----ADSVTCAQAIAAF------- 119
I G A L +A+ LR RG I+ +
Sbjct: 50 --------IGATDGFALLNLEEAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVH 101

Query: 120 GGEHGEAF---------DVWIEVDVDGHRSGIAPDDDRLITVGRVLAEGGMRLGGVLAHA 170
+A D++++V+ +R G P D L ++ A + +++H
Sbjct: 102 SNWQLKALQNARLKAPLDIYLKVNSGMNRLGFQP-DRVLTVWQQLRAMANVGEMTLMSH- 159

Query: 171 GSSYEYHAHDALVRIAELERSGCVRAAERLRAAGLPCETVSIGSTPTAL 219
+ A A GL C S+ ++ L
Sbjct: 160 -----------FAEAEHPDGISGAMARIEQAAEGLECR-RSLSNSAATL 196


52RS_RS14030RS_RS14110Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS140303102.704483two-component system sensor histidine kinase
RS_RS140353102.226084hypothetical protein
RS_RS140404101.387214ABC transporter permease
RS_RS14050090.554118signal recognition particle protein
RS_RS14055-1130.104382hypoxanthine phosphoribosyltransferase
RS_RS14060-112-0.247960membrane protein
RS_RS14065-190.079349hypothetical protein
RS_RS14070-190.557326lytic transglycosylase
RS_RS14075-1110.178378proline--tRNA ligase
RS_RS14080011-0.340573RNA pyrophosphohydrolase
RS_RS14085212-1.042100lipoprotein
RS_RS14090013-1.045641gamma-glutamyl kinase
RS_RS14095-313-1.854363GTPase Obg
RS_RS14100-111-3.52092850S ribosomal protein L27
RS_RS14105-211-4.47381950S ribosomal protein L21
RS_RS14110-310-3.709883octaprenyl diphosphate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS14055PilS_PF08805310.006 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 31.1 bits (70), Expect = 0.006
Identities = 12/52 (23%), Positives = 17/52 (32%), Gaps = 6/52 (11%)

Query: 430 QEVNRLLNQFEQMQTMMKKLKGGG------MMKMMRAMGGLKGGMKGLLPGG 475
+ + N + MK LK G +K + A G L M G
Sbjct: 58 IQSSNEQNNVLTVIANMKSLKFQGRYTDSNYIKTLYAQGLLPSDMIADTTGA 109


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS14095CARBMTKINASE378e-05 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 37.1 bits (86), Expect = 8e-05
Identities = 32/130 (24%), Positives = 47/130 (36%), Gaps = 10/130 (7%)

Query: 130 GLGVVPIINENDTVVTDEIKFGDNDTLGALVTNLIEGDALIILTDQRGLYTADPRKHPDA 189
G G VP+I E+ + E D D G + + D +ILTD G
Sbjct: 193 GGGGVPVILEDGEIKGVEAVI-DKDLAGEKLAEEVNADIFMILTDVNGAAL-YYGT-EKE 249

Query: 190 RFVDEAQAGAPELEAMAGGAGSSIGKGGMLTKIVAAKRAAKSGAHTVIASGREADVLARL 249
+++ E + EL G M K++AA R + G I + L
Sbjct: 250 QWLREVKVE--ELRKYY--EEGHFKAGSMGPKVLAAIRFIEWGGERAII-AHLEKAVEAL 304

Query: 250 AGGEAIGTQL 259
G GTQ+
Sbjct: 305 EG--KTGTQV 312


53RS_RS14285RS_RS14310Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS14285-193.076373MFS transporter
RS_RS14290093.335584long-chain fatty acid--CoA ligase
RS_RS142952104.441044aminopeptidase
RS_RS143002114.069028molybdopterin oxidoreductase
RS_RS143051103.873020LacI family transcriptional regulator
RS_RS143101103.642322PTS fructose transporter subunit IIA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS14285TCRTETA330.003 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.9 bits (75), Expect = 0.003
Identities = 27/116 (23%), Positives = 46/116 (39%), Gaps = 11/116 (9%)

Query: 292 LLCLAVVFMALATPLSAWASDRWGRKPVLLTGILAAILSGFTMAPLLGSGSTPLVALFLI 351
LL L + P+ SDR+GR+PVLL + A + MA + P + + I
Sbjct: 48 LLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMA------TAPFLWVLYI 101

Query: 352 LELF--LMGVTFAPMGALLPELFPTH--VRYTG-AGVSYNLGGILGASIAPYIAQV 402
+ + G T A GA + ++ R+ G + G + G + +
Sbjct: 102 GRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGF 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS14310PHPHTRNFRASE6010.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 601 bits (1550), Expect = 0.0
Identities = 213/564 (37%), Positives = 314/564 (55%), Gaps = 10/564 (1%)

Query: 273 AIGGIGASPGLAIGTVHV-MQPGVSAIPDHPVPLATGGDLLDRALADTRAELAALARDTA 331
I GI AS G+AI + ++P V ++T + L AL ++ EL A+ T
Sbjct: 4 KITGIAASSGVAIAKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQTE 63

Query: 332 ARLGEAEAGIFKAQAELLGDTDLMT-LTCQAMVEGHGVAWSWHHAVERLAERLAALGNPL 390
A +G +A IF A +L D +L+ + + E ++ + ++ N
Sbjct: 64 ASMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEY 123

Query: 391 LAARAADLRDVGRRVLGHLDPALRGTAQALPDGPCILVAQDLSPSDTAALDTRRIAGIVT 450
+ RAAD+RDV +RVLGHL G+ + + +++A+DL+PSDTA L+ + + G T
Sbjct: 124 MKERAADIRDVSKRVLGHLIGVETGSLATIAE-ETVIIAEDLTPSDTAQLNKQFVKGFAT 182

Query: 451 AQGGPTSHTAILTRTLGIPAVVAAGPAVLDVASGTQAIVDGSGGQLYLDPDAAAVAGAEA 510
GG TSH+AI++R+L IPAVV + G IVDG G + ++P V E
Sbjct: 183 DIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEE 242

Query: 511 WLREDAARAQREQAERGLPARTRDGHAVEIAANVNLPAQAIEAVTLGAEGVGLMRTEFLF 570
+ Q G P+ T+DG VE+AAN+ P + G EG+GL RTEFL+
Sbjct: 243 KRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLY 302

Query: 571 LERDHAPDEDAQHEVYAAMLGALGGRPLIVRTLDIGGDKQVPHLNLPKEENPFLGVRGAR 630
++RD P E+ Q E Y ++ + G+P+++RTLDIGGDK++ +L LPKE NPFLG R R
Sbjct: 303 MDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFRAIR 362

Query: 631 LLLRRPDLMEPQLRALYRAASGGGPLSIMFPMITSLGEVIALRAACERIRAELDAPAVP- 689
L L + D+ QLRAL R AS G L +MFPMI +L E+ +A + + +L + V
Sbjct: 363 LCLEKQDIFRTQLRALLR-ASTYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEGVDV 421

Query: 690 -----LGIMVEVPAAALLADQLAEHVDFFSIGTNDLTQYTLAIDRQHPDLAAEADSLHPA 744
+GIMVE+P+ A+ A+ A+ VDFFSIGTNDL QYT+A DR + ++ HPA
Sbjct: 422 SDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPA 481

Query: 745 VLRLIQLTVQGAARHGRWVGVCGGLAGDPFGALLLTGLGVHELSMSPRDIPAVKARLRDA 804
+LRL+ + ++ A G+WVG+CG +AGD LL GLG+ E SMS I +++L
Sbjct: 482 ILRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLLKL 541

Query: 805 HVRQLTELAGQALACASAEDVRAL 828
+L A +AL +AE+V L
Sbjct: 542 SKEELKPFAQKALMLDTAEEVEQL 565


54RS_RS14640RS_RS14680Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS14640013-4.495014two-component system sensor histidine kinase
RS_RS14645211-5.548345acetyltransferase
RS_RS14655212-5.304455*peptidase
RS_RS14660113-5.030903transcription modulator protein
RS_RS14665110-4.282636cytochrome C
RS_RS14670215-4.617581cytochrome B
RS_RS14675319-3.653833ubiquinol-cytochrome C reductase (iron-sulfur
RS_RS14680215-2.602607large-conductance mechanosensitive channel
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS14665STREPTOPAIN310.004 Streptopain (C10) cysteine protease family signature.
		>STREPTOPAIN#Streptopain (C10) cysteine protease family signature.

Length = 398

Score = 31.2 bits (70), Expect = 0.004
Identities = 21/78 (26%), Positives = 36/78 (46%), Gaps = 12/78 (15%)

Query: 3 KLLAIFALAGFMIAAPVFANEGGVRLDPAPNQS-----EDLSALQRGAKL-------FVN 50
+LL++ AL GF++A PVFA++ R + S + +A++ GA+ VN
Sbjct: 9 RLLSLLALGGFVLANPVFADQNFARNEKEAKDSAITFIQKSAAIKAGARSAEDIKLDKVN 68

Query: 51 YCLNCHGASAMRYNRLRD 68
G++ YN
Sbjct: 69 LGGELSGSNMYVYNISTG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS14680MECHCHANNEL1183e-37 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 118 bits (296), Expect = 3e-37
Identities = 62/143 (43%), Positives = 89/143 (62%), Gaps = 13/143 (9%)

Query: 1 MALMQDFKKFAMRGNVIDLAVGVIIGAAFGKIVDSLVNDLIMPLVARIVGKLDFSNLFIQ 60
M+++++F++FAMRGNV+DLAVGVIIGAAFGKIV SLV D+IMP + ++G +DF +
Sbjct: 1 MSIIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMPPLGLLIGGIDFKQFAVT 60

Query: 61 LADAPAGVPQTLADLKKAGVPVFAYGNFITVAVNFLILAFIVFLMVRAITRVIDTNPPPA 120
L DA +P V YG FI +FLI+AF +F+ ++ I ++ PA
Sbjct: 61 LRDAQGDIPAV----------VMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEEPA 110

Query: 121 DTP---ENTLLLRDIRDSLKSKN 140
P + +LL +IRD LK +N
Sbjct: 111 AAPAPTKEEVLLTEIRDLLKEQN 133


55RS_RS15400RS_RS15540Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS154000143.015993peptidase
RS_RS154051143.150245mechanosensitive ion channel protein MscS
RS_RS154101153.017102MFS transporter
RS_RS154152152.309731glutathione S-transferase
RS_RS154201181.258192hypothetical protein
RS_RS154252192.661495hypothetical protein
RS_RS154302172.648238hypothetical protein
RS_RS154354173.140038hypothetical protein
RS_RS154403162.811236cation transporter
RS_RS154452232.790684hisitidine kinase
RS_RS154503202.621281PAS domain-containing two-component system
RS_RS154551241.834095MerR family transcriptional regulator
RS_RS154600241.569710hypothetical protein
RS_RS15465-3320.122694cytochrome C
RS_RS15470-3330.348652*hypothetical protein
RS_RS15475-2140.147318hypothetical protein
RS_RS15490-213-1.612024high-potential iron-sulfur protein
RS_RS15495016-2.443415two-component system response regulator
RS_RS15500314-3.160716ABC transporter substrate-binding protein
RS_RS15505315-3.087029transposase
RS_RS15510212-2.970705transposase
RS_RS15515212-3.058444membrane protein
RS_RS15520114-1.788531hypothetical protein
RS_RS15525012-0.283973glutamate carboxypeptidase
RS_RS155301100.945711DNA repair protein
RS_RS155351100.755567hypothetical protein
RS_RS15540290.686778potassium transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15415TCRTETB538e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 53.3 bits (128), Expect = 8e-10
Identities = 71/376 (18%), Positives = 125/376 (33%), Gaps = 54/376 (14%)

Query: 63 MPLLGHDFGLSPAQTSLVLSATTMLLAFAILFAGLLSESIHRKTLMGVSLVLSSALTVAA 122
+P + +DF PA T+ V +A + + G LS+ + K L+ ++++ +V
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 123 AVLPGWQGLL-ATRAVLGVVLGGVPAVAMAYLAEEVHPQGLGMAMGLYVGGTAFGGMAGR 181
V + LL R + G PA+ M +A + + G A GL A G G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 182 VLTGMVADYAGW---------------------------RVAMGFIGALCLLAALAFAWL 214
+ GM+A Y W + G + + + F L
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFML 216

Query: 215 LPPSRRFA------------PRRGVALADLFDTLAEHLREPGLRALFAMAFLLMG---GF 259
S + + + D F P + + ++ G GF
Sbjct: 217 FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVL-CGGIIFGTVAGF 275

Query: 260 VTIYNYASYRLLSAPYALSHSAVGAIFM--VYLLGIGASTWFGRLADRHGRGRVLLAGTA 317
V++ Y ++ + LS + +G++ + + I G L DR G VL G
Sbjct: 276 VSMVPY----MMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVT 331

Query: 318 AMSAGVLMTLAMPLG--LVIGGIALLTFGFFGAHA-VASGWVGRRARRAK-GQAAALYLL 373
+S L + + I + G V S V ++ + G +L
Sbjct: 332 FLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNF 391

Query: 374 AYYLGSSIAGTAGGKL 389
+L G L
Sbjct: 392 TSFLSEGTGIAIVGGL 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15455HTHFIS472e-08 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 47.1 bits (112), Expect = 2e-08
Identities = 21/102 (20%), Positives = 41/102 (40%), Gaps = 9/102 (8%)

Query: 1 MQAPVRFAIADDHPGVVAAVRHLVCRVEGFEVVGEATSADGLLALLGRVHCDVVITDYAM 60
M +ADD + + + R G++V ++A L + D+V+TD M
Sbjct: 1 MTGA-TILVADDDAAIRTVLNQALSR-AGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVM 57

Query: 61 PASRHGDGIVLLEFLLRRHPGVRVVVLTMLETHAVVDRMLRA 102
P + LL + + P + V+V++ ++A
Sbjct: 58 P---DENAFDLLPRIKKARPDLPVLVMS---AQNTFMTAIKA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15460HTHFIS594e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 59.5 bits (144), Expect = 4e-11
Identities = 31/119 (26%), Positives = 51/119 (42%), Gaps = 6/119 (5%)

Query: 736 HILVAEDHPINRELITRQLRLLGYRVTLAEDGAAALQRLCETRFDALLTDCCMPRLDGFD 795
ILVA+D R ++ + L GY V + + A + + D ++TD MP + FD
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 796 LARQIRQREQQAPGGRRLPILAITATTLAEEHARCRTVGMDGYVLKPTTLATLQDALSR 854
L +I++ LP+L ++A + G Y+ KP L L + R
Sbjct: 65 LLPRIKKARP------DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15505HTHFIS845e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.5 bits (209), Expect = 5e-21
Identities = 39/122 (31%), Positives = 58/122 (47%), Gaps = 1/122 (0%)

Query: 10 PTVLVIDDEPHIRRFVRAALEAEGCEVFEADRVARGLIEAGTRQPDAVILDLGLPDGDGM 69
T+LV DD+ IR + AL G +V A D V+ D+ +PD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 70 SLIRDLRT-WTEVPVLVLSARVDERDKIDALDAGADDYLTKPFGVGELIARLRVLLRRHA 128
L+ ++ ++PVLV+SA+ I A + GA DYL KPF + ELI + L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 129 KR 130
+R
Sbjct: 124 RR 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15525ECOLNEIPORIN663e-14 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 66.4 bits (162), Expect = 3e-14
Identities = 56/234 (23%), Positives = 86/234 (36%), Gaps = 25/234 (10%)

Query: 1 MKKSAILVAVGALFAGSAYAQSSVTLYGIVDATIHYTTNANQAG---NSLLRMDNGAVSN 57
MKKS I + + AL A + VTLYG + A + + + G S+
Sbjct: 1 MKKSLIALTLAALPVA---AMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLG 57

Query: 58 SRWGLKGSEDLGGGNKALFVLESGFDPDTGRVNSGGLFNRQSFVGLSNKDYGTLTMGRQY 117
S+ G KG EDLG G KA++ +E G NRQSF+GL +G L +GR
Sbjct: 58 SKIGFKGQEDLGNGLKAIWQVEQKASIAGT---DSGWGNRQSFIGLK-GGFGKLRVGRLN 113

Query: 118 NFGFTMGGNFDPLGVGNYDENSWLYYGVTGLRVSNML---KYEG-KWNGLYVGLGYGFGE 173
D + +D S L +Y+ ++ GL + Y +
Sbjct: 114 ------SVLKDTGDINPWDSKSDYLGVNKIAEPEARLISVRYDSPEFAGLSGSVQYALND 167

Query: 174 QAGSTANNRYMGGAVSYEFGPALIGAFYQQ----QQDSTAAGNKQKVWGIGGNY 223
AG + Y +Y+ G + Q K ++ + Y
Sbjct: 168 NAGRHNSESY-HAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSGY 220


56RS_RS15590RS_RS15625Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS155902123.363543type II secretion system protein GspG
RS_RS155954123.669680type II secretion system protein GspH
RS_RS156003113.939955type II secretion system protein GspI
RS_RS156054124.292045general secretion pathway protein GspJ
RS_RS156103124.507025general secretion pathway protein GspK
RS_RS156151123.530065general secretion pathway protein GspL
RS_RS156201113.406765general secretion pathway protein GspM
RS_RS156251133.308398general secretion pathway protein GspN
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15600BCTERIALGSPG2011e-69 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 201 bits (512), Expect = 1e-69
Identities = 67/135 (49%), Positives = 89/135 (65%), Gaps = 3/135 (2%)

Query: 21 RGFTLIEIMVVVVILGILAALVVPKIMSRPDEARIIAAKQDIASISQALKLYRLDNGRYP 80
RGFTL+EIMVV+VI+G+LA+LVVP +M ++A A DI ++ AL +Y+LDN YP
Sbjct: 8 RGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHHYP 67

Query: 81 TTEQGLAALVTKPTTEPVPNNWKGGGYLERLPKDPWGHPYQYLNPGVRGEVDIFSYGADG 140
TT QGL +LV PT P+ N+ GY++RLP DPWG+ Y +NPG G D+ S G DG
Sbjct: 68 TTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLLSAGPDG 127

Query: 141 QPGGTANDADIGNWD 155
+ G + DI NW
Sbjct: 128 EMGT---EDDITNWG 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15605BCTERIALGSPH533e-11 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 52.6 bits (126), Expect = 3e-11
Identities = 23/75 (30%), Positives = 39/75 (52%), Gaps = 6/75 (8%)

Query: 14 RRGFTLLELLVVVVIIGIVLGVVAVNATPNPRSQLADDAQKLARL---IELAQEEAQLTA 70
+RGFTLLE++++++++G+ G+V + A P R AQ LAR + Q+ T
Sbjct: 3 QRGFTLLEMMLILLLMGVSAGMV-LLAFPASRDD--SAAQTLARFEAQLRFVQQRGLQTG 59

Query: 71 RPVAWEGDAQGWRFL 85
+ W+FL
Sbjct: 60 QFFGVSVHPDRWQFL 74


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15610BCTERIALGSPG347e-05 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 33.7 bits (77), Expect = 7e-05
Identities = 25/94 (26%), Positives = 38/94 (40%), Gaps = 11/94 (11%)

Query: 8 RRAARRHGFTLLEVLVALTIVAVALTATM-RAMGSMTVASESLQTRMLATWSAENHLAGL 66
R ++ GFTLLE++V + I+ V + + MG+ A + + EN L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVA--LENALDMY 59

Query: 67 RL-AHAYPEPGMRGFACPQGDAQLWCEETVAPTP 99
+L H YP QG L T+ P
Sbjct: 60 KLDNHHYP-------TTNQGLESLVEAPTLPPLA 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15615BCTERIALGSPG372e-05 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 36.8 bits (85), Expect = 2e-05
Identities = 27/75 (36%), Positives = 39/75 (52%), Gaps = 7/75 (9%)

Query: 1 MRAISSDRVRGFTLLELLVAITLLAILAVLAWRGLDSMTRTHEALAQRDE-RIEALKTAY 59
MRA +D+ RGFTLLE++V I ++ +LA L L M +A Q+ I AL+ A
Sbjct: 1 MRA--TDKQRGFTLLEIMVVIVIIGVLASLVVPNL--MGNKEKADKQKAVSDIVALENAL 56

Query: 60 AQFDADCTQLADPST 74
+ D P+T
Sbjct: 57 DMYKLDNHHY--PTT 69


57RS_RS15705RS_RS15760Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS15705-1133.047263ethanolamine ammonia-lyase light chain
RS_RS15710-1132.761458ethanolamine ammonia lyase large subunit
RS_RS15715-2122.562402aldehyde dehydrogenase
RS_RS15720293.831320Fis family transcriptional regulator
RS_RS15725092.728715alcohol dehydrogenase
RS_RS157301113.019980lactate dehydrogenase
RS_RS157351123.446506DNA-binding protein
RS_RS157401123.263562N-acetyltransferase
RS_RS157451123.239625MFS transporter
RS_RS15750-2122.812906ferritin
RS_RS15755-1113.180360chemotaxis protein
RS_RS157601113.250939hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15720HTHFIS340e-112 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 340 bits (873), Expect = e-112
Identities = 143/430 (33%), Positives = 211/430 (49%), Gaps = 51/430 (11%)

Query: 251 IRAANQSALNLLGRTRQSLLGQPV-----ETVFDLTADALLARATDSGGVAWPLHTHQGR 305
+ +++A +LL R +++ PV + F A A D + P +
Sbjct: 55 VVMPDENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDY--LPKPFDLTELI 112

Query: 306 LLFGLLRAPRPRPTADAAPASPPMPATPGLCLQADGLRTGFHRALRVFEHDVPLLLHGET 365
G++ P + L ++ ++ + R+ + D+ L++ GE+
Sbjct: 113 ---GIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGES 169

Query: 366 GTGKEAFARAVHQASSRAAQPFVAVNCAAIPETLIESELFGYRGGSFTGARREGMRGKLQ 425
GTGKE ARA+H R PFVA+N AAIP LIESELFG+ G+FTGA+ G+ +
Sbjct: 170 GTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRST-GRFE 228

Query: 426 QADGGTLFLDEIGDMPLALQSRLLRVLEERAVVPIGG-EAQTVDVRIVSASHRDMEARVR 484
QA+GGTLFLDEIGDMP+ Q+RLLRVL++ +GG DVRIV+A+++D++ +
Sbjct: 229 QAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSIN 288

Query: 485 DGRFREDLYYRLNGLRITLPPLRERADKAALLAHVLAEESR---GRPARLDDDARDALLA 541
G FREDLYYRLN + + LPPLR+RA+ L +++ R D +A + + A
Sbjct: 289 QGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKA 348

Query: 542 QPWPGNVRQLRNVLRTLVALSDDGRIRLRDLPPELRPPAMASAAAPAP------------ 589
PWPGNVR+L N++R L AL I + ELR S A
Sbjct: 349 HPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAV 408

Query: 590 ------------------------LDNAEKAALLAALQAQQWRMTHAAKALGISRNTLYR 625
L E +LAAL A + AA LG++RNTL +
Sbjct: 409 EENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRK 468

Query: 626 KLRKHGIARP 635
K+R+ G++
Sbjct: 469 KIRELGVSVY 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15725DHBDHDRGNASE280.044 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 28.1 bits (62), Expect = 0.044
Identities = 29/109 (26%), Positives = 44/109 (40%), Gaps = 13/109 (11%)

Query: 169 GQWIAISGIG-GLGHVAVQYAVAMGLHVVAVDVAPEKLALARELGARLSVDASQ-----Q 222
G+ I+G G+G + + G H+ AVD PEKL + A +
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 223 DPAAV------IQKEVGGV-HGVLVTAVSRSAFAQALGMVRRGGTVSLN 264
D AA+ I++E+G + V V V R +L T S+N
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVN 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15735HTHTETR270.042 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 27.3 bits (60), Expect = 0.042
Identities = 18/156 (11%), Positives = 48/156 (30%), Gaps = 15/156 (9%)

Query: 18 ERIARRVRDLRAARGY---TLDALAARCGVSRSMISLIERGAASPTAVVLDKLAAGLGVS 74
+ I L + +G +L +A GV+R I + + + ++ +
Sbjct: 14 QHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSD----LFSEIWELSESN 69

Query: 75 LASLFGGEREGVPAQPLMRRAQQAQWRDPASGYVRRNLSPPDWPSPIQLVEVNFPAGARV 134
+ L + P PL + R+ + ++ ++++ +
Sbjct: 70 IGELELEYQAKFPGDPL------SVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEM 123

Query: 135 AYETGGRENAMQQQVWVIDGRIDVMLGDQRHELHPG 170
A + N + I+ + + + L
Sbjct: 124 AVVQQAQRNLCLESYDRIEQTLKHCI--EAKMLPAD 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15755RTXTOXIND300.031 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.031
Identities = 31/264 (11%), Positives = 67/264 (25%), Gaps = 29/264 (10%)

Query: 325 DELEADLVSARNRFLLLGLALGALLAGGLYWMLRRAVSAPLAEVAGVARRVAAGDLTHR- 383
LE R L+ + L +V + VA ++ +
Sbjct: 44 AHLELIETPVSRRPRLVAYFIMGFLVIAFIL----SVLGQVEIVATANGKLTHSGRSKEI 99

Query: 384 --FSGTRRDEI----------GQLMHAINGLGDGLSGIVDKVRASASTIASSTGQIAAGN 431
+ EI G ++ + LG + + + + + QI + +
Sbjct: 100 KPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRS 159

Query: 432 VDLSARTEAQAGNLERTASSIEQLAATVRQNADSAQHAHDMVQSASEAANAGGRTVARLV 491
++L+ E + + + E+ + + E L
Sbjct: 160 IELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELN---------LD 210

Query: 492 GTMSGIHTTAQKIADITGIIDGIAFQTNI---LALNAAVEAARAGEQGRGFAVVAGEVRS 548
+ T +I + + + L A+ EQ + E+R
Sbjct: 211 KKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRV 270

Query: 549 LAQRSAAAAKEIKELISRSVQEVQ 572
+ EI Q
Sbjct: 271 YKSQLEQIESEILSAKEEYQLVTQ 294


58RS_RS15805RS_RS15850Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS158055201.831028hypothetical protein
RS_RS158103230.359214amino acid-binding protein
RS_RS15815431-4.715166membrane protein
RS_RS15820430-4.730163transcription regulator protein
RS_RS15825428-4.291187hypothetical protein
RS_RS15830329-5.013147tryptophan 2-monooxygenase oxidoreductase
RS_RS15835334-7.221060hypothetical protein
RS_RS15840430-6.603470hypothetical protein
RS_RS15845322-1.995181cytochrome C
RS_RS15850422-1.817572antibiotic hydrolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15835NUCEPIMERASE593e-12 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 59.0 bits (143), Expect = 3e-12
Identities = 30/141 (21%), Positives = 52/141 (36%), Gaps = 24/141 (17%)

Query: 1 MNVLITGARGYLGALLSVAL-SGSHDVV----------------RTSRGDISNSLFQYLD 43
M L+TGA G++G +S L H VV R F +D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 44 VTDADCVKRVVAATVPDVIIHAAAMANLSVCEANPEAAFLVNAEGTLNVVQAANEVG-AR 102
+ D + + + A+ + + + + NP A N G LN+++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 103 VLFISTLAASNPSIVYGRSKS 123
+L+ S+ S VYG ++
Sbjct: 121 LLYASS------SSVYGLNRK 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15850SURFACELAYER300.025 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 29.6 bits (66), Expect = 0.025
Identities = 59/285 (20%), Positives = 102/285 (35%), Gaps = 29/285 (10%)

Query: 1 MKRLLYLAAPAIVGAALLCLPAPSSAAANAVGANAVIGEFATPSA-ASSPQGVDVAGDGS 59
MK+ L + + A AA L AP +A A V A I + +A ++ VDV S
Sbjct: 1 MKKNLRIVSAA---AAALLAVAPIAATAMPVNAATTINADSAINANTNAKYDVDVTPSIS 57

Query: 60 VWYSETTAGKIAVLRPDGSSAEFPVPNGGQPFILKVADD--GVWFTDSGTRAIGHLDPAT 117
+ + + + P + G+ + + D TDS + +
Sbjct: 58 AIAAVAKSDTMPAI-PGSLTGSISASYNGKSYTANLPKDSGNATITDSNNNTVKPAELEA 116

Query: 118 GTVETYAIPSGASPFFIQVGPDGSKWFTETAGIGRLSPNGEFTEWTVTLEHADDNME--Q 175
T +P + F GS+ + IG +PN FTE + D +
Sbjct: 117 DKAYTVTVPDVSFNF-------GSENAGKEITIGSANPNVTFTE-----KTGDQPASTVK 164

Query: 176 LSLDPWGNVWFVERNFDGIGAAGTNKVRRLNPYSNVISTYRVPTLGGTPSGVVANANGSV 235
++LD G + A T +N Y ++T T G + A+ G +
Sbjct: 165 VTLDQDGVAKLSSVQIKNVYAIDTTYNSNVNFYD--VTTGATVTTGAV--SIDADNQGQL 220

Query: 236 WVSEYYANAIALLNPFAAPHSDEVVQPNALKADSRTASVSPIRAA 280
++ A + FAA + + Q + D+ TA ++A
Sbjct: 221 NITSVVAAINS--KYFAAQYDKK--QLTNVTFDTETAVKDALKAQ 261


59RS_RS15900RS_RS16375Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS15900-126-3.088567glycosyl transferase family 8
RS_RS15905-223-3.692481esterase
RS_RS15910-124-4.162927adenylylsulfate kinase
RS_RS15915-126-4.472265hypothetical protein
RS_RS15920026-4.038488transposase
RS_RS15925334-6.404739hypothetical protein
RS_RS15930338-6.946625recombinase
RS_RS15935554-10.051949hypothetical protein
RS_RS15940356-10.168369hypothetical protein
RS_RS15945346-8.784593hypothetical protein
RS_RS15950226-4.207741hypothetical protein
RS_RS15955017-0.995197hypothetical protein
RS_RS15960-116-1.298335hypothetical protein
RS_RS15965219-0.413372hypothetical protein
RS_RS15970219-0.260779hypothetical protein
RS_RS15975219-0.436652activation/secretion signal peptide protein
RS_RS15980323-0.930252hemagglutinin
RS_RS15985225-1.478576hypothetical protein
RS_RS15990224-0.889720hypothetical protein
RS_RS15995-127-3.601360hypothetical protein
RS_RS16000228-3.375214glycoside hydrolase
RS_RS16005025-1.847586membrane protein
RS_RS16015-123-2.680507membrane protein
RS_RS16020-136-4.227265hypothetical protein
RS_RS16025-137-4.618581hypothetical protein
RS_RS16030-138-4.998470hypothetical protein
RS_RS16035-129-5.239478transposase
RS_RS16040032-4.877483hypothetical protein
RS_RS16045031-5.086100hypothetical protein
RS_RS16050226-4.068098multidrug transporter
RS_RS16055128-4.375671hemolysin secretion protein D
RS_RS16060227-4.185581AcrR family transcriptional regulator
RS_RS16065533-2.829700RND transporter
RS_RS16070431-3.486156short-chain dehydrogenase
RS_RS16075333-3.503494hypothetical protein
RS_RS16080335-5.383805hypothetical protein
RS_RS16085238-5.057490hypothetical protein
RS_RS16090238-4.784788hypothetical protein
RS_RS16095129-4.963174hypothetical protein
RS_RS16100121-4.653488hypothetical protein
RS_RS16105118-3.625195hypothetical protein
RS_RS16110117-3.258981virulence factor
RS_RS16115114-2.370421membrane protein
RS_RS16120115-2.946819transposase
RS_RS16125016-2.866104hypothetical protein
RS_RS16130124-2.806695hypothetical protein
RS_RS16135032-3.344558toxin RelE
RS_RS16140-138-3.845063addiction module antitoxin
RS_RS16145032-5.332618RNA polymerase subunit sigma
RS_RS16150119-1.191316hypothetical protein
RS_RS16155117-0.364576hypothetical protein
RS_RS16160-116-0.953924hypothetical protein
RS_RS16165-115-1.509401hypothetical protein
RS_RS16170-113-1.374928isoleucyl-tRNA synthetase
RS_RS16175012-0.676189hypothetical protein
RS_RS16180-123-2.928968hypothetical protein
RS_RS16185-126-3.725477hypothetical protein
RS_RS16190025-3.227409hypothetical protein
RS_RS16195025-2.803920hypothetical protein
RS_RS16200025-2.964528hypothetical protein
RS_RS25185-117-1.389580hypothetical protein
RS_RS16215-117-1.590419excisionase
RS_RS16220-218-1.987211hypothetical protein
RS_RS16225-219-2.185537hypothetical protein
RS_RS16230-123-2.838947transposase
RS_RS16235131-4.430393hypothetical protein
RS_RS16240025-3.794480hypothetical protein
RS_RS16245225-3.522989hypothetical protein
RS_RS16250025-3.073799hypothetical protein
RS_RS16255227-4.810216recombinase
RS_RS16265119-4.170659hypothetical protein
RS_RS16270222-3.720648hypothetical protein
RS_RS16275120-3.383168hypothetical protein
RS_RS16280119-2.602270hypothetical protein
RS_RS16285117-0.580568*thiosulfohydrolase SoxB
RS_RS16295-1212.501308alkyl hydroperoxide reductase
RS_RS16300-2192.012443sulfur oxidation c-type cytochrome SoxX
RS_RS16305-1191.528369sulfur oxidation c-type cytochrome SoxA
RS_RS163101141.749596hypothetical protein
RS_RS163150132.670978thiosulfate oxidation carrier complex protein
RS_RS163202112.889818thiosulfate oxidation carrier protein SoxY
RS_RS163251113.242632cytochrome C protein
RS_RS163301113.396584membrane protein
RS_RS163352123.975695flavocytochrome C sulfide dehydrogenase
RS_RS163401124.463446hypothetical protein
RS_RS16345-1124.166534hypothetical protein
RS_RS163500123.674899hypothetical protein
RS_RS163600133.39150616S rRNA
RS_RS163650152.649444hypothetical protein
RS_RS163700122.772128hypothetical protein
RS_RS16375-1143.062598exonuclease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15905PF06057290.020 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 29.0 bits (65), Expect = 0.020
Identities = 19/100 (19%), Positives = 34/100 (34%), Gaps = 8/100 (8%)

Query: 136 YWTGKRFAPEAVAS-IDQAVSHFAAKVPGQRIHLIGYSGG-----GALAVLVAARRTDVA 189
YW K P+ V + + A+ Q++ LIGYS G L + A R +V
Sbjct: 90 YWKQK--DPKDVTQDTLAIIDKYQAEFGTQKVILIGYSFGAEVIPFVLNEMPARYRKNVL 147

Query: 190 SIRTVAGNLDHAFVNRLHDVSSMPQSENAIDFAQRVASIP 229
++ + F + ++ + V
Sbjct: 148 GAVLLSPSQSSDFEIHVSEMVTSDNQSARYLTLPEVNKQT 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15915SYCDCHAPRONE310.003 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 31.1 bits (70), Expect = 0.003
Identities = 11/59 (18%), Positives = 19/59 (32%)

Query: 135 SVAIILYLARRYDQAREELNKALEIDPNHFLLHFRLGLVYQQQKLFHDAIEEMQKAVTL 193
S+A Y + +Y+ A + +D LG Q + AI +
Sbjct: 41 SLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIM 99


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15950MICOLLPTASE310.002 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 30.8 bits (69), Expect = 0.002
Identities = 15/43 (34%), Positives = 22/43 (51%), Gaps = 2/43 (4%)

Query: 37 IGVAAKISDLYPHVINVYMNATLPPLRKKSWAGYYFKTLIPYF 79
+G ++ + +N +N L L KKSW GY KT+ YF
Sbjct: 701 VGGRSQGEENDWKDMNSKLNDILKELSKKSWNGY--KTVTAYF 741


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15955SACTRNSFRASE362e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.1 bits (83), Expect = 2e-05
Identities = 17/86 (19%), Positives = 31/86 (36%), Gaps = 8/86 (9%)

Query: 54 TAPACKYFVAMESGLLVGVCALK----DRKHIYHLFVAPEAQGRGVARALWEYARADAEL 109
A F+ +G ++ I + VA + + +GV AL A A+
Sbjct: 63 EGKAA--FLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKE 120

Query: 110 DGATGSF--TVNSSLHAVPVYERLGF 133
+ G T + ++ A Y + F
Sbjct: 121 NHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15975PF03544300.023 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 29.9 bits (67), Expect = 0.023
Identities = 12/70 (17%), Positives = 20/70 (28%), Gaps = 2/70 (2%)

Query: 9 AAALLSVATAHAQV-LPPAPPGPVLEDPAQRAQRERQDTERRREATQPAPQIAVAPSVPD 67
A L + ++ P P + PA + +P P+ P P
Sbjct: 30 AGLLYTSVHQVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVV-EPEPEPEPIPEPPK 88

Query: 68 DAAVDAVAEP 77
+A V
Sbjct: 89 EAPVVIEKPK 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15980PF05860761e-18 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 75.6 bits (186), Expect = 1e-18
Identities = 28/117 (23%), Positives = 47/117 (40%), Gaps = 4/117 (3%)

Query: 28 AGIVPDG--GTATTVTTGANGRPVVNIAPSTAGVSHNIYTSFNVGPVGADLNNAIVRART 85
A I PD + +TT N R + + + + H+ + F+V G N +
Sbjct: 1 AQITPDTTLPINSNITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFNNPTNIQN 59

Query: 86 IVNQVTSTDPSLIQGNIAVLGPRANVIIANPNGITVDGGSFTNTGNVALTTGQVSFN 142
I+++VT S I G I AN+ + NPNGI + + G + +
Sbjct: 60 IISRVTGGSVSNIDGLIRANA-TANLFLINPNGIIFGQNARLDIGGSFVGSTANRLK 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16060ACRIFLAVINRP452e-144 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 452 bits (1164), Expect = e-144
Identities = 226/1054 (21%), Positives = 423/1054 (40%), Gaps = 69/1054 (6%)

Query: 8 LSALAVRERAVTLFLICLISLAGLISFFKLGRAEDPAFTVKVMTIITAWPGATAQEMQDQ 67
++ +R L ++ +AG ++ +L A+ P +++ +PGA AQ +QD
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 68 VAEKIEKRIQELRWYDRTETYT-RPGLAFTTLTLLDSTPPSEVQEEFYQARKKVSDEVAN 126
V + IE+ + + + + G TLT T P Q Q + K+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQV---QVQNKLQLATPL 117

Query: 127 LPPGVIGPMVNDEYADVTFAL---FALKAQGEPQRVLVRDAE-TLRQRLLHVPGVKKVNI 182
LP V ++ E + ++ + F G Q + ++ L + GV V +
Sbjct: 118 LPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL 177

Query: 183 IGEQSE-RIYVEFSHDRLATLGVSPQEVFAALNNQNALTAAGSVET------KGPQVFIR 235
G Q RI+++ D L ++P +V L QN AAG + + I
Sbjct: 178 FGAQYAMRIWLDA--DLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASII 235

Query: 236 LDGAFDELQKIRDTPVVSQ--GRTLKLSDIATVKRGYEDPATFMVRNGGQPALLLGIVMR 293
F ++ + G ++L D+A V+ G E+ R G+PA LGI +
Sbjct: 236 AQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVI-ARINGKPAAGLGIKLA 294

Query: 294 EGWNGLDLGKALDKEVGAINADMPLGMSLTKVTDQAVNISAAVDEFMLK-FFAALLVVML 352
G N LD KA+ ++ + P GM + D + ++ E + F A +LV ++
Sbjct: 295 TGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLV 354

Query: 353 VSFVSMGWRAGLVVAAAVPLTLAVVFVVMAATGKNFDRITLGSLILALGLLVDDAIIAIE 412
+ RA L+ AVP+ L F ++AA G + + +T+ ++LA+GLLVDDAI+ +E
Sbjct: 355 MYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVE 414

Query: 413 MMV-VKMEEGYSRVAASAYAWSHTAAPMLSGTLVTAVGFMPNGFARSTAGEYTSNMFWIV 471
+ V ME+ A+ + S ++ +V + F+P F + G +
Sbjct: 415 NVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITI 474

Query: 472 GIALIASWVVAVVFTPYLGVKMLPDFKKVE--------GGHDAIYDTPRYNRFRALLGRV 523
A+ S +VA++ TP L +L G + +D N + +G++
Sbjct: 475 VSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFD-HSVNHYTNSVGKI 533

Query: 524 IARKWLVAGSVVGLFALAILGMAVVKKQFFPISDRPEVLVEVQMPYGTSINQTSAATAKL 583
+ + A ++ + F P D+ L +Q+P G + +T ++
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQV 593

Query: 584 EAWLAKQKEAQIVTSYVGQGAPRFYLAMGPELPDSSFAKIVI-----RTDSQEERDALKQ 638
+ K ++A + + + G + + ++ A + + R + +A+
Sbjct: 594 TDYYLKNEKANVESVFTVNG-----FSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIH 648

Query: 639 RLRQAIADGLAPEARVRVTQLVFGPYSPFPVAYRVTGPDAETLRRIAADVRQVMDAS--- 695
R + + + V + TG D E + + + A
Sbjct: 649 RAKMELGK--IRDGFVIPFNM-----PAIVELGTATGFDFELIDQAGLGHDALTQARNQL 701

Query: 696 --------PMMRTVNTDWGMRVPTLHFTLQQDRLQAVGLTSSAVAQQLQFLLNGIPVTAV 747
+ +V + + Q++ QA+G++ S + Q + L G V
Sbjct: 702 LGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDF 761

Query: 748 REDIRTVQVTARSAGDIRLDPAKIGDFTLAGANGQRVPLSQVGKIDVRMEEPIIRRRDRM 807
+ R ++ ++ R+ P + + ANG+ VP S P + R + +
Sbjct: 762 IDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGL 821

Query: 808 PTITVRGDIADGLQPPDVSTALSKQLQPIIEKLPSGYRIEQAGSIEESGKATKAMLPLFP 867
P++ ++G+ A G D + + KLP+G + G + + L
Sbjct: 822 PSMEIQGEAAPGTSSGDAMALMEN----LASKLPAGIGYDWTGMSYQERLSGNQAPALVA 877

Query: 868 IMLAVTLLILIFQVRSIPAMVMVFLTSPLGLIGVVPTLILFGQPFGINALVGLIALSGIL 927
I V L L S V V L PLG++GV+ LF Q + +VGL+ G+
Sbjct: 878 ISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLS 937

Query: 928 MRNTLILIGQIQHNKE-EGLDPFTAVVEATVQRARPVILTALAAILAFIPLTHSVFWGA- 985
+N ++++ + E EG A + A R RP+++T+LA IL +PL S G+
Sbjct: 938 AKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSG 997

Query: 986 ----LAYTLIGGTFAGTILTLVFLPAMYSIWFRI 1015
+ ++GG + T+L + F+P + + R
Sbjct: 998 AQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRC 1031



Score = 83.7 bits (207), Expect = 2e-18
Identities = 55/323 (17%), Positives = 126/323 (39%), Gaps = 20/323 (6%)

Query: 712 LHFTLQQDRLQAVGLT----SSAVAQQLQFLLNGIPVTAVREDIRTVQVTARSAGDIRLD 767
+ L D L LT + + Q + G + + + + + +
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK-N 242

Query: 768 PAKIGDFTL-AGANGQRVPLSQVGKIDVRMEEPIIRRR-DRMPTITVRGDIADGLQPPDV 825
P + G TL ++G V L V ++++ E + R + P + +A G D
Sbjct: 243 PEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDT 302

Query: 826 STALSKQLQPIIEKLPSGYRIE----QAGSIEESGKATKAMLPLFPIMLAVTLLILIFQV 881
+ A+ +L + P G ++ ++ S L IML L++ +F +
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTL-FEAIMLVF-LVMYLF-L 359

Query: 882 RSIPAMVMVFLTSPLGLIGVVPTLILFGQPFGINALVGLIALSGILMRNTLILIGQIQ-H 940
+++ A ++ + P+ L+G L FG + G++ G+L+ + ++++ ++
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 941 NKEEGLDPFTAVVEATVQRARPVILTALAAILAFIPL-----THSVFWGALAYTLIGGTF 995
E+ L P A ++ Q ++ A+ FIP+ + + + T++
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 996 AGTILTLVFLPAMYSIWFRIRPN 1018
++ L+ PA+ + +
Sbjct: 480 LSVLVALILTPALCATLLKPVSA 502


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16065RTXTOXIND401e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.8 bits (93), Expect = 1e-05
Identities = 18/118 (15%), Positives = 40/118 (33%), Gaps = 9/118 (7%)

Query: 70 GKVLERLVDAGQTVKRGQPLMRIDPVD-----LKLAAHAQQEAVSAARARAQQTA--EDE 122
V E +V G++V++G L+++ + LK + Q + R + + ++
Sbjct: 105 SIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNK 164

Query: 123 ARYRDLRGTGAISASAYDQAKAAADAAKAQLSAAEAQADVARNASRYAELLADGDGVV 180
L + ++ K Q S + Q + A+ V+
Sbjct: 165 LPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKE--LNLDKKRAERLTVL 220


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16070HTHTETR618e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 61.2 bits (148), Expect = 8e-14
Identities = 41/199 (20%), Positives = 66/199 (33%), Gaps = 16/199 (8%)

Query: 17 DVRDQIVAAATEHFSRYGYEKTAVSDLAKAIGFSKAYIYKFFESKQAIGEMICANCLREI 76
+ R I+ A FS+ G T++ ++AKA G ++ IY F+ K + I I
Sbjct: 11 ETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNI 70

Query: 77 EADVSAALAE-ADSPPEKLRRMFKAL-----TEASLRLFSHDRKLYEIAASAATERWQAV 130
A+ P LR + + TE RL QA
Sbjct: 71 GELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQ 130

Query: 131 IAYEERIQKVLRDVLQEGRETGDFERKTPLDEATTAIYLVMRPYMNPLLLQY-----SFD 185
+ L+ E L AI +MR Y++ L+ + SFD
Sbjct: 131 RNLCLESYDRIEQTLKHCIEAKML--PADLMTRRAAI--IMRGYISGLMENWLFAPQSFD 186

Query: 186 YTESAPGLLSSLVLRSLSP 204
+ A +++L
Sbjct: 187 LKKEA-RDYVAILLEMYLL 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16080DHBDHDRGNASE1046e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 104 bits (260), Expect = 6e-29
Identities = 53/186 (28%), Positives = 86/186 (46%), Gaps = 8/186 (4%)

Query: 5 KVVLITGVSSGIGRATAAKFALRGCRVFGTVRNLAKAQPIPGVELVE--------MDIRD 56
K+ ITG + GIG A A A +G + N K + + E D+RD
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 57 QASVQQGIQTLIAQAKRIDVLVNSAGVTLLGATEETSIAEAQALFDTNVFGILRTTQAAL 116
A++ + + + ID+LVN AGV G S E +A F N G+ +++
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 117 PHMRAQRSGRIINISSVLGFLPAPYMGLYSASKHAVEGMSETLDHEVRKFGIRVVLVEPS 176
+M +RSG I+ + S +P M Y++SK A ++ L E+ ++ IR +V P
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 177 FTKTSL 182
T+T +
Sbjct: 189 STETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16175PF05272496e-08 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 49.3 bits (117), Expect = 6e-08
Identities = 72/273 (26%), Positives = 105/273 (38%), Gaps = 57/273 (20%)

Query: 24 ALLARLEFVLSVLFPAGKKRRGTFVVGDILGSPGDSLEVVLDGEKAGLWTDRATGDGG-D 82
ALL R + +L P G + G + G GDS +V + G W D +TG+ G D
Sbjct: 17 ALLTRAKDLLPEWLPGGVLVGHEYECGSLAGGKGDSCKVNV---TTGKWCDFSTGESGRD 73

Query: 83 IFDLIAAQAGLRVSTDFSGVL-ERALQ-----LLGQASKQPVRRKRREPPTDDLGPETAK 136
+ DL A GL+VS + V E L+ ++G + P + R P E
Sbjct: 74 LLDLYAEIHGLKVSKAAAQVAREEGLESVAGIVMGAPAGAPAPKPPRPEPPPRPVVEKEC 133

Query: 137 WDYLDAAGKLIGVVYRYDPPGRGKEFRPWDAKRRKMAPP--------------------- 175
W+ + + + P +G+E + R P
Sbjct: 134 WETIQPVPEHAVPPSFWHPAPKGREPDKIEHTARYQVGPVLWGYVVRFIKSDGDKLTLPY 193

Query: 176 -------------------EPRPLYNQPGLATATQ-VVLVEGEK---CAQALIDIG---I 209
+PRPLY A ++ VVLVEGE+ C Q L+D G +
Sbjct: 194 VYSRSQRDGSEAWKWRGWDDPRPLYFPSHRAPESRTVVLVEGERKADCLQQLLDAGAPGV 253

Query: 210 VATTAMHGANAPVEKTDWSPLAGKAVLIWPDRD 242
+ G + K DWS LAG V++WPD D
Sbjct: 254 YCVASWPGGSNGWPKADWSWLAGCTVVLWPDCD 286


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16300ECOLIPORIN280.018 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 28.0 bits (62), Expect = 0.018
Identities = 11/18 (61%), Positives = 14/18 (77%)

Query: 13 AALVPMALAAGAAHAVEV 30
A ++P LAAGAAHA E+
Sbjct: 7 ALVIPALLAAGAAHAAEI 24


60RS_RS16540RS_RS16640Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS165401123.094420bifunctional proline
RS_RS165452133.495584primosome assembly protein PriA
RS_RS165502132.301201uroporphyrinogen decarboxylase
RS_RS165551102.025101cupin
RS_RS16560-191.874673aldehyde-activating protein
RS_RS16565-191.904739hypothetical protein
RS_RS16570-1102.271330chemotaxis protein
RS_RS16575092.493613LysR family transcriptional regulator
RS_RS16580-1102.689225glutathione S-transferase
RS_RS16585193.806431phosphate ABC transporter substrate-binding
RS_RS165900113.786390hypothetical protein
RS_RS16595-182.393482alpha/beta hydrolase
RS_RS16600-181.376859CoA-binding protein
RS_RS166050120.073659hypothetical protein
RS_RS16610115-0.792247membrane protein
RS_RS16615219-3.688882ATP synthase epsilon chain 1
RS_RS16620120-4.057419ATP synthase subunit beta
RS_RS16625017-4.546205ATP synthase subunit gamma
RS_RS16630017-4.180991ATP synthase subunit alpha
RS_RS16635213-3.366021ATP synthase subunit delta
RS_RS16640112-3.299040ATP synthase subunit b 1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16545HTHFIS300.043 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.8 bits (67), Expect = 0.043
Identities = 16/105 (15%), Positives = 33/105 (31%), Gaps = 2/105 (1%)

Query: 636 DAPALLALVRHDYAGFARQLLRERKQACLPPFAYQALLTAEHRELARALEFLGAARAAGQ 695
D AL + H + G R+L ++ Q ++T E E E +
Sbjct: 339 DQEALELMKAHPWPGNVRELENLVRRLTA--LYPQDVITREIIENELRSEIPDSPIEKAA 396

Query: 696 AVAEAQSLPVQLHDPVPMTMVRLANRERAQLLLESASRAALQRLL 740
A + + S+ + + + + L + L+
Sbjct: 397 ARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLI 441


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16610TONBPROTEIN432e-06 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 42.7 bits (100), Expect = 2e-06
Identities = 26/93 (27%), Positives = 35/93 (37%), Gaps = 1/93 (1%)

Query: 666 PMPSAQQERPRSEPPAQARPEPQPHPAPVPVPHQPEARIEAPRPMPQPHPVEIRPPQPQP 725
P + + P PEP+P P P P P + IE P+P P+P P ++ Q QP
Sbjct: 52 PADLEPPQAVQPPPEPVVEPEPEPEPIPEP-PKEAPVVIEKPKPKPKPKPKPVKKVQEQP 110

Query: 726 QPQPHVVEMRPPQPQPQPRPEVRPPQPPQPPQP 758
+ VE RP P P
Sbjct: 111 KRDVKPVESRPASPFENTAPARLTSSTATAATS 143



Score = 40.7 bits (95), Expect = 9e-06
Identities = 23/107 (21%), Positives = 35/107 (32%)

Query: 681 AQARPEPQPHPAPVPVPHQPEARIEAPRPMPQPHPVEIRPPQPQPQPQPHVVEMRPPQPQ 740
PQ P +PE E P+ PV I P+P+P+P+P V+ QP+
Sbjct: 52 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPK 111

Query: 741 PQPRPEVRPPQPPQPPQPRVEHHEAAPAQPHPQAGHNEAQQRDTDHR 787
+P P P + + + A R
Sbjct: 112 RDVKPVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRALSR 158



Score = 35.0 bits (80), Expect = 7e-04
Identities = 20/105 (19%), Positives = 35/105 (33%), Gaps = 5/105 (4%)

Query: 651 PQPESMPGTPREERRPMPSAQQERPRSEPPAQARPEPQPHPAPVPVPHQPEARIEAPRPM 710
P+++ P P P + + +P+P P P P +P +++ +P
Sbjct: 56 EPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKP---KPVKKVQE-QPK 111

Query: 711 PQPHPVEIRPPQPQPQPQPHVVEMRPPQPQPQPRPEVRPPQPPQP 755
PVE RP P P + +P P+
Sbjct: 112 RDVKPVESRPASPFENTAPARL-TSSTATAATSKPVTSVASGPRA 155



Score = 32.7 bits (74), Expect = 0.004
Identities = 24/80 (30%), Positives = 31/80 (38%), Gaps = 2/80 (2%)

Query: 704 IEAPRPMPQPHPVEIRPPQPQPQPQPHVVEMRPPQPQPQPRPEVRPPQPPQPPQPRVEHH 763
IE P P QP V + P PQ V P +P+P PE P P + P +
Sbjct: 36 IELPAP-AQPISVTMVTPADLEPPQA-VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPK 93

Query: 764 EAAPAQPHPQAGHNEAQQRD 783
+P P E +RD
Sbjct: 94 PKPKPKPKPVKKVQEQPKRD 113



Score = 32.7 bits (74), Expect = 0.005
Identities = 24/87 (27%), Positives = 39/87 (44%), Gaps = 12/87 (13%)

Query: 695 PVPHQP-EARIEAPRPMPQPHPVEIRPP---QPQPQPQPHVVEMRPPQPQPQPRPEVRPP 750
P P QP + P + P V+ P +P+P+P+P + E P +P+ +P
Sbjct: 39 PAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEP-IPEPPKEAPVVIEKPKPKPK 97

Query: 751 QPPQP-------PQPRVEHHEAAPAQP 770
P+P P+ V+ E+ PA P
Sbjct: 98 PKPKPVKKVQEQPKRDVKPVESRPASP 124


61RS_RS16705RS_RS16775Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS167051153.130315LysR family transcriptional regulator
RS_RS167101152.205919membrane protein
RS_RS167152160.638491membrane protein
RS_RS16720215-0.514911membrane protein
RS_RS16725114-0.445615hypothetical protein
RS_RS16730215-1.322196ABC transporter permease
RS_RS16735016-0.472560ABC transporter permease
RS_RS167400130.130370ABC transporter permease
RS_RS167450130.780226histidinol dehydrogenase
RS_RS167503133.471419amino acid ABC transporter ATP-binding protein
RS_RS167553123.709625choline dehydrogenase
RS_RS167602123.438334alpha/beta hydrolase
RS_RS167651102.331182MerR family transcriptional regulator
RS_RS167702102.678341copper-transporting ATPase
RS_RS167753112.874389hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16705NUCEPIMERASE290.031 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 28.6 bits (64), Expect = 0.031
Identities = 16/54 (29%), Positives = 23/54 (42%), Gaps = 7/54 (12%)

Query: 166 ELGVHAHRRYLERRG-----TPATLDDLAGHALIGYDKETAFIRGMKGQVPWMR 214
LG+ A + L + T A L + +IG+ ET G+K V W R
Sbjct: 278 ALGIEAKKNMLPLQPGDVLETSADTKAL--YEVIGFTPETTVKDGVKNFVNWYR 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16710NUCEPIMERASE310.009 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 30.5 bits (69), Expect = 0.009
Identities = 31/154 (20%), Positives = 52/154 (33%), Gaps = 37/154 (24%)

Query: 9 LVLGATGGIGGEMARRLTAHGWRVRAL------------HRNPDALARRDPAFEWVRGDA 56
LV GA G IG +++RL G +V + + LA+ P F++ + D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQ--PGFQFHKIDL 61

Query: 57 MVR------------DDVVAAARGAAVIVHAVNPPGY--RNWAGLVLPMIDHTIAAARAC 102
R + V + AV NP Y N G + + + R
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFL-----NILEGCRHN 116

Query: 103 G-ARVVLPGT--VYNFGPDAFPVLCETSAQQPLT 133
++ + VY P + S P++
Sbjct: 117 KIQHLLYASSSSVYGLNRK-MPFSTDDSVDHPVS 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16740PF06580310.004 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.0 bits (70), Expect = 0.004
Identities = 15/73 (20%), Positives = 32/73 (43%), Gaps = 1/73 (1%)

Query: 23 SFYAMLSLGLAVIFGLLNVINFAHG-ALFMLGAVLTWMGLSYLDLPYWVMLVAAPLAVGV 81
Y + G A ++G + + A+ ++G VLT S++ W+ L + + V
Sbjct: 21 GVYTLTGFGFASLYGSPKLHSMIFNIAISLMGLVLTHAYRSFIKRQGWLKLNMGQIILRV 80

Query: 82 VGVVIERTMLRFI 94
+ + M+ F+
Sbjct: 81 LPACVVIGMVWFV 93


62RS_RS00015RS_RS00055N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS00015-390.159946protein translocase component YidC
RS_RS00020-210-0.250366tRNA modification GTPase TrmE
RS_RS00025-112-0.306876hypothetical protein
RS_RS00035-1140.15197330S ribosomal protein S21
RS_RS00040-1121.154368multidrug transporter
RS_RS00045-1101.408341multidrug efflux RND transporter permease
RS_RS00050082.649460MexE family multidrug efflux RND transporter
RS_RS00055-1111.742209TetR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS0001560KDINNERMP493e-172 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 493 bits (1270), Expect = e-172
Identities = 203/561 (36%), Positives = 304/561 (54%), Gaps = 44/561 (7%)

Query: 1 MDIKRTILWVIFSLSVVLLFDNWQRANGHQSMFFPTPQTVTTTAAAPGGTPAGDVPKAAA 60
MD +R +L + +++ W++ P PQ TT A
Sbjct: 1 MDSQRNLLVIALLFVSFMIWQAWEQDKN------PQPQAQQTTQTTT-----------TA 43

Query: 61 PAAAGSQAAPATGAVSQTPASEKIVVTTDVIRATVDTAGAIVTKLELL---TQKDHDGNP 117
+A Q PA+G I V TDV+ T++T G V + L + + P
Sbjct: 44 AGSAADQGVPASGQGKL------ISVKTDVLDLTINTRGGDVEQALLPAYPKELNST-QP 96

Query: 118 MVLFDRSLERTYLARSGLIGGDFPNHTT-----VFTASAGPRDLGTGGE---VSLTLTAD 169
L + S + Y A+SGL G D P++ ++ L G V +T T D
Sbjct: 97 FQLLETSPQFIYQAQSGLTGRDGPDNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYT-D 155

Query: 170 KGGAKLAKTYVFKRGSYVIDTRFDVTNDGAAPINPTLYMELARDGGAVEQSRFYS----- 224
G KT+V KRG Y ++ ++V N G P+ + + +L + S
Sbjct: 156 AAGNTFTKTFVLKRGDYAVNVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFAL 215

Query: 225 -TFTGPAVYTDTDHYHKITFADIDKSKAHVPAPTDSGWVAMVQHYFASAWIPAASAKREF 283
TF G A T + Y K F I ++ + GWVAM+Q YFA+AWIP F
Sbjct: 216 HTFRGAAYSTPDEKYEKYKFDTI-ADNENLNISSKGGWVAMLQQYFATAWIPHNDGTNNF 274

Query: 284 YVDRIDTNFYRVGMQQALGTVAPGASVSATARLFAGPQEERMLEGITPGLELVKDYGWLT 343
Y + +G + V PG + + + L+ GP+ + + + P L+L DYGWL
Sbjct: 275 YTANLGNGIAAIGYKSQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLW 334

Query: 344 IIAKPLFWLLEKIHKLLGNWGWSIVALTVLVKLVFFPLSATSYRSMAKMKDLQPRMTAIR 403
I++PLF LL+ IH +GNWG+SI+ +T +V+ + +PL+ Y SMAKM+ LQP++ A+R
Sbjct: 335 FISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMR 394

Query: 404 ERHKGDPQKMNQEMMTLYRTEKVNPLGGCLPIVIQIPVFIALYWVLLSSVEMRGAPWLGW 463
ER D Q+++QEMM LY+ EKVNPLGGC P++IQ+P+F+ALY++L+ SVE+R AP+ W
Sbjct: 395 ERLGDDKQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFALW 454

Query: 464 VHDLASPDPFYILPILMAVSMFVQTRLNPTP-PDPVQAKMMMFMPIAFSVMFFFFPAGLV 522
+HDL++ DP+YILPILM V+MF +++PT DP+Q K+M FMP+ F+V F +FP+GLV
Sbjct: 455 IHDLSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLV 514

Query: 523 LYWVVNNCLSIAQQWSINRML 543
LY++V+N ++I QQ I R L
Sbjct: 515 LYYIVSNLVTIIQQQLIYRGL 535


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00020TCRTETOQM364e-04 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 35.6 bits (82), Expect = 4e-04
Identities = 29/95 (30%), Positives = 42/95 (44%), Gaps = 19/95 (20%)

Query: 280 TIQIEGIPLNIVDTAGLRDTEDEVERIGIERTWAAIARADVVLHLLDAADYRAHGLSAED 339
+ Q E +NI+DT G D EV R +++ D + L+ A D G+ A+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYR--------SLSVLDGAILLISAKD----GVQAQ- 108

Query: 340 AAIDARIAEHVPP--GVPTLRVINKIDLAGAAVPG 372
RI H G+PT+ INKID G +
Sbjct: 109 ----TRILFHALRKMGIPTIFFINKIDQNGIDLST 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00045ACRIFLAVINRP11840.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1184 bits (3064), Expect = 0.0
Identities = 586/1036 (56%), Positives = 753/1036 (72%), Gaps = 7/1036 (0%)

Query: 1 MAKFFIDRPVFAWVLALFIIVAGAISITQLPIAQYPTIAPPSIIITATYPGASAKTLDDA 60
MA FFI RP+FAWVLA+ +++AGA++I QLP+AQYPTIAPP++ ++A YPGA A+T+ D
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTSIIEQEMNGADGLLYIESVSQAGNGQATITVTFKPGTDPALAQVDVQNRLKRVEARLP 120
VT +IEQ MNG D L+Y+ S S G TIT+TF+ GTDP +AQV VQN+L+ LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTS-DSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLP 119

Query: 121 SSVTQQGVQVDKTRSNFLLFATLISKDGKMDPVALGDYISRNVLNEVKRVPGVGQAVLFG 180
V QQG+ V+K+ S++L+ A +S + + DY++ NV + + R+ GVG LFG
Sbjct: 120 QEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 181 TERAMRIWIDPAKLVGYKLTPTDVYNAIRNQNALVSAGTLGDLPSTSDQPIAATVVVEGQ 240
+ AMRIW+D L YKLTP DV N ++ QN ++AG LG P+ Q + A+++ + +
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 241 MTTTEQFGNIVLVSKPDGSQVRIKDVARLELGGQTYATSARINGQPISAIGVQLSPTGNA 300
E+FG + L DGS VR+KDVAR+ELGG+ Y ARING+P + +G++L+ NA
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 301 LGTAKAVKAKLDELSKYFPAGVEYKIPYDTSKFVQISIEEVVKTLFEAMALVFLVMLVFL 360
L TAKA+KAKL EL +FP G++ PYDT+ FVQ+SI EVVKTLFEA+ LVFLVM +FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 361 QNIRYTLIPSIVVPISLLGAFATMNALGFSINVLTMFGLVLAIGILVDDAIVVVENVERI 420
QN+R TLIP+I VP+ LLG FA + A G+SIN LTMFG+VLAIG+LVDDAIVVVENVER+
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 421 MSQEGLPPREATRKAMGQITGAIIGITLVLMAVFIPMAFFSGSVGAIYRQFSLSMVASIF 480
M ++ LPP+EAT K+M QI GA++GI +VL AVFIPMAFF GS GAIYRQFS+++V+++
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 481 FSALMALTLTPALCSTLLKPIEKGSHHEKKGFFGWFNRMFTSTTDRYQSLVERVLKKTFR 540
S L+AL LTPALC+TLLKP+ H K GFFGWFN F + + Y + V ++L T R
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YMVIYGALIAAVVFLFMRLPSSFLPNEDQGYIITNIQLPPAASANRTLEVIKHVENYYQH 600
Y++IY ++A +V LF+RLPSSFLP EDQG +T IQLP A+ RT +V+ V +YY
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 --EKAVENIVAVQGFSFSGNGPNAALVFTTLKDWSSR-GADQSADAVAGRAFGALFGGIR 657
+ VE++ V GFSFSG NA + F +LK W R G + SA+AV RA G IR
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKME-LGKIR 658

Query: 658 DAIVFPLNPPPIPELGNATGFTFRLQDRGGLGHDALMAARNQLLGMAGQSKV-LKNVRPD 716
D V P N P I ELG ATGF F L D+ GLGHDAL ARNQLLGMA Q L +VRP+
Sbjct: 659 DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 717 GLEDSPQYQVDIDREKANALGVAFADINNVLSTALGSAYANDFPNYGRQQRVIVQADRIN 776
GLED+ Q+++++D+EKA ALGV+ +DIN +STALG Y NDF + GR +++ VQAD
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 777 RMQPEDIMNLYVRNTQGSMVPMSAFAKGHWVVGPVQLVRYNGYPSVRISGDAATGASTGD 836
RM PED+ LYVR+ G MVP SAF HWV G +L RYNG PS+ I G+AA G S+GD
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 837 AMDEMEHLASRLPAGIGFEWTGQSLQEKSSGSQAPALYALSLLAVFLVLAALYESWSIPA 896
AM ME+LAS+LPAGIG++WTG S QE+ SG+QAPAL A+S + VFL LAALYESWSIP
Sbjct: 839 AMALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPV 898

Query: 897 AVILVVPLGVLGALLGVTLRFMPDDVYFKVGLIAVVGLSAKNAILIIEFAKDLQ-AQGKG 955
+V+LVVPLG++G LL TL +DVYF VGL+ +GLSAKNAILI+EFAKDL +GKG
Sbjct: 899 SVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKG 958

Query: 956 LIEATLEAVHLRFRPIIMTSFAFILGVLPLAIATGASSASQRAIGTGVMGGMITATVLAV 1015
++EATL AV +R RPI+MTS AFILGVLPLAI+ GA S +Q A+G GVMGGM++AT+LA+
Sbjct: 959 VVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAI 1018

Query: 1016 VLVPVFFVVVRRIFKG 1031
VPVFFVV+RR FKG
Sbjct: 1019 FFVPVFFVVIRRCFKG 1034



Score = 73.7 bits (181), Expect = 3e-15
Identities = 56/331 (16%), Positives = 125/331 (37%), Gaps = 21/331 (6%)

Query: 723 QYQVDIDREKANALGVAFADINNVLSTA---LGSAYANDFPNYGRQQRVIVQADRINRMQ 779
++ +D + N + D+ N L + + P QQ +
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKN 242

Query: 780 PEDIMNLYVR-NTQGSMVPMSAFAKGHWVVGPVQ-LVRYNGYPSVRISGDAATGASTGDA 837
PE+ + +R N+ GS+V + A+ + R NG P+ + ATGA+ D
Sbjct: 243 PEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDT 302

Query: 838 MD----EMEHLASRLPAGIGFEWT-GQSLQEKSSGSQAPALYALSLLAVFLVLAALYESW 892
++ L P G+ + + + S + +++ VFLV+ ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 893 SIPAAVILVVPLGVLGAL-----LGVTLRFMPDDVYFKVGLIAVVGLSAKNAILIIE-FA 946
+ VP+ +LG G ++ + G++ +GL +AI+++E
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLT-----MFGMVLAIGLLVDDAIVVVENVE 417

Query: 947 KDLQAQGKGLIEATLEAVHLRFRPIIMTSFAFILGVLPLAIATGASSASQRAIGTGVMGG 1006
+ + EAT +++ ++ + +P+A G++ A R ++
Sbjct: 418 RVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSA 477

Query: 1007 MITATVLAVVLVPVFFVVVRRIFKGSERQRR 1037
M + ++A++L P + + + +
Sbjct: 478 MALSVLVALILTPALCATLLKPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00050RTXTOXIND453e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 45.2 bits (107), Expect = 3e-07
Identities = 36/209 (17%), Positives = 71/209 (33%), Gaps = 30/209 (14%)

Query: 99 AQYQASLDSAKAQLARAEATQTQAQLKAERYKPLVATNAISKQDYDDAVAAA-KQATADV 157
+ L K+QL + E+ A+ + + Q + + + +Q T ++
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLVT----------QLFKNEILDKLRQTTDNI 311

Query: 158 GAARAAVETAKLNLGYATVTSPISGR-AGLAQVTEGALVGQG--------SDATLLATVQ 208
G + + + + +P+S + L TEG +V D TL T
Sbjct: 312 GLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTA- 370

Query: 209 QIDPIYLTFTQSSTEVMRLQEALKAGKLAAAGDTAAKVTLVTEDGRVYAQTGKLYFSDLT 268
+ + F + EA + G KV + D + G ++ ++
Sbjct: 371 LVQNKDIGFINVGQNAIIKVEAFPYTR---YGYLVGKVKNINLDAIEDQRLGLVFNVIIS 427

Query: 269 VDQTTGSITLRAIFPNAERTLLPGMYVRA 297
+++ S + I L GM V A
Sbjct: 428 IEENCLSTGNKNIP------LSSGMAVTA 450



Score = 37.1 bits (86), Expect = 1e-04
Identities = 16/105 (15%), Positives = 38/105 (36%)

Query: 64 RVAQVRARVAGIVLKRTYQEGSDVKANDVLFRIDPAQYQASLDSAKAQLARAEATQTQAQ 123
R +++ IV + +EG V+ DVL ++ +A ++ L +A QT+ Q
Sbjct: 95 RSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQ 154

Query: 124 LKAERYKPLVATNAISKQDYDDAVAAAKQATADVGAARAAVETAK 168
+ + + + + ++ + T +
Sbjct: 155 ILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQ 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00055HTHTETR1263e-38 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 126 bits (317), Expect = 3e-38
Identities = 75/209 (35%), Positives = 111/209 (53%), Gaps = 1/209 (0%)

Query: 1 MVRRTKEEALETRNRILDAAEEVFYVRGVARTSLSDIAQAAGVTRGAIYWHFRNKADVFA 60
M R+TK+EA ETR ILD A +F +GV+ TSL +IA+AAGVTRGAIYWHF++K+D+F+
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 AMCERVKLPM-ETLCSPLGVEPNDPLGELRNVANFVLQQLVEDMHWRRVFEIMFNKCEFV 119
+ E + + E P DPL LR + VL+ V + R + EI+F+KCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 120 EDTSPILKRQQESFEEGLARLTHIIGQAARRGQLPPDLDVPLAVAHFHASFGGIMGDYLF 179
+ + + + Q+ E R+ + LP DL A G+M ++LF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 180 YPDASSLAVKRERFVDACIDTLKYSPALR 208
P + L + +V ++ P LR
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLR 209


63RS_RS00260RS_RS00295N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS00260-1122.661530Tyrosine recombinase XerC 1
RS_RS00265-1132.332500hypothetical protein
RS_RS00270-1151.595656aspartyl/glutamyl-tRNA(Asn/Gln) amidotransferase
RS_RS00275-1142.742392dienelactone hydrolase
RS_RS00280-1161.455868hypothetical protein
RS_RS00285-2161.638728glutamyl-tRNA(Gln) amidotransferase subunit A
RS_RS00290-2170.808864glutamyl-tRNA(Gln) amidotransferase subunit C
RS_RS00295-1151.168070rod shape-determining protein MreB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00260BCTERIALGSPF300.012 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 30.2 bits (68), Expect = 0.012
Identities = 20/77 (25%), Positives = 37/77 (48%), Gaps = 2/77 (2%)

Query: 26 ALKFERKLSQHTLASYARELAVLQQLGARFAAGIDLMRLQPH--HIRRMMAQLHGGGLSG 83
+L+ + +LS LA R+LA L +D + Q H+ ++MA + + G
Sbjct: 58 SLRRKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEG 117

Query: 84 RSIARALSAWRGWYQWL 100
S+A A+ + G ++ L
Sbjct: 118 HSLADAMKCFPGSFERL 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00265TYPE3OMGPROT290.015 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 29.1 bits (65), Expect = 0.015
Identities = 12/49 (24%), Positives = 21/49 (42%), Gaps = 4/49 (8%)

Query: 62 LVRHGNENDRTQQRIHAWTARLLGEADAHAL----PYTVQDGLREIFEV 106
L R +E R R+ R++ E AH L ++ G+ + E+
Sbjct: 487 LFRRKSELTRRTVRLFIIEPRIIDEGIAHHLALGNGQDLRTGILTVDEI 535


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00270TYPE4SSCAGA310.013 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 30.8 bits (69), Expect = 0.013
Identities = 27/87 (31%), Positives = 45/87 (51%), Gaps = 5/87 (5%)

Query: 394 NTAKKDVFPAMWAGEHGGDADAIIAEKGLK--QMSDSGELEKIIDEVLAANAKSVEEFRA 451
N+ K ++F A+ E DA AI + LK + S +LE + ++ L KS +EF+
Sbjct: 649 NSQKDEIF-ALINKEANRDARAIAYAQNLKGIKRELSDKLENV-NKNLKDFDKSFDEFKN 706

Query: 452 GKDKAFNALVGQAMKATRGKANPSQVN 478
GK+K F+ + +KA +G +N
Sbjct: 707 GKNKDFSK-AEETLKALKGSVKDLGIN 732


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00295SHAPEPROTEIN5110.0 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 511 bits (1318), Expect = 0.0
Identities = 251/348 (72%), Positives = 296/348 (85%), Gaps = 2/348 (0%)

Query: 1 MFGFLRSYFSNDLAIDLGTANTLIYVRDKGIVLDEPSVVAIRQEGGPNAKKTIQAVGKEA 60
M R FSNDL+IDLGTANTLIYV+ +GIVL+EPSVVAIRQ+ + K++ AVG +A
Sbjct: 1 MLKKFRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRA-GSPKSVAAVGHDA 59

Query: 61 KQMLGKVPGNIEAIRPMKDGVIADFTVTEQMLKQFIKMVHDSKLLRPSPRIIICVPCGST 120
KQMLG+ PGNI AIRPMKDGVIADF VTE+ML+ FIK VH + +RPSPR+++CVP G+T
Sbjct: 60 KQMLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGAT 119

Query: 121 QVERRAIRESALGAGASQVYLIEEPMAAAIGAGLPVSEPSGSMVVDIGGGTTEVGIISLG 180
QVERRAIRESA GAGA +V+LIEEPMAAAIGAGLPVSE +GSMVVDIGGGTTEV +ISL
Sbjct: 120 QVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN 179

Query: 181 GMVYKGSVRVGGDKFDEAIVNYIRRNYGMLIGEQTAEAIKKEIGSAFPGSEVREMEVKGR 240
G+VY SVR+GGD+FDEAI+NY+RRNYG LIGE TAE IK EIGSA+PG EVRE+EV+GR
Sbjct: 180 GVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGR 239

Query: 241 NLSEGIPRAFTVSSNEILEALTDPLNQIVSSVKIALEQTPPELGADIAERGMMLTGGGAL 300
NL+EG+PR FT++SNEILEAL +PL IVS+V +ALEQ PPEL +DI+ERGM+LTGGGAL
Sbjct: 240 NLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGAL 299

Query: 301 LRDLDRLLAEETGLPVLVAEDPLTCVVRGSGMALERMDKL-GSIFSYE 347
LR+LDRLL EETG+PV+VAEDPLTCV RG G ALE +D G +FS E
Sbjct: 300 LRNLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHGGDLFSEE 347


64RS_RS00800RS_RS00845N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS008000122.776130TetR family transcriptional regulator
RS_RS00805-190.937757secretion protein HlyD
RS_RS00810-18-0.566478multidrug ABC transporter ATP-binding protein
RS_RS00815-115-3.520158mannose-1-phosphate guanyltransferase
RS_RS00820-118-3.485630RND transporter
RS_RS00825122-4.889273membrane protein
RS_RS00830121-4.528816hypothetical protein
RS_RS00835021-3.208456oxidoreductase
RS_RS00840017-2.840601membrane protein
RS_RS00845-1110.242449short-chain dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00800HTHTETR627e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.3 bits (151), Expect = 7e-14
Identities = 35/211 (16%), Positives = 79/211 (37%), Gaps = 7/211 (3%)

Query: 13 RPRGPVREDLRDRLLDIAVQRFARDGIDATTMAAIAREAGVTAPMVHYHFATRDQLLDAV 72
R ++ R +LD+A++ F++ G+ +T++ IA+ AGVT +++HF + L +
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 73 VDMRLKPLIDEVMDPALTAIPDDGHAPELARIVAGAAQRMVAVASATPWFPTLWIREIAS 132
++ + + ++ D L I+ + V ++ +
Sbjct: 63 WELSESNIGELELEYQAKFPGDP--LSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 133 EGGQLRERVFARIALERATVLVERIARAQAAGAVNAALEPTLVMVSVIGLAMLPLATRAL 192
+ ++ + LE + + + A + A L ++ + L
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAA-IIMRGYISGLMEN-- 177

Query: 193 WGRLPHAETIDAQAIGRHVAALLLHGVGPAP 223
W P ++ D + R A+LL P
Sbjct: 178 WLFAP--QSFDLKKEARDYVAILLEMYLLCP 206


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00805RTXTOXIND573e-11 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 57.1 bits (138), Expect = 3e-11
Identities = 40/357 (11%), Positives = 90/357 (25%), Gaps = 85/357 (23%)

Query: 44 VASPVGGRLEHLGVQRGQTVSAGTPLFILESTDEAAARQQAAAQQQAAEAQLADLNLGRR 103
+ ++ + V+ G++V G L L + A + + A + + R
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSR 158

Query: 104 VPEVDAV------------RAQLAQAVAADQLSATQRVRDEAQFRAGGIPQAQLDASRST 151
E++ + + + L Q + Q + + A R T
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT 218

Query: 152 AQTNAQRVRELTNQLRI-----------------------AQLPARTDQIHAQSAQVEAA 188
R L+ + + +++ +Q+E
Sbjct: 219 VLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQI 278

Query: 189 RAAVAQAQWRLDQKAQ------------------------------------KATQGGLV 212
+ + A+ Q +A V
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKV 338

Query: 213 FD-TLYREGEWVGAGSPVVRMLPPANV-KVRFFVPEGVVGRLKPGQAVRIRCDG----CA 266
++ EG V ++ ++P + +V V +G + GQ I+ +
Sbjct: 339 QQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRY 398

Query: 267 AEVNATISYV---ASEAEYTPPVIYSNSRRDKLVFMVEARPAAADGPKLRPGQPVEV 320
+ + + A E + V ++ L G V
Sbjct: 399 GYLVGKVKNINLDAIEDQRLGLVFNVIISIEE-----NCLSTGNKNIPLSSGMAVTA 450


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00815ABC2TRNSPORT361e-04 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 36.4 bits (84), Expect = 1e-04
Identities = 39/192 (20%), Positives = 80/192 (41%), Gaps = 8/192 (4%)

Query: 194 GVILTMTMVMMT----GLAMTRERERGTMENLLATPVRPLEVMTGKIVPYIFIGLVQVSI 249
G++ T M T A R + T E +L T +R +++ G++ + +
Sbjct: 72 GMVATSAMTAATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAG 131

Query: 250 ILLAARFIFHVPFVGSAWMVYLAALL-FIVASLTVGITLSSLAQNQLQATQLTFFYFLPS 308
I + A + + ++ + + + AL ASL + +T + + + Q P
Sbjct: 132 IGVVAAALGYTQWLSLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVI--TPI 189

Query: 309 ILLSGFMFPFAGMPKWAQVIGDVLPMTYFHRLTRGILLKGNGWVELWPSIWPLLVFTAVV 368
+ LSG +FP +P Q LP+++ L R I+L V++ + L ++ +
Sbjct: 190 LFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPV-VDVCQHVGALCIYIVIP 248

Query: 369 MGIALRFYRKTL 380
++ R+ L
Sbjct: 249 FFLSTALLRRRL 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00820RTXTOXIND355e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 35.2 bits (81), Expect = 5e-04
Identities = 30/190 (15%), Positives = 52/190 (27%), Gaps = 35/190 (18%)

Query: 77 LDALVDEALRASPTVAQAAARLREAQAQA--DAQFGAVLPSVDGSASAVRQQVNPEAFGF 134
L AL EA + ARL + + Q + LP + Q V+ E
Sbjct: 127 LTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEE--- 183

Query: 135 NTPKPGPFTLYSASLSVSYALDLFGGVRRALEASRAQVDMQRYELEAARLSLAGNVVTAA 194
V R + Q + + L+L
Sbjct: 184 --------------------------VLRLTSLIKEQFSTWQNQKYQKELNLDK----KR 213

Query: 195 VRIASLDAQIATTQRLLAAQRDQLVITERRFGAGGVARVDVLSQRTQVAQTEATLPPLAQ 254
++ A+I + L ++ +L +A+ VL Q + + L
Sbjct: 214 AERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKS 273

Query: 255 QAAQARHRLS 264
Q Q +
Sbjct: 274 QLEQIESEIL 283


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00835NUCEPIMERASE496e-09 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 49.4 bits (118), Expect = 6e-09
Identities = 27/131 (20%), Positives = 53/131 (40%), Gaps = 18/131 (13%)

Query: 5 RIVLPGGAGLVGQNLVARLKARGYTNLVVLDK---------HEANLDVLRSVHPDITAVF 55
+ ++ G AG +G ++ RL G+ +V +D +A L++L P
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQ-VVGIDNLNDYYDVSLKQARLELLA--QPGFQFHK 58

Query: 56 ADLAEPGDWARHFE--GADVVVMLQAQIGAP--TREP--FVRNNIDSTRCVLEVIKQHAI 109
DLA+ F + V + ++ P + +N+ +LE + + I
Sbjct: 59 IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 110 PYTVHISSSVV 120
+ ++ SSS V
Sbjct: 119 QHLLYASSSSV 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS00845DHBDHDRGNASE777e-19 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 76.6 bits (188), Expect = 7e-19
Identities = 53/191 (27%), Positives = 78/191 (40%), Gaps = 1/191 (0%)

Query: 2 KNILIVGATSAIAIACARQWAAEGARFFLVARNGERLEQVADDLSARGAQLASRHQLDIT 61
K I GA I A AR A++GA V N E+LE+V L A A+ A D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE-ARHAEAFPADVR 67

Query: 62 DLERHAAMFEQCLAALGTIDIVLVAPGTLPDQAACQADPAIAVREFNTNATAVIALLTRI 121
D + + +G IDI++ G L F+ N+T V +
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 122 ANALEAQRSGALAVISSVAADRGRPSNYLYGSAKAALSAFCEGLNARLFKAGVHVLTIKP 181
+ + +RSG++ + S A R S Y S+KAA F + L L + + + P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 182 GFVSTPMTAGL 192
G T M L
Sbjct: 188 GSTETDMQWSL 198


65RS_RS01395RS_RS01465N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS01395-2141.262695short-chain dehydrogenase
RS_RS01400-2150.505722DSBA oxidoreductase
RS_RS01405-1150.763714cell division protein FtsN
RS_RS01410015-0.442384arginine--tRNA ligase
RS_RS01415213-1.682523hypothetical protein
RS_RS01420115-1.568264sensor histidine kinase
RS_RS01425219-1.544740hypothetical protein
RS_RS01430117-1.221692chemotaxis protein CheY
RS_RS01435218-1.465389hisitidine kinase
RS_RS01440017-0.260723hypothetical protein
RS_RS014450160.5772185-methyltetrahydrofolate--homocysteine
RS_RS01450-1132.508677sulfurtransferase
RS_RS014550112.480695hypothetical protein
RS_RS014600113.0846493-oxoadipate enol-lactonase
RS_RS014651103.479700hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01395DHBDHDRGNASE643e-14 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 64.3 bits (156), Expect = 3e-14
Identities = 53/195 (27%), Positives = 89/195 (45%), Gaps = 10/195 (5%)

Query: 3 QIVFITGASSGIGQALARQYAARGATLGLVARRVDALHGFVQTLPA-GIAVHCYAADVRD 61
+I FITGA+ GIG+A+AR A++GA + V + L V +L A + ADVRD
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 62 ARSMHDAASAFMAVAGVPDVVIANAGISVGTDLREAGDLPAFAAVMETNWMGVLHTLQPF 121
+ ++ + + G D+++ AG+ + D + A N GV + +
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSD-EEWEATFSVNSTGVFNASRSV 127

Query: 122 VGPMLARAEPGRPGGTLVGIASVAGVRGLPGAAAYSASKAAVIKLMESLRVEFSPKRRPG 181
M+ R G++V + S AAY++SKAA + + L +E +
Sbjct: 128 SKYMMDRR-----SGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEY---N 179

Query: 182 LRVVTLAPGYIRTPM 196
+R ++PG T M
Sbjct: 180 IRCNIVSPGSTETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01405PERTACTIN346e-04 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 33.5 bits (76), Expect = 6e-04
Identities = 34/97 (35%), Positives = 41/97 (42%), Gaps = 9/97 (9%)

Query: 81 ADPNAPLWSRVPAKPVDQVPEPNQPGTQPGTQPASQATPKPAEPGSSVAMQRPPKPVTPA 140
A+ N WS V AK QPG QPG QP P+P +P +PP+ A
Sbjct: 554 ANGNG-QWSLVGAKAPPAPKPAPQPGPQPGPQP-----PQPPQPPQPPQPPQPPQRQPEA 607

Query: 141 PAEKPVADPIAEIARQDAAKTGYFLQVGAYDSAEYAE 177
PA +P A A A TG VG + YAE
Sbjct: 608 PAPQPPAGRELSAAANAAVNTG---GVGLASTLWYAE 641


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01420PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.7 bits (77), Expect = 0.001
Identities = 17/91 (18%), Positives = 35/91 (38%), Gaps = 11/91 (12%)

Query: 360 IVQESLTNASKYAHATIVS---VALDASEEG--VRLRVRDNGRGFPADLERRRMVGHHGL 414
+VQ + N K+ A + + L +++ V L V + G L+ + GL
Sbjct: 259 LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA---LKNTKESTGTGL 315

Query: 415 LGMEQRAIALGG---TLTIDSTPGGGVTIIV 442
+ +R L G + + G +++
Sbjct: 316 QNVRERLQMLYGTEAQIKLSEKQGKVNAMVL 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01430HTHFIS536e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 52.5 bits (126), Expect = 6e-11
Identities = 26/127 (20%), Positives = 50/127 (39%), Gaps = 3/127 (2%)

Query: 14 RVLLIEDSPVLRGMVLEYLKASAFVAVVEWADTEDLALRLLAQGHYDVVIVDLQLRQGNG 73
+L+ +D +R VL + A V ++ R +A G D+V+ D+ + N
Sbjct: 5 TILVADDDAAIR-TVLNQALSRAGYDVRITSNAAT-LWRWIAAGDGDLVVTDVVMPDENA 62

Query: 74 FKVLQSLRDQASPSVRIVYTNHAQVPTYRQRCFEAGANYFFDKSLELDKVFEVIEERAGM 133
F +L ++ +V + T + E GA + K +L ++ +I
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTA-IKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 134 VRPRPQA 140
+ RP
Sbjct: 122 PKRRPSK 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01435HTHFIS911e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.0 bits (226), Expect = 1e-23
Identities = 37/122 (30%), Positives = 54/122 (44%), Gaps = 2/122 (1%)

Query: 2 IRVLIADDHEIVRAGLRQFLSEERDIEVAGEAASGEEVMEQLRTGTFDVVVLDISMPDRN 61
+L+ADD +R L Q L +V ++ + + G D+VV D+ MPD N
Sbjct: 4 ATILVADDDAAIRTVLNQAL-SRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GIDTLKLVRQRHPDLPVLILSTFPEDQYAINLIRAGASGYLTKESAPDELVKAIRTVSQG 121
D L +++ PDLPVL++S AI GA YL K EL+ I
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 122 RR 123
+
Sbjct: 122 PK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01450YERSINIAYOPE280.045 Yersinia virulence determinant YopE protein signature.
		>YERSINIAYOPE#Yersinia virulence determinant YopE protein signature.

Length = 219

Score = 28.2 bits (62), Expect = 0.045
Identities = 24/119 (20%), Positives = 43/119 (36%), Gaps = 3/119 (2%)

Query: 215 SGTVTDASGRILSGQTVEAFWNSL--RHAKPITFGLNCALGAALMRPYIAELAKICDTAV 272
S +V + SGR +S QT + + N+L R P L + L + + I
Sbjct: 20 SSSVGEMSGRSVSQQTSDQYANNLAGRTESPQGSSLASRIIERLSSVAHSVIGFI-QRMF 78

Query: 273 SCYPNAGLPNPMSDTGFDETPEVTSSLVDEFAAAGLVNLVGGCCGTTPEHIRAIAERVA 331
S + + P +P S + + AA L + E ++ ++ A
Sbjct: 79 SEGSHKPVVTPAPTPAQMPSPTSFSDSIKQLAAETLPKYMQQLNSLDAEMLQKNHDQFA 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01470TONBPROTEIN405e-06 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 40.4 bits (94), Expect = 5e-06
Identities = 30/141 (21%), Positives = 44/141 (31%), Gaps = 13/141 (9%)

Query: 22 RWRRWGVVALAALLAHGIAVIWVARSHQVMWPPAPEQ------VVPTLLLQP-------E 68
R W + + +A + HQV+ PAP Q V P L P E
Sbjct: 7 RRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQAVQPPPE 66

Query: 69 PVHPPAPPAPAPVAAKPRPPAPRPHPQHAPTPTPEPAPTVPDMADTELPELTGTGAQASA 128
PV P P P P+ P P P+P V + ++ + A
Sbjct: 67 PVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFE 126

Query: 129 DTGVVTDLGAERPDAPAAPPG 149
+T + A + P
Sbjct: 127 NTAPARLTSSTATAATSKPVT 147



Score = 35.3 bits (81), Expect = 2e-04
Identities = 23/107 (21%), Positives = 37/107 (34%), Gaps = 3/107 (2%)

Query: 53 PPAPEQVVPTLLLQPEPVHPPAPPAPAPVAA-KPRP-PAPRPHPQHAPTPTPEPAPTVPD 110
PP Q P +++PEP P P P +P P P+P P+ +P V
Sbjct: 57 PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKP 116

Query: 111 MADTELPELTGTGAQASADTGVVTDLGAERPDAPAAPPGPGFALPPS 157
++ A A + T ++ + A+ P P
Sbjct: 117 -VESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQ 162



Score = 30.3 bits (68), Expect = 0.009
Identities = 22/112 (19%), Positives = 32/112 (28%), Gaps = 5/112 (4%)

Query: 44 VARSHQVMWPPAPEQVVPTLLLQPEPVHPPAPPAPAPVAAKPRPPAPR-PHPQHAPTPTP 102
V + + P PE + PV P KP P P +
Sbjct: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120

Query: 103 EPAPTVPDMADTELPELTGTGAQASADTGVVTDLGAERPDAPAAPPGPGFAL 154
+P +T LT + A A+ V + R + P P A
Sbjct: 121 PASP----FENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQ 168


66RS_RS01715RS_RS01755N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS01715012-0.865490phosphoenolpyruvate-protein phosphotransferase
RS_RS01720015-1.698078(2Fe-2S)-binding protein
RS_RS01725015-1.817746bacterioferritin
RS_RS01730-115-1.114511molybdopterin biosynthesis protein MoeB
RS_RS01735-114-0.524347carboxyl-terminal processing protease
RS_RS01740-2120.3979962,3-bisphosphoglycerate-dependent
RS_RS01745-1121.463110sulfurtransferase
RS_RS017501112.012872glutaredoxin
RS_RS017550111.462439protein-export protein SecB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01715PHPHTRNFRASE6310.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 631 bits (1629), Expect = 0.0
Identities = 220/580 (37%), Positives = 331/580 (57%), Gaps = 14/580 (2%)

Query: 1 MPFTLHGIPVSRGIAIGRAHMLARAALDVSHYLVDEDKLDAEVERLRAARTAVRAELVLL 60
M + GI S G+AI +A + +D+ + + + E+E+L AA + EL +
Sbjct: 1 MHHKITGIAASSGVAIAKAFIHLEPNVDIEKTSITD--VSTEIEKLTAALEKSKEELRAI 58

Query: 61 KRELPTDAPEEMGAFLDVHRMILDDALLAQAPETLIHQRRYNAEWALTTQLEELMRQFGE 120
K + + H ++LDD L + I + NAE+AL + + F
Sbjct: 59 KDQTEASMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFES 118

Query: 121 IEDEYLRERKADIEQVVERILKALAGTPVLAPAPVAAGGDPDAGMIVVAHDIAPADMLQF 180
+++EY++ER ADI V +R+L L G + A + +++A D+ P+D Q
Sbjct: 119 MDNEYMKERAADIRDVSKRVLGHLIGVETGSLATI------AEETVIIAEDLTPSDTAQL 172

Query: 181 RGQTFAGFVTDLGGRTSHTAIVARSLDIPAAVGVHNASTLIRQDDRIIIDGDNGIVIVDP 240
Q GF TD+GGRTSH+AI++RSL+IPA VG + I+ D +I+DG GIVIV+P
Sbjct: 173 NKQFVKGFATDIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNP 232

Query: 241 SLAILEEYRHRRSESELQKKRLERLRHTPTVTLDGTQIDLLANIEMPEDAAAALAAGAVG 300
+ ++ Y +R+ E QK+ +L P+ T DG ++L ANI P+D LA G G
Sbjct: 233 TEEEVKAYEEKRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEG 292

Query: 301 VGLFRSEFLFMNRRGALPDEEEQFAAYRTAVESMNGLPVTIRTIDIGADKPLDVRDGRDD 360
+GL+R+EFL+M+R LP EEEQF AY+ V+ M+G PV IRT+DIG DK L +
Sbjct: 293 IGLYRTEFLYMDR-DQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKE 351

Query: 361 LFETAPNPALGLRAIRWSLSEPAMFLTQLRALLRASAFGPVRMLVPMLAHAREIDQTLEL 420
L NP LG RAIR L + +F TQLRALLRAS +G ++++ PM+A E+ Q +
Sbjct: 352 L-----NPFLGFRAIRLCLEKQDIFRTQLRALLRASTYGNLKVMFPMIATLEELRQAKAI 406

Query: 421 IERAKASLDADGMAYDPTIKVGAMIEIPAAVLILPVFLRRMDFLSIGTNDLIQYTLAIDR 480
++ K L ++G+ +I+VG M+EIP+ + +F + +DF SIGTNDLIQYT+A DR
Sbjct: 407 MQEEKDKLLSEGVDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADR 466

Query: 481 ADNAVAHLYDPLHPAVLQLIARTIQEGQRAGRPVSVCGEMAGDATMTRLLLGMGLTEFSM 540
+ V++LY P HPA+L+L+ I+ G+ V +CGEMAGD LLLG+GL EFSM
Sbjct: 467 MNERVSYLYQPYHPAILRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSM 526

Query: 541 HPSQLLSVKQQILRADRPRLMREVDRLLHACEPTEVQDVL 580
+ +L + Q+L+ + L + L EV+ ++
Sbjct: 527 SATSILPARSQLLKLSKEELKPFAQKALMLDTAEEVEQLV 566


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01725HELNAPAPROT341e-04 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 33.7 bits (77), Expect = 1e-04
Identities = 21/103 (20%), Positives = 42/103 (40%), Gaps = 10/103 (9%)

Query: 44 EYKESIGEMKHADRLIERILMLDGLPN--LQDLGKLL------IGQDTKEMLECDLKLEK 95
E + E D + ER+L + G P +++ + EM++ + K
Sbjct: 52 ELYDHAAE--TVDTIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYK 109

Query: 96 AAHATVVEAIAYCEEVKDYVSRTLFVDILDDTEEHIDWLETQL 138
+ I EE +D + LFV ++++ E+ + L + L
Sbjct: 110 QISSESKFVIGLAEENQDNATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01730PF05272352e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 35.0 bits (80), Expect = 2e-04
Identities = 28/135 (20%), Positives = 47/135 (34%), Gaps = 18/135 (13%)

Query: 41 AAALPYLAAAGIGTLTIVDDDSVDLTNLQRQIIHTTESVGHPKVESARQAIARLNPEVRV 100
A P+ A G + D D + L + + TT G ++ QAI RV
Sbjct: 481 VRAFPWRKAPG----PLEDADVLRLAD----YVETTYGTGEASAQTTEQAINVAADMNRV 532

Query: 101 HAVRQRLDA---DGIGALLDGVTVVLDCSDNFATRQATNQACVRARVPLVSGAA------ 151
H R + A D + L + VL + + + + + L+ A
Sbjct: 533 HPFRDWVKAQQWDEVPRLEKWLVHVLGKTPDDYKPRRLRYLQLVGKYILMGHVARVMEPG 592

Query: 152 IRFDGQISVFDSRTG 166
+FD + V + G
Sbjct: 593 CKFDYSV-VLEGTGG 606


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01735GPOSANCHOR300.033 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 29.6 bits (66), Expect = 0.033
Identities = 21/57 (36%), Positives = 26/57 (45%), Gaps = 6/57 (10%)

Query: 479 LEAAAAVPDKATKSPAAKESSAPAKLPKGKAAPASEPAAPSGP--ASGVIPAPEPTG 533
LEA A KA K AK++ AKL GKA+ + P A G G AP+
Sbjct: 437 LEAEA----KALKEKLAKQAEELAKLRAGKASDSQTPDAKPGNKAVPGKGQAPQAGT 489


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01745SHAPEPROTEIN280.011 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 28.2 bits (63), Expect = 0.011
Identities = 20/62 (32%), Positives = 31/62 (50%), Gaps = 8/62 (12%)

Query: 83 SKAAGLAKNKETPIILVC------QTGQRAGRAQAVLKQAGYSEVYSLEGGLAAWQQAGL 136
+ + + +P +LVC Q +RA R A + AG EV+ +E +AA AGL
Sbjct: 96 KQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESA--QGAGAREVFLIEEPMAAAIGAGL 153

Query: 137 PI 138
P+
Sbjct: 154 PV 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS01755SECBCHAPRONE1512e-49 Bacterial protein-transport SecB chaperone protein ...
		>SECBCHAPRONE#Bacterial protein-transport SecB chaperone protein

signature.
Length = 170

Score = 151 bits (383), Expect = 2e-49
Identities = 53/160 (33%), Positives = 85/160 (53%), Gaps = 5/160 (3%)

Query: 1 MSDQQQQQPGQ---DGQPFFNIQRVYLKDLSLEQPNSPHIFLEQEQPTVEVQVDVAATQL 57
MS++ Q QP IQR+Y+KD+S E PN PHIF + +P + + A Q+
Sbjct: 1 MSEENQVNAADTQATQQPVLQIQRIYVKDVSFEAPNLPHIFQQDWEPKLSFDLSTEAKQV 60

Query: 58 AEGVFEVTVIGTVTTKVK--EKVAFLVEAKQAGIFDIRNVPVEQMDPLLGIACPTIVYPY 115
+ ++EV + +V T ++ VAF+ E KQAG+F I + QM L CP +++PY
Sbjct: 61 GDDLYEVCLNISVETTMESSGDVAFICEVKQAGVFTISGLEEMQMAHCLTSQCPNMLFPY 120

Query: 116 LRSNIADTIGRAGFQPIHLAEINFQALYEQRLASAMEEAQ 155
R ++ + R F ++L+ +NF AL+ L + Q
Sbjct: 121 ARELVSSLVNRGTFPALNLSPVNFDALFMDYLQRQEQAEQ 160


67RS_RS02665RS_RS02755N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS02665-1133.9959573-ketoacyl-ACP reductase
RS_RS02670-1113.614466acyl-CoA dehydrogenase
RS_RS02675-1123.926139TetR family transcriptional regulator
RS_RS02680-1123.492917RNA helicase
RS_RS02685-1122.866519serine/threonine protein kinase
RS_RS02690-1112.167589hypothetical protein
RS_RS02695-1121.670474LysR family transcriptional regulator
RS_RS027000112.358878two-component system response regulator
RS_RS02705-1112.112721hypothetical protein
RS_RS027150100.322817FAD-binding monooxygenase
RS_RS027200121.000456hypothetical protein
RS_RS02730013-0.301503*MFS transporter
RS_RS02745-2120.430462membrane protein
RS_RS02750013-0.719822histidine kinase
RS_RS02755221-2.274462two-component system response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02680DHBDHDRGNASE872e-21 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 87.0 bits (215), Expect = 2e-21
Identities = 64/255 (25%), Positives = 112/255 (43%), Gaps = 14/255 (5%)

Query: 229 LAGRTALVTGASRGIGAAIAQVLARDGARVLCLD-VPAAQPALDGVARAIG--GEALAYD 285
+ G+ A +TGA++GIG A+A+ LA GA + +D P + +A EA D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 286 IAEAETPARLARQLAA-MGGIDILVHNAGITRDKTIARMTEAAWRSVLDINLAAQLRIND 344
+ ++ + ++ MG IDILV+ AG+ R I +++ W + +N +
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 345 ALLAAGALHAGGAIVCVSSISGIAGNPGQTNYATAKAGVIGLVQACAPLLAERGITINAV 404
++ G+IV V S YA++KA + + LAE I N V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 405 APGFIETQMTAAVPFTIREAGRRMNAMAQG----------GQPVDVAEAIAWLACPASNG 454
+PG ET M ++ A + + + +P D+A+A+ +L +
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 455 VTGNVVRVCGQSLLG 469
+T + + V G + LG
Sbjct: 246 ITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02690HTHTETR615e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.8 bits (147), Expect = 5e-13
Identities = 24/104 (23%), Positives = 35/104 (33%), Gaps = 5/104 (4%)

Query: 1 MARPGAG---DTKNRILEATELLFIEFGYEAMSLRQITARAKVNLAAVNYHFGSKEALMQ 57
MAR +T+ IL+ LF + G + SL +I A V A+ +HF K L
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 58 SVLGRRLDPLNTRRLALLTACEE--RWPQRLSCEHVLGALFVPA 99
+ + L R HVL +
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEE 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02695SECA300.033 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.033
Identities = 19/65 (29%), Positives = 33/65 (50%), Gaps = 4/65 (6%)

Query: 253 VLVFTRTKHGANRLAEQLTRDGIPALAIHG-NKSQSARTRALSEFKAGTLRVLVATDIAA 311
VLV T + + ++ +LT+ GI ++ + A A + + A V +AT++A
Sbjct: 452 VLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNMAG 508

Query: 312 RGIDI 316
RG DI
Sbjct: 509 RGTDI 513


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02715HTHFIS832e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.0 bits (205), Expect = 2e-20
Identities = 37/118 (31%), Positives = 60/118 (50%), Gaps = 1/118 (0%)

Query: 2 RIALLEDDPFQSEVIAQILSQSGHDVVTFTNGATLLRMLGRSSYDMLVLDWHTPGMLGID 61
I + +DD V+ Q LS++G+DV +N ATL R + D++V D P D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 LLSVVRSRHQEVLPILFVTAEARERGLVRALGQGADDYITKPFHIPELRARVEALLRR 119
LL ++ + LP+L ++A+ ++A +GA DY+ KPF + EL + L
Sbjct: 65 LLPRIKKARPD-LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02740TCRTETA387e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 38.3 bits (89), Expect = 7e-05
Identities = 45/258 (17%), Positives = 86/258 (33%), Gaps = 28/258 (10%)

Query: 84 ALVFGRLGDMIGRKYTFLITILIMGTSTFIVGLLPSYTTIGAAAPVILIMLRLLQGLALG 143
A V G L D GR+ L+++ I+ P +L + R++ G+ G
Sbjct: 60 APVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLW--------VLYIGRIVAGIT-G 110

Query: 144 GEYGGAATYVAEHAPHGRRGNYTSWIQTTATLGLFLSLIVILVVRELTGASFEDWGWRIP 203
A Y+A+ R + ++ G+ ++ G + P
Sbjct: 111 ATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVL--------GGLMGGFSPHAP 162

Query: 204 FLVSIVL-LATSVYIRLSMSESPAFQKMKAEGKTSKAPLTESFGQWRNLKIVILALVGLT 262
F + L + + ES K E + + +R + + L
Sbjct: 163 FFAAAALNGLNFLTGCFLLPES-----HKGERRPLRREALNPLASFR-WARGMTVVAALM 216

Query: 263 AGQAVVWYTGQFYA---LFFLTQVLKVDAFTANVLIAVALAIGTPF-FVFFGSLSDRIGR 318
A ++ GQ A + F DA T + +A + + + G ++ R+G
Sbjct: 217 AVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGE 276

Query: 319 KGIIMAGCLLAVLTYFPL 336
+ +M G + Y L
Sbjct: 277 RRALMLGMIADGTGYILL 294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02750PF06580372e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 2e-04
Identities = 25/107 (23%), Positives = 41/107 (38%), Gaps = 25/107 (23%)

Query: 401 LLDNAIRY----TPDGGHVTVRVTTAPFEPFVFLDVEDTGPGIPAGERERVMQRFYRILG 456
L++N I++ P GG + ++ T V L+VE+TG L
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKD--NGTVTLEVENTGSLA---------------LK 305

Query: 457 TQAEGSGLGLAIVREIVQQHGGDIAVLDYVYQSSPRLAGARFRITLP 503
E +G GL VRE +Q G + + S + + +P
Sbjct: 306 NTKESTGTGLQNVRERLQMLYGT----EAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS02755HTHFIS936e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.6 bits (230), Expect = 6e-24
Identities = 42/157 (26%), Positives = 79/157 (50%), Gaps = 5/157 (3%)

Query: 2 RILIAEDDATLADGLTRSLRQAGYAVDRAADGAAADAALSAQTAQTYDLLILDVGLPRLS 61
IL+A+DDA + L ++L +AGY V ++ A ++A DL++ DV +P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD---GDLVVTDVVMPDEN 61

Query: 62 GLEVLKRLRSRGAMLPVLILTAADSVDERVKGLDLGADDYMAKPFALSELEARVRALVRR 121
++L R++ LPVL+++A ++ +K + GA DY+ KPF L+EL + +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 122 GTGGGATLVRHGPLAFDQVGRIAYIRD--QMVDLSAR 156
+ L VGR A +++ +++ +
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQ 158


68RS_RS03605RS_RS03655N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS036052251.565990pilus assembly protein
RS_RS036101260.873943pilus assembly protein
RS_RS036150260.069171pilus assembly protein PilY
RS_RS03620317-3.752611pilus assembly protein PilZ
RS_RS03625418-4.402937pilus assembly protein PilW
RS_RS03630520-5.374730membrane protein
RS_RS03635421-5.679788general secretion pathway protein GspH
RS_RS03640420-5.727109pilus assembly protein PilE
RS_RS03645417-5.296925pilus biosynthesis protein PilY
RS_RS03650-117-4.356724pilus assembly protein PilX
RS_RS03655-116-4.163155pilus assembly protein PilW
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03605BCTERIALGSPG290.009 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 28.7 bits (64), Expect = 0.009
Identities = 11/31 (35%), Positives = 17/31 (54%)

Query: 1 MTRVAWQAGVTLAELTMVLAILAILAVFAMP 31
M Q G TL E+ +V+ I+ +LA +P
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVP 31


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03610BCTERIALGSPG413e-07 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 40.6 bits (95), Expect = 3e-07
Identities = 19/88 (21%), Positives = 32/88 (36%), Gaps = 2/88 (2%)

Query: 1 MLRPTGFTFIELLITLAIAGVLA-LAASQAWSGAYLRTERMAAHAALVATMAALEQHHAQ 59
+ GFT +E+++ + I GVLA L + ++ A + +VA AL+ +
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKE-KADKQKAVSDIVALENALDMYKLD 62

Query: 60 TGSYALPGDDTSPLDRWPHTLEHPRGYR 87
Y L P Y
Sbjct: 63 NHHYPTTNQGLESLVEAPTLPPLAANYN 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03620BCTERIALGSPG290.006 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 29.5 bits (66), Expect = 0.006
Identities = 10/25 (40%), Positives = 15/25 (60%)

Query: 3 MRPLRPSRGFMLVFVMTSTVLLGLL 27
MR RGF L+ +M V++G+L
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVL 25


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03635BCTERIALGSPG401e-06 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 39.9 bits (93), Expect = 1e-06
Identities = 19/58 (32%), Positives = 31/58 (53%), Gaps = 3/58 (5%)

Query: 9 RRRSARGFTLLELMVTIAIISIMLVLVAPSF---SDFLRKQRLLSAADSVASAIGQAR 63
RGFTLLE+MV I II ++ LV P+ + KQ+ +S ++ +A+ +
Sbjct: 3 ATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK 60


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03640BCTERIALGSPG472e-09 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 47.2 bits (112), Expect = 2e-09
Identities = 13/43 (30%), Positives = 31/43 (72%)

Query: 12 QRGFTLIELMIVVIVVAILSTIAYPSYTQFVQKSRRTQAKAAL 54
QRGFTL+E+M+V++++ +L+++ P+ +K+ + +A + +
Sbjct: 7 QRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDI 49


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03655BCTERIALGSPG336e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 32.9 bits (75), Expect = 6e-04
Identities = 15/33 (45%), Positives = 22/33 (66%), Gaps = 1/33 (3%)

Query: 10 RRTRGFSLVELMVALVIALLVLAATVSFYLMTR 42
+ RGF+L+E+MV +VI + VLA+ V LM
Sbjct: 5 DKQRGFTLLEIMVVIVI-IGVLASLVVPNLMGN 36


69RS_RS03675RS_RS03705N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS03675-111-0.768640oxidoreductase
RS_RS03680-110-0.9551314-hydroxybenzoyl-CoA thioesterase
RS_RS03685-110-1.009491protein tolQ
RS_RS03690-111-0.524168biopolymer transporter ExbD
RS_RS03695-314-0.155809TonB-dependent receptor
RS_RS03700-114-0.861507TolB protein
RS_RS03705-216-1.738536membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03675DHBDHDRGNASE842e-21 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 83.9 bits (207), Expect = 2e-21
Identities = 49/180 (27%), Positives = 81/180 (45%), Gaps = 4/180 (2%)

Query: 2 IVFVTGATAGFGAAVARRFVREGHRVIAA---GRRQDRLDALKAELGDALLPFLLDVQDA 58
I F+TGA G G AVAR +G + A + +++ + F DV+D+
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 59 AAVAALPGALPEGWREVDVLVNNAGLALGLEPAHRASLSDWDLMIGTNVTGLVHVTRALL 118
AA+ + + +D+LVN AG+ L H S +W+ N TG+ + +R++
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGV-LRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 119 PGMVARNRGHVINLGSIAGTYPYPGGNVYGGTKAFVRQFSLNLRADLTGTRVRVSNVEPG 178
M+ R G ++ +GS P Y +KA F+ L +L +R + V PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03690PF04335300.003 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 29.8 bits (67), Expect = 0.003
Identities = 18/83 (21%), Positives = 33/83 (39%), Gaps = 9/83 (10%)

Query: 27 LVLLVIFMVAAPVVNPGVVNLPSVANATPQQTQPPIVVTI---KGDGTILARVKSGSGAT 83
V+ + A + +VA TP +T P V+T+ G+ +I A++ + T
Sbjct: 36 WVVAGVAGALA------TAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLHGDATIT 89

Query: 84 EQKLANNQELRAFVKQRTGENPD 106
+ L +V+ R G
Sbjct: 90 YDEAVRKYFLATYVRYREGWIAA 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03695IGASERPTASE576e-11 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 57.0 bits (137), Expect = 6e-11
Identities = 27/183 (14%), Positives = 56/183 (30%), Gaps = 19/183 (10%)

Query: 60 QQAVQSAPAPEPKPEPAVQAPPAKVEEEADIALEQQKRKREEEAAAAREAELARRATAER 119
+V S + + A PPA + K+E + E +
Sbjct: 1007 VPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQD--------- 1057

Query: 120 EDAKRQALLKEQKAQQARQAELQKRALAEQLAQQAKERQAHEKAVAAEAQARKDAQLKAQ 179
E AQ A+ K + + E + ++ A ++ +
Sbjct: 1058 --------ATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 180 AEAKAKAE--AEARQKAKAVAEAKAKADAEQRAAQARSNAQRQARLRDLQALAGTAGATG 237
+AK + E E + V+ + +++ Q A+ +++ Q+ T T
Sbjct: 1110 EKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTE 1169

Query: 238 AAA 240
A
Sbjct: 1170 QPA 1172



Score = 39.3 bits (91), Expect = 2e-05
Identities = 25/184 (13%), Positives = 64/184 (34%), Gaps = 16/184 (8%)

Query: 49 EEAELWDEAALQQA-VQSAPAPEPKPEPAVQAPPAKVEEEADIALEQQKRKREEEAAAAR 107
+EA+ +A Q V + + + + A VE+E A + ++ +E ++
Sbjct: 1070 KEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEK-AKVETEKTQEVPKVTSQ 1128

Query: 108 EAELARRATAEREDAKRQA------LLKEQKAQQARQAELQK------RALAEQLAQQAK 155
+ ++ + A+ +KE ++Q A+ ++ + + + +
Sbjct: 1129 VSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTT 1188

Query: 156 ERQAHEKAVAAEAQARKDAQLKAQAEAKAKAEAEARQKAKAVAEAK--AKADAEQRAAQA 213
+ E Q +E+ K + R+ ++V A + R+ A
Sbjct: 1189 VNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVA 1248

Query: 214 RSNA 217
+
Sbjct: 1249 LCDL 1252



Score = 35.4 bits (81), Expect = 4e-04
Identities = 21/175 (12%), Positives = 50/175 (28%), Gaps = 10/175 (5%)

Query: 51 AELWDEAALQQAVQSAPAPEPKPEPAVQAPPAKVEEEADIALEQQKRKREEEAAAAREAE 110
+++ + + VQ P + +P V + + EQ ++
Sbjct: 1127 SQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES 1186

Query: 111 LARRATAEREDAKRQALLKEQKAQQARQAELQKRALAEQLAQQAKERQAHEKAVAAEAQA 170
+ Q +E + + + ++ V +
Sbjct: 1187 TTVNTGNSVVENPENTT--PATTQPTVNSESSNKP----KNRHRRSVRSVPHNVEPATTS 1240

Query: 171 RKDAQLKAQAEAKAK----AEAEARQKAKAVAEAKAKADAEQRAAQARSNAQRQA 221
D A + + ++AR KA+ VA KA ++ + +N +
Sbjct: 1241 SNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQYN 1295



Score = 33.9 bits (77), Expect = 0.001
Identities = 31/213 (14%), Positives = 61/213 (28%), Gaps = 13/213 (6%)

Query: 60 QQAVQSAPAPEPKPEPAVQAPPAKVEEEADIALEQQKRKREEEAAAAREAELARRATAER 119
+ A ++ + + A +E ++ E+E A E E +
Sbjct: 1067 EVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETE------KTQ 1120

Query: 120 EDAKRQALLKEQKAQQARQAELQKRALAEQ---LAQQAKERQAHEKAVAAEAQARKDAQL 176
E K + + +Q + +Q +A + KE Q+ A Q K+
Sbjct: 1121 EVPKVTS---QVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSS 1177

Query: 177 KAQAEAKAKAEAEARQKAKAVAEAKAKADAEQRAAQARSNAQRQARLRDLQALAGTAGAT 236
+ E A + SN + R ++++
Sbjct: 1178 NVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPA 1237

Query: 237 GAAAGSGGGVGSGTGAGGTASAGYADRVRAKVQ 269
++ V +A +D RAK Q
Sbjct: 1238 TTSSNDRSTVALCDLTSTNTNAVLSD-ARAKAQ 1269


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS03705OMPADOMAIN1057e-30 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 105 bits (264), Expect = 7e-30
Identities = 37/145 (25%), Positives = 61/145 (42%), Gaps = 17/145 (11%)

Query: 37 GGAAAGADTRNVTPVDVSRDELTDPNSPLAKRSVYFDFDSYTVKPEYQGLLTQHARYLQS 96
G AA +V T + V F+F+ T+KPE Q L Q L +
Sbjct: 194 GEAAPVVAPAPAPAPEVQTKHFTLKSD------VLFNFNKATLKPEGQAALDQLYSQLSN 247

Query: 97 HN--QRKVLIQGNTDERGTSEYNLALGQKRAEAVRRALSSLGVPDSQMESVSLGKEKPQA 154
+ V++ G TD G+ YN L ++RA++V L S G+P ++ + +G+ P
Sbjct: 248 LDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPADKISARGMGESNPVT 307

Query: 155 SGHDE---------ESWAQNRRSDI 170
+ + A +RR +I
Sbjct: 308 GNTCDNVKQRAALIDCLAPDRRVEI 332


70RS_RS06460RS_RS06495N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS064600131.506034translation initiation factor IF-2
RS_RS064651142.194372ribosome-binding factor A
RS_RS064700141.390763tRNA pseudouridine synthase B
RS_RS064750160.898618DSBA oxidoreductase
RS_RS064800161.716411hemolysin D
RS_RS064850121.187154outer membrane CHANEL lipoprotein
RS_RS06490-19-1.183063MarR family transcriptional regulator
RS_RS0649509-1.546962GTP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06460TCRTETOQM733e-15 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 73.0 bits (179), Expect = 3e-15
Identities = 64/279 (22%), Positives = 100/279 (35%), Gaps = 80/279 (28%)

Query: 471 VMGHVDHGKTSLLDYIRRAKVAAGEAG------------------GITQHIGAYHVETPR 512
V+ HVD GKT+L + + A E G GIT G +
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 513 GVITFLDTPGHEAFTAMRARGAKATDIVILVVAADDGVMPQTKEAIAHAKAAGVPIVVAI 572
+ +DTPGH F A R D IL+++A DGV QT+ + G+P + I
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 573 NKIDKPEANPDRVKQEL---VSESVIP--------------------------------E 597
NKID+ + V Q++ +S ++ E
Sbjct: 128 NKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLE 187

Query: 598 EY-GGDSP-----------------FVPV---SAKTGQGIENLLENV--LLQAEVLELKA 634
+Y G S PV SAK GI+NL+E + + ++
Sbjct: 188 KYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTHRGQS 247

Query: 635 PINAAAKGLVVEAQLDKGKGPIATVLVQSGTLKRGDVVL 673
+ G V + + + + +A + + SG L D V
Sbjct: 248 EL----CGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVR 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06475TCRTETB1282e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 128 bits (324), Expect = 2e-34
Identities = 84/402 (20%), Positives = 163/402 (40%), Gaps = 22/402 (5%)

Query: 24 LSLATFMNVLDSSIANVSIPAISGDLGVAPNQGTWVITSFAVANAISVPLTGWLTQRFGQ 83
L + +F +VL+ + NVS+P I+ D P WV T+F + +I + G L+ + G
Sbjct: 19 LCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGI 78

Query: 84 VRLFVTSILLFVLSSWACGLAPNMGT-LIAARILQGAVAGPMIPLSQSLLLSTYPPAKSS 142
RL + I++ S + + + LI AR +QGA A L ++ P
Sbjct: 79 KRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRG 138

Query: 143 MALALWGMTTLVAPIMGPIFGGWISDNMTWPWIFYINIPVGILAAYATWVIYKDRESPTR 202
A L G + +GP GG I+ + W ++ IP+ I +++ ++
Sbjct: 139 KAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL--LIPM-ITIITVPFLMKLLKKEVRI 195

Query: 203 ALPIDRIGLALLVIWVGSLQLMLDRGKELDWFNSAEIVTLTVVAVVGFAFFLVWELTEDH 262
D G+ L+ + + L F ++ ++ +V+V+ F F+
Sbjct: 196 KGHFDIKGIILMSVGIVFFML----------FTTSYSISFLIVSVLSFLIFVKHIRKVTD 245

Query: 263 PVVDLTLFKGRNFSAGVVAISVAYGLFFGNLVILPLWLQTIVGYTATDAG-LVMAPVGIF 321
P VD L K F GV+ + +G G + ++P ++ + + + G +++ P +
Sbjct: 246 PFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMS 305

Query: 322 AILLSPVIGKNLPKMDARWVATTAFITFALVFW---MRSRFTIQVDTWTLMVPTLIQGAA 378
I+ + G + + +V ++ F T T ++ L +
Sbjct: 306 VIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVF-VLGGLSF 364

Query: 379 MAMFFIPLTSIILSGLRPERIPAASGLSNFVRIMFGGIGASI 420
+++I+ S L+ + A L NF + G G +I
Sbjct: 365 TKTV---ISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06480RTXTOXIND794e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 79.1 bits (195), Expect = 4e-18
Identities = 52/281 (18%), Positives = 102/281 (36%), Gaps = 26/281 (9%)

Query: 92 QLVTAGQPLIELDRADARVALEQAEAALAQAVRQVRTLYSNTGAYTSTLAMRESDLAKAK 151
+ V LI+ + + Q E L + + T+ + Y + + +S L
Sbjct: 182 EEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFS 241

Query: 152 EDLARRKQIAGTGAVSQE--------EIAHAQTSLQAAEAAVETAREQLQANRVLTEQTT 203
L ++ IA + QE E+ ++ L+ E+ + +A+E+ Q L +
Sbjct: 242 S-LLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI 300

Query: 204 LER----HPNVLQAAAKVREAYLAYARTSLPASVTGYVAKRSVQ-VGQRVAPGTPLMAIV 258
L++ N+ ++ + + + A V+ V + V G V LM IV
Sbjct: 301 LDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIV 360

Query: 259 PLDQ-LWVDANFKEVQVRHMRVGQPVELVADVYGSSVTYH--GKVVGFSAGTGSAFSLLP 315
P D L V A + + + VGQ + + + + + GKV +
Sbjct: 361 PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA-------I 413

Query: 316 AQNATGNWIKVVQRLPVRVSLDPKELKDHPLRVGLSMEARV 356
G V+ + K+ PL G+++ A +
Sbjct: 414 EDQRLGLVFNVIISIEENCLST--GNKNIPLSSGMAVTAEI 452



Score = 52.5 bits (126), Expect = 1e-09
Identities = 34/212 (16%), Positives = 69/212 (32%), Gaps = 35/212 (16%)

Query: 38 LAAALVVAGVGYGLYWGLYGRWFESTDDAYVQGNVV------QVTPQVAGTVVAIRADDT 91
L A ++ + + G+ E A G + ++ P V I +
Sbjct: 59 LVAYFIMGFLVIAFILSVLGQ-VEIV--ATANGKLTHSGRSKEIKPIENSIVKEIIVKEG 115

Query: 92 QLVTAGQPLIELDRADARVALEQAEAALAQA-VRQVRTLYSNTGAYTSTL---------- 140
+ V G L++L A + +++L QA + Q R + + L
Sbjct: 116 ESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPY 175

Query: 141 --------AMRESDLAKAKEDLARR----KQIAGTGAVSQEEIAHAQTSLQAAEAAVETA 188
+R + L K + + K++ ++ A+ + E
Sbjct: 176 FQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLAR--INRYENLSRVE 233

Query: 189 REQLQANRVLTEQTTLERHPNVLQAAAKVREA 220
+ +L L + + +H VL+ K EA
Sbjct: 234 KSRLDDFSSLLHKQAIAKH-AVLEQENKYVEA 264


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06495TCRTETOQM1626e-45 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 162 bits (412), Expect = 6e-45
Identities = 98/445 (22%), Positives = 171/445 (38%), Gaps = 82/445 (18%)

Query: 4 LRNIAIIAHVDHGKTTLVDQLLRQAGTFRENEQIAE--RVMDSNDIEKERGITILAKNCA 61
+ NI ++AHVD GKTTL + LL +G E + + D+ +E++RGITI +
Sbjct: 3 IINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITS 62

Query: 62 VEYNGTHINIVDTPGHADFGGEVERVLSMVDGVLLLVDAVEGPMPQTRFVTRKALALGLK 121
++ T +NI+DTPGH DF EV R LS++DG +LL+ A +G QTR + +G+
Sbjct: 63 FQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIP 122

Query: 122 PIVVINKVDRPGAR-------------PEWVINQTFDLFDK---------------LGAN 153
I INK+D+ G E VI Q +L+ + N
Sbjct: 123 TIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGN 182

Query: 154 DDQLD--------------------------FPVVYASGLNGYAGLTDDVRSGDMKPLFE 187
DD L+ FPV + S N + L E
Sbjct: 183 DDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIG----------IDNLIE 232

Query: 188 TILDKVPQRNDDPNGSFQLQIISLDYSSYVGKIGVGRITRGRVRPLQDVVVKFGPEGAPI 247
I +K ++ ++YS ++ R+ G + L+D V E
Sbjct: 233 VITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLH-LRDSVRISEKE---- 287

Query: 248 KGRVNQVLKFRGLERELVQEAEAGDIVLINGIDELGIGCTVCAPDAQDALPMLKVDEPTL 307
K ++ ++ E + +A +G+IV++ + L + + ++ P L
Sbjct: 288 KIKITEMYTSINGELCKIDKAYSGEIVILQN-EFLKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 308 TMNFCVNTSPLAGREGKFVTSRQLRDRLDRELKSNVALRVKDTGDDTIFEVSGRGELHLT 367
+ + D L + V + I +S G++ +
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPL-------LRYYVDSATHEII--LSFLGKVQME 397

Query: 368 ILVENMRRE-GYELAVSRPRVVFKE 391
+ ++ + E+ + P V++ E
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYME 422


71RS_RS06845RS_RS06900N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS068450122.099499MFS transporter
RS_RS068500121.910180acyl-CoA-binding protein
RS_RS068551132.374836tRNA threonylcarbamoyladenosine biosynthesis
RS_RS068601151.600500acyltransferase
RS_RS068652171.799303DNA polymerase
RS_RS068701152.125995MFS transporter
RS_RS068750123.151544alanine racemase
RS_RS06880-1143.564946DNA repair protein RadA
RS_RS06885-1143.982940disulfide bond formation protein B
RS_RS068900143.934766hypothetical protein
RS_RS06895-2142.398619carnitine dehydratase
RS_RS06900-2151.472842ABC transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06845TCRTETB1244e-33 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 124 bits (312), Expect = 4e-33
Identities = 87/405 (21%), Positives = 179/405 (44%), Gaps = 13/405 (3%)

Query: 23 VLFTLTVGAVSSIISATIVNVAIPDLSRHFVLGQERAQWVSASFMVAMTLAMLLTPWLLL 82
+L L + + S+++ ++NV++PD++ F WV+ +FM+ ++ + L
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 83 RFGLRRTFIGALLLLGVGGLVGGLSPTY-GVMIAMRVVGGVAAGIMQPLPNILILRVFPE 141
+ G++R + +++ G ++G + ++ ++I R + G A L +++ R P+
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 142 REQGKAFGLFGFGVVLAPALGPSLGGFLVELFGWRSIFFVVVPFTLLGWAMARRFMAINS 201
+GKAFGL G V + +GP++GG + W S ++ T++ + +
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW-SYLLLIPMITIITVPFLMKLLKKEV 193

Query: 202 SMAGEPKPLDWRGLLLVGAATVALLNGLVELHADVTRGVALMAVSAVCLAVFLFWQRRVE 261
+ G D +G++L+ V + + ++ + VS + +F+ R+V
Sbjct: 194 RIKG---HFDIKGIILMSVGIVFFMLFT------TSYSISFLIVSVLSFLIFVKHIRKVT 244

Query: 262 SPLLDLRLFSYRQFAMGAVVAFIYGAGLYGSTYLLPVYMQVALAYTPSSAGLVLLPPG-L 320
P +D L F +G + I + G ++P M+ + + G V++ PG +
Sbjct: 245 DPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTM 304

Query: 321 ALAATIVVAGRLTSRIEPYRLVSFGLAALAVSFLLMTTNTRATGYLLLIGIAAIGRVGLG 380
++ + G L R P +++ G+ L+VSFL + T + + I I + GL
Sbjct: 305 SVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFV-LGGLS 363

Query: 381 FVLPSLSLGAMRGVDFTLIPQGSSAVNFLRQLGGAIGVSATGIFL 425
F +S + G S +NF L G++ G L
Sbjct: 364 FTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLL 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06860SACTRNSFRASE437e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 43.4 bits (102), Expect = 7e-08
Identities = 18/74 (24%), Positives = 30/74 (40%)

Query: 90 NVTVAPAWQRQGLGRWLLRAAQALTLAHGFASLLLEVRPSNAGAIALYRRVGFAEIGRRK 149
++ VA ++++G+G LL A + F L+LE + N A Y + F
Sbjct: 94 DIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVDT 153

Query: 150 RYYPAENNTREDAL 163
Y E A+
Sbjct: 154 MLYSNFPTANEIAI 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06865SURFACELAYER290.029 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 28.9 bits (64), Expect = 0.029
Identities = 13/71 (18%), Positives = 28/71 (39%)

Query: 48 VADAPAPRIEASPVVVAEVVPVAAPAAAAAATVADVPTDSTVPTDSTDPVAPRAQRIAAF 107
+ A A + A + A +PV A A + + T++ D T ++ A +
Sbjct: 7 IVSAAAAALLAVAPIAATAMPVNAATTINADSAINANTNAKYDVDVTPSISAIAAVAKSD 66

Query: 108 DWAQLEAAVSG 118
+ +++G
Sbjct: 67 TMPAIPGSLTG 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06875ALARACEMASE419e-149 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 419 bits (1080), Expect = e-149
Identities = 211/359 (58%), Positives = 256/359 (71%), Gaps = 4/359 (1%)

Query: 1 MPRPIQAVIHGPALVNNLQVVRRHAADSRVWAVIKANAYGHGIERAYEGLRQADGFGLLD 60
M RPIQA + AL NL +VR+ A +RVW+V+KANAYGHGIER + + DGF LL+
Sbjct: 1 MTRPIQASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGATDGFALLN 60

Query: 61 LDEAVRLRQLGWQGPILLLEGFFKPEDLALVEQYRLTTTVHCEEQLRMLELARLKGPVSI 120
L+EA+ LR+ GW+GPIL+LEGFF +DL + +Q+RLTT VH QL+ L+ ARLK P+ I
Sbjct: 61 LEEAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDI 120

Query: 121 QLKINTGMSRLGFAPAAYRAAWEHARAISGIGTIVHMTHFSDADGPRGIDHQLAAFEQAT 180
LK+N+GM+RLGF P W+ RA++ +G + M+HF++A+ P GI +A EQA
Sbjct: 121 YLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAEHPDGISGAMARIEQAA 180

Query: 181 QGLPGEASLSNSAATLWHPRAHRDWVRPGVILYGASPTGVAADIEGTGLMPAMTLKSELI 240
+GL SLSNSAATLWHP AH DWVRPG+ILYGASP+G DI TGL P MTL SE+I
Sbjct: 181 EGLECRRSLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTGLRPVMTLSSEII 240

Query: 241 AVQDLQPGATVGYGSRFEAEQPMRIGIVACGYADGYPRHAPGWDGNYTPVLVDGVRTRMV 300
VQ L+ G VGYG R+ A RIGIVA GYADGYPRHAP TPVLVDGVRT V
Sbjct: 241 GVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAP----TGTPVLVDGVRTMTV 296

Query: 301 GRVSMDMITVDLAEVPGARVGAPVTLWGQGLPIDEVAHAAGTVGYELMCALAPRVPVTV 359
G VSMDM+ VDL P A +G PV LWG+ + ID+VA AAGTVGYELMCALA RVPV
Sbjct: 297 GTVSMDMLAVDLTPCPQAGIGTPVELWGKEIKIDDVAAAAGTVGYELMCALALRVPVVT 355


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06880TCRTETOQM290.046 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 29.1 bits (65), Expect = 0.046
Identities = 15/74 (20%), Positives = 29/74 (39%), Gaps = 12/74 (16%)

Query: 106 LLQALSNLAASRRVLYVSGEESGAQIALRARRLGVESPNLALLAEIQLERIQATIEAEKP 165
LL AL ++ S +L + + +I L L ++Q+E A ++ +
Sbjct: 361 LLDALLEISDSDPLLRYYVDSATHEIILS------------FLGKVQMEVTCALLQEKYH 408

Query: 166 EVVVIDSIQTLYSE 179
+ I +Y E
Sbjct: 409 VEIEIKEPTVIYME 422


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS06900GPOSANCHOR300.025 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 30.4 bits (68), Expect = 0.025
Identities = 21/118 (17%), Positives = 41/118 (34%), Gaps = 1/118 (0%)

Query: 524 LRQAAAKRAAASTPAAAGTGDAAPAVNRRDQKREEAE-QRQRLAARRKPLQKDVDTVEKR 582
+ +A A A ++ E A +A+ K L+ + +E R
Sbjct: 202 AMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEAR 261

Query: 583 MAPLQQEKTALEAFLADGAAYEDTNKARLMESLRRQAEVNTELDQLEARWLALQEQLE 640
A L++ F +A T +A +A++ + L A +L+ L+
Sbjct: 262 QAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLD 319


72RS_RS08015RS_RS08060N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS08015-2101.402399alpha-dehydro-beta-deoxy-D-glucarate aldolase
RS_RS08020-2140.781454MarR family transcriptional regulator
RS_RS08025-2120.709160exodeoxyribonuclease III
RS_RS08030-214-0.268138cytosine deaminase
RS_RS08035-215-0.527056oligopeptidase A
RS_RS08040-314-1.025711bifunctional protein FolD
RS_RS08045-213-1.124588LuxR family transcriptional regulator
RS_RS08050-312-0.952058PAS domain-containing two-component system
RS_RS08055-115-1.750887pyruvate dehydrogenase
RS_RS08060-112-0.804964dihydrolipoamide acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08015PHPHTRNFRASE469e-08 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 45.9 bits (109), Expect = 9e-08
Identities = 35/167 (20%), Positives = 59/167 (35%), Gaps = 31/167 (18%)

Query: 93 LLIPMVQSAEEARAAVATMRYPPHGNRGVGSALARASRWNRVDDYVRRADAEMCTLVQVE 152
++ PM+ + EE R A A M+ G ++ + + VE
Sbjct: 389 VMFPMIATLEELRQAKAIMQEEKDKLLSEGVDVSDSIEVG----------------IMVE 432

Query: 153 TPRGVEHLDAILAVAGVDGVFIGPGDLA----------ASMGHLGEPGHPEVRQTIDRTI 202
P + VD IG DL + +L +P HP + + +D I
Sbjct: 433 IPSTAVAANLFAKE--VDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPAILRLVDMVI 490

Query: 203 RRIVAAGKAAGI---LSADPALARHYLSLGATFVAVGSDVTLLARSA 246
+ + GK G+ ++ D L LG ++ + L ARS
Sbjct: 491 KAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQ 537


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08025MALTOSEBP300.007 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 30.5 bits (68), Expect = 0.007
Identities = 47/181 (25%), Positives = 73/181 (40%), Gaps = 38/181 (20%)

Query: 49 LAELEAAGYASLFTGQKTYNGVAILARKAAMPEGRDVVRNIPDFADEQQRVVAATYDVAG 108
LA++E G K YNG+A + +K G V PD +E+ VAAT D
Sbjct: 25 LAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIKVTVEHPDKLEEKFPQVAATGD--- 81

Query: 109 GPVRVISAYFPNGQALDS-------------DKMVYKMRWLAALQDWLQAEMAAHP---- 151
GP + A+ G S DK+ Y W A + ++ A+P
Sbjct: 82 GPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKL-YPFTWDAVRYN---GKLIAYPIAVE 137

Query: 152 RLMLLGDFNIAPDDRDVHDPKKWEGMNLVSPEERAAFRALESAGLVDAFRMFEQEDKLFS 211
L L+ + ++ P+ PK WE E A + L++ G + MF ++ F+
Sbjct: 138 ALSLIYNKDLLPN-----PPKTWE-------EIPALDKELKAKG--KSALMFNLQEPYFT 183

Query: 212 W 212
W
Sbjct: 184 W 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08045HTHFIS1058e-29 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 105 bits (264), Expect = 8e-29
Identities = 38/150 (25%), Positives = 68/150 (45%), Gaps = 1/150 (0%)

Query: 14 TVFVVDDDEAMRDSLTWLLEGNGYNVRTYRSAEEFLVDDKRGEGVGCLILDVRMQGMSGP 73
T+ V DDD A+R L L GY+VR +A G+G ++ DV M +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDG-DLVVTDVVMPDENAF 63

Query: 74 ELQDRLLAENNRMPIVFVTGHGDVPMAVSTMKKGAVDFIEKPFDESELRELVERMLSKAR 133
+L R+ +P++ ++ A+ +KGA D++ KPFD +EL ++ R L++ +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 134 TEDSAAREQKAANDLLSRLTTREQQVLERI 163
S + L + Q++ +
Sbjct: 124 RRPSKLEDDSQDGMPLVGRSAAMQEIYRVL 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08050TONBPROTEIN300.023 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 30.3 bits (68), Expect = 0.023
Identities = 11/62 (17%), Positives = 19/62 (30%), Gaps = 5/62 (8%)

Query: 5 PRPQNSGVRPPIMLSPSHATPEPADPATQPPA-----AAMPVPVPAPELPPEDGVRPNVP 59
PQ P ++ P +P + P P P P P ++ + +V
Sbjct: 56 EPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVK 115

Query: 60 RG 61

Sbjct: 116 PV 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08060RTXTOXIND300.021 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.021
Identities = 12/37 (32%), Positives = 23/37 (62%)

Query: 49 VPSPKSGVVKEVKIKVGDAVSEGSLVLLLEEQGATAA 85
+ ++ +VKE+ +K G++V +G ++L L GA A
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEAD 135


73RS_RS08300RS_RS08335N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS08300-170.463589two-component system sensor histidine kinase
RS_RS08305-170.866667two-component system response regulator
RS_RS08310-3110.281278membrane protein
RS_RS08315-310-0.537311hypothetical protein
RS_RS08320-39-0.589841hypothetical protein
RS_RS08325-49-0.717964phytoene synthase
RS_RS08330-310-1.244121RND transporter MFP subunit
RS_RS08335-311-1.864418drug efflux protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08300PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.1 bits (78), Expect = 0.001
Identities = 14/82 (17%), Positives = 27/82 (32%), Gaps = 17/82 (20%)

Query: 345 ILDNLIENARRYGKTHDTGRAEITVSTALQGNEVVLCVADRGAGVPADQLALLTRPFYRL 404
++ L+EN ++G +I + V L V + G+
Sbjct: 259 LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA--------------- 303

Query: 405 ESARSEAKGAGLGMSIVSRILQ 426
++ + G G+ V LQ
Sbjct: 304 --LKNTKESTGTGLQNVRERLQ 323


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08305HTHFIS1043e-28 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 104 bits (262), Expect = 3e-28
Identities = 43/137 (31%), Positives = 71/137 (51%), Gaps = 2/137 (1%)

Query: 4 SGHKILVVDDDPRLRDLLRRYLGEQGFTVLVAENATAMNKLWLRERFDLLVLDLMMPGED 63
+G ILV DDD +R +L + L G+ V + NA + + DL+V D++MP E+
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 64 GLSICRRLRGANDQTPIIMLTAKGEDVDRIVGLEMGADDYLPKPFNPRELIARIHAVL-- 121
+ R++ A P+++++A+ + I E GA DYLPKPF+ ELI I L
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 122 RRRGPAEIPGAPSETPE 138
+R P+++ +
Sbjct: 122 PKRRPSKLEDDSQDGMP 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08330RTXTOXIND561e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 55.6 bits (134), Expect = 1e-10
Identities = 39/261 (14%), Positives = 78/261 (29%), Gaps = 59/261 (22%)

Query: 104 ATVRKGQVLARLDPTDLALAQQSAQAQLQAAKTDRDLAASDLKRFSELFAKGFIS----- 158
+T + + L+ + + A++ + + S L FS L K I+
Sbjct: 196 STWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVL 255

Query: 159 AAEQQRHQANYDAAQAR-----------------------YDQAVAGYRNQSN------- 188
E + +A + + + + Q+
Sbjct: 256 EQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLT 315

Query: 189 --------QAAYATLEADADGVVTSVDA-EVGQVVTAGQPVVRVAQTAEK-EVVVGIPED 238
+ + + A V + G VVT + ++ + + EV +
Sbjct: 316 LELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNK 375

Query: 239 QVDALRKSPDVRVKLWADPSRS---LPGKVREIAPAADPVTRT---YTVKITVP------ 286
+ + + +K+ A P L GKV+ I A R + V I++
Sbjct: 376 DIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLST 435

Query: 287 -NPPPDLKLGMTAVATFVRPG 306
N L GM A ++ G
Sbjct: 436 GNKNIPLSSGMAVTA-EIKTG 455



Score = 49.4 bits (118), Expect = 1e-08
Identities = 28/169 (16%), Positives = 55/169 (32%), Gaps = 10/169 (5%)

Query: 69 AAGTVEFSGDVRPRVESRLGFRVPGKIIARLVDVGATVRKGQVLARLDPTDLALAQQSAQ 128
A G + SG + ++ V +I +V G +VRKG VL +L Q
Sbjct: 86 ANGKLTHSGRSK-EIKPIENSIV-KEI---IVKEGESVRKGDVLLKLTALGAEADTLKTQ 140

Query: 129 AQLQAAKTD--RDLAASDLKRFSELFAKGFIS-AAEQQRHQANYDAAQARYDQAVAGYRN 185
+ L A+ + R S ++L Q + + + + ++N
Sbjct: 141 SSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQN 200

Query: 186 QSNQ--AAYATLEADADGVVTSVDAEVGQVVTAGQPVVRVAQTAEKEVV 232
Q Q A+ V+ ++ + + K+ +
Sbjct: 201 QKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAI 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08335ACRIFLAVINRP502e-163 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 502 bits (1294), Expect = e-163
Identities = 234/1056 (22%), Positives = 438/1056 (41%), Gaps = 71/1056 (6%)

Query: 8 LSRWALEHQPLTRFLLVALLLGGIFAYTQLGQDEDPPFTFRAMVVQAFWPGATAEQMSRQ 67
++ + + L + L++ G A QL + P A+ V A +PGA A+ +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 68 VTDKIEKALQEVPYAWKIRSYSKPGETL---VTFQLADTSPTKDTQQLWYTVRKKVGDIA 124
VT IE+ + + + S S ++ +TFQ T P Q V+ K+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQS-GTDPDIAQVQ----VQNKLQLAT 115

Query: 125 SSLPTGVRGPY-FNDDFGDVYGSIYALSADGFTYRQ--LNDYADA-IRQQLLRVPNVAKV 180
LP V+ + Y + +D Q ++DY + ++ L R+ V V
Sbjct: 116 PLLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDV 175

Query: 181 TLLGDQDEKIYIEFQQAKFAQMGLDINSIANQIAQQNNIGPSGVLVTPTD------NVQI 234
L G Q + I + L + NQ+ QN+ +G L N I
Sbjct: 176 QLFGAQYA-MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASI 234

Query: 235 RLSGQFSDIRDLENLTLRGPGGTSNIRLGDIATIRHGYIDPPRAKMRFNGKEVIGLGISM 294
+F + + +TLR S +RL D+A + G + R NGK GLGI +
Sbjct: 235 IAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELG-GENYNVIARINGKPAAGLGIKL 293

Query: 295 AKGGDIIQLGKDLRATVERIRAKLPVGIEMQQVQDQPQSVQHSVGEFVHVLIEAVVIVLA 354
A G + + K ++A + ++ P G+++ D VQ S+ E V L EA+++V
Sbjct: 294 ATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFL 353

Query: 355 VSFLSLGLHTKPRLRIDVWPGLVVGLTIPLVLAVTFLFMNIFDIGLHKISLGALIIALGL 414
V +L L ++ L+ + +P+VL TF + F ++ +++ +++A+GL
Sbjct: 354 VMYLFLQ---------NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGL 404

Query: 415 LVDDAIIAVEMMVR-KLEEGFSKMEAATFAYTSTAMPMLTGTLITATGFLPVGLARSTVG 473
LVDDAI+ VE + R +E+ EA + + ++ ++ + F+P+ + G
Sbjct: 405 LVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTG 464

Query: 474 EYTFGIFAVT-ALALVLSWVAAVVFVPYLGYLLL---HTKSHVGDGGHHELFDTPFYNRF 529
+ F++T A+ LS + A++ P L LL + H GG F+T F +
Sbjct: 465 AI-YRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSV 523

Query: 530 RGW---VNWCVEYRKTVIVITLAAFVLGVFGFKYVEKQFFPDSSRPELMVELWLPEGASF 586
+ V + ++I V F + F P+ + + + LP GA+
Sbjct: 524 NHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQ 583

Query: 587 GQTEAEAKRFE--ALIRQQKSVESVAFFIGSGAPRFYLPLDQILPQTNVAQAIVMPTSLE 644
+T+ + L ++ +VESV G N A V E
Sbjct: 584 ERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSG---------QAQNAGMAFVSLKPWE 634

Query: 645 TR---EDVRQEIIGLLKSQFPQLRGRVKLLPNGPPV-------PYPVQ-FRVMGPDIGGV 693
R E+ + +I K + ++R + N P + + + G +
Sbjct: 635 ERNGDENSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDAL 694

Query: 694 RKIADQVKAIMQANPNTV-GVNDNWNENVKLLRLDIDQDKARALGVTTGSIAQVTQTVMS 752
+ +Q+ + +P ++ V N E+ +L++DQ+KA+ALGV+ I Q T +
Sbjct: 695 TQARNQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALG 754

Query: 753 GAPIAQYRDGDKLLDIVMRPQERERNTLDALQNVQVPTASGRVVPLTQVARVGFAWEPGV 812
G + + D ++ + ++ + R + + + V +A+G +VP + + +
Sbjct: 755 GTYVNDFIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPR 814

Query: 813 IWRENRDYGITVQSDVVDGVQGPTVTAQINPLLDKIRADLPPDYQIKIAGAEEESANAGA 872
+ R N + +Q + G + L++ + + LP G + +G
Sbjct: 815 LERYNGLPSMEIQGEAAPGT----SSGDAMALMENLASKLPAGIGYDWTGMSYQERLSGN 870

Query: 873 SIVAQMPLCIFIIFTLLMLQLHSFSRSVMVFLTGPLGLIGAAATLLLLRAPMGFVAQLGI 932
A + + ++F L S+S V V L PLG++G L +G+
Sbjct: 871 QAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGL 930

Query: 933 TALIGMIIRNSVILVDQIEQ-DVATGVPTWTAIVEAAVRRFRPIILTAAAAVLAMIPLSR 991
IG+ +N++++V+ + G A + A R RPI++T+ A +L ++PL+
Sbjct: 931 LTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAI 990

Query: 992 SVFWG-----PMAAAIMGGLIVATVLTLLFLPALYA 1022
S G + +MGG++ AT+L + F+P +
Sbjct: 991 SNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFV 1026



Score = 113 bits (284), Expect = 2e-27
Identities = 99/528 (18%), Positives = 186/528 (35%), Gaps = 49/528 (9%)

Query: 7 NLSRWALEHQPLTRFLLVALLLGGIFAYTQLGQDEDPPF-TFRAMVVQAFWPGATAEQMS 65
N L + ++ G + + +L P + + GAT E+
Sbjct: 528 NSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQ 587

Query: 66 R---QVTD---KIEKALQEVPYAWKIRSYSKP----GETLVTFQLAD--TSPTKDTQQLW 113
+ QVTD K EKA E + S+S G V+ + + + +
Sbjct: 588 KVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVI 647

Query: 114 YTVRKKVGDIASSLPTGVRGPYFNDDFGDVYGSIYALSAD-GFTYRQLNDYADAIRQQLL 172
+ + ++G I P + G G + L G + L + +
Sbjct: 648 HRAKMELGKIRDGFVIPFNMPAI-VELGTATGFDFELIDQAGLGHDALTQARNQLLGMAA 706

Query: 173 RVP-NVAKVTLLGDQDE-KIYIEFQQAKFAQMGLDINSIANQIAQQNNIGPSGVLVTPTD 230
+ P ++ V G +D + +E Q K +G+ ++ I I +G + V
Sbjct: 707 QHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTI--STALGGTYVNDFIDR 764

Query: 231 ----NVQIRLSGQF-SDIRDLENLTLRGPGGTSNIRLGDIATIRHGYIDPPRAKMRFNGK 285
+ ++ +F D++ L +R G + T Y PR + R+NG
Sbjct: 765 GRVKKLYVQADAKFRMLPEDVDKLYVRSANGEM-VPFSAFTTSHWVY-GSPRLE-RYNGL 821

Query: 286 EVIGLGISMAKGGDIIQLGKDLRATVERIRAKLPVGIEMQQVQDQPQSVQHSVGEFVHVL 345
+ + A G D A +E + +KLP GI + S + ++
Sbjct: 822 PSMEIQGEAAPGTSS----GDAMALMENLASKLPAGIGYD-WTGMSYQERLSGNQAPALV 876

Query: 346 IEAVVIV---LAVSFLSLGLHTKPRLRIDVWPGLVVGLTIPLVLAVTFLFMNIFDIGLHK 402
+ V+V LA + S + + V L +PL + L +F+
Sbjct: 877 AISFVVVFLCLAALYESWSI------------PVSVMLVVPLGIVGVLLAATLFNQKNDV 924

Query: 403 ISLGALIIALGLLVDDAIIAVEMMV-RKLEEGFSKMEAATFAYTSTAMPMLTGTLITATG 461
+ L+ +GL +AI+ VE +EG +EA A P+L +L G
Sbjct: 925 YFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILG 984

Query: 462 FLPVGLARSTVGEYTFGIFAVTALALVLSWVAAVVFVPYLGYLLLHTK 509
LP+ ++ + +V + + A+ FVP ++++
Sbjct: 985 VLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVF-FVVIRRC 1031



Score = 78.3 bits (193), Expect = 1e-16
Identities = 78/515 (15%), Positives = 166/515 (32%), Gaps = 47/515 (9%)

Query: 542 TVIVITLAAFVLGVFGFKYVEKQFFPDSSRPELMVELWLPEGASFGQTEAEAKRF--EAL 599
V+ + + G + +P + P + V P GA + + + +
Sbjct: 11 FAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYP-GADAQTVQDTVTQVIEQNM 69

Query: 600 --IRQQKSVESVAFFIGSGAPRFYLPLDQILPQTNVAQAIVMPTSLETREDVRQEIIGLL 657
I + S + GS T+ A V V Q + L
Sbjct: 70 NGIDNLMYMSSTSDSAGSVTITLTFQSG-----TDPDIAQV---------QV-QNKLQLA 114

Query: 658 KSQFPQ--LRGRVKLLPNGPPVPYPVQFRVMGPDIG-------GVRKIADQVKAIMQANP 708
PQ + + + + F P + D + +
Sbjct: 115 TPLLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRL----N 170

Query: 709 NTVGVNDNWNENVKLLRLDIDQDKARALGVTTGSIAQV----TQTVMSGAPIAQYRDGDK 764
V + +R+ +D D +T + + +G +
Sbjct: 171 GVGDVQLFGAQ--YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQ 228

Query: 765 LLDIVMRPQERERNTLDALQNVQVPTASGRVVPLTQVARVGFAWEP-GVIWRENRDYGIT 823
L+ + Q R +N + + + G VV L VARV E VI R N
Sbjct: 229 QLNASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAG 288

Query: 824 VQSDVVDGVQGPTVTAQINPLLDKIRADLPPDYQIKIAGAEEESANAGASIVAQMPL-CI 882
+ + G I L +++ P ++ V + I
Sbjct: 289 LGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAI 348

Query: 883 FIIFTLLMLQLHSFSRSVMVFLTGPLGLIGAAATLLLLRAPMGFVAQLGITALIGMIIRN 942
++F ++ L L + +++ + P+ L+G A L + + G+ IG+++ +
Sbjct: 349 MLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDD 408

Query: 943 SVILVDQIEQ-DVATGVPTWTAIVEAAVRRFRPIILTAAAAVLAMIPL-----SRSVFWG 996
++++V+ +E+ + +P A ++ + ++ A IP+ S +
Sbjct: 409 AIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYR 468

Query: 997 PMAAAIMGGLIVATVLTLLFLPALYAAWFRVKRPE 1031
+ I+ + ++ ++ L+ PAL A + E
Sbjct: 469 QFSITIVSAMALSVLVALILTPALCATLLKPVSAE 503


74RS_RS08740RS_RS08775N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS087400151.5957693-oxoacyl-ACP reductase
RS_RS087451171.818317polyamine ABC transporter permease
RS_RS087501182.037613polyamine ABC transporter substrate-binding
RS_RS087550181.984357spermidine/putrescine ABC transporter
RS_RS087601172.453328Fe3+/spermidine/putrescine ABC transporter
RS_RS087650182.327888sensor histidine kinase
RS_RS08770-2151.179909two-component system response regulator
RS_RS08775-2121.175434porin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08740DHBDHDRGNASE892e-23 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 89.3 bits (221), Expect = 2e-23
Identities = 66/258 (25%), Positives = 112/258 (43%), Gaps = 19/258 (7%)

Query: 10 IALVTGGSRGLGRNMVQKLARGGTDVIFTYRSNQAEAQALVAELAALGRRAAALQLDAGQ 69
IA +TG ++G+G + + LA G + N + + +V+ L A R A A D
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIA-AVDYNPEKLEKVVSSLKAEARHAEAFPADV-- 66

Query: 70 IATFAAFADAVRGVLAQWQRE--RIDALVNNAGIGIHASLAEMTEAQFDALMNVHFKGVF 127
+ A+ + A+ +RE ID LVN AG+ + +++ +++A +V+ GVF
Sbjct: 67 -----RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVF 121

Query: 128 FLTQALLPLIAD--GGRIVNISSGLTRFAFVGYGAYAAMKGAVEVLTKYMAKELGPRGIA 185
++++ + D G IV + S AYA+ K A + TK + EL I
Sbjct: 122 NASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIR 181

Query: 186 VNVVAPGAIETDFGGGRVRDDANLNAMVAAQTA-------LGRVGLPDDIGGVVASLLAP 238
N+V+PG+ ETD D+ ++ L ++ P DI V L++
Sbjct: 182 CNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSG 241

Query: 239 DNRWINAQRIEASGGMFL 256
I + GG L
Sbjct: 242 QAGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08750CHANLCOLICIN290.045 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 28.9 bits (64), Expect = 0.045
Identities = 45/179 (25%), Positives = 73/179 (40%), Gaps = 12/179 (6%)

Query: 11 STARLKRELKSAEARRRAMALALVAPL----ALFLLLIFVVPIGALLTRAVQNPEVADAL 66
STA+LK+ AR +A A A AL L +V AL A + P +
Sbjct: 58 STAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIVN-EALRHNASRTPSATELA 116

Query: 67 PRTVAALKSWDRRTPPADAAYAALADDLTATQAGEAMGALARRLNTEISGFRSLVAKTAR 126
AA+++ D R A A A + A +A + A + EI A+T R
Sbjct: 117 HANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQE----AEQRRKEIE---REKAETER 169

Query: 127 AMPLADEQGHALAPAQTRARLVELDERWADTAYWQAIAKNSSRTSPFYLLAALDHRQDA 185
+ LA+ + LA A+ VE+ ++ A + + + + L++ H +DA
Sbjct: 170 QLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDA 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08760PF05272300.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.011
Identities = 9/31 (29%), Positives = 15/31 (48%)

Query: 37 LTLLGPSGSGKTTCLMMLAGFEFPTGGEIRL 67
+ L G G GK+T + L G +F + +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDI 629


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08765PF06580310.008 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.008
Identities = 22/106 (20%), Positives = 41/106 (38%), Gaps = 24/106 (22%)

Query: 356 LQELVGNLLDNAIRYAGPDARVTLRVARQAAPGQDTGALLQVEDNGPGIAEAERAAVFGR 415
+Q LV N + + I ++ L+ + + L+VE+ G +
Sbjct: 260 VQTLVENGIKHGIAQLPQGGKILLKGTKD-----NGTVTLEVENTGSLALK--------- 305

Query: 416 FYRGSTAQAQEGSGLGLAIVRE-IARVHG--ATVALSDTAGGGLTV 458
+E +G GL VRE + ++G A + LS+ G +
Sbjct: 306 -------NTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08770HTHFIS963e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 96.4 bits (240), Expect = 3e-25
Identities = 52/199 (26%), Positives = 86/199 (43%), Gaps = 26/199 (13%)

Query: 2 RILLVEDNPKLAGTLEEALGQAGFTVDCVHDGHAADLLLTTQDYALLLLDLGLPRLDGLE 61
IL+ +D+ + L +AL +AG+ V + + D L++ D+ +P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLRRLRLRRNPLPVMILTAHGAVEERVRGLNLGADDYLTKPFDLTE-VEARARALIRRSH 120
+L R++ R LPV++++A ++ GA DYL KPFDLTE + RAL
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 GHERTQLQCGPLHY-DGTSGAFT-----------------LRGEALALTGRE---RAVLE 159
+ + G S A + GE + TG+E RA+ +
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGE--SGTGKELVARALHD 182

Query: 160 VLMLRDGR--AVNKAAISE 176
R+G A+N AAI
Sbjct: 183 YGKRRNGPFVAINMAAIPR 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS08775ECOLNEIPORIN911e-22 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 90.6 bits (225), Expect = 1e-22
Identities = 93/364 (25%), Positives = 141/364 (38%), Gaps = 44/364 (12%)

Query: 1 MKITSMALAAAALATAGTAAAQSNVTLYGIMDAGIEYVNHAGANGGGAARLVSGGKNT-- 58
MK + +AL AAL A A +VTLYG + AG+E NG AA + +G
Sbjct: 1 MKKSLIALTLAALPVAAMA----DVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDL 56

Query: 59 -SRWGLRGSEDLGGGLKGIFNLESGIAIDTGRLDTDNTLFDRRAVVGLAGSFGQVVFGRT 117
S+ G +G EDLG GLK I+ +E TD+ +R++ +GL G FG++ GR
Sbjct: 57 GSKIGFKGQEDLGNGLKAIWQVEQKA----SIAGTDSGWGNRQSFIGLKGGFGKLRVGRL 112

Query: 118 FTTTYDF--MLPYDPMGYAPNYSWATSSTATGDRKDGLFSRASNAVRYDGT-FGGLKLGA 174
+ D + P+D +Y R VRYD F GL
Sbjct: 113 NSVLKDTGDINPWDSKS---DYLGVNKIAEPEARLIS--------VRYDSPEFAGLSGSV 161

Query: 175 TVGFGEVAGNFNSSSKYDVGIGYSAGGFSAAATWDRQNGAGTSTTPADTTDYIQGIHAGA 234
+ AG NS S Y G Y GGF + + Q +
Sbjct: 162 QYALNDNAGRHNSES-YHAGFNYKNGGFFVQYGGAYKRHH--QVQENVNIEKYQIHRLVS 218

Query: 235 SYDFGALKLF--AGYRNYKRAFTTAAATQRSDMYWAGASYDF---TPAFTLFGAVYKQNI 289
YD AL ++ K + ++++ A +Y F TP +
Sbjct: 219 GYDNDALYASVAVQQQDAKLVEENYSHNSQTEVA-ATLAYRFGNVTPRVSYAHGFKGSFD 277

Query: 290 KGGTDADPILFSLRAQYALSKRTTAYLAGGYAKARNGQNVSLSRDVAGFGNSQVGVTAGL 349
+ D + A+Y SKRT+A ++ G+ + G++ +S GL
Sbjct: 278 ATNYNNDYDQVVVGAEYDFSKRTSALVSAGWLQEGKGESKFVSTAGG----------VGL 327

Query: 350 QHRF 353
+H+F
Sbjct: 328 RHKF 331


75RS_RS11585RS_RS11655N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS11585-121-2.687047general secretion pathway protein GspF
RS_RS11590-125-4.911636type II secretion system protein GspG
RS_RS11595-223-4.735240ATPase
RS_RS11600-230-5.922462transcriptional regulator
RS_RS11605034-6.485162transposase
RS_RS11610-135-6.551670transposase
RS_RS11615-139-7.025016hypothetical protein
RS_RS11620-140-7.072875MFS transporter
RS_RS11625038-7.407141AsnC family transcriptional regulator
RS_RS11630034-4.739670lysophospholipase
RS_RS11635032-4.042306AsnC family transcriptional regulator
RS_RS11640129-4.143146oxidoreductase
RS_RS11645227-3.708466MerR family transcriptional regulator
RS_RS11650129-4.065996MFS transporter
RS_RS11655331-4.286032MFS transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11585BCTERIALGSPF2047e-64 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 204 bits (520), Expect = 7e-64
Identities = 102/403 (25%), Positives = 181/403 (44%), Gaps = 6/403 (1%)

Query: 3 YLLRVFDSGGLVQTVCIEGDTPAGATAAAHARGWKVIAVRAGGARGRHRIRGTVL----- 57
Y + D+ G E D+ A RG ++V + +
Sbjct: 4 YHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLRRKI 63

Query: 58 GGTAFDVELFARELAALLDAGVSVIDALRTLGSNERREASAAVYRDLLRRLEEGHALSAA 117
+ D+ L R+LA L+ A + + +AL + + + + + ++ EGH+L+ A
Sbjct: 64 RLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLADA 123

Query: 118 LELADNVFPPVLVACVKASEQTGGLAASLKRYSQNSATLQALRARVVSASIYPAVLLAVG 177
++ F + A V A E +G L A L R + + Q +R+R+ A IYP VL V
Sbjct: 124 MKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLTVVA 183

Query: 178 GSVAVFLLAFVVPRFAGLLEHSGRELPMLSQWLMAWGAMVHAHGQGLAVGFAICVLAGIG 237
+V LL+ VVP+ H + LP+ ++ LM V G + + +A
Sbjct: 184 IAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMAFRV 243

Query: 238 ALRRQATRSWATDRLLSLPGVGEHFRVYRQAQFFRTSAMLVDGGIPAVQAFDLACGLVS- 296
LR++ R RLL LP +G R A++ RT ++L +P +QA ++ ++S
Sbjct: 244 MLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDVMSN 303

Query: 297 RADRAALASAMERIRNGGRMSDAFLGSGLADPITYRLLTVAEKTGGLGPVLDKIAAFQEA 356
R L+ A + +R G + A + L P+ ++ E++G L +L++ A Q+
Sbjct: 304 DYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADNQDR 363

Query: 357 HVSHAIDLASRLVEPAMMMVIGVVIGGIVVLMYLPIFQLASSV 399
S + LA L EP +++ + V+ IV+ + PI QL + +
Sbjct: 364 EFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11590BCTERIALGSPG1601e-53 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 160 bits (406), Expect = 1e-53
Identities = 57/140 (40%), Positives = 81/140 (57%), Gaps = 6/140 (4%)

Query: 19 RCARREGGFTLLELLVVMVIIGLLAGLVAPQYFDQIGKSNLKIAKAQIESLGKALDQYRL 78
R ++ GFTLLE++VV+VIIG+LA LV P K++ + A + I +L ALD Y+L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL 61

Query: 79 DVGAYPSTEEGLDALNTRPQSQP---RWSGPYLKKAVPLDPWDRPYVYRSPGEHGEYDLY 135
D YP+T +GL++L P P ++ K +P DPW YV +PGEHG YDL
Sbjct: 62 DNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLL 121

Query: 136 SLGKSGQPGGTGENVAVTSW 155
S G G+ G + +T+W
Sbjct: 122 SAGPDGEMGTEDD---ITNW 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11600HTHFIS697e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 69.5 bits (170), Expect = 7e-16
Identities = 31/118 (26%), Positives = 53/118 (44%), Gaps = 2/118 (1%)

Query: 6 RLLIAEDHHLLRCGLRSMLSALGEYDVVGEAKDGREACQLAISLAPDLVLTDLSMPGMNG 65
+L+A+D +R L LS G YDV + + + DLV+TD+ MP N
Sbjct: 5 TILVADDDAAIRTVLNQALSRAG-YDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 66 IDMVATIKRRLPQIRVVVLTVYKSEEYVREALRVGVDGYVLKDASFEELVAALRAVMQ 123
D++ IK+ P + V+V++ + +A G Y+ K EL+ + +
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11620TCRTETB363e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 35.6 bits (82), Expect = 3e-04
Identities = 63/388 (16%), Positives = 130/388 (33%), Gaps = 57/388 (14%)

Query: 23 NSTVIFATGAIIGHMLAPSPALATVPVSIFVVGMAAATLPVGVVTRRYGRKTSALFGSVC 82
N V+ + I + PA + F++ + T G ++ + G K LFG
Sbjct: 29 NEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFG--- 85

Query: 83 GVTVGLLAALALVIQSFSLFCVA--MLFGGAYAAVVLTYRFAAAECVPAEHRARAM---S 137
+ + + V SF + + G AA A +P E+R +A
Sbjct: 86 IIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIG 145

Query: 138 TVLAGGVAAGVLGPQLVTAT--------------------MNLWSPHAYAVTYLASAGTA 177
+++A G G ++ M L + G
Sbjct: 146 SIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGII 205

Query: 178 VLSAIVLQGVRFEHQPPVSAH-----------AHGRPSSDIMRQPRFV--VAMLCGVVSY 224
++S ++ + F +S H R +D P + + GV+
Sbjct: 206 LMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCG 265

Query: 225 MMMNFMMTSAPLAMELCGIPRVHANYGIEIHVIAMYAPS-------FFTGRLIARFGAPR 277
++ + + + VH EI + ++ + + G L+ R G
Sbjct: 266 GIIFGTVAGFVSMVPYM-MKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLY 324

Query: 278 VSLAGLALIALAATTGMTGVSVNHFWVALLLLGLGWNFGFLGASATVLTCH-----TPAE 332
V G+ ++++ T + +++ ++++ + G L + TV++ E
Sbjct: 325 VLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFV---LGGLSFTKTVISTIVSSSLKQQE 381

Query: 333 GPRVQSIYDFVVFGAMVVGSFVSGGLLA 360
S+ +F F + G + GGLL+
Sbjct: 382 AGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11630RTXTOXINA310.010 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 30.7 bits (69), Expect = 0.010
Identities = 15/54 (27%), Positives = 22/54 (40%), Gaps = 9/54 (16%)

Query: 128 LCGGSLGGYLAYLCAARMGAG-----PIAGVIATTLADPRSPL----VRRQFAR 172
+ G G Y+ A R G AG+IA+ + SPL + +F R
Sbjct: 279 VLGNVGKGISQYIIAQRAAQGLSTSAAAAGLIASAVTLAISPLSFLSIADKFKR 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11640DHBDHDRGNASE822e-20 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 81.6 bits (201), Expect = 2e-20
Identities = 53/192 (27%), Positives = 86/192 (44%), Gaps = 1/192 (0%)

Query: 3 NCFDFRGKTALITGASSGIGREFAYALAKRGAKLLLVARSRDKLHDLTAELRRDYACDAD 62
N GK A ITGA+ GIG A LA +GA + V + +KL + + L+ + A A+
Sbjct: 2 NAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE-ARHAE 60

Query: 63 FLTVDLSAPDAIPTVAHLLKATGTVVDVLINNAGFATYGRFETIPWTRQRDEVLVNCMAA 122
D+ AI + ++ +D+L+N AG G ++ VN
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 123 IELTHLLLPGMQARSDGAVINVASTAAFQPDPYMAIYGATKAFLLSFSEAVWAENRHRGI 182
+ + M R G+++ V S A P MA Y ++KA + F++ + E I
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 183 RVLALCPGATQT 194
R + PG+T+T
Sbjct: 181 RCNIVSPGSTET 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11650TCRTETB1283e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 128 bits (322), Expect = 3e-34
Identities = 95/413 (23%), Positives = 175/413 (42%), Gaps = 17/413 (4%)

Query: 21 RNTVALLAVCLAALMFGLEISSVPVILPTLEVALHGDFSGIQWIMNAYTIACTAVLMATG 80
R+ L+ +C+ + L + V LP + + + W+ A+ + + G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 81 TLADRYGRKRVFLYTIALFGVASLLCGLAWNT-PILIASRFLQGASGGAMLICLVAVLSH 139
L+D+ G KR+ L+ I + S++ + + +LI +RF+QG +G A LV V+
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQG-AGAAAFPALVMVVVA 129

Query: 140 QFPEGAERGKAFGIWGIVFGIGLGFGPLIGGLIVAMSGWKWVFWVHVFIAALTFGLAIVG 199
++ RGKAFG+ G + +G G GP IGG+I W ++ + + L +
Sbjct: 130 RYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLL 189

Query: 200 VRESRDPHARRLDLAGIVTLSLTVLGLAYFVTQGADTGFGSASALRVIAATAVSFALFLI 259
+E R D+ GI+ +S+ ++ F T S S +I + +SF +F+
Sbjct: 190 KKEVR--IKGHFDIKGIILMSVGIVFFMLFTT--------SYSISFLIVSV-LSFLIFVK 238

Query: 260 VEMRAAHPMFDFSVFRVRNFSGALLGSVGMNFSFWPFIIYLPIYFQSALGQGAAAAG-VS 318
+ P D + + F +L + + F+ +P + A G V
Sbjct: 239 HIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVI 298

Query: 319 LLAYTLPTLVFPPIGERLALRYRPGIVIPAGLFTIGLGFFLMLIGSSIDQASWLTVLPGC 378
+ T+ ++F IG L R P V+ G+ + + F + ++ SW +
Sbjct: 299 IFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSF--LTASFLLETTSWFMTIIIV 356

Query: 379 LLAGAGLGITNTPVTNTTTGSVSPARAGMASGIDMSARMISLAINIALMGFLL 431
+ G GL T T ++ + S+ AG + +S IA++G LL
Sbjct: 357 FVLG-GLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLL 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS11655TCRTETB601e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 59.5 bits (144), Expect = 1e-11
Identities = 61/322 (18%), Positives = 127/322 (39%), Gaps = 23/322 (7%)

Query: 61 WNTTLFIVASITGAAVAAQVIQHLGPRGAYLLGAMVFGGGSALCALA-PIMQVLLVGRVL 119
W T F++ G AV ++ LG + L G ++ GS + + +L++ R +
Sbjct: 53 WVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFI 112

Query: 120 QGLGGGVLLSAPYVLMRSVLPEPLWPRALALLSGMWGIATLLGPAVGGIFAQYATWRLAF 179
QG G + V++ +P+ +A L+ + + +GPA+GG+ A Y
Sbjct: 113 QGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHY-----IH 167

Query: 180 WSLVVLIALAAAAAVVVLKPAAPQAERPT-PVPTRQLVLLAVAVVAASWASVSDSPWLGA 238
WS ++LI + V L + R + ++L++V +V + S S
Sbjct: 168 WSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLI 227

Query: 239 LGLIAAALTLLAIRRVEMRSSRRILPADAYTGRTALAALYGVSALLAVTVTCTEIFVPLF 298
+ +++ + + IR+V + + + + ++ TV VP
Sbjct: 228 VSVLSFLIFVKHIRKV----TDPFVDPGLGKNIPFMIGVL-CGGIIFGTVAGFVSMVPYM 282

Query: 299 LQRLHGQSPLIAGYIAATASAGWTLGAILSAALRAASVTR-----AIRLAPWACALGLLV 353
++ +H S G + T+ I+ + V R + + ++ L
Sbjct: 283 MKDVHQLSTAEIGSVIIFPG---TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLT 339

Query: 354 LAMLVPIGAGSAWMTTLIIVAL 375
+ L+ ++W T+IIV +
Sbjct: 340 ASFLL---ETTSWFMTIIIVFV 358


76RS_RS13265RS_RS13290N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS13265282.953025inorganic polyphosphate/ATP-NAD kinase
RS_RS13270273.519635DNA repair protein RecN
RS_RS13275193.564604hypothetical protein
RS_RS13280093.696659peptidase
RS_RS13290-1103.242626peptidase S8
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13270PF06057310.005 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 30.6 bits (69), Expect = 0.005
Identities = 15/68 (22%), Positives = 25/68 (36%), Gaps = 9/68 (13%)

Query: 49 DIVFERETALNIGVQDYPALP----PDEMARHAD-VAVVLGGDGTLLGIGRHLAGA---- 99
F E A N+G+ P P + + + L GDG + + + G
Sbjct: 18 ANAFADEFADNLGLTLLPVEPSTQVNAASSHTKPPLVIFLSGDGGWATLDKAVGGILQQQ 77

Query: 100 SVPVIGVN 107
PV+G +
Sbjct: 78 GWPVVGWS 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13275CHANLCOLICIN340.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 33.9 bits (77), Expect = 0.002
Identities = 47/227 (20%), Positives = 90/227 (39%), Gaps = 26/227 (11%)

Query: 163 RAWQAVVRLREAAEQQSREAQLERERVEWQVSELQKLAPQPGEWEEVQAEHHRLSHAASL 222
+A + + EAAE+ +EA+ R+ +E + +E ++ Q E + LS A
Sbjct: 134 KAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETER---QLKLAEAEEKRLAALSEEAKA 190

Query: 223 IEGTRAALDT-------LSEADSAVLTQLGAAVHGLQALAEIDPALADVLAALEPAQVQV 275
+E + L + + ++L +++H A + LA L A +
Sbjct: 191 VEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMK---TLAGKRNELAQASAKY 247

Query: 276 QEAVHSLARYADRAELDPDR----LAEVDARLQALHTM--ARKYRVAPET----LPAELT 325
+E + + + RA DP + R+ A +K A ET + A++T
Sbjct: 248 KELDELVKKLSPRAN-DPLQNRPFFEATRRRVGAGKIREEKQKQVTASETRINRINADIT 306

Query: 326 QRQAQLAALQAASDLDALQAQEAQTHAAYLQAAQALSRGRAKAAREL 372
Q Q A Q +++ +A A+ + +A L + K A +
Sbjct: 307 --QIQKAISQVSNNRNAGIARVHEAEENLKKAQNNLLNSQIKDAVDA 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13285SUBTILISIN1151e-30 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 115 bits (289), Expect = 1e-30
Identities = 82/386 (21%), Positives = 131/386 (33%), Gaps = 87/386 (22%)

Query: 130 PNDPLFSTQNNLQSPTVVAGGINVASAWDITLGSSSLVVAVVDTGY-TDHPDLSGKILPG 188
N + I + W+ T G + VAV+DTG DHPDL +I+ G
Sbjct: 11 QVIKQEQQVNEIPRG---VEMIQAPAVWNQTRGRG-VKVAVLDTGCDADHPDLKARIIGG 66

Query: 189 YNFISDPARAGNSTGRGSDAHDTGDGVTSADVSAISGCTSSDIGNSTWHGTEVMSVLAAG 248
NF D D D HGT V A
Sbjct: 67 RNFTDD---------DEGDPEIFKDY--------------------NGHGTHVAGT-IAA 96

Query: 249 TNNALDIAGVGWNTRIVPVRTSGKCG-ALLSDTVDGMLWAGGISVSGVPANPNPARVINV 307
T N + GV ++ ++ K G + G+ + A +I++
Sbjct: 97 TENENGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYY----------AIEQKVDIISM 146

Query: 308 SLGSVGSCSAAEQDAINRLAALGTVVVAAAGNEGSAVDA------PANCSGAIAVTAHVD 361
SLG +A+ + A +V+ AAGNEG D P + I+V A
Sbjct: 147 SLGGPED-VPELHEAVKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAINF 205

Query: 362 SGENASYANVGSQVALSAPGGGCANSQATSSGCTGPVSVIQADSNDGQYSLGNSVVKSVA 421
+ ++N ++V L APG I + G+Y+ + +
Sbjct: 206 DRHASEFSNSNNEVDLVAPGED-----------------ILSTVPGGKYA-------TFS 241

Query: 422 GTSFSTPEVAGTIALMLS-----VQSQLSNAQILAGLQQTARAHPSGTFCATRGGCGAGL 476
GTS +TP VAG +AL+ + L+ ++ A L + + G GL
Sbjct: 242 GTSMATPHVAGALALIKQLANASFERDLTEPELYAQLIKRTIP-----LGNSPKMEGNGL 296

Query: 477 LDTAGAVRYAQTTTPAGIGSGSSSSS 502
L ++ + S++S
Sbjct: 297 LYLTAVEELSRIFDTQRVAGILSTAS 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13290SUBTILISIN1302e-35 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 130 bits (328), Expect = 2e-35
Identities = 81/359 (22%), Positives = 118/359 (32%), Gaps = 94/359 (26%)

Query: 172 NLTTAWDTTKGSNTVTVTVIDTGLLAGHADLSGATIQPGYDFVSSTAMTGTPDPVTNLSI 231
W+ T+G V V V+DTG A H DL I G +F
Sbjct: 30 QAPAVWNQTRGRG-VKVAVLDTGCDADHPDLKA-RIIGGRNFTD---------------- 71

Query: 232 PSGFVENDATAGRDSDPTDPGDWITTSDATNYPSFCGTSVTDSSWHGTFVTGLIAAQHNA 291
+ DP D + HGT V G IAA N
Sbjct: 72 -----------DDEGDPEIF--------------------KDYNGHGTHVAGTIAATENE 100

Query: 292 IGVAGVAPGVSVQMTRAIGKCG-GASSDILDALTWAAGGTVPGVTTNATPAKVINMSLGG 350
GV GVAP + + + + K G G I+ + +A +I+MSLGG
Sbjct: 101 NGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAI----------EQKVDIISMSLGG 150

Query: 351 STACTTAQQSAITAARSLGATIVVATGNEFQNA----AIDAPASCSGTIAVTAHTLEGDI 406
+ A+ A + ++ A GNE + P + I+V A +
Sbjct: 151 PEDVPELHE-AVKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAINFDRHA 209

Query: 407 ANYANVGTGTTLSAPGGGNGSTATGLGALVPSTSNAGTTTASTDTYVGEEGTSMSTPQVA 466
+ ++N L APG +T Y GTSM+TP VA
Sbjct: 210 SEFSNSNNEVDLVAPGEDI------------------LSTVPGGKYATFSGTSMATPHVA 251

Query: 467 GVAALMLSVNAA-----LTPDQIKTILQTSSRPFPPDTFCTTHAGVCGAGMLDAGSAIA 520
G AL+ + A LT ++ L + P + G G+L +
Sbjct: 252 GALALIKQLANASFERDLTEPELYAQLIKRTIPLGNSPK------MEGNGLLYLTAVEE 304


77RS_RS13385RS_RS13435N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS13385210-1.373381DNA-binding transcriptional regulator
RS_RS13390311-1.775200glutathione peroxidase
RS_RS13395312-1.654194type 4 fimbrial biogenesis PilY1 signal peptide
RS_RS13400012-2.108830pilus assembly protein
RS_RS13405-112-2.509565type 4 fimbrial biogenesis PilW transmembrane
RS_RS13410-212-1.905472type 4 fimbrial biogenesis protein
RS_RS13415-210-1.200367general secretion pathway protein GspH
RS_RS13420-210-1.069557pilus assembly protein PilE
RS_RS13425-210-0.684536twitching motility protein PilT
RS_RS13430-210-0.078730twitching motility protein PilT
RS_RS134351111.583487YggS family pyridoxal phosphate enzyme
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13385ARGREPRESSOR300.005 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 29.8 bits (67), Expect = 0.005
Identities = 16/60 (26%), Positives = 27/60 (45%), Gaps = 5/60 (8%)

Query: 6 RRADRLFQIAQILRGRRLTTAAMLADRL-----GVSERTVYRDIRDLSVSGVPIEGEAGI 60
+ R +I +I+ + T L D L V++ TV RDI++L + VP +
Sbjct: 2 NKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKELHLVKVPTNNGSYK 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13405BCTERIALGSPG300.008 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 29.9 bits (67), Expect = 0.008
Identities = 14/65 (21%), Positives = 31/65 (47%), Gaps = 11/65 (16%)

Query: 6 KMLRRQRGMSLVELMIGMA-LGLLLLTALGSLYFSTTQSRAQLANS----------ASQI 54
+ +QRG +L+E+M+ + +G+L + +L + ++ Q A S ++
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL 61

Query: 55 ENGRY 59
+N Y
Sbjct: 62 DNHHY 66


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13410BCTERIALGSPG310.001 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 31.0 bits (70), Expect = 0.001
Identities = 11/24 (45%), Positives = 19/24 (79%), Gaps = 2/24 (8%)

Query: 17 QRGLTLIEVLISVVIL--LAALLG 38
QRG TL+E+++ +VI+ LA+L+
Sbjct: 7 QRGFTLLEIMVVIVIIGVLASLVV 30


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13415BCTERIALGSPG310.001 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 30.6 bits (69), Expect = 0.001
Identities = 13/44 (29%), Positives = 23/44 (52%)

Query: 16 RGVTLPELLVGLAVLSILIVIAVPSFSGLIATQRARNASLDLSA 59
RG TL E++V + ++ +L + VP+ G + A D+ A
Sbjct: 8 RGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVA 51


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13420BCTERIALGSPG536e-12 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 53.4 bits (128), Expect = 6e-12
Identities = 20/67 (29%), Positives = 37/67 (55%)

Query: 8 QRARSARGFTLIELMIVAAIVAILAVLAYPSYVQYVVRSNRAAAESFMQEVAAAQERFLL 67
+ RGFTL+E+M+V I+ +LA L P+ + ++++ A S + + A + + L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL 61

Query: 68 DNRAYAS 74
DN Y +
Sbjct: 62 DNHHYPT 68


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13435ALARACEMASE310.005 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 30.5 bits (69), Expect = 0.005
Identities = 51/248 (20%), Positives = 79/248 (31%), Gaps = 49/248 (19%)

Query: 1 MSVIAANLQAVRQRITTACTLASRDPASVTLLAVSKTFDADA-----VRAAHAAGQ-RLF 54
+ + NL VRQ T A + +V K A+A R A G F
Sbjct: 11 LQALKQNLSIVRQAATHAR-----------VWSVVK---ANAYGHGIERIWSAIGATDGF 56

Query: 55 GENYVQEGTAKTSALTDLR----NGPEAIAWHFIGPLQSNKTRAVAEHFD-WVHSIDRLK 109
++E LR GP + F + VHS +LK
Sbjct: 57 ALLNLEEAIT-------LRERGWKGPILMLEGFFHA--QDLEIYDQHRLTTCVHSNWQLK 107

Query: 110 IAERLSAQRPATLPPLQVCLEINISRQASKHGVPPDLDEVLPLARAVAALPRLRLRGLMA 169
+ + P + L++N ++ G PD VL + + + A+ + LM+
Sbjct: 108 ALQNARLKAPLD-----IYLKVNSG--MNRLGFQPD--RVLTVWQQLRAMANVGEMTLMS 158

Query: 170 VPEPADDPAARRAPFAALRDLADRLCAAGLPVDTLSMGMSADLEDAIAEGATIVRVGSAI 229
A+ P A + A GL +A L A VR G +
Sbjct: 159 HFAEAEHPDGISGAMARIEQA-----AEGLECRRSLSNSAATLWHPEA-HFDWVRPGIIL 212

Query: 230 FGARQYAH 237
+GA
Sbjct: 213 YGASPSGQ 220


78RS_RS13670RS_RS13695N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS13670-113-1.997613NAD(P) transhydrogenase subunit beta
RS_RS13675013-1.711688chemotaxis protein CheY
RS_RS13680-111-0.732154SAM-dependent methyltransferase
RS_RS13685-211-0.518208two-component system sensor histidine kinase
RS_RS13690-2120.069999two-component system sensor histidine
RS_RS13695-2121.332986chemotaxis protein CheY
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13675ACRIFLAVINRP310.017 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.6 bits (69), Expect = 0.017
Identities = 21/83 (25%), Positives = 36/83 (43%), Gaps = 13/83 (15%)

Query: 186 MLNLLLALAMLG-FGVLFFLSQSWL-PFVIMTAIAFAL-GVLIIIPIGGA--DMPVVVSM 240
L+A++ + F L L +SW P +M + + GVL+ + D+ +V +
Sbjct: 871 QAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGL 930

Query: 241 LNSYSGWAAAGIGFSLNNPMLII 263
L IG S N +LI+
Sbjct: 931 L--------TTIGLSAKNAILIV 945


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13680HTHFIS379e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.1 bits (86), Expect = 9e-05
Identities = 15/80 (18%), Positives = 31/80 (38%), Gaps = 9/80 (11%)

Query: 101 QPSVIVVDYSMPRMNGLEFCAKLK----DLPCMTILLTGMADENIAVQGFNDGLIDRYIK 156
++V D MP N + ++K DLP ++++ A++ G D Y+
Sbjct: 47 DGDLVVTDVVMPDENAFDLLPRIKKARPDLP--VLVMSAQNTFMTAIKASEKGAYD-YLP 103

Query: 157 K--DHPAMAERLSTEIEALQ 174
K D + + + +
Sbjct: 104 KPFDLTELIGIIGRALAEPK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13690PF06580290.037 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.1 bits (65), Expect = 0.037
Identities = 5/35 (14%), Positives = 12/35 (34%)

Query: 352 KAQREHPCIDISAEWHNGRLYVVVKDNGPGIAEKN 386
+ I + NG + + V++ G +
Sbjct: 273 AQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNT 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13695HTHFIS844e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.5 bits (209), Expect = 4e-20
Identities = 31/133 (23%), Positives = 56/133 (42%), Gaps = 4/133 (3%)

Query: 7 TGPTILYVDDEQQACKWFTRMVS-ASYNVLTANSVEEAKAVLRDAHDRIGVVLTDFRMPG 65
TG TIL DD+ + +S A Y+V ++ + +V+TD MP
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD--GDLVVTDVVMPD 59

Query: 66 GDGTELLRFIDAEYSDIAAILVTAYADKDLLIQAVNTGRVFKILEKPYQPEDVRRSLQEA 125
+ +LL I D+ ++++A I+A G + L KP+ ++ + A
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKG-AYDYLPKPFDLTELIGIIGRA 118

Query: 126 LALRQDRLLRTQR 138
LA + R + +
Sbjct: 119 LAEPKRRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS13700HTHFIS874e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 87.2 bits (216), Expect = 4e-21
Identities = 30/135 (22%), Positives = 59/135 (43%), Gaps = 6/135 (4%)

Query: 9 PKILYVDDETLALTYFGRAIGSLA-PVLTATSVEEGKRMLDEHAATLGVLVSDQRMPGEL 67
IL DD+ T +A+ V ++ R + A ++V+D MP E
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIA--AGDGDLVVTDVVMPDEN 61

Query: 68 GNELLRYARERYPHMVRILTTAYSEIEDTVEAVNQGQIHRYIKKPWDITALRVELKQALE 127
+LL ++ P + ++ +A + ++A +G + Y+ KP+D+T L + +AL
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKG-AYDYLPKPFDLTELIGIIGRALA 120

Query: 128 FADLRRERDALLREK 142
+ +R L +
Sbjct: 121 --EPKRRPSKLEDDS 133


79RS_RS14665RS_RS14735N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS14665110-4.282636cytochrome C
RS_RS14670215-4.617581cytochrome B
RS_RS14675319-3.653833ubiquinol-cytochrome C reductase (iron-sulfur
RS_RS14680215-2.602607large-conductance mechanosensitive channel
RS_RS14685012-0.634115Nif3-like dinuclear metal center hexameric
RS_RS14690111-1.0557222-alkenal reductase
RS_RS14695014-0.679847membrane protein
RS_RS14700-1100.529549ABC transporter substrate-binding protein
RS_RS147050140.207328two-component system response regulator
RS_RS14710013-0.006846sensor histidine kinase
RS_RS14715-115-1.385955iron ABC transporter substrate-binding protein
RS_RS14720-115-1.197793spermidine/putrescine ABC transporter permease
RS_RS14725-113-1.202540lipase
RS_RS14730-110-3.038985preprotein translocase subunit TatC
RS_RS14735010-2.327301Sec-independent protein translocase TatB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS14665STREPTOPAIN310.004 Streptopain (C10) cysteine protease family signature.
		>STREPTOPAIN#Streptopain (C10) cysteine protease family signature.

Length = 398

Score = 31.2 bits (70), Expect = 0.004
Identities = 21/78 (26%), Positives = 36/78 (46%), Gaps = 12/78 (15%)

Query: 3 KLLAIFALAGFMIAAPVFANEGGVRLDPAPNQS-----EDLSALQRGAKL-------FVN 50
+LL++ AL GF++A PVFA++ R + S + +A++ GA+ VN
Sbjct: 9 RLLSLLALGGFVLANPVFADQNFARNEKEAKDSAITFIQKSAAIKAGARSAEDIKLDKVN 68

Query: 51 YCLNCHGASAMRYNRLRD 68
G++ YN
Sbjct: 69 LGGELSGSNMYVYNISTG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS14680MECHCHANNEL1183e-37 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 118 bits (296), Expect = 3e-37
Identities = 62/143 (43%), Positives = 89/143 (62%), Gaps = 13/143 (9%)

Query: 1 MALMQDFKKFAMRGNVIDLAVGVIIGAAFGKIVDSLVNDLIMPLVARIVGKLDFSNLFIQ 60
M+++++F++FAMRGNV+DLAVGVIIGAAFGKIV SLV D+IMP + ++G +DF +
Sbjct: 1 MSIIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMPPLGLLIGGIDFKQFAVT 60

Query: 61 LADAPAGVPQTLADLKKAGVPVFAYGNFITVAVNFLILAFIVFLMVRAITRVIDTNPPPA 120
L DA +P V YG FI +FLI+AF +F+ ++ I ++ PA
Sbjct: 61 LRDAQGDIPAV----------VMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEEPA 110

Query: 121 DTP---ENTLLLRDIRDSLKSKN 140
P + +LL +IRD LK +N
Sbjct: 111 AAPAPTKEEVLLTEIRDLLKEQN 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS14690V8PROTEASE663e-14 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 66.2 bits (161), Expect = 3e-14
Identities = 35/160 (21%), Positives = 58/160 (36%), Gaps = 29/160 (18%)

Query: 116 LGSGVIVSPEGYILTNHHVVDGADEIEVALT------------DGRKANAKVVGTDPETD 163
+ SGV+V +LTN HVVD AL +G ++ E D
Sbjct: 103 IASGVVVGK-DTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGD 161

Query: 164 LAVLKISLT--------NLPAITLGRLENVRVGDVVLAIGNPFGVGQTVTMGIVSALGRS 215
LA++K S + T+ +V + G P V+ + S
Sbjct: 162 LAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGD-------KPVATMWES 214

Query: 216 HLGINTFEN-FIQTDAAINPGNSGGALVDAEGNLLGINTA 254
I + +Q D + GNSG + + + ++GI+
Sbjct: 215 KGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWG 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS14695ECOLNEIPORIN924e-23 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 92.2 bits (229), Expect = 4e-23
Identities = 89/391 (22%), Positives = 130/391 (33%), Gaps = 72/391 (18%)

Query: 1 MKMKLFAAAVAALAAGGAYAQSSVTLYGVADVGVEWANKVPGADGQGHSRVAMQSGNLSG 60
MK L A +AAL + VTLYG GVE + V Q S G
Sbjct: 1 MKKSLIALTLAALPVAAM---ADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLG 57

Query: 61 SRWGLRGVEDLGGGLKGIFNLESGFNLDTGTSAQSSRLFGRNAYVGLQGQWGQLTLGRQQ 120
S+ G +G EDLG GLK I+ +E ++ S +R +++GL+G +G+L +GR
Sbjct: 58 SKIGFKGQEDLGNGLKAIWQVEQKASIAGTDSGWGNRQ----SFIGLKGGFGKLRVGRLN 113

Query: 121 NLLYD---FALAYDPMAIAGRYSITAQDAFMAG-RADNAIKYIGTFGGLSVSALYAFNRD 176
++L D G I +A + R D+ F GLS S YA N +
Sbjct: 114 SVLKDTGDINPWDSKSDYLGVNKIAEPEARLISVRYDSP-----EFAGLSGSVQYALNDN 168

Query: 177 GQEVAGVNKLGREWSLGANYAAGPFGLGVVYDQSN---GTAASNASFAAQNAAADTKEQR 233
+ G NY G F + N + +
Sbjct: 169 AG-----RHNSESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSGYDND 223

Query: 234 ATIAGSYA-FGPAKLYAGYRWYHANFATVAGA----GNLRSNLYWLGAGYQATPALTLTG 288
A A AKL +++ A GN+ + Y +
Sbjct: 224 ALYASVAVQQQDAKLVEENYSHNSQTEVAATLAYRFGNVTPRV-----SYAHGFKGSFDA 278

Query: 289 TAYYQQFKNSGTGNPWLFVVGTDYALSKRTDAYFNVAYAKNSNGSTLGVAGINGASSGNA 348
T Y + VVG +Y SKRT A + + + G + V+
Sbjct: 279 TNYNNDYDQ--------VVVGAEYDFSKRTSALVSAGWLQEGKGESKFVSTA-------- 322

Query: 349 TTLQSTDNQVLSNSTNTGNQFGAVVGIRHKF 379
VG+RHKF
Sbjct: 323 ----------------------GGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS14705HTHFIS912e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.4 bits (227), Expect = 2e-23
Identities = 35/123 (28%), Positives = 65/123 (52%), Gaps = 1/123 (0%)

Query: 2 RILLVEDEGELAAWLARALAQSGFVVERAADGLAAEAFLASGEFDAVVLDLRLPRKDGFA 61
IL+ +D+ + L +AL+++G+ V ++ ++A+G+ D VV D+ +P ++ F
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLADMRARDDRTPVLILTAQGALDERVRGLNAGADDFLTKPFALTE-LEARLMALIRRSR 120
+L ++ PVL+++AQ ++ GA D+L KPF LTE + AL R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 GRA 123
+
Sbjct: 125 RPS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS14710PF06580362e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.4 bits (84), Expect = 2e-04
Identities = 20/109 (18%), Positives = 37/109 (33%), Gaps = 36/109 (33%)

Query: 372 LVHNAIRY----TPPGGRITVRAIKDSDAALVCVDDTGPGMNAVERAHAFERFRRAHEGG 427
LV N I++ P GG+I ++ KD+ + V++TG
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA------------------- 303

Query: 428 TLPGGPKDARYAAEGSGLGLA-----IARAYAARSGGRIELADGEPNSR 471
+ E +G GL + Y + ++ G+ N+
Sbjct: 304 --------LKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS14725PF05272300.018 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.018
Identities = 13/34 (38%), Positives = 17/34 (50%)

Query: 31 VVSLLGPSGSGKTTLLRAVAGLEQASSGTVKIGE 64
V L G G GK+TL+ + GL+ S IG
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGT 631


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS14735TATBPROTEIN771e-20 Bacterial sec-independent translocation TatB protein...
		>TATBPROTEIN#Bacterial sec-independent translocation TatB protein

signature.
Length = 171

Score = 77.4 bits (190), Expect = 1e-20
Identities = 30/110 (27%), Positives = 55/110 (50%), Gaps = 4/110 (3%)

Query: 1 MIDLGISKLALIGAVALVVIGPERLPKVARTAGALIGRAQRYIADVKAEVSREIELEELR 60
M D+G S+L L+ + LVV+GP+RLP +T I + V+ E+++E++L+E +
Sbjct: 1 MFDIGFSELLLVFIIGLVVLGPQRLPVAVKTVAGWIRALRSLATTVQNELTQELKLQEFQ 60

Query: 61 KMRTEFE-EAARNVEQTIHQEVS--RHTSEINERLNEAL-GDPASRQTAT 106
+ E + N+ + + R +E +R A + AS + T
Sbjct: 61 DSLKKVEKASLTNLTPELKASMDELRQAAESMKRSYVANDPEKASDEAHT 110


80RS_RS15365RS_RS15390N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS153651141.643333MFS transporter
RS_RS153700141.322667MFS transporter
RS_RS15375-1131.510161ethanolamine utilization protein EutH
RS_RS15380-1112.027417AraC family transcriptional regulator
RS_RS153850122.513914two-component system sensor histidine kinase
RS_RS153901122.648477two-component system response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15370TCRTETA356e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.8 bits (80), Expect = 6e-04
Identities = 43/191 (22%), Positives = 74/191 (38%), Gaps = 21/191 (10%)

Query: 64 LMRPLGALILGAYIDRHGRRAGLILTLGLMACGTLLIALVPGYATIGLAAPLLVLIGRLL 123
LM+ A +LGA DR GRR L+++L A ++A P ++ IGR++
Sbjct: 54 LMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLW--------VLYIGRIV 105

Query: 124 QGFSAGVELGGVSVYLAEISTPGRKGFFVSWQSASQQVAVMFAALLGVLLSFQLSPKEMG 183
G + G Y+A+I+ + + SA ++ +LG L MG
Sbjct: 106 AGIT-GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGL---------MG 155

Query: 184 EWGWRVPFLVGCLIVPFLFLIRRSLQETEEFKQRKHRPALGEIFRSMGENWRLVGAGTLL 243
+ PF + FL E + + RP E + ++R T++
Sbjct: 156 GFSPHAPFFAAAALNGLNFLT--GCFLLPESHKGERRPLRREAL-NPLASFRWARGMTVV 212

Query: 244 VVMTTVSFYLI 254
+ V F +
Sbjct: 213 AALMAVFFIMQ 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15375TCRTETA320.003 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.5 bits (74), Expect = 0.003
Identities = 43/192 (22%), Positives = 76/192 (39%), Gaps = 23/192 (11%)

Query: 64 LMRPLGAIVLGAYIDRHGRRAGLILTLVLMACGTMLIALVPGYGTIGLAAPLLVLIGRLL 123
LM+ A VLGA DR GRR L+++L A ++A P ++ IGR++
Sbjct: 54 LMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAP--------FLWVLYIGRIV 105

Query: 124 QGFSAGGEPGGVSVYLSEIATPGRKGFYVSWQSGSQQVAVMFAALLGVLLSFKLSPKEMG 183
G + G Y+++I + + + S ++ +LG L MG
Sbjct: 106 AGIT-GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGL---------MG 155

Query: 184 EWGWRVPFLIGCLIVPFLFFIRR-SLRETEEFEQRKHRPTMGELLRSMGENWRLAGAGTL 242
+ PF + F L E+ + E+R + + ++R A T+
Sbjct: 156 GFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERR----PLRREALNPLASFRWARGMTV 211

Query: 243 LAVMTTVSFYLI 254
+A + V F +
Sbjct: 212 VAALMAVFFIMQ 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15390PF06580320.005 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.005
Identities = 19/108 (17%), Positives = 38/108 (35%), Gaps = 31/108 (28%)

Query: 330 LIRNAMRYA--------RQAITVRAEAGTYGSLCLTVEDDGPGIPAAERARVFEPFYRLD 381
L+ N +++ + + + GT + L VE+ G +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGT---VTLEVENTGSLALKNTK----------- 308

Query: 382 ASRDRHTGGFGLGLAIVR-RIALVHGGEVRLDTGGS-GGARFVLLLPA 427
G GL VR R+ +++G E ++ G ++L+P
Sbjct: 309 -------ESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15395HTHFIS763e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 76.4 bits (188), Expect = 3e-18
Identities = 25/151 (16%), Positives = 58/151 (38%), Gaps = 5/151 (3%)

Query: 7 KTRVLLIEDDDRLAQLVSEYLGNYEFTVEVVRRGDIAVAAVREHRPALVILDLMLPHMDG 66
+L+ +DD + ++++ L + V + + LV+ D+++P +
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 67 MEVCRRIRA-FSRVPVLILTARVDTYDQVAGLEIGADDYVLKPVEPRLLVARARALLRRV 125
++ RI+ +PVL+++A+ + E GA DY+ KP + ++ R
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT----ELIGIIGRA 118

Query: 126 ASAMPAAEPATPRDDTLVFGELAISPPNRTV 156
+ D + S + +
Sbjct: 119 LAEPKRRPSKLEDDSQDGMPLVGRSAAMQEI 149


81RS_RS15565RS_RS15630N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS15565110-1.312218serine protease
RS_RS15570010-0.334450AsnC family transcriptional regulator
RS_RS15575010-0.5892824-hydroxyphenylpyruvate dioxygenase
RS_RS15580-1130.293939calcium sensor EFh
RS_RS155850120.937181pilus assembly protein PilZ
RS_RS155902123.363543type II secretion system protein GspG
RS_RS155954123.669680type II secretion system protein GspH
RS_RS156003113.939955type II secretion system protein GspI
RS_RS156054124.292045general secretion pathway protein GspJ
RS_RS156103124.507025general secretion pathway protein GspK
RS_RS156151123.530065general secretion pathway protein GspL
RS_RS156201113.406765general secretion pathway protein GspM
RS_RS156251133.308398general secretion pathway protein GspN
RS_RS15630-190.568607type II secretion system protein GspD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15575SUBTILISIN592e-11 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 59.1 bits (143), Expect = 2e-11
Identities = 19/65 (29%), Positives = 28/65 (43%), Gaps = 7/65 (10%)

Query: 585 FYGTSAAAPHVAGVAALMRQAVPKA-----TPEQIYSALRKTAVDMDAPGHDNATGAGFV 639
F GTS A PHVAG AL++Q + T ++Y+ L K + + G G +
Sbjct: 240 FSGTSMATPHVAGALALIKQLANASFERDLTEPELYAQLIKRTIPLG--NSPKMEGNGLL 297

Query: 640 QPERA 644

Sbjct: 298 YLTAV 302



Score = 34.4 bits (79), Expect = 0.001
Identities = 27/196 (13%), Positives = 49/196 (25%), Gaps = 43/196 (21%)

Query: 157 VKGLGIELTGKGITVGLISDSFN-----CNSQLNQDARYVA-QNGRQDTMEDDIARGELP 210
+ + G+G+ V ++ + +++ + G + +D G
Sbjct: 31 APAVWNQTRGRGVKVAVLDTGCDADHPDLKARIIGGRNFTDDDEGDPEIFKDYNGHGTH- 89

Query: 211 GNGRIRIVKELRDCTDGTDEGRAMAEIIHDVAPGADIVFY-----SGGAGMADFAQGIET 265
GT + VAP AD++ G QGI
Sbjct: 90 --------------VAGTIAATENENGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIY- 134

Query: 266 LALPKNKSNAQGVAGGGAQVIVDDLQYSYEPAFQSGIVGAAIDNVVKNHGIAYFTAGGND 325
+I S + A+ V I A GN+
Sbjct: 135 -----------YAIEQKVDII----SMSLGGPEDVPELHEAVKKAV-ASQILVMCAAGNE 178

Query: 326 GVGASPVSYINNNARF 341
G G + +
Sbjct: 179 GDGDDRTDELGYPGCY 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15600BCTERIALGSPG2011e-69 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 201 bits (512), Expect = 1e-69
Identities = 67/135 (49%), Positives = 89/135 (65%), Gaps = 3/135 (2%)

Query: 21 RGFTLIEIMVVVVILGILAALVVPKIMSRPDEARIIAAKQDIASISQALKLYRLDNGRYP 80
RGFTL+EIMVV+VI+G+LA+LVVP +M ++A A DI ++ AL +Y+LDN YP
Sbjct: 8 RGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHHYP 67

Query: 81 TTEQGLAALVTKPTTEPVPNNWKGGGYLERLPKDPWGHPYQYLNPGVRGEVDIFSYGADG 140
TT QGL +LV PT P+ N+ GY++RLP DPWG+ Y +NPG G D+ S G DG
Sbjct: 68 TTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLLSAGPDG 127

Query: 141 QPGGTANDADIGNWD 155
+ G + DI NW
Sbjct: 128 EMGT---EDDITNWG 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15605BCTERIALGSPH533e-11 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 52.6 bits (126), Expect = 3e-11
Identities = 23/75 (30%), Positives = 39/75 (52%), Gaps = 6/75 (8%)

Query: 14 RRGFTLLELLVVVVIIGIVLGVVAVNATPNPRSQLADDAQKLARL---IELAQEEAQLTA 70
+RGFTLLE++++++++G+ G+V + A P R AQ LAR + Q+ T
Sbjct: 3 QRGFTLLEMMLILLLMGVSAGMV-LLAFPASRDD--SAAQTLARFEAQLRFVQQRGLQTG 59

Query: 71 RPVAWEGDAQGWRFL 85
+ W+FL
Sbjct: 60 QFFGVSVHPDRWQFL 74


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15610BCTERIALGSPG347e-05 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 33.7 bits (77), Expect = 7e-05
Identities = 25/94 (26%), Positives = 38/94 (40%), Gaps = 11/94 (11%)

Query: 8 RRAARRHGFTLLEVLVALTIVAVALTATM-RAMGSMTVASESLQTRMLATWSAENHLAGL 66
R ++ GFTLLE++V + I+ V + + MG+ A + + EN L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVA--LENALDMY 59

Query: 67 RL-AHAYPEPGMRGFACPQGDAQLWCEETVAPTP 99
+L H YP QG L T+ P
Sbjct: 60 KLDNHHYP-------TTNQGLESLVEAPTLPPLA 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15615BCTERIALGSPG372e-05 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 36.8 bits (85), Expect = 2e-05
Identities = 27/75 (36%), Positives = 39/75 (52%), Gaps = 7/75 (9%)

Query: 1 MRAISSDRVRGFTLLELLVAITLLAILAVLAWRGLDSMTRTHEALAQRDE-RIEALKTAY 59
MRA +D+ RGFTLLE++V I ++ +LA L L M +A Q+ I AL+ A
Sbjct: 1 MRA--TDKQRGFTLLEIMVVIVIIGVLASLVVPNL--MGNKEKADKQKAVSDIVALENAL 56

Query: 60 AQFDADCTQLADPST 74
+ D P+T
Sbjct: 57 DMYKLDNHHY--PTT 69


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15640BCTERIALGSPD393e-128 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 393 bits (1011), Expect = e-128
Identities = 205/706 (29%), Positives = 314/706 (44%), Gaps = 94/706 (13%)

Query: 46 PAASNPGDEVSLNFVNADLETVVKAVGQATGKNFIVDPRVKGTVNLVTEKPVTRAQALES 105
PAA+ E S +F D++ + V + K I+DP V+GT+ + + + Q +
Sbjct: 24 PAAAE---EFSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRSYDMLNEEQYYQF 80

Query: 106 LGSILRMQGYAIVE-GNGFTKVVPEADAKLQGSPTSVGPGGARGGEQVVTQVFRLQYESA 164
S+L + G+A++ NG KVV DAK P + G E VVT+V L +A
Sbjct: 81 FLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPGIGDE-VVTRVVPLTNVAA 139

Query: 165 NNLVPVLRPMI--APNNTITAYPANNTLVITDYADNLRRIARIITSIDSPAAGETELIAL 222
+L P+LR + A ++ Y +N L++T A ++R+ I+ +D+ + L
Sbjct: 140 RDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVDNAGDRSVVTVPL 199

Query: 223 KNAVAIDAAATLQKLLDPSGTAGGAGAGAALADPSLRTSVVAEPRSNSVLVRASSAARMA 282
A A D + +L + + G+ A +VVA+ R+N+VLV +R
Sbjct: 200 SWASAADVVKLVTELNKDTSKSALPGSMVA--------NVVADERTNAVLVSGEPNSR-Q 250

Query: 283 QAKQLLAKLDVPGTRPGNIWVVPLKNANAVQLATTLRAIVAADATLSASQSGGPGGQSAA 342
+ ++ +LD GN V+ LK A A L L I
Sbjct: 251 RIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGI--------------------- 289

Query: 343 QGAQAQQPATTGTQTSQNTQTGSYSSSSGSSSGMGSGNSSFRASFGQSNLPTTGGIIQAD 402
SS+ S ++ + II+A
Sbjct: 290 ------------------------SSTMQSEKQAAKPVAALDKNI----------IIKAH 315

Query: 403 PATNALIITASEPVYRNLRTVIDDLDARRAQVYIESMIVEVTSDKASQLGIQWMVGAGGP 462
TNALI+TA+ V +L VI LD RR QV +E++I EV LGIQW
Sbjct: 316 GQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGIQW----ANK 371

Query: 463 NTYGFGGTNFGSGVGNILNLGVIAATVGSGGIGSTAAQTALGSITGSNVSGLNGGNFGVF 522
N TN G + + G + S S +S NG G +
Sbjct: 372 NAGMTQFTNSGLPISTAI-----------AGANQYNKDGTVSSSLASALSSFNGIAAGFY 420

Query: 523 NKNTGLGAILSALGSDGSVNVLSTPNLITLDNEEAKILIGQNVPITTGSYAQTGSSASVT 582
N +L+AL S ++L+TP+++TLDN EA +GQ VP+ TGS +G +
Sbjct: 421 QGN--WAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDN---- 474

Query: 583 PFQTFDRKDVGITLRVKPQITDGGMVKMQIFQESSAVVNGTQNATQ--GPTTNVRSIETN 640
F T +RK VGI L+VKPQI +G V ++I QE S+V + + + G T N R++
Sbjct: 475 IFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGATFNTRTVNNA 534

Query: 641 VIANDGQVIVLGGLLEDNYQDSEQKVPGLGDIPVLGALFRSESKTRKKTNLLVFLRPYIL 700
V+ G+ +V+GGLL+ + D+ KVP LGDIPV+GALFRS SK K NL++F+RP ++
Sbjct: 535 VLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVI 594

Query: 701 RTAEATGALSDNRYNYMRDAQQGFVSPNVLPTTSDRDTPVLPSPES 746
R + S +Y DAQ ++D + +
Sbjct: 595 RDRDEYRQASSGQYTAFNDAQSKQRGKENNDAMLNQDLLEIYPRQD 640


82RS_RS15720RS_RS15755N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS15720293.831320Fis family transcriptional regulator
RS_RS15725092.728715alcohol dehydrogenase
RS_RS157301113.019980lactate dehydrogenase
RS_RS157351123.446506DNA-binding protein
RS_RS157401123.263562N-acetyltransferase
RS_RS157451123.239625MFS transporter
RS_RS15750-2122.812906ferritin
RS_RS15755-1113.180360chemotaxis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15720HTHFIS340e-112 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 340 bits (873), Expect = e-112
Identities = 143/430 (33%), Positives = 211/430 (49%), Gaps = 51/430 (11%)

Query: 251 IRAANQSALNLLGRTRQSLLGQPV-----ETVFDLTADALLARATDSGGVAWPLHTHQGR 305
+ +++A +LL R +++ PV + F A A D + P +
Sbjct: 55 VVMPDENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDY--LPKPFDLTELI 112

Query: 306 LLFGLLRAPRPRPTADAAPASPPMPATPGLCLQADGLRTGFHRALRVFEHDVPLLLHGET 365
G++ P + L ++ ++ + R+ + D+ L++ GE+
Sbjct: 113 ---GIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGES 169

Query: 366 GTGKEAFARAVHQASSRAAQPFVAVNCAAIPETLIESELFGYRGGSFTGARREGMRGKLQ 425
GTGKE ARA+H R PFVA+N AAIP LIESELFG+ G+FTGA+ G+ +
Sbjct: 170 GTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRST-GRFE 228

Query: 426 QADGGTLFLDEIGDMPLALQSRLLRVLEERAVVPIGG-EAQTVDVRIVSASHRDMEARVR 484
QA+GGTLFLDEIGDMP+ Q+RLLRVL++ +GG DVRIV+A+++D++ +
Sbjct: 229 QAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSIN 288

Query: 485 DGRFREDLYYRLNGLRITLPPLRERADKAALLAHVLAEESR---GRPARLDDDARDALLA 541
G FREDLYYRLN + + LPPLR+RA+ L +++ R D +A + + A
Sbjct: 289 QGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKA 348

Query: 542 QPWPGNVRQLRNVLRTLVALSDDGRIRLRDLPPELRPPAMASAAAPAP------------ 589
PWPGNVR+L N++R L AL I + ELR S A
Sbjct: 349 HPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAV 408

Query: 590 ------------------------LDNAEKAALLAALQAQQWRMTHAAKALGISRNTLYR 625
L E +LAAL A + AA LG++RNTL +
Sbjct: 409 EENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRK 468

Query: 626 KLRKHGIARP 635
K+R+ G++
Sbjct: 469 KIRELGVSVY 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15725DHBDHDRGNASE280.044 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 28.1 bits (62), Expect = 0.044
Identities = 29/109 (26%), Positives = 44/109 (40%), Gaps = 13/109 (11%)

Query: 169 GQWIAISGIG-GLGHVAVQYAVAMGLHVVAVDVAPEKLALARELGARLSVDASQ-----Q 222
G+ I+G G+G + + G H+ AVD PEKL + A +
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 223 DPAAV------IQKEVGGV-HGVLVTAVSRSAFAQALGMVRRGGTVSLN 264
D AA+ I++E+G + V V V R +L T S+N
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVN 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15735HTHTETR270.042 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 27.3 bits (60), Expect = 0.042
Identities = 18/156 (11%), Positives = 48/156 (30%), Gaps = 15/156 (9%)

Query: 18 ERIARRVRDLRAARGY---TLDALAARCGVSRSMISLIERGAASPTAVVLDKLAAGLGVS 74
+ I L + +G +L +A GV+R I + + + ++ +
Sbjct: 14 QHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSD----LFSEIWELSESN 69

Query: 75 LASLFGGEREGVPAQPLMRRAQQAQWRDPASGYVRRNLSPPDWPSPIQLVEVNFPAGARV 134
+ L + P PL + R+ + ++ ++++ +
Sbjct: 70 IGELELEYQAKFPGDPL------SVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEM 123

Query: 135 AYETGGRENAMQQQVWVIDGRIDVMLGDQRHELHPG 170
A + N + I+ + + + L
Sbjct: 124 AVVQQAQRNLCLESYDRIEQTLKHCI--EAKMLPAD 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15755RTXTOXIND300.031 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.031
Identities = 31/264 (11%), Positives = 67/264 (25%), Gaps = 29/264 (10%)

Query: 325 DELEADLVSARNRFLLLGLALGALLAGGLYWMLRRAVSAPLAEVAGVARRVAAGDLTHR- 383
LE R L+ + L +V + VA ++ +
Sbjct: 44 AHLELIETPVSRRPRLVAYFIMGFLVIAFIL----SVLGQVEIVATANGKLTHSGRSKEI 99

Query: 384 --FSGTRRDEI----------GQLMHAINGLGDGLSGIVDKVRASASTIASSTGQIAAGN 431
+ EI G ++ + LG + + + + + QI + +
Sbjct: 100 KPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRS 159

Query: 432 VDLSARTEAQAGNLERTASSIEQLAATVRQNADSAQHAHDMVQSASEAANAGGRTVARLV 491
++L+ E + + + E+ + + E L
Sbjct: 160 IELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELN---------LD 210

Query: 492 GTMSGIHTTAQKIADITGIIDGIAFQTNI---LALNAAVEAARAGEQGRGFAVVAGEVRS 548
+ T +I + + + L A+ EQ + E+R
Sbjct: 211 KKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRV 270

Query: 549 LAQRSAAAAKEIKELISRSVQEVQ 572
+ EI Q
Sbjct: 271 YKSQLEQIESEILSAKEEYQLVTQ 294


83RS_RS15880RS_RS15915N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS15880-115-0.534890histidine kinase
RS_RS15885-2140.041013LuxR family transcriptional regulator
RS_RS15890-217-1.102985hemagglutinin
RS_RS15895022-1.473088hypothetical protein
RS_RS15900-126-3.088567glycosyl transferase family 8
RS_RS15905-223-3.692481esterase
RS_RS15910-124-4.162927adenylylsulfate kinase
RS_RS15915-126-4.472265hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15880PF06580330.003 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.3 bits (76), Expect = 0.003
Identities = 11/71 (15%), Positives = 28/71 (39%), Gaps = 5/71 (7%)

Query: 571 EIRVGTAVESTGVLVSIVDNGRGFDVEKTLAAGKGNGLRNLQRRAQAI---DGTVRWTSG 627
+I + ++ V + + + G K G GL+N++ R Q + + ++ +
Sbjct: 280 KILLKGTKDNGTVTLEVENTGSLA--LKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEK 337

Query: 628 PEGTQFTLWLP 638
+ +P
Sbjct: 338 QGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15885HTHFIS592e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 59.5 bits (144), Expect = 2e-12
Identities = 21/118 (17%), Positives = 49/118 (41%), Gaps = 2/118 (1%)

Query: 10 LPSIRVAIVEDDSGFLDALTQALACAPDMHLTGVAGSRAEGLLLLDGAPADVLLVDLGLP 69
+ + + +DD+ L QAL+ A + + + A + D+++ D+ +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 70 DGSGIDVIEAAARKWSACSIMVSTNFGDEIHVMRSIEVGAAGYLLKDSSAARILDEIR 127
D + D++ + ++V + + +++ E GA YL K ++ I
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15890IGASERPTASE360.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.8 bits (82), Expect = 0.001
Identities = 34/141 (24%), Positives = 59/141 (41%), Gaps = 20/141 (14%)

Query: 751 DQIDGYSANYGGLLIGADREINDRWRAGGVFSYSNTAIDNTGDTSGNS---ARVNGYGLI 807
Q +S+ +G D+ I++ + GGVF+Y N D + + A+VN Y
Sbjct: 1310 SQYRRFSSKSTQTQLGWDQTISNNVQLGGVFTYVRN--SNNFDKATSKNTLAQVNFYS-- 1365

Query: 808 GYASYTGNPWYVNLS-GAAVQQRYDTSRLVSMQGLSGTASGHFSGQQYVARTEAGYPLSV 866
Y N WY+ + G Q S+L + + F+ AG ++
Sbjct: 1366 --KYYADNHWYLGIDLGYGKFQ----SKLQTNH------NAKFARHTAQFGLTAGKAFNL 1413

Query: 867 GSVTLTPLASLTYSYLTQDSY 887
G+ +TP+ + YSYL+ +
Sbjct: 1414 GNFGITPIVGVRYSYLSNADF 1434


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15895SYCDCHAPRONE421e-06 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 42.2 bits (99), Expect = 1e-06
Identities = 19/93 (20%), Positives = 33/93 (35%)

Query: 86 NLAEMYRQKGLLAEAEDAARRAVAMDPALVSGWNNLGIVLQEAGKFAESLDCLERVILLQ 145
+LA Q G +A + +D + LG Q G++ ++ ++
Sbjct: 41 SLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMD 100

Query: 146 PDGAQAHNNLANTLRRLERLERAESHYRQALEL 178
+ + A L + L AES A EL
Sbjct: 101 IKEPRFPFHAAECLLQKGELAEAESGLFLAQEL 133



Score = 29.1 bits (65), Expect = 0.040
Identities = 22/124 (17%), Positives = 42/124 (33%), Gaps = 1/124 (0%)

Query: 211 ELSPRLVDAYLNLAEAEMGRHRHEAALRVLDTLSTFAPQHPAALTARANVLKRVERLDEA 270
E+S ++ +LA + ++E A +V L + + + D A
Sbjct: 30 EISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLA 89

Query: 271 LAVARQAVVLAPRSAEAHHALAMALQTLGQTDEALPHFEQAARLPGAVAE-EALVGRATL 329
+ ++ + A L G+ EA A L E + L R +
Sbjct: 90 IHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEFKELSTRVSS 149

Query: 330 LMEA 333
++EA
Sbjct: 150 MLEA 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15905PF06057290.020 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 29.0 bits (65), Expect = 0.020
Identities = 19/100 (19%), Positives = 34/100 (34%), Gaps = 8/100 (8%)

Query: 136 YWTGKRFAPEAVAS-IDQAVSHFAAKVPGQRIHLIGYSGG-----GALAVLVAARRTDVA 189
YW K P+ V + + A+ Q++ LIGYS G L + A R +V
Sbjct: 90 YWKQK--DPKDVTQDTLAIIDKYQAEFGTQKVILIGYSFGAEVIPFVLNEMPARYRKNVL 147

Query: 190 SIRTVAGNLDHAFVNRLHDVSSMPQSENAIDFAQRVASIP 229
++ + F + ++ + V
Sbjct: 148 GAVLLSPSQSSDFEIHVSEMVTSDNQSARYLTLPEVNKQT 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15915SYCDCHAPRONE310.003 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 31.1 bits (70), Expect = 0.003
Identities = 11/59 (18%), Positives = 19/59 (32%)

Query: 135 SVAIILYLARRYDQAREELNKALEIDPNHFLLHFRLGLVYQQQKLFHDAIEEMQKAVTL 193
S+A Y + +Y+ A + +D LG Q + AI +
Sbjct: 41 SLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIM 99


84RS_RS15950RS_RS15980N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS15950226-4.207741hypothetical protein
RS_RS15955017-0.995197hypothetical protein
RS_RS15960-116-1.298335hypothetical protein
RS_RS15965219-0.413372hypothetical protein
RS_RS15970219-0.260779hypothetical protein
RS_RS15975219-0.436652activation/secretion signal peptide protein
RS_RS15980323-0.930252hemagglutinin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15950MICOLLPTASE310.002 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 30.8 bits (69), Expect = 0.002
Identities = 15/43 (34%), Positives = 22/43 (51%), Gaps = 2/43 (4%)

Query: 37 IGVAAKISDLYPHVINVYMNATLPPLRKKSWAGYYFKTLIPYF 79
+G ++ + +N +N L L KKSW GY KT+ YF
Sbjct: 701 VGGRSQGEENDWKDMNSKLNDILKELSKKSWNGY--KTVTAYF 741


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15955SACTRNSFRASE362e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.1 bits (83), Expect = 2e-05
Identities = 17/86 (19%), Positives = 31/86 (36%), Gaps = 8/86 (9%)

Query: 54 TAPACKYFVAMESGLLVGVCALK----DRKHIYHLFVAPEAQGRGVARALWEYARADAEL 109
A F+ +G ++ I + VA + + +GV AL A A+
Sbjct: 63 EGKAA--FLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKE 120

Query: 110 DGATGSF--TVNSSLHAVPVYERLGF 133
+ G T + ++ A Y + F
Sbjct: 121 NHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15975PF03544300.023 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 29.9 bits (67), Expect = 0.023
Identities = 12/70 (17%), Positives = 20/70 (28%), Gaps = 2/70 (2%)

Query: 9 AAALLSVATAHAQV-LPPAPPGPVLEDPAQRAQRERQDTERRREATQPAPQIAVAPSVPD 67
A L + ++ P P + PA + +P P+ P P
Sbjct: 30 AGLLYTSVHQVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVV-EPEPEPEPIPEPPK 88

Query: 68 DAAVDAVAEP 77
+A V
Sbjct: 89 EAPVVIEKPK 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS15980PF05860761e-18 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 75.6 bits (186), Expect = 1e-18
Identities = 28/117 (23%), Positives = 47/117 (40%), Gaps = 4/117 (3%)

Query: 28 AGIVPDG--GTATTVTTGANGRPVVNIAPSTAGVSHNIYTSFNVGPVGADLNNAIVRART 85
A I PD + +TT N R + + + + H+ + F+V G N +
Sbjct: 1 AQITPDTTLPINSNITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFNNPTNIQN 59

Query: 86 IVNQVTSTDPSLIQGNIAVLGPRANVIIANPNGITVDGGSFTNTGNVALTTGQVSFN 142
I+++VT S I G I AN+ + NPNGI + + G + +
Sbjct: 60 IISRVTGGSVSNIDGLIRANA-TANLFLINPNGIIFGQNARLDIGGSFVGSTANRLK 115


85RS_RS16050RS_RS16070N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS16050226-4.068098multidrug transporter
RS_RS16055128-4.375671hemolysin secretion protein D
RS_RS16060227-4.185581AcrR family transcriptional regulator
RS_RS16065533-2.829700RND transporter
RS_RS16070431-3.486156short-chain dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16060ACRIFLAVINRP452e-144 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 452 bits (1164), Expect = e-144
Identities = 226/1054 (21%), Positives = 423/1054 (40%), Gaps = 69/1054 (6%)

Query: 8 LSALAVRERAVTLFLICLISLAGLISFFKLGRAEDPAFTVKVMTIITAWPGATAQEMQDQ 67
++ +R L ++ +AG ++ +L A+ P +++ +PGA AQ +QD
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 68 VAEKIEKRIQELRWYDRTETYT-RPGLAFTTLTLLDSTPPSEVQEEFYQARKKVSDEVAN 126
V + IE+ + + + + G TLT T P Q Q + K+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQV---QVQNKLQLATPL 117

Query: 127 LPPGVIGPMVNDEYADVTFAL---FALKAQGEPQRVLVRDAE-TLRQRLLHVPGVKKVNI 182
LP V ++ E + ++ + F G Q + ++ L + GV V +
Sbjct: 118 LPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL 177

Query: 183 IGEQSE-RIYVEFSHDRLATLGVSPQEVFAALNNQNALTAAGSVET------KGPQVFIR 235
G Q RI+++ D L ++P +V L QN AAG + + I
Sbjct: 178 FGAQYAMRIWLDA--DLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASII 235

Query: 236 LDGAFDELQKIRDTPVVSQ--GRTLKLSDIATVKRGYEDPATFMVRNGGQPALLLGIVMR 293
F ++ + G ++L D+A V+ G E+ R G+PA LGI +
Sbjct: 236 AQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVI-ARINGKPAAGLGIKLA 294

Query: 294 EGWNGLDLGKALDKEVGAINADMPLGMSLTKVTDQAVNISAAVDEFMLK-FFAALLVVML 352
G N LD KA+ ++ + P GM + D + ++ E + F A +LV ++
Sbjct: 295 TGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLV 354

Query: 353 VSFVSMGWRAGLVVAAAVPLTLAVVFVVMAATGKNFDRITLGSLILALGLLVDDAIIAIE 412
+ RA L+ AVP+ L F ++AA G + + +T+ ++LA+GLLVDDAI+ +E
Sbjct: 355 MYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVE 414

Query: 413 MMV-VKMEEGYSRVAASAYAWSHTAAPMLSGTLVTAVGFMPNGFARSTAGEYTSNMFWIV 471
+ V ME+ A+ + S ++ +V + F+P F + G +
Sbjct: 415 NVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITI 474

Query: 472 GIALIASWVVAVVFTPYLGVKMLPDFKKVE--------GGHDAIYDTPRYNRFRALLGRV 523
A+ S +VA++ TP L +L G + +D N + +G++
Sbjct: 475 VSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFD-HSVNHYTNSVGKI 533

Query: 524 IARKWLVAGSVVGLFALAILGMAVVKKQFFPISDRPEVLVEVQMPYGTSINQTSAATAKL 583
+ + A ++ + F P D+ L +Q+P G + +T ++
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQV 593

Query: 584 EAWLAKQKEAQIVTSYVGQGAPRFYLAMGPELPDSSFAKIVI-----RTDSQEERDALKQ 638
+ K ++A + + + G + + ++ A + + R + +A+
Sbjct: 594 TDYYLKNEKANVESVFTVNG-----FSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIH 648

Query: 639 RLRQAIADGLAPEARVRVTQLVFGPYSPFPVAYRVTGPDAETLRRIAADVRQVMDAS--- 695
R + + + V + TG D E + + + A
Sbjct: 649 RAKMELGK--IRDGFVIPFNM-----PAIVELGTATGFDFELIDQAGLGHDALTQARNQL 701

Query: 696 --------PMMRTVNTDWGMRVPTLHFTLQQDRLQAVGLTSSAVAQQLQFLLNGIPVTAV 747
+ +V + + Q++ QA+G++ S + Q + L G V
Sbjct: 702 LGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDF 761

Query: 748 REDIRTVQVTARSAGDIRLDPAKIGDFTLAGANGQRVPLSQVGKIDVRMEEPIIRRRDRM 807
+ R ++ ++ R+ P + + ANG+ VP S P + R + +
Sbjct: 762 IDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGL 821

Query: 808 PTITVRGDIADGLQPPDVSTALSKQLQPIIEKLPSGYRIEQAGSIEESGKATKAMLPLFP 867
P++ ++G+ A G D + + KLP+G + G + + L
Sbjct: 822 PSMEIQGEAAPGTSSGDAMALMEN----LASKLPAGIGYDWTGMSYQERLSGNQAPALVA 877

Query: 868 IMLAVTLLILIFQVRSIPAMVMVFLTSPLGLIGVVPTLILFGQPFGINALVGLIALSGIL 927
I V L L S V V L PLG++GV+ LF Q + +VGL+ G+
Sbjct: 878 ISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLS 937

Query: 928 MRNTLILIGQIQHNKE-EGLDPFTAVVEATVQRARPVILTALAAILAFIPLTHSVFWGA- 985
+N ++++ + E EG A + A R RP+++T+LA IL +PL S G+
Sbjct: 938 AKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSG 997

Query: 986 ----LAYTLIGGTFAGTILTLVFLPAMYSIWFRI 1015
+ ++GG + T+L + F+P + + R
Sbjct: 998 AQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRC 1031



Score = 83.7 bits (207), Expect = 2e-18
Identities = 55/323 (17%), Positives = 126/323 (39%), Gaps = 20/323 (6%)

Query: 712 LHFTLQQDRLQAVGLT----SSAVAQQLQFLLNGIPVTAVREDIRTVQVTARSAGDIRLD 767
+ L D L LT + + Q + G + + + + + +
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK-N 242

Query: 768 PAKIGDFTL-AGANGQRVPLSQVGKIDVRMEEPIIRRR-DRMPTITVRGDIADGLQPPDV 825
P + G TL ++G V L V ++++ E + R + P + +A G D
Sbjct: 243 PEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDT 302

Query: 826 STALSKQLQPIIEKLPSGYRIE----QAGSIEESGKATKAMLPLFPIMLAVTLLILIFQV 881
+ A+ +L + P G ++ ++ S L IML L++ +F +
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTL-FEAIMLVF-LVMYLF-L 359

Query: 882 RSIPAMVMVFLTSPLGLIGVVPTLILFGQPFGINALVGLIALSGILMRNTLILIGQIQ-H 940
+++ A ++ + P+ L+G L FG + G++ G+L+ + ++++ ++
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 941 NKEEGLDPFTAVVEATVQRARPVILTALAAILAFIPL-----THSVFWGALAYTLIGGTF 995
E+ L P A ++ Q ++ A+ FIP+ + + + T++
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 996 AGTILTLVFLPAMYSIWFRIRPN 1018
++ L+ PA+ + +
Sbjct: 480 LSVLVALILTPALCATLLKPVSA 502


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16065RTXTOXIND401e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.8 bits (93), Expect = 1e-05
Identities = 18/118 (15%), Positives = 40/118 (33%), Gaps = 9/118 (7%)

Query: 70 GKVLERLVDAGQTVKRGQPLMRIDPVD-----LKLAAHAQQEAVSAARARAQQTA--EDE 122
V E +V G++V++G L+++ + LK + Q + R + + ++
Sbjct: 105 SIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNK 164

Query: 123 ARYRDLRGTGAISASAYDQAKAAADAAKAQLSAAEAQADVARNASRYAELLADGDGVV 180
L + ++ K Q S + Q + A+ V+
Sbjct: 165 LPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKE--LNLDKKRAERLTVL 220


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16070HTHTETR618e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 61.2 bits (148), Expect = 8e-14
Identities = 41/199 (20%), Positives = 66/199 (33%), Gaps = 16/199 (8%)

Query: 17 DVRDQIVAAATEHFSRYGYEKTAVSDLAKAIGFSKAYIYKFFESKQAIGEMICANCLREI 76
+ R I+ A FS+ G T++ ++AKA G ++ IY F+ K + I I
Sbjct: 11 ETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNI 70

Query: 77 EADVSAALAE-ADSPPEKLRRMFKAL-----TEASLRLFSHDRKLYEIAASAATERWQAV 130
A+ P LR + + TE RL QA
Sbjct: 71 GELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQ 130

Query: 131 IAYEERIQKVLRDVLQEGRETGDFERKTPLDEATTAIYLVMRPYMNPLLLQY-----SFD 185
+ L+ E L AI +MR Y++ L+ + SFD
Sbjct: 131 RNLCLESYDRIEQTLKHCIEAKML--PADLMTRRAAI--IMRGYISGLMENWLFAPQSFD 186

Query: 186 YTESAPGLLSSLVLRSLSP 204
+ A +++L
Sbjct: 187 LKKEA-RDYVAILLEMYLL 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16080DHBDHDRGNASE1046e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 104 bits (260), Expect = 6e-29
Identities = 53/186 (28%), Positives = 86/186 (46%), Gaps = 8/186 (4%)

Query: 5 KVVLITGVSSGIGRATAAKFALRGCRVFGTVRNLAKAQPIPGVELVE--------MDIRD 56
K+ ITG + GIG A A A +G + N K + + E D+RD
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 57 QASVQQGIQTLIAQAKRIDVLVNSAGVTLLGATEETSIAEAQALFDTNVFGILRTTQAAL 116
A++ + + + ID+LVN AGV G S E +A F N G+ +++
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 117 PHMRAQRSGRIINISSVLGFLPAPYMGLYSASKHAVEGMSETLDHEVRKFGIRVVLVEPS 176
+M +RSG I+ + S +P M Y++SK A ++ L E+ ++ IR +V P
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 177 FTKTSL 182
T+T +
Sbjct: 189 STETDM 194


86RS_RS16955RS_RS17010N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
RS_RS16955-1111.944138potassium-transporting ATPase C chain
RS_RS16960-3111.119663two-component system sensor histidine kinase
RS_RS16965-212-0.840988dioxygenase
RS_RS16970-110-0.788424diguanylate cyclase
RS_RS169751190.763954LuxR family transcriptional regulator
RS_RS16980119-0.796162porin
RS_RS169850150.228472nitrate ABC transporter substrate-binding
RS_RS169900160.711460two-component system response regulator
RS_RS169950150.957679endonuclease DDE
RS_RS170000151.008939restriction endonuclease subunit R
RS_RS17005-190.539002membrane protein
RS_RS17010182.081836DNA methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16955TCRTETB280.033 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 27.9 bits (62), Expect = 0.033
Identities = 15/54 (27%), Positives = 25/54 (46%), Gaps = 3/54 (5%)

Query: 21 AALVIFVGLSLVTGVLYPVVVTGIGKAAFPAQAGGSI---IERGGKPVGSALIG 71
+++ FVG S + ++ + G G AAFPA + I + + LIG
Sbjct: 92 GSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIG 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16970LIPPROTEIN48290.023 Mycoplasma P48 major surface lipoprotein signature.
		>LIPPROTEIN48#Mycoplasma P48 major surface lipoprotein signature.

Length = 428

Score = 28.8 bits (64), Expect = 0.023
Identities = 16/58 (27%), Positives = 29/58 (50%), Gaps = 4/58 (6%)

Query: 41 QIWEVVQKVAKKDGLDIKIIEFNDYAQP--NPALDAGD--LDANGFQHQPFLDGQVKA 94
+E ++ + K+ G++I +E + + N AL AG NGF+HQ + + A
Sbjct: 81 SAFEALKAINKQTGIEINNVEPSSNFESAYNSALSAGHKIWVLNGFKHQQSIKQYIDA 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16975HTHFIS538e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 53.3 bits (128), Expect = 8e-10
Identities = 17/73 (23%), Positives = 27/73 (36%), Gaps = 1/73 (1%)

Query: 9 VLVVDGQLGQRTAARQVLQTLGVQQILAAGDGREALDLLHLCQFDLVLCDIAMPGMDGSA 68
+LV D RT Q L G + + + DLV+ D+ MP +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 69 LMQRMRLHDGRPP 81
L+ R++ P
Sbjct: 65 LLPRIKKARPDLP 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16980HTHFIS411e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 41.4 bits (97), Expect = 1e-06
Identities = 21/83 (25%), Positives = 36/83 (43%), Gaps = 5/83 (6%)

Query: 6 SIVAADDHPVILMGLATAIQQFPRQKIIAQAHDGHALLALLESTDCDVVVTDYHMNGDAA 65
+I+ ADD I L A+ + + + L + + D D+VVTD M +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYD--VRITSNAATLWRWIAAGDGDLVVTDVVMPDE-- 60

Query: 66 SDGMQLLTRLRARHPQRAVVVCT 88
+ LL R++ P V+V +
Sbjct: 61 -NAFDLLPRIKKARPDLPVLVMS 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16985ECOLNEIPORIN568e-11 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 55.6 bits (134), Expect = 8e-11
Identities = 71/362 (19%), Positives = 122/362 (33%), Gaps = 56/362 (15%)

Query: 25 AQTSAVTLYGLIDTTISTVSNTNAAGARTTGFQVP---WFSGSRWGLTGKEDLGGGTSAI 81
A + VTLYG I + T + GA+ + GS+ G G+EDLG G AI
Sbjct: 16 AAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSKIGFKGQEDLGNGLKAI 75

Query: 82 FKLESEFETPTGNMDTPGVLFNRDAWVGLSNPLFGKLTFGRQNALARDVSGIYGDPYTSA 141
+++E + T NR +++GL FGKL GR N++ +D I +P+ S
Sbjct: 76 WQVEQK----ASIAGTDSGWGNRQSFIGLKGG-FGKLRVGRLNSVLKDTGDI--NPWDSK 128

Query: 142 EVTLDEGGYTNVNNFKQLIFYAGSATGTRLNNGVVWKKLWDGGFFTGLAYQFGEVPGQFS 201
L + Y G + Y + G
Sbjct: 129 SDYLGVNKIAEPEARLISVRYDSPEF---------------AGLSGSVQYALNDNAG-RH 172

Query: 202 QNTTESAALGYNGGNFHLAAFAQQAKVNGFTDRSYSLGGNV--ILGMFRVNAGYYHYTGE 259
+ + A Y G F + + + + + ++ + +A Y +
Sbjct: 173 NSESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSGYDNDALYASVAVQ 232

Query: 260 QPAAIGNRKDDAYTVSLKIAPPGAFDYE-----IGYQIMKAGNAAFNADGNTLNAYADVS 314
Q A ++ ++ ++A A+ + + Y G+ + N V
Sbjct: 233 QQDAKLVEENYSHNSQTEVAATLAYRFGNVTPRVSYAHGFKGSFD-ATNYNNDYDQVVVG 291

Query: 315 TATASGSGRKNTLYGSVFYHVSKRTEFYVAADYMKLKDQYIVGSTNGHNSQTEFGVGMRT 374
Y SKRT V+A +++ G T GVG+R
Sbjct: 292 AE----------------YDFSKRTSALVSAGWLQE------GKGESKFVSTAGGVGLRH 329

Query: 375 RF 376
+F
Sbjct: 330 KF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS16995HTHFIS928e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.2 bits (229), Expect = 8e-24
Identities = 32/112 (28%), Positives = 59/112 (52%)

Query: 2 KLLLIEDNAPLAQWLSEALRRAQFTVDHASDGESADTLLLTQHYDVVLLDLQLPSLSGRA 61
+L+ +D+A + L++AL RA + V S+ + + D+V+ D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLQRLRDRRNAVPVMILTAAGGIDDKVACLGAGADDYLVKPFEIRELIARIQ 113
+L R++ R +PV++++A + GA DYL KPF++ ELI I
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
RS_RS17020PF09025345e-04 YopR Core
		>PF09025#YopR Core

Length = 143

Score = 33.8 bits (77), Expect = 5e-04
Identities = 20/74 (27%), Positives = 34/74 (45%)

Query: 76 WSVFRTLSPQAMFDVVSTRGIPFLQALGDNDPAAGRHMEDIRFTLTTPALLARIVQLLDA 135
++V+ + SP+A + + F QALG PAAGR + + LL R Q L
Sbjct: 10 FAVYPSASPKAANLPAVDQVLAFEQALGGEPPAAGRRLAGLENGALGERLLQRFAQPLQG 69

Query: 136 IPLHRRDVRGAVYE 149
+ R +++ +
Sbjct: 70 LEADRLELKAMLRA 83



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.