PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome2266.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_004337 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1SF0052SF0059Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF0052-2163.402546Dna-J like membrane chaperone protein
SF0053-2153.360166hypothetical protein
SF0054-2153.325900ATP-dependent helicase HepA
SF00550143.689127DNA polymerase II
SF00561174.597826L-ribulose-5-phosphate 4-epimerase
SF00581174.175727ribulokinase
SF00590173.436910DNA-binding transcriptional regulator AraC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF005256KDTSANTIGN290.023 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 28.8 bits (64), Expect = 0.023
Identities = 32/120 (26%), Positives = 51/120 (42%), Gaps = 18/120 (15%)

Query: 157 IAEELGISRAQFD-----QFLRMMQGGAQFGGGYQQQSGGGNWQQAQRGPTLEDACNVLG 211
EEL R FD F+ + QQQ G G QQAQ T ++A
Sbjct: 310 TLEEL---RDSFDGYINNAFVNQIHLNFVMPPQAQQQQGQGQQQQAQ--ATAQEAVAAAA 364

Query: 212 VKPTDDATTIKRAYRKLMS-EHHPDKLVAKGLPPEMMEMAKQKAQEIQ-QAYELIKQQKG 269
V+ + + I + Y+ L+ + H G+ M ++A Q+ ++ + Q KQQ+G
Sbjct: 365 VRLLNGSDQIAQLYKDLVKLQRH------AGIRKAMEKLAAQQEEDAKNQGKGDCKQQQG 418


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0058TCRTETOQM290.040 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 29.4 bits (66), Expect = 0.040
Identities = 19/103 (18%), Positives = 39/103 (37%), Gaps = 18/103 (17%)

Query: 300 ILIADKQSVGERAVKGICGQVDGSVV------PGFIGLEAGQS-AFGDIYAWFGRVLGWP 352
+ I++K+ + + + ++G + G I + + + G P
Sbjct: 281 VRISEKEKIK---ITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSV---LGDTKLLP 334

Query: 353 L-EQLAAQHPELKAQINASQKQ----LLPALTEAWAKNPSLDH 390
E++ P L+ + S+ Q LL AL E +P L +
Sbjct: 335 QRERIENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRY 377


2SF0097SF0112Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF00972120.083822hypothetical protein
SF00982110.097954DNA gyrase inhibitor
SF00991130.327315hypothetical protein
SF01000140.673955dephospho-CoA kinase
SF0101-1140.628366guanosine 5'-monophosphate oxidoreductase
SF01030130.822046type IV pilin biogenesis protein
SF0105-2111.362778major pilin subunit
SF01060181.328716quinolinate phosphoribosyltransferase
SF01072261.595336N-acetyl-anhydromuranmyl-L-alanine amidase
SF01083321.439689regulatory protein AmpE
SF01093362.097358aromatic amino acid transporter
SF01104372.251598transcriptional regulator PdhR
SF01113381.450721pyruvate dehydrogenase subunit E1
SF01122311.258206pyruvate dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0103BCTERIALGSPF2275e-73 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 227 bits (581), Expect = 5e-73
Identities = 94/405 (23%), Positives = 184/405 (45%), Gaps = 13/405 (3%)

Query: 6 LWRWHGITGDGNAQDGMLWAESRTLLLMALQQQMVTPLSLKRIAINSAQ----------- 54
+ + + G G A+S L+++ + PLS+ + +
Sbjct: 3 QYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLRRK 62

Query: 55 WRGDKS--AEVIHQLATLLKAGLTLSEGLALLAEQHPSKQWQALLQSLAHDLEQGIAFSN 112
R S A + QLATL+ A + L E L +A+Q L+ ++ + +G + ++
Sbjct: 63 IRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD 122

Query: 113 ALLPWSEAFPPLYQAMIRTGELTGKLDECCFELARQQKSQRQLTDKVKSALRYPIIILAM 172
A+ + +F LY AM+ GE +G LD LA + ++Q+ +++ A+ YP ++ +
Sbjct: 123 AMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLTVV 182

Query: 173 AIMVVVAMLHFVLPEFAAIYKTFNTPLPALTQGIMTLADFSGEWGWLLVLFGFLLAIANK 232
AI VV +L V+P+ + LP T+ +M ++D +G ++L +A +
Sbjct: 183 AIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMAFR 242

Query: 233 LLMRRPTWLIARQKLLLRIPIMGSLMRGQKLTQIFTILALTQSAGITFLQGVESVRETMR 292
+++R+ ++ + LL +P++G + RG + L++ ++ + LQ + + M
Sbjct: 243 VMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDVMS 302

Query: 293 CPYWVQLLTQIQHDISNGHPIWLALKNAGEFSPLCLQLVRTGEASGSLDLMLDNLAHHHR 352
Y L+ + G + AL+ F P+ ++ +GE SG LD ML+ A +
Sbjct: 303 NDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADNQD 362

Query: 353 DNTMALADNLAALLEPALLIITGGIIGTLVVAMYLPIFHLGDAMS 397
+ L EP L++ ++ +V+A+ PI L MS
Sbjct: 363 REFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLMS 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0105BCTERIALGSPG502e-10 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 49.5 bits (118), Expect = 2e-10
Identities = 27/79 (34%), Positives = 43/79 (54%), Gaps = 1/79 (1%)

Query: 1 MDKQRGFTLIELMVVIGIIAILSSIGIPAYQNYLRKAALTDMLQTFVPYRTAVELCALEH 60
DKQRGFTL+E+MVVI II +L+S+ +P KA + V A+++ L++
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDN 63

Query: 61 GGLDTCD-GGSNGIPSPTT 78
T + G + + +PT
Sbjct: 64 HHYPTTNQGLESLVEAPTL 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0112RTXTOXIND320.006 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.1 bits (73), Expect = 0.006
Identities = 15/60 (25%), Positives = 29/60 (48%), Gaps = 2/60 (3%)

Query: 119 EVTEILVKVGDKV-EAEQSLITVEGDKASMEVPAPFAGTVKEIKVN-VGDKVSTGSLIMV 176
E+ + L + D + L E + + + AP + V+++KV+ G V+T +MV
Sbjct: 299 EILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358



Score = 32.1 bits (73), Expect = 0.007
Identities = 16/63 (25%), Positives = 27/63 (42%), Gaps = 2/63 (3%)

Query: 26 DKVEAEQSLITVEGDKASMEVPSPQAGIVKEIKVSVGDKTQTGALIMIFDSADGAADAAP 85
+ V +T G S E+ + IVKEI V G+ + G +++ + AD
Sbjct: 81 EIVATANGKLTHSGR--SKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLK 138

Query: 86 AQA 88
Q+
Sbjct: 139 TQS 141



Score = 31.3 bits (71), Expect = 0.013
Identities = 17/106 (16%), Positives = 38/106 (35%), Gaps = 4/106 (3%)

Query: 230 DKVAAEQSLITVEGDKASMEVPAPFAGVVKELKVNVGDKVKTGSLIMIFEVEGAAPAAAP 289
+ VA +T G S E+ +VKE+ V G+ V+ G + + ++ A
Sbjct: 81 EIVATANGKLTHSGR--SKEIKPIENSIVKEIIVKEGESVRKGDV--LLKLTALGAEADT 136

Query: 290 AKQEAAAPAPAAKAEAPAAKAEGKSEFAENDSYVHATPLIRRLARE 335
K +++ + + + + P + ++ E
Sbjct: 137 LKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEE 182


3SF0196SF0224Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF0196-222-3.299178biotin synthase
SF0197-217-1.149519membrane-bound lytic murein transglycosylase D
SF0198-1151.346865hydroxyacylglutathione hydrolase
SF0199-2163.419988hypothetical protein
SF0201-1173.858923RNase HI
SF02000184.449188DNA polymerase III subunit epsilon
SF02020184.740728*hypothetical protein
SF02040213.057024outer membrane usher protein
SF0205426-0.063150hypothetical protein
SF0209427-0.163538IS2 repressor TnpA
SF0210526-0.083465IS2 transposase TnpB
SF0211528-0.765197hypothetical protein
SF0215429-0.734343terminase large subunit
SF0216528-0.819694packaging glycoprotein
SF0217528-0.893762scaffolding protein
SF0218528-1.523426coat protein
SF0219429-1.754360bacteriophage protein
SF0220329-1.359814DNA stabilization protein
SF0221430-1.158668packaged DNA stabilization protein
SF0222430-0.893847packaged DNA stabilization protein
SF0223430-1.043923head assembly protein
SF0224325-0.593856DNA transfer protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0197INTIMIN290.050 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 28.9 bits (64), Expect = 0.050
Identities = 20/84 (23%), Positives = 41/84 (48%), Gaps = 9/84 (10%)

Query: 316 RESLASGE---IAAVQSTLVANNTPLNSRVYTVRSGDTLSSIASRLGVSTKDLQQWNKLR 372
+ S A+GE S L+ +N+ N YT+++G+T++ ++ ++ + NK
Sbjct: 35 QNSFANGENYFKLGSDSKLLTHNSYQNRLFYTLKTGETVADLSKSQDINLSTIWSLNKHL 94

Query: 373 GS------KLKPGQSLTIGAQRLA 390
S K +PGQ + + ++L
Sbjct: 95 YSSESEMMKAEPGQQIILPLKKLP 118


4SF0238SF0277Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF0238325-3.143797hypothetical protein
SF0240526-1.641133hypothetical protein
SF0241427-1.444267insertion sequence element IS600 protein
SF0242523-1.343562insertion sequence element IS600 transposase
SF0243621-1.922138hypothetical protein
SF02443180.292953IS2 transposase TnpB
SF02450150.119264IS2 repressor TnpA
SF02460161.636726insertion element IS1 protein InsA
SF02470172.111885insertion element IS1 protein InsB
SF0248-1172.722726hypothetical protein
SF0249-1193.520046delta-aminolevulinic acid dehydratase
SF0250-1203.780468taurine dioxygenase
SF02510170.809468taurine ABC transporter permease
SF0252116-0.617462taurine ABC transporter ATP-binding subunit
SF0253-114-1.708032taurine ABC transporter substrate-binding
SF0254117-3.023130insertion element IS1 protein InsB
SF0255019-3.170776insertion element IS1 protein InsA
SF0256117-3.115466hypothetical protein
SF0258019-2.742550transporter
SF0259123-4.864189hypothetical protein
SF0260216-2.465124dehydrogenase subunit
SF0261318-2.115424insertion element IS1 protein InsA
SF02623211.342082insertion element IS1 protein InsB
SF02634231.845534hypothetical protein
SF02643233.212853hypothetical protein
SF02653234.023913Rhs-family protein
SF02660234.555362Rhs-family protein
SF0267-1234.242567Rhs-family protein
SF0268-1182.668470Rhs-family protein
SF0269-1161.690537C-N hydrolase amidase
SF02700150.726731C-lysozyme inhibitor
SF02710202.348377acyl-CoA dehydrogenase
SF02722211.278952phosphoheptose isomerase
SF02731211.651540amidotransferase
SF02741210.934893hypothetical protein
SF02763190.705745hypothetical protein
SF02782190.551789flagellar biosynthetic protein FlhA
SF0277219-1.908227hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0249BINARYTOXINB300.019 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 29.7 bits (66), Expect = 0.019
Identities = 19/69 (27%), Positives = 30/69 (43%)

Query: 265 DIVRELRERTELPIGAYQVSGEYAMIKFAALAGAIDEEKVVLESLGSIKRAGADLIFSYF 324
+ EL + +L + QV G A F +D E L I+ A +IF+
Sbjct: 466 NQFLELEKTKQLRLDTDQVYGNIATYNFENGRVRVDTGSNWSEVLPQIQETTARIIFNGK 525

Query: 325 ALDLAEKKI 333
L+L E++I
Sbjct: 526 DLNLVERRI 534


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0265CHANLCOLICIN300.027 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.4 bits (68), Expect = 0.027
Identities = 31/101 (30%), Positives = 42/101 (41%), Gaps = 9/101 (8%)

Query: 10 PVGNGGPVITT-----PPIAGESGGMSTGSAVTDVSGAAEEMAEQAAADLFGALPEPSGL 64
P + G VI T P +G GG G + ++ S A A+ + A L E +
Sbjct: 13 PYDDKGQVIITLLNGTPDGSGSGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAAR 72

Query: 65 VKAAVAAAQAAAAA---AGISDMAGAVQDAAASLAAGAPGA 102
KAA A AQA A A A + V +A A+ P A
Sbjct: 73 AKAA-AEAQAKAKANRDALTQRLKDIVNEALRHNASRTPSA 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0277OMPADOMAIN382e-05 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 38.4 bits (89), Expect = 2e-05
Identities = 29/118 (24%), Positives = 45/118 (38%), Gaps = 22/118 (18%)

Query: 121 FERGSAQIMPFFKTLLVELAPVFDSLY---NKIIITGHTDAM---AYKNNIYNNWNLSGD 174
F A + P + L +L +L +++ G+TD + AY N LS
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAY------NQGLSER 276

Query: 175 RALSARRVLEEAGMPEDKVMQVS-----AMADQMLLDAKNPQS-----AGNRRIEIMV 222
RA S L G+P DK+ + + K + A +RR+EI V
Sbjct: 277 RAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


5SF0294SF0331Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SF0294331-4.593352*phage integrase
SF0295429-3.773785insertion sequence element IS600 protein
SF0296733-5.059076insertion sequence element IS600 transposase
SF0297428-3.773785phage integrase
SF0298327-3.730189hypothetical protein
SF0299428-5.084356bacteriophage protein
SF0301437-10.322922insertion sequence element IS629 protein
SF0302538-10.345700insertion sequence element IS600 transposase
SF0303541-12.396150insertion sequence element IS600 protein
SF0305546-14.162129flippase
SF0306444-13.007783bactoprenol glucosyl transferase
SF0307440-11.772013glucosyl transferase II
SF0308334-8.626898hypothetical protein
SF0309135-8.203724hypothetical protein
SF0310136-7.937026phage tail fiber protein
SF0311134-6.338940bacteriophage protein
SF0312235-6.887847insertion sequence element IS600 protein
SF0313235-7.442478insertion sequence element IS600 transposase
SF0314029-5.473777phage integrase
SF0315022-4.090766hypothetical protein
SF0317-1141.099912insertion sequence element IS600 protein
SF03181171.353678insertion sequence element IS600 transposase
SF03201161.681689insertion sequence element IS629 protein
SF03210170.373519diguanylate cyclase AdrA
SF0322017-0.227169pyrroline-5-carboxylate reductase
SF0323024-1.573728hypothetical protein
SF0324-128-3.951188shikimate kinase II
SF0325-122-3.558235hypothetical protein
SF0326-121-2.007433hypothetical protein
SF0327121-1.262957hypothetical protein
SF03283180.192482hypothetical protein
SF03292150.459838hypothetical protein
SF03312161.074005recombination associated protein
6SF0358SF0381Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF0358-119-4.992048geranyltranstransferase
SF0359129-9.896159exodeoxyribonuclease VII small subunit
SF0361129-9.534820oxidative-stress-resistance chaperone
SF0362132-10.5332322-dehydropantoate 2-reductase
SF0363229-10.270877nucleotide-binding protein
SF0364330-10.446358hypothetical protein
SF0365120-6.868978hypothetical protein
SF03662250.978887insertion element IS1 protein InsB
SF03682250.391557hypothetical protein
SF03691201.247415protoheme IX farnesyltransferase
SF03700180.768325cytochrome o ubiquinol oxidase subunit IV
SF0371-1180.621607cytochrome o ubiquinol oxidase subunit III
SF0375-1180.062459insertion sequence element ISSfl4 ORF2
SF0376118-0.540938insertion sequence element ISSfl4 transposase
SF0377018-0.427346muropeptide transporter
SF0378227-1.271972polymerase/proteinase
SF0379425-0.620596murein genes regulator
SF0380327-0.536111hypothetical protein
SF0381326-0.201769trigger factor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0377TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.0 bits (91), Expect = 3e-05
Identities = 71/347 (20%), Positives = 135/347 (38%), Gaps = 20/347 (5%)

Query: 62 KFLWSPLMDRYTPPFFGRRRGWLLATQILLLVAIAAMGFLEPGTQLRWMAALAVVIAFCS 121
+F +P++ + F RR LL + V A M W+ + ++A +
Sbjct: 56 QFACAPVLGALSDRF--GRRPVLLVSLAGAAVDYAIMAT----APFLWVLYIGRIVAGIT 109

Query: 122 ASQDIVFDAWKTDVLPAEERGAGAAISVLGYRLGMLVSGGLALWLADKWLGWQGMYWLMA 181
+ V A+ D+ +ER + GM+ L + ++ A
Sbjct: 110 GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG--FSPHAPFFAAA 167

Query: 182 AL-LIPCIIATLLAPEP--TDTIPVPKTLEQAVVAPLRDFFGRNNAWLILLLIVLYKLGD 238
AL + + L PE + P+ + + + A L+ + ++ +G
Sbjct: 168 ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQ 227

Query: 239 AFAMSLTTTFLIRGVGFDAGEVGVVNKTLGLLATIVGALYGGILMQRLSLFRALLIFGIL 298
A +L F +DA +G+ G+L ++ A+ G + RL RAL+ G++
Sbjct: 228 VPA-ALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALM-LGMI 285

Query: 299 QGASNAGYWLLSITDKHLYSMGAAVFFENLCGGMGTSAFVALLMTLCNKSFSATQFALLS 358
A GY LL+ + + V GG+G A A+L ++ L+
Sbjct: 286 --ADGTGYILLAFATRGWMAFPIMVLL--ASGGIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 359 ALSAVGRVYVGPVAGWFVEAHGWSTF--YLFSVAAAVPGLILLLVCR 403
AL+++ + VGP+ + A +T+ + + AA+ L L + R
Sbjct: 342 ALTSLTSI-VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0378PF06291290.006 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 28.9 bits (64), Expect = 0.006
Identities = 12/37 (32%), Positives = 19/37 (51%)

Query: 34 NMFKKILFPLVALFMLAGCAKPPTTIEVSPTITLPQQ 70
N KK+LF ++ GCA+ T+ PT P++
Sbjct: 4 NKMKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKE 40


7SF0399SF0415Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF0399219-1.913404insertion sequence IS4 transposase InsG
SF0400218-4.566848hypothetical protein
SF0401013-1.778675hypothetical protein
SF0402013-1.174100hypothetical protein
SF0403216-0.959059hypothetical protein
SF0405015-0.426251gene expression modulator
SF0406014-0.230555Hha toxicity attenuator
SF04070150.662550multidrug efflux system protein AcrB
SF04081120.072911multidrug efflux system transporter AcrA
SF0409114-0.116117DNA-binding transcriptional repressor AcrR
SF04102142.052489hypothetical protein
SF04114163.771721hypothetical protein
SF04123164.335160primosomal replication protein N''
SF04133222.836400hypothetical protein
SF04143262.660582adenine phosphoribosyltransferase
SF04152222.697059DNA polymerase III subunits gamma and tau
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0402BCTERIALGSPF300.026 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.8 bits (67), Expect = 0.026
Identities = 31/137 (22%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 245 IWLPLGLVIGLLAAMFVLRILRRIQSPHHRLQDAIENRDICVHYQPIVSLANGKIVGAEA 304
W+ L L+ G +A +LR R+ + + P++ G+I
Sbjct: 228 PWMLLALLAGFMAFRVMLR------QEKRRVS-----FHRRLLHLPLI----GRIARGLN 272

Query: 305 LARWPQTDGSWLSPDSFIPLAQQTGLS-EPLTLLIIRSVFEDMGDCLRQHPQQHISINLE 363
AR+ +T + S +PL Q +S + ++ R D +R+ H + LE
Sbjct: 273 TARYARTLSILNA--SAVPLLQAMRISGDVMSNDYARHRLSLATDAVREGVSLHKA--LE 328

Query: 364 STVLTSEKIPQLLREMI 380
T L P ++R MI
Sbjct: 329 QTAL----FPPMMRHMI 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0407ACRIFLAVINRP13610.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1361 bits (3523), Expect = 0.0
Identities = 798/1033 (77%), Positives = 910/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISRFYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++S YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMTGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNM GIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ + LNK++LTPVDVI +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL 300
+ EEFGK+ L+VN DGS V L+DVA++ELGGENY++IA NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+LA+++PFFP G+K++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVF+PMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWLNRMFEKSTHHYTDSVGGILRSTGR 540
SVLVALILTPALCAT+LKP++ H E K GFFGW N F+ S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTHYYLT 600
YL++Y +IV GM LF+RLPSSFLP+EDQGVF+TM+QLPAGATQERTQKVL++VT YYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD 660
EK NVESVF VNGF F+G+ QN G+AFVSLK W +R G+EN EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQLL AA+HP L SVRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS++DIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D+ YVR+A+G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSTPFS 900
M LME LASKLP G+GYDWTGMSYQERLSGNQAP+L AIS +VVFLCLAALYESWS P S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF 1020
+EATL AVRMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0408RTXTOXIND453e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 45.2 bits (107), Expect = 3e-07
Identities = 33/212 (15%), Positives = 72/212 (33%), Gaps = 23/212 (10%)

Query: 100 TYQATYDSAKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQGYDQALADAQQANAAVTA 159
+ Y A +L + + Q+ Q +++ ++ L +Q +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 160 AKAAVETARINLAYTKVTSPISGRIGKSNV-TEGALVQNGQATVLATVQQLDPIYVDVTQ 218
+ + + +P+S ++ + V TEG +V + T++ V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALV 372

Query: 219 SSNDFLRLKQELA----------NGTLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVD 268
+ D + KV I D I+ + G + ++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLV---GKVKNINLDAIEDQRLGLVFNVIISIE 429

Query: 269 QTTGSITLRAIFPNPDHTLLPGMFVRARLEEG 300
+ S + I L GM V A ++ G
Sbjct: 430 ENCLSTGNKNIP------LSSGMAVTAEIKTG 455



Score = 32.9 bits (75), Expect = 0.002
Identities = 26/127 (20%), Positives = 50/127 (39%), Gaps = 10/127 (7%)

Query: 49 PLQITTELPGR-TSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQATYDS 107
++I G+ T + R E++P + I+ + KEG + G L ++ +A
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA---- 134

Query: 108 AKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQGYDQALADAQQANAAVTAAKAAVETA 167
D K Q++ A+L RYQ L + I + + V+ + T+
Sbjct: 135 ---DTLKTQSSLLQARLEQTRYQILSRS--IELNKLPELKLPDEPYFQNVSEEEVLRLTS 189

Query: 168 RINLAYT 174
I ++
Sbjct: 190 LIKEQFS 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0409HTHTETR2225e-76 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 222 bits (567), Expect = 5e-76
Identities = 215/215 (100%), Positives = 215/215 (100%)

Query: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60
MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180
GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215
APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0410RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRIKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0415IGASERPTASE397e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.9 bits (90), Expect = 7e-05
Identities = 40/251 (15%), Positives = 77/251 (30%), Gaps = 31/251 (12%)

Query: 402 PLPETTSQVLAARQQLQRVQGATKAKKSEPAA----ATRARPVNNAALERLASVTDRVQA 457
P E +Q + + + P+ AR + A + A T
Sbjct: 983 PEVEKRNQTVDTTN----ITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETT 1037

Query: 458 RPVPSALEKAPAKKEAYRWKATTPVMQQKE--------VVATPKALKKA---LEHEKTPE 506
V ++ E AT Q +E V A + + A E ++T
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 507 LAVKLAA---------EAIERDPWAAQVSQLSLPKLVEQVALNAWKE-ESDNAVCLHLRS 556
K A E+ +V+ PK + + E +N ++++
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 557 SQRHLNNRGAQQKLAEALS-MLKGSTVELTIVEDDNPAVRTPLEWRQAIYEEKLAQARES 615
Q N ++ A+ S ++ E T V N V P A + + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 616 IIADNNIQTLR 626
+ + +++R
Sbjct: 1218 KPKNRHRRSVR 1228


8SF0450SF0463Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF04501183.345734hydantoin utilization protein
SF04511173.485788carboxylase
SF04520193.154112carbamate kinase
SF04532192.489583phosphoribosylaminoimidazole carboxylase ATPase
SF04543201.984273phosphoribosylaminoimidazole carboxylase
SF04553191.444524UDP-2,3-diacylglucosamine hydrolase
SF0456217-0.016283peptidyl-prolyl cis-trans isomerase B
SF0457014-0.952581cysteinyl-tRNA synthetase
SF0458-122-3.648959hypothetical protein
SF0459-221-3.828676hypothetical protein
SF0460-122-3.866001bifunctional 5,10-methylene-tetrahydrofolate
SF0461130-6.722345type-1 fimbrial protein subunit A
SF0462-126-6.387994fimbrial chaperone protein FimC
SF0463021-4.058715hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0452CARBMTKINASE385e-137 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 385 bits (990), Expect = e-137
Identities = 125/310 (40%), Positives = 175/310 (56%), Gaps = 16/310 (5%)

Query: 2 KTLVVALGGNALLQRGEALTAENQYRNIASAVPALARL-ARSYRLAIVHGNGPQVGLLAL 60
K +V+ALGGNAL QRG+ + E N+ +A + AR Y + I HGNGPQVG L L
Sbjct: 3 KRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLL 62

Query: 61 QNLAWKE---VEPYPLDVLVAESQGMIGYMLAQSLSAQPQM----PPVTTVLTRIEVSPD 113
A + + P+DV A SQG IGYM+ Q+L + + V T++T+ V +
Sbjct: 63 HMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKN 122

Query: 114 DPAFLQPEKFIGPVYQPEEQEALEAAYGWQMKRD-GKYLRRVVASPQPRKILDSEAIELL 172
DPAF P K +GP Y E + L GW +K D G+ RRVV SP P+ +++E I+ L
Sbjct: 123 DPAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKL 182

Query: 173 LKEGHVVICSGGGGVPVTEDG---AGSEAVIDKDLAAALLAEQINADGLVILTDADAVYE 229
++ G +VI SGGGGVPV + G EAVIDKDLA LAE++NAD +ILTD +
Sbjct: 183 VERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAAL 242

Query: 230 NWGTPQQRAIRHATPDELAPFAKAD----GSMGPKVTAVSGYVRSRSKPAWIGALSRIEE 285
+GT +++ +R +EL + + GSMGPKV A ++ + A I L + E
Sbjct: 243 YYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLEKAVE 302

Query: 286 TLAGEAGTCI 295
L G+ GT +
Sbjct: 303 ALEGKTGTQV 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0457RTXTOXIND290.031 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.031
Identities = 16/150 (10%), Positives = 44/150 (29%), Gaps = 8/150 (5%)

Query: 299 RSQLNYSEENLKQARAALERLYTALRGTDKTVAPAGGEAFEARFIEAMDDDFNTP----- 353
+ ++ +L QAR R R + P E F +++
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIK 192

Query: 354 EAYSVLFDMAREVNRLKAEDMAAANAMASHLRKLSAVLGLLEQEPEAFLQSGAQADDSEV 413
E +S + + + A + + + + + + + + F +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSS---LLHKQAI 249

Query: 414 AEIEALIQQRLDARKAKDWAAADAARDRLN 443
A+ L Q+ + + +++
Sbjct: 250 AKHAVLEQENKYVEAVNELRVYKSQLEQIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0463PF005771339e-39 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 133 bits (336), Expect = 9e-39
Identities = 59/167 (35%), Positives = 89/167 (53%), Gaps = 16/167 (9%)

Query: 11 QRYTWCL------AGICYSSLAILPSFLSY-----AESYFNPAFLLENGTSVADLSRFER 59
QR T CL + L + +F + AE YFNP FL ++ +VADLSRFE
Sbjct: 10 QRNTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFEN 69

Query: 60 GNHQPAGVYRVDLWRNDEFIGSQDIVFESTTENTGDKSGGLMPCFNQVLLERIGLNSSAF 119
G P G YRVD++ N+ ++ ++D+ F NTGD G++PC + L +GLN+++
Sbjct: 70 GQELPPGTYRVDIYLNNGYMATRDVTF-----NTGDSEQGIVPCLTRAQLASMGLNTASV 124

Query: 120 PELAQQQNNKCINLLKAVPDATINFDFAAMRLNITIPQIALLSSAHG 166
+ ++ C+ L + DAT D RLN+TIPQ + + A G
Sbjct: 125 SGMNLLADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARG 171


9SF0483SF0528Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF0483019-3.921053transporter
SF0484122-3.137377insertion element iso-IS10R transposase
SF0486120-1.264712hypothetical protein
SF0487219-0.904606hypothetical protein
SF0488424-2.687705gamma-glutamyl:cysteine ligase
SF0489731-3.933203insertion element IS150 protein
SF0490529-3.589185insertion sequence element IS600 protein
SF0492216-1.205173insertion sequence element IS629 protein
SF0493014-0.639767insertion sequence element IS600 transposase
SF0494-2101.617217insertion element IS150 protein InsB
SF0494a-2112.681751hypothetical protein
SF0495-2102.682977enterobactin synthase multienzyme complex
SF0496-2113.039275outer membrane receptor FepA
SF04970133.742899enterobactin/ferric enterobactin esterase
SF04980144.651064enterobactin synthase subunit F
SF04991144.893856insertion element IS1 protein InsA
SF05001154.958568insertion element IS1 protein InsB
SF05021175.062131iron-enterobactin ABC transporter ATP-binding
SF05031175.212540iron-enterobactin ABC transporter permease
SF05040175.068700iron-enterobactin ABC transporter permease
SF05050184.504594enterobactin exporter EntS
SF0506-1204.574070iron-enterobactin ABC transporter
SF0508-1214.514306enterobactin synthase subunit E
SF05090194.4042452,3-dihydro-2,3-dihydroxybenzoate synthetase
SF0510-1174.0555592,3-dihydroxybenzoate-2,3-dehydrogenase
SF05110163.639950thioesterase
SF0512-1142.885293carbon starvation protein A
SF0513-2151.341238hypothetical protein
SF0514-1150.875271oxidoreductase
SF0515020-2.779145methionine aminotransferase
SF0516224-4.753859insertion sequence element IS911 integrase core
SF0517222-6.009050insertion sequence element IS911 transposase
SF0519016-3.899543hypothetical protein
SF0520-115-3.826682hypothetical protein
SF0522-115-3.650625LysR family transcriptional regulator
SF0523-117-1.393085thiol:disulfide interchange protein
SF0524-119-0.605379alkyl hydroperoxide reductase
SF0525-1150.158397alkyl hydroperoxide reductase
SF0526-2151.190460universal stress protein UspG
SF0527-1152.064341oxidoreductase
SF0528-2203.081154nucleoside diphosphate kinase regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0494aHOKGEFTOXIC562e-15 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 56.4 bits (136), Expect = 2e-15
Identities = 18/50 (36%), Positives = 27/50 (54%)

Query: 1 MLTKYALVAVIVLCLTVLGFTLLAGDSLCEFTVKERNIEFRAVLAYEPKK 50
+ + V+++CLT+L FT L SLCE ++ E A +AYE K
Sbjct: 3 LPRSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYESGK 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0495ENTSNTHTASED2757e-97 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 275 bits (705), Expect = 7e-97
Identities = 105/183 (57%), Positives = 130/183 (71%), Gaps = 1/183 (0%)

Query: 4 MKTTHTSLPFAGHTLHFVEFDPANFCEQDLLWLPHYAQLQHAGRKRKTEHLAGRIAAVYA 63
M T+H LPFAGH LH V+FD ++F E DLLWLPH+ +L+ AGRKRK EHLAGRIAAV+A
Sbjct: 1 MLTSHFPLPFAGHRLHIVDFDASSFREHDLLWLPHHDRLRSAGRKRKAEHLAGRIAAVHA 60

Query: 64 LREYGYKCVPAIGELRQPVWPAEVYGSISHCGATALAVVSRQPIGVDIEEIFSAQTATEL 123
LRE G + VP +G+ RQP+WP ++GSISHC TALAV+SRQ IG+DIE+I S TATEL
Sbjct: 61 LREVGVRTVPGMGDKRQPLWPDGLFGSISHCATTALAVISRQRIGIDIEKIMSQHTATEL 120

Query: 124 TDNIITPAEHERLADCGLAFSLALTLAFSAKESAFKA-SEIQTDAGFLDYQIISWNKQQV 182
+II E + L L F LALTLAFSAKES +KA S+ T GF ++ S +
Sbjct: 121 APSIIDSDERQILQASLLPFPLALTLAFSAKESVYKAFSDRVTLPGFNSAKVTSLTATHI 180

Query: 183 IIH 185
+H
Sbjct: 181 SLH 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0505TCRTETA371e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.7 bits (85), Expect = 1e-04
Identities = 81/391 (20%), Positives = 142/391 (36%), Gaps = 38/391 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATSALVGR 141
V+L + G ++ + P L +Y+ + G + G A A + +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPE-LP 200
+ + G V P++GGL+ A + AA L LPE
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHK 186

Query: 201 PPPQPLEHPLKSLLAGFRFLLASPLLGGLLTMA----------SAVLVLYPALADNWQMS 250
+PL + LA FR+ ++ L+ + +A+ V++ D +
Sbjct: 187 GERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRFHWD 244

Query: 251 AAQIGFLYAAIP-LGAAIGALTSGKLAHSARPGLLMLLSTLGS---FLAIGLFGLMPMWI 306
A IG AA L + A+ +G +A ++L + ++ + M
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAF 304

Query: 307 LGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGGLGA 366
+V LA G ML Q E G++ G A +G L + A
Sbjct: 305 PIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA 360

Query: 367 MMTPVASASASGFGLLIIGVLLLLVLVELRR 397
+ + +G+ + L LL L LRR
Sbjct: 361 ----ASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0506FERRIBNDNGPP632e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 63.0 bits (153), Expect = 2e-13
Identities = 61/285 (21%), Positives = 102/285 (35%), Gaps = 35/285 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSAEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKS--- 151
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 152 --WQSLLTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
+ LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQVLERL 314
KD DA+ A PL +P V+ + + F SAM + L
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVL 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0509ISCHRISMTASE444e-161 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 444 bits (1142), Expect = e-161
Identities = 146/299 (48%), Positives = 195/299 (65%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPESHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPIPASKA-----------ALREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA + + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0510DHBDHDRGNASE359e-129 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 359 bits (922), Expect = e-129
Identities = 108/258 (41%), Positives = 149/258 (57%), Gaps = 20/258 (7%)

Query: 5 GKNVWVTGAGKGIGYATALTFVEAGAKVTGFD---------------QAFTQEQYPFATE 49
GK ++TGA +GIG A A T GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAGQVAQVCQRLLAETERLDVLINAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+ + ++ R+ E +D+L+N AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRRQRGGAIVTVASDAAHTARIGMSAYGASKAALKSLALSVGLELAGSGVRC 169
+ +R G+IVTV S+ A R M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0519HTHFIS270.013 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 26.7 bits (59), Expect = 0.013
Identities = 7/45 (15%), Positives = 16/45 (35%), Gaps = 1/45 (2%)

Query: 6 KRYPEEFKTEAVKQVVDR-GYSVASVATRLDITTHSLYAWIKKYG 49
R E + + + + A L + ++L I++ G
Sbjct: 430 DRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELG 474


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0523BCTLIPOCALIN280.019 Bacterial lipocalin signature.
		>BCTLIPOCALIN#Bacterial lipocalin signature.

Length = 171

Score = 28.4 bits (63), Expect = 0.019
Identities = 18/98 (18%), Positives = 39/98 (39%), Gaps = 13/98 (13%)

Query: 50 QGITIIKTFDAPGGMKGYLGKYQDMGVTIYLTPDGKHAISG--YMYNEKGENLSNTLIEK 107
+ + + F+ YLGK+ ++ + G ++ + N+ G ++ N
Sbjct: 21 ESVKPVSDFEL----NNYLGKWYEVARLDHSFERGLSQVTAEYRVRNDGGISVLN----- 71

Query: 108 EIYAPAGREMWQRMEQSHWLLDGKKDAPVIVYVFSDPF 145
Y+ + W+ E + ++G D + V F PF
Sbjct: 72 RGYSEE-KGEWKEAEGKAYFVNGSTDGYLKVSFFG-PF 107


10SF0553SF0558Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF0553318-0.947394nicotinamide riboside transporter PnuC
SF0554320-0.523029quinolinate synthetase
SF0555321-0.448391***tol-pal system protein YbgF
SF0556322-0.218602peptidoglycan-associated outer membrane
SF0557318-0.285670translocation protein TolB
SF0558318-0.163512cell envelope integrity inner membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0556OMPADOMAIN1165e-34 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 116 bits (292), Expect = 5e-34
Identities = 35/119 (29%), Positives = 54/119 (45%), Gaps = 4/119 (3%)

Query: 55 EEQARLQMQQLQQNNIVYFDLDKYDIRSDFAQMLDAHANFLRSN--PSYKVTVEGHADER 112
+Q + + V F+ +K ++ + LD + L + V V G+ D
Sbjct: 205 APAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI 264

Query: 113 GTPEYNISLGERRANAVKMYLQGKGVSADQISIVSYGKEKPAVLGHDEAAYSKNRRAVL 171
G+ YN L ERRA +V YL KG+ AD+IS G+ P V G+ K R A++
Sbjct: 265 GSDAYNQGLSERRAQSVVDYLISKGIPADKISARGMGESNP-VTGN-TCDNVKQRAALI 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0558IGASERPTASE647e-13 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 63.5 bits (154), Expect = 7e-13
Identities = 39/210 (18%), Positives = 70/210 (33%), Gaps = 9/210 (4%)

Query: 79 EQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQAELKQKQAEVAA 138
E+R A+ + +E E A +E + AE +
Sbjct: 986 EKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK 1045

Query: 139 AKAAADAKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKAEVAAAALKKKAEAAEAAAAE 198
++ K E+ A + A ++ A+ + A Q EVA + +E E E
Sbjct: 1046 QESKTVEKN-EQDATETTAQNREVAKEAKSNVKANTQT-NEVA----QSGSETKETQTTE 1099

Query: 199 ARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAAEKAAADKKAAEKAAADKAAADKKAAA 258
++ A E EKAK E EK K + E++ + AE A +
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV---NIK 1156

Query: 259 EKAAADKKAAAAKAAAEKAAAAKAAAEADD 288
E + A + A++ ++ +
Sbjct: 1157 EPQSQTNTTADTEQPAKETSSNVEQPVTES 1186



Score = 55.1 bits (132), Expect = 3e-10
Identities = 30/244 (12%), Positives = 81/244 (33%), Gaps = 20/244 (8%)

Query: 51 DAVMVDSGAVVEQYKRMQSQESSAKRSDEQRKMKEQQAAE-ELREKQAAEQER------L 103
D V A + ++ ++K+ + + EQ A E + ++ A++ +
Sbjct: 1021 DEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANT 1080

Query: 104 KQLEKERLAAQEQKKQAEEAAKQAELKQKQAEVAAAKAAADAKAAEEAAKKAAADAKKKA 163
+ E + ++ ++ Q K+ +K+ KA + + +E K + + K+
Sbjct: 1081 QTNEVAQSGSETKETQ-TTETKETATVEKE-----EKAKVETEKTQEVPKVTSQVSPKQE 1134

Query: 164 EAEAAKAAAEAQKKAEVAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEK 223
++E + AE ++ + E + + A++ + E+
Sbjct: 1135 QSETVQPQAEPARENDPTVN-------IKEPQSQTNTTADTEQPAKETSSNVEQPVTEST 1187

Query: 224 AAADKKAAAEKAAADKKAAEKAAADKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAA 283
+ E A + + +++K + + + A +
Sbjct: 1188 TVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTV 1247

Query: 284 AEAD 287
A D
Sbjct: 1248 ALCD 1251



Score = 54.7 bits (131), Expect = 5e-10
Identities = 33/221 (14%), Positives = 76/221 (34%), Gaps = 8/221 (3%)

Query: 66 RMQSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAK 125
R ++E+ + + + Q+ E +E Q E + +EKE A E +K E
Sbjct: 1066 REVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKV 1125

Query: 126 QAELKQKQAEVAAAKAAADAKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKAEVAAAAL 185
+++ KQ ++ AE A + K+ +++ A Q E ++
Sbjct: 1126 TSQVSPKQ-----EQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVE 1180

Query: 186 KKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAAEKAAADKKAAEKA 245
+ E+ + + ++ K + + + + A +
Sbjct: 1181 QPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTS 1240

Query: 246 AADKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAAAEA 286
+ D++ A + + + A + A A+ A +A
Sbjct: 1241 SNDRSTV---ALCDLTSTNTNAVLSDARAKAQFVALNVGKA 1278



Score = 52.4 bits (125), Expect = 3e-09
Identities = 32/184 (17%), Positives = 67/184 (36%), Gaps = 12/184 (6%)

Query: 99 EQERLKQLEKERLAAQEQKKQAEEAAKQAELKQK----QAEVAAAKAAADAKAAEEAAK- 153
E E+ Q QA+ + + ++ +A V A ++ E A+
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 154 -KAAADAKKKAEAEAAKAAAEAQKKAEVAAAALKKKAEAAEAAAAEARKKAATEAAEKAK 212
K + +K E +A + A+ ++ A+ A + +K + E A ++ +E E
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVA------QSGSETKETQT 1097

Query: 213 AEAEKKAAAEKAAADKKAAAEKAAADKKAAEKAAADKAAADKKAAAEKAAADKKAAAAKA 272
E ++ A EK K + K ++ + + + + AE A + K
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 273 AAEK 276
+
Sbjct: 1158 PQSQ 1161



Score = 47.4 bits (112), Expect = 9e-08
Identities = 29/213 (13%), Positives = 68/213 (31%), Gaps = 6/213 (2%)

Query: 71 ESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQAELK 130
+S ++ + Q ++ A E EK E E+ +++ K +++Q+E QAE
Sbjct: 1087 QSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPA 1146

Query: 131 QKQAEVAAAKAAADAKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKAEVAAAALKKKAE 190
++ K ++ A + E ++ + V A
Sbjct: 1147 RENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPAT 1206

Query: 191 AAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAAEKAAADKKAAEKAAADKA 250
+E+ K ++ A ++ D+ A +
Sbjct: 1207 TQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTN------TNAV 1260

Query: 251 AADKKAAAEKAAADKKAAAAKAAAEKAAAAKAA 283
+D +A A+ A + A ++ ++ +
Sbjct: 1261 LSDARAKAQFVALNVGKAVSQHISQLEMNNEGQ 1293



Score = 40.4 bits (94), Expect = 1e-05
Identities = 22/156 (14%), Positives = 44/156 (28%), Gaps = 11/156 (7%)

Query: 126 QAELKQKQAEVAAAKAAADAKAAEEAAK-KAAADAKKKAEAEAAKAAAEAQKKAEVAAAA 184
+ E + + + + +A + A+ A A + E A
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 185 LKKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAAEKAAADKKAAEK 244
K++++ E K +A E E A+ E A + + E
Sbjct: 1044 SKQESKTVE--------KNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKET 1095

Query: 245 AAADKAAADKKAAAEKAAA--DKKAAAAKAAAEKAA 278
+ EKA +K K ++ +
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSP 1131


11SF0587SF0594Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SF05870183.640064metal-binding protein
SF05880203.707474dipeptide/tripeptide permease D
SF05891244.443331deoxyribodipyrimidine photolyase
SF05902285.227796hypothetical protein
SF05912255.117722hypothetical protein
SF05942203.942275Rhs-family protein
12SF0668SF0725Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF06682201.258899bacteriophage protein
SF06691200.809325bacteriophage protein
SF0670022-0.641663bacteriophage protein
SF0672026-2.288490IS2 transposase TnpB
SF0673125-2.608808IS2 repressor TnpA
SF0674026-3.377296bacteriophage protein
SF0675228-2.615706bacteriophage protein
SF0676227-3.258591hypothetical protein
SF0677327-1.854728insertion sequence element IS911 transposase
SF0678429-0.797993insertion sequence element IS911 integrase core
SF0679327-1.832913bacteriophage protein
SF0680325-1.273306****insertion sequence element IS600 protein
SF0681322-0.947394insertion sequence element IS600 transposase
SF0682322-0.768209bacteriophage protein
SF0683321-0.625677insertion sequence element IS911 integrase core
SF0685423-1.415802insertion sequence element IS911 transposase
SF0684423-1.474952replication protein DnaC
SF0686323-1.377119helicase
SF0687225-1.966179bacteriophage protein
SF0688126-2.432706Q protein
SF0689325-1.738282****S protein
SF0690224-1.676088bacteriophage protein
SF0691125-1.675575endolysin R of prophage CP-933V
SF06921221.711351endopeptidase
SF06931222.086860insertion sequence element IS600 transposase
SF06961232.598586insertion sequence element IS629 protein
SF06972252.717244insertion sequence element IS600 protein
SF06982263.128519DNA packaging protein of prophage; terminase
SF07034263.886732head-tail preconnector gp5
SF07044282.195188capsid protein small subunit
SF07055232.187578major capsid protein
SF07066261.987057DNA-packaging protein
SF07077252.119718tail attachment protein
SF07083253.173115prophage CP-933K tail protein
SF07093263.034424prophage CP-933K tail protein
SF07102253.314015prophage CP-933K tail protein
SF07114293.812387prophage CP-933K tail protein
SF07123284.154077prophage CP-933K tail protein
SF07132264.085454tail length tape measure protein
SF07145264.346588minor tail protein
SF07155254.276964minor tail protein
SF07164202.833348tail assembly protein
SF07172162.267265phage tail protein
SF07181132.222138host specificity protein
SF0719-1111.386845membrane protein
SF07210112.047039prophage CP-933K tail protein
SF07220122.345009invasion plasmid antigen
SF07232153.657055kinase inhibitor protein
SF0725-1133.2172947,8-diaminopelargonic acid synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0682PF05272542e-09 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 53.9 bits (129), Expect = 2e-09
Identities = 30/82 (36%), Positives = 48/82 (58%), Gaps = 2/82 (2%)

Query: 4 SELSDLLWAQVDRVAPHLLPNGKIEGHEWVAGNVNGDKGNSLKVNLIGKKKWADFAEGDG 63
+ L+D L + + P LP G + GHE+ G++ G KG+S KVN + KW DF+ G+
Sbjct: 12 TSLADALLTRAKDLLPEWLPGGVLVGHEYECGSLAGGKGDSCKVN-VTTGKWCDFSTGES 70

Query: 64 G-DMLDLWMACRGINLHQAMQE 84
G D+LDL+ G+ + +A +
Sbjct: 71 GRDLLDLYAEIHGLKVSKAAAQ 92


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0688FIMREGULATRY270.014 Escherichia coli: P pili regulatory PapB protein si...
		>FIMREGULATRY#Escherichia coli: P pili regulatory PapB protein

signature.
Length = 104

Score = 26.8 bits (59), Expect = 0.014
Identities = 8/31 (25%), Positives = 15/31 (48%)

Query: 71 LVDYYVFGMTFMTLARKHGCSDGYIGKKLQK 101
+ DY V G + + K+ ++GY L +
Sbjct: 51 MKDYLVGGHSRKEVCEKYQMNNGYFSTTLGR 81


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0713RTXTOXIND406e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.8 bits (93), Expect = 6e-05
Identities = 14/194 (7%), Positives = 43/194 (22%), Gaps = 12/194 (6%)

Query: 29 LNGAASDAERSSARMQRFMERQTQAARQTMQAASSAATAASAHAQTVEKNARAHERMARE 88
L ++A+ + Q + + Q S + + E
Sbjct: 127 LTALGAEADTLKTQSSL---LQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEE 183

Query: 89 VEQTRLRVDALNQKMREEQAQARALAEAQDKAAAAFYRQIDSVKQAGAGLQELQRIQQQI 148
V + + + ++ Q + + +I+ + +
Sbjct: 184 VLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDD---F 240

Query: 149 RQARNSGGVGQQDYLALISEITAKTRALTQAE------EQATRQKAAFIRQLKEQATRQN 202
+ + + L ++ L + E + + + +
Sbjct: 241 SSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI 300

Query: 203 LSSSELLRARAAQL 216
L L
Sbjct: 301 LDKLRQTTDNIGLL 314



Score = 38.7 bits (90), Expect = 1e-04
Identities = 25/224 (11%), Positives = 63/224 (28%), Gaps = 30/224 (13%)

Query: 545 NYQEQQKRRNAENAALNRMNETEAARHQREIARINAMQYADQAVRDAAIQRENERYEKAI 604
+ + + A L + + E+ ++ ++ D+ + E R I
Sbjct: 133 EADTLKTQSSLLQARLEQ-TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLI 191

Query: 605 KKNTRATRNDEATRLLLQYSQQQAQVEGQIAAARQSAGIATERMTEAHKQLLALQQRISD 664
K+ + Q+ Q E + R R+ + R+ D
Sbjct: 192 KEQ------------FSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDD 239

Query: 665 LDGKKLTADEKSVLARKNELIQALTLLDVKQQELQKQTALNDLRKKTVQLTSQLADKERA 724
L + I +L+ + + ++ L + + Q+ S++ +
Sbjct: 240 F--SSL---------LHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE 288

Query: 725 LREQHNLDIATAGMGDKQRQRYQAQLRIRQEYRQQLQQLENDSR 768
+ + + Q I +L + E +
Sbjct: 289 YQ-LVTQLFKN----EILDKLRQTTDNIGL-LTLELAKNEERQQ 326



Score = 32.5 bits (74), Expect = 0.009
Identities = 31/238 (13%), Positives = 69/238 (28%), Gaps = 42/238 (17%)

Query: 402 DPVNAAKALDNALHFLNATQLEQIRVLGEQGRSSDAARIAMSALAEETGKRTSDIDNNLN 461
+ A L +LEQ R RS + ++ L +E + + L
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILS-RSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 462 ALGSTLQTLSDWWKQFWDAAMNIGREDSLDAQIDALQEKIQRAKKYPWTNASTQVEYDQQ 521
+ S W Q + +N+ D A+ + +I R + ++
Sbjct: 187 LTSLIKEQFSTWQNQKYQKELNL---DKKRAERLTVLARINRYEN--------LSRVEKS 235

Query: 522 RLNDLQEKKRRKDLQDAKAQAERNYQEQQKRRNAENAALNRMNETEAARHQREIARINAM 581
RL+D L +A A+ EQ+ + L +++ +
Sbjct: 236 RLDDFSS------LLHKQAIAKHAVLEQENKYVEAVNELRVY-----------KSQLEQI 278

Query: 582 QYADQAVRDAAIQRENERYEKAIKKNTRATRNDEATRLLLQYSQQQAQVEGQIAAARQ 639
+ + ++ + + K + T + ++A +
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTT-------------DNIGLLTLELAKNEE 323



Score = 31.0 bits (70), Expect = 0.030
Identities = 26/185 (14%), Positives = 57/185 (30%), Gaps = 13/185 (7%)

Query: 13 IDAAEFKNEIPRIKNLLNGAASDAERSSARMQRFMERQTQAARQTMQAASSAATAASAHA 72
+ E + +E R+ ++ Q + A
Sbjct: 157 SRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAER 216

Query: 73 QTVEKNARAHERMAREVEQTRLRVDALNQKMREEQAQARALAEAQDKAAAAFYRQIDSVK 132
TV +E ++R VE++RL + +QA A+ Q+ ++
Sbjct: 217 LTVLARINRYENLSR-VEKSRL---DDFSSLLHKQAIAKHAVLEQENKYVEAVNEL---- 268

Query: 133 QAGAGLQELQRIQQQIRQARNSGGVGQQDYLALISEITAKTRA-LTQAEEQATRQKAAFI 191
+L++I+ +I A+ + Q + I + +T + + K
Sbjct: 269 --RVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE--LAKNEER 324

Query: 192 RQLKE 196
+Q
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0719ENTEROVIROMP1342e-42 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 134 bits (339), Expect = 2e-42
Identities = 62/200 (31%), Positives = 98/200 (49%), Gaps = 30/200 (15%)

Query: 1 MRKLYAAILSAAICLAVSGAPAWASEHRSTLSAGYLHASTNAPGSDDLNGINVKYRYEFT 60
M+K+ AA+ +G A+ ST++ GY + + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAAT---STVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGLVTSFSYANAEDEQKTHYSDTRWHEDSVRNRWFSVMAGLSVRVNEWFSAYAMAGV 119
++ LG++ SF+Y T S T D +N+++ + AG + R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDNRHSNTSLAWGAGVQFNPTESVAIDLAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


13SF0815SF0820Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF0815016-6.244721arginine transporter permease subunit ArtM
SF0816-218-6.706681arginine transporter permease subunit ArtQ
SF0817-117-6.230199arginine ABC transporter substrate-binding
SF0818115-5.072567arginine transporter ATP-binding subunit
SF0819116-3.864914lipoprotein
SF0820114-3.037396hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0818PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.010
Identities = 9/18 (50%), Positives = 12/18 (66%)

Query: 31 LVLLGPSGAGKSSLLRVL 48
+VL G G GKS+L+ L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


14SF0857SF0895Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF08572210.082210MFS family transporter protein
SF08583240.104899insertion sequence element ISSfl4 transposase
SF0859322-0.006359insertion sequence element ISSfl4 ORF2
SF0860323-0.552974insertion sequence element ISSfl4 transposase
SF0862224-1.197716insertion sequence element IS911 integrase core
SF0863129-3.873105insertion sequence element IS911 transposase
SF0864131-3.276024bacteriophage protein
SF0865128-3.529681insertion sequence element IS600 transposase
SF0866228-3.821874insertion sequence element IS600 protein
SF0867228-4.225263bacteriophage protein
SF0868430-4.765510hypothetical protein
SF0868a325-2.998828hypothetical protein
SF0869224-3.049100hypothetical protein
SF0870a527-3.648398hypothetical protein
SF0871425-3.042219bacteriophage protein
SF0873126-1.317527bacteriophage protein
SF0874025-0.388160bacteriophage protein
SF0875224-1.224654DNA-binding transcriptional regulator DicC
SF0876324-1.513960transcriptional repressor DicA
SF0877225-0.712388hypothetical protein
SF0878224-0.831981IS2 transposase TnpB
SF0879326-1.805106IS2 repressor TnpA
SF0880325-2.189440bacteriophage protein
SF0881323-1.508431insertion sequence element IS600 transposase
SF0882318-0.269338insertion sequence element IS600 protein
SF0883015-0.863982exodeoxyribonuclease VIII
SF0884212-0.929293insertion sequence element IS600 protein
SF0885113-0.815287insertion sequence element IS600 transposase
SF0886213-0.263718bacteriophage protein
SF0888212-0.4480176-phosphogluconolactonase
SF0890112-1.109663hypothetical protein
SF0891113-0.940110membrane pump protein
SF0892125-0.805226hydratase
SF0893430-1.228171acyl-CoA thioesterase
SF0895228-1.615711insertion sequence element IS600 protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0860RTXTOXIND417e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.4 bits (97), Expect = 7e-06
Identities = 9/74 (12%), Positives = 33/74 (44%), Gaps = 1/74 (1%)

Query: 16 LRKQQSRLRQYACQVAGYEQEIERLKAQLDRLRRMLFGQSSEKKRHKLENQIRQAEKRLS 75
+ +Q+++ + ++ Y+ ++E++++++ + + K L+ ++RQ +
Sbjct: 254 VLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILD-KLRQTTDNIG 312

Query: 76 ELENRLNTARNLLE 89
L L +
Sbjct: 313 LLTLELAKNEERQQ 326


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0868aHOKGEFTOXIC624e-17 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 61.8 bits (150), Expect = 4e-17
Identities = 20/48 (41%), Positives = 34/48 (70%)

Query: 23 QKAMLIALIVICLTVIVTALVTRKDLCEVRIRTGQTEVAVFVDYESRK 70
+ +++ ++++CLT+++ +TRK LCE+R R G EVA F+ YES K
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYESGK 52


15SF0933SF0943Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF09332171.221325IS2 repressor TnpA
SF09340151.430887alkanesulfonate monooxygenase
SF0935121-0.000630hypothetical protein
SF0936118-2.503137NAD(P)H-dependent FMN reductase
SF0937221-3.270825hypothetical protein
SF0938121-3.209167insertion element IS1 protein InsA
SF0939122-3.573429insertion element IS1 protein InsB
SF0940023-3.549669insertion element IS1 protein InsA
SF0941021-4.002465outer membrane usher protein
SF0942022-3.693335FimH-like protein
SF0943021-3.184551fimbrial-like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0941PF005778220.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 822 bits (2125), Expect = 0.0
Identities = 414/862 (48%), Positives = 570/862 (66%), Gaps = 18/862 (2%)

Query: 15 GVPSFIGGLVVFVSAAFNAQAETWFDPAFFKDDPSMVADLSRFEKGQKITPGVYRVDIVL 74
G + F + A + AE +F+P F DDP VADLSRFE GQ++ PG YRVDI L
Sbjct: 25 GFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYL 84

Query: 75 NQTIVDTRNVNFVEITPEKGIAACLTTESLDAMGVNTDAFPAFKQLDKQVCVPLAEIIPD 134
N + TR+V F E+GI CLT L +MG+NT + L CVPL +I D
Sbjct: 85 NNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHD 144

Query: 135 ASVTFNVNKLRLEISVPQIAIKSNARGYVPPERWDEGINALLLGYSFSGANSIHSSADSD 194
A+ +V + RL +++PQ + + ARGY+PPE WD GINA LL Y+FSG + + +
Sbjct: 145 ATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNS 204

Query: 195 SGDSYFLNLNSGVNLGPWRLRNNSTWSR-----SSGQTAEWKNLSSYLQRAVIPLKGELT 249
+LNL SG+N+G WRLR+N+TWS SSG +W++++++L+R +IPL+ LT
Sbjct: 205 --HYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLT 262

Query: 250 VGDDYTAGDFFDSVSFRGVQLASDDNMLPDSLKGFAPVVRGIAKSNAQITIKQNGYTIYQ 309
+GD YT GD FD ++FRG QLASDDNMLPDS +GFAPV+ GIA+ AQ+TIKQNGY IY
Sbjct: 263 LGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYN 322

Query: 310 TYVSPGAFEISDLYSTSSSGDLLVEIKEADGSVNSYSVPFSSVPLLQRQGRIKYAVTLAK 369
+ V PG F I+D+Y+ +SGDL V IKEADGS ++VP+SSVPLLQR+G +Y++T +
Sbjct: 323 STVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGE 382

Query: 370 YRTNSNEQQESKFAQTTLQWGGPWGTTWYGGGQYAEYYRAAMFGLGFNLGDFGAISFDAT 429
YR+ + +Q++ +F Q+TL G P G T YGG Q A+ YRA FG+G N+G GA+S D T
Sbjct: 383 YRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMT 442

Query: 430 QAKSTLADQSEHKGQSYRFLYAKTLNQLGTNFQLMGYRYSTSGFYTLSDTMYKHMDGY-- 487
QA STL D S+H GQS RFLY K+LN+ GTN QL+GYRYSTSG++ +DT Y M+GY
Sbjct: 443 QANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNI 502

Query: 488 EFNDGDDEDTPMWSRYYNLFYTKRGKLQVNISQQLGEYGSFYLSGSQQTYWHTDQQDRLL 547
E DG + P ++ YYNL Y KRGKLQ+ ++QQLG + YLSGS QTYW T D
Sbjct: 503 ETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQF 562

Query: 548 QFGYNTQIKDLSLGVSWNYSKSRGQPDADQVFALNFSLPLNLLLPRSNDSYTRKKNYAWM 607
Q G NT +D++ +S++ +K+ Q DQ+ ALN ++P + L + S R +A
Sbjct: 563 QAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWR---HASA 619

Query: 608 TSNTSIDNEGHITQNLGLTETLLDDGNLSYSVQQGYNSEGKTANGS---ASMDYKGAFAD 664
+ + S D G +T G+ TLL+D NLSYSVQ GY G +GS A+++Y+G + +
Sbjct: 620 SYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGN 679

Query: 665 ARVGYNYSDNGSQQQLNYALSGSLVAHSQGITLGQSLGETNVLIAAPGAENTRVANSTGL 724
A +GY++SD+ +QL Y +SG ++AH+ G+TLGQ L +T VL+ APGA++ +V N TG+
Sbjct: 680 ANIGYSHSDD--IKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGV 737

Query: 725 KTDWRGYTVVPYATSYRENRIALDAASLKRNVDLENAVVNVVPTKGALVLAEFNAHAGAR 784
+TDWRGY V+PYAT YRENR+ALD +L NVDL+NAV NVVPT+GA+V AEF A G +
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIK 797

Query: 785 VLMKTSKQGIPLRFGAIATLDGIQTNSGIIDDDGSLYMSGLPAQGAITVRWGEAPDQICH 844
+LM + PL FGA+ T + +SGI+ D+G +Y+SG+P G + V+WGE + C
Sbjct: 798 LLMTLTHNNKPLPFGAMVTSES-SQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCV 856

Query: 845 ISYQLTEQQINSAITRMDAICR 866
+YQL + +T++ A CR
Sbjct: 857 ANYQLPPESQQQLLTQLSAECR 878


16SF0997SF1010Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SF0997-118-5.045086trimethylamine N-oxide reductase cytochrome
SF1001018-4.986060insertion element IS1 protein InsB
SF1002-120-5.529892insertion element IS1 protein InsA
SF1003018-4.896908chaperone-modulator protein CbpM
SF1004117-3.532256curved DNA-binding protein CbpA
SF1005016-3.243431hypothetical protein
SF10061151.959766glucose-1-phosphatase/inositol phosphatase
SF10070173.408927hypothetical protein
SF10081173.737248NAD(P)H:quinone oxidoreductase
SF1009-1183.944948transporter
SF1010-2184.178884hypothetical protein
17SF1024SF1062Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SF1024020-4.831507insertion sequence IS3 transposase InsE
SF1026022-5.427110hypothetical protein
SF1028022-4.848761*hydrolase
SF1029125-5.668549oxidoreductase subunit
SF1030229-6.836093hypothetical protein
SF1032236-8.470629insertion element iso-IS10R transposase
SF1033237-9.144447curli assembly protein CsgE
SF1034335-7.318364DNA-binding transcriptional regulator CsgD
SF1035128-4.171176curlin minor subunit CsgB
SF1037126-4.676040insertion sequence element IS600 transposase
SF1039017-3.958880curli assembly protein CsgC
SF1040-212-1.505186hypothetical protein
SF1041-113-0.692579RNase III inhibitor
SF1042-213-0.741078synthase
SF1043-111-0.432740glucans biosynthesis protein
SF10440130.674650glucan biosynthesis protein G
SF10451151.816980glucosyltransferase MdoH
SF1046422-0.253774insertion sequence IS91 transposase
SF1048424-1.585492insertion sequence IS91 protein
SF1050326-0.309206insertion sequence element IS629 protein
SF1051326-0.339263insertion sequence element IS911 transposase
SF1052426-0.169086insertion sequence element IS911 integrase core
SF1053427-1.120583bacteriophage protein
SF1054529-1.912813IS2 repressor TnpA
SF1055222-1.377830IS2 transposase TnpB
SF1056-118-2.581354insertion sequence element IS629 protein
SF1058-116-3.277791insertion sequence element IS600 transposase
SF1059-216-2.859764insertion sequence element IS600 protein
SF1060-314-3.520401bacteriophage protein
SF1061217-1.455709lipid A biosynthesis lauroyl acyltransferase
SF1062219-2.211515rhodanese-related sulfurtransferase
18SF1153SF1157Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF1153017-4.537609phosphohydrolase
SF1154119-4.84465723S rRNA pseudouridine synthase E
SF1155119-5.442807isocitrate dehydrogenase
SF1156126-5.996037hypothetical protein
SF1156a227-5.098725membrane protein
SF1156b126-4.813489ATPase
SF1157021-3.992097hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1157PRTACTNFAMLY984e-23 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 98.2 bits (244), Expect = 4e-23
Identities = 142/655 (21%), Positives = 227/655 (34%), Gaps = 93/655 (14%)

Query: 137 DVDITTHGDNAHAIAARQGTVSFNQGEIYTTGPDAAIAKIYNGGTVTLKNTSAVAHQGSG 196
D + + +V Q + AAI + G VT+ S A G+
Sbjct: 285 PGGFGPVLDGWYGVDVSGSSVELAQSIVEAPELGAAIR-VGRGARVTVSGGSLSAPHGNV 343

Query: 197 IVLESSIN--GQEATVDILSGSSLRSANEILYHKDETSNVTITDSEVSSAADVFINNIKG 254
I + Q A + I + + + L ++ V +T ++ AD + +
Sbjct: 344 IETGGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKLT---LTGGADAQGDIVAT 400

Query: 255 HLTVDATNSKITGSANISTDDN------THTYLSLS-DNSTWDIKADSTVSNLTV--DNS 305
L S G +++ T SLS DN+TW + +S V L + D S
Sbjct: 401 ELPSIPGTS--IGPLDVALASQARWTGATRAVDSLSIDNATWVMTDNSNVGALRLASDGS 458

Query: 306 TVYISRADGRDVEPTRLTITENYVGNNGVLHLRTELDDDNSATDKVVINGNTSGTTRVKV 365
+ A+ + +T N + +G+ + D +DK+V+ + SG R+ V
Sbjct: 459 VDFQQPAEAGRFK----VLTVNTLAGSGLFRMNVFAD--LGLSDKLVVMQDASGQHRLWV 512

Query: 366 TNAGGSGAYTLNGIEIISVEGESNGEFI---KDSRIFAGAYEYSLTRGNTEATNKNWYLT 422
N+G S + N + ++ S F KD ++ G Y Y L N W L
Sbjct: 513 RNSG-SEPASANTLLLVQTPLGSAATFTLANKDGKVDIGTYRYRLA----ANGNGQWSLV 567

Query: 423 NFQAT-------SGGETNSGGSSAPTVAPTPVLRPEAGSYVANLAAANTLFVMRLNDRAG 475
+A G AP P A AA NT V A
Sbjct: 568 GAKAPPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSAAANAAVNTGGV----GLAS 623

Query: 476 EMRYIDPVTEQERSSRLWLRQIGGHNAWRDSNGQLRTTSHRY-------VS--QLGGDLL 526
+ Y + +R L L G AW Q + +R V+ +LG D
Sbjct: 624 TLWYAESNALSKRLGELRLNPDAG-GAWGRGFAQRQQLDNRAGRRFDQKVAGFELGADHA 682

Query: 527 TGGFTDSDSWRLGVMAGYARDYNLTHSSVSDYRSKGSVRGYSAGLYATWFADDISKKGAY 586
W LG +AGY R + G G YAT+ AD G Y
Sbjct: 683 VAVAGGR--WHLGGLAGYTR----GDRGFTGDGG-GHTDSVHVGGYATYIADS----GFY 731

Query: 587 IDSWAQYSWFKN----------SVKGDELAYESYSAKGATVSLEAGYGFALNKSFGLEAA 636
+D+ + S +N +VKG Y G SLEAG F
Sbjct: 732 LDATLRASRLENDFKVAGSDGYAVKGK------YRTHGVGASLEAGRRFTHADG------ 779

Query: 637 KYTWIFQPQAQAIWMGVDHNAHTEANGSRIENDANNNIQTRLGFRTFIRTQEKNSGPHGD 696
W +PQA+ A+ ANG R+ ++ +++ RLG R + G
Sbjct: 780 ---WFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIELAG----GR 832

Query: 697 DFEPFVEMNWIHNSK-DFAVSMNGVKVEQDGVSNLGEIKLGVNGNLNPAASVWGN 750
+P+++ + + V NG+ + E+ LG+ L S++ +
Sbjct: 833 QVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYAS 887


19SF1247SF1261Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF1247017-3.118544oligopeptide ABC transporter ATP-binding protein
SF1248-118-3.163896oligopeptide ABC transporter ATP-binding protein
SF1249-119-3.844260hypothetical protein
SF1250-221-2.869140cardiolipin synthetase
SF1251-223-3.928429voltage-gated potassium channel
SF1252120-1.806226insertion sequence element IS600 transposase
SF1253116-1.613042insertion sequence element IS600 integrase core
SF1254217-2.037302hypothetical protein
SF1255019-3.300294transporter
SF1256023-5.550977acyl-CoA esterase
SF1257023-5.669049intracellular septation protein A
SF1258021-3.856050hypothetical protein
SF1259016-2.097523outer membrane protein W
SF1260114-1.399794hypothetical protein
SF12612121.055583structural protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1248HTHFIS310.008 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.6 bits (69), Expect = 0.008
Identities = 9/16 (56%), Positives = 11/16 (68%)

Query: 55 VVGESGCGKSTFARAI 70
+ GESG GK ARA+
Sbjct: 165 ITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1254adhesinmafb314e-04 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 31.2 bits (70), Expect = 4e-04
Identities = 16/57 (28%), Positives = 20/57 (35%), Gaps = 2/57 (3%)

Query: 41 GPMPAVDSNDPGAAGFTGSTVIAEFESLEAAQAWADADPYVAAGVYEHVSVKPFKKV 97
P+PA G GS E + EA W +P A V +V KV
Sbjct: 268 APLPA--EGKFAVIGGLGSVAGFEKNTREAVDRWIQENPNAAETVEAVFNVAAAAKV 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1255TONBPROTEIN2562e-88 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 256 bits (654), Expect = 2e-88
Identities = 236/239 (98%), Positives = 237/239 (99%)

Query: 4 ITLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVAPADLEPPQA 63
+TLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMV PADLEPPQA
Sbjct: 1 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQA 60

Query: 64 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 123
VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR
Sbjct: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120

Query: 124 PASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 183
PASPFENTAPAR TSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF
Sbjct: 121 PASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 180

Query: 184 DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ 242
DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ
Sbjct: 181 DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ 239


20SF1302SF1323Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF1302-1163.405735glutamine synthetase
SF13030163.406680gamma-glutamyl-gamma-aminobutyrate hydrolase
SF13041173.100481DNA-binding transcriptional repressor PuuR
SF13052183.176626gamma-glutamyl-gamma-aminobutyraldehyde
SF13061162.122730oxidoreductase
SF13073140.9242944-aminobutyrate aminotransferase
SF1308213-1.244210psp operon transcriptional activator
SF1309214-1.578862phage shock protein PspA
SF1310-1150.132645phage shock protein B
SF1311-2190.636716DNA-binding transcriptional activator PspC
SF1312-1211.772920peripheral inner membrane phage-shock protein
SF13130171.678634thiosulfate:cyanide sulfurtransferase
SF13150172.188488insertion element IS1 protein InsA
SF13171191.775339binding-protein dependent transport protein
SF13182170.524958transport system permease
SF13193160.966570oxidoreductase
SF13223160.969729hypothetical protein
SF13232170.526504beta-phosphoglucomutase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1308HTHFIS342e-117 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 342 bits (879), Expect = e-117
Identities = 125/341 (36%), Positives = 182/341 (53%), Gaps = 23/341 (6%)

Query: 11 DNLLGEANSFLEVLEQVSHLAPLDKPVLIIGERGTGKELIASRLHYLSSRWQGPFISLNC 70
L+G + + E+ ++ L D ++I GE GTGKEL+A LH R GPF+++N
Sbjct: 137 MPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINM 196

Query: 71 AALNENLLDSELFGHEAGAFTGAQKLHPGRFERADGGTLFLDELATAPMMVQEKLLRVIE 130
AA+ +L++SELFGHE GAFTGAQ GRFE+A+GGTLFLDE+ PM Q +LLRV++
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQ 256

Query: 131 YGELERVGGSQPLQVNVRLVCATNADLPAMVNEGTFRADLLDRLAFDVVQLPPLRERESD 190
GE VGG P++ +VR+V ATN DL +N+G FR DL RL ++LPPLR+R D
Sbjct: 257 QGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAED 316

Query: 191 IMLMAEHFAIQMCREIKLPLFPGFTERARETLLNYRWPGNIRELKNVVERSVYRHGTSDY 250
I + HF Q +E F + A E + + WPGN+REL+N+V R +
Sbjct: 317 IPDLVRHFVQQAEKEGLDVK--RFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVI 374

Query: 251 PLDDIIID---PFKRRPPEEAIAVSENTSLPTLPLD------------------LREFQM 289
+ I + P E+A A S + S+ +
Sbjct: 375 TREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLA 434

Query: 290 QQEKELLQLSLQQGKYNQKRAAELLGLTYHQFRALLKKHQI 330
+ E L+ +L + NQ +AA+LLGL + R +++ +
Sbjct: 435 EMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1310MPTASEINHBTR250.041 Metalloprotease inhibitor signature.
		>MPTASEINHBTR#Metalloprotease inhibitor signature.

Length = 122

Score = 24.6 bits (53), Expect = 0.041
Identities = 6/43 (13%), Positives = 16/43 (37%)

Query: 30 SGRSELSQSEQQRLAQLVDEAKRMRERIQALESILDAEHPNWR 72
+G+ + + A ++A + + E L + +W
Sbjct: 37 AGQLGIEATGSGVCAGPAEQANALAGDVACAEQWLGDKPVSWS 79


21SF1333SF1383Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF1333226-1.599270hypothetical protein
SF13341230.440718hypothetical protein
SF13352250.812405insertion sequence element ISSfl2 transposase
SF1337022-0.461292hypothetical protein
SF13392250.223102insertion sequence element IS600 protein
SF1340126-0.415673insertion sequence element IS600 transposase
SF1342027-1.013843IS2 transposase TnpB
SF1343026-2.937666IS2 repressor TnpA
SF1344-122-3.224364DnaC-like chromosome replication protein
SF1345028-4.778419bacteriophage protein
SF1346129-5.126863bacteriophage protein
SF1347128-5.753831hypothetical protein
SF1348127-3.711727hypothetical protein
SF1349226-3.090882hypothetical protein
SF1350225-3.029782hypothetical protein
SF1350a025-2.697768regulatory protein
SF1350b025-2.977897hypothetical protein
SF1351-123-2.234612insertion element IS1 protein InsB
SF1352024-2.112918insertion element IS1 protein InsA
SF1353122-2.177501bacteriophage protein
SF1354224-0.718987bacteriophage protein
SF1355324-0.513358bacteriophage protein
SF1356323-0.585689insertion sequence element IS600 transposase
SF1357421-0.193087insertion sequence element IS600 protein
SF1359320-0.164940tail fiber assembly protein
SF1360320-0.697638tail fiber protein
SF1361118-1.293983hypothetical protein
SF1362120-1.386542iron transporter inner membrane protein
SF1363119-0.508725iron ABC transporter permease
SF1364121-0.315310iron ABC transporter ATP-binding protein
SF1365019-1.543298iron ABC transporter substrate-binding protein
SF1366120-1.119661insertion element IS1 protein InsA
SF1367120-1.451071insertion element IS1 protein InsB
SF1370222-2.381125insertion sequence element IS629 integrase core
SF1371223-3.708678insertion sequence element IS629 protein
SF1372022-2.675053bacteriophage protein
SF1373123-1.909146host-nuclease inhibitor protein Gam
SF1375121-1.350230exodeoxyribonuclease VIII
SF1376525-2.019643hypothetical protein
SF1377525-1.070304FtsZ inhibitor protein
SF1378526-0.594299IS2 transposase TnpB
SF1379626-1.863961IS2 repressor TnpA
SF1382625-2.007433hypothetical protein
SF1383322-1.910970invasion plasmid antigen
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1350aHOKGEFTOXIC652e-18 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 65.2 bits (159), Expect = 2e-18
Identities = 19/46 (41%), Positives = 32/46 (69%)

Query: 23 QKAMLIALIVICLTVIVTALVTRKDLCEVRIRTGQTEVAVFTAYEP 68
+ +++ ++++CLT+++ +TRK LCE+R R G EVA F AYE
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1365adhesinb332e-116 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 332 bits (852), Expect = e-116
Identities = 90/296 (30%), Positives = 163/296 (55%), Gaps = 7/296 (2%)

Query: 1 MLLGCLALTCSIAFQASATEKFKVITTFTIIADMAKNVAGDAAEVSSITKPGAEIHEYQP 60
+G A + + + + K V+ T +IIAD+ KN+AGD + SI G + HEY+P
Sbjct: 13 AFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKNIAGDKINLHSIVPVGQDPHEYEP 72

Query: 61 TPGDIKRAQGAQLILANGMNLEL----WFQRFYQHLNGVPE---VIVSSGVTPVGITEGP 113
P D+K+ A LI NG+NLE WF + ++ VS GV + +
Sbjct: 73 LPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVSEGVDVIYLEGQS 132

Query: 114 YEGKPNPHAWMSPDNALIYVDNIRDALIKYDPANAQTYQRNADTYKAKITQTLAPLRKQI 173
+GK +PHAW++ +N +IY NI L + DPAN +TY++N Y K++ +++
Sbjct: 133 EKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANKETYEKNLKAYVEKLSALDKEAKEKF 192

Query: 174 TELPENQRWMVTSEGAFSYLARDLGLKELYLWPINADQQGTPQQVRKVVDIVKKNHIPAV 233
+P ++ +VTSEG F Y ++ + Y+W IN +++GTP Q++ +V+ ++K +P++
Sbjct: 193 NNIPGEKKMIVTSEGCFKYFSKAYNVPSAYIWEINTEEEGTPDQIKTLVEKLRKTKVPSL 252

Query: 234 FSESTISDKPARQVARETGAHYGGVLYVDSLSTENGPVPTYIDLLKVTTSTLVQGI 289
F ES++ D+P + V+++T ++ DS++ + +Y ++K + +G+
Sbjct: 253 FVESSVDDRPMKTVSKDTNIPIYAKIFTDSVAEKGEEGDSYYSMMKYNLEKIAEGL 308


22SF1435SF1452Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF1435216-1.015169hypothetical protein
SF1436219-1.703297hypothetical protein
SF1437116-1.385542hypothetical protein
SF1439221-1.497681insertion sequence element IS600 transposase
SF1440221-1.434212insertion sequence element IS600 protein
SF1441023-1.897058hypothetical protein
SF1442020-2.291073aldehyde reductase
SF1443-117-2.554697hypothetical protein
SF1444-119-3.604781glyceraldehyde-3-phosphate dehydrogenase
SF1445023-4.614453methionine sulfoxide reductase B
SF1446122-4.584942hypothetical protein
SF1447022-5.131907oxidoreductase
SF1448124-6.003252transporter
SF1450123-5.926327aldolase
SF1451120-4.851244kinase
SF1452-114-3.191404hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1435PRTACTNFAMLY280.022 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 27.7 bits (61), Expect = 0.022
Identities = 18/61 (29%), Positives = 26/61 (42%)

Query: 49 QGLSIGIIILTIGVMAPIASGTLPPSTLIHSFLNWKSLVAIAVGVIVSWLGGRGVTLMGS 108
Q +I L IG + + LPPS ++ N ++ A VS LG +TL G
Sbjct: 174 QRSAIVDGGLHIGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPAAVSVLGASELTLDGG 233

Query: 109 Q 109

Sbjct: 234 H 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1443INVEPROTEIN290.021 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 28.9 bits (64), Expect = 0.021
Identities = 18/81 (22%), Positives = 34/81 (41%), Gaps = 13/81 (16%)

Query: 158 ETTSALHTYFNVGDIAKVSVSGLGDRFIDKVNDAKED-----------VLTDGIQTFPDR 206
E ++AL + N D K S S L + F ++V + + V ++ F +
Sbjct: 57 EMSAALAQFRNRRDYEKKS-SNLSNSF-ERVLEDEALPKAKQILKLISVHGGALEDFLRQ 114

Query: 207 TDRVYLNPQDCSVINDEALNR 227
++ +P D ++ E L R
Sbjct: 115 ARSLFPDPSDLVLVLRELLRR 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1448TCRTETB290.041 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.1 bits (65), Expect = 0.041
Identities = 32/142 (22%), Positives = 47/142 (33%), Gaps = 23/142 (16%)

Query: 71 MFLGALVGGIIGDKTGRRNAFILYEAIHIASMVVGAFSPNMDF-LIACRFVMDVGLGALL 129
+G V G + D+ G + + I+ V+G + LI RF+ G A
Sbjct: 62 FSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFP 121

Query: 130 VTLFAGFTEYMPGRNR----GTWSSRVSFIGNWSYPLCSLIAMGLTPLISA----EWNWR 181
+ Y+P NR G S V+ + G+ P I +W
Sbjct: 122 ALVMVVVARYIPKENRGKAFGLIGSIVA------------MGEGVGPAIGGMIAHYIHWS 169

Query: 182 VQLLIPAILSLIATALAWRYFP 203
LLIP I I T
Sbjct: 170 YLLLIPMI--TIITVPFLMKLL 189


23SF1479SF1498Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF14792132.548248arginine succinyltransferase
SF14801131.286914aldehyde dehydrogenase
SF1482015-0.904676succinylglutamate desuccinylase
SF1483117-2.269213periplasmic protein
SF1484217-2.449088hypothetical protein
SF1485118-2.878171nucleotide excision repair endonuclease
SF1486016-2.969776NAD synthetase
SF1487116-2.551449DNA-binding transcriptional activator OsmE
SF1488016-3.502368PTS system N,N'-diacetylchitobiose-specific
SF1489015-2.843553PTS system N,N'-diacetylchitobiose-specific
SF1490-212-1.942582PTS system N,N'-diacetylchitobiose-specific
SF1492-211-1.743780insertion element IS1 protein InsB
SF1493-116-3.873039insertion element IS1 protein InsA
SF1494-115-3.782325cryptic phospho-beta-glucosidase
SF1495017-3.029144hypothetical protein
SF1496015-2.956660hydroperoxidase II
SF1497024-4.339745cell division modulator
SF1498022-5.278461hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1480DNABINDINGHU310.002 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 31.2 bits (71), Expect = 0.002
Identities = 14/61 (22%), Positives = 28/61 (45%), Gaps = 5/61 (8%)

Query: 93 SNKAELTAIIARETGKPRWEAATEVTAMINKIAISIKAYHVRTGEQRSEMPDGAASLRHR 152
+NK +L A +A T + ++A V A+ + ++ + GE+ + G +R R
Sbjct: 2 ANKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAK-----GEKVQLIGFGNFEVRER 56

Query: 153 P 153

Sbjct: 57 A 57


24SF1532SF1547aY        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF1532115-3.623232electron transfer flavoprotein subunit YdiR
SF1533119-3.705063hypothetical protein
SF1535023-4.713619insertion element IS1 protein InsA
SF1536023-4.692141insertion element IS1 protein InsB
SF1537023-4.670663transcriptional regulator YdeO
SF1537a221-2.760669two-protein-system connector protein SafA
SF1538221-2.210816oxidoreductase
SF1539525-2.296450minor fimbrial subunit, D-mannose specific
SF1540528-3.079024insertion element IS1 protein InsA
SF1541430-3.486075insertion element IS1 protein InsB
SF1542329-3.525952insertion sequence element IS600 protein
SF1542a224-2.092728hypothetical protein
SF1542b122-2.802662hypothetical protein
SF1543022-2.916524anti-adapter protein IraM
SF1544-122-1.494086**antitermination protein
SF1545024-1.280937endodeoxyribonuclease RUS
SF1546225-1.099016bacteriophage protein
SF1547429-3.802305hypothetical protein
SF1547a323-3.163287regulatory protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1547aHOKGEFTOXIC615e-17 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 60.6 bits (147), Expect = 5e-17
Identities = 19/46 (41%), Positives = 32/46 (69%)

Query: 4 QKAMLIALIVICLTVIVTALVTRKDLCEVRIRTGQTEVAVFTAYEP 49
+ +++ ++++CLT+++ +TRK LCE+R R G EVA F AYE
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


25SF1592SF1608Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF1592525-1.102314hypothetical protein
SF1593930-0.882507hypothetical protein
SF15951133-0.853154hypothetical protein
SF159613411.237684hypothetical protein
SF159916420.884636hypothetical protein
SF160013351.269721hypothetical protein
SF160112330.172180hypothetical protein
SF16021435-0.056485hypothetical protein
SF16031129-1.146381hypothetical protein
SF1604826-1.138449integrase
SF1605523-1.050052insertion sequence element IS911 integrase core
SF1606422-1.381354insertion sequence element IS911 transposase
SF1608318-0.314028integrase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1600PYOCINKILLER270.048 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 26.7 bits (58), Expect = 0.048
Identities = 11/31 (35%), Positives = 16/31 (51%)

Query: 78 AYGRRQMNKRAEAERIAKEQRLQAERMREEN 108
A + + AEA+R A+EQ Q +R N
Sbjct: 218 AANKAREQAAAEAKRKAEEQARQQAAIRAAN 248


26SF1718SF1727Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF1718217-3.515854hypothetical protein
SF1719118-4.738974hypothetical protein
SF1720118-4.298421transport system permease
SF1721-117-4.674100amino acid/amine transport protein
SF1722-121-6.484894quinate/shikimate dehydrogenase
SF1723018-6.7039093-dehydroquinate dehydratase
SF1724119-6.808597insertion element IS1 protein InsB
SF1725119-6.903575insertion element IS1 protein InsA
SF1726119-6.776266sulfatase
SF1727111-4.517276hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1720TCRTETA310.007 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 31.3 bits (71), Expect = 0.007
Identities = 63/331 (19%), Positives = 113/331 (34%), Gaps = 19/331 (5%)

Query: 41 FAGLLSDRFGRRPFIMLGMCCYMAFFFDILQTNNIIIAYVFGFLAGMANSFLDAGTYPSL 100
G LSDRFGRRP +++ + + + + + Y+ +AG+ + A +
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAYI 120

Query: 101 MEAFPRSPGTANI-LIKAFVSSGQFLLPLIISLLVWAELWFGWSFMIAAGIMFINALFLY 159
+ + + A G P++ L+ F AA + +N L
Sbjct: 121 ADITDGDERARHFGFMSACFGFGMVAGPVLGGLM--GGFSPHAPFFAAAALNGLNFLTGC 178

Query: 160 RCTFPPHPGRRLPV---IKKTTSSTEHRCSIIDLASYTLYGYISMATFYLVSQWLAQYGQ 216
H G R P+ +S + +A+ +I + + +G+
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGE 238

Query: 217 FVAGMSYTM-SIKLLSIYTVGSLLCVFITAPLIRNTVRPTTLLMLYTFISFIALFTVCLH 275
T I L + + SL IT P+ L++ IA T +
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALML-----GMIADGTGYIL 293

Query: 276 PTFYVVIIFAF-VIGFTSAGGVVQIGLTLMAERF--PYAKGKATGIYYSTGSIATFTIPL 332
F AF ++ ++GG+ L M R +G+ G + S+ + PL
Sbjct: 294 LAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPL 353

Query: 333 ITAHLSQRSIA---DIMWFDTAIGFLLALFI 360
+ + SI W A +LL L
Sbjct: 354 LFTAIYAASITTWNGWAWIAGAALYLLCLPA 384



Score = 29.8 bits (67), Expect = 0.019
Identities = 25/137 (18%), Positives = 51/137 (37%), Gaps = 3/137 (2%)

Query: 20 NAAGVSIVISSLGI-GRLSVLLFAGLLSDRFGRRPFIMLGMCCYMAFFFDILQTNNIIIA 78
+A + I +++ GI L+ + G ++ R G R +MLGM + + +A
Sbjct: 244 DATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMA 303

Query: 79 YVFGFLAGMANSFLDAGTYPSLMEAFPRSPGTANILIKAFVSSGQFLLPLIISLL--VWA 136
+ L + A + G + A S + PL+ + +
Sbjct: 304 FPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASI 363

Query: 137 ELWFGWSFMIAAGIMFI 153
W GW+++ A + +
Sbjct: 364 TTWNGWAWIAGAALYLL 380


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1721TCRTETB300.021 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.8 bits (67), Expect = 0.021
Identities = 31/150 (20%), Positives = 63/150 (42%), Gaps = 7/150 (4%)

Query: 4 LAEKFSTDNAGIAYLISGIGLGRLISILFFGVISDKFGRRAVILMAVIMY----LLFFFG 59
+A F+ A ++ + L I +G +SD+ G + ++L +I+ ++ F G
Sbjct: 40 IANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVG 99

Query: 60 IPACPNLTLAYGLAVCVGIANSALDTGGYPALMECFPKASGSAVILVKAMVSFGQMFYPM 119
L +A + A AL + G A L+ ++V+ G+ P
Sbjct: 100 HSFFSLLIMARFIQGAGAAAFPALVMVVVARY--IPKENRGKAFGLIGSIVAMGEGVGP- 156

Query: 120 LVSYMLLNNIWYGYGLIIPGILFVLITLML 149
+ M+ + I + Y L+IP I + + ++
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLM 186


27SF1804SF1811Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SF1804-127-4.658424insertion sequence element IS600 transposase
SF1805-126-4.183939insertion sequence element IS600 protein
SF1807-223-4.224325hypothetical protein
SF1808-320-3.006525hypothetical protein
SF1810-319-3.596709hypothetical protein
SF1811-318-3.573525hypothetical protein
28SF1824SF1843Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF1824022-3.542642universal stress protein F
SF1825-117-1.648659tRNA 2-thiocytidine biosynthesis protein TtcA
SF1826022-2.073967ATP-dependent RNA helicase DbpA
SF1827122-2.791261zinc transporter
SF1830225-2.920675multidrug resistant protein-like protein
SF1831-1200.167662hypothetical protein
SF1832-117-0.363289IS1294 transposase
SF1834-120-1.541747hypothetical protein
SF1834a019-0.872025hypothetical protein
SF1835-117-2.291847O-6-alkylguanine-DNA:cysteine-protein
SF1836-113-3.222522fumarate/nitrate reduction transcriptional
SF1837015-3.685948universal stress protein UspE
SF1838016-3.786712insertion sequence element IS911 integrase core
SF1839017-4.530628insertion sequence element IS911 transposase
SF1840016-4.452476hypothetical protein
SF1841118-3.125560transport protein
SF1843321-2.403762insertion sequence element IS600 transposase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1830TCRTETB1022e-25 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 102 bits (255), Expect = 2e-25
Identities = 78/418 (18%), Positives = 172/418 (41%), Gaps = 21/418 (5%)

Query: 3 SMRKHIAFASMCMGLFIAQLDIQIVSSSLNEIGGGLSAGKDEMAWLQTSYLIAEIIVIPL 62
++R + +C+ F + L+ +++ SL +I + W+ T++++ I +
Sbjct: 9 NLRHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAV 68

Query: 63 SGWLSRVFSTRWLFTLSAGIFTLMSIACGLAWN-IQIMILFRALQGAAGASMIPLVFTMA 121
G LS + L I S+ + + ++I+ R +QGA A+ LV +
Sbjct: 69 YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVV 128

Query: 122 FIYYQGKELGLAAAVVSALASLSPTLGPTLGGWLTDNLDWRWLFYINILPGIYLVLSIPF 181
Y + G A ++ ++ ++ +GP +GG + + W +L I ++ ++++PF
Sbjct: 129 ARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMI----TIITVPF 184

Query: 182 LVNFDKPDLSLLKVADYPSIILLAMTLGCLEYTLEEGARWGWLDDNTILLTSVLALVSFI 241
L+ K ++ + D IIL+++ + +L +++++SF+
Sbjct: 185 LMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTS-YSISFL---------IVSVLSFL 234

Query: 242 LFAARTLTISNPIMDLHAFKDKNFTLGCFFSFSGGVGIFSTVYLIPVFLGQIRGLNAEEI 301
+F +++P +D K+ F +G + V ++P + + L+ EI
Sbjct: 235 IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEI 294

Query: 302 GFAVCTTG-IFQLFSVPFYFWLSKKINLRWLLMAGLGGFVFSMYL--FTPITHEWGWQEL 358
G + G + + L + ++L G+ S F T W + +
Sbjct: 295 GSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSW-FMTI 353

Query: 359 LFPQAIRGISQQFAMAPIVTLTLGGIPKERLKLASGVFNLTRNLGGAIGIALCGSILN 416
+ + G+S F I T+ + ++ + N T L GIA+ G +L+
Sbjct: 354 IIVFVLGGLS--FTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


29SF1878SF1893Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF18782181.713144insertion sequence element ISSfl2 transposase
SF18793181.262435bacteriophage protein
SF18804211.986514invasion plasmid antigen
SF18815254.148998prophage CP-933K tail protein
SF18835264.214576membrane protein
SF18842263.985206host specificity protein
SF18853283.875991phage tail protein
SF18863293.541873tail assembly protein
SF18873253.207179minor tail protein
SF18883252.878623minor tail protein
SF18893263.057836tail length tape measure protein
SF18907281.549808prophage CP-933K tail protein
SF18917291.705866prophage CP-933K tail protein
SF18924223.188249prophage CP-933K tail protein
SF18932261.468210prophage CP-933K tail protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1883ENTEROVIROMP1342e-42 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 134 bits (339), Expect = 2e-42
Identities = 62/200 (31%), Positives = 98/200 (49%), Gaps = 30/200 (15%)

Query: 1 MRKLYAAILSAAICLAVSGAPAWASEHRSTLSAGYLHASTNAPGSDDLNGINVKYRYEFT 60
M+K+ AA+ +G A+ ST++ GY + + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAAT---STVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGLVTSFSYANAEDEQKTHYSDTRWHEDSVRNRWFSVMAGLSVRVNEWFSAYAMAGV 119
++ LG++ SF+Y T S T D +N+++ + AG + R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDNRHSNTSLAWGAGVQFNPTESVAIDLAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1889RTXTOXIND406e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.8 bits (93), Expect = 6e-05
Identities = 14/194 (7%), Positives = 43/194 (22%), Gaps = 12/194 (6%)

Query: 29 LNGAASDAERSSARMQRFMERQTQAARQTMQAASSAATAASAHAQTVEKNARAHERMARE 88
L ++A+ + Q + + Q S + + E
Sbjct: 127 LTALGAEADTLKTQSSL---LQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEE 183

Query: 89 VEQTRLRVDALNQKMREEQAQARALAEAQDKAAAAFYRQIDSVKQAGAGLQELQRIQQQI 148
V + + + ++ Q + + +I+ + +
Sbjct: 184 VLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDD---F 240

Query: 149 RQARNSGGVGQQDYLALISEITAKTRALTQAE------EQATRQKAAFIRQLKEQATRQN 202
+ + + L ++ L + E + + + +
Sbjct: 241 SSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI 300

Query: 203 LSSSELLRARAAQL 216
L L
Sbjct: 301 LDKLRQTTDNIGLL 314



Score = 38.7 bits (90), Expect = 1e-04
Identities = 25/224 (11%), Positives = 63/224 (28%), Gaps = 30/224 (13%)

Query: 545 NYQEQQKRRNAENAALNRMNETEAARHQREIARINAMQYADQAVRDAAIQRENERYEKAI 604
+ + + A L + + E+ ++ ++ D+ + E R I
Sbjct: 133 EADTLKTQSSLLQARLEQ-TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLI 191

Query: 605 KKNTRATRNDEATRLLLQYSQQQAQVEGQIAAARQSAGIATERMTEAHKQLLALQQRISD 664
K+ + Q+ Q E + R R+ + R+ D
Sbjct: 192 KEQ------------FSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDD 239

Query: 665 LDGKKLTADEKSVLARKNELIQALTLLDVKQQELQKQTALNDLRKKTVQLTSQLADKERA 724
L + I +L+ + + ++ L + + Q+ S++ +
Sbjct: 240 F--SSL---------LHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE 288

Query: 725 LREQHNLDIATAGMGDKQRQRYQAQLRIRQEYRQQLQQLENDSR 768
+ + + Q I +L + E +
Sbjct: 289 YQ-LVTQLFKN----EILDKLRQTTDNIGL-LTLELAKNEERQQ 326



Score = 32.5 bits (74), Expect = 0.009
Identities = 31/238 (13%), Positives = 69/238 (28%), Gaps = 42/238 (17%)

Query: 402 DPVNAAKALDNALHFLNATQLEQIRVLGEQGRSSDAARIAMSALAEETGKRTSDIDNNLN 461
+ A L +LEQ R RS + ++ L +E + + L
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILS-RSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 462 ALGSTLQTLSDWWKQFWDAAMNIGREDSLDAQIDALQEKIQRAKKYPWTNASTQVEYDQQ 521
+ S W Q + +N+ D A+ + +I R + ++
Sbjct: 187 LTSLIKEQFSTWQNQKYQKELNL---DKKRAERLTVLARINRYEN--------LSRVEKS 235

Query: 522 RLNDLQEKKRRKDLQDAKAQAERNYQEQQKRRNAENAALNRMNETEAARHQREIARINAM 581
RL+D L +A A+ EQ+ + L +++ +
Sbjct: 236 RLDDFSS------LLHKQAIAKHAVLEQENKYVEAVNELRVY-----------KSQLEQI 278

Query: 582 QYADQAVRDAAIQRENERYEKAIKKNTRATRNDEATRLLLQYSQQQAQVEGQIAAARQ 639
+ + ++ + + K + T + ++A +
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTT-------------DNIGLLTLELAKNEE 323



Score = 31.0 bits (70), Expect = 0.030
Identities = 26/185 (14%), Positives = 57/185 (30%), Gaps = 13/185 (7%)

Query: 13 IDAAEFKNEIPRIKNLLNGAASDAERSSARMQRFMERQTQAARQTMQAASSAATAASAHA 72
+ E + +E R+ ++ Q + A
Sbjct: 157 SRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAER 216

Query: 73 QTVEKNARAHERMAREVEQTRLRVDALNQKMREEQAQARALAEAQDKAAAAFYRQIDSVK 132
TV +E ++R VE++RL + +QA A+ Q+ ++
Sbjct: 217 LTVLARINRYENLSR-VEKSRL---DDFSSLLHKQAIAKHAVLEQENKYVEAVNEL---- 268

Query: 133 QAGAGLQELQRIQQQIRQARNSGGVGQQDYLALISEITAKTRA-LTQAEEQATRQKAAFI 191
+L++I+ +I A+ + Q + I + +T + + K
Sbjct: 269 --RVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE--LAKNEER 324

Query: 192 RQLKE 196
+Q
Sbjct: 325 QQASV 329


30SF1970SF1995Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF1970-112-3.624710cytoplasmic alpha-amylase
SF1971122-4.881812lipoprotein
SF1972227-6.180130hypothetical protein
SF1973436-8.505129hypothetical protein
SF1977124-5.629539porin
SF1978223-2.904975regulator
SF19790172.044716kinase inhibitor
SF19801183.276518multidrug efflux protein
SF19821183.905880flagellar hook-basal body protein FliE
SF19841183.688263flagellar motor switch protein G
SF1985-1173.048522flagellar assembly protein H
SF1986-2182.801276flagellum-specific ATP synthase
SF1987-2161.719021flagellar biosynthesis chaperone FliJ
SF1988-2171.793737flagellar hook-length control protein
SF1989-2221.325900flagellar basal body protein FliL
SF1990119-0.034524flagellar motor switch protein FliM
SF1991219-3.218047flagellar motor switch protein FliN
SF1992121-4.122128flagellar protein FliO
SF1993121-4.994159flagellar biosynthesis protein FliP
SF1994017-3.598723flagellar biosynthesis protein FliQ
SF1995-118-3.624028flagellar biosynthesis protein FliR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1972RTXTOXIND300.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.017
Identities = 10/57 (17%), Positives = 17/57 (29%), Gaps = 2/57 (3%)

Query: 164 RFTLLPIFRIPVKMQKVSAASPLTQKPDQARRRF--RLGMLVFFGMLGWALLTAMNQ 218
R L R + + + A L + P R R M ++L +
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEI 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1973PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1977ECOLIPORIN5090.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 509 bits (1312), Expect = 0.0
Identities = 239/388 (61%), Positives = 282/388 (72%), Gaps = 33/388 (8%)

Query: 1 MKKLTVAISAVAASVLMAMSAQAAEIYNKDSNKLDLYGKVNAKHYFSSNDADDGDTTYVR 60
MK+ +A+ V ++L A +A AAEIYNKD NKLDLYGKV+ HYFS + + DGD TY+R
Sbjct: 1 MKRKVLAL--VIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMR 58

Query: 61 LGFKGETQINDQLTGFGQWEYEFKGNRAESQGSSKDKTRLAFAGLKFGDYGSIDYGRNYG 120
+GFKGETQINDQLTG+GQWEY + N E +G++ TRLAFAGLKFGDYGS DYGRNYG
Sbjct: 59 VGFKGETQINDQLTGYGQWEYNVQANTTEGEGANS-WTRLAFAGLKFGDYGSFDYGRNYG 117

Query: 121 VAYDIGAWTDVLPEFGGDTWTQTDVFMTGRTTGVATYRNNDFFGLVDGLNFAAQYQGKND 180
V YD+ WTD+LPEFGGD++T D +MTGR GVATYRN DFFGLVDGLNFA QYQGKN+
Sbjct: 118 VLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNE 177

Query: 181 R----------------TDVTEANGDGFGFSTTYEY-EGFGVGATYAKSDRTNDQVIYGN 223
D+ NGDGFG STTY+ GF GA Y SDRTN+QV G
Sbjct: 178 SQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGG 237

Query: 224 NSLNASGQNAEVWAAGLKYDANNIYLATTYSETQNMTVFG------NNHIANKAQNFEVV 277
A G A+ W AGLKYDANNIYLAT YSET+NMT +G + +ANK QNFEV
Sbjct: 238 T--IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVT 295

Query: 278 AQYQFDFGLRPSVAYLQSKGKDLG----AWGDQDLIEYIDVGATYYFNKNMSTFVDYKIN 333
AQYQFDFGLRP+V++L SKGKDL D+DL++Y DVGATYYFNKN ST+VDYKIN
Sbjct: 296 AQYQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKIN 355

Query: 334 LIDKSD-FTKASGVATDDIVAVGLVYQF 360
L+D D F K +G++TDDIVA+G+VYQF
Sbjct: 356 LLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1978HTHFIS290.017 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.017
Identities = 8/30 (26%), Positives = 16/30 (53%)

Query: 176 RTKWTANKVARYLYISVSTLHRRLASEGIS 205
T+ K A L ++ +TL +++ G+S
Sbjct: 447 ATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1982FLGHOOKFLIE1178e-38 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 117 bits (293), Expect = 8e-38
Identities = 102/103 (99%), Positives = 102/103 (99%)

Query: 2 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTVARTQAEKFTL 61
SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQT ARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1984FLGMOTORFLIG338e-118 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 338 bits (868), Expect = e-118
Identities = 117/329 (35%), Positives = 197/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE 60
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD 120
+ + DY R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLRRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLAKRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1985FLGFLIH373e-135 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 373 bits (958), Expect = e-135
Identities = 223/228 (97%), Positives = 226/228 (99%)

Query: 1 MSDNLPWKTWTPDDLAPPPAEFVPMVESEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60
MSDNLPWKTWTPDDLAPP AEFVP+VE EETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 61 AEGRQQGHEQGYQEGLAQGLEQGLAEAKAQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120
AEGRQQGH+QGYQEGLAQGLEQGLAEAK+QQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180
MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1987FLGFLIJ1525e-51 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 152 bits (384), Expect = 5e-51
Identities = 112/113 (99%), Positives = 113/113 (100%)

Query: 2 AEEQLKMLIDYQNEYRNNLNSDMSAGMTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKV 61
AEEQLKMLIDYQNEYRNNLNSDMSAG+TSNRWINYQQFIQTLEKAITQHRQQLNQWTQKV
Sbjct: 35 AEEQLKMLIDYQNEYRNNLNSDMSAGITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKV 94

Query: 62 DIALNSWREKKQRLQAWQTLQERQSTAALLAENRLDQKKMDEFAQRAAMRKPE 114
DIALNSWREKKQRLQAWQTLQERQSTAALLAENRLDQKKMDEFAQRAAMRKPE
Sbjct: 95 DIALNSWREKKQRLQAWQTLQERQSTAALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1988FLGHOOKFLIK468e-168 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 468 bits (1204), Expect = e-168
Identities = 364/375 (97%), Positives = 369/375 (98%)

Query: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLTLLSEALAGETTTDKAAPQLLVATDKPTTK 60
MIRLAPLITADVDTTTLPGGKASDAAQDFL LLSEALAGETTTDKAAPQLLVATDKPTTK
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLVSDILADAQQADLLIPVDETLPVINDEQSTSTPLTTAQTMTLAAVADKNTTKDEKA 120
GEPL+SDI++DAQQA+LLIPVDET PVINDEQSTSTPLTTAQTM LAAVADKNTTKDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPAEKPTLFTKLTSAQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLP EKPTLFTKLTS QLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTADASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTPLVAEAQSKAEVISTPSPVTA ASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMISPHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQM+SPHQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360

Query: 361 LQGRVTGNSGVDIFA 375
LQGRVTGNSGVDIFA
Sbjct: 361 LQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1990FLGMOTORFLIM380e-134 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 380 bits (977), Expect = e-134
Identities = 86/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 5 ILSQAEIDALLNGDS--EVKDEPTASISGESDIRPYDPNTQRRVVRERLQALEIINERFA 62
+LSQ EID LL S + E IS I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 63 RHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 122
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 123 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 182
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 183 EMQVEFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 240
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 241 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 297
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 298 GVPVLTSQYGTLNGQYALRIEHLI 321
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1991FLGMOTORFLIN2106e-74 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 210 bits (537), Expect = 6e-74
Identities = 125/137 (91%), Positives = 133/137 (97%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTSEKSAADAVFQQFGGGDVSGTLQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T+ KSAADAVFQQ GGGDVSG +QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1993FLGBIOSNFLIP2814e-99 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 281 bits (720), Expect = 4e-99
Identities = 203/204 (99%), Positives = 204/204 (100%)

Query: 1 MQTLVFITSLTFIPAILLMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIM 60
+QTLVFITSLTFIPAILLMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIM
Sbjct: 42 VQTLVFITSLTFIPAILLMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIM 101

Query: 61 SPVIDKIYVDAYQPFSEEKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQG 120
SPVIDKIYVDAYQPFSEEKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQG
Sbjct: 102 SPVIDKIYVDAYQPFSEEKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQG 161

Query: 121 PEAVPMRILLPAYVTSELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPF 180
PEAVPMRILLPAYVTSELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPF
Sbjct: 162 PEAVPMRILLPAYVTSELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPF 221

Query: 181 KLMLFVLVDGWQLLVGSLAQSFYS 204
KLMLFVLVDGWQLLVGSLAQSFYS
Sbjct: 222 KLMLFVLVDGWQLLVGSLAQSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1994TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1995TYPE3IMRPROT2034e-67 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 203 bits (517), Expect = 4e-67
Identities = 254/261 (97%), Positives = 257/261 (98%)

Query: 1 MMQETSDQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
M+Q TS+QWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPGSHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDP SHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGSEPLNSNAFLAPTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIG EPLNSNAFLA TKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIISELPLI 261
EHLFSEIFNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


31SF2007SF2063Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF2007324-1.611953hypothetical protein
SF2008423-1.351013hypothetical protein
SF2009120-1.790631outer membrane pore protein
SF2010-126-4.126481IS2 transposase TnpB
SF2011031-6.344296IS2 repressor TnpA
SF2012-131-7.235450insertion element IS1 protein InsA
SF2013-132-7.285788insertion element IS1 protein InsB
SF2014034-8.791901chaperone protein HchA
SF2015240-9.4786972-component sensor protein
SF2016336-8.589559transcriptional regulatory protein YedW
SF2017133-8.255809hydroxyisourate hydrolase
SF2019230-6.698791sulfite oxidase subunit YedZ
SF2020429-5.964437zinc/cadmium-binding protein
SF2020a426-3.965338tail fiber assembly protein
SF2021329-5.322157hypothetical protein
SF2023222-2.688437insertion element iso-IS10R transposase
SF2024224-1.635646hypothetical protein
SF2025124-1.906287insertion sequence element IS600 protein
SF2026124-1.002947insertion element IS1 protein InsB
SF2027123-1.143780hypothetical protein
SF2028021-0.107015tail protein
SF20291210.233004bacteriophage protein
SF20302240.155608bacteriophage protein
SF20312240.066180sheath protein
SF2033326-2.089535insertion sequence element IS600 protein
SF2034325-2.200334insertion sequence element IS600 transposase
SF2036022-0.908532insertion sequence element ISSfl4 ORF2
SF2037-123-1.243479insertion sequence element ISSfl4 transposase
SF2038023-0.781943lysis protein S
SF2039-125-0.657976****Q antiterminator of prophage
SF2040026-0.499895crossover junction endodeoxyribonuclease
SF2041025-1.201665bacteriophage protein
SF2042127-4.362505hypothetical protein
SF2042a125-2.928343hypothetical protein
SF2042b225-3.911169regulatory protein
SF2042c326-4.338679membrane protein
SF2043223-2.921271bacteriophage protein
SF2045020-2.664463*DgsA anti-repressor MtfA
SF2048-123-3.446929*insertion sequence element IS911 integrase core
SF2049-126-4.783382insertion sequence element IS911 transposase
SF2051-127-4.460704insertion sequence element IS600 transposase
SF2052-127-4.148185insertion sequence element IS600 protein
SF2053-130-3.947337AMP nucleosidase
SF2054031-3.597407hypothetical protein
SF2056130-2.289589**transcriptional regulator Cbl
SF2057226-1.598006nitrogen assimilation transcriptional regulator
SF2058125-1.415448*L,D-transpeptidase
SF2059122-0.672514nicotinate-nucleotide--dimethylbenzimidazole
SF20601180.009174cobalamin synthase
SF2061222-0.898465adenosylcobinamide kinase
SF2062223-1.189949insertion sequence element IS600 protein
SF2063224-0.560488insertion sequence element IS600 transposase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2008HTHFIS270.013 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 26.7 bits (59), Expect = 0.013
Identities = 7/45 (15%), Positives = 16/45 (35%), Gaps = 1/45 (2%)

Query: 6 KRYPEEFKTEAVKQVVDR-GYSVASVATRLDITTHSLYAWIKKYG 49
R E + + + + A L + ++L I++ G
Sbjct: 430 DRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELG 474


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2009ECOLIPORIN2881e-98 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 288 bits (739), Expect = 1e-98
Identities = 137/269 (50%), Positives = 165/269 (61%), Gaps = 31/269 (11%)

Query: 31 DTSYARVGVKGETQINPEMTGYGQFELDLEASNRHNPDQ---TRLAYAGLSYKDFGSFDY 87
D +Y RVG KGETQIN ++TGYGQ+E +++A+ TRLA+AGL + D+GSFDY
Sbjct: 53 DQTYMRVGFKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDY 112

Query: 88 SRNVGVAYDAEAFTDMFVEWGGDSWAGTDLFMTNRTNGVATYRNTDFFGMVEGLNFALQY 147
RN GV YD E +TDM E+GGDS+ D +MT R NGVATYRNTDFFG+V+GLNFALQY
Sbjct: 113 GRNYGVLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQY 172

Query: 148 QGKNEGTGNY----------------KANGDGHGLSATYTID-GFSFAGAYANSDRTDWQ 190
QGKNE NGDG G+S TY I GFS AY SDRT+ Q
Sbjct: 173 QGKNESQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQ 232

Query: 191 SGDGK----GERAEVWALSTKYDANNVYAAVMYGESHNM-------NSDDGDVVNKTQNF 239
G G++A+ W KYDANN+Y A MY E+ NM DG V NKTQNF
Sbjct: 233 VNAGGTIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNF 292

Query: 240 EAVLQYQFDFGLRPSIGYSYSKALDVAGQ 268
E QYQFDFGLRP++ + SK D+
Sbjct: 293 EVTAQYQFDFGLRPAVSFLMSKGKDLTYN 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2015PF06580354e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 35.2 bits (81), Expect = 4e-04
Identities = 38/195 (19%), Positives = 73/195 (37%), Gaps = 34/195 (17%)

Query: 261 TLSQIRSIAEYQKTIAGN-IEELENISRLTENILFLARADKNNVLVKLDSLSLNKEVENL 319
L+ IR++ T A + L + R + L ++ V SL E+ +
Sbjct: 178 ALNNIRALILEDPTKAREMLTSLSELMRYS-----LRYSNARQV-------SLADELTVV 225

Query: 320 LDYL--EYLSDEKEICFKVKCNQQIFADKI---LLQRMLSNLIVNAIRYSPEKSRIHITS 374
YL + E + F+ + N I ++ L+Q ++ N I + I P+ +I +
Sbjct: 226 DSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKG 285

Query: 375 FLDANGSLNIDIASPGTKINEPEKLFRRFWRGDNSRHSVGQGLGLSLVKA-IAELHGGSA 433
D NG++ +++ + G+ + K G GL V+ + L+G A
Sbjct: 286 TKD-NGTVTLEVENTGSLALKNTKE--------------STGTGLQNVRERLQMLYGTEA 330

Query: 434 TYHYLSKHNVFRITL 448
K +
Sbjct: 331 QIKLSEKQGKVNAMV 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2016HTHFIS831e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.6 bits (204), Expect = 1e-20
Identities = 30/117 (25%), Positives = 60/117 (51%), Gaps = 1/117 (0%)

Query: 2 KILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDDYALIILDIMLPGMDGWQ 61
IL+ +D+ + + Q LS AGY + S+ D L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 ILQTLRTA-KQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSFSELLARVRAQLRQ 117
+L ++ A PV+ ++A+++ ++ + GA DYL KPF +EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2021LUXSPROTEIN310.002 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 31.4 bits (71), Expect = 0.002
Identities = 18/66 (27%), Positives = 30/66 (45%), Gaps = 7/66 (10%)

Query: 41 TKEHLLPHFL-EHLGNNHLDI------GVGTGFYLTHVPESSLISLMDLNEASLNAASTR 93
T EHL F+ HL + ++I G TGFY++ + S + D A++
Sbjct: 54 TLEHLYAGFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKV 113

Query: 94 AGESKI 99
++KI
Sbjct: 114 ENQNKI 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2038FLAGELLIN250.029 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 25.0 bits (54), Expect = 0.029
Identities = 14/77 (18%), Positives = 31/77 (40%), Gaps = 9/77 (11%)

Query: 2 KSMDKISTGIAYGTSAGSAGYWFL--------QWLDQVSPSQWAAIGVLGSLVLGFLTYL 53
+++++S+G+ ++ A + + L Q S + I + G L +
Sbjct: 26 SAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGISIA-QTTEGALNEI 84

Query: 54 TNLYFKIREDKRKAARG 70
N ++RE +A G
Sbjct: 85 NNNLQRVRELSVQATNG 101


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2042bHOKGEFTOXIC645e-18 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 64.1 bits (156), Expect = 5e-18
Identities = 19/46 (41%), Positives = 32/46 (69%)

Query: 23 QKAMLIALIVICLTVIVTALVTRKDLCEVRIRTGQTEVAVFTAYEP 68
+ +++ ++++CLT+++ +TRK LCE+R R G EVA F AYE
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


32SF2081SF2111Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF2081-1233.430607ATP phosphoribosyltransferase
SF20820223.091666bifunctional histidinal dehydrogenase/
SF20830241.997097histidinol-phosphate aminotransferase
SF2084-1180.034234imidazole glycerol-phosphate
SF2085-116-0.596653imidazole glycerol phosphate synthase subunit
SF2086-314-3.0466491-(5-phosphoribosyl)-5-[(5-
SF2087-118-5.937300imidazole glycerol phosphate synthase subunit
SF2088126-8.257433bifunctional phosphoribosyl-AMP
SF2089439-12.964859chain length determinant protein WzzB
SF2091545-15.8273406-phosphogluconate dehydrogenase
SF2092663-21.130714hypothetical protein
SF2093863-20.541077hypothetical protein
SF2096761-20.167810glycosyl translocase
SF2097556-18.173384O-antigen polymerase
SF2098449-14.182241dTDP-rhamnosyl transferase
SF2099244-11.784371dTDP-rhamnosyl transferase
SF2100134-8.731633polysaccharide biosynthesis protein
SF2101-119-5.621468dTDP-6-deoxy-L-mannose-dehydrogenase
SF2102-113-3.379982glucose-1-phosphate thymidylyltransferase
SF2103-213-1.690183dTDP-4-dehydrorhamnose reductase
SF2104-212-0.824725dTDP-glucose 4,6 dehydratase
SF2105-2170.673585GalU regulator GalF
SF2106-1201.198168colanic acid biosynthesis protein
SF2107-1222.452489colanic acid biosynthesis glycosyltransferase
SF21080222.603536colanic acid biosynthesis protein
SF2109-1212.717723colanic acid exporter
SF2110-1223.039106UDP-glucose lipid carrier transferase
SF2111-1223.134169phosphomannomutase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2103NUCEPIMERASE475e-08 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 46.7 bits (111), Expect = 5e-08
Identities = 33/175 (18%), Positives = 64/175 (36%), Gaps = 35/175 (20%)

Query: 1 MNILLFGKTGQVGWELQRALAPLGN-LIALDVHSTDY--------------------CGD 39
M L+ G G +G+ + + L G+ ++ +D + Y D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 40 FSNPEGVAETVKKIRPDVIVNAAAHTAVDKAESEP------NFAQLLNATCVEAIAKAAN 93
++ EG+ + + + + AV + P N LN + +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLN---ILEGCRHNK 117

Query: 94 EVGAWVIHYSTDYVFPGNGDTPWLETDATA-PLNVYGETKLAGEKALQEHCAKHL 147
+ +++ S+ V+ N P+ D+ P+++Y TK A E L H HL
Sbjct: 118 -IQH-LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANE--LMAHTYSHL 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2104NUCEPIMERASE1841e-57 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 184 bits (469), Expect = 1e-57
Identities = 89/360 (24%), Positives = 149/360 (41%), Gaps = 48/360 (13%)

Query: 1 MKILVTGGAGFIGSAVVRHIINNTQDSVVNVDKLT--YAGNL-ESLADVSDSERYAFEHA 57
MK LVTG AGFIG V + ++ VV +D L Y +L ++ ++ + F
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 58 DICDAVAMSRIFAQHQPDAVMHLAAESHVDRSITGPAAFIETNIVGTYVLLEAARNYWSA 117
D+ D M+ +FA + V V S+ P A+ ++N+ G +LE R+
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN--- 116

Query: 118 LNDEKKKSFRFHHISTDEVYGDLPHPDEANNNEALPLFTETTAYAPSSPYSASKASSDHL 177
K + S+ VYG N +P T+ + P S Y+A+K +++ +
Sbjct: 117 ------KIQHLLYASSSSVYGL---------NRKMPFSTDDSVDHPVSLYAATKKANELM 161

Query: 178 VRAWKRTYGLPTIVTNCSNNYGPYHFPEKLIPLVILNALEGKALPIYGKGDQIRDWLYVE 237
+ YGLP YGP+ P+ + LEGK++ +Y G RD+ Y++
Sbjct: 162 AHTYSHLYGLPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYID 221

Query: 238 D-------------HARALYTVVTEGKA-----GETYNIGGHNEKKNIDVVLTICDLLDE 279
D HA +TV T A YNIG + + +D + + D L
Sbjct: 222 DIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALG- 280

Query: 280 IVPKEKSYREQITYVADRPGHDRRYAIDADKISRELGWKPQETFESGIRKTVEWYLANTN 339
+ +K+ +PG + D + +G+ P+ T + G++ V WY
Sbjct: 281 -IEAKKNMLPL------QPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYK 333


33SF2132SF2211Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF2132-2193.134225insertion sequence element ISSfl2 transposase
SF2133-3133.4822203-methyladenine DNA glycosylase
SF2136-3133.425483hypothetical protein
SF2137-3153.665600hypothetical protein
SF2138-2153.822796insertion sequence element ISSfl2 transposase
SF2139-1183.158113acriflavin resistance protein AcrA-like protein
SF2141-1183.200526multidrug efflux system subunit MdtC
SF2142-2121.025281transporter
SF2143-19-0.265849signal transduction histidine-protein kinase
SF2144112-2.084952DNA-binding transcriptional regulator BaeR
SF2145114-3.010950hypothetical protein
SF2146013-2.655981protease
SF2148023-4.115627hypothetical protein
SF2149318-2.354875lipid kinase
SF2150220-2.781502galactitol utilization operon repressor
SF2151121-3.097530galactitol-1-phosphate dehydrogenase
SF2152218-3.012645insertion element IS1 protein InsB
SF2153317-2.519911insertion element IS1 protein InsA
SF2154216-2.301830PTS system galactitol-specific transporter
SF2155117-1.484356PTS system galactitol-specific transporter
SF2156-118-2.240232PTS system galactitol-specific transporter
SF2157-117-1.926369D-tagatose-1,6-bisphosphate aldolase subunit
SF2158-219-0.903332tagatose-bisphosphate aldolase
SF2159-121-0.766298fructose-bisphosphate aldolase
SF2161-221-0.978626kinase
SF2162-122-2.231860insertion element iso-IS10R transposase
SF2164124-3.205350hypothetical protein
SF2165226-4.526073bifunctional hydroxy-methylpyrimidine kinase/
SF2166226-4.967833hydroxyethylthiazole kinase
SF2167227-6.786696transcriptional repressor RcnR
SF2169227-6.403797hypothetical protein
SF2170225-6.269138type-1 fimbrial protein
SF2171023-5.009225fimbrial outer membrane usher protein
SF2173015-1.247653insertion element IS1 protein InsB
SF2175116-0.785283fimbrial-like protein
SF21761160.162528insertion element IS1 protein InsB
SF2177116-0.038113insertion element IS1 protein InsA
SF2177a116-0.038113hypothetical protein
SF21782180.463203antiporter inner membrane protein
SF2179219-1.370039methionyl-tRNA synthetase
SF2181022-2.022327insertion sequence element IS600 transposase
SF21821170.504753insertion sequence element IS600 protein
SF21831180.947273insertion element IS1 protein InsA
SF21842201.339450insertion element IS1 protein InsB
SF21871191.317855hypothetical protein
SF21882181.467008hypothetical protein
SF21892171.524331hypothetical protein
SF21902170.572057hypothetical protein
SF2191316-0.247133hypothetical protein
SF2192016-2.907837insertion element IS1 protein InsB
SF2193-114-2.579515hypothetical protein
SF2194-215-2.114960insertion element IS1 protein InsB
SF2195-217-2.396971insertion element IS1 protein InsA
SF2196-218-4.151903hypothetical protein
SF2197-222-4.766125two-component response-regulatory protein YehT
SF2198-121-4.2173182-component sensor protein
SF2200327-3.663378DNA-damage-inducible protein
SF2200a325-3.258278tail fiber assembly protein
SF2201225-3.054023hypothetical protein
SF2203222-1.335970insertion element iso-IS10R transposase
SF22043261.094757bacteriophage protein
SF22051241.553821tail fiber protein
SF2206-1182.266479tail fiber assembly protein
SF22081193.705355insertion element IS1 protein InsA
SF22092183.667758insertion element IS1 protein InsB
SF2211-1163.212693insertion sequence element IS629 protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2139RTXTOXIND445e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.0 bits (104), Expect = 5e-07
Identities = 33/167 (19%), Positives = 64/167 (38%), Gaps = 11/167 (6%)

Query: 61 ALAQTQGQLAKDKATLANARRDLARYQQLAKTNLVSRQELDAQQALVSETEGTIKADEAS 120
+ +L K+ L ++ AK +L + L + T +
Sbjct: 260 KYVEAVNELRVYKSQLEQIESEILS----AKEEYQLVTQLFKNEILDKLRQTTDNIGLLT 315

Query: 121 --VASAQLQLDWSRITAPVDGRV-GLKQVDVGNQISSGDTTGIVVITQTHPIDLVFTLPE 177
+A + + S I APV +V LK G +++ +T +V++ + +++ +
Sbjct: 316 LELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVTALVQN 374

Query: 178 SDIATVVQAQKAGKPLMVEAWDRTNSKKL-SEGTLLSLDNQIDATTG 223
DI + Q A + VEA+ T L + ++LD D G
Sbjct: 375 KDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419



Score = 43.7 bits (103), Expect = 8e-07
Identities = 21/122 (17%), Positives = 47/122 (38%), Gaps = 13/122 (10%)

Query: 15 GTITAA-NTVTVRSRVDGQLMALHFQEGQQVKAGDLLAEIDPSQFKVALAQTQGQLAKDK 73
G +T + + ++ + + + +EG+ V+ GD+L ++ + K +
Sbjct: 88 GKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTAL-------GAEADTLKTQ 140

Query: 74 ATLANARRDLARYQQLAKTNLVSRQELDAQQALVSETEGTIKADEASVASAQLQLDWSRI 133
++L AR + RYQ L+++ EL+ L E + L +
Sbjct: 141 SSLLQARLEQTRYQILSRS-----IELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQF 195

Query: 134 TA 135
+
Sbjct: 196 ST 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2141ACRIFLAVINRP9050.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 905 bits (2340), Expect = 0.0
Identities = 286/1035 (27%), Positives = 502/1035 (48%), Gaps = 40/1035 (3%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ +L++ + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVSEMTSSS-SLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLPSGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSDT--YSQGELYDFASTQLAPTISQIDGVGDVDVGGSSL 182
+ S + +M+ SD +Q ++ D+ ++ + T+S+++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLTPQALFNQGVSLDDVRTAISNANVRKPQG------ALEDGTHRWQIQTNDELK 236
A+R+ L L ++ DV + N + G AL I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDSIRAKLPELQETIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRS 355
T +I+AKL ELQ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RAT+IP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RMLVALQQGYGKSLKWVLNHTRLVG 530
+LV+L LTP +C +LK + GF Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVLLGTIALNI----SIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 582
++ +A + +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 583 RD-DPAVDNVTGFT-GGSRVSSGMMFITLKPRDERS---ETAQQIIDRLRVKLAKEPGAN 637
+ +V V GF+ G ++GM F++LKP +ER+ +A+ +I R +++L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 638 LFLMAVQDIRVGGRQSNASYQYTLLSDDLAALREWEPKIRKKLATL-----PELADVNSD 692
+ + I G + ++ L D + + R +L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 693 QQDNGAEMNLVYDRDTMARLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 752
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 753 TQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSD 812
++K++V + G+ +P S F + + I G S D
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 813 ASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVILIIAAIATVYIVLGILYESYVH 872
A A ++ ++L P+ + + G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 873 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGN 932
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA +
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 933 LTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLEITIVGGLVMSQLL 992
EA A +R RPI+MT+LA + G LPL +S G GS + + I ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 993 TLYTTPVVYLFFDRL 1007
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031



Score = 78.7 bits (194), Expect = 7e-17
Identities = 77/446 (17%), Positives = 162/446 (36%), Gaps = 26/446 (5%)

Query: 588 VDNVTGFTGGS-RVSSGMMFITLKPRDERSETAQQIIDRLRVKLAKEPGANLFLMAVQDI 646
+DN+ + S S + +T + + Q+ ++L++ P + Q I
Sbjct: 72 IDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE----VQQQGI 127

Query: 647 RVGGRQSNASYQYTLLSDDLAALREW-----EPKIRKKLATLPELADVNSDQQDNGAE-- 699
V S+ +SD+ ++ ++ L+ L + DV GA+
Sbjct: 128 SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL----FGAQYA 183

Query: 700 MNLVYDRDTMARLGID----VQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRYTQD 755
M + D D + + + + + + T P Q + R+
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP 243

Query: 756 ISALEKMFVINNEGKAIPLSYFAK--WQPANAPLSVNHQGLSAASTISFNLPTGKSLSDA 813
+ +N++G + L A+ N + G AA +L D
Sbjct: 244 EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL-DT 302

Query: 814 SAAIDRAMTQL--GVPSTVRGSFA-GTAQVFQETMNSQVILIIAAIATVYIVLGILYESY 870
+ AI + +L P ++ + T Q +++ V + AI V++V+ + ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 871 VHPLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRH 930
L +P +G L F + + + G++L IG++ +AI++V+
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 931 GNLTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLEITIVGGLVMSQ 990
L P+EA ++ ++ + +P+ GG + + ITIV + +S
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 991 LLTLYTTPVVYLFFDRLRLRFSRKPK 1016
L+ L TP + + + K
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2142TCRTETB1252e-33 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 125 bits (315), Expect = 2e-33
Identities = 97/429 (22%), Positives = 188/429 (43%), Gaps = 23/429 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIATLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGLSPLAIAGLVAVGVVALVLYLLHARNNNRALFSLKL 257
G +L++VG+ L + + V V++ ++++ H R L
Sbjct: 202 KGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+R G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHISVDSGTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYTWLSMAF 441
+Y+ L + F
Sbjct: 428 LYSNLLLLF 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2143BCTERIALGSPF310.009 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 31.3 bits (71), Expect = 0.009
Identities = 27/93 (29%), Positives = 34/93 (36%), Gaps = 27/93 (29%)

Query: 173 LATLLAALATFLLA-------------RGLLAPVKRLVDGTHKLAAGDFTTRVTPTSEDE 219
LATL+AA A L+A V+ V H LA + P S +
Sbjct: 77 LATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSFER 133

Query: 220 L-----------GKLAQDFNQLASTLEKNQQMR 241
L G L N+LA E+ QQMR
Sbjct: 134 LYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2144HTHFIS766e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 6e-18
Identities = 28/136 (20%), Positives = 65/136 (47%), Gaps = 1/136 (0%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLAYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDVPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTILRRCK 129
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 PQRELQQQDAESPLII 145
+ + D++ + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2148LIPOLPP20270.026 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 26.6 bits (58), Expect = 0.026
Identities = 13/38 (34%), Positives = 24/38 (63%), Gaps = 1/38 (2%)

Query: 18 EGEMKKIAAISLISIFLISGCAVHNDETSIGKFGLAYK 55
+ ++KKI +S+++ +I GC+ H ++ I K AYK
Sbjct: 2 KNQVKKILGMSVVAAMVIVGCS-HAPKSGISKSNKAYK 38


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2151DHBDHDRGNASE347e-04 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 33.9 bits (77), Expect = 7e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 2/92 (2%)

Query: 156 AQGCENKNVIIIGAGT-IGLLAIQCAVALGAKSVTAIDISSEKLALAKSFGAMQTFNSSE 214
A+G E K I GA IG + + GA + A+D + EKL S + ++
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAH-IAAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 215 MSAPQMQSVLRELRFNQLILETAGVPQTVELA 246
A S + ++ E + V +A
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2169TYPE3OMGPROT260.029 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 26.4 bits (58), Expect = 0.029
Identities = 13/42 (30%), Positives = 21/42 (50%), Gaps = 1/42 (2%)

Query: 6 KMLLGVLLLVTSAAWAAPATAGSTNTSGISKYE-LSSFIADF 46
++L G LLL++S +WA ++K E L + DF
Sbjct: 11 RVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDF 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2170BINARYTOXINB280.043 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 28.1 bits (62), Expect = 0.043
Identities = 17/79 (21%), Positives = 34/79 (43%), Gaps = 8/79 (10%)

Query: 93 NITLSNNQ---SSFTSGYSVTVTPAASNAKVNISAGGGGSVMINGVATLSSA-----SSS 144
NI LS N+ + T + T++ S ++ + S G + + + + S+S
Sbjct: 297 NIILSKNEDQSTQNTDSQTRTISKNTSTSRTHTSEVHGNAEVHASFFDIGGSVSAGFSNS 356

Query: 145 TRGSAAVQFLLCLLGGKSW 163
+ A+ L L G ++W
Sbjct: 357 NSSTVAIDHSLSLAGERTW 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2171PF005777130.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 713 bits (1843), Expect = 0.0
Identities = 239/843 (28%), Positives = 389/843 (46%), Gaps = 35/843 (4%)

Query: 2 LRMTPLASAI---VALLLGIEAYAAEETFDTHFMIGGMKDQQVANIRL--DDNQPLPGQY 56
R+ + A +AE F+ F+ Q VA++ + + PG Y
Sbjct: 21 HRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADD--PQAVADLSRFENGQELPPGTY 78

Query: 57 DIDIYVNKQWRGKYEIIVKDNPQET----CLSREVIKRLGIN-----SDNFASGKQCLTF 107
+DIY+N + ++ E CL+R + +G+N N + C+
Sbjct: 79 RVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPL 138

Query: 108 EQLVQGGSYSWDIGVFRLDFSVPQAWVEELESGYVPPENWERGINAFYTSYYVSQYYSDY 167
++ + D+G RL+ ++PQA++ GY+PPE W+ GINA +Y S
Sbjct: 139 TSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQN 198

Query: 168 KASGNNKSTYVRFNSGLNLLEWQLHSDASFSKTNNNPGV-----WKSNTLYLERGFAQFL 222
+ GN+ Y+ SGLN+ W+L + ++S +++ W+ +LER
Sbjct: 199 RIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLR 258

Query: 223 GTLRVGDMYTSSDIFDSVRFSGVRLFRDMQMLPNSKQNFTPRVQGIAQSNALVTIEQNGF 282
L +GD YT DIFD + F G +L D MLP+S++ F P + GIA+ A VTI+QNG+
Sbjct: 259 SRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGY 318

Query: 283 VVYQKEVPPGPFAITDLQLAGGGADLDVSVKEADGSVTTYLVPYAAVPNMLQPGVSKYDF 342
+Y VPPGPF I D+ AG DL V++KEADGS + VPY++VP + + G ++Y
Sbjct: 319 DIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSI 378

Query: 343 AAGRSHIEGASKQSD-FVQAGYQYGFNNLLTLYGGTMVANNYYAFTLGTGWNT-RIGAIS 400
AG A ++ F Q+ +G T+YGGT +A+ Y AF G G N +GA+S
Sbjct: 379 TAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALS 438

Query: 401 VDATKSHSKQDNGDVFDGQSYQIAYNKFVSQTSTRFGLAAWRYSSRDYRTFNDHVWANNK 460
VD T+++S + DGQS + YNK ++++ T L +RYS+ Y F D ++
Sbjct: 439 VDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMN 498

Query: 461 DNYRRDENDIYDI----ADYYQNDFGRKNSFSANMSQSLPEGWGSVSLSTLWRDYWGRSG 516
++ + + DYY + ++ ++Q L ++ LS + YWG S
Sbjct: 499 GYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSN 557

Query: 517 SSKDYQLSYSNNWRRISYTLAASQAYGENHHE-EKRFNIFISIPCD--WGDDVTTPRRQI 573
+ +Q + + I++TL+ S ++ + ++IP D + R
Sbjct: 558 VDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHA 617

Query: 574 YMSNSTTFDDQGFASNNTGLSGTVGSRDQFNYGVNLSHQHQGN---ETTAGANLTWNAPV 630
S S + D G +N G+ GT+ + +Y V + G+ +T A L +
Sbjct: 618 SASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGY 677

Query: 631 ATVNGSYSQSSTYRQTGASVSGGIVAWSGGVNLANRLSETFAVMNAPGIKDAYVNGQKYR 690
N YS S +Q VSGG++A + GV L L++T ++ APG KDA V Q
Sbjct: 678 GNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGV 737

Query: 691 TTNRNGVVVYDGMTPYRENHLMLDVSQSDSEAELRGNRKIAAPYRGAVVLVNFDTDQRKP 750
T+ G V T YREN + LD + +L P RGA+V F +
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKA-RVGI 796

Query: 751 WFIKALRADGQPLTFGYEVNDIHGHNIGVVGQGSQLFIRTNEIPPSVNVAIDKQQGLSCT 810
+ L + +PL FG V + G+V Q+++ + V V +++ C
Sbjct: 797 KLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCV 856

Query: 811 ITF 813
+
Sbjct: 857 ANY 859


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2193INTIMIN280.015 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 28.1 bits (62), Expect = 0.015
Identities = 20/94 (21%), Positives = 32/94 (34%)

Query: 36 LNGTEIAITYVYKGDKVLKQSSETKIQFASIGATTKEDAAKTLEPLSAKYKNIAGVEEKS 95
+ + AITY K K K S ++ F + KT AK + KS
Sbjct: 671 VANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKS 730

Query: 96 TYTDTYAQENVTIDMEKVDFKALQGISGINVSAE 129
+ + V + +V+F I N+
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIV 764


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2197HTHFIS712e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 71.4 bits (175), Expect = 2e-16
Identities = 41/178 (23%), Positives = 76/178 (42%), Gaps = 14/178 (7%)

Query: 2 IKVLIVDDEPLARENL-RIFLQEQSDIEIVGECSNAVEGIGAVHKLRPDVLFLDIQMPRI 60
+L+ DD+ R L + + D+ I NA + D++ D+ MP
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITS---NAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 SGLEMVGMLDPEHRPYI--VFLTAFD--EYAIKAFEEHAFDYLLKPIDEARLEKTLARLR 116
+ +++ + + RP + + ++A + AIKA E+ A+DYL KP D L + R
Sbjct: 61 NAFDLLPRIK-KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 117 QERSKQDVSLLPENQQALKFIPCTGHSRIYLLQMKDVAFVSSRMSGVYVT--SHEGKE 172
E ++ L ++Q + + G S + +A + + +T S GKE
Sbjct: 120 AEPKRRPSKLEDDSQDGMPLV---GRSAAMQEIYRVLARLMQTDLTLMITGESGTGKE 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2198PF065802204e-69 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 220 bits (562), Expect = 4e-69
Identities = 63/216 (29%), Positives = 115/216 (53%), Gaps = 3/216 (1%)

Query: 343 LGEGIAQLLSAQILAGQYERQKAMLTQSEIKLLHAQVNPHFLFNALNTIKAVIRRDSEQA 402
L G + + + +M ++++ L AQ+NPHF+FNALN I+A+I D +A
Sbjct: 134 LYFGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKA 193

Query: 403 SQLVQYLSTFFRKNLKR-PSEFVTLADEIEHVNAYLQIEKARFQSRLQVNIAIPQELSQQ 461
+++ LS R +L+ + V+LADE+ V++YLQ+ +F+ RLQ I +
Sbjct: 194 REMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDV 253

Query: 462 QLPAFTLQPIVENAIKHGTSQLLDTGRVAISARREGQHLMLEIEDNAGL-YQPVTNASGL 520
Q+P +Q +VEN IKHG +QL G++ + ++ + LE+E+ L + ++G
Sbjct: 254 QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGT 313

Query: 521 GMNLVDKRLRERFGDDYGISVACEPDSYTRITLRLP 556
G+ V +RL+ +G + I ++ + + +P
Sbjct: 314 GLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2201LUXSPROTEIN310.002 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 31.4 bits (71), Expect = 0.002
Identities = 18/66 (27%), Positives = 30/66 (45%), Gaps = 7/66 (10%)

Query: 41 TKEHLLPHFL-EHLGNNHLDI------GVGTGFYLTHVPESSLISLMDLNEASLNAASTR 93
T EHL F+ HL + ++I G TGFY++ + S + D A++
Sbjct: 54 TLEHLYAGFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKV 113

Query: 94 AGESKI 99
++KI
Sbjct: 114 ENQNKI 119


34SF2277SF2288Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF22770203.255187transcriptional regulator NarP
SF22780213.681911heme lyase subunit
SF22791213.947281thiol:disulfide interchange protein DsbE
SF22800194.099028cytochrome c-type biogenesis protein
SF22810162.784610cytochrome c-type biogenesis protein CcmE
SF22821162.980402heme exporter protein C
SF22830153.091502heme exporter protein C
SF2284-1173.684875heme exporter protein B
SF2285-1193.827647cytochrome c biogenesis protein CcmA
SF2286-1213.790479cytochrome c-type protein NapC
SF2287-1194.156951nitrate reductase cytochrome C550 subunit
SF2288-1193.592189quinol dehydrogenase membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2277HTHFIS637e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.9 bits (153), Expect = 7e-14
Identities = 22/114 (19%), Positives = 46/114 (40%), Gaps = 2/114 (1%)

Query: 9 VMIVDDHPLMRRGVRQLLELDPGFEVVAEAGEGASAIDLANRLDIDVILLDLNMKGMSGL 68
+++ DD +R + Q L G++V A+ D D+++ D+ M +
Sbjct: 6 ILVADDDAAIRTVLNQALS-RAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 69 DTLNALRRDGVTAQIIILTVSDASSDVFALIDAGADGYLLKDSDPEVLLEAIRT 122
D L +++ +++++ + + GA YL K D L+ I
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


35SF2337SF2359Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF23370143.4320804-amino-4-deoxy-L-arabinose-phospho-UDP flippase
SF23381144.642227signal transduction protein PmrD
SF23391154.566391O-succinylbenzoic acid--CoA ligase
SF23400144.389155O-succinylbenzoate synthase
SF23410133.073511dihydroxynaphthoic acid synthetase
SF23420142.2288782-succinyl-6-hydroxy-2,
SF23430131.8256892-succinyl-5-enolpyruvyl-6-hydroxy-3-
SF2344-117-0.252284isochorismate hydroxymutase 2
SF2345-119-2.118544hypothetical protein
SF23460130.361848acyltransferase
SF23470161.663453hypothetical protein
SF23482253.288352insertion element IS1 protein InsA
SF23492263.360097insertion element IS1 protein InsB
SF23511273.393802hypothetical protein
SF23520293.900681NADH dehydrogenase I subunit N
SF23530293.287298NADH:ubiquinone oxidoreductase subunit M
SF23540293.749619NADH:ubiquinone oxidoreductase subunit L
SF23550293.563641NADH:ubiquinone oxidoreductase subunit K
SF23560283.608345NADH:ubiquinone oxidoreductase subunit J
SF23571273.759360NADH dehydrogenase subunit I
SF23580263.573091NADH:ubiquinone oxidoreductase subunit H
SF23591253.597168NADH dehydrogenase subunit G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2346AUTOINDCRSYN356e-05 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 34.8 bits (80), Expect = 6e-05
Identities = 14/79 (17%), Positives = 32/79 (40%), Gaps = 12/79 (15%)

Query: 1 MIEWQDLHHSELSVSQLYALLQLRCAVFV--------VEQNCPYQDIDGDDLTGDNRHIL 52
M+E D++H+ LS ++ L LR F + D + + ++
Sbjct: 1 MLEIFDVNHTLLSETKSGELFTLRKETFKDRLNWAVQCTDGMEFDQYDNN----NTTYLF 56

Query: 53 GWKNDELVAYARILKSDDD 71
G K++ ++ R +++
Sbjct: 57 GIKDNTVICSLRFIETKYP 75


36SF2407SF2440Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF2407023-3.983970N5-glutamine S-adenosyl-L-methionine-dependent
SF2406328-5.498424hypothetical protein
SF2408427-5.864912hypothetical protein
SF2409422-4.254826fimbrial-like protein
SF2410318-2.919176minor fimbrial subunit
SF2411-114-1.214861minor fimbrial subunit
SF2415-2100.368002fimbrial-like protein
SF2416-2101.049646insertion element IS1 protein InsB
SF2417-2120.395430insertion element IS1 protein InsA
SF2418-115-0.926710phosphohistidine phosphatase
SF2419-118-3.341579multifunctional fatty acid oxidation complex
SF2420120-4.0761683-ketoacyl-CoA thiolase
SF2420a323-6.120990hypothetical protein
SF2421423-5.134075long-chain fatty acid outer membrane
SF2422324-4.799574insertion element iso-IS10R transposase
SF2423220-3.511325hypothetical protein
SF2424118-0.581241ABC transporter outer membrane lipoprotein
SF2426118-0.928564*integrase
SF2427122-2.786249insertion sequence element IS911 integrase core
SF2430027-4.795096hypothetical protein
SF2431030-6.420935sucrose specific repressor
SF2432033-9.229655gluconate-like high-affinity transporter
SF2433034-9.065066D-serine dehydratase
SF2434036-9.965891multidrug resistance protein Y
SF2435036-9.109737multidrug resistance protein K
SF2436134-8.366940DNA-binding transcriptional activator EvgA
SF2437134-7.998789hybrid sensory histidine kinase in two-component
SF2438333-6.310953hypothetical protein
SF2439232-6.059876transporter YfdV
SF2440127-4.962516oxalyl-CoA decarboxylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2409FIMBRIALPAPE334e-04 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 32.7 bits (74), Expect = 4e-04
Identities = 49/182 (26%), Positives = 75/182 (41%), Gaps = 19/182 (10%)

Query: 1 MKKKRTLFFISSL-MLLGSGTTIAGDNLHFTGNLISKSCTPVINGSQLAEVHFPAIAASD 59
MKK R L L +L S A DNL F G LI +CT Q AEV++ I +
Sbjct: 1 MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACT-----VQNAEVNWGDIEIQN 55

Query: 60 LMNLGQSERVPLVFQLKDCHSSTLFNVKVTLTGTEDSALPGFLAFDSSSSASGAGIGIET 119
L+ G +++ + +S V +T G ++ + ++S+ASG G+ I
Sbjct: 56 LVQSGGNQK-DFTVDMNCPYSLGTMKVTITSNGQTGNS----ILVPNTSTASGDGLLIYL 110

Query: 120 AAGTSVPINNTTGVTLPLNQGN---NSLNFNTWLQAKSG-----RDVTSGDFSATVTATF 171
+ I N + + G + L AK G + + +G FSAT T
Sbjct: 111 YNSNNSGIGNAVTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVA 170

Query: 172 EY 173
Y
Sbjct: 171 SY 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2410FIMBRIALPAPF438e-08 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 42.8 bits (100), Expect = 8e-08
Identities = 44/171 (25%), Positives = 74/171 (43%), Gaps = 21/171 (12%)

Query: 1 MKRISL---ILLWGFCSMALSNVSFHGYLVQPPNCTISNAQTIEITFQDVLIDDINGSNY 57
M R+SL +LL +A ++ G + PP CTI+N Q I + F ++ + ++ S
Sbjct: 1 MIRLSLFISLLLTSVAVLADVQINIRGNVYIPP-CTINNGQNIVVDFGNINPEHVDNSRG 59

Query: 58 EQTVPYSITCDTAVRDPLMEMTLSWSGTPSDFDNAAVSSNITGLGIQLKQ---------- 107
E T SI+C +++T + G N +++NIT GI L Q
Sbjct: 60 EVTKNISISCPYKSGSLWIKVTGNTMGVGQ---NNVLATNITHFGIALYQGKGMSTPLTL 116

Query: 108 ---AGQSFTINTPLVVNETDLPVLTAVPVKKSGVILPEADFEAWATLQVDY 155
+G + + L + T+VP + IL DF A++ + Y
Sbjct: 117 GNGSGNGYRVTAGLDTARSTF-TFTSVPFRNGSGILNGGDFRTTASMSMIY 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2424VACJLIPOPROT407e-148 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 407 bits (1048), Expect = e-148
Identities = 251/251 (100%), Positives = 251/251 (100%)

Query: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60
MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR
Sbjct: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60

Query: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120
DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM
Sbjct: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120

Query: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADALYPVLSWLTWPM 180
ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADALYPVLSWLTWPM
Sbjct: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADALYPVLSWLTWPM 180

Query: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240
SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA
Sbjct: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240

Query: 241 IQDDLKDIDSE 251
IQDDLKDIDSE
Sbjct: 241 IQDDLKDIDSE 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2434TCRTETB1193e-31 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 119 bits (300), Expect = 3e-31
Identities = 98/408 (24%), Positives = 168/408 (41%), Gaps = 25/408 (6%)

Query: 19 VTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVITSFGVANAIAIPVTGRLAQ 78
+ I L + +F +L+ + NV++P I+ WV T+F + +I V G+L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 79 RIGELRLFLLSVTFFSLSSLMCSLS-TNLDVLIFFRVVQGLMAGPLIPLSQSLLLRNYPP 137
++G RL L + S++ + + +LI R +QG A L ++ R P
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 138 EKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRE 197
E R A L V + GP +GG I W +L+ +PM I+ L L +E
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192

Query: 198 TETSPVKMNLPRLTLLVLGVGGLQIMLDKGRDLDWFNSSTIIILTVVSVISLISLVIWES 257
K + ++++ VG + ML F +S I +VSV+S + V
Sbjct: 193 VRI---KGHFDIKGIILMSVGIVFFML--------FTTSYSISFLIVSVLSFLIFVKHIR 241

Query: 258 TSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQKTMGYNAIWAGLAYAPI 317
+P +D L K+ F IG++ + +G + ++P +++ + G
Sbjct: 242 KVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 318 GIMPLLISPLIG-----RYGNKIDMRVLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQ 372
G M ++I IG R G + + VTF +V + S T F II+
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL----SVSFLTASFLLETTSWFMTIIIVF 357

Query: 373 FFQGFAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL 420
G + ++TI S L + S+ NF LS G ++
Sbjct: 358 VLGGLSFTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2435RTXTOXIND771e-17 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 77.2 bits (190), Expect = 1e-17
Identities = 63/419 (15%), Positives = 125/419 (29%), Gaps = 96/419 (22%)

Query: 8 KKQSNRKKYFSLLVIVLFIAFSGAYAYWSMELEDMISTDDAYVT-GNADPISAQVSGSVT 66
+ +R+ I+ F+ + + ++E + + + G + I + V
Sbjct: 50 ETPVSRRPRLVAYFIMGFLVIAFILSVLG-QVEIVATANGKLTHSGRSKEIKPIENSIVK 108

Query: 67 VVNHKDTNYVRQGDILVSLDKTDATIALNKA----------------------------- 97
+ K+ VR+GD+L+ L A K
Sbjct: 109 EIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPEL 168

Query: 98 -----------------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQ 131
K + Q + L + AE + + Y+
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYEN 228

Query: 132 SLEDYNRRV----PLAKQGVISKE----------TLEHTKDTLISSKAALNAAIQAYKAN 177
R+ L + I+K + S + + I + K
Sbjct: 229 LSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE 288

Query: 178 KALVMN-------TPLNR-QPQVVEAADATKEAWLVLKRTDIRSPVTGYIAQRSVQ-VGE 228
LV L + + + + + IR+PV+ + Q V G
Sbjct: 289 YQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGG 348

Query: 229 TVSSGQSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINM 287
V++ ++LM +VP + V A + + + +GQ+ I + F G +
Sbjct: 349 VVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLV 402

Query: 288 GTGNAFSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDTKD 342
G + + +V V +S++ L PL G+++TA I T
Sbjct: 403 GK---VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2436HTHFIS493e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 3e-09
Identities = 22/148 (14%), Positives = 53/148 (35%), Gaps = 31/148 (20%)

Query: 4 IIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQV 63
++ DD + L + ++ + + + + D+V+ DV +P N +
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 LETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF 123
L ++K + ++++SA+N + AI+A++ G +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDY 101

Query: 124 ---PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 102 LPKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2437HTHFIS802e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.9 bits (197), Expect = 2e-17
Identities = 30/105 (28%), Positives = 51/105 (48%)

Query: 960 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKVSMQHYDLLITDVNMPNMDGFE 1019
+IL+ADD R +L + L+ GYDV ++ ++ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 LTRKLREQNSSLPIWGLTANAQANEREKGLSCGMNLCLFKPLTLD 1064
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


37SF2484SF2499Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF24842150.976555hypothetical protein
SF24852140.906883RpoE-regulated lipoprotein
SF24862151.893276hypothetical protein
SF24870130.704064acetyltransferase
SF2488115-0.247297N-acetylmuramoyl-L-alanine amidase
SF2489018-0.313724coproporphyrinogen III oxidase
SF2491019-1.785633carboxysome structural protein EutK
SF2492120-3.330716ethanolamine utilization protein EutL
SF2493023-5.443452hypothetical protein
SF2494-118-5.361248hypothetical protein
SF2495535-8.004782hypothetical protein
SF2496639-10.200945hypothetical protein
SF2497639-10.158497hypothetical protein
SF2497a741-10.162830hypothetical protein
SF2498226-5.190954hypothetical protein
SF2499224-5.003688amino acid antiporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2487SACTRNSFRASE316e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 31.5 bits (71), Expect = 6e-04
Identities = 15/102 (14%), Positives = 38/102 (37%), Gaps = 4/102 (3%)

Query: 24 LRPWNDPEMDIERKMNHDVSLFLVAEVNGEVVG--TVMGGYDGHRGSAYYLGVHPEFRGR 81
+ + D +MD+ + FL + +G + ++G + V ++R +
Sbjct: 47 FKQYEDDDMDVSYVEEEGKAAFL-YYLENNCIGRIKIRSNWNG-YALIEDIAVAKDYRKK 104

Query: 82 GIANALLNRLEKKLIARGCPKIQINVPEDNDMVLGMYERLGY 123
G+ ALL++ + + + + N Y + +
Sbjct: 105 GVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


38SF2565SF2575Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF2565-1123.278978penicillin-binding protein 1B-like protein
SF25690162.393373enhanced serine sensitivity protein SseB
SF25700193.356042aminopeptidase
SF25712212.432401hypothetical protein
SF25722252.280750[2FE-2S] ferredoxin electron carrer protein
SF25732252.342212chaperone protein HscA
SF25741240.797578co-chaperone HscB
SF25752281.098628iron-sulfur cluster assembly protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2569STREPKINASE300.014 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 29.7 bits (66), Expect = 0.014
Identities = 27/120 (22%), Positives = 52/120 (43%), Gaps = 21/120 (17%)

Query: 127 GNPLSSQEILEGGESLILSE-----VAEPPAQMIDSLTTLFKTIKPVKRAFICSIKENEE 181
G+ ++SQE+L +S++ + E + ++ +F+TI P+ + F +K E+
Sbjct: 217 GDTITSQELLAQAQSILNKNHPGYTIYERDSSIVTHDNDIFRTILPMDQEFTYRVKNREQ 276

Query: 182 A-QPNLLIGIEADGDIEEIIQATGSVATDTLPGDEPIDICQVKKGEKGISHFITEHIAPF 240
A + N G+ + + ++I V +KKGEK F H+ F
Sbjct: 277 AYRINKKSGLNEEINNTDLISEKYYV---------------LKKGEKPYDPFDRSHLKLF 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2573SHAPEPROTEIN1149e-30 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 114 bits (286), Expect = 9e-30
Identities = 80/371 (21%), Positives = 143/371 (38%), Gaps = 74/371 (19%)

Query: 23 GIDLGTTNSLVATVRSGQAETLADHEGRHLLPSVVHYQQQGHS-------VGYDARTNAA 75
IDLGT N+L+ G + +E PSVV +Q VG+DA+
Sbjct: 14 SIDLGTANTLIYVKGQG----IVLNE-----PSVVAIRQDRAGSPKSVAAVGHDAK-QML 63

Query: 76 LDTANTISSVKRLMGRSLADIQQRYPHLPYQFQASENGLPMIETAAGLLNPVRVSADILK 135
T I++++ + +AD V+ +L+
Sbjct: 64 GRTPGNIAAIRPMKDGVIADF-------------------------------FVTEKMLQ 92

Query: 136 ALAARATEALAGE-LDGVVITVPAYFDDAQRQGTKDAARLAGLHVLRLLNEPTAAAIAYG 194
+ V++ VP +R+ +++A+ AG + L+ EP AAAI G
Sbjct: 93 HFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAG 152

Query: 195 LDSGQEGVIAVYDLGGGTFDISILRLSRGVFEVLATGGDSALGGDDFDHLLADYIREQAD 254
L + V D+GGGT +++++ L+ V +GGD FD + +Y+R
Sbjct: 153 LPVSEATGSMVVDIGGGTTEVAVISLNGVV-----YSSSVRIGGDRFDEAIINYVRRNYG 207

Query: 255 --IPDRSDNRVQRELLDATIAAKIALSDADSVTVNVAG---WQG-----EISREQFNELI 304
I + + R++ E+ A + + V G +G ++ + E +
Sbjct: 208 SLIGEATAERIKHEI-------GSAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEAL 260

Query: 305 APLVKRTLLACRRALKDAGVE-ADEVLE--VVMVGGSTRVPLVRERVGEFFGRPPLTSID 361
+ + A AL+ E A ++ E +V+ GG + + + E G P + + D
Sbjct: 261 QEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLMEETGIPVVVAED 320

Query: 362 PDKVVAIGAAI 372
P VA G
Sbjct: 321 PLTCVARGGGK 331


39SF2608SF2617Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF2608221-2.992706hypothetical protein
SF2609527-3.178855DNA-invertase
SF2610426-2.070247invasion plasmid antigen
SF2611121-0.108699prophage CP-933K tail protein
SF2613223-0.673601tail fiber assembly protein
SF2614324-0.378360bacteriophage protein
SF2615224-0.285122IS2 repressor TnpA
SF2616324-0.131053IS2 transposase TnpB
SF2617324-1.628645tail fiber protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2609STREPKINASE280.019 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 27.8 bits (61), Expect = 0.019
Identities = 15/24 (62%), Positives = 18/24 (75%), Gaps = 2/24 (8%)

Query: 42 RPGLK--KLLKTLSAGDTLVVWKL 63
RPGLK KLLKTL+ GDT+ +L
Sbjct: 202 RPGLKDTKLLKTLAIGDTITSQEL 225


40SF2673SF2701Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF2673215-1.389268heat shock protein GrpE
SF2674215-1.213782inorganic polyphosphate/ATP-NAD kinase
SF2675217-1.469075recombination and repair protein
SF2676122-2.039985small membrane protein A
SF2677121-1.623556hypothetical protein
SF2678220-1.625754hypothetical protein
SF2679121-0.314178SsrA-binding protein
SF2680228-6.190476integrase
SF2681-122-5.792216insertion element IS1 protein InsA
SF2682-216-3.717662insertion element IS1 protein InsB
SF2683-113-2.209453insertion sequence IS3 transposase InsE
SF2684014-1.911332insertion sequence IS3A transposase InsF
SF2685013-2.670022*hypothetical protein
SF26861202.863717hypothetical protein
SF2687-1223.584034hydroxyglutarate oxidase
SF2689-1232.579600GABA transaminase
SF2691-1260.640542DNA-binding transcriptional regulator CsiR
SF2692023-0.716993LysM domain/BON superfamily protein
SF2693a-123-0.750966hypothetical protein
SF2693-122-1.056482IS2 transposase TnpB
SF2694123-3.238960IS2 repressor TnpA
SF2695024-3.306420hypothetical protein
SF2696221-2.595174hypothetical protein
SF2697112-0.834079DNA binding protein, nucleoid-associated
SF2698112-1.033551hypothetical protein
SF2699315-1.333937hypothetical protein
SF27012130.237601glutaredoxin-like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2676BLACTAMASEA260.032 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 26.3 bits (58), Expect = 0.032
Identities = 23/87 (26%), Positives = 36/87 (41%), Gaps = 11/87 (12%)

Query: 4 KTLTAAAAVLLMLTAGCSTLERVVYRPDINQGNYLTANDVSKIRV--GMTQQQVAYALGT 61
K + AVL + AG LER ++ Q + + + VS+ + GMT ++ A
Sbjct: 69 KVV-LCGAVLARVDAGDEQLERKIH---YRQQDLVDYSPVSEKHLADGMTVGELCAA--A 122

Query: 62 PLMSDPFGTNTWFYVFRQQPGHEGVTQ 88
MSD N + G G+T
Sbjct: 123 ITMSDNSAANL---LLATVGGPAGLTA 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2701PF07675260.014 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 26.2 bits (57), Expect = 0.014
Identities = 11/36 (30%), Positives = 14/36 (38%), Gaps = 1/36 (2%)

Query: 31 INVDRVPEAAEALRAQG-FRQLPVVIAGDLSWSGFR 65
V PEA R QG + Q V + + FR
Sbjct: 750 KTVVTAPEAIRGTRVQGTWYQKTVQLPAGTKYVAFR 785


41SF2730SF2742Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF27301153.153937DNA-binding transcriptional repressor SrlR
SF27311153.632811hypothetical protein
SF27321173.1710492-component transcriptional regulator
SF2733-1173.465573anaerobic nitric oxide reductase
SF27340184.063996NADH:flavorubredoxin oxidoreductase
SF27350184.809646transcriptional regulatory protein
SF2736-1224.572909formate dehydrogenase-H ferredoxin subunit
SF2737-2224.425181ascBF operon repressor
SF27380244.861305hydrogenase 3 large subunit
SF27392214.544428hydrogenase 3 membrane-spanning protein
SF27402224.138897formate hydrogenlyase subunit 3
SF27412202.729409hydrogenase-3 iron-sulfur protein
SF27421203.063960formate hydrogenlyase regulatory protein HycA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2730ARGREPRESSOR290.014 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 28.7 bits (64), Expect = 0.014
Identities = 20/105 (19%), Positives = 35/105 (33%), Gaps = 17/105 (16%)

Query: 1 MKPRQRQAAILEYLQKQGKCSVEEL-----AQYFDTTGTTIRKDLVILEHAGTVIRTYGG 55
M QR I E + + +EL ++ T T+ +D+ E + T G
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIK--ELHLVKVPTNNG 58

Query: 56 ---VVLNKEESDPPIDHKTLINTHKKELIAEAAVSFIHDGDSIIL 97
L ++ P+ K + +A V I+L
Sbjct: 59 SYKYSLPADQRFNPLS-------KLKRSLMDAFVKIDSASHLIVL 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2732HTHFIS377e-128 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 377 bits (970), Expect = e-128
Identities = 125/388 (32%), Positives = 194/388 (50%), Gaps = 33/388 (8%)

Query: 174 IAALAAGALS----------NALLIEQLESQNMLPGDATPFEAVKQTQMIGLSPGMTQLK 223
I A GA +I + ++ ++ ++G S M ++
Sbjct: 91 IKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIY 150

Query: 224 KEIEIVAASDLNVLISGETGTGKELVAKAIHEASPRAVNPLVYLNCAALPESVAESELFG 283
+ + + +DL ++I+GE+GTGKELVA+A+H+ R P V +N AA+P + ESELFG
Sbjct: 151 RVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFG 210

Query: 284 HVKGAFTGAISNRSGKFEMADNGTLFLDEIGELSLALQAKLLRVLQYGDIQRVGDDRSLR 343
H KGAFTGA + +G+FE A+ GTLFLDEIG++ + Q +LLRVLQ G+ VG +R
Sbjct: 211 HEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIR 270

Query: 344 VDVRVLAATNRDLREEVLAGRFRADLFHRLSVFPLSVPPLRERGDDVILLAGYFCEQCRL 403
DVR++AATN+DL++ + G FR DL++RL+V PL +PPLR+R +D+ L +F +Q
Sbjct: 271 SDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAE- 329

Query: 404 RQGLSRVVLSAGARNLLQHYSFPGNVRELEHAIHRAVVLARATRNGDEVIL-----EAQH 458
++GL A L++ + +PGNVRELE+ + R L E+I E
Sbjct: 330 KEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPD 389

Query: 459 FAFPEVTLPPPEAAAVPVVKQNLR-----------------EATEAFQRETIRQALAQNH 501
+ + V++N+R + I AL
Sbjct: 390 SPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATR 449

Query: 502 HNWAACARMLETDVANLHRLAKRLGLKD 529
N A +L + L + + LG+
Sbjct: 450 GNQIKAADLLGLNRNTLRKKIRELGVSV 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2740PF00577300.039 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 29.8 bits (67), Expect = 0.039
Identities = 9/47 (19%), Positives = 13/47 (27%), Gaps = 2/47 (4%)

Query: 4 PLFWHFSFQKALSGWIAGIGGAVGS--LYTAAAGFTVLTGAVGVSGA 48
P F+ + L GG + G GA+G
Sbjct: 393 PRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSV 439


42SF2779SF2789Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF2779-1133.103259sulfite reductase subunit beta
SF27800143.051434sulfite reductase subunit alpha
SF27811192.0750096-pyruvoyl tetrahydrobiopterin synthase
SF27821162.339974hypothetical protein
SF27832151.756008hypothetical protein
SF27852161.608087flavoprotein
SF27862161.236747transporter
SF2787119-0.087310transporter
SF2789326-0.534079insertion element IS1 protein InsB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2779PF07675300.021 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 30.4 bits (68), Expect = 0.021
Identities = 20/92 (21%), Positives = 39/92 (42%), Gaps = 12/92 (13%)

Query: 206 ILGQTYLPRKFKTTVVIP---PQND--IDLHANDMNFVAIAENGKLVGFNLLVGGGLSIE 260
++ +P+ T +P PQN + A+ ++VAI+++G L G + G++
Sbjct: 240 VMPYRAMPKT--NTYTLPASLPQNQASYSIQASAGSYVAISKDGVLYGTGVANASGVATV 297

Query: 261 HGNK-----KTYARTASEFGYLPLEHTLAVAE 287
+ K Y + YLP+ + E
Sbjct: 298 NMTKQITENGNYDVVITRSNYLPVIKQIQAGE 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2787TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.5 bits (79), Expect = 0.001
Identities = 27/137 (19%), Positives = 55/137 (40%), Gaps = 11/137 (8%)

Query: 93 LGSLVLGWISDHIGRQKIFTFSFLLITLASFLQFFATTP-EHLIGLRILIGIGLGGDYSV 151
+G+ V G +SD +G +++ F ++ S + F + LI R + G G ++
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPAL 123

Query: 152 GHTLLAEFSPRRHRGILLGAFSVVWT----VGYVLASIAGHHFISENPEAWRWLLASAAL 207
++A + P+ +RG G + VG + + H+ W +LL +
Sbjct: 124 VMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------HWSYLLLIPMI 177

Query: 208 PALLITLLRWGTPESPR 224
+ + L + R
Sbjct: 178 TIITVPFLMKLLKKEVR 194


43SF2854SF2861Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SF2854320-0.180640acyltransferase
SF2856727-1.019686insertion sequence IS3 transposase InsE
SF2857627-1.448982insertion sequence IS3A transposase InsF
SF2858733-3.480967hypothetical protein
SF28591137-2.096085hypothetical protein
SF2860729-2.611803hypothetical protein
SF2861221-1.549018hypothetical protein
44SF2963SF3006Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF2963224-5.445054transporter
SF2964224-5.458814*P4-type integrase
SF2965224-5.568039superfamily I DNA helicase
SF2967a526-4.515341hypothetical protein
SF2967727-3.934353hypothetical protein
SF2968627-3.775379serine protease
SF2970626-0.321158hypothetical protein
SF2971626-0.467257hypothetical protein
SF2972625-0.364614hypothetical protein
SF2973625-0.845968serine protease
SF29765240.622641reverse transcriptase-like protein
SF2977529-2.664093hypothetical protein
SF2979530-2.826617insertion sequence element IS629 protein
SF2980528-2.046710hypothetical protein
SF2981429-3.787600hypothetical protein
SF2982427-4.178092hypothetical protein
SF2983326-3.702987insertion element iso-IS10R transposase
SF2984229-5.593931IS2 repressor TnpA
SF2985425-3.676450insertion element IS2 transposase InsD
SF29866230.100544hypothetical protein
SF29877231.135974hypothetical protein
SF29887230.852044hypothetical protein
SF29896231.010651hypothetical protein
SF29908243.385277hypothetical protein
SF29918243.121786outer membrane fluffing protein
SF29926241.701886hypothetical protein
SF29937253.007650hypothetical protein
SF2993a8295.102124hypothetical protein
SF29948304.814862hypothetical protein
SF29958284.260549hypothetical protein
SF29967293.671751RADC family DNA repair protein
SF29977280.544401hypothetical protein
SF2998526-1.993127structural protein
SF2999626-2.594287hypothetical protein
SF3000727-3.100575hypothetical protein
SF3001525-2.343726hypothetical protein
SF3002327-3.424539hypothetical protein
SF3003228-3.662279insertion element iso-IS10R transposase
SF3004226-3.787368hypothetical protein
SF3005127-4.248937insertion sequence IS3 transposase InsE
SF3006128-4.038168insertion sequence IS3A transposase InsF
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2968IGASERPTASE2833e-80 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 283 bits (726), Expect = 3e-80
Identities = 136/571 (23%), Positives = 223/571 (39%), Gaps = 122/571 (21%)

Query: 33 KKVILGIILSSIYGSYGETAFA-AMLDINNIWTRDYLDLAQNRGEFRPGATNVQLMMKDG 91
KK L I ++ +Y T + A L +++ + + D A+N+G+F GATNV + K+
Sbjct: 4 KKFKLNFIALTV--AYALTPYTEAALVRDDVDYQIFRDFAENKGKFSVGATNVLVKDKNN 61

Query: 92 KIFH--FPE-LPVPDFSAVS-NKGATTSIGGAYSVTATH--------------------N 127
K P +P+ DFS V +K T I Y V H N
Sbjct: 62 KDLGTALPNGIPMIDFSVVDVDKRIATLINPQYVVGVKHVSNGVSELHFGNLNGNMNNGN 121

Query: 128 GTQHHAITTQSWDQTAYKASNRVSS----------------GDFSVHRLNKFVVETTGVT 171
H ++++ + + + + D+ + RL+KFV T
Sbjct: 122 AKAHRDVSSEENRYFSVEKNEYPTKLNGKTVTTEDQTQKRREDYYMPRLDKFV------T 175

Query: 172 ESADFSLSPEDAMKRYGVNYNGKEQ-IIGFRAGAGTTSTILNGKQY-------------- 216
E A S + YN + + R G+G+ G Y
Sbjct: 176 EVAPIEASTASS---DAGTYNDQNKYPAFVRLGSGSQFIYKKGDNYSLILNNHEVGGNNL 232

Query: 217 -LFGQNYNPDLLSASLFNLDWKNKSYIYT--------------NRTPFKNSPIFGDSGSG 261
L G Y ++ + + ++ +N I ++ P N + GDSGS
Sbjct: 233 KLVGDAYTY-GIAGTPYKVNHENNGLIGFGNSKEEHSDPKGILSQDPLTNYAVLGDSGSP 291

Query: 262 SYLYDKEQQKWVFHGVTSTVGFISSTNIAWTNYSLFNNILVNNLKKNFTNTMQLDGKKQE 321
++YD+E+ KW+F G + +W ++++ + ++ + + K
Sbjct: 292 LFVYDREKGKWLFLGSYD--FWAGYNKKSWQEWNIYKSQFTKDVLNKDSAGSLIGSKTDY 349

Query: 322 LSSIIKD-------------------------KDLSVSGGGVLTLKQDTDLGIGGLIFDK 356
S K ++ G G LTL + D G GGL F+
Sbjct: 350 SWSSNGKTSTITGGEKSLNVDLADGKDKPNHGKSVTFEGSGTLTLNNNIDQGAGGLFFEG 409

Query: 357 NQTYKVYGKDKSYKGAGIDIDNNTTVEWNVKGVAGDNLHKIGSGTLDVKIAQGN--NLKI 414
+ K + ++KGAG+ + TV W V D L KIG GTL V+ N +LK+
Sbjct: 410 DYEVKGTSDNTTWKGAGVSVAEGKTVTWKVHNPQYDRLAKIGKGTLIVEGTGDNKGSLKV 469

Query: 415 GNGTVIL------SAEKAFNKIYMAGGKGTVKINAKDALSESGNGEIYFTRNGGTLDLNG 468
G+GTVIL S + AF + + G+ T+ +N + + IYF GG LDLNG
Sbjct: 470 GDGTVILKQQTNGSGQHAFASVGIVSGRSTLVLNDDKQVDPNS---IYFGFRGGRLDLNG 526

Query: 469 YDQSFQKIAATDAGTTVTNSNVKQ-STLSLT 498
+F I D G + N N+ S +++T
Sbjct: 527 NSLTFDHIRNIDDGARLVNHNMTNASNITIT 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2973IGASERPTASE7520.0 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 752 bits (1942), Expect = 0.0
Identities = 258/870 (29%), Positives = 400/870 (45%), Gaps = 115/870 (13%)

Query: 45 ICLCYSQISQAGIVRSDIAYQIYRDFAENKGLFVPGANDIPVYDKDGKLVGRL--GKAPM 102
+ + ++A +VR D+ YQI+RDFAENKG F GA ++ V DK+ K +G PM
Sbjct: 15 VAYALTPYTEAALVRDDVDYQIFRDFAENKGKFSVGATNVLVKDKNNKDLGTALPNGIPM 74

Query: 103 ADFSSVSSN-GVATLVSPQYIVSVKH-NGGYRSVSFGN------------------GKNT 142
DFS V + +ATL++PQY+V VKH + G + FGN +N
Sbjct: 75 IDFSVVDVDKRIATLINPQYVVGVKHVSNGVSELHFGNLNGNMNNGNAKAHRDVSSEENR 134

Query: 143 YSLVDRNNHPSI-----------------DFHAPRLNKLVTEVIPSAVTSEGTKANAYKY 185
Y V++N +P+ D++ PRL+K VTEV P ++ + A Y
Sbjct: 135 YFSVEKNEYPTKLNGKTVTTEDQTQKRREDYYMPRLDKFVTEVAPIEASTASSDAGTYND 194

Query: 186 TERYTAFYRVGSGTQYTKDKDGNLVKVAGGYAFKTGGTTGVPLISDATIVSNPGQTYNPV 245
+Y AF R+GSG+Q+ K N + + V I P + +
Sbjct: 195 QNKYPAFVRLGSGSQFIYKKGDNYSLILNNHEVGGNNLKLVGDAYTYGIAGTPYKVNHEN 254

Query: 246 NG---------------------PLPDYGAPGDSGSPLFAYDKQQKKWVIVAVLRAYAGI 284
NG PL +Y GDSGSPLF YD+++ KW+ + +AG
Sbjct: 255 NGLIGFGNSKEEHSDPKGILSQDPLTNYAVLGDSGSPLFVYDREKGKWLFLGSYDFWAGY 314

Query: 285 NGAT-NWWNVIPTDYLNQVMQDDFDAPVDFVSGLGPLNWTYDKTSGTGTLSQGSKNWTMH 343
N + WN+ + + V+ D + + +W+ + + T T + S N +
Sbjct: 315 NKKSWQEWNIYKSQFTKDVLNKD--SAGSLIGSKTDYSWSSNGKTSTITGGEKSLNVDLA 372

Query: 344 GQKDNDLNAGKNLVFSGQNGAIILKDSVTQGAGYLEFKDSYTVSAES-GKTWTGAGIITD 402
KD N GK++ F G +G + L +++ QGAG L F+ Y V S TW GAG+
Sbjct: 373 DGKDKP-NHGKSVTFEG-SGTLTLNNNIDQGAGGLFFEGDYEVKGTSDNTTWKGAGVSVA 430

Query: 403 KGTNVTWKVNGVAGDNLHKLGEGTLTINGTGVNPGGLKTGDGIVVLNQQADTAGNIQAFS 462
+G VTWKV+ D L K+G+GTL + GTG N G LK GDG V+L QQ + +G AF+
Sbjct: 431 EGKTVTWKVHNPQYDRLAKIGKGTLIVEGTGDNKGSLKVGDGTVILKQQTNGSGQ-HAFA 489

Query: 463 SVNLASGRPTVVLGDARQVNPDNISWGYRGGKLDLNGNAVTFTRLQAADYGAVITN-NAQ 521
SV + SGR T+VL D +QV+P++I +G+RGG+LDLNGN++TF ++ D GA + N N
Sbjct: 490 SVGIVSGRSTLVLNDDKQVDPNSIYFGFRGGRLDLNGNSLTFDHIRNIDDGARLVNHNMT 549

Query: 522 QKSQLLLDLKAQDT--------NVSEPTIGNISPFGGTGTPGNLYSMILNSQTRFYILKS 573
S + + ++ T N+ P N F G LY + L + T + + K
Sbjct: 550 NASNITITGESLITDPNTITPYNIDAPDEDNPYAFRRIKDGGQLY-LNLENYTYYALRKG 608

Query: 574 ASYGNTLWGNSLNDPAQWEFVGMNKNKAVQTVKDRILAGRAKQPVIF----HGQLTGNMD 629
AS + L NS W ++G ++A + V + I R + G+ GN++
Sbjct: 609 ASTRSELPKNSGESNENWLYMGKTSDEAKRNVMNHINNERMNGFNGYFGEEEGKNNGNLN 668

Query: 630 VAIPQVPGGRKVIFDGSVNLPEGTLSQDSGTLIFQGHPVIHA-SISGSAPVSLN------ 682
V + + G NL G L+ + GTL G P HA I+G + +
Sbjct: 669 VTFKGKSEQNRFLLTGGTNL-NGDLTVEKGTLFLSGRPTPHARDIAGISSTKKDPHFAEN 727

Query: 683 -----QKDWENRQFTMKTLSLK-DADFHLSRN-ASLNSDIKSDNS---HITLGSDRAFVD 732
+ DW NR F T+++ +A + RN A++ S+I + N HI +
Sbjct: 728 NEVVVEDDWINRNFKATTMNVTGNASLYSGRNVANITSNITASNKAQVHIGYKTGDTVCV 787

Query: 733 KNDGTGNYVIPEEGTSVPDTVNDR-------SQYEGNITLNHNSALDIGSR--FTGGIDA 783
++D TG T D ++D+ + GN+ L ++ +G F
Sbjct: 788 RSDYTGY------VTCTTDKLSDKALNSFNPTNLRGNVNLTESANFVLGKANLFGTIQSR 841

Query: 784 YDSAVSITSPDVLLTAPGAFAGSSLTVHDG 813
+S V +T + G L + +G
Sbjct: 842 GNSQVRLTE-NSHWHLTGNSDVHQLDLANG 870



Score = 78.6 bits (193), Expect = 1e-16
Identities = 41/154 (26%), Positives = 65/154 (42%), Gaps = 18/154 (11%)

Query: 865 DNAALEITRGAHASGDIHASAASTVTIGSDTPAELASAETAASAFAG--------SLLEG 916
+ A + I + + + VT +D ++ A + G + + G
Sbjct: 771 NKAQVHIGYKTGDTVCVRSDYTGYVTCTTDKLSDKALNSFNPTNLRGNVNLTESANFVLG 830

Query: 917 YNAAFNGAITGGRADVSM-HNALWTLGGDSAIHSLTVRNSRI------SSEGDRTFRTLT 969
F + G + V + N+ W L G+S +H L + N I +S + TLT
Sbjct: 831 KANLFGTIQSRGNSQVRLTENSHWHLTGNSDVHQLDLANGHIHLNSADNSNNVTKYNTLT 890

Query: 970 VNKLDATGSDFVLRTDLKN--ADKINVTEKATGS 1001
VN L GS + L TDL N DK+ VT+ ATG+
Sbjct: 891 VNSLSGNGSFYYL-TDLSNKQGDKVVVTKSATGN 923


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2991PRTACTNFAMLY436e-06 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 43.1 bits (101), Expect = 6e-06
Identities = 124/629 (19%), Positives = 199/629 (31%), Gaps = 75/629 (11%)

Query: 311 QNTGGALVTSTAATVTGTNRLGAFSVVEGKADNVVLENGGRLDVLTGHTATNTRVDDGGT 370
++ + V VT GA + V + + +GG + G A +
Sbjct: 194 EDLPPSRVVLRDTNVTAVPASGAPAAVSVLGASELTLDGGH--ITGGRAAGVAAMQGAVV 251

Query: 371 LDVRNGGTATTVSMGNGGVLLADSGAAVSGTRSDGKAFSIGGGQA----DALMLEKGSSF 426
R G A G AV G G + G +E S
Sbjct: 252 HLQRATIRRGDAPAGGAVPGGAVPGGAVPGGFGPGGFGPVLDGWYGVDVSGSSVELAQSI 311

Query: 427 TLNAGDTATDTTVNGGLFTARGGTLAGTTTLNNGAILTLSGKTV---NNDTLTIR-EGDA 482
A G T GG+L+ G ++ G L+I + A
Sbjct: 312 VEAPELGAAIRVGRGARVTVSGGSLSAPH----GNVIETGGARRFAPQAAPLSITLQAGA 367

Query: 483 LLQGGALTGNGSVEKSGSGTLTVSNTTLTQKAVNLNEGTLTLNDSTVTTDVIAQRGTALK 542
QG AL E LT++ Q + E S DV
Sbjct: 368 HAQGKALLYRVLPEPV---KLTLTGGADAQGDIVATELPSIPGTSIGPLDVALASQARWT 424

Query: 543 LTGSTVLNGAIDPTNVTLASGATWNIPDNATVQSVVDDLSHAGQIHF-TSTRTGKFVPAT 601
GA + ATW + DN+ V ++ L+ G + F G+F
Sbjct: 425 --------GATRAVDSLSIDNATWVMTDNSNVGAL--RLASDGSVDFQQPAEAGRF--KV 472

Query: 602 LKVKNLNGQNGTISLRVRPDMAQNNADRLVIDGGRATGKTILNLVNAGNSASGLATSGKG 661
L V L G +G + V D+ + D+LV+ A+G+ L + N+G+ + T
Sbjct: 473 LTVNTLAG-SGLFRMNVFADLGLS--DKLVVMQD-ASGQHRLWVRNSGSEPASANTL--- 525

Query: 662 IQVVEAINGATTEEGAFIQGNKLQAGAFNYSLNRDSDESWYLRSENAYRAEVPLYASMLT 721
+ V + A T A + K+ G + Y L + + W L A A P
Sbjct: 526 LLVQTPLGSAATFTLAN-KDGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQ 584

Query: 722 QAMDYDRILAGSRSHQTGVSGENNSVRLSIQGGHLGHDNNGGIARG-----------ATP 770
+ + ++ G +G + A P
Sbjct: 585 PPQPPQPQPEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRLNP 644

Query: 771 ESSGSYG--FVRLE------GDLLRTEVAG--------MSVTAGVYGAAGHSSVDVKDDD 814
++ G++G F + + G +VAG ++V G + G + D
Sbjct: 645 DAGGAWGRGFAQRQQLDNRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRG 704

Query: 815 GSRAGTVRDDAGSLGGYLNLIHNASGLWADIVAQGTRH-------SMKASSDNNDFRVRG 867
+ G D+ +GGY I + SG + D + +R + +R G
Sbjct: 705 FTGDGGGHTDSVHVGGYATYIAD-SGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHG 763

Query: 868 WGWLGSLETGLPFSITDNLMLEPQLQYTW 896
G SLE G F+ D LEPQ +
Sbjct: 764 VGA--SLEAGRRFTHADGWFLEPQAELAV 790


45SF3195SF3209Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF31951193.265095hypothetical protein
SF3196-1183.413682GIY-YIG nuclease superfamily protein
SF3197-1182.946998hypothetical protein
SF3198-1173.088108hypothetical protein
SF3199-1142.450433collagenase
SF32000211.869818protease
SF32011251.553594hypothetical protein
SF32021271.122990tryptophan permease
SF32034321.046117inducible ATP-independent RNA helicase
SF32045320.534709lipoprotein NlpI
SF32055371.064561polynucleotide phosphorylase
SF32065320.60805130S ribosomal protein S15
SF32073290.593088tRNA pseudouridine synthase B
SF32084290.546874ribosome-binding factor A
SF32092270.850273translation initiation factor IF-2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3209TCRTETOQM732e-15 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 73.4 bits (180), Expect = 2e-15
Identities = 69/313 (22%), Positives = 109/313 (34%), Gaps = 77/313 (24%)

Query: 388 IMGHVDHGKTSLLDYI-----RSTKVASGEAG-------------GITQHIGAYHVETEN 429
++ HVD GKT+L + + T++ S + G GIT G + EN
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 430 GMITFLDTPGHAAFTSMRARGAQATDIVVLVVAADDGVMPQTIEAIQHAKAAQVPVVVAV 489
+ +DTPGH F + R D +L+++A DGV QT + +P + +
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 490 NKIDKPEADPDRV----KNELSQYGI-----------------LPEEWG----------- 517
NKID+ D V K +LS + E+W
Sbjct: 128 NKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLE 187

Query: 518 ---------------GESQFV---------HVSAKAGTGIDELLDAILLQAEVLELKAVR 553
ES H SAK GID L++ I +
Sbjct: 188 KYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVIT--NKFYSSTHRG 245

Query: 554 KGMASGAVIESFLDKGRGPVATVLVREGTLHKGDIVL-CGFEYGRVRAMRNELGQEVLEA 612
+ G V + + R +A + + G LH D V E ++ M + E+ +
Sbjct: 246 QSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKIKITEMYTSINGELCKI 305

Query: 613 GPSIPVEILGLSG 625
+ EI+ L
Sbjct: 306 DKAYSGEIVILQN 318


46SF3469SF3514Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF3469-1253.026124glycerol-3-phosphate ABC transporter permease
SF3470-2253.287883glycerol-3-phosphate transporter permease
SF3471-1233.095609glycerol-3-phosphate ABC transporter
SF3472-1213.580291leucine/isoleucine/valine ABC transporter
SF3473-1212.967691leucine/isoleucine/valine ABC transporter
SF3474-2222.970695leucine/isoleucine/valine ABC transporter
SF3475-2212.413964branched-chain amino acid ABC transporter
SF34760212.269008leucine ABC transporter substrate-binding
SF34771191.948123hypothetical protein
SF34781171.795239leucine ABC transporter substrate-binding
SF34792151.338029RNA polymerase factor sigma-32
SF34802131.002450cell division ABC transporter subunit FtsX
SF34813121.375652cell division protein FtsE
SF34822122.929504signal recognition particle-docking protein
SF34830163.49384616S rRNA m(2)G966-methyltransferase
SF34840142.960516hypothetical protein
SF3485-1142.981951receptor
SF3486-1143.447112hypothetical protein
SF34871152.608844zinc/cadmium/mercury/lead-transporting ATPase
SF34882181.226398sulfur transfer protein SirA
SF34891151.235481hypothetical protein
SF34901162.287758hypothetical protein
SF34911183.446822major facilitator superfamily transporter
SF34920203.837625hypothetical protein
SF34930234.702759holo-(acyl carrier protein) synthase 2
SF34940234.636711nickel ABC transporter substrate-binding
SF34952224.739467nickel ABC transporter permease NikB
SF34962213.920822nickel ABC transporter permease NikC
SF34970203.650356nickel ABC transporter ATP-binding protein NikD
SF34980193.825131nickel ABC transporter ATP-binding protein NikE
SF3499-114-0.230020nickel responsive regulator
SF3500119-4.274928hypothetical protein
SF3501120-4.811594transporter
SF3502125-6.851846ABC transporter ATP-binding protein
SF3503240-12.606511hypothetical protein
SF3504447-15.517089hypothetical protein
SF3505645-14.185868hypothetical protein
SF3507640-10.753789hypothetical protein
SF3508334-7.333665hypothetical protein
SF3509429-5.712337hypothetical protein
SF3510428-3.813791hypothetical protein
SF3511425-3.220371hypothetical protein
SF3512324-2.159073IS2 repressor TnpA
SF3513423-1.819261IS2 transposase TnpB
SF3514219-3.320402outer membrane pore protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3471MALTOSEBP402e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 39.7 bits (92), Expect = 2e-05
Identities = 41/160 (25%), Positives = 68/160 (42%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTSVLYYNKDAFKKAGLDPEQPPKTWQDLADYSAKLKASGIKCGYASGWQ 193
G L++ P L YNKD L P PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKD------LLP-NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3482IGASERPTASE519e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 50.8 bits (121), Expect = 9e-09
Identities = 36/181 (19%), Positives = 60/181 (33%), Gaps = 13/181 (7%)

Query: 19 EQTPEKETEVQNEQPVVEEI---VQAQEPAKASEQAVEEQPQAHTEAEAETFAADVVEVT 75
TP + TE E E Q+ + + Q E +A + +A T EV
Sbjct: 1030 PATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTN---EVA 1086

Query: 76 EQVVESEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLPEDVNAEEVSPEEWQAEAETV 135
+ E+++ Q + V +E V E+ + +VSP++ Q+E
Sbjct: 1087 QSGSETKETQTTE--TKETATVEKEEKAKVETEKTQ---EVPKVTSQVSPKQEQSETVQP 1141

Query: 136 EIVEAAEEEAAK--EEITDEELEAQALAAEAAEEAVMVVPPAEEEQPVAEIAQEQEKPTK 193
+ A E + +E + A E + V P E V E P
Sbjct: 1142 QAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPEN 1201

Query: 194 E 194

Sbjct: 1202 T 1202



Score = 47.4 bits (112), Expect = 1e-07
Identities = 40/192 (20%), Positives = 69/192 (35%), Gaps = 26/192 (13%)

Query: 20 QTPEK-ETEVQNEQPVVEEIVQAQE----------PAKASEQAVEEQPQAHTEAE----- 63
TP + +V + EEI + E P++ +E E Q E
Sbjct: 998 TTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQD 1057

Query: 64 AETFAADVVEVTEQVVESEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLPEDVNAEEV 123
A A EV ++ + KA + VAQ +ET + E V EE
Sbjct: 1058 ATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKET------QTTETKETATVEKEEK 1111

Query: 124 SPEEWQAEAETVEIVEAAEEEAAKEEITDEELEAQALAAEAAEEAVMVVPPAEEEQPVAE 183
+ E +T E+ + + + K+E E ++ QA A + V + P + A+
Sbjct: 1112 AKVE---TEKTQEVPKVTSQVSPKQE-QSETVQPQAEPARENDPTVNIKEPQSQTNTTAD 1167

Query: 184 IAQEQEKPTKEG 195
Q ++ +
Sbjct: 1168 TEQPAKETSSNV 1179



Score = 46.2 bits (109), Expect = 3e-07
Identities = 36/177 (20%), Positives = 60/177 (33%), Gaps = 18/177 (10%)

Query: 22 PEKETE---VQNEQPVVEEIVQAQEPAKASEQAVEEQPQAHTEAEAETFAADVVEVTEQV 78
PE E V +QA P+ S + A E TE V
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVD--EAPVPPPAPATPSETTETV 1040

Query: 79 VESEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLPEDVNAEEVSPEEWQAEAETVEIV 138
E+ K + + + + + + + + EV+ Q+ +ET E
Sbjct: 1041 AENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVA----QSGSETKETQ 1096

Query: 139 EAAEEEAAKEEITDEELEAQALAAEAAEEAVMV--VPP----AEEEQPVAEIAQEQE 189
+E A E +E +A+ + E + V P +E QP AE A+E +
Sbjct: 1097 TTETKETATVE---KEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND 1150



Score = 45.1 bits (106), Expect = 6e-07
Identities = 26/159 (16%), Positives = 48/159 (30%), Gaps = 7/159 (4%)

Query: 17 QKEQTPEKETEVQNEQPVVEEIVQAQEPAKASE------QAVEEQPQAHTEAEAETFAAD 70
Q +T E T + E+ VE + P S+ Q+ QPQA E +
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNI 1155

Query: 71 VVEVTEQVVESEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLPEDVNAEEVSPEEWQA 130
++ ++ QP E + E V E+ V + PE+ P
Sbjct: 1156 KEPQSQTNTTADTEQPAKETSSNVEQPVTES-TTVNTGNSVVENPENTTPATTQPTVNSE 1214

Query: 131 EAETVEIVEAAEEEAAKEEITDEELEAQALAAEAAEEAV 169
+ + + + + + A +
Sbjct: 1215 SSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLT 1253



Score = 38.5 bits (89), Expect = 7e-05
Identities = 27/177 (15%), Positives = 52/177 (29%), Gaps = 12/177 (6%)

Query: 17 QKEQTPEKETEVQNEQPVVEEIVQAQEPAKASEQAVEEQPQAHTEAEAETFAAD------ 70
+E E ++ V+ E E + +E + E +
Sbjct: 1065 NREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE-TATVEKEEKAKVETEKTQEVP 1123

Query: 71 --VVEVTEQVVESEKAQPEAEVVAQPEPVV--EETPEPVAIEREELPLPEDVNAEEVSPE 126
+V+ + +SE QP+AE + +P V +E + ++ ++ P
Sbjct: 1124 KVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPV 1183

Query: 127 EWQAEAETV-EIVEAAEEEAAKEEITDEELEAQALAAEAAEEAVMVVPPAEEEQPVA 182
T +VE E E+ +V VP E +
Sbjct: 1184 TESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTS 1240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3485SHIGARICIN260.039 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 25.9 bits (57), Expect = 0.039
Identities = 6/21 (28%), Positives = 13/21 (61%)

Query: 7 FFIVIIGLIVVAASFRFMQQR 27
+V+I AA ++F++Q+
Sbjct: 173 ALMVLIQSTSEAARYKFIEQQ 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3488PF012061053e-34 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 105 bits (265), Expect = 3e-34
Identities = 24/72 (33%), Positives = 41/72 (56%)

Query: 9 DHTLDALGLRCPEPVMMVRKTVRNMQPGETLLIIADDPATTRDIPGFCTFMEHELVAKET 68
D +LDA GL CP P++ +KT+ M GE L ++A DP + +D F HEL+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 69 DGLPYRYLIRKG 80
+ Y + +++
Sbjct: 65 EDGTYHFRLKRA 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3491TCRTETA552e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 55.2 bits (133), Expect = 2e-10
Identities = 80/398 (20%), Positives = 147/398 (36%), Gaps = 32/398 (8%)

Query: 27 LRLNLRIVSIVMFNFASYLTIGLPLAVLPGYVHDVM--GFSAFWAGLVISLQYFATLLSR 84
++ N ++ I+ + IGL + VLPG + D++ G++++L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 85 PHAGRYADLLGPKKIVVFGLCGCFLSGLGYLTAGLTASLPVISLLLLCLGRVILGI-GQS 143
P G +D G + +++ L G + + Y L V L +GR++ GI G +
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAG---AAVDYAIMATAPFLWV-----LYIGRIVAGITGAT 112

Query: 144 FAGTGSTLWGVGVVGSL--HIGRVISWNGIVTYGAMAMGAPLGVVFYHWGGLQALALIIM 201
A G+ + + H G + + G +G +G H A AL +
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGL 172

Query: 202 GVALVAILLAIPRPTVK--ASKGKPLPFRAVLGRVWLYGMALALA-----SAGFGVIATF 254
LL + + P + + +A +A V A
Sbjct: 173 NFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAAL 232

Query: 255 ITLFYDAK-GWDGAAFALTLFSCAFVGT---RLLFPNGINRIGGLNVAMICFSVEIIGLL 310
+F + + WD ++L + + + ++ R+G M+ + G +
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI 292

Query: 311 LVGVATMPWMAKIG-VLLAGAGFSLVFPALGVVAVKAVPQQNQGAALATYTVFMDLSLGV 369
L+ AT WMA VLLA G + PAL + + V ++ QG + L+ +
Sbjct: 293 LLAFATRGWMAFPIMVLLASGGIGM--PALQAMLSRQVDEERQGQLQGSLAALTSLT-SI 349

Query: 370 TGPLAGLVMSWAGVPV----IYLAAAGLVAIALLLTWR 403
GPL + A + ++A A L + L R
Sbjct: 350 VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3498HTHFIS290.018 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.4 bits (66), Expect = 0.018
Identities = 10/34 (29%), Positives = 19/34 (55%)

Query: 25 QAVLNNVSLTLKSGETVALLGRSGCGKSTLARLL 58
Q + ++ +++ T+ + G SG GK +AR L
Sbjct: 147 QEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3501ABC2TRNSPORT505e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 49.9 bits (119), Expect = 5e-09
Identities = 41/171 (23%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 200 REREHGTVEHLLVMPITPFEIMMAKI-WSMGLVVLVVSGLSLVLMVKGVLGVPIEGSIPL 258
R T E +L + +I++ ++ W+ L +G+ +V G + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY----TQWLSLL 148

Query: 259 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQMLSGGSTPRESMPQMVQD 317
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P + Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 318 IMLTMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFF-TIALLRFR 367
+P +H + L + I+ ++ + I FF + ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3502PF05272300.045 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.045
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 37 ARCMVGLIGPDGVGKSSLLSLISGAR 62
V L G G+GKS+L++ + G
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3503RTXTOXIND844e-20 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 84.5 bits (209), Expect = 4e-20
Identities = 71/408 (17%), Positives = 139/408 (34%), Gaps = 81/408 (19%)

Query: 6 RHLAWWVVGALAVAAVVAWWLLRPAGVP-EGFAVSNGRIEATEVDIASKIAGRIDTILVK 64
R +A++++G L +A +++ G +GR + I + I+VK
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVK 113

Query: 65 EGQFVREGEVLAKMDTRV----------------LQEQRLEAI----------------- 91
EG+ VR+G+VL K+ L++ R + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 92 -------------------AQIKEAQSAVAAAQALLEQRQSETRAAQSLVNQRQAELDSV 132
Q Q+ + L+++++E + +N+ +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 133 AKRHTRSRSLAQRGAISAQQLDDDRAAAESARAALESAKAQVSASKAAIEAARTNIIQ-- 190
R SL + AI+ + + A L K+Q+ ++ I +A+
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 191 -----------AQTRVEAAQATERRIAADID--DSELKAPRDGRV-QYRVAEPGEVLAAG 236
QT T + S ++AP +V Q +V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 237 GRVLNMVDLSDVY-MTFFLPTEQAGTLKLGGEARLILDAAPDLRIPATISFVASVAQFTP 295
++ +V D +T + + G + +G A + ++A P R V V
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YLVGKVKNINL 410

Query: 296 KTVETSDERLKLMFRVKARIPPELLQQHLEYV--KTGLPGVAWVRVNE 341
+E D+RL L+F V I L + + +G+ A ++
Sbjct: 411 DAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3511PF005772649e-83 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 264 bits (675), Expect = 9e-83
Identities = 106/409 (25%), Positives = 176/409 (43%), Gaps = 35/409 (8%)

Query: 2 SITTNRYAS-GYATLTEAVSAQDERNRKRDKNSHDGS------------------TISLS 42
+ RY++ GY + ++ ++ ++++
Sbjct: 475 QLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVT 534

Query: 43 QPLGNIGNLNFNTTRYNSSRGTGNTRSTSLSYSTVWRGITFSINWAKNDLLTSHKWKVDR 102
Q LG L + + + +T + I ++++++ D+
Sbjct: 535 QQLGRTSTLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGR--DQ 592

Query: 103 KLSVGISVPLSLG-------DENQIYASSQMSRSGEQGNNYQVSLSGQ--NSGGVWWDVA 153
L++ +++P S AS MS + G + + V
Sbjct: 593 MLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQ 652

Query: 154 TNITNAHQSQPKSTMNIVQVGKNGSYGQFSSHYSSSENMKQLGANLSGGILITRDGLTFG 213
T ST + G YG + YS S+++KQL +SGG+L +G+T G
Sbjct: 653 TGYAGGGDGNSGSTGY-ATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLG 711

Query: 214 QNVDGTLALIEAPGATGVNVNGWPGLSTDFRGYAILP-VQPYRRDDVILDEKTIGKNYDL 272
Q ++ T+ L++APGA V G+ TD+RGYA+LP YR + V LD T+ N DL
Sbjct: 712 QPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDL 771

Query: 273 PQTSQLVVPTAGAVVPATLAVKSGDKGLVTLKQKEGKPIPFGAVISYSKDTENMAGIVGE 332
VVPT GA+V A + G K L+TL KP+PFGA+++ +GIV +
Sbjct: 772 DNAVANVVPTRGAIVRAEFKARVGIKLLMTLTH-NNKPLPFGAMVTSESSQS--SGIVAD 828

Query: 333 DGIAYVSGLSAEGEFNVKWGYSKDQSCIAKYQLPAKKSASGLYQIAATC 381
+G Y+SG+ G+ VKWG ++ C+A YQLP + L Q++A C
Sbjct: 829 NGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3514ECOLIPORIN296e-101 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 296 bits (758), Expect = e-101
Identities = 137/268 (51%), Positives = 165/268 (61%), Gaps = 31/268 (11%)

Query: 31 DTSYARVGVKGETQINPEMTGYGQFELDLEASNRHNPDQ---TRLAYAGLSYKDFGSFDY 87
D +Y RVG KGETQIN ++TGYGQ+E +++A+ TRLA+AGL + D+GSFDY
Sbjct: 53 DQTYMRVGFKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDY 112

Query: 88 SRNVGVAYDAEAFTDMFVEWGGDSWAGTDLFMTNRTNGVATYRNTDFFGMVEGLNFALQY 147
RN GV YD E +TDM E+GGDS+ D +MT R NGVATYRNTDFFG+V+GLNFALQY
Sbjct: 113 GRNYGVLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQY 172

Query: 148 QGKNEGTGNY----------------KANGDGHGLSATYTID-GFSFAGAYANSDRTDWQ 190
QGKNE NGDG G+S TY I GFS AY SDRT+ Q
Sbjct: 173 QGKNESQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQ 232

Query: 191 SGDGK----GERAEVWALSTKYDANNVYAAVMYGESHNM-------NSDDGDVVNKTQNF 239
G G++A+ W KYDANN+Y A MY E+ NM DG V NKTQNF
Sbjct: 233 VNAGGTIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNF 292

Query: 240 EAVLQYQFDFGLRPSIGYSYSKALDVAG 267
E QYQFDFGLRP++ + SK D+
Sbjct: 293 EVTAQYQFDFGLRPAVSFLMSKGKDLTY 320


47SF3528SF3544Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SF3528-1203.183245methyltransferase
SF3529-1202.295181oligopeptidase A
SF3530-1171.936417hypothetical protein
SF35310201.492935glutathione reductase
SF35320211.034156insertion element IS1 protein InsB
SF3533016-0.698948insertion element IS1 protein InsA
SF3534022-5.399950hypothetical protein
SF3535122-5.856663arsenical pump membrane protein
SF3536128-9.291733arsenate reductase
SF3537232-10.515255insertion element IS1 protein InsB
SF3538336-11.846790hypothetical protein
SF3539232-12.312989hypothetical protein
SF3540224-10.926166carbon starvation outer membrane protein
SF3541324-10.892450hypothetical protein
SF3543220-8.584230acid-resistance protein
SF3544119-6.934058acid-resistance protein
48SF3587SF3606Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF3587121-3.268561bifunctional glyoxylate/hydroxypyruvate
SF3588021-4.055788hypothetical protein
SF3589-114-2.672511RNA chaperone/anti-terminator
SF3590-117-3.821913insertion element IS150 protein InsB
SF3591-215-3.798964insertion element IS150 protein
SF3592-114-3.439344insertion element IS1 protein InsB
SF3593-113-3.569622cytochrome C peroxidase
SF3594-114-0.945471glutamate decarboxylase
SF3595016-0.893762DNA-binding transcriptional regulator GadX
SF35961190.651119AraC family transcriptional regulator
SF3600-2201.002668insertion element IS1 protein InsA
SF3601-1190.066119insertion element IS1 protein InsB
SF3602-216-0.722340glycyl-tRNA synthetase subunit beta
SF3603117-1.838628glycyl-tRNA synthetase subunit alpha
SF3604120-4.461014hypothetical protein
SF3605019-4.626233hypothetical protein
SF3606021-3.284315hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3606FLGBIOSNFLIP270.017 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 27.5 bits (61), Expect = 0.017
Identities = 19/66 (28%), Positives = 26/66 (39%), Gaps = 1/66 (1%)

Query: 77 MTCLTVFIISVALLMVGLWNATLLLSEKGFYGLAFFLSLFGAVAVQKNIRDAGINPPKET 136
MT T II LL L + + GLA FL+ F V I P E
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAP-PNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEE 119

Query: 137 QVTQEE 142
+++ +E
Sbjct: 120 KISMQE 125


49SF3635SF3642Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF3635021-3.644713mannitol repressor protein
SF3635a224-3.825306hypothetical protein
SF3636421-2.112585hypothetical protein
SF3637319-0.942995insertion element IS1 protein InsB
SF3638217-0.693070insertion element IS1 protein InsA
SF3639115-0.120754hypothetical protein
SF36402150.973102hypothetical protein
SF36411151.533679hypothetical protein
SF3642-1123.059907L-lactate permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3640OMADHESIN635e-13 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 63.0 bits (152), Expect = 5e-13
Identities = 52/146 (35%), Positives = 86/146 (58%), Gaps = 3/146 (2%)

Query: 153 GRYSKALGKLSIAMGDSSKAEGANAIALGRSSVASGTDSLAFGRQSLASAANAIAIGAET 212
G + A G SIA+G +++A A+A+G S+A+G +S+A G S A +A+ GA +
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 213 EAAENATAIGNNAKAKGTNSMAMGFGSLADKVNTIALGNGSQALADN--AIAIGQGNKAD 270
A ++ AIG A T +A+GF S AD N++A+G+ S A++ +IAIG +K D
Sbjct: 122 TAQKDGVAIGARASTSDT-GVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTD 180

Query: 271 GVDAIALGNGSQSRGLNTIALGTASN 296
+++++G+ S +R L +A GT
Sbjct: 181 RENSVSIGHESLNRQLTHLAAGTKDT 206



Score = 53.0 bits (126), Expect = 8e-10
Identities = 56/158 (35%), Positives = 87/158 (55%), Gaps = 21/158 (13%)

Query: 222 GNNAKAKGTNSMAMGFGSLADKVNTIALGNGSQALADNAIAIGQGNKADGVDAIALGNGS 281
G NA AKG +S+A+G + A K +A+G GS A N++AIG +KA G A+ G S
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 282 QSRGLNTIALGTASNATGDKSLALGSNSSANGINSVALGAD----------------SIA 325
++ + +A+G A +T D +A+G NS A+ NSVA+G S
Sbjct: 122 TAQK-DGVAIG-ARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKT 179

Query: 326 DLDNTVSVGNSSLKRKIVNVKNGAIKSDSYDAINGSQL 363
D +N+VS+G+ SL R++ ++ G + DA+N +QL
Sbjct: 180 DRENSVSIGHESLNRQLTHLAAG---TKDTDAVNVAQL 214



Score = 48.4 bits (114), Expect = 3e-08
Identities = 64/214 (29%), Positives = 102/214 (47%), Gaps = 24/214 (11%)

Query: 97 GYDAIAEGQYSSAIGSKTHAIGGASMAFGVSAISEGDRSIALGASSYSLGQYSMALGRYS 156
G +A A+G +S AIG+ A GA ++A+GA S + G S+A+G S
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGA--------------AVAVGAGSIATGVNSVAIGPLS 107

Query: 157 KALGKLSIAMGDSSKAEGANAIALGRSSVASGTDSLAFGRQSLASAANAIAIGAETEAAE 216
KALG ++ G +S A+ + +A+G + S T +A G S A A N++AIG + A
Sbjct: 108 KALGDSAVTYGAASTAQ-KDGVAIGARASTSDT-GVAVGFNSKADAKNSVAIGHSSHVAA 165

Query: 217 N---ATAIGNNAKAKGTNSMAMGFGSLADKVNTIALGNGSQALADNA-----IAIGQGNK 268
N + AIG+ +K NS+++G SL ++ +A G + A I Q N
Sbjct: 166 NHGYSIAIGDRSKTDRENSVSIGHESLNRQLTHLAAGTKDTDAVNVAQLKKEIEKTQENT 225

Query: 269 ADGVDAIALGNGSQSRGLNTIALGTASNATGDKS 302
+ + + ++ LG A+N T KS
Sbjct: 226 NKRSAELLANANAYADNKSSSVLGIANNYTDSKS 259



Score = 40.7 bits (94), Expect = 7e-06
Identities = 49/160 (30%), Positives = 82/160 (51%), Gaps = 9/160 (5%)

Query: 75 VAIGKGAKANTFMNTSGSSTAVGYDAIAEGQYSSAIGSKTHAIGGASMAFGVSAISEGDR 134
+AIG A+A G++ AVG +IA G S AIG + A+G +++ +G ++ ++ D
Sbjct: 73 IAIGATAEA-----AKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAASTAQKD- 126

Query: 135 SIALGASSYSLGQYSMALGRYSKALGKLSIAMGDSS--KAEGANAIALGRSSVASGTDSL 192
+A+GA + S +A+G SKA K S+A+G SS A +IA+G S +S+
Sbjct: 127 GVAIGARA-STSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTDRENSV 185

Query: 193 AFGRQSLASAANAIAIGAETEAAENATAIGNNAKAKGTNS 232
+ G +SL +A G + A N + + N+
Sbjct: 186 SIGHESLNRQLTHLAAGTKDTDAVNVAQLKKEIEKTQENT 225


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3641PF03895676e-16 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 67.2 bits (164), Expect = 6e-16
Identities = 19/79 (24%), Positives = 36/79 (45%), Gaps = 2/79 (2%)

Query: 913 ESKLSGGIASAMAMTGLPQAYTPGASMASIGGGTYNGESAVALGV-SMVSANGRWVYKLQ 971
+L G+A+ A++ L Q G + S G Y ++A+A+GV S ++ +
Sbjct: 2 SKELQTGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGVA 61

Query: 972 GSTNSQGEYSAALGAGIQW 990
+T + G S G ++
Sbjct: 62 FNTYN-GGMSYGASVGYEF 79


50SF3657SF3666Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF3657016-5.0443272-amino-3-ketobutyrate CoA ligase
SF3658127-8.386435hypothetical protein
SF3659228-9.480884ADP-L-glycero-D-manno-heptose-6-epimerase
SF3660338-12.165224ADP-heptose:LPS heptosyltransferase II
SF3661448-16.060369ADP-heptose:LPS heptosyl transferase I
SF3662447-16.772249O-antigen ligase RfaL
SF3663342-14.246882lipopolysaccharide
SF3664336-12.398973UDP-glucose:(galactosyl) LPS
SF3665129-9.659794lipopolysaccharide core biosynthesis protein
SF3666025-7.263267lipopolysaccharide 1,3-galactosyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3659NUCEPIMERASE1047e-28 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 104 bits (260), Expect = 7e-28
Identities = 77/348 (22%), Positives = 127/348 (36%), Gaps = 67/348 (19%)

Query: 2 IIVTGGAGFIGSNIVKALNDKGITDILVVDNLKD--------------GTKFVNLVDLDI 47
+VTG AGFIG ++ K L + G ++ +DNL D +D+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 48 ADYMDKEDFLIQIMAGEEFGDVEAIFHEGACSSTTEWDGKYMMDNNYQYSK-------EL 100
AD + + + A F E +F + +Y ++N + Y+ +
Sbjct: 62 ADR----EGMTDLFASGHF---ERVFISPHRLAV-----RYSLENPHAYADSNLTGFLNI 109

Query: 101 LHYCLEREIP-FLYASSAATYGGRTSD-FIESREYEKPLNVYGYSKFLFDEYVRQILPEA 158
L C +I LYASS++ YG F + P+++Y +K +
Sbjct: 110 LEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 159 NSQIVGFRYFNVYGPREGHKGSMASVAFHLNTQLNNGESPKLFEGSENFKRDFVYVGDVA 218
G R+F VYGP + MA F + G+S ++ KRDF Y+ D+A
Sbjct: 170 GLPATGLRFFTVYGPWG--RPDMA--LFKFTKAMLEGKSIDVY-NYGKMKRDFTYIDDIA 224

Query: 219 DVNL------------WFLENGVSG-------IFNLGTGRAESFQAVADATLAY-HKKGQ 258
+ + W +E G ++N+G A + +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 259 IEYIPFPDKLKGRYQAFTQADLTNLRAA-GYDKPFKTVAEGVTEYMAW 305
+P G T AD L G+ P TV +GV ++ W
Sbjct: 285 KNMLPLQ---PGDVL-ETSADTKALYEVIGF-TPETTVKDGVKNFVNW 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3666RTXTOXINA320.003 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 32.2 bits (73), Expect = 0.003
Identities = 25/117 (21%), Positives = 45/117 (38%), Gaps = 10/117 (8%)

Query: 60 HVFTDYISDKDKLYFSDL-------AKQYNSRINIYVINCDKLKSLPSTKNWTYATYFRF 112
H+ D +DKL +D+ ++ N I + S+ T+ +F
Sbjct: 860 HIIDDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEG--NVLSIGHKNGITFRNWFEK 917

Query: 113 IIADYFYHKHEKILYLDADIACKGSIKELLDYQFSTNEIAAVVAERDIEWWQNRASV 169
D H+ E+I I S+K+ L+YQ N A+ V D + ++ +
Sbjct: 918 ESGDISNHEIEQIFDKSGRIITPDSLKKALEYQ-QRNNKASYVYGNDALAYGSQGDL 973


51SF3698SF3737Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF3698227-4.577914*integrase
SF3699023-1.524341hypothetical protein
SF37000250.026908hypothetical protein
SF3703023-0.058843hypothetical protein
SF3706024-0.057892insertion sequence element IS629 protein
SF3707128-3.525372hypothetical protein
SF3708326-1.027041IS2 transposase TnpB
SF3709521-0.880505IS2 repressor TnpA
SF37104200.074783hypothetical protein
SF37124190.981507insertion element IS1 protein InsB
SF3713419-0.058079hypothetical protein
SF37144181.247648membrane transport protein
SF3715317-0.421761siderophore biosynthesis protein
SF3716319-0.448498siderophore biosynthesis protein
SF3717318-1.130240siderophore biosynthesis protein
SF3718319-3.120184lysine:N6-hydroxylase
SF3719318-2.501106ferric siderophore receptor
SF3720225-3.648488serine protease-like protein
SF3721124-3.562733IS2 transposase TnpB
SF3722219-4.112992IS2 repressor TnpA
SF3723013-3.190561long polar fimbriae
SF3724-118-0.388385insertion element IS1 protein InsB
SF3725-121-0.238101insertion element IS1 protein InsA
SF3726-222-0.217457fimbrial protein
SF3727-321-0.047790phosphate ABC transporter substrate-binding
SF3728-214-0.163000high-affinity phosphate ABC transporter
SF3729-213-0.656585phosphate ABC transporter permease subunit PtsA
SF3730-211-1.576399phosphate ABC transporter ATP-binding protein
SF3731-212-3.293550transcriptional regulator PhoU
SF3732-113-3.394962transcriptional antiterminator BglG
SF3733013-2.463713PTS system beta-glucoside-specific transporter
SF3735015-2.685663insertion element IS1 protein InsA
SF3736116-2.763152insertion element IS1 protein InsB
SF3737117-3.515274receptor protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3703RTXTOXIND260.024 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 26.3 bits (58), Expect = 0.024
Identities = 15/80 (18%), Positives = 27/80 (33%), Gaps = 6/80 (7%)

Query: 17 PEFRNEALKLAERIGVAAAARELSLYESQLYAWRSKQQQ-----QMSSSERESELAAENV 71
PE + + + R SL + Q W++++ Q +ER + LA N
Sbjct: 166 PELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARIN- 224

Query: 72 RLKRQLAEQAEELSILQKAA 91
R + + L
Sbjct: 225 RYENLSRVEKSRLDDFSSLL 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3714TCRTETA485e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 47.5 bits (113), Expect = 5e-08
Identities = 81/375 (21%), Positives = 135/375 (36%), Gaps = 41/375 (10%)

Query: 20 FSAGLLGIGQNGLLVVLPVLVIQTNLSLSV---WAALLMLGSMLFLPSSPWWGKQISRTG 76
+ L +G ++ VLP L+ S V + LL L +++ +P G R G
Sbjct: 12 STVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFG 71

Query: 77 SKPVVLWALGGYGISFTLLGLGSVLMATSAITTAVGLGILIIARIAYGLTVSAMVPACQV 136
+PV+L +L G + + ++ L +L I RI G+T + A
Sbjct: 72 RRPVLLVSLAGAAVDYAIMATAPFLW------------VLYIGRIVAGITGATGAVAGAY 119

Query: 137 WALQRAGEGNRMAALATISSGLSCGRLFGPLCAAAMLAIHPLAPLGLLMAAPVLALLMLL 196
A R +S+ G + GP+ M P AP A L L
Sbjct: 120 IA-DITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGC 178

Query: 197 RL------PGTPPQPTPECKSVSLKRDCLPYLLCAILLAAAVSMMQLGLSPAL------T 244
L P ++ R + A L+A M +G PA
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGE 238

Query: 245 RQFATDTTAISQQVAWLLGLSAVAALIAQFGVLRPQRLTPVALLLSAGVLMSGGLAIMLS 304
+F D T I +A L ++A + G + + AL+L +G + + +
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMI-TGPVAARLGERRALMLGMIADGTGYILLAFA 297

Query: 305 EQLWLFYPGCAVLSFGAALATPAYQLLLNDKLADGAGAGWLATSHTLGYGLCALLVPLVS 364
+ W+ +P +L+ G + PA Q +L+ + D G L L L S
Sbjct: 298 TRGWMAFPIMVLLASG-GIGMPALQAMLS-RQVDEERQGQLQ----------GSLAALTS 345

Query: 365 KTGVAIALIMAALFA 379
T + L+ A++A
Sbjct: 346 LTSIVGPLLFTAIYA 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3715PF04183339e-111 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 339 bits (872), Expect = e-111
Identities = 104/480 (21%), Positives = 178/480 (37%), Gaps = 46/480 (9%)

Query: 56 ELLIPLDEQKSLHFRVAYFSPTQHHRF-----AFPARLVTASGSYPVDFTTLSRLIIDKL 110
E + + Q + + P RF + + A D L++ ++ +L
Sbjct: 24 EQVFHAESQGDDRYCIN--LPGAQWRFIAERGIWGWLWIDAQTLRCADEPVLAQTLLMQL 81

Query: 111 RHQLFLPVPLCETFHQRVLESHVHTQQAIDARHDWAALREKALNFGEAEQALLTGHAFHP 170
+ L + Q + + + Q + AR +A LN + Q LL+GH
Sbjct: 82 KQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINLNA-DRLQCLLSGHPKFV 140

Query: 171 APKSHEPFNRREAERYLPDMAPHFPLRWFSVDKTQIAGES-LHLNLQQRLTRFAAENAPQ 229
K + + ERY P+ A F L W +V + + +++ Q LT A PQ
Sbjct: 141 FNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQLLT---AAMDPQ 197

Query: 230 LLNELS--------DNQWLF-PLHPWQGEYLLQQGWCQALVAKGLIKDLGEAGTSWLPTT 280
S D+ WL P+HPWQ + + + A+G + LGE G WL
Sbjct: 198 EFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADF-AEGRMVSLGEFGDQWLAQQ 256

Query: 281 SSRSLYCATSRD--MIKFSLSVRLTNSIRTLSVKEVKRGMRLARLAQ----TDGWQMLQ- 333
S R+L A+ R IK L++ T+ R + + + G +R Q TD +
Sbjct: 257 SLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQVFATDATLVQSG 316

Query: 334 ---VRFPTFRVMQEDGWAGLLDLNGNIMQESLFALRENLLVDQPKSQTNVLVSLTQAAPD 390
+ P + +G+A L + REN ++ VL++ +
Sbjct: 317 AVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKPDESPVLMATLMECDE 376

Query: 391 GGDSLLVSAVKRLSDRLGITVQQAAHAWVDAYCQQVLKPLFTAEADYGLVLLAHQQNILV 450
L + + DR G+ A W+ + V+ PL+ YG+ L+AH QNI +
Sbjct: 377 NNQPLAGAYI----DRSGLD----AETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITL 428

Query: 451 QMLGDLPVGFIYRDCQGSAFMPHATDWLDSIGEAQAENIFTHEQLLRYFPYYLLVNSTFA 510
M +P + +D QG M + + E + L++
Sbjct: 429 AMKEGVPQRVLLKDFQGD--MRLVKEEFPEMDSLPQE----VRDVTSRLSADYLIHDLQT 482


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3717PF041838160.0 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 816 bits (2109), Expect = 0.0
Identities = 565/580 (97%), Positives = 571/580 (98%)

Query: 1 MNHKDWDFVNRRLVAKMLSEMEYEQVFHAESQGDDHYCINLPGAQWRFIAERGIWGWLWI 60
MNHKDWD VNRRLVAKMLSE+EYEQVFHAESQGDD YCINLPGAQWRFIAERGIWGWLWI
Sbjct: 1 MNHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWI 60

Query: 61 DAQTLRCTDEPVLAQTLLMQLKPVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD 120
DAQTLRC DEPVLAQTLLMQLK VLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD
Sbjct: 61 DAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD 120

Query: 121 LINLDADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYTNTFRLHWLAVKREHMIWRC 180
LINL+ADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEY NTFRLHWLAVKREHMIWRC
Sbjct: 121 LINLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRC 180

Query: 181 DNDLDIQQLLTAAMDPQEFTRFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG 240
DN++DI QLLTAAMDPQEF RFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG
Sbjct: 181 DNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG 240

Query: 241 RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR 300
RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR
Sbjct: 241 RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR 300

Query: 301 WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK 360
WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK
Sbjct: 301 WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK 360

Query: 361 PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI 420
PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI
Sbjct: 361 PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI 420

Query: 421 AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEAFPEMDSLPQEVRDVTSRLSADYLIHDL 480
AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKE FPEMDSLPQEVRDVTSRLSADYLIHDL
Sbjct: 421 AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDL 480

Query: 481 QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMNKHPQMAERFALFSLFRPQIIR 540
QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYM KHPQM+ERFALFSLFRPQIIR
Sbjct: 481 QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSLFRPQIIR 540

Query: 541 VVLNPVKLTWPDLDGGSRMLPNYLENLQNPLWLVTQEYES 580
VVLNPVKLTWPDLDGGSRMLPNYLE+LQNPLWLVTQEYES
Sbjct: 541 VVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQEYES 580


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3720IGASERPTASE802e-19 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 80.5 bits (198), Expect = 2e-19
Identities = 48/214 (22%), Positives = 74/214 (34%), Gaps = 52/214 (24%)

Query: 35 NRKLVATMLSLAVAGTVNA---ANIDISNVWARDYLDLAQNKGIFQPGATDVTITLKNGD 91
N+K ++L VA + A + +V + + D A+NKG F GAT+V + KN
Sbjct: 3 NKKFKLNFIALTVAYALTPYTEAALVRDDVDYQIFRDFAENKGKFSVGATNVLVKDKNNK 62

Query: 92 KF--SFHN-LSIPDFSGAAAS-GAATAIGGSYSVTVAH-----------------NKKNP 130
+ N + + DFS AT I Y V V H N N
Sbjct: 63 DLGTALPNGIPMIDFSVVDVDKRIATLINPQYVVGVKHVSNGVSELHFGNLNGNMNNGNA 122

Query: 131 QAAETQVYAQSSYKVVDRRNSN-------------------DFEIQRLNKFVVETVGATP 171
+A ++ Y V++ D+ + RL+KFV E
Sbjct: 123 KAHRDVSSEENRYFSVEKNEYPTKLNGKTVTTEDQTQKRREDYYMPRLDKFVTEVAPIEA 182

Query: 172 AETNPTTYSDALERYGIVTSDGSKKIIGFRAGSG 205
+ + +D +K R GSG
Sbjct: 183 STAS---------SDAGTYNDQNKYPAFVRLGSG 207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3723PF005777560.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 756 bits (1954), Expect = 0.0
Identities = 328/870 (37%), Positives = 484/870 (55%), Gaps = 54/870 (6%)

Query: 6 IVVGLTAGTCLIFSQNLMAEVSVFNPALLEINHQSGVDIRQFNRANLMPPGVYSVDIFIN 65
V L L + FNP L + Q+ D+ +F +PPG Y VDI++N
Sbjct: 26 FFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLN 85

Query: 66 GKMFERQDVTFVQDNPDADLHACFIAIKKTLSSFGIKVDALKSFNDVDETVCLDPAPRIE 125
+DVTF + + + C L+S G+ ++ N + + C+ I
Sbjct: 86 NGYMATRDVTFNTGDSEQGIVPCLTR--AQLASMGLNTASVSGMNLLADDACVPLTSMIH 143

Query: 126 GSSWQFDSDKLQLNISIHQIYMDAMAYDYISPTRWDEGINALTINYDFSGSHTLRSDYGS 185
++ Q D + +LN++I Q +M A YI P WD GINA +NY+FSG+ +
Sbjct: 144 DATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSV--QNRIG 201

Query: 186 QETDTSYLNLRNGLNIGPWRLRNYSTLN------TSDGRAEYNSISTWIQRDIAALRSQI 239
+ +YLNL++GLNIG WRLR+ +T + +S + ++ I+TW++RDI LRS++
Sbjct: 202 GNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRL 261

Query: 240 MIGDTWTASDIFDSTQIRGARLYTDNDMLPASQNGFAPVVRGIAKSNATVIIRQNGYVIY 299
+GD +T DIFD RGA+L +D++MLP SQ GFAPV+ GIA+ A V I+QNGY IY
Sbjct: 262 TLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIY 321

Query: 300 QSAVPQGAFEITDLNTASTGGDLDVTIKEEDGSEQRFTQPYASLAILKREGLTDVDVSVG 359
S VP G F I D+ A GDL VTIKE DGS Q FT PY+S+ +L+REG T ++ G
Sbjct: 322 NSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAG 381

Query: 360 ELRDEDG--FTPDVLQAQILHGFSHGITLYGGMQAAENYGSAALGVGKDLGALGAISFDV 417
E R + P Q+ +LHG G T+YGG Q A+ Y + G+GK++GALGA+S D+
Sbjct: 382 EYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDM 441

Query: 418 THARANFSHDDTETGQSYRFLYSKLFDDTDTSLRLVGYRYSTEGYYTLNEWASRRNS--- 474
T A + D GQS RFLY+K +++ T+++LVGYRYST GY+ + R +
Sbjct: 442 TQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYN 501

Query: 475 --------------PEDFWETGNRRSRVEGTLTQSLGRDYGNLYLTLSRQQYWHTDDVER 520
+ + N+R +++ T+TQ LGR LYL+ S Q YW T +V+
Sbjct: 502 IETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSNVDE 560

Query: 521 LMQFGYSSSWKRLSWNVSWSYSNTARQGTGNNHASDNTSEQIYMLSLSVPLSGW------ 574
Q G +++++ ++W +S+S + A +Q+ L++++P S W
Sbjct: 561 QFQAGLNTAFEDINWTLSYSLTKN---------AWQKGRDQMLALNVNIPFSHWLRSDSK 611

Query: 575 --WGNSYATYSVSQNDNSGSSHQLGLSGTALERNNLSWNLMQSYNSHDDEVGGN---MSL 629
W ++ A+YS+S + N ++ G+ GT LE NNLS+++ Y D G+ +L
Sbjct: 612 SQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATL 671

Query: 630 TYDGSYGTVNGSYNYSQNSQRLNYGIRGGILAHSEGVTLSQELGETIALVKAPGAAGLEI 689
Y G YG N Y++S + ++L YG+ GG+LAH+ GVTL Q L +T+ LVKAPGA ++
Sbjct: 672 NYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKV 731

Query: 690 DNMRGAATDWRGYTVKTQLNPYDENRVAISDNYFSKSNIELDNTVVTMVPTRGAVVKAEF 749
+N G TDWRGY V Y ENRVA+ N + N++LDN V +VPTRGA+V+AEF
Sbjct: 732 ENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLA-DNVDLDNAVANVVPTRGAIVRAEF 790

Query: 750 VTHVGYRVLFRVLNANGKPVPFGAIAAIQDASLADSGIVGDRGELYLSGLPEKGQVTLSW 809
VG ++L L N KP+PFG A + S SGIV D G++YLSG+P G+V + W
Sbjct: 791 KARVGIKLLMT-LTHNNKPLPFG--AMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKW 847

Query: 810 GENASTKCIFNYSFSTPESESGLIEQGVTC 839
GE + C+ NY + L + C
Sbjct: 848 GEEENAHCVANYQLPPESQQQLLTQLSAEC 877


52SF3785SF3815Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF37853152.752801hypothetical protein
SF37862163.050370hypothetical protein
SF37871193.828744hypothetical protein
SF37881184.451465transcriptional regulator
SF37891184.402823multidrug resistance protein D
SF37900173.413714acetolactate synthase catalytic subunit
SF3791-1152.975084acetolactate synthase 1 regulatory subunit
SF3792-1151.836794DNA-binding transcriptional activator UhpA
SF3793-1151.325900sensory histidine kinase UhpB
SF37941150.425279regulatory protein UhpC
SF3795013-0.560104sugar phosphate antiporter
SF3796118-1.161183cryptic adenine deaminase
SF3797017-2.575493hypothetical protein
SF3798022-5.449206hypothetical protein
SF3799024-5.637915transporter
SF3800129-5.673831transporter
SF3801228-5.230821insertion sequence element IS600 transposase
SF3802225-3.315400insertion sequence element IS600 protein
SF3804018-2.081727serine protease-like protein
SF38050141.052524IS2 repressor TnpA
SF38060141.338321IS2 transposase TnpB
SF38071211.461778fimbrial protein
SF38081221.939023insertion sequence IS4 transposase InsG
SF38093351.841838glucosamine--fructose-6-phosphate
SF38103341.627145bifunctional N-acetylglucosamine-1-phosphate
SF38114391.679203ATP synthase F0F1 subunit epsilon
SF38124401.568252ATP synthase F0F1 subunit beta
SF38133330.466263ATP synthase F0F1 subunit gamma
SF38144330.169744ATP synthase F0F1 subunit alpha
SF3815220-1.162729ATP synthase F0F1 subunit delta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3789TCRTETB469e-08 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 46.4 bits (110), Expect = 9e-08
Identities = 33/141 (23%), Positives = 62/141 (43%), Gaps = 1/141 (0%)

Query: 4 VMGAYLLTYGVSQLFYGPISDRVGRRPVILVGMSIFMLATLVA-VTTSSLTVLIAASAMQ 62
V A++LT+ + YG +SD++G + ++L G+ I +++ V S ++LI A +Q
Sbjct: 54 VNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQ 113

Query: 63 GMGTGVGGVMARTLPRDLYERTQLRHANSLLNMGILVSPLLAPLIGGLLDTMWNWRACYL 122
G G + + + A L+ + + + P IGG++ +W L
Sbjct: 114 GAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLL 173

Query: 123 FLLVLCAGVTFSMARWMPETR 143
++ V F M E R
Sbjct: 174 IPMITIITVPFLMKLLKKEVR 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3792HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 2e-13
Identities = 29/174 (16%), Positives = 59/174 (33%), Gaps = 20/174 (11%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATG 118
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 119 GCYLTPDIAIKLASGRQDPLTKRERQVAEKLAQG---MAVKEIAAELGLSPKTV 169
A+ R L + + + + + A L + T+
Sbjct: 118 --------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3793PF06580402e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.8 bits (93), Expect = 2e-05
Identities = 28/142 (19%), Positives = 56/142 (39%), Gaps = 11/142 (7%)

Query: 365 LRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWRIDESALSENQRVTLFRVCQEGLNN 424
LR ++L + + ++L L++ + + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 425 IVKHA-----DASAVTLQGWQQDERLMLVIEDDGSGLPPGSGQ-QGFGLTGMRERVTALG 478
+KH + L+G + + + L +E+ GS + + G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 479 G---TLHISCLHG-TRVSVSLP 496
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3794TCRTETB418e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.6 bits (95), Expect = 8e-06
Identities = 65/408 (15%), Positives = 137/408 (33%), Gaps = 60/408 (14%)

Query: 29 RHILLTIWLGYALFY--FTRKSFNAAVPEILANGVLSRSDIGLLATLFYITYGVSKFVSG 86
RH + IWL F+ N ++P+I + + + T F +T+ + V G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 87 IVSDRSNARYFMGIGLIATGIMNILFGFSTSLWAFAVLWVLNAFFQGWGS---PVCARLL 143
+SD+ + + G+I +++ S F L ++ F QG G+ P ++
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHS---FFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 144 TAWY-SRTERGGWWALWNTAHNVGGALIPIVMAASALHYGWRAGMMIAGCMAIVVGIFLC 202
A Y + RG + L + +G + P + A + W ++ M ++ +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPF- 184

Query: 203 WRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKEILTKYVLLNPYIWLLSFCYVLV 262
+ L +I G L I+ + Y VL
Sbjct: 185 --------LMKLLKKEVRIKGHFDIK----GIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 263 YVV-----RAAINDWGNLYMSETLGVDLVTANTAVTMFELGGFI-----------GALVA 306
+++ R + + + + + + + + + GF+ A
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 307 GWGSDKLFNGNRGPMNLIFAAGILL-SVGSLWLMPFASYVMQATCFFTIGFFVFGPQMLI 365
GS +F G + + GIL+ G L+++ + + F T F + +
Sbjct: 293 EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFM 351

Query: 366 ---------GMAAAECS---------HKEAAGAATGFVGLFAYLGASL 395
G++ + ++ AGA + ++L
Sbjct: 352 TIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3795TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 0.001
Identities = 28/168 (16%), Positives = 61/168 (36%), Gaps = 17/168 (10%)

Query: 49 FNIAQNDMISTYGLSMTQLGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAIC 108
N++ D+ + + + F +T+ +G + +D K+ L F +I++ C
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN--C 90

Query: 109 MLGFSASMGSGSVSLFLMIAFYALSGFFQSTGGSCSYSTI----TKWTPRRKRGTFLGFW 164
+G SL +M + F Q G + + + ++ P+ RG G
Sbjct: 91 FGSVIGFVGHSFFSLLIM------ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 165 NISHNLGGAGAAGVALFGANYLFDGHVIGMFIFPSIIALIVGFIGLRY 212
+G + A+Y+ + + + P + I+ L
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSY---LLLIP--MITIITVPFLMK 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3796UREASE403e-05 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 39.7 bits (93), Expect = 3e-05
Identities = 30/105 (28%), Positives = 43/105 (40%), Gaps = 17/105 (16%)

Query: 22 AVSRGDAVADYIIDNVSILDLINAGEISGPIVIKGRYIAGVG-AEYADT---------PA 71
V+R D +I N ILD + G + I +K IA +G A D P
Sbjct: 60 QVTREGGAVDTVITNALILD--HWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPG 117

Query: 72 LQRIDARGATAVPGFIDAHLHIESSMMTPVTFETATLPRGLTTVI 116
+ I G G +D+H+H + P E A L GLT ++
Sbjct: 118 TEVIAGEGKIVTAGGMDSHIH----FICPQQIEEA-LMSGLTCML 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3799TCRTETA392e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.4 bits (92), Expect = 2e-05
Identities = 35/208 (16%), Positives = 71/208 (34%), Gaps = 13/208 (6%)

Query: 88 IIVEFLPVSLLTP----MAQDLGISEGVAGQSVTVTAFVAMFASLFITQTIQATDR--RY 141
+ ++ + + L+ P + +DL S V + A A+ +DR R
Sbjct: 14 VALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRR 73

Query: 142 VVILFAVLL-TLSCLLVSFANSFSLLLIGRACLGLALGGFWAMSASLTMRLVPPRTVPKA 200
V+L ++ + +++ A +L IGR G+ G A++ + + +
Sbjct: 74 PVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARH 132

Query: 201 LSVIFGAVSIALVIAAPLGSFLGELIGWRNVFNAAAVMG----VLCIFWIIKSLPSLPGE 256
+ +V LG +G F AAA + + F + +S
Sbjct: 133 FGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRP 191

Query: 257 PSHQKQNTFRLLQRPGVMAGMIAIFMSF 284
+ N + M + A+ F
Sbjct: 192 LRREALNPLASFRWARGMTVVAALMAVF 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3804IGASERPTASE1852e-54 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 185 bits (470), Expect = 2e-54
Identities = 84/316 (26%), Positives = 139/316 (43%), Gaps = 29/316 (9%)

Query: 4 YKNDKTFRNLEIFGDSGSGAYLYDNKLEKWVLVGTTHGIASVNGDQLTWITKYNDKLVSE 63
+ N + GDSGS ++YD + KW+ +G+ A N Y + +
Sbjct: 273 ILSQDPLTNYAVLGDSGSPLFVYDREKGKWLFLGSYDFWAGYNKKSWQEWNIYKSQFTKD 332

Query: 64 LKDTYS----------HKINLNGNNVTIKNTDITLHQNNADTTGTQEKITKDKDIVFTNG 113
+ + S + + NG TI + +L N D ++K K + F
Sbjct: 333 VLNKDSAGSLIGSKTDYSWSSNGKTSTITGGEKSL---NVDLADGKDKPNHGKSVTFEGS 389

Query: 114 GNVLFKDNLDFGSGGIIFDEGHEYNINGQRFTFKGAGIDIGKESIVNWNALYSSDDVLHK 173
G + +N+D G+GG+ F+ +E T+KGAG+ + + V W D L K
Sbjct: 390 GTLTLNNNIDQGAGGLFFEGDYEVKGTSDNTTWKGAGVSVAEGKTVTWKVHNPQYDRLAK 449

Query: 174 IGPGTLNVQKKQG--ANIKIGEGNVILNEEG------TFNNIYLASGNGKVILNKDNSLG 225
IG GTL V+ ++K+G+G VIL ++ F ++ + SG ++LN D +
Sbjct: 450 IGKGTLIVEGTGDNKGSLKVGDGTVILKQQTNGSGQHAFASVGIVSGRSTLVLNDDKQVD 509

Query: 226 NDQYAGIFFTKRGGTLDLNGHNQTFTRIAATDDGTTITNSDTKKEAVLAINNEDSYIYHG 285
+ I+F RGG LDLNG++ TF I DDG + N + + + I E
Sbjct: 510 PN---SIYFGFRGGRLDLNGNSLTFDHIRNIDDGARLVNHNMTNASNITITGESLI---- 562

Query: 286 NINGNIKLTHNINSQD 301
+ N +NI++ D
Sbjct: 563 -TDPNTITPYNIDAPD 577



Score = 32.7 bits (74), Expect = 0.002
Identities = 52/280 (18%), Positives = 84/280 (30%), Gaps = 42/280 (15%)

Query: 71 KINLNGNNVT---IKNTDI-TLHQNNADTTGTQEKITKDKDIVFTNGGNVLFKDNLDFGS 126
+++LNGN++T I+N D N+ T + IT + I N D D +
Sbjct: 521 RLDLNGNSLTFDHIRNIDDGARLVNHNMTNASNITITGESLITDPNTITPYNIDAPDEDN 580

Query: 127 GGIIFDEGHEYNI-----NGQRFTFKGAG------IDIGKESIVNWNALYSSDDVLHKIG 175
+ N + + ES NW + + D +
Sbjct: 581 PYAFRRIKDGGQLYLNLENYTYYALRKGASTRSELPKNSGESNENWLYMGKTSDEAKRNV 640

Query: 176 PGTLNVQKKQGANIKIGEGNVILNEEGTFN-NIYLASGNGKVILNKDNSLGNDQYAGIFF 234
+N ++ G N GE N G N S + +L +L D
Sbjct: 641 MNHINNERMNGFNGYFGEEEGKNN--GNLNVTFKGKSEQNRFLLTGGTNLNGD------L 692

Query: 235 TKRGGTLDLNG----HNQTFTRIAATDDGTTITNSDT------------KKEAVLAINNE 278
T GTL L+G H + I++T ++ K + N
Sbjct: 693 TVEKGTLFLSGRPTPHARDIAGISSTKKDPHFAENNEVVVEDDWINRNFKATTMNVTGNA 752

Query: 279 DSY--IYHGNINGNIKLTHNINSQDKKTNAKLILDGSVNT 316
Y NI NI ++ + S T
Sbjct: 753 SLYSGRNVANITSNITASNKAQVHIGYKTGDTVCVRSDYT 792


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3810RTXTOXINA290.047 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.047
Identities = 23/80 (28%), Positives = 31/80 (38%), Gaps = 10/80 (12%)

Query: 367 LGDAEIGDNVNIGAGTITCNYDGANKFKTIIGDDVFVGSDTQLVAPVTVGKGATIAAGTT 426
LGD + D V + AG+ N G DV T G AT A T
Sbjct: 616 LGDGD--DKVFLSAGSA--NIYAGK------GHDVVYYDKTDTGYLTIDGTKATEAGNYT 665

Query: 427 VTRNVGENALAISRVPQTQK 446
VTR +G + + V + Q+
Sbjct: 666 VTRVLGGDVKVLQEVVKEQE 685


53SF3926SF3944Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF3926-116-3.065858protoporphyrinogen oxidase
SF3927-121-5.900050*molybdopterin-guanine dinucleotide biosynthesis
SF3928-220-6.284356molybdopterin-guanine dinucleotide biosynthesis
SF3929-122-7.555635hypothetical protein
SF3930-216-4.645731serine/threonine protein kinase
SF3931-214-4.304730periplasmic protein disulfide isomerase I
SF3932-115-3.971364hypothetical protein
SF3933013-1.402219acyltransferase
SF3933a1150.484981hypothetical protein
SF39340141.318136DNA polymerase I
SF39350161.842354hypothetical protein
SF39362221.779846Der GTPase activator
SF39371201.431312coproporphyrinogen III oxidase
SF39381170.226523nitrogen regulation protein NR(I)
SF3939117-1.730247nitrogen regulation protein NR(II)
SF3940220-2.318885glutamine synthetase
SF3941113-2.931771GTP-binding protein
SF3942018-5.629480transcriptional regulator
SF3943019-6.548850hypothetical protein
SF3944013-3.383580resistance protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3936SECA300.004 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.2 bits (68), Expect = 0.004
Identities = 11/71 (15%), Positives = 30/71 (42%)

Query: 14 AKARRKTREELDQEARDRKRQKKRRGHAPGSRAAGGNTTSGSKGQNAPKDPRIGSKTPIP 73
+K + + EE+++ + R+ + +R ++ + + + ++G P P
Sbjct: 827 SKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCP 886

Query: 74 LGVTEKVTKQH 84
G +K + H
Sbjct: 887 CGSGKKYKQCH 897


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3938HTHFIS5970.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 597 bits (1542), Expect = 0.0
Identities = 206/478 (43%), Positives = 300/478 (62%), Gaps = 11/478 (2%)

Query: 1 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGAEVLEALASKTPDVLLSDIRMPGM 60
M + V DDD++IR VL +AL+ AG N A + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 120
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 HYQEQQQPRNIQLNGPTTDIIGEAQAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 180
+ + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 181 LHRHSPRTKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 240
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 241 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVQEGKFREDLFHR 300
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L+Q + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 301 LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL 360
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 361 ENTCRWLTVMAAGQEVLIQDLPGELFESTVAESTSQMQPDSWATLLAQWADRALRS---- 416
EN R LT + + + + EL + S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 417 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 469
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3939PF06580280.042 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.3 bits (63), Expect = 0.042
Identities = 34/190 (17%), Positives = 72/190 (37%), Gaps = 41/190 (21%)

Query: 171 IIEQADRLRNLVDRL---LGPQLPGTRVTE-SIHKVAERV---VTLVSMELPDNVRLIRD 223
I+E + R ++ L + L + + S+ V + L S++ D ++
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 224 YDPSLPELAHDPDQIEQVLLN-IVRNALQ---ALGPEGGEIILRTRTAFQLTLHGERYRL 279
+P++ ++ Q+ +L+ +V N ++ A P+GG+I+L+
Sbjct: 246 INPAIMDV-----QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGT------KDNGTVT- 293

Query: 280 AARIDVEDNGPGIPPHLQDTLFYPMVSGREGGTGLGLSIARNLIDQHSGK---IEFTSWP 336
++VE+ G + ++ TG GL R + G I+ +
Sbjct: 294 ---LEVENTGSLALKNTKE------------STGTGLQNVRERLQMLYGTEAQIKLSEKQ 338

Query: 337 GHTEFSVYLP 346
G V +P
Sbjct: 339 GKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3941TCRTETOQM1804e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 180 bits (458), Expect = 4e-51
Identities = 97/445 (21%), Positives = 170/445 (38%), Gaps = 81/445 (18%)

Query: 4 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQE--RVMDSNDLEKERGITILAKNT 61
K+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIVYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPALSVDEPTV 307
K+ ++ + E + D A +G+IV + L ++ + DT+ + + P +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 308 SMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSGRGELHLS 367
+ + D L LR +S G++ +
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKVQME 397

Query: 368 VLIENMRRE-GFELAVSRPKVIFRE 391
V ++ + E+ + P VI+ E
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


54SF4010SF4018Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF40103122.976316ATP-dependent protease subunit HslV
SF40112133.034821essential cell division protein FtsN
SF40121132.534731DNA-binding transcriptional regulator CytR
SF40132164.230848primosome assembly protein PriA
SF40140183.15701150S ribosomal protein L31
SF4015-1173.517392hypothetical protein
SF4016-2173.577573transcriptional repressor protein MetJ
SF4017-2173.490552cystathionine gamma-synthase
SF4018-2183.327175bifunctional aspartate kinase II/homoserine
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4011IGASERPTASE422e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 42.0 bits (98), Expect = 2e-06
Identities = 32/155 (20%), Positives = 64/155 (41%), Gaps = 5/155 (3%)

Query: 114 LTPEQRQLLEQMQADMRQQPTQLVEVPWNEQTPEQRQQTLQRQRQAQQLAEQQRLAQQSR 173
+ +QAD+ P+ E+ ++ P + +AE + Q+S+
Sbjct: 992 VDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK--QESK 1049

Query: 174 TTEQSWQQQT-RTSQAAPVQAQPRQSKPASTQQPYQDLLQTPAHTTAQSKPQQAAPVARA 232
T E++ Q T T+Q V + + + A+TQ + T ++ ++ A V +
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 233 ADAPKPTAEKKDERRWMVQCGSFRGAEQAETVRAQ 267
A T + ++ + Q + EQ+ETV+ Q
Sbjct: 1110 EKAKVETEKTQEVPKVTSQVSPKQ--EQSETVQPQ 1142


55SF4182SF4218Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF4182224-0.008237insertion sequence element ISSfl4 transposase
SF4184-124-0.013688insertion sequence element ISSfl4 transposase
SF4185-1171.395104IS2 repressor TnpA
SF4186-2151.197695IS2 transposase TnpB
SF41870160.370222hypothetical protein
SF4188-1150.974502hypothetical protein
SF4189-1161.071827hypothetical protein
SF4190-118-1.731744isoaspartyl dipeptidase
SF4191217-2.996016cell density-dependent motility repressor
SF4192116-0.461539DNA replication/recombination/repair protein
SF4193-1180.480887insertion element IS1 protein InsB
SF41940170.590080insertion element IS1 protein InsA
SF4195-2150.162465hypothetical protein
SF4196-1171.889828DNA-binding transcriptional repressor UxuR
SF4197-1161.518208D-mannonate oxidoreductase
SF41980150.182573mannonate dehydratase
SF4199315-0.464691fructuronate transporter
SF4200122-2.606953minor fimbrial subunit, D-mannose specific
SF4201124-2.593371minor fimbrial subunit
SF4202023-2.857609minor fimbrial subunit
SF4203-124-3.122733insertion element IS1 protein InsB
SF4204130-4.711374insertion element IS1 protein InsA
SF4206130-5.996152periplasmic chaperone
SF4207130-6.375904fimbrial protein
SF4208130-6.355259major type 1 subunit fimbrin
SF4209128-6.426883tyrosine recombinase
SF4210027-6.223119recombinase
SF4211125-5.620487hypothetical protein
SF4212-123-4.696443N-acetylneuraminic acid mutarotase
SF4213121-2.631164hypothetical protein
SF4214223-2.867648insertion element IS1 protein InsB
SF4215222-2.594044insertion element IS1 protein InsA
SF4216223-3.375214DeoR family transcriptional regulator
SF4217323-1.186177hypothetical protein
SF4218224-0.916163hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4182RTXTOXIND417e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.4 bits (97), Expect = 7e-06
Identities = 9/74 (12%), Positives = 33/74 (44%), Gaps = 1/74 (1%)

Query: 16 LRKQQSRLRQYACQVAGYEQEIERLKAQLDRLRRMLFGQSSEKKRHKLENQIRQAEKRLS 75
+ +Q+++ + ++ Y+ ++E++++++ + + K L+ ++RQ +
Sbjct: 254 VLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILD-KLRQTTDNIG 312

Query: 76 ELENRLNTARNLLE 89
L L +
Sbjct: 313 LLTLELAKNEERQQ 326


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4190UREASE354e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 35.1 bits (81), Expect = 4e-04
Identities = 21/85 (24%), Positives = 37/85 (43%), Gaps = 20/85 (23%)

Query: 26 CDVLVANGKIIAVASNIPSDIVPNCT--------VVDLSGQILCPGFIDQHVHLIGGGGE 77
D+ + +G+I A+ D+ P T V+ G+I+ G +D H+H I
Sbjct: 86 ADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIHFI----- 140

Query: 78 AGPTTRTPEVALSRLTEAGVTSVVG 102
+ E AL +G+T ++G
Sbjct: 141 ---CPQQIEEALM----SGLTCMLG 158


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4199PF06580310.008 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.008
Identities = 10/49 (20%), Positives = 25/49 (51%)

Query: 230 LVPLIPAIIMISTTIANIWLVKDTPAWEVVNFIGSSPIAMFIAMVVAFV 278
+ +I ++ I +W V +T W ++ FI + P+A + + ++ +
Sbjct: 73 MGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSII 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4200SURFACELAYER280.047 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 28.1 bits (62), Expect = 0.047
Identities = 19/79 (24%), Positives = 32/79 (40%), Gaps = 1/79 (1%)

Query: 211 SQNLGYYLSGTTADAGNSIFTNTASFSPAQGVGVQLTRNGTIIPANNTVSLGAVGTSAVS 270
S+N G ++ +A+ N FT PA V V L ++G ++ + + +
Sbjct: 133 SENAGKEITIGSAN-PNVTFTEKTGDQPASTVKVTLDQDGVAKLSSVQIKNVYAIDTTYN 191

Query: 271 LGLTANYARTGGQVTAGNV 289
+ TG VT G V
Sbjct: 192 SNVNFYDVTTGATVTTGAV 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4201VACCYTOTOXIN334e-04 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 33.5 bits (76), Expect = 4e-04
Identities = 30/158 (18%), Positives = 49/158 (31%), Gaps = 9/158 (5%)

Query: 3 WCKRGYVLAAMLALASATIQAADVTITVNGKVVAKPCTVSTTNATVDLGDLYSFSLMSAG 62
W R + A LA + +TI + VT VN + + + + G
Sbjct: 258 WMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTH------IG 311

Query: 63 AASAWHDVALELTNCPVG--TSRVTASFSGAADSTGYYKNQGTAQNIQLELQDDSGNTLN 120
W L + P G + S + Q ++QN + N+
Sbjct: 312 TLDLWQSAGLNIIAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSAQ 371

Query: 121 TGATKTVQVDDSSQSAHFPLQVRALTVNGGATQGTIQA 158
+ QV D + V +N A GTI+
Sbjct: 372 KTEIQPTQVIDGPFAGGKNTVVNINRINTNA-DGTIRV 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4217HTHFIS270.013 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 26.7 bits (59), Expect = 0.013
Identities = 7/45 (15%), Positives = 16/45 (35%), Gaps = 1/45 (2%)

Query: 6 KRYPEEFKTEAVKQVVDR-GYSVASVATRLDITTHSLYAWIKKYG 49
R E + + + + A L + ++L I++ G
Sbjct: 430 DRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELG 474


56SF4283SF4288Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SF4283320-0.377263hypothetical protein
SF4284421-0.678835insertion element IS1 protein InsB
SF42852140.064639insertion sequence element IS600 protein
SF42862130.136412lysine/cadaverine transporter
SF42872150.968368insertion element IS1 protein InsA
SF4288220-0.065434insertion element IS1 protein InsB
57SF4319SF4334Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF4319-1143.035007hypothetical protein
SF4322-1133.063241****hypothetical protein
SF4321-1133.166049carbohydrate kinase
SF4323-1132.411698ADP-binding protein
SF4324-1122.865903N-acetylmuramoyl-L-alanine amidase
SF43250132.531190DNA mismatch repair protein
SF43262181.589689tRNA delta(2)-isopentenylpyrophosphate
SF43274241.673434RNA-binding protein Hfq
SF43284221.585546GTPase HflX
SF43294222.054318FtsH protease regulator HflK
SF43304221.849989FtsH protease regulator HflC
SF43313212.009187hypothetical protein
SF43324202.063474adenylosuccinate synthetase
SF43333151.556064transcriptional repressor NsrR
SF43342151.849534hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4328SECA320.005 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 32.2 bits (73), Expect = 0.005
Identities = 26/144 (18%), Positives = 54/144 (37%), Gaps = 6/144 (4%)

Query: 282 HVIDAADVRVQENIEAVNTVLEEIDAHEIPTLLVMNKIDMLEDFEPRIDRDEENK-PIRV 340
++D +DV N + IDA+ P L ++ + + R+ D + PI
Sbjct: 665 ELLDVSDVSETINSIREDVFKATIDAYIPPQSL--EEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 341 WLSAQTGAGIPQLFQALTERLSGEVAQHTLRLPPQEGRLRSRFYQLQAIEKEWMEEDGSV 400
WL + L + + + + + + R + LQ ++ W E ++
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAM 782

Query: 401 SLQVRMPIVDWRRLCKQEPALIDY 424
+R I R +++P +Y
Sbjct: 783 D-YLRQGIH-LRGYAQKDP-KQEY 803


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4329cloacin320.006 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.6 bits (71), Expect = 0.006
Identities = 25/81 (30%), Positives = 30/81 (37%), Gaps = 10/81 (12%)

Query: 17 GSSKPGGNSEGNGNKGGRDQGPPDLDDIFRKLSKKLGGLGGGKGTGSGGGSSSQGP---- 72
S G +SE N GG G G GGG GTG G S+ P
Sbjct: 33 ASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTG-GNLSAVAAPVAFG 91

Query: 73 -----RPQLGGRVVTIAAAAI 88
P GG V+I+A A+
Sbjct: 92 FPALSTPGAGGLAVSISAGAL 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4334RTXTOXIND310.026 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.026
Identities = 12/55 (21%), Positives = 24/55 (43%), Gaps = 1/55 (1%)

Query: 179 VVPDDSRLSFDILIPPDQIMGARMGFVVVVELTQRPTRRTKAV-GKIVEVLGDNM 232
+VP+D L L+ I +G ++++ P R + GK+ + D +
Sbjct: 359 IVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAI 413


58SF4353aSF4368Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF4353a329-1.932917membrane protein
SF4354433-0.36298930S ribosomal protein S6
SF4353b531-0.641926primosomal replication protein N
SF4355529-1.27462630S ribosomal protein S18
SF4356326-2.18898750S ribosomal protein L9
SF4358322-1.818469hypothetical protein
SF4359122-5.767671hypothetical protein
SF4360018-5.389222insertion sequence element IS600 protein
SF4361015-4.877846insertion sequence element IS600 transposase
SF4362014-4.591531hypothetical protein
SF4363-113-4.116277endoribonuclease SymE
SF4364-212-3.428838hypothetical protein
SF4365-217-0.382374restriction modification enzyme M subunit
SF4366-120-0.004275Type I restriction enzyme EcoAI R protein
SF43671263.159234insertion element IS1 protein InsA
SF43681263.184149insertion element IS1 protein InsB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4358HTHFIS270.013 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 26.7 bits (59), Expect = 0.013
Identities = 7/45 (15%), Positives = 16/45 (35%), Gaps = 1/45 (2%)

Query: 4 KRYPEEFKTEAVKQVVDR-GYSVASVATRLDITTHSLYAWIKKYG 47
R E + + + + A L + ++L I++ G
Sbjct: 430 DRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELG 474


59SF0330SF0337N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF03301161.274510NagC-like transcriptional regulator
SF03331121.529635MFS transport protein AraJ
SF03340131.786740exonuclease SbcC
SF0335-1121.803760exonuclease SbcD
SF03360141.923073transcriptional regulator PhoB
SF03370131.470774phosphate regulon sensor protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0330ACETATEKNASE290.021 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.4 bits (66), Expect = 0.021
Identities = 17/69 (24%), Positives = 29/69 (42%), Gaps = 10/69 (14%)

Query: 233 FISGTGFATDYRRLSGHALKGSEIIRLVEESDPVAELALRRYELRLAKSLAHVVNILDP- 291
+G ++D+R L A + D A+LAL + R+ K++ +
Sbjct: 273 VYGISGISSDFRDLEDAAF---------KNGDKRAQLALNVFAYRVKKTIGSYAAAMGGV 323

Query: 292 DVIVLGGGM 300
DVIV G+
Sbjct: 324 DVIVFTAGI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0333TCRTETA531e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 52.9 bits (127), Expect = 1e-09
Identities = 73/356 (20%), Positives = 126/356 (35%), Gaps = 36/356 (10%)

Query: 5 ILSLALGTFGLGMAEFGIMSVLTELAHNVGISIPAAGH---MISYYALVVVVGAPIIALF 61
+ ++AL G+G+ IM VL L ++ S H +++ YAL+ AP++
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 62 SSRYSLKHILLFLVALCVIGNAMFTLSSSYLMLAIGRLVSGFPHGAFFGVGAIVLSKIIK 121
S R+ + +LL +A + A+ + +L IGR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 122 PGKVTAAVAGMVSGMTVANLLGIP-LGTYLSQECWRYTFLLIAVFNIAVMASVYFWVPDI 180
G A G +S ++ P LG + F A N + F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 181 RDEAKGNLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYVKPYMMFI 228
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 229 SGFSETAMTFIMMLVGLGM---VLGNMLSGRISGRYSPLRIAAVTDFIIVLALLMLFFCG 285
F A T + L G+ + M++G ++ R R + ++L F
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 286 GMKTTSLIFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAVG 339
I + G+ LQ +L + E G G +A +L S VG
Sbjct: 299 RGWMAFPIMVLLASGGIG--MPALQAMLSRQV-DEERQGQLQGSLAALTSLTSIVG 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0334RTXTOXIND397e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.4 bits (92), Expect = 7e-05
Identities = 34/199 (17%), Positives = 71/199 (35%), Gaps = 14/199 (7%)

Query: 671 QQEAQSWQQRQNELTALQNRIQQLTPILETLPQSDDLPHSEETVALDNWRQVHEQCLALH 730
+ + Q + Q R Q L+ +E + E + +V +
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIK 192

Query: 731 SQQQTLQQQDVLAAQSLQKAQAQFDTAL--------QASVFDDQQAFLAALMDEQTLTQL 782
Q T Q Q +L K +A+ T L + V + ++L+ +Q +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIA-- 250

Query: 783 EQLKQNLENQRRQAQTLVTQTAETLAQHQQHRPDGLALTVTVEQIQQEL-AQTHQKLREN 841
K + Q + V + +Q +Q + L+ + + Q + KLR+
Sbjct: 251 ---KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQT 307

Query: 842 TTSQGEIRQQLKQDADNRQ 860
T + G + +L ++ + +Q
Sbjct: 308 TDNIGLLTLELAKNEERQQ 326



Score = 39.4 bits (92), Expect = 7e-05
Identities = 25/204 (12%), Positives = 59/204 (28%), Gaps = 18/204 (8%)

Query: 487 EARIKTLEAQRAQLQAGQPCPLCGSTSHPAVEAYQALEPGVNQSRLLALENEVKKLGEEG 546
EA ++ Q + Q ++E + E + +E + L
Sbjct: 133 EADTLKTQSSLLQARLEQ---TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLT- 188

Query: 547 AALRGQLDALTKQLQRDENEAQSLRQDEQALTQQWQAVTASLNITLQPQDDIQPWLDAQD 606
+ ++ Q Q + E R + + + + DD L Q
Sbjct: 189 SLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQA 248

Query: 607 -------EHERQL-RLLSQRHELQGQIAAHNQQIIQYQQQIEQRQQQLLTALAGYALTLP 658
E E + +++ + Q+ +I+ +++ + Q L
Sbjct: 249 IAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF------KNEILD 302

Query: 659 QEDEEESWLATRQQEAQSWQQRQN 682
+ + + E ++RQ
Sbjct: 303 KLRQTTDNIGLLTLELAKNEERQQ 326



Score = 32.5 bits (74), Expect = 0.009
Identities = 16/150 (10%), Positives = 42/150 (28%), Gaps = 5/150 (3%)

Query: 731 SQQQTLQQQDVLAAQSLQKAQAQFDTA----LQASVFDDQQAFLAALMDEQTLTQLEQLK 786
+ Q + A + Q + L D+ F +E+ L +K
Sbjct: 134 ADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS-EEEVLRLTSLIK 192

Query: 787 QNLENQRRQAQTLVTQTAETLAQHQQHRPDGLALTVTVEQIQQELAQTHQKLRENTTSQG 846
+ + Q + A+ + L L + ++
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKH 252

Query: 847 EIRQQLKQDADNRQQQQTLMQQIAQMTQQV 876
+ +Q + + + + Q+ Q+ ++
Sbjct: 253 AVLEQENKYVEAVNELRVYKSQLEQIESEI 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0335FRAGILYSIN310.010 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 30.8 bits (69), Expect = 0.010
Identities = 14/70 (20%), Positives = 25/70 (35%), Gaps = 4/70 (5%)

Query: 149 KQQHLLAAITDYYQQHYADACKLRGDQPLPIIATGHLTTVGSSKSDAVRDIYIGTLDAFP 208
K+ ++ I ++Y + + + I T S+ D + + I A
Sbjct: 135 KEAQMMNEIAEFYAAPFKKTRAINEKEAFECI-YDSRTR--SAGKD-IVSVKINIDKAKK 190

Query: 209 AQNFPPADYI 218
N P DYI
Sbjct: 191 ILNLPECDYI 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0336HTHFIS951e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.5 bits (235), Expect = 1e-24
Identities = 32/149 (21%), Positives = 63/149 (42%), Gaps = 9/149 (6%)

Query: 4 RILVVEDEAPIREMVCFVLEQNGFQPVEAEDYDSAVNQLNEPWPDLILLDWMLPGGSGIQ 63
ILV +D+A IR ++ L + G+ + + + DL++ D ++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 FIKHLKRESMTRDIPVVMLTARGEEEDRVRGLETGADDYITKPFSPKELVARIKAVMRRI 123
+ +K+ D+PV++++A+ ++ E GA DY+ KPF EL+ I +
Sbjct: 65 LLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA-- 120

Query: 124 SPMAVEEVIKMQGLSLNPTSHRVMAGEEP 152
E + L + + G
Sbjct: 121 -----EPKRRPSKLEDDSQDGMPLVGRSA 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0337PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.1 bits (78), Expect = 0.001
Identities = 19/105 (18%), Positives = 33/105 (31%), Gaps = 26/105 (24%)

Query: 325 LVYNAVNH----TPEGTHITVRWQRVPHGAEFSVEDNGPGIAPEHIPRLTERFYRVDKAR 380
LV N + H P+G I ++ + VE+ G
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN---------------- 306

Query: 381 SRQTGGSGLGLAIVKHAVNH---HESRLNIESTVGKGTRFSFVIP 422
+G GL V+ + E+++ + GK +IP
Sbjct: 307 --TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM-VLIP 348


60SF0377SF0385N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF0377018-0.427346muropeptide transporter
SF0378227-1.271972polymerase/proteinase
SF0379425-0.620596murein genes regulator
SF0380327-0.536111hypothetical protein
SF0381326-0.201769trigger factor
SF03820190.082362ATP-dependent Clp protease proteolytic subunit
SF0383120-0.174252ATP-dependent protease ATP-binding subunit ClpX
SF0384018-0.222617DNA-binding ATP-dependent protease La
SF0385-113-0.389406transcriptional regulator HU subunit beta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0377TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.0 bits (91), Expect = 3e-05
Identities = 71/347 (20%), Positives = 135/347 (38%), Gaps = 20/347 (5%)

Query: 62 KFLWSPLMDRYTPPFFGRRRGWLLATQILLLVAIAAMGFLEPGTQLRWMAALAVVIAFCS 121
+F +P++ + F RR LL + V A M W+ + ++A +
Sbjct: 56 QFACAPVLGALSDRF--GRRPVLLVSLAGAAVDYAIMAT----APFLWVLYIGRIVAGIT 109

Query: 122 ASQDIVFDAWKTDVLPAEERGAGAAISVLGYRLGMLVSGGLALWLADKWLGWQGMYWLMA 181
+ V A+ D+ +ER + GM+ L + ++ A
Sbjct: 110 GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG--FSPHAPFFAAA 167

Query: 182 AL-LIPCIIATLLAPEP--TDTIPVPKTLEQAVVAPLRDFFGRNNAWLILLLIVLYKLGD 238
AL + + L PE + P+ + + + A L+ + ++ +G
Sbjct: 168 ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQ 227

Query: 239 AFAMSLTTTFLIRGVGFDAGEVGVVNKTLGLLATIVGALYGGILMQRLSLFRALLIFGIL 298
A +L F +DA +G+ G+L ++ A+ G + RL RAL+ G++
Sbjct: 228 VPA-ALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALM-LGMI 285

Query: 299 QGASNAGYWLLSITDKHLYSMGAAVFFENLCGGMGTSAFVALLMTLCNKSFSATQFALLS 358
A GY LL+ + + V GG+G A A+L ++ L+
Sbjct: 286 --ADGTGYILLAFATRGWMAFPIMVLL--ASGGIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 359 ALSAVGRVYVGPVAGWFVEAHGWSTF--YLFSVAAAVPGLILLLVCR 403
AL+++ + VGP+ + A +T+ + + AA+ L L + R
Sbjct: 342 ALTSLTSI-VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0378PF06291290.006 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 28.9 bits (64), Expect = 0.006
Identities = 12/37 (32%), Positives = 19/37 (51%)

Query: 34 NMFKKILFPLVALFMLAGCAKPPTTIEVSPTITLPQQ 70
N KK+LF ++ GCA+ T+ PT P++
Sbjct: 4 NKMKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKE 40


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0383HTHFIS290.043 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.043
Identities = 16/73 (21%), Positives = 29/73 (39%), Gaps = 13/73 (17%)

Query: 60 ERSALPTPHEIRNHLDDYVIGQEQAKKVLAVAVYNHYKRLRNGDTSNGVELGKSNILLIG 119
E P+ E + ++G+ A + +Y RL D +++ G
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGRSAAMQ----EIYRVLARLMQTD---------LTLMITG 167

Query: 120 PTGSGKTLLAETL 132
+G+GK L+A L
Sbjct: 168 ESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0384GPOSANCHOR350.001 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.0 bits (80), Expect = 0.001
Identities = 34/133 (25%), Positives = 69/133 (51%), Gaps = 15/133 (11%)

Query: 191 ERLEYLMAMMESEIDLLQVEKRIRNRVKKQMEKSQREYYLNEQMKAIQKELGEMDDVPD- 249
LE A +E + +L R +++ ++ S+ +Q++A ++L E + + +
Sbjct: 291 AALEAEKADLEHQSQVLNAN---RQSLRRDLDASREAK---KQLEAEHQKLEEQNKISEA 344

Query: 250 ENEALKRKIDAAKMPKEAKEKAEAELQKLKMMSPMS-AEATVVRGYIDWMVQVPWNARSK 308
++L+R +DA++ EAK++ EAE QKL+ + +S A +R +D + A+ +
Sbjct: 345 SRQSLRRDLDASR---EAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASRE----AKKQ 397

Query: 309 VKKDLRQAQEILD 321
V+K L +A L
Sbjct: 398 VEKALEEANSKLA 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0385DNABINDINGHU1173e-38 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 117 bits (294), Expect = 3e-38
Identities = 49/88 (55%), Positives = 67/88 (76%)

Query: 2 NKSQLIDKIAAGADISKAAAGRALDAIIASVTESLKEGDDVALVGFGTFAVKERAARTGR 61
NK LI K+A +++K + A+DA+ ++V+ L +G+ V L+GFG F V+ERAAR GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEITIAAAKVPSFRAGKALKDAV 89
NPQTG+EI I A+KVP+F+AGKALKDAV
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


61SF0402SF0415N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF0402013-1.174100hypothetical protein
SF0403216-0.959059hypothetical protein
SF0405015-0.426251gene expression modulator
SF0406014-0.230555Hha toxicity attenuator
SF04070150.662550multidrug efflux system protein AcrB
SF04081120.072911multidrug efflux system transporter AcrA
SF0409114-0.116117DNA-binding transcriptional repressor AcrR
SF04102142.052489hypothetical protein
SF04114163.771721hypothetical protein
SF04123164.335160primosomal replication protein N''
SF04133222.836400hypothetical protein
SF04143262.660582adenine phosphoribosyltransferase
SF04152222.697059DNA polymerase III subunits gamma and tau
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0402BCTERIALGSPF300.026 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.8 bits (67), Expect = 0.026
Identities = 31/137 (22%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 245 IWLPLGLVIGLLAAMFVLRILRRIQSPHHRLQDAIENRDICVHYQPIVSLANGKIVGAEA 304
W+ L L+ G +A +LR R+ + + P++ G+I
Sbjct: 228 PWMLLALLAGFMAFRVMLR------QEKRRVS-----FHRRLLHLPLI----GRIARGLN 272

Query: 305 LARWPQTDGSWLSPDSFIPLAQQTGLS-EPLTLLIIRSVFEDMGDCLRQHPQQHISINLE 363
AR+ +T + S +PL Q +S + ++ R D +R+ H + LE
Sbjct: 273 TARYARTLSILNA--SAVPLLQAMRISGDVMSNDYARHRLSLATDAVREGVSLHKA--LE 328

Query: 364 STVLTSEKIPQLLREMI 380
T L P ++R MI
Sbjct: 329 QTAL----FPPMMRHMI 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0407ACRIFLAVINRP13610.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1361 bits (3523), Expect = 0.0
Identities = 798/1033 (77%), Positives = 910/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISRFYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++S YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMTGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNM GIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ + LNK++LTPVDVI +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL 300
+ EEFGK+ L+VN DGS V L+DVA++ELGGENY++IA NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+LA+++PFFP G+K++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVF+PMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWLNRMFEKSTHHYTDSVGGILRSTGR 540
SVLVALILTPALCAT+LKP++ H E K GFFGW N F+ S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTHYYLT 600
YL++Y +IV GM LF+RLPSSFLP+EDQGVF+TM+QLPAGATQERTQKVL++VT YYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD 660
EK NVESVF VNGF F+G+ QN G+AFVSLK W +R G+EN EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQLL AA+HP L SVRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS++DIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D+ YVR+A+G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSTPFS 900
M LME LASKLP G+GYDWTGMSYQERLSGNQAP+L AIS +VVFLCLAALYESWS P S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF 1020
+EATL AVRMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0408RTXTOXIND453e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 45.2 bits (107), Expect = 3e-07
Identities = 33/212 (15%), Positives = 72/212 (33%), Gaps = 23/212 (10%)

Query: 100 TYQATYDSAKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQGYDQALADAQQANAAVTA 159
+ Y A +L + + Q+ Q +++ ++ L +Q +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 160 AKAAVETARINLAYTKVTSPISGRIGKSNV-TEGALVQNGQATVLATVQQLDPIYVDVTQ 218
+ + + +P+S ++ + V TEG +V + T++ V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALV 372

Query: 219 SSNDFLRLKQELA----------NGTLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVD 268
+ D + KV I D I+ + G + ++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLV---GKVKNINLDAIEDQRLGLVFNVIISIE 429

Query: 269 QTTGSITLRAIFPNPDHTLLPGMFVRARLEEG 300
+ S + I L GM V A ++ G
Sbjct: 430 ENCLSTGNKNIP------LSSGMAVTAEIKTG 455



Score = 32.9 bits (75), Expect = 0.002
Identities = 26/127 (20%), Positives = 50/127 (39%), Gaps = 10/127 (7%)

Query: 49 PLQITTELPGR-TSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQATYDS 107
++I G+ T + R E++P + I+ + KEG + G L ++ +A
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA---- 134

Query: 108 AKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQGYDQALADAQQANAAVTAAKAAVETA 167
D K Q++ A+L RYQ L + I + + V+ + T+
Sbjct: 135 ---DTLKTQSSLLQARLEQTRYQILSRS--IELNKLPELKLPDEPYFQNVSEEEVLRLTS 189

Query: 168 RINLAYT 174
I ++
Sbjct: 190 LIKEQFS 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0409HTHTETR2225e-76 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 222 bits (567), Expect = 5e-76
Identities = 215/215 (100%), Positives = 215/215 (100%)

Query: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60
MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180
GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215
APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0410RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRIKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0415IGASERPTASE397e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.9 bits (90), Expect = 7e-05
Identities = 40/251 (15%), Positives = 77/251 (30%), Gaps = 31/251 (12%)

Query: 402 PLPETTSQVLAARQQLQRVQGATKAKKSEPAA----ATRARPVNNAALERLASVTDRVQA 457
P E +Q + + + P+ AR + A + A T
Sbjct: 983 PEVEKRNQTVDTTN----ITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETT 1037

Query: 458 RPVPSALEKAPAKKEAYRWKATTPVMQQKE--------VVATPKALKKA---LEHEKTPE 506
V ++ E AT Q +E V A + + A E ++T
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 507 LAVKLAA---------EAIERDPWAAQVSQLSLPKLVEQVALNAWKE-ESDNAVCLHLRS 556
K A E+ +V+ PK + + E +N ++++
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 557 SQRHLNNRGAQQKLAEALS-MLKGSTVELTIVEDDNPAVRTPLEWRQAIYEEKLAQARES 615
Q N ++ A+ S ++ E T V N V P A + + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 616 IIADNNIQTLR 626
+ + +++R
Sbjct: 1218 KPKNRHRRSVR 1228


62SF0505SF0510N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF05050184.504594enterobactin exporter EntS
SF0506-1204.574070iron-enterobactin ABC transporter
SF0508-1214.514306enterobactin synthase subunit E
SF05090194.4042452,3-dihydro-2,3-dihydroxybenzoate synthetase
SF0510-1174.0555592,3-dihydroxybenzoate-2,3-dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0505TCRTETA371e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.7 bits (85), Expect = 1e-04
Identities = 81/391 (20%), Positives = 142/391 (36%), Gaps = 38/391 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATSALVGR 141
V+L + G ++ + P L +Y+ + G + G A A + +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPE-LP 200
+ + G V P++GGL+ A + AA L LPE
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHK 186

Query: 201 PPPQPLEHPLKSLLAGFRFLLASPLLGGLLTMA----------SAVLVLYPALADNWQMS 250
+PL + LA FR+ ++ L+ + +A+ V++ D +
Sbjct: 187 GERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRFHWD 244

Query: 251 AAQIGFLYAAIP-LGAAIGALTSGKLAHSARPGLLMLLSTLGS---FLAIGLFGLMPMWI 306
A IG AA L + A+ +G +A ++L + ++ + M
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAF 304

Query: 307 LGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGGLGA 366
+V LA G ML Q E G++ G A +G L + A
Sbjct: 305 PIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA 360

Query: 367 MMTPVASASASGFGLLIIGVLLLLVLVELRR 397
+ + +G+ + L LL L LRR
Sbjct: 361 ----ASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0506FERRIBNDNGPP632e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 63.0 bits (153), Expect = 2e-13
Identities = 61/285 (21%), Positives = 102/285 (35%), Gaps = 35/285 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSAEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKS--- 151
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 152 --WQSLLTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
+ LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQVLERL 314
KD DA+ A PL +P V+ + + F SAM + L
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVL 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0509ISCHRISMTASE444e-161 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 444 bits (1142), Expect = e-161
Identities = 146/299 (48%), Positives = 195/299 (65%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPESHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPIPASKA-----------ALREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA + + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0510DHBDHDRGNASE359e-129 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 359 bits (922), Expect = e-129
Identities = 108/258 (41%), Positives = 149/258 (57%), Gaps = 20/258 (7%)

Query: 5 GKNVWVTGAGKGIGYATALTFVEAGAKVTGFD---------------QAFTQEQYPFATE 49
GK ++TGA +GIG A A T GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAGQVAQVCQRLLAETERLDVLINAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+ + ++ R+ E +D+L+N AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRRQRGGAIVTVASDAAHTARIGMSAYGASKAALKSLALSVGLELAGSGVRC 169
+ +R G+IVTV S+ A R M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


63SF0742SF0747N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF0742-2202.885999hypothetical protein
SF0743-2182.942352hypothetical protein
SF0744-2172.629055ABC transporter ATP-binding protein
SF0745-2132.568786efflux pump membrane fusion protein
SF0746-1142.348973DNA-binding transcriptional regulator
SF0747-1132.201813ATP-dependent RNA helicase RhlE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0742ABC2TRNSPORT473e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 47.2 bits (112), Expect = 3e-08
Identities = 36/146 (24%), Positives = 63/146 (43%), Gaps = 5/146 (3%)

Query: 197 AREREQGTLDQLLVSPLTTWQIFIGKAVPALIVATFQATIVLAIGIWAYQIPFAGSLALF 256
R Q T + +L + L I +G+ A A IG+ A + + L+L
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGA---GIGVVAAALGYTQWLSLL 148

Query: 257 YFTMVI--YGLSLVGFGLLISSLCSTQQQAFIGVFVFMMPAILLSGYVSPVENMPVWLQN 314
Y VI GL+ G+++++L + + + P + LSG V PV+ +P+ Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 315 LTWINPIRHFTDITKQIYLKDASLDI 340
P+ H D+ + I L +D+
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0744PF05272310.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.012
Identities = 20/90 (22%), Positives = 28/90 (31%), Gaps = 21/90 (23%)

Query: 298 TPRFEDAFIDLLGGAGTSESPLGAILHTVEGTPGETVIEAKELTKKFGDFAATDHVNFAV 357
PR E + +LG P + + + K HV +
Sbjct: 547 VPRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVM 589

Query: 358 KRGEIFG----LLGPNGAGKSTTFKMMCGL 383
+ G F L G G GKST + GL
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGL 619



Score = 29.3 bits (65), Expect = 0.047
Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 39 YVTGLVGPDGAGKTTLMRMLAGL 61
Y L G G GK+TL+ L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0745RTXTOXIND635e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 62.5 bits (152), Expect = 5e-13
Identities = 42/259 (16%), Positives = 92/259 (35%), Gaps = 25/259 (9%)

Query: 83 ALMQAKAGVSVAQAQYDLMLAGYRDEEIAQAAAAVKQAQAAYDYEQNFYNRQQGLWKSRT 142
Q + + +A+ +LA E + + + + L +
Sbjct: 201 QKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENK 260

Query: 143 ISA--NDLENARSSRDQAQATLKSAQDKLRQYRSGNREQ---DIAQAKASLEQAQAQLAQ 197
N+L +S +Q ++ + SA+++ + + + + Q ++ +LA+
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 198 AELNLQDSTLIAPSDGTLLTRAV-EPGTVLNEGGTVFTVSLT-RPVWVRAYVDERNLDQA 255
E Q S + AP + V G V+ T+ + + V A V +++
Sbjct: 321 NEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFI 380

Query: 256 QPGRKVLLYTDGRPDKPYH---GQIGFVSPTAEFTPKTVETPDLRTDLVYRLRIVVT--- 309
G+ ++ + P Y G++ ++ A D R LV+ + I +
Sbjct: 381 NVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISIEENC 432

Query: 310 ----DADDALRQGMPVTVQ 324
+ + L GM VT +
Sbjct: 433 LSTGNKNIPLSSGMAVTAE 451


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0746HTHTETR736e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 73.1 bits (179), Expect = 6e-18
Identities = 33/214 (15%), Positives = 77/214 (35%), Gaps = 17/214 (7%)

Query: 13 KGEQAKKQLIAAALAQFGEYGMNATT-REIAAQAGQNIAAITYYFGSKEDLYLACAQWIA 71
+ ++ ++ ++ AL F + G+++T+ EIA AG AI ++F K DL+ +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 72 DFIGEQFRPHAEEAERLFAQPQPDRAAIRELILRACRNMIKLLTQDDTVNLSKFISREQL 131
IGE E + P + +RE+++ + + + + + F E +
Sbjct: 68 SNIGELEL---EYQAKFPGDP---LSVLREILIHVLESTVTEERRRLLMEII-FHKCEFV 120

Query: 132 SPTAAYHLVHEQVISPLHSHLTRLIAAWTGCDANDTRMILHTHALIGEILAFRLGKETIL 191
A + + + + + +A L T + + G
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKH--CIEAKMLPADLMTRRAAIIMRGYISG----- 173

Query: 192 LRTGWTAFDEEKTELINQTVTCHIDLILQGLSQR 225
L W + + + ++ ++L+
Sbjct: 174 LMENWLFAPQSFD--LKKEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0747SECA310.013 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.6 bits (69), Expect = 0.013
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 4/67 (5%)

Query: 246 QQVLVFTRTKHGANHLAEQLNKDGIRSVAIHG-NKSQGARTRALADFKSGDIRVLVATDI 304
Q VLV T + + ++ +L K GI+ ++ + A A A + + V +AT++
Sbjct: 450 QPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNM 506

Query: 305 AARGLDI 311
A RG DI
Sbjct: 507 AGRGTDI 513


64SF0795SF0800N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF0795114-0.516067multidrug efflux system translocase MdfA
SF0796016-0.881307hypothetical protein
SF0797017-1.317707hypothetical protein
SF0798-115-0.515238DeoR family transcriptional regulator
SF0799-219-3.109361DeoR family transcriptional regulator
SF0800-216-3.100872transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0795TCRTETB393e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 38.7 bits (90), Expect = 3e-05
Identities = 28/155 (18%), Positives = 61/155 (39%), Gaps = 5/155 (3%)

Query: 48 QAGIDWVPTSMTAYLAGGMFLQWLLGPLSDRIGRRPVMLAGVVWFIVTCLAILLAQNIEQ 107
A +WV T+ + G + G LSD++G + ++L G++ + + +
Sbjct: 48 PASTNWVNTAFMLTFSIGTAV---YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFS 104

Query: 108 FTLL-RFLQGISFCFIGAVGYAAIQESFEEAVCIKITALMANVALIAPLLGPLVGAAWIH 166
++ RF+QG A+ + + K L+ ++ + +GP +G H
Sbjct: 105 LLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAH 164

Query: 167 VLPWEGMFVLFAALAAISFFGLQRAMPETAMRIGE 201
+ W +L + I+ L + + + G
Sbjct: 165 YIHW-SYLLLIPMITIITVPFLMKLLKKEVRIKGH 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0798PF05272330.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 33.1 bits (75), Expect = 0.002
Identities = 24/61 (39%), Positives = 29/61 (47%), Gaps = 1/61 (1%)

Query: 302 DSAWVAGVSVVLWGLGASLGFPLTISAASDTGPDAPTRVSVVATTGYLAFLVGPPLLGYL 361
D +W+AG +VVLW SL LT DT PD R ++A YL F P L
Sbjct: 270 DWSWLAGCTVVLWPDCDSLREKLTRQELKDT-PDPLAREKLLAAKPYLPFDKQPGQKAML 328

Query: 362 G 362
G
Sbjct: 329 G 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0799HTHTETR461e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 46.2 bits (109), Expect = 1e-08
Identities = 13/83 (15%), Positives = 29/83 (34%), Gaps = 4/83 (4%)

Query: 2 RRANDPQRRGKMIQATLEAVKLYGIHAVTHRKIATLAGVPLGSMTYYFSGIDELLLEAFS 61
+ + R ++ L G+ + + +IA AGV G++ ++F +L E +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIW- 63

Query: 62 SFTEIMSRQYQAFFSDVSDAQGA 84
E+ +
Sbjct: 64 ---ELSESNIGELELEYQAKFPG 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0800TCRTETA320.006 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.1 bits (73), Expect = 0.006
Identities = 21/106 (19%), Positives = 34/106 (32%), Gaps = 6/106 (5%)

Query: 394 LMIGMITFQFSTFSFGMGNAAGLLFAGIML-GFMRANHPTFG-YIPQ--GALSMVKEFGL 449
L++ + +L+ G ++ G A G YI + FG
Sbjct: 76 LLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGF 135

Query: 450 MVFMAGVGLSAGSGINNGLGAIGGQM--LIAGLIVSLVPVVICFLF 493
M G G+ AG + +G A + L + CFL
Sbjct: 136 MSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLL 181


65SF0818SF0824N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF0818115-5.072567arginine transporter ATP-binding subunit
SF0819116-3.864914lipoprotein
SF0820114-3.037396hypothetical protein
SF08210132.640986hypothetical protein
SF08220132.791196regulator
SF0823-2142.603040nucleotide di-P-sugar epimerase or dehydratase
SF0824-2121.952194hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0818PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.010
Identities = 9/18 (50%), Positives = 12/18 (66%)

Query: 31 LVLLGPSGAGKSSLLRVL 48
+VL G G GKS+L+ L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0822ECOLIPORIN300.010 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 29.9 bits (67), Expect = 0.010
Identities = 21/54 (38%), Positives = 27/54 (50%), Gaps = 9/54 (16%)

Query: 2 RRVFWLVAAALLLAGCAGEKGIVEKEGYQLDTRHQAQAAYPRIKVLVIHYTADD 55
R+V LV ALL AG A I K+G +LD Y ++ L HY +DD
Sbjct: 3 RKVLALVIPALLAAGAAHAAEIYNKDGNKLDL-------YGKVDGL--HYFSDD 47


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0823NUCEPIMERASE746e-17 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 74.0 bits (182), Expect = 6e-17
Identities = 70/363 (19%), Positives = 123/363 (33%), Gaps = 65/363 (17%)

Query: 13 MKVLVTGATSGLGRNAVEFLCQKGISVRA---------SGRNEAMGKLLEKMGAEFVPTD 63
MK LVTGA +G + + L + G V +A +LL + G +F D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 64 LTELVSSQAKVMLAGIDTLWHCS-------SFTSPWGTQQAFDLANVRATRRLGEWAVAW 116
L + + ++ S +P A+ +N+ + E
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPH----AYADSNLTGFLNILEGCRHN 116

Query: 117 GVRNFIHISSPSLYFDYHHHRNIKEDFRPHRFANEFARSKAASEEVINMLSQANPQTRFT 176
+++ ++ SS S+Y + D + +A +K A+E + + S T
Sbjct: 117 KIQHLLYASSSSVYGL-NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH-LYGLPAT 174

Query: 177 ILRPQSLFGPHDK--VFIPRLAHMMHHYGSILLPHGGSALVDMTYYENAVHAMWLASQEA 234
LR +++GP + + + + M SI + + G D TY ++ A+
Sbjct: 175 GLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI 234

Query: 235 CDKLPS--------------GRVYNITNGEHRTLRSIVQKLIDELNIDCRIRSVPYPMLD 280
RVYNI N L +Q L D L I+ + +P D
Sbjct: 235 PHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGD 294

Query: 281 MIARSMERLGRKSAKEPPLTHYGASKLNFDFTLDITRAQEELGYQPVITLDEGIEKTAAW 340
+ T D E +G+ P T+ +G++ W
Sbjct: 295 V----------------LETS-----------ADTKALYEVIGFTPETTVKDGVKNFVNW 327

Query: 341 LRD 343
RD
Sbjct: 328 YRD 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF0824NUCEPIMERASE562e-10 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 55.6 bits (134), Expect = 2e-10
Identities = 29/125 (23%), Positives = 52/125 (41%), Gaps = 17/125 (13%)

Query: 4 RILVLGASGYIGQHLVRTLSQQGHQILA---------AARHVDRLAKLQLANVSCHKVDL 54
+ LV GA+G+IG H+ + L + GHQ++ + RL L HK+DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 55 SWPDNLPALLQD--IDTVYFLVH------SMGEGGDFIAQERQLALNVRDALREVPVKQL 106
+ + + L + V+ H S+ + LN+ + R ++ L
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 107 IFLSS 111
++ SS
Sbjct: 122 LYASS 126


66SF1080SF1088N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF10800122.138965flagellar hook protein FlgE
SF10820142.187482flagellar basal body rod protein FlgG
SF10831141.926700flagellar basal body L-ring protein
SF10841141.515263flagellar basal body P-ring biosynthesis protein
SF10852161.216909flagellar rod assembly protein/muramidase FlgJ
SF10883171.127202RNase E
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1080FLGHOOKAP1404e-05 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 39.6 bits (92), Expect = 4e-05
Identities = 16/49 (32%), Positives = 28/49 (57%)

Query: 594 TLTNGALEASNVDLSKELVNMIVAQRNYKSNAQTIKTQDQILNTRVNLR 642
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + +N+R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 37.6 bits (87), Expect = 2e-04
Identities = 22/56 (39%), Positives = 30/56 (53%), Gaps = 4/56 (7%)

Query: 246 AVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 297
A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1082FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1083FLGLRINGFLGH349e-126 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 349 bits (897), Expect = e-126
Identities = 232/232 (100%), Positives = 232/232 (100%)

Query: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60
MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180
RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232
SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1084FLGPRINGFLGI427e-152 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 427 bits (1100), Expect = e-152
Identities = 157/363 (43%), Positives = 213/363 (58%), Gaps = 9/363 (2%)

Query: 4 FLSALILLLVTTAAQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQTLN 63
F + L A RI+D+ S+Q R N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 13 FSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMR 72

Query: 64 NMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGTLLM 123
ML LGIT G + KN+AAVMVTA+LPPF G +DV VSS+G+A SLRGG L+M
Sbjct: 73 AMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIM 131

Query: 124 TPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFGVGN 183
T L G D Q+YA+AQG ++V G A +++ R+ NGA+IERELPS+F
Sbjct: 132 TSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSV 191

Query: 184 TLNLQLNDEDFSMAQQIADTINRVR----GYGSATALDARTIQVRVPSGNSSQVRFLADI 239
L LQL + DFS A ++AD +N G A D++ I V+ P + R +A+I
Sbjct: 192 NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEI 250

Query: 240 QNMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAVAQGNLSVTVNRQANVSQPDTPFGG 299
+N+ V T AKVVIN RTG++V+ +V + AV+ G L+V V V QP PF
Sbjct: 251 ENLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSR 308

Query: 300 GQTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLR 359
GQT V PQT I Q G + ++ L +V LN++G +++ILQ ++SAG L+
Sbjct: 309 GQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQ 367

Query: 360 AKL 362
A+L
Sbjct: 368 AEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1085FLGFLGJ5030.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 503 bits (1297), Expect = 0.0
Identities = 308/313 (98%), Positives = 309/313 (98%)

Query: 1 MISDSKLLASAAWDAQSLNELKAKASEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60
MISDSKLLASAAWDAQSLNELKAKA EDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120
LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VVRYQNQTLSQLVQKAVPRNYDDSLPGDSRAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180
VVRYQNQ LSQLVQKAVPRNYDDSLPGDS+AFLAQLSLPAQLASQQSGVPHHLILAQAAL
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180

Query: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGQVTEITTTEYENGEAKKVKAKFRVYSSYL 240
ESGWGQRQIRRENGEPSYNLFGVKASGNWKG VTEITTTEYENGEAKKVKAKFRVYSSYL
Sbjct: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240

Query: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQVLQDAGYATDPHYARKLTNMIQQMKSISDK 300
EALSDYVGLLTRNPRYAAVTTAASAEQGAQ LQDAGYATDPHYARKLTNMIQQMKSISDK
Sbjct: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300

Query: 301 VSKTYSMNIDNLF 313
VSKTYSMNIDNLF
Sbjct: 301 VSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1088IGASERPTASE682e-13 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 68.2 bits (166), Expect = 2e-13
Identities = 49/261 (18%), Positives = 87/261 (33%), Gaps = 26/261 (9%)

Query: 548 VAPAPKAATATPAAPAQPGLLSRFFGALKALFSGGEEAKPTEQP-TPKAEAKPERQQDRR 606
T P + S E A+ E P P A A P
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSN----------NEEIARVDEAPVPPPAPATPSET---- 1036

Query: 607 KPRQSNRRDRNERRDTRSERTEGSDNREENRRNRRQAQQQTAETRESRQQAEV------T 660
+ N ++++++ D E +NR A++ + + + Q EV T
Sbjct: 1037 ----TETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 661 EKARTTDEQQAPRRERSRRRNDDKRQAQQEAKALNVEEQSVQETEQEERVRPVQPRRKQR 720
++ +TT+ ++ E+ + + + Q+ K + + QE + + + R
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPK-VTSQVSPKQEQSETVQPQAEPARENDP 1151

Query: 721 QLNQKVRYEQSVAEEAVVAPVVEETAAAEPIVQEAPAPRTELVKVPLPVVAQTAPEQQEE 780
+N K Q+ P E ++ E V E+ T V P A Q
Sbjct: 1152 TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTV 1211

Query: 781 NNADNRDNGGMPRRSRRSPRH 801
N+ + RRS RS H
Sbjct: 1212 NSESSNKPKNRHRRSVRSVPH 1232



Score = 61.2 bits (148), Expect = 2e-11
Identities = 48/288 (16%), Positives = 88/288 (30%), Gaps = 36/288 (12%)

Query: 510 PSEEEFAERKRPEQPALATFAMPDVPPAPT-PAEPAATVVAPAPKAATATPAAPAQPGLL 568
P E+ + DVP P+ E A AP P A ATP+ +
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTE---- 1038

Query: 569 SRFFGALKALFSGGEEAKPTEQPTPKAEAKPERQQDRRKPRQSNRRDRNERRDTRSER-- 626
A E +K + K E Q+ + + + ++
Sbjct: 1039 ------TVA-----ENSKQESKTVEKNEQDATE-----TTAQNREVAKEAKSNVKANTQT 1082

Query: 627 TEGSDNREENRRNRRQAQQQTAETRESRQQAEVTEKARTTDEQQAPRRERSRRRNDDKRQ 686
E + + E + + ++TA + + TEK + + + + + + Q
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ 1142

Query: 687 AQ---QEAKALNVEEQSVQETEQEERVRPVQPRRKQRQLNQKVRYEQSV--AEEAVVAPV 741
A+ + +N++E Q + +P + + Q V +V V P
Sbjct: 1143 AEPARENDPTVNIKEPQSQTNTTADTEQPA--KETSSNVEQPVTESTTVNTGNSVVENPE 1200

Query: 742 VEETAAAEPIVQEAPA------PRTELVKVPLPVVAQTAPEQQEENNA 783
A +P V + R + VP V T A
Sbjct: 1201 NTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVA 1248


67SF1201SF1205N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF1201-116-1.443293dihydroxyacetone kinase subunit M
SF1202-213-1.311125dihydroxyacetone kinase subunit DhaL
SF1203-214-1.602497dihydroxyacetone kinase subunit DhaK
SF1204-115-1.827758DNA-binding transcriptional regulator DhaR
SF1205115-1.056976adhesion and penetration protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1201PHPHTRNFRASE1402e-38 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 140 bits (355), Expect = 2e-38
Identities = 60/206 (29%), Positives = 100/206 (48%), Gaps = 1/206 (0%)

Query: 258 GKAFYYQPVLCTVQAKSTLTAEEEQDRLRQAIDFTLLDLMTLTAKAEASGLDDIAAIFSG 317
KAF + ++ S E ++L A++ + +L + + EAS D A IF+
Sbjct: 17 AKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQTEASMGADKAEIFAA 76

Query: 318 HHTLLGDPELLAAASELLQHEHCTAEYAWQQVLKELSQQYQQLDDEYLQARYIDVDDLLH 377
H +L DPEL+ +++E AEYA ++V ++ +D+EY++ R D+ D+
Sbjct: 77 HLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEYMKERAADIRDVSK 136

Query: 378 RTLVHLT-QTKEELPQFNSPTILLAENIYPSTVLQLDPAVVKGICLSAGSPVSHSALIAR 436
R L HL L T+++AE++ PS QL+ VKG G SHSA+++R
Sbjct: 137 RVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSR 196

Query: 437 ELGIGWICQQGEKLYAIQPEETLTLD 462
L I + E IQ + + +D
Sbjct: 197 SLEIPAVVGTKEVTEKIQHGDMVIVD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1202adhesinmafb280.020 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 28.5 bits (63), Expect = 0.020
Identities = 10/47 (21%), Positives = 26/47 (55%)

Query: 138 VESLRQSSEQNLSVPVALEAASSIAESAAQSTITMQARKGRASYLGE 184
E++ + ++N + +EA ++A +A + + A+ G+A+ G+
Sbjct: 293 REAVDRWIQENPNAAETVEAVFNVAAAAKVAKLAKAAKPGKAAVSGD 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1204HTHFIS2441e-75 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 244 bits (624), Expect = 1e-75
Identities = 91/363 (25%), Positives = 155/363 (42%), Gaps = 33/363 (9%)

Query: 311 QMRQLMTSQLGKVSHTFAHMPQDDPQTRRLIHFGRQAARSSFPVLLCGEEGVGKALLSQA 370
+ S+L S + + + + ++ +++ GE G GK L+++A
Sbjct: 120 AEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 371 IHNESERAAGPYIAVNCELYGDAALAEEFIG---GDRTDNENGRLSRLELAHGGTLFLEK 427
+H+ +R GP++A+N + E G G T + R E A GGTLFL++
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 428 IEYLAVELQSALLQVIKQGVITRLDARRLIPIDVKVIATTTADLAMLVEQNRFSRQLYYA 487
I + ++ Q+ LL+V++QG T + R I DV+++A T DL + Q F LYY
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 488 LHAFEITIPPLRMRRGSIPALVNNKLRSLEKRFSTRLKIDDDALARLVSCAWPGNDFELY 547
L+ + +PPLR R IP LV + ++ EK + D +AL + + WPGN EL
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELE 359

Query: 548 SVIENLALSSDNGRIRVSDLPEHLFTEQATDDVSATRLSTS------------------- 588
+++ L I + L +E + +
Sbjct: 360 NLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASF 419

Query: 589 -----------LSFAEVEKEAIINAAQVTGGRIQEMSALLGIGRTTLWRKMKQHGIDAGQ 637
AE+E I+ A T G + + LLG+ R TL +K+++ G+ +
Sbjct: 420 GDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVSVYR 479

Query: 638 FKR 640
R
Sbjct: 480 SSR 482


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1205PRTACTNFAMLY2123e-59 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 212 bits (540), Expect = 3e-59
Identities = 247/980 (25%), Positives = 402/980 (41%), Gaps = 117/980 (11%)

Query: 14 RLAELKIRSPSIQLIKFGAIGLNAIIFSPLLIAADTGSQYGTNITINDGDRI---TGDTA 70
+ A L+ + ++ L GA ++ I Q+G +I +D + +G T
Sbjct: 10 KAAPLRRTTLAMALGALGAAPAAHADWNNQSIVKTGERQHGIHIQGSDPGGVRTASGTTI 69

Query: 71 DPSGN-LYGVMTPAGNTPGNINLGNDVTVN---VNDASGYAKGIIIQGKNSSLTANRLTV 126
SG G++ N + N + ++D + K L A+ T+
Sbjct: 70 KVSGRQAQGILLE--NPAAELQFRNGSVTSSGQLSDDGIRRFLGTVTVKAGKLVADHATL 127

Query: 127 DVVGQT---SAIGINLIGDYTHADLGTGSTIKSNDDGIIIGHSSTLTATQFTIENSNGIG 183
VG T I + + G+ A + ST++ G+ I + +T + I + G+
Sbjct: 128 ANVGDTWDDDGIALYVAGEQAQASIAD-STLQGAG-GVQIERGANVTVQRSAIVD-GGLH 184

Query: 184 LTINDYGTSVDLGSGSKIKTDGS-TGVYIGGLNGNNANGAARFTATDLTID---VQGYSA 239
+ DL + D + T V G + A++LT+D + G A
Sbjct: 185 IGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPA----AVSVLGASELTLDGGHITGGRA 240

Query: 240 MGINVQKNSVVDLGTNSTIKTNGDNAHGLWSFGQVSANAL-------TVDVTGAAANGVE 292
G+ + +VV L +TI+ A G G V A+ GV+
Sbjct: 241 AGVAAMQGAVVHL-QRATIRRGDAPAGGAVPGGAVPGGAVPGGFGPGGFGPVLDGWYGVD 299

Query: 293 VRGGTTTIGADSHISSAQGGGLVTSSSDATINFSG---TAAQRNSIFSGGSYGASAQTAT 349
V G + + A S + + + G + A + SG +A N I +GG+ + Q A
Sbjct: 300 VSGSSVEL-AQSIVEAPELGAAIRVGRGARVTVSGGSLSAPHGNVIETGGARRFAPQAAP 358

Query: 350 AVINMQNTDITVDRNGSLALGLWALSGGRITGDSLAITGAAGARGIYAMTNSQIDLTSDL 409
I +Q G+ A G L L +TG A A+G T + +
Sbjct: 359 LSITLQA--------GAHAQGKALLYRVLPEPVKLTLTGGADAQGDIVATELPSIPGTSI 410

Query: 410 VIDMSTPDQMAIATQHDDGYAASRINASGRMLINGSVLSKGGLINLDMHPGSVWTGSSLS 469
P +A+A+ + WTG++
Sbjct: 411 -----GPLDVALAS------------------------------------QARWTGAT-- 427

Query: 470 DNVNGGKLDVAMNNSVWNVTSNSNLDTLAL-SHSTVDFASHGSTAGTFTTLNVENLSGNS 528
V+ +D N+ W +T NSN+ L L S +VDF + AG F L V L+G+
Sbjct: 428 RAVDSLSID----NATWVMTDNSNVGALRLASDGSVDFQQ-PAEAGRFKVLTVNTLAGSG 482

Query: 529 TFIMRADVVGEGNGVNNRGDLLNISGSSAGNHVLAIRNQGSEATTGNEVLTVVKTTDGAA 588
F M D L + ++G H L +RN GSE + N +L V AA
Sbjct: 483 LFRMNV------FADLGLSDKLVVMQDASGQHRLWVRNSGSEPASANTLLLVQTPLGSAA 536

Query: 589 SFSASS---QVELGGYLYDVRKNG-TNWELYASGTVPEPTPNPEPTPAPAQPPIVNPD-P 643
+F+ ++ +V++G Y Y + NG W L + P P P P+P P P QPP P+ P
Sbjct: 537 TFTLANKDGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPEAP 596

Query: 644 TPEPAPTPKPTTTADAGGNYLNVGYL--LNYVENRTLMQRMGDLRNQSKDGNIWLRSYG- 700
P+P + + A+A N VG L Y E+ L +R+G+LR G W R +
Sbjct: 597 APQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDAGGAWGRGFAQ 656

Query: 701 -GSLDSFASGKLSGFDMGYSGIQFGGDKRLSDVM-PLYVGLYIDSTHASPDYSG-GDGTA 757
LD+ A + FD +G + G D ++ ++G T ++G G G
Sbjct: 657 RQQLDNRAGRR---FDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDGGGHT 713

Query: 758 RSDYMGMYASYMAQNGFYSDLVIKASRQKNSFHVLDSQNNGVNANGTANGMSISLEAGQR 817
S ++G YA+Y+A +GFY D ++ASR +N F V S V +G+ SLEAG+R
Sbjct: 714 DSVHVGGYATYIADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGASLEAGRR 773

Query: 818 FNLSPTGYGFYIEPQTQLTYSHQNEMAMKASNGLNIHLNHYESLLGRASMILGYDIT-AG 876
F + G+++EPQ +L A +A+NGL + S+LGR + +G I AG
Sbjct: 774 FTHAD---GWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIELAG 830

Query: 877 NSQLNVYVKTGAIREFSGDTEYLLNDSREKYSFKGNGWNNGVGVSAQYNKQHTFYLEADY 936
Q+ Y+K ++EF G N + +G G+G++A + H+ Y +Y
Sbjct: 831 GRQVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYASYEY 890

Query: 937 TQGNLFDQK-QVNGGYRFSF 955
++G + GYR+S+
Sbjct: 891 SKGPKLAMPWTFHAGYRYSW 910


68SF1223SF1226N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF1223-2151.534053invasin
SF1224-2181.917725transcriptional regulator NarL
SF1225-1202.185854nitrate/nitrite sensor protein NarX
SF1226-2231.691905nitrate transport protein NarK
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1223INTIMIN2542e-78 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 254 bits (650), Expect = 2e-78
Identities = 120/378 (31%), Positives = 197/378 (52%), Gaps = 21/378 (5%)

Query: 32 GEQAKAFALGKVRDALSQQVNQHVESWLSPWGNASVDVKVDNEGHFTGSRGSWFVPLQDN 91
G+ AK ALG + Q + +++WL +G A V+++ N F GS + +P D+
Sbjct: 184 GDYAKDTALGIAGN----QASSQLQAWLQHYGTAEVNLQSGNN--FDGSSLDFLLPFYDS 237

Query: 92 DRYLTWSQLGLTQQDNGLVSNVGVGQRWARGNWLVGYNTFYDNLQDENLQRAGFGAEAWG 151
++ L + Q+G D+ +N+G GQR+ ++GYN F D + R G G E W
Sbjct: 238 EKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLGIGGEYWR 297

Query: 152 EYLRLSANFYQPFAAWHE--QTATQEQRMARGYDLTARMRMPFYQHLNTSVSLEQYFGDR 209
+Y + S N Y + WHE ++R A G+D+ +P Y L + EQY+GD
Sbjct: 298 DYFKSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDN 357

Query: 210 VDLFNSGTGYHNPVALSLGLNYTPVPLVTVTAQHKQGESGENQNNLGLNLNYRFGVPLKK 269
V LFNS NP A ++G+NYTP+PLVT+ ++ G EN + Y+F P +
Sbjct: 358 VALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQ 417

Query: 270 QLSAGEVAESQSLRGSRYDNPQRNNLPTLEYRQRKTLTVFLATPPWDLKPGETVPLKLQI 329
Q+ V E ++L GSRYD QRNN LEY+++ L++ + + T ++L +
Sbjct: 418 QIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNI-PHDINGTERSTQKIQLIV 476

Query: 330 RSRYGIRQLIWQGDTQILS-----LTPGAQANSAEGWTLIMPDWQNGEGASNHWRLSVVV 384
+S+YG+ +++W D+ + S G+Q SA+ + I+P + +G SN ++++
Sbjct: 477 KSKYGLDRIVWD-DSALRSQGGQIQHSGSQ--SAQDYQAILPAYV--QGGSNVYKVTARA 531

Query: 385 EDNQGQRVSSNEITLTLV 402
D G SSN + LT+
Sbjct: 532 YDRNGN--SSNNVLLTIT 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1224HTHFIS742e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 73.7 bits (181), Expect = 2e-17
Identities = 32/117 (27%), Positives = 56/117 (47%), Gaps = 2/117 (1%)

Query: 7 ATILLIDDHPMLRTGVKQLISMAPDITVVGEASNGEQGIELAESLDPDLILLDLNMPGMN 66
ATIL+ DD +RT + Q +S A + SN + D DL++ D+ MP N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 67 GLETLDKLREKSLSGRIVVFSVSNHEEDVVTALKRGADGYLLKDMEPEDLLKALHQA 123
+ L ++++ ++V S N + A ++GA YL K + +L+ + +A
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1225PF06580531e-09 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 53.3 bits (128), Expect = 1e-09
Identities = 36/172 (20%), Positives = 73/172 (42%), Gaps = 23/172 (13%)

Query: 424 PESSRELLSQIRNELNASWAQLRELLTTFRLQLTEPGLRPALEASCEEYSAKFGFPVKLD 483
P +RE+L+ + + S + +LT +++ + S +F ++ +
Sbjct: 190 PTKAREMLTSLSELMRYSLRYSNARQVSLADELT------VVDSYLQLASIQFEDRLQFE 243

Query: 484 YQLPPRL----VPSHQAIHLLQIAREALSNALKH-----SQASEVVVTVAQNDNQVKLTV 534
Q+ P + VP L+Q E N +KH Q ++++ +++ V L V
Sbjct: 244 NQINPAIMDVQVPPM----LVQTLVE---NGIKHGIAQLPQGGKILLKGTKDNGTVTLEV 296

Query: 535 QDNGCGVPENAIRSNHYGMIIMRDRAQSLRG-DCRVRRRESGGTEVVVTFIP 585
++ G +N S G+ +R+R Q L G + +++ E G + IP
Sbjct: 297 ENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1226ACRIFLAVINRP330.004 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 32.5 bits (74), Expect = 0.004
Identities = 35/166 (21%), Positives = 60/166 (36%), Gaps = 22/166 (13%)

Query: 258 IMSLLYLATFGSFIGFSAGFAMLSKTQFPDVQILQYAFFGPFIGALARSA---GGALSDR 314
I+S + L+ + I A A L K + + FFG F S ++
Sbjct: 474 IVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKI 533

Query: 315 LGGTRVTLVNFILMAIFSGLLFLTLPTD----GQGGSFMAFFAVFLALFLTAGLGSGSTF 370
LG T L+ + L+ +LFL LP+ G F+ L +G+T
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTM----------IQLPAGATQ 583

Query: 371 QMISVIFRKLTMDRVKAEGGSDER-----AMREAATGTAAALGFIS 411
+ + ++T +K E + E + A + F+S
Sbjct: 584 ERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVS 629


69SF1685SF1697N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF1685014-0.422373transporter
SF1687015-1.172284DNA-binding transcriptional regulator
SF1688-118-1.366050transporter
SF1689020-2.179046cyclopropane fatty acyl phospholipid synthase
SF1690022-1.786390riboflavin synthase subunit alpha
SF1694-121-1.133732hypothetical protein
SF1695-216-2.155625*monooxygenase
SF1697-215-1.951692hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1685TCRTETA462e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 45.6 bits (108), Expect = 2e-07
Identities = 75/382 (19%), Positives = 138/382 (36%), Gaps = 22/382 (5%)

Query: 7 LLALAIGAFGIGTTEFSPMGLLPVIARGVDVSIPAA---GMLISAYAVGVMVGAPLMTLL 63
L +A+ A GIG M +LP + R + S G+L++ YA+ AP++ L
Sbjct: 11 LSTVALDAVGIGLI----MPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 64 LSHRARRSALIFLMAIFTLGNVLSAIAPDYMTLMLSRILTSLNHGAFFGLGSVVAASVVP 123
RR L+ +A + + A AP L + RI+ + G+ + A +
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYI-ADITD 125

Query: 124 KHKQASAVATMFMGLTLANIGGVPAATWLGETIGWRMSFLATAGLGVISMVSLFFSLPKG 183
++A M + G +G F A A L ++ ++ F LP+
Sbjct: 126 GDERARHFGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 184 GTGARPEVKKELAVLMRPQVLSALLTTVLGAGAMFTLYTYISPVLQSI--------THAT 235
G R +++E + + +T V A+F + + V ++ H
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWD 244

Query: 236 PVFVTAMLVLIGVGFSIGN-YLGGKLADRSVNGTLKGFLLLLMVIMLAIPFLARNEFGAA 294
+ L G+ S+ + G +A R ++ + A + A
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAF 304

Query: 295 ISMVVWGAATFAVVPPLQ-MRVMHVASEAPGLSSSVNIGAFNLGNALGAAAGGAVISAGL 353
MV+ + +P LQ M V E G +L + +G A+ +A +
Sbjct: 305 PIMVLLASGGIG-MPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASI 363

Query: 354 GY--SFVPVTGAIVAGLALLLV 373
+ + GA + L L +
Sbjct: 364 TTWNGWAWIAGAALYLLCLPAL 385


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1688TCRTETB758e-17 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 74.9 bits (184), Expect = 8e-17
Identities = 45/188 (23%), Positives = 81/188 (43%), Gaps = 2/188 (1%)

Query: 6 RFLVWLAGLSVLGFLATDMYLPAFAAIQADLQTPASAVSASLSLFLAGFAAAQLLWGPLS 65
+ L+WL LS L + + I D P ++ + + F+ F+ ++G LS
Sbjct: 14 QILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLS 73

Query: 66 DRYGRKPVLLIGLTIFALGSLGMLWVENAATLLVL-RFVQAVGVCAAPVIWQALVTDYYR 124
D+ G K +LL G+ I GS+ + +LL++ RF+Q G A P + +V Y
Sbjct: 74 DQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIP 133

Query: 125 SQKVNRIFATIMPLVGLSPALAPLLGSWIMVHFSWQAILATLYAITVVMILPIFWLKPTT 184
+ + F I +V + + P +G I + W + L + IT++ + + L
Sbjct: 134 KENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW-SYLLLIPMITIITVPFLMKLLKKE 192

Query: 185 KARNNSQD 192
D
Sbjct: 193 VRIKGHFD 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1694IGASERPTASE300.023 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.023
Identities = 26/128 (20%), Positives = 45/128 (35%), Gaps = 5/128 (3%)

Query: 278 VELTGSVALLDGASMIIGYGAELQQSTITV-QQGGVLILDGSTVKGDSVTFIVGNINLNG 336
V LT S + G + + G S + + + + S V + G+I+LN
Sbjct: 819 VNLTESANFVLGKANLFGTIQSRGNSQVRLTENSHWHLTGNSDVH--QLDLANGHIHLNS 876

Query: 337 GKLWLITGAATHVQLKVKRLRGEGAICLQTSAKEISPDFINVKGEVNGDIRVEITDASRQ 396
+ L V L G G+ T D + V G+ +++ D + +
Sbjct: 877 ADN--SNNVTKYNTLTVNSLSGNGSFYYLTDLSNKQGDKVVVTKSATGNFTLQVADKTGE 934

Query: 397 TLCNALKL 404
N L L
Sbjct: 935 PNHNELTL 942


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1697ICENUCLEATIN290.026 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 29.0 bits (64), Expect = 0.026
Identities = 39/163 (23%), Positives = 74/163 (45%), Gaps = 24/163 (14%)

Query: 108 TAGYAAQIASMGYSVRIGSVGFNSHIGSSGERARVAVTGNSSRISSAGDSSRIANTGMRV 167
TAGY + + S SV++ +GER ++ +S++ +AGD S++
Sbjct: 1113 TAGYRSTLISGADSVQM-----------AGERGKLIAGADSTQ--TAGDRSKL--LAGNN 1157

Query: 168 RVCTLGERCHVASNGDLVQIASFGANARIANSGDNVHIIASGENSTVVSTGVVDSIILGL 227
T G+R + + D + +A G +++ +G N + A + + S G L
Sbjct: 1158 SYLTAGDRSKLTAGNDCILMA--GDRSKLT-AGINSILTAGCRSKLIGSNGST----LTA 1210

Query: 228 GGSAALAYH--DGERVRFAVAIEGENNIRTGVRYRLNEQHQFV 268
G ++ L + DG+R VA G+ I + Y+++E + V
Sbjct: 1211 GENSVLIFRCWDGKRYTNVVAKTGKGGIEADMPYQMDEDNNIV 1253


70SF1931SF1939N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF19311141.016129chemotaxis regulatory protein CheY
SF19320121.263106chemotaxis-specific methylesterase
SF19330121.176984chemotaxis methyltransferase CheR
SF19340130.920495methyl-accepting protein IV
SF1935-1140.345057methyl-accepting chemotaxis protein II
SF1936-2140.259545purine-binding chemotaxis protein
SF1937-1140.367840chemotaxis protein CheA
SF1938-117-0.365333flagellar motor protein MotB
SF1939-312-0.913285flagellar motor protein MotA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1931HTHFIS904e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 4e-24
Identities = 30/105 (28%), Positives = 51/105 (48%), Gaps = 3/105 (2%)

Query: 7 KFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMPNMDGL 66
LV DD + +R ++ L G++ V + + AG V++D MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 ELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKPF 111
+LL I+ LPVL+++A+ I A++ GA Y+ KPF
Sbjct: 64 DLLPRIKKARPD--LPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1932HTHFIS659e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 65.2 bits (159), Expect = 9e-14
Identities = 35/188 (18%), Positives = 72/188 (38%), Gaps = 23/188 (12%)

Query: 1 MSKIRVLSVDDSALMRQIMTEIINSHSDMEMVATAPDPLVARDLIKKFNPDVLTLDVEMP 60
M+ +L DD A +R ++ + ++ V + I + D++ DV MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 RMDGLDFLEKLMRLRPMPVVMVSSLTGKGS-EVTLRALELGAIDFVTKPQLGIREGMLAY 119
+ D L ++ + RP V+V ++ + + ++A E GA D++ KP + E +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLV--MSAQNTFMTAIKASEKGAYDYLPKP-FDLTELIGII 115

Query: 120 SEMIAEKVRTAAKASLAAHKPLSAPTTLKAGPLLSSEKLIAIGASTGGTEAIRHVLQPLP 179
+AE R +K + + +G S E R + + +
Sbjct: 116 GRALAEPKRRPSKLEDDSQDGMP-----------------LVGRSAAMQEIYRVLARLMQ 158

Query: 180 LSSPALLI 187
++
Sbjct: 159 TDLTLMIT 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1937PF06580424e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.2 bits (99), Expect = 4e-06
Identities = 23/151 (15%), Positives = 49/151 (32%), Gaps = 52/151 (34%)

Query: 361 ELDKSLIERIIDPLT--HLVRNSLDHGIELPEKRLAAGKNSVGNLILSAEHQGGNICIEV 418
+++ ++++ + P+ LV N + HGI G ++L G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 419 TDDGAGLNRERILAKAASQGLTVSENMSDDEVAMLIFAPGFSTAEQVTDVSGRGVGMDVV 478
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKNTK--------------------------------------ESTGTGLQNV 318

Query: 479 KRNIQEMGG---HVEIQSKQGTGTTIRILLP 506
+ +Q + G +++ KQG +L+P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1938PF05272310.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.8 bits (69), Expect = 0.009
Identities = 22/93 (23%), Positives = 35/93 (37%), Gaps = 11/93 (11%)

Query: 46 LISISSPKELIQIAEYFRTPLATAVTGGDRISNSESPIPGGGDDYTQSQGEVNKQPNIEE 105
L +SSP A P + G + ++ PGGGDD GE +++
Sbjct: 384 LADVSSPTAAAGGAGGGEPPKKRDPSAG---AGTDPGGPGGGDD-----GEDPFGEWLDD 435

Query: 106 LKKRM---EQSRLRKLRGDLDQLIESDPKLRAL 135
R+ + L+ R L + + S P L
Sbjct: 436 EVARLRLRGRWLLKPRRAALIEALRSAPALAGC 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1939PF05844330.001 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 33.1 bits (75), Expect = 0.001
Identities = 12/28 (42%), Positives = 22/28 (78%), Gaps = 2/28 (7%)

Query: 76 MDLLALLYRLMAKSRQMGMFSLERDIEN 103
++LL +L+R+ K+R++G+ L+RD EN
Sbjct: 74 VELLLILFRIAQKARELGV--LQRDNEN 99


71SF1966SF1995N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF1966-112-1.545680flagellin
SF1967-2160.071176flagellar capping protein
SF1968-114-0.198742flagellar protein FliS
SF1969-113-1.621292flagellar biosynthesis protein FliT
SF1970-112-3.624710cytoplasmic alpha-amylase
SF1971122-4.881812lipoprotein
SF1972227-6.180130hypothetical protein
SF1973436-8.505129hypothetical protein
SF1977124-5.629539porin
SF1978223-2.904975regulator
SF19790172.044716kinase inhibitor
SF19801183.276518multidrug efflux protein
SF19821183.905880flagellar hook-basal body protein FliE
SF19841183.688263flagellar motor switch protein G
SF1985-1173.048522flagellar assembly protein H
SF1986-2182.801276flagellum-specific ATP synthase
SF1987-2161.719021flagellar biosynthesis chaperone FliJ
SF1988-2171.793737flagellar hook-length control protein
SF1989-2221.325900flagellar basal body protein FliL
SF1990119-0.034524flagellar motor switch protein FliM
SF1991219-3.218047flagellar motor switch protein FliN
SF1992121-4.122128flagellar protein FliO
SF1993121-4.994159flagellar biosynthesis protein FliP
SF1994017-3.598723flagellar biosynthesis protein FliQ
SF1995-118-3.624028flagellar biosynthesis protein FliR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1966FLAGELLIN2349e-73 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 234 bits (599), Expect = 9e-73
Identities = 260/551 (47%), Positives = 311/551 (56%), Gaps = 47/551 (8%)

Query: 2 AQVINTNSLSLITQNNINKNQSALSSSIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 61
AQVINTNSLSL+TQNN+NK+QS+LSS+IERLSSGLRINSAKDDAAGQAIANRFTSNIKGL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 TQAARNANDGISVAQTTEGALSEINNNLQRIRELTVQASTGTNSDSDLDSIQDEIKSRLD 121
TQA+RNANDGIS+AQTTEGAL+EINNNLQR+REL+VQA+ GTNSDSDL SIQDEI+ RL+
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EIDRVSGQTQFNGVNVLAKDGSMKIQVGANDGQTITIDLKKIDSDTLGLNGFNVNGGGAV 181
EIDRVS QTQFNGV VL++D MKIQVGANDG+TITIDL+KID +LGL+GFNVNG
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEA 180

Query: 182 A---NTAASKADLVAANATVVGNKYTVSAGYDAAKASDLLAGVSDGDTVQATINNGFGTA 238
++ K V NKY V A V D V A N T
Sbjct: 181 TVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAA-NGQLTTD 239

Query: 239 ASATNYKYDSASKSYSFDTTTASAADVQKYLTPGVGDTAKGTITIDGSAQDVQISSDGKI 298
+ N D + S T + A GDT +GK+
Sbjct: 240 DAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKV 299

Query: 299 TASNGDKLYIDTTGRLTKNGSGASLTEASLSTLAANNTKATTIDIGGTSISFTGNSTTPD 358
+ T NG +LT A ++ AAN AT S T D
Sbjct: 300 ST--------------TINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFD 345

Query: 359 TITYSVTGAKVDQAAFDKAVSTSGNNVDFTTAGYSVNGTTGAVTKGVDSVYVDNNEALTT 418
T + + D A + S V+ + G +
Sbjct: 346 DKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLA---------------- 389

Query: 419 SDTVDFYLQDDGSVTNGSGKAVYKDADGKLTTDAETKAATTADPLKALDEAISSIDKFRS 478
+ DA +TA+PL ++D A+S +D RS
Sbjct: 390 -------------GKTMFIDKTASGVSTLINEDAAAAKKSTANPLASIDSALSKVDAVRS 436

Query: 479 SLGAVQNRLDSAVTNLNNTTTNLSEAQSRIQDADYATEVSNMSKAQIIQQAGNSVLAKAN 538
SLGA+QNR DSA+TNL NT TNL+ A+SRI+DADYATEVSNMSKAQI+QQAG SVLA+AN
Sbjct: 437 SLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKAQILQQAGTSVLAQAN 496

Query: 539 QVPQQVLSLLQ 549
QVPQ VLSLL+
Sbjct: 497 QVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1967TYPE3OMBPROT320.005 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 32.0 bits (72), Expect = 0.005
Identities = 24/72 (33%), Positives = 37/72 (51%), Gaps = 2/72 (2%)

Query: 214 NGMEVSVAAQNAQLTVNNVAIENSSNTISDALENITLNLNDVTTGNQTLTITQDTSKVQT 273
N E +VAA+N + + A+ + +S AL T++L V+T LT T T ++
Sbjct: 236 NSSERAVAARNKAEELVSAALYSRPELLSQALSGKTVDLKIVSTS--LLTPTSLTGGEES 293

Query: 274 AIKDWVNAYNSL 285
+KD VNA L
Sbjct: 294 MLKDQVNALKGL 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1972RTXTOXIND300.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.017
Identities = 10/57 (17%), Positives = 17/57 (29%), Gaps = 2/57 (3%)

Query: 164 RFTLLPIFRIPVKMQKVSAASPLTQKPDQARRRF--RLGMLVFFGMLGWALLTAMNQ 218
R L R + + + A L + P R R M ++L +
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEI 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1973PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1977ECOLIPORIN5090.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 509 bits (1312), Expect = 0.0
Identities = 239/388 (61%), Positives = 282/388 (72%), Gaps = 33/388 (8%)

Query: 1 MKKLTVAISAVAASVLMAMSAQAAEIYNKDSNKLDLYGKVNAKHYFSSNDADDGDTTYVR 60
MK+ +A+ V ++L A +A AAEIYNKD NKLDLYGKV+ HYFS + + DGD TY+R
Sbjct: 1 MKRKVLAL--VIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMR 58

Query: 61 LGFKGETQINDQLTGFGQWEYEFKGNRAESQGSSKDKTRLAFAGLKFGDYGSIDYGRNYG 120
+GFKGETQINDQLTG+GQWEY + N E +G++ TRLAFAGLKFGDYGS DYGRNYG
Sbjct: 59 VGFKGETQINDQLTGYGQWEYNVQANTTEGEGANS-WTRLAFAGLKFGDYGSFDYGRNYG 117

Query: 121 VAYDIGAWTDVLPEFGGDTWTQTDVFMTGRTTGVATYRNNDFFGLVDGLNFAAQYQGKND 180
V YD+ WTD+LPEFGGD++T D +MTGR GVATYRN DFFGLVDGLNFA QYQGKN+
Sbjct: 118 VLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNE 177

Query: 181 R----------------TDVTEANGDGFGFSTTYEY-EGFGVGATYAKSDRTNDQVIYGN 223
D+ NGDGFG STTY+ GF GA Y SDRTN+QV G
Sbjct: 178 SQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGG 237

Query: 224 NSLNASGQNAEVWAAGLKYDANNIYLATTYSETQNMTVFG------NNHIANKAQNFEVV 277
A G A+ W AGLKYDANNIYLAT YSET+NMT +G + +ANK QNFEV
Sbjct: 238 T--IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVT 295

Query: 278 AQYQFDFGLRPSVAYLQSKGKDLG----AWGDQDLIEYIDVGATYYFNKNMSTFVDYKIN 333
AQYQFDFGLRP+V++L SKGKDL D+DL++Y DVGATYYFNKN ST+VDYKIN
Sbjct: 296 AQYQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKIN 355

Query: 334 LIDKSD-FTKASGVATDDIVAVGLVYQF 360
L+D D F K +G++TDDIVA+G+VYQF
Sbjct: 356 LLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1978HTHFIS290.017 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.017
Identities = 8/30 (26%), Positives = 16/30 (53%)

Query: 176 RTKWTANKVARYLYISVSTLHRRLASEGIS 205
T+ K A L ++ +TL +++ G+S
Sbjct: 447 ATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1982FLGHOOKFLIE1178e-38 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 117 bits (293), Expect = 8e-38
Identities = 102/103 (99%), Positives = 102/103 (99%)

Query: 2 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTVARTQAEKFTL 61
SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQT ARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1984FLGMOTORFLIG338e-118 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 338 bits (868), Expect = e-118
Identities = 117/329 (35%), Positives = 197/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE 60
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD 120
+ + DY R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLRRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLAKRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1985FLGFLIH373e-135 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 373 bits (958), Expect = e-135
Identities = 223/228 (97%), Positives = 226/228 (99%)

Query: 1 MSDNLPWKTWTPDDLAPPPAEFVPMVESEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60
MSDNLPWKTWTPDDLAPP AEFVP+VE EETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 61 AEGRQQGHEQGYQEGLAQGLEQGLAEAKAQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120
AEGRQQGH+QGYQEGLAQGLEQGLAEAK+QQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180
MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1987FLGFLIJ1525e-51 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 152 bits (384), Expect = 5e-51
Identities = 112/113 (99%), Positives = 113/113 (100%)

Query: 2 AEEQLKMLIDYQNEYRNNLNSDMSAGMTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKV 61
AEEQLKMLIDYQNEYRNNLNSDMSAG+TSNRWINYQQFIQTLEKAITQHRQQLNQWTQKV
Sbjct: 35 AEEQLKMLIDYQNEYRNNLNSDMSAGITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKV 94

Query: 62 DIALNSWREKKQRLQAWQTLQERQSTAALLAENRLDQKKMDEFAQRAAMRKPE 114
DIALNSWREKKQRLQAWQTLQERQSTAALLAENRLDQKKMDEFAQRAAMRKPE
Sbjct: 95 DIALNSWREKKQRLQAWQTLQERQSTAALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1988FLGHOOKFLIK468e-168 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 468 bits (1204), Expect = e-168
Identities = 364/375 (97%), Positives = 369/375 (98%)

Query: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLTLLSEALAGETTTDKAAPQLLVATDKPTTK 60
MIRLAPLITADVDTTTLPGGKASDAAQDFL LLSEALAGETTTDKAAPQLLVATDKPTTK
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLVSDILADAQQADLLIPVDETLPVINDEQSTSTPLTTAQTMTLAAVADKNTTKDEKA 120
GEPL+SDI++DAQQA+LLIPVDET PVINDEQSTSTPLTTAQTM LAAVADKNTTKDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPAEKPTLFTKLTSAQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLP EKPTLFTKLTS QLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTADASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTPLVAEAQSKAEVISTPSPVTA ASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMISPHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQM+SPHQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360

Query: 361 LQGRVTGNSGVDIFA 375
LQGRVTGNSGVDIFA
Sbjct: 361 LQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1990FLGMOTORFLIM380e-134 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 380 bits (977), Expect = e-134
Identities = 86/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 5 ILSQAEIDALLNGDS--EVKDEPTASISGESDIRPYDPNTQRRVVRERLQALEIINERFA 62
+LSQ EID LL S + E IS I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 63 RHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 122
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 123 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 182
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 183 EMQVEFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 240
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 241 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 297
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 298 GVPVLTSQYGTLNGQYALRIEHLI 321
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1991FLGMOTORFLIN2106e-74 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 210 bits (537), Expect = 6e-74
Identities = 125/137 (91%), Positives = 133/137 (97%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTSEKSAADAVFQQFGGGDVSGTLQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T+ KSAADAVFQQ GGGDVSG +QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1993FLGBIOSNFLIP2814e-99 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 281 bits (720), Expect = 4e-99
Identities = 203/204 (99%), Positives = 204/204 (100%)

Query: 1 MQTLVFITSLTFIPAILLMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIM 60
+QTLVFITSLTFIPAILLMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIM
Sbjct: 42 VQTLVFITSLTFIPAILLMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIM 101

Query: 61 SPVIDKIYVDAYQPFSEEKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQG 120
SPVIDKIYVDAYQPFSEEKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQG
Sbjct: 102 SPVIDKIYVDAYQPFSEEKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQG 161

Query: 121 PEAVPMRILLPAYVTSELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPF 180
PEAVPMRILLPAYVTSELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPF
Sbjct: 162 PEAVPMRILLPAYVTSELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPF 221

Query: 181 KLMLFVLVDGWQLLVGSLAQSFYS 204
KLMLFVLVDGWQLLVGSLAQSFYS
Sbjct: 222 KLMLFVLVDGWQLLVGSLAQSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1994TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF1995TYPE3IMRPROT2034e-67 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 203 bits (517), Expect = 4e-67
Identities = 254/261 (97%), Positives = 257/261 (98%)

Query: 1 MMQETSDQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
M+Q TS+QWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPGSHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDP SHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGSEPLNSNAFLAPTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIG EPLNSNAFLA TKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIISELPLI 261
EHLFSEIFNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


72SF2139SF2151N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF2139-1183.158113acriflavin resistance protein AcrA-like protein
SF2141-1183.200526multidrug efflux system subunit MdtC
SF2142-2121.025281transporter
SF2143-19-0.265849signal transduction histidine-protein kinase
SF2144112-2.084952DNA-binding transcriptional regulator BaeR
SF2145114-3.010950hypothetical protein
SF2146013-2.655981protease
SF2148023-4.115627hypothetical protein
SF2149318-2.354875lipid kinase
SF2150220-2.781502galactitol utilization operon repressor
SF2151121-3.097530galactitol-1-phosphate dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2139RTXTOXIND445e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.0 bits (104), Expect = 5e-07
Identities = 33/167 (19%), Positives = 64/167 (38%), Gaps = 11/167 (6%)

Query: 61 ALAQTQGQLAKDKATLANARRDLARYQQLAKTNLVSRQELDAQQALVSETEGTIKADEAS 120
+ +L K+ L ++ AK +L + L + T +
Sbjct: 260 KYVEAVNELRVYKSQLEQIESEILS----AKEEYQLVTQLFKNEILDKLRQTTDNIGLLT 315

Query: 121 --VASAQLQLDWSRITAPVDGRV-GLKQVDVGNQISSGDTTGIVVITQTHPIDLVFTLPE 177
+A + + S I APV +V LK G +++ +T +V++ + +++ +
Sbjct: 316 LELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVTALVQN 374

Query: 178 SDIATVVQAQKAGKPLMVEAWDRTNSKKL-SEGTLLSLDNQIDATTG 223
DI + Q A + VEA+ T L + ++LD D G
Sbjct: 375 KDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419



Score = 43.7 bits (103), Expect = 8e-07
Identities = 21/122 (17%), Positives = 47/122 (38%), Gaps = 13/122 (10%)

Query: 15 GTITAA-NTVTVRSRVDGQLMALHFQEGQQVKAGDLLAEIDPSQFKVALAQTQGQLAKDK 73
G +T + + ++ + + + +EG+ V+ GD+L ++ + K +
Sbjct: 88 GKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTAL-------GAEADTLKTQ 140

Query: 74 ATLANARRDLARYQQLAKTNLVSRQELDAQQALVSETEGTIKADEASVASAQLQLDWSRI 133
++L AR + RYQ L+++ EL+ L E + L +
Sbjct: 141 SSLLQARLEQTRYQILSRS-----IELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQF 195

Query: 134 TA 135
+
Sbjct: 196 ST 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2141ACRIFLAVINRP9050.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 905 bits (2340), Expect = 0.0
Identities = 286/1035 (27%), Positives = 502/1035 (48%), Gaps = 40/1035 (3%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ +L++ + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVSEMTSSS-SLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLPSGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSDT--YSQGELYDFASTQLAPTISQIDGVGDVDVGGSSL 182
+ S + +M+ SD +Q ++ D+ ++ + T+S+++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLTPQALFNQGVSLDDVRTAISNANVRKPQG------ALEDGTHRWQIQTNDELK 236
A+R+ L L ++ DV + N + G AL I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDSIRAKLPELQETIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRS 355
T +I+AKL ELQ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RAT+IP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RMLVALQQGYGKSLKWVLNHTRLVG 530
+LV+L LTP +C +LK + GF Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVLLGTIALNI----SIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 582
++ +A + +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 583 RD-DPAVDNVTGFT-GGSRVSSGMMFITLKPRDERS---ETAQQIIDRLRVKLAKEPGAN 637
+ +V V GF+ G ++GM F++LKP +ER+ +A+ +I R +++L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 638 LFLMAVQDIRVGGRQSNASYQYTLLSDDLAALREWEPKIRKKLATL-----PELADVNSD 692
+ + I G + ++ L D + + R +L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 693 QQDNGAEMNLVYDRDTMARLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 752
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 753 TQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSD 812
++K++V + G+ +P S F + + I G S D
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 813 ASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVILIIAAIATVYIVLGILYESYVH 872
A A ++ ++L P+ + + G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 873 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGN 932
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA +
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 933 LTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLEITIVGGLVMSQLL 992
EA A +R RPI+MT+LA + G LPL +S G GS + + I ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 993 TLYTTPVVYLFFDRL 1007
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031



Score = 78.7 bits (194), Expect = 7e-17
Identities = 77/446 (17%), Positives = 162/446 (36%), Gaps = 26/446 (5%)

Query: 588 VDNVTGFTGGS-RVSSGMMFITLKPRDERSETAQQIIDRLRVKLAKEPGANLFLMAVQDI 646
+DN+ + S S + +T + + Q+ ++L++ P + Q I
Sbjct: 72 IDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE----VQQQGI 127

Query: 647 RVGGRQSNASYQYTLLSDDLAALREW-----EPKIRKKLATLPELADVNSDQQDNGAE-- 699
V S+ +SD+ ++ ++ L+ L + DV GA+
Sbjct: 128 SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL----FGAQYA 183

Query: 700 MNLVYDRDTMARLGID----VQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRYTQD 755
M + D D + + + + + + T P Q + R+
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP 243

Query: 756 ISALEKMFVINNEGKAIPLSYFAK--WQPANAPLSVNHQGLSAASTISFNLPTGKSLSDA 813
+ +N++G + L A+ N + G AA +L D
Sbjct: 244 EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL-DT 302

Query: 814 SAAIDRAMTQL--GVPSTVRGSFA-GTAQVFQETMNSQVILIIAAIATVYIVLGILYESY 870
+ AI + +L P ++ + T Q +++ V + AI V++V+ + ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 871 VHPLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRH 930
L +P +G L F + + + G++L IG++ +AI++V+
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 931 GNLTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLEITIVGGLVMSQ 990
L P+EA ++ ++ + +P+ GG + + ITIV + +S
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 991 LLTLYTTPVVYLFFDRLRLRFSRKPK 1016
L+ L TP + + + K
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2142TCRTETB1252e-33 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 125 bits (315), Expect = 2e-33
Identities = 97/429 (22%), Positives = 188/429 (43%), Gaps = 23/429 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIATLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGLSPLAIAGLVAVGVVALVLYLLHARNNNRALFSLKL 257
G +L++VG+ L + + V V++ ++++ H R L
Sbjct: 202 KGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+R G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHISVDSGTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYTWLSMAF 441
+Y+ L + F
Sbjct: 428 LYSNLLLLF 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2143BCTERIALGSPF310.009 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 31.3 bits (71), Expect = 0.009
Identities = 27/93 (29%), Positives = 34/93 (36%), Gaps = 27/93 (29%)

Query: 173 LATLLAALATFLLA-------------RGLLAPVKRLVDGTHKLAAGDFTTRVTPTSEDE 219
LATL+AA A L+A V+ V H LA + P S +
Sbjct: 77 LATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSFER 133

Query: 220 L-----------GKLAQDFNQLASTLEKNQQMR 241
L G L N+LA E+ QQMR
Sbjct: 134 LYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2144HTHFIS766e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 6e-18
Identities = 28/136 (20%), Positives = 65/136 (47%), Gaps = 1/136 (0%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLAYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDVPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTILRRCK 129
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 PQRELQQQDAESPLII 145
+ + D++ + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2148LIPOLPP20270.026 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 26.6 bits (58), Expect = 0.026
Identities = 13/38 (34%), Positives = 24/38 (63%), Gaps = 1/38 (2%)

Query: 18 EGEMKKIAAISLISIFLISGCAVHNDETSIGKFGLAYK 55
+ ++KKI +S+++ +I GC+ H ++ I K AYK
Sbjct: 2 KNQVKKILGMSVVAAMVIVGCS-HAPKSGISKSNKAYK 38


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2151DHBDHDRGNASE347e-04 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 33.9 bits (77), Expect = 7e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 2/92 (2%)

Query: 156 AQGCENKNVIIIGAGT-IGLLAIQCAVALGAKSVTAIDISSEKLALAKSFGAMQTFNSSE 214
A+G E K I GA IG + + GA + A+D + EKL S + ++
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAH-IAAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 215 MSAPQMQSVLRELRFNQLILETAGVPQTVELA 246
A S + ++ E + V +A
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVA 93


73SF2434SF2437N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF2434036-9.965891multidrug resistance protein Y
SF2435036-9.109737multidrug resistance protein K
SF2436134-8.366940DNA-binding transcriptional activator EvgA
SF2437134-7.998789hybrid sensory histidine kinase in two-component
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2434TCRTETB1193e-31 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 119 bits (300), Expect = 3e-31
Identities = 98/408 (24%), Positives = 168/408 (41%), Gaps = 25/408 (6%)

Query: 19 VTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVITSFGVANAIAIPVTGRLAQ 78
+ I L + +F +L+ + NV++P I+ WV T+F + +I V G+L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 79 RIGELRLFLLSVTFFSLSSLMCSLS-TNLDVLIFFRVVQGLMAGPLIPLSQSLLLRNYPP 137
++G RL L + S++ + + +LI R +QG A L ++ R P
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 138 EKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRE 197
E R A L V + GP +GG I W +L+ +PM I+ L L +E
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192

Query: 198 TETSPVKMNLPRLTLLVLGVGGLQIMLDKGRDLDWFNSSTIIILTVVSVISLISLVIWES 257
K + ++++ VG + ML F +S I +VSV+S + V
Sbjct: 193 VRI---KGHFDIKGIILMSVGIVFFML--------FTTSYSISFLIVSVLSFLIFVKHIR 241

Query: 258 TSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQKTMGYNAIWAGLAYAPI 317
+P +D L K+ F IG++ + +G + ++P +++ + G
Sbjct: 242 KVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 318 GIMPLLISPLIG-----RYGNKIDMRVLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQ 372
G M ++I IG R G + + VTF +V + S T F II+
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL----SVSFLTASFLLETTSWFMTIIIVF 357

Query: 373 FFQGFAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL 420
G + ++TI S L + S+ NF LS G ++
Sbjct: 358 VLGGLSFTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2435RTXTOXIND771e-17 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 77.2 bits (190), Expect = 1e-17
Identities = 63/419 (15%), Positives = 125/419 (29%), Gaps = 96/419 (22%)

Query: 8 KKQSNRKKYFSLLVIVLFIAFSGAYAYWSMELEDMISTDDAYVT-GNADPISAQVSGSVT 66
+ +R+ I+ F+ + + ++E + + + G + I + V
Sbjct: 50 ETPVSRRPRLVAYFIMGFLVIAFILSVLG-QVEIVATANGKLTHSGRSKEIKPIENSIVK 108

Query: 67 VVNHKDTNYVRQGDILVSLDKTDATIALNKA----------------------------- 97
+ K+ VR+GD+L+ L A K
Sbjct: 109 EIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPEL 168

Query: 98 -----------------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQ 131
K + Q + L + AE + + Y+
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYEN 228

Query: 132 SLEDYNRRV----PLAKQGVISKE----------TLEHTKDTLISSKAALNAAIQAYKAN 177
R+ L + I+K + S + + I + K
Sbjct: 229 LSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE 288

Query: 178 KALVMN-------TPLNR-QPQVVEAADATKEAWLVLKRTDIRSPVTGYIAQRSVQ-VGE 228
LV L + + + + + IR+PV+ + Q V G
Sbjct: 289 YQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGG 348

Query: 229 TVSSGQSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINM 287
V++ ++LM +VP + V A + + + +GQ+ I + F G +
Sbjct: 349 VVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLV 402

Query: 288 GTGNAFSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDTKD 342
G + + +V V +S++ L PL G+++TA I T
Sbjct: 403 GK---VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2436HTHFIS493e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 3e-09
Identities = 22/148 (14%), Positives = 53/148 (35%), Gaps = 31/148 (20%)

Query: 4 IIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQV 63
++ DD + L + ++ + + + + D+V+ DV +P N +
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 LETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF 123
L ++K + ++++SA+N + AI+A++ G +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDY 101

Query: 124 ---PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 102 LPKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF2437HTHFIS802e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.9 bits (197), Expect = 2e-17
Identities = 30/105 (28%), Positives = 51/105 (48%)

Query: 960 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKVSMQHYDLLITDVNMPNMDGFE 1019
+IL+ADD R +L + L+ GYDV ++ ++ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 LTRKLREQNSSLPIWGLTANAQANEREKGLSCGMNLCLFKPLTLD 1064
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


74SF3186SF3193N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF31860130.576756fimbrial protein
SF31871122.390968SAM-dependent 16S ribosomal RNA C1402 ribose
SF31880132.502150glycosylase
SF31890161.840721hypothetical protein
SF31901171.767606chromosome replication initiator DnaA
SF31911182.485079outer membrane lipoprotein
SF31920202.689794hypothetical protein
SF31930201.272885hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3186FIMBRIALPAPF290.022 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 28.9 bits (64), Expect = 0.022
Identities = 41/160 (25%), Positives = 67/160 (41%), Gaps = 21/160 (13%)

Query: 208 VKLSIQGNLTAPQSCKINQGDVIKVNFGFINGQKFTTRNAMPDGFTPVDFDITYDCGDTS 267
V+++I+GN+ P C IN G I V+FG IN + V +I+ C S
Sbjct: 21 VQINIRGNVYIP-PCTINNGQNIVVDFGNINPEHVDNSRG------EVTKNISISCPYKS 73

Query: 268 KIKNSLQMRIDGTTGVVDQYNLVARRRSSDNVPDVGIRIENLGGGVANIPFQNG------ 321
SL +++ G T V Q N++A N+ GI + G + NG
Sbjct: 74 ---GSLWIKVTGNTMGVGQNNVLA-----TNITHFGIALYQGKGMSTPLTLGNGSGNGYR 125

Query: 322 ILPVDPSGHGTVNMRAWPVNLVGGELETGKFQGTATITVM 361
+ + T + P G L G F+ TA+++++
Sbjct: 126 VTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMI 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3188IGASERPTASE300.029 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.029
Identities = 41/266 (15%), Positives = 83/266 (31%), Gaps = 15/266 (5%)

Query: 277 QQGFEAAKNIGTQPVAAQVAAAPAADVAEQPQPQTADSVASPAQASVSDLTGDQPAAQPV 336
Q G E + T+ E + Q V S QP A+P
Sbjct: 1087 QSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPA 1146

Query: 337 PVSAPATSTAAVSAPANPSAELKIYDTSSQPLSQILSQVQQDGASIVVGPLLKNNVEELL 396
+ P + + N +A+ + + S + V + +++N
Sbjct: 1147 RENDPTVNIKEPQSQTNTTAD--TEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTP 1204

Query: 397 KSNTPLNVLALNQPENIENRVNICYFALSPEDEARDAARHIRDQGKQAPLVLIPR---SA 453
+ P + +R ++ + E A D+ A L +
Sbjct: 1205 ATTQPTVNSESSNKPKNRHRRSVRSVPHNVE----PATTSSNDRSTVALCDLTSTNTNAV 1260

Query: 454 LGDRVANAFAQEWQKLGGGTVLQQKFGSTSELRAGVNGGSGIALTGSPITPRATTDSGMT 513
L D A A ++ L G + Q S+L G + ++ + + ++
Sbjct: 1261 LSDARAKA---QFVALNVGKAVSQHI---SQLEMNNEGQYNVWVSNTSMNKNYSSSQYRR 1314

Query: 514 TNNPTLQTTPTDDQFTNNGGRVDAVY 539
++ + QT DQ +N ++ V+
Sbjct: 1315 FSSKSTQTQLGWDQTISNNVQLGGVF 1340


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3190RTXTOXINA280.036 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 27.6 bits (61), Expect = 0.036
Identities = 26/111 (23%), Positives = 44/111 (39%), Gaps = 22/111 (19%)

Query: 42 NKILCCGNGTSAANAQHFAASMINRFETERPSLPAIALNTDNVVLTAIA-------NDRL 94
K+L GN + A T + IA + V AI+ D+
Sbjct: 277 TKVL--GNVGKGISQYIIAQRAAQGLSTSAAAAGLIA----SAVTLAISPLSFLSIADKF 330

Query: 95 HD----EVYAKQVRALGHAGDVLLAISTRGNSRDIVKAVEAAVTRDMTIVA 141
E Y+++ + LG+ GD LLA + A++A++T T++A
Sbjct: 331 KRANKIEEYSQRFKKLGYDGDSLLAAFHKETG-----AIDASLTTISTVLA 376


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3193NUCEPIMERASE290.013 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.0 bits (65), Expect = 0.013
Identities = 8/22 (36%), Positives = 13/22 (59%)

Query: 19 VLITGATGLVGGHLLRMLINEP 40
L+TGA G +G H+ + L+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG 24


75SF3275SF3281N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF3275-313-0.775808serine endoprotease
SF3276-212-0.588776malate dehydrogenase
SF3277-212-0.979793arginine repressor ArgR
SF3278-313-0.338627hypothetical protein
SF3279-2120.671292hypothetical protein
SF3280-3101.212325p-hydroxybenzoic acid efflux subunit AaeB
SF3281-2101.178758p-hydroxybenzoic acid efflux subunit AaeA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3275V8PROTEASE538e-10 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 52.7 bits (126), Expect = 8e-10
Identities = 31/160 (19%), Positives = 59/160 (36%), Gaps = 26/160 (16%)

Query: 77 RTLGSGVIMDQRGYIITNKHVINDADQIIVALQ------------DGRVFEALLVGSDSL 124
+ SGV++ + ++TNKHV++ AL+ +G +
Sbjct: 101 TFIASGVVVG-KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 125 TDLAVLKI-------NATGGLPTIPINARRVPHIGDVVLAIGNPYNLGQTITQGIISATG 177
DLA++K + + ++ + + G P + T + G
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGD-KPVATMW--ESKG 216

Query: 178 RIGLNPTGRQNFLQTDASINHGNSGGALVNSLGELMGINT 217
+I + +Q D S GNSG + N E++GI+
Sbjct: 217 KI---TYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHW 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3276DHBDHDRGNASE280.045 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 28.1 bits (62), Expect = 0.045
Identities = 37/167 (22%), Positives = 61/167 (36%), Gaps = 27/167 (16%)

Query: 3 VAVLGAAGGIGQALALLLKTQLPSGSELSLYDIAPVTPGVAVDLSHIPTAVKIKGFSGED 62
+ GAA GIG+A+A L G+ ++ D P V S A + F +
Sbjct: 11 AFITGAAQGIGEAVARTL---ASQGAHIAAVDYNP-EKLEKVVSSLKAEARHAEAFPADV 66

Query: 63 ATPA------------LEGADVVLISAGVARK------PGMDRSDLFNVNAGIVKNLVQQ 104
A + D+++ AGV R + F+VN+ V N +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 105 VAKNCPK----ACIGIITNPVNTT-VAIAAEVLKKAGVYDKNKLFGV 146
V+K + + + +NP ++AA KA K G+
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGL 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3277ARGREPRESSOR1694e-57 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 169 bits (430), Expect = 4e-57
Identities = 44/141 (31%), Positives = 71/141 (50%), Gaps = 5/141 (3%)

Query: 15 KALLKEEKFSSQGEIVAALQEQGFDNINQSKVSRMLTKFGAVRTRNAKMEMVYCLPAELG 74
+ ++ + +Q E+V L++ G+ N+ Q+ VSR + + V+ Y LPA+
Sbjct: 11 REIITANEIETQDELVDILKKDGY-NVTQATVSRDIKELHLVKVPTNNGSYKYSLPADQR 69

Query: 75 VPTTSSPLKNLV---LDIDYNDAVVVIHTSPGAAQLIARLLDSLGKAEGILGTIAGDDTI 131
S ++L+ + ID ++V+ T PG AQ I L+D+L E I+GTI GDDTI
Sbjct: 70 FNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEE-IMGTICGDDTI 128

Query: 132 FTTPANGFTVKDLYEAILELF 152
K + + ILEL
Sbjct: 129 LIICRTHDDTKVVQKKILELL 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3281RTXTOXIND535e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 53.3 bits (128), Expect = 5e-10
Identities = 28/163 (17%), Positives = 59/163 (36%), Gaps = 16/163 (9%)

Query: 6 RKFSRTAITVVLVILAFIAIFNAWVYYTE----SPWTRDARFSADVVAIAPDVSGLITQV 61
SR V I+ F+ I + + S I P + ++ ++
Sbjct: 51 TPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEI 110

Query: 62 NVHDNQLVKKGQILFTIDQPR-------YQKALEEAQADVAYYQVLAQEKRQEAGRRNRL 114
V + + V+KG +L + Q +L +A+ + YQ+L++ E + L
Sbjct: 111 IVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRS--IELNKLPEL 168

Query: 115 GVQAMSREEIDQANNVL---QTVLHQLAKAQATRDLAKLDLER 154
+ + VL + Q + Q + +L+L++
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDK 211



Score = 51.4 bits (123), Expect = 2e-09
Identities = 28/147 (19%), Positives = 54/147 (36%), Gaps = 15/147 (10%)

Query: 100 LAQEKRQEAGRRNRLGVQ-AMSREEIDQANNVLQT-VLHQLAKAQAT-------RDLAKL 150
E R + ++ + ++EE + + +L +L + +
Sbjct: 264 AVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEE 323

Query: 151 DLERTVIRAPADGWVTNLNVYT-GEFITRGSTAVALVKQNSFY-VLAYMEETKLEGVRPG 208
+ +VIRAP V L V+T G +T T + +V ++ V A ++ + + G
Sbjct: 324 RQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVG 383

Query: 209 YRAEIT----PLGSNKVLKGTVDSVAA 231
A I P L G V ++
Sbjct: 384 QNAIIKVEAFPYTRYGYLVGKVKNINL 410


76SF3364SF3371N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF3364120-1.184587hypothetical protein
SF33652181.489070FKBP-type peptidylprolyl isomerase
SF3366-1182.525900phi X174 lysis protein
SF3367-1192.348123FKBP-type peptidylprolyl isomerase
SF3368-1172.020606hypothetical protein
SF3370-1172.096520glutathione-regulated potassium-efflux system
SF3371-1171.329850ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3364ACRIFLAVINRP290.021 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 29.0 bits (65), Expect = 0.021
Identities = 14/62 (22%), Positives = 29/62 (46%), Gaps = 1/62 (1%)

Query: 160 ASSVEDLVTQTLEFTIEEVNADRNV-SNNAKNRQIVLNLYEKGIFDIKDAINQVADRLNI 218
A +V+D VTQ +E + ++ + S + + + L + D A QV ++L +
Sbjct: 54 AQTVQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQL 113

Query: 219 SK 220
+
Sbjct: 114 AT 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3365INFPOTNTIATR1332e-40 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 133 bits (337), Expect = 2e-40
Identities = 80/226 (35%), Positives = 124/226 (54%), Gaps = 9/226 (3%)

Query: 28 AAKPATTADSKASFKNDDQKSAYALGASLGRYMENSLKEQEKLGIKLDKDQLIAGVQDAF 87
A A A S D K +Y++GA LG K + GI ++ D L G+QD
Sbjct: 14 AMSTAMAATDATSLTTDKDKLSYSIGADLG-------KNFKNQGIDINPDVLAKGMQDGM 66

Query: 88 A-DKSKLSDQEIEQTLQAFEARVKSSAQAKMEKDAADNEAKGKEYREKFAKEKGVKTSST 146
+ + L++++++ L F+ + + A+ K A +N+AKG + + G+ +
Sbjct: 67 SGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVLPS 126

Query: 147 GLVYQVVEAGKGEAPKDSDTVVVNYKGTLIDGKEFDNSYTRGEPLSFRLDGVIPGWTEGL 206
GL Y++++AG G P SDTV V Y GTLIDG FD++ G+P +F++ VIPGWTE L
Sbjct: 127 GLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEAL 186

Query: 207 KNIKKGGKIKLVIPPELAYGKAGVPG-IPPNSTLVFDVELLDVKPA 251
+ + G ++ +P +LAYG V G I PN TL+F + L+ VK A
Sbjct: 187 QLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVKKA 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3370ISCHRISMTASE320.001 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 31.9 bits (72), Expect = 0.001
Identities = 32/135 (23%), Positives = 51/135 (37%), Gaps = 16/135 (11%)

Query: 11 YAHPESQDSVANRVLLKPATQLSNVTVHDLYAHYPDFFIDIPREQALLREHEVIVFQH-- 68
Y P + D N+V P + + +HD+ ++ D F L + +
Sbjct: 9 YQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCV 68

Query: 69 ----PLYTYSCPALLKEWLDRVLSRGFASGPGGNQLAGKYWRNVITTGEPESA------Y 118
P+ + P DR L F GPG N +G Y +IT PE +
Sbjct: 69 QLGIPVVYTAQPGSQNP-DDRALLTDFW-GPGLN--SGPYEEKIITELAPEDDDLVLTKW 124

Query: 119 RYDALNRYPMSDVLR 133
RY A R + +++R
Sbjct: 125 RYSAFKRTNLLEMMR 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3371GPOSANCHOR330.005 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.7 bits (74), Expect = 0.005
Identities = 28/152 (18%), Positives = 54/152 (35%), Gaps = 22/152 (14%)

Query: 504 KVEPFDGDLEDYQQWLSDVQKQENQTDEAPKENANSAQARKDQKRREAELRAQTQPLRKE 563
+ D + ++ E + + ++ R+ +R R + L E
Sbjct: 272 AMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAE 331

Query: 564 IARLEKEME---------------------KLNAQLAQAEEKLGDSELYDQSRKAELTAC 602
+LE++ + +L A+ + EE+ SE QS + +L A
Sbjct: 332 HQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDAS 391

Query: 603 LQQQASAKSGLEECEMAWLEAQEQLEQMLLEG 634
+ + + LEE L A E+L + L E
Sbjct: 392 REAKKQVEKALEEANSK-LAALEKLNKELEES 422



Score = 32.0 bits (72), Expect = 0.008
Identities = 13/125 (10%), Positives = 39/125 (31%), Gaps = 7/125 (5%)

Query: 513 EDYQQWLSDVQKQENQTDEAPKENANSAQARKDQKRREAELRAQTQPLRKEIARLEKEME 572
+ + ++ + E A A + D ++ + +++
Sbjct: 127 KALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST-------ADSAKIK 179

Query: 573 KLNAQLAQAEEKLGDSELYDQSRKAELTACLQQQASAKSGLEECEMAWLEAQEQLEQMLL 632
L A+ A E + + E + TA + + ++ + ++ LE +
Sbjct: 180 TLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMN 239

Query: 633 EGQSN 637
++
Sbjct: 240 FSTAD 244


77SF3463SF3471N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF3463-2150.331748acetyltransferase YhhY
SF3464-2170.845573gamma-glutamyltranspeptidase
SF34650170.279068hypothetical protein
SF34660190.965635cytoplasmic glycerophosphodiester
SF34670190.674559insertion element iso-IS10R transposase
SF3469-1253.026124glycerol-3-phosphate ABC transporter permease
SF3470-2253.287883glycerol-3-phosphate transporter permease
SF3471-1233.095609glycerol-3-phosphate ABC transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3463SACTRNSFRASE371e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 37.2 bits (86), Expect = 1e-05
Identities = 21/92 (22%), Positives = 33/92 (35%), Gaps = 16/92 (17%)

Query: 55 VACIDGDVVGHLTIDVQQRPRRSHVADFGICVDSRWKNRGVASALMREMIE------MCD 108
+ ++ + +G + I + + D + D R K GV +AL+ + IE C
Sbjct: 69 LYYLENNCIGRIKIR-SNWNGYALIEDIAVAKDYRKK--GVGTALLHKAIEWAKENHFCG 125

Query: 109 NWLRVDRIELTVFVDNAPAIKVYKKFGFEIEG 140
L I N A Y K F I
Sbjct: 126 LMLETQDI-------NISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3464NAFLGMOTY320.007 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 31.6 bits (71), Expect = 0.007
Identities = 27/82 (32%), Positives = 37/82 (45%), Gaps = 17/82 (20%)

Query: 275 RTPISGDYRGYQVYSMPPPSSGGIHIVQILNI--LENFDMKKYGF-GSADAMQIMAEAEK 331
R P+ G+ R + SMPPP G H +I N+ + FD G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNLKFFKQFD----GYVGGQTAWGILSELEK 131

Query: 332 YAYADRSEYLGDPDFVKVPWQA 353
Y P F WQ+
Sbjct: 132 GRY---------PTFSYQDWQS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3466PF04619280.017 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 28.4 bits (63), Expect = 0.017
Identities = 12/60 (20%), Positives = 22/60 (36%), Gaps = 4/60 (6%)

Query: 29 VGAKYGHKMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGELNWQD----LLRVDAGSW 84
+G ++ D + G+ FL+ D+N ++ W + D GSW
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3471MALTOSEBP402e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 39.7 bits (92), Expect = 2e-05
Identities = 41/160 (25%), Positives = 68/160 (42%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTSVLYYNKDAFKKAGLDPEQPPKTWQDLADYSAKLKASGIKCGYASGWQ 193
G L++ P L YNKD L P PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKD------LLP-NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


78SF3498SF3503N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF34980193.825131nickel ABC transporter ATP-binding protein NikE
SF3499-114-0.230020nickel responsive regulator
SF3500119-4.274928hypothetical protein
SF3501120-4.811594transporter
SF3502125-6.851846ABC transporter ATP-binding protein
SF3503240-12.606511hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3498HTHFIS290.018 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.4 bits (66), Expect = 0.018
Identities = 10/34 (29%), Positives = 19/34 (55%)

Query: 25 QAVLNNVSLTLKSGETVALLGRSGCGKSTLARLL 58
Q + ++ +++ T+ + G SG GK +AR L
Sbjct: 147 QEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3501ABC2TRNSPORT505e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 49.9 bits (119), Expect = 5e-09
Identities = 41/171 (23%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 200 REREHGTVEHLLVMPITPFEIMMAKI-WSMGLVVLVVSGLSLVLMVKGVLGVPIEGSIPL 258
R T E +L + +I++ ++ W+ L +G+ +V G + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY----TQWLSLL 148

Query: 259 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQMLSGGSTPRESMPQMVQD 317
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P + Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 318 IMLTMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFF-TIALLRFR 367
+P +H + L + I+ ++ + I FF + ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3502PF05272300.045 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.045
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 37 ARCMVGLIGPDGVGKSSLLSLISGAR 62
V L G G+GKS+L++ + G
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3503RTXTOXIND844e-20 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 84.5 bits (209), Expect = 4e-20
Identities = 71/408 (17%), Positives = 139/408 (34%), Gaps = 81/408 (19%)

Query: 6 RHLAWWVVGALAVAAVVAWWLLRPAGVP-EGFAVSNGRIEATEVDIASKIAGRIDTILVK 64
R +A++++G L +A +++ G +GR + I + I+VK
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVK 113

Query: 65 EGQFVREGEVLAKMDTRV----------------LQEQRLEAI----------------- 91
EG+ VR+G+VL K+ L++ R + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 92 -------------------AQIKEAQSAVAAAQALLEQRQSETRAAQSLVNQRQAELDSV 132
Q Q+ + L+++++E + +N+ +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 133 AKRHTRSRSLAQRGAISAQQLDDDRAAAESARAALESAKAQVSASKAAIEAARTNIIQ-- 190
R SL + AI+ + + A L K+Q+ ++ I +A+
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 191 -----------AQTRVEAAQATERRIAADID--DSELKAPRDGRV-QYRVAEPGEVLAAG 236
QT T + S ++AP +V Q +V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 237 GRVLNMVDLSDVY-MTFFLPTEQAGTLKLGGEARLILDAAPDLRIPATISFVASVAQFTP 295
++ +V D +T + + G + +G A + ++A P R V V
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YLVGKVKNINL 410

Query: 296 KTVETSDERLKLMFRVKARIPPELLQQHLEYV--KTGLPGVAWVRVNE 341
+E D+RL L+F V I L + + +G+ A ++
Sbjct: 411 DAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


79SF3581SF3586N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF35811131.222808resistance protein
SF35831131.393928lipase
SF35821101.2804303-methyladenine DNA glycosylase
SF35840111.448874acetyltransferase
SF3585-111-0.017125biotin sulfoxide reductase
SF3586019-2.187419hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3581TCRTETA431e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.9 bits (101), Expect = 1e-06
Identities = 47/275 (17%), Positives = 94/275 (34%), Gaps = 32/275 (11%)

Query: 44 PVSQVAFSFGLLSLGLAIS----SSVAGKLQERFGVKRVTVASGILLGLGFFLTAHSNNL 99
+ V +G+L A+ + V G L +RFG + V + S + + + A + L
Sbjct: 37 HSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFL 96

Query: 100 MMLWLS---AGVLVGLADGAGYLL----TLSNCVKWFPERKGLISAFAIGSYGLGSLGFK 152
+L++ AG+ AG + + F + LG
Sbjct: 97 WVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGG---- 152

Query: 153 FIDTQLLETVGLEKTFVIWGAIALVMIVFGATLMKDAPKQEVKTSNGVVEKDYTLAESMR 212
L+ F A+ + + G L+ ++ K E + R
Sbjct: 153 -----LMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWAR 207

Query: 213 --KPQYWMLAVMFLTACMSG----LYVIGVAKDIAQSLAHLDVVSAANAVTVISIAN-LS 265
++AV F+ + L+VI + H D + ++ I + L+
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWVI-----FGEDRFHWDATTIGISLAAFGILHSLA 262

Query: 266 GRLVLGILSDKIARIRVITIGQVISLVGMAALLFA 300
++ G ++ ++ R + +G + G L FA
Sbjct: 263 QAMITGPVAARLGERRALMLGMIADGTGYILLAFA 297



Score = 36.0 bits (83), Expect = 2e-04
Identities = 37/155 (23%), Positives = 64/155 (41%), Gaps = 9/155 (5%)

Query: 241 AQSLAHLDVVSAANAVTVISIANLSGRLVLGILSDKIARIRVITIGQVISLVGMAALLFA 300
AH ++ A A+ + A + G L SD+ R V+ + + V A + A
Sbjct: 39 NDVTAHYGILLALYALMQFACAPVLGAL-----SDRFGRRPVLLVSLAGAAVDYAIMATA 93

Query: 301 PLNAVTFFAAIACVAFNFGGTITVFPSLVSEFFGLNNLAKNYGVIYLGFGIGSIFGSIIA 360
P V + I VA G T V + +++ + A+++G + FG G + G ++
Sbjct: 94 PFLWVLYIGRI--VAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG 151

Query: 361 SLFGGF--YVTFYVIFALLILSLALSTTIRQPEQK 393
L GGF + F+ AL L+ + K
Sbjct: 152 GLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHK 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3583ECOLNEIPORIN280.039 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 27.8 bits (62), Expect = 0.039
Identities = 19/90 (21%), Positives = 37/90 (41%), Gaps = 13/90 (14%)

Query: 119 SMYNEFGDSTTTLTDPLWHASVSSLGWRVDSRLGDLRPWAQISYNQQFGENIWKAQSGLS 178
S+ + D+ + H S + + + R G++ P ++SY F +
Sbjct: 228 SVAVQQQDAKLV-EENYSHNSQTEVAATLAYRFGNVTP--RVSYAHGFKGSF-------- 276

Query: 179 RMTATNQNGNWLDVTVGADMLLNQNIAAYA 208
ATN N ++ V VGA+ ++ +A
Sbjct: 277 --DATNYNNDYDQVVVGAEYDFSKRTSALV 304


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3584SACTRNSFRASE355e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 34.9 bits (80), Expect = 5e-05
Identities = 16/52 (30%), Positives = 22/52 (42%), Gaps = 5/52 (9%)

Query: 76 VAPKAVRRGIGKALMQYV-----QQRYPHLMLEVYQKNQPAIDFYRAQGFHI 122
VA ++G+G AL+ + + LMLE N A FY F I
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3586OMPADOMAIN1111e-31 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 111 bits (280), Expect = 1e-31
Identities = 40/122 (32%), Positives = 61/122 (50%), Gaps = 11/122 (9%)

Query: 97 LNMPNNVTFDSSSAPLKPAGANTLTGVAMVLKEY--PKTAVNVIGYTDSTGGHDLNMRLS 154
+ ++V F+ + A LKP G L + L +V V+GYTD G N LS
Sbjct: 215 FTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLS 274

Query: 155 QQRADSVASALITQGVDASRIRTQGLGPANPIASNSTAEGK---------AQNRRVEITL 205
++RA SV LI++G+ A +I +G+G +NP+ N+ K A +RRVEI +
Sbjct: 275 ERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334

Query: 206 SP 207

Sbjct: 335 KG 336


80SF3714SF3720N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF37144181.247648membrane transport protein
SF3715317-0.421761siderophore biosynthesis protein
SF3716319-0.448498siderophore biosynthesis protein
SF3717318-1.130240siderophore biosynthesis protein
SF3718319-3.120184lysine:N6-hydroxylase
SF3719318-2.501106ferric siderophore receptor
SF3720225-3.648488serine protease-like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3714TCRTETA485e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 47.5 bits (113), Expect = 5e-08
Identities = 81/375 (21%), Positives = 135/375 (36%), Gaps = 41/375 (10%)

Query: 20 FSAGLLGIGQNGLLVVLPVLVIQTNLSLSV---WAALLMLGSMLFLPSSPWWGKQISRTG 76
+ L +G ++ VLP L+ S V + LL L +++ +P G R G
Sbjct: 12 STVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFG 71

Query: 77 SKPVVLWALGGYGISFTLLGLGSVLMATSAITTAVGLGILIIARIAYGLTVSAMVPACQV 136
+PV+L +L G + + ++ L +L I RI G+T + A
Sbjct: 72 RRPVLLVSLAGAAVDYAIMATAPFLW------------VLYIGRIVAGITGATGAVAGAY 119

Query: 137 WALQRAGEGNRMAALATISSGLSCGRLFGPLCAAAMLAIHPLAPLGLLMAAPVLALLMLL 196
A R +S+ G + GP+ M P AP A L L
Sbjct: 120 IA-DITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGC 178

Query: 197 RL------PGTPPQPTPECKSVSLKRDCLPYLLCAILLAAAVSMMQLGLSPAL------T 244
L P ++ R + A L+A M +G PA
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGE 238

Query: 245 RQFATDTTAISQQVAWLLGLSAVAALIAQFGVLRPQRLTPVALLLSAGVLMSGGLAIMLS 304
+F D T I +A L ++A + G + + AL+L +G + + +
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMI-TGPVAARLGERRALMLGMIADGTGYILLAFA 297

Query: 305 EQLWLFYPGCAVLSFGAALATPAYQLLLNDKLADGAGAGWLATSHTLGYGLCALLVPLVS 364
+ W+ +P +L+ G + PA Q +L+ + D G L L L S
Sbjct: 298 TRGWMAFPIMVLLASG-GIGMPALQAMLS-RQVDEERQGQLQ----------GSLAALTS 345

Query: 365 KTGVAIALIMAALFA 379
T + L+ A++A
Sbjct: 346 LTSIVGPLLFTAIYA 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3715PF04183339e-111 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 339 bits (872), Expect = e-111
Identities = 104/480 (21%), Positives = 178/480 (37%), Gaps = 46/480 (9%)

Query: 56 ELLIPLDEQKSLHFRVAYFSPTQHHRF-----AFPARLVTASGSYPVDFTTLSRLIIDKL 110
E + + Q + + P RF + + A D L++ ++ +L
Sbjct: 24 EQVFHAESQGDDRYCIN--LPGAQWRFIAERGIWGWLWIDAQTLRCADEPVLAQTLLMQL 81

Query: 111 RHQLFLPVPLCETFHQRVLESHVHTQQAIDARHDWAALREKALNFGEAEQALLTGHAFHP 170
+ L + Q + + + Q + AR +A LN + Q LL+GH
Sbjct: 82 KQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINLNA-DRLQCLLSGHPKFV 140

Query: 171 APKSHEPFNRREAERYLPDMAPHFPLRWFSVDKTQIAGES-LHLNLQQRLTRFAAENAPQ 229
K + + ERY P+ A F L W +V + + +++ Q LT A PQ
Sbjct: 141 FNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQLLT---AAMDPQ 197

Query: 230 LLNELS--------DNQWLF-PLHPWQGEYLLQQGWCQALVAKGLIKDLGEAGTSWLPTT 280
S D+ WL P+HPWQ + + + A+G + LGE G WL
Sbjct: 198 EFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADF-AEGRMVSLGEFGDQWLAQQ 256

Query: 281 SSRSLYCATSRD--MIKFSLSVRLTNSIRTLSVKEVKRGMRLARLAQ----TDGWQMLQ- 333
S R+L A+ R IK L++ T+ R + + + G +R Q TD +
Sbjct: 257 SLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQVFATDATLVQSG 316

Query: 334 ---VRFPTFRVMQEDGWAGLLDLNGNIMQESLFALRENLLVDQPKSQTNVLVSLTQAAPD 390
+ P + +G+A L + REN ++ VL++ +
Sbjct: 317 AVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKPDESPVLMATLMECDE 376

Query: 391 GGDSLLVSAVKRLSDRLGITVQQAAHAWVDAYCQQVLKPLFTAEADYGLVLLAHQQNILV 450
L + + DR G+ A W+ + V+ PL+ YG+ L+AH QNI +
Sbjct: 377 NNQPLAGAYI----DRSGLD----AETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITL 428

Query: 451 QMLGDLPVGFIYRDCQGSAFMPHATDWLDSIGEAQAENIFTHEQLLRYFPYYLLVNSTFA 510
M +P + +D QG M + + E + L++
Sbjct: 429 AMKEGVPQRVLLKDFQGD--MRLVKEEFPEMDSLPQE----VRDVTSRLSADYLIHDLQT 482


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3717PF041838160.0 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 816 bits (2109), Expect = 0.0
Identities = 565/580 (97%), Positives = 571/580 (98%)

Query: 1 MNHKDWDFVNRRLVAKMLSEMEYEQVFHAESQGDDHYCINLPGAQWRFIAERGIWGWLWI 60
MNHKDWD VNRRLVAKMLSE+EYEQVFHAESQGDD YCINLPGAQWRFIAERGIWGWLWI
Sbjct: 1 MNHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWI 60

Query: 61 DAQTLRCTDEPVLAQTLLMQLKPVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD 120
DAQTLRC DEPVLAQTLLMQLK VLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD
Sbjct: 61 DAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD 120

Query: 121 LINLDADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYTNTFRLHWLAVKREHMIWRC 180
LINL+ADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEY NTFRLHWLAVKREHMIWRC
Sbjct: 121 LINLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRC 180

Query: 181 DNDLDIQQLLTAAMDPQEFTRFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG 240
DN++DI QLLTAAMDPQEF RFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG
Sbjct: 181 DNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG 240

Query: 241 RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR 300
RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR
Sbjct: 241 RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR 300

Query: 301 WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK 360
WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK
Sbjct: 301 WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK 360

Query: 361 PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI 420
PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI
Sbjct: 361 PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI 420

Query: 421 AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEAFPEMDSLPQEVRDVTSRLSADYLIHDL 480
AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKE FPEMDSLPQEVRDVTSRLSADYLIHDL
Sbjct: 421 AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDL 480

Query: 481 QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMNKHPQMAERFALFSLFRPQIIR 540
QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYM KHPQM+ERFALFSLFRPQIIR
Sbjct: 481 QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSLFRPQIIR 540

Query: 541 VVLNPVKLTWPDLDGGSRMLPNYLENLQNPLWLVTQEYES 580
VVLNPVKLTWPDLDGGSRMLPNYLE+LQNPLWLVTQEYES
Sbjct: 541 VVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQEYES 580


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3720IGASERPTASE802e-19 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 80.5 bits (198), Expect = 2e-19
Identities = 48/214 (22%), Positives = 74/214 (34%), Gaps = 52/214 (24%)

Query: 35 NRKLVATMLSLAVAGTVNA---ANIDISNVWARDYLDLAQNKGIFQPGATDVTITLKNGD 91
N+K ++L VA + A + +V + + D A+NKG F GAT+V + KN
Sbjct: 3 NKKFKLNFIALTVAYALTPYTEAALVRDDVDYQIFRDFAENKGKFSVGATNVLVKDKNNK 62

Query: 92 KF--SFHN-LSIPDFSGAAAS-GAATAIGGSYSVTVAH-----------------NKKNP 130
+ N + + DFS AT I Y V V H N N
Sbjct: 63 DLGTALPNGIPMIDFSVVDVDKRIATLINPQYVVGVKHVSNGVSELHFGNLNGNMNNGNA 122

Query: 131 QAAETQVYAQSSYKVVDRRNSN-------------------DFEIQRLNKFVVETVGATP 171
+A ++ Y V++ D+ + RL+KFV E
Sbjct: 123 KAHRDVSSEENRYFSVEKNEYPTKLNGKTVTTEDQTQKRREDYYMPRLDKFVTEVAPIEA 182

Query: 172 AETNPTTYSDALERYGIVTSDGSKKIIGFRAGSG 205
+ + +D +K R GSG
Sbjct: 183 STAS---------SDAGTYNDQNKYPAFVRLGSG 207


81SF3789SF3799N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF37891184.402823multidrug resistance protein D
SF37900173.413714acetolactate synthase catalytic subunit
SF3791-1152.975084acetolactate synthase 1 regulatory subunit
SF3792-1151.836794DNA-binding transcriptional activator UhpA
SF3793-1151.325900sensory histidine kinase UhpB
SF37941150.425279regulatory protein UhpC
SF3795013-0.560104sugar phosphate antiporter
SF3796118-1.161183cryptic adenine deaminase
SF3797017-2.575493hypothetical protein
SF3798022-5.449206hypothetical protein
SF3799024-5.637915transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3789TCRTETB469e-08 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 46.4 bits (110), Expect = 9e-08
Identities = 33/141 (23%), Positives = 62/141 (43%), Gaps = 1/141 (0%)

Query: 4 VMGAYLLTYGVSQLFYGPISDRVGRRPVILVGMSIFMLATLVA-VTTSSLTVLIAASAMQ 62
V A++LT+ + YG +SD++G + ++L G+ I +++ V S ++LI A +Q
Sbjct: 54 VNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQ 113

Query: 63 GMGTGVGGVMARTLPRDLYERTQLRHANSLLNMGILVSPLLAPLIGGLLDTMWNWRACYL 122
G G + + + A L+ + + + P IGG++ +W L
Sbjct: 114 GAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLL 173

Query: 123 FLLVLCAGVTFSMARWMPETR 143
++ V F M E R
Sbjct: 174 IPMITIITVPFLMKLLKKEVR 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3792HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 2e-13
Identities = 29/174 (16%), Positives = 59/174 (33%), Gaps = 20/174 (11%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATG 118
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 119 GCYLTPDIAIKLASGRQDPLTKRERQVAEKLAQG---MAVKEIAAELGLSPKTV 169
A+ R L + + + + + A L + T+
Sbjct: 118 --------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3793PF06580402e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.8 bits (93), Expect = 2e-05
Identities = 28/142 (19%), Positives = 56/142 (39%), Gaps = 11/142 (7%)

Query: 365 LRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWRIDESALSENQRVTLFRVCQEGLNN 424
LR ++L + + ++L L++ + + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 425 IVKHA-----DASAVTLQGWQQDERLMLVIEDDGSGLPPGSGQ-QGFGLTGMRERVTALG 478
+KH + L+G + + + L +E+ GS + + G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 479 G---TLHISCLHG-TRVSVSLP 496
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3794TCRTETB418e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.6 bits (95), Expect = 8e-06
Identities = 65/408 (15%), Positives = 137/408 (33%), Gaps = 60/408 (14%)

Query: 29 RHILLTIWLGYALFY--FTRKSFNAAVPEILANGVLSRSDIGLLATLFYITYGVSKFVSG 86
RH + IWL F+ N ++P+I + + + T F +T+ + V G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 87 IVSDRSNARYFMGIGLIATGIMNILFGFSTSLWAFAVLWVLNAFFQGWGS---PVCARLL 143
+SD+ + + G+I +++ S F L ++ F QG G+ P ++
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHS---FFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 144 TAWY-SRTERGGWWALWNTAHNVGGALIPIVMAASALHYGWRAGMMIAGCMAIVVGIFLC 202
A Y + RG + L + +G + P + A + W ++ M ++ +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPF- 184

Query: 203 WRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKEILTKYVLLNPYIWLLSFCYVLV 262
+ L +I G L I+ + Y VL
Sbjct: 185 --------LMKLLKKEVRIKGHFDIK----GIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 263 YVV-----RAAINDWGNLYMSETLGVDLVTANTAVTMFELGGFI-----------GALVA 306
+++ R + + + + + + + + + GF+ A
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 307 GWGSDKLFNGNRGPMNLIFAAGILL-SVGSLWLMPFASYVMQATCFFTIGFFVFGPQMLI 365
GS +F G + + GIL+ G L+++ + + F T F + +
Sbjct: 293 EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFM 351

Query: 366 ---------GMAAAECS---------HKEAAGAATGFVGLFAYLGASL 395
G++ + ++ AGA + ++L
Sbjct: 352 TIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3795TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 0.001
Identities = 28/168 (16%), Positives = 61/168 (36%), Gaps = 17/168 (10%)

Query: 49 FNIAQNDMISTYGLSMTQLGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAIC 108
N++ D+ + + + F +T+ +G + +D K+ L F +I++ C
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN--C 90

Query: 109 MLGFSASMGSGSVSLFLMIAFYALSGFFQSTGGSCSYSTI----TKWTPRRKRGTFLGFW 164
+G SL +M + F Q G + + + ++ P+ RG G
Sbjct: 91 FGSVIGFVGHSFFSLLIM------ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 165 NISHNLGGAGAAGVALFGANYLFDGHVIGMFIFPSIIALIVGFIGLRY 212
+G + A+Y+ + + + P + I+ L
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSY---LLLIP--MITIITVPFLMK 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3796UREASE403e-05 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 39.7 bits (93), Expect = 3e-05
Identities = 30/105 (28%), Positives = 43/105 (40%), Gaps = 17/105 (16%)

Query: 22 AVSRGDAVADYIIDNVSILDLINAGEISGPIVIKGRYIAGVG-AEYADT---------PA 71
V+R D +I N ILD + G + I +K IA +G A D P
Sbjct: 60 QVTREGGAVDTVITNALILD--HWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPG 117

Query: 72 LQRIDARGATAVPGFIDAHLHIESSMMTPVTFETATLPRGLTTVI 116
+ I G G +D+H+H + P E A L GLT ++
Sbjct: 118 TEVIAGEGKIVTAGGMDSHIH----FICPQQIEEA-LMSGLTCML 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3799TCRTETA392e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.4 bits (92), Expect = 2e-05
Identities = 35/208 (16%), Positives = 71/208 (34%), Gaps = 13/208 (6%)

Query: 88 IIVEFLPVSLLTP----MAQDLGISEGVAGQSVTVTAFVAMFASLFITQTIQATDR--RY 141
+ ++ + + L+ P + +DL S V + A A+ +DR R
Sbjct: 14 VALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRR 73

Query: 142 VVILFAVLL-TLSCLLVSFANSFSLLLIGRACLGLALGGFWAMSASLTMRLVPPRTVPKA 200
V+L ++ + +++ A +L IGR G+ G A++ + + +
Sbjct: 74 PVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARH 132

Query: 201 LSVIFGAVSIALVIAAPLGSFLGELIGWRNVFNAAAVMG----VLCIFWIIKSLPSLPGE 256
+ +V LG +G F AAA + + F + +S
Sbjct: 133 FGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRP 191

Query: 257 PSHQKQNTFRLLQRPGVMAGMIAIFMSF 284
+ N + M + A+ F
Sbjct: 192 LRREALNPLASFRWARGMTVVAALMAVF 219


82SF3936SF3941N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF39362221.779846Der GTPase activator
SF39371201.431312coproporphyrinogen III oxidase
SF39381170.226523nitrogen regulation protein NR(I)
SF3939117-1.730247nitrogen regulation protein NR(II)
SF3940220-2.318885glutamine synthetase
SF3941113-2.931771GTP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3936SECA300.004 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.2 bits (68), Expect = 0.004
Identities = 11/71 (15%), Positives = 30/71 (42%)

Query: 14 AKARRKTREELDQEARDRKRQKKRRGHAPGSRAAGGNTTSGSKGQNAPKDPRIGSKTPIP 73
+K + + EE+++ + R+ + +R ++ + + + ++G P P
Sbjct: 827 SKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCP 886

Query: 74 LGVTEKVTKQH 84
G +K + H
Sbjct: 887 CGSGKKYKQCH 897


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3938HTHFIS5970.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 597 bits (1542), Expect = 0.0
Identities = 206/478 (43%), Positives = 300/478 (62%), Gaps = 11/478 (2%)

Query: 1 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGAEVLEALASKTPDVLLSDIRMPGM 60
M + V DDD++IR VL +AL+ AG N A + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 120
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 HYQEQQQPRNIQLNGPTTDIIGEAQAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 180
+ + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 181 LHRHSPRTKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 240
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 241 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVQEGKFREDLFHR 300
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L+Q + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 301 LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL 360
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 361 ENTCRWLTVMAAGQEVLIQDLPGELFESTVAESTSQMQPDSWATLLAQWADRALRS---- 416
EN R LT + + + + EL + S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 417 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 469
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3939PF06580280.042 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.3 bits (63), Expect = 0.042
Identities = 34/190 (17%), Positives = 72/190 (37%), Gaps = 41/190 (21%)

Query: 171 IIEQADRLRNLVDRL---LGPQLPGTRVTE-SIHKVAERV---VTLVSMELPDNVRLIRD 223
I+E + R ++ L + L + + S+ V + L S++ D ++
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 224 YDPSLPELAHDPDQIEQVLLN-IVRNALQ---ALGPEGGEIILRTRTAFQLTLHGERYRL 279
+P++ ++ Q+ +L+ +V N ++ A P+GG+I+L+
Sbjct: 246 INPAIMDV-----QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGT------KDNGTVT- 293

Query: 280 AARIDVEDNGPGIPPHLQDTLFYPMVSGREGGTGLGLSIARNLIDQHSGK---IEFTSWP 336
++VE+ G + ++ TG GL R + G I+ +
Sbjct: 294 ---LEVENTGSLALKNTKE------------STGTGLQNVRERLQMLYGTEAQIKLSEKQ 338

Query: 337 GHTEFSVYLP 346
G V +P
Sbjct: 339 GKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF3941TCRTETOQM1804e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 180 bits (458), Expect = 4e-51
Identities = 97/445 (21%), Positives = 170/445 (38%), Gaps = 81/445 (18%)

Query: 4 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQE--RVMDSNDLEKERGITILAKNT 61
K+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIVYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPALSVDEPTV 307
K+ ++ + E + D A +G+IV + L ++ + DT+ + + P +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 308 SMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSGRGELHLS 367
+ + D L LR +S G++ +
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKVQME 397

Query: 368 VLIENMRRE-GFELAVSRPKVIFRE 391
V ++ + E+ + P VI+ E
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


83SF4072SF4085N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF40720143.120076transcriptional regulator HU subunit alpha
SF40730163.616289hypothetical protein
SF40740173.592835hypothetical protein
SF4075-1173.681421sensor protein ZraS
SF40760213.892756transcriptional regulatory protein ZraR
SF4077-1203.776455phosphoribosylamine--glycine ligase
SF4078-1183.363791bifunctional
SF4081-2163.003970*isocitrate lyase
SF4084-1152.740629aceBA operon repressor
SF4085-1172.790269B12-dependent methionine synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4072DNABINDINGHU1202e-39 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 120 bits (302), Expect = 2e-39
Identities = 50/89 (56%), Positives = 66/89 (74%)

Query: 2 NKTQLIDVIAEKAELSKTQAKAALESTLAAITESLKEGDAVQLVGFGTFKVNHRAERTGR 61
NK LI +AE EL+K + AA+++ +A++ L +G+ VQL+GFG F+V RA R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEIKIAAANVPAFVSGKALKDAVK 90
NPQTG+EIKI A+ VPAF +GKALKDAVK
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4075PF06580362e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.4 bits (84), Expect = 2e-04
Identities = 49/262 (18%), Positives = 104/262 (39%), Gaps = 43/262 (16%)

Query: 197 ILFALATVLLA-SVLSFFW-YRRYLRSRQLLQDEMKRKEKLVALGHLAAGV-AHEIRNPL 253
I+F + V S+L F W + + + ++ Q +M + L L A + H + N L
Sbjct: 120 IIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNAL 179

Query: 254 SSIKGLAKYFAERAPAGGEAHQLAQVM---AKEADRLNRVVSELLELVKPTHLALQAVDL 310
++I+ L +A L+++M + ++ +++ L +V ++L L ++
Sbjct: 180 NNIRALILEDPTKAREM--LTSLSELMRYSLRYSNARQVSLADELTVVD-SYLQLASIQF 236

Query: 311 NTLINHSLQLVSQDANSREIQLRFTANDTLPEIQADPDRLTQVLL-NLYLNAIQAIGQHG 369
+ Q+ + ++Q+ P L Q L+ N + I + Q G
Sbjct: 237 EDRLQFENQI---NPAIMDVQV--------------PPMLVQTLVENGIKHGIAQLPQGG 279

Query: 370 VISVTASESGAGVKISVTDSGKGIAADQLEAIFTPYFTTKAEGTGLGLAVVHNIVEQHGG 429
I + ++ V + V ++G + E TG GL V ++ G
Sbjct: 280 KILLKGTKDNGTVTLEVENTGSLALKNT------------KESTGTGLQNVRERLQMLYG 327

Query: 430 ---TIQVASQEGKGSTFTLWLP 448
I+++ ++GK + +P
Sbjct: 328 TEAQIKLSEKQGKV-NAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4076HTHFIS5250.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 525 bits (1355), Expect = 0.0
Identities = 183/468 (39%), Positives = 253/468 (54%), Gaps = 35/468 (7%)

Query: 8 ILVVDDDISHCTILQALLRGWGYNVALANSGRQALEQVREQVFDLVLCDVRMAEMDGIAT 67
ILV DDD + T+L L GY+V + ++ + DLV+ DV M + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 68 LKEIKALNPAIPVLIMTAYSSVETAVEALKTGAQDYLIKPLDFDNLQATLEKALAHTHSI 127
L IK P +PVL+M+A ++ TA++A + GA DYL KP D L + +ALA
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRR 125

Query: 128 DAETPAVTASQFGMVGKSPAMQHLLSEIALVAPSEATVLIHGDSGTGKELVARAIHASSA 187
++ + +VG+S AMQ + +A + ++ T++I G+SGTGKELVARA+H
Sbjct: 126 PSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGK 185

Query: 188 RSEKPLVTLNCAALNESLLESELFGHEKGAFTGADKRREGRFVEADGGTLFLDEIGDISP 247
R P V +N AA+ L+ESELFGHEKGAFTGA R GRF +A+GGTLFLDEIGD+
Sbjct: 186 RRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPM 245

Query: 248 MMQVRLLRAIQEREVQRVGSNQTISVDVRLIAATHRDLAAEVNAGRFRQDLYYRLNVVAI 307
Q RLLR +Q+ E VG I DVR++AAT++DL +N G FR+DLYYRLNVV +
Sbjct: 246 DAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPL 305

Query: 308 EVPSLRQRREDIPLLAGHFLQRFAERNRKAVKGFTPQAMDLLIHYDWPGNIRELENAVER 367
+P LR R EDIP L HF+Q+ + VK F +A++L+ + WPGN+RELEN V R
Sbjct: 306 RLPPLRDRAEDIPDLVRHFVQQAEKEGLD-VKRFDQEALELMKAHPWPGNVRELENLVRR 364

Query: 368 AVVLLTGEYISERELPLAIASTPIPLGQSQDIQP-------------------------- 401
L + I+ + + S +
Sbjct: 365 LTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALP 424

Query: 402 --------LVEVEKEVILAALEKTGGNKTEAARQLGITRKTLLAKLSR 441
L E+E +ILAAL T GN+ +AA LG+ R TL K+
Sbjct: 425 PSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4081BINARYTOXINB320.004 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 32.3 bits (73), Expect = 0.004
Identities = 14/58 (24%), Positives = 23/58 (39%)

Query: 289 ETSTPDLELARRFAQAIHAKYPGKLLAYNCSPSFNWQKNLDDKTIASFQQQLSDMGYK 346
ET+ PD+ L A P L Y + N D +T + + QL+++
Sbjct: 544 ETTKPDMTLKEALKIAFGFNEPNGNLQYQGKDITEFDFNFDQQTSQNIKNQLAELNAT 601


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4085BCTERIALGSPD340.004 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 34.1 bits (78), Expect = 0.004
Identities = 20/87 (22%), Positives = 37/87 (42%), Gaps = 17/87 (19%)

Query: 343 SGLEPLNIGDDSLFVNVGERTN---VTGSA----KFKRLIKEEKYSEALDVARQQVENGA 395
+P+ D ++ + +TN VT + +R+I + LD+ R QV A
Sbjct: 298 QAAKPVAALDKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQ------LDIRRPQVLVEA 351

Query: 396 QIIDINMDEGMLDAEAAMVRFLNLIAG 422
I ++ D L+ +++ N AG
Sbjct: 352 IIAEVQ-DADGLNLG---IQWANKNAG 374


84SF4093SF4099N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF4093-114-0.671589sorbitol-6-phosphate 2-dehydrogenase
SF4094014-0.768817sor-operon regulator
SF4095014-0.28212323S rRNA pseudouridine synthase F
SF40960150.253345IS2 transposase TnpB
SF4097-1150.131730IS2 repressor TnpA
SF4098013-1.004723sensory histidine kinase DcuS
SF4099013-0.557546DNA-binding transcriptional activator DcuR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4093DHBDHDRGNASE1155e-33 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 115 bits (289), Expect = 5e-33
Identities = 79/272 (29%), Positives = 128/272 (47%), Gaps = 27/272 (9%)

Query: 7 LQDKIIIVTGGASGIGLAIVEELLAQGANVQMVDIHG-------GDGQYEGHKGYQFWPT 59
++ KI +TG A GIG A+ L +QGA++ VD + + E F P
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF-PA 64

Query: 60 DISSTKEVNHTVAEIIQRFGRIDGLVNNAGVNFPRLLVDEKAPAGQYELNEAAFEKMVNI 119
D+ + ++ A I + G ID LVN AGV P L+ + L++ +E ++
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLI---------HSLSDEEWEATFSV 115

Query: 120 NQKGVFLMSQAVARQMVKQHDGVIVNVSSESGLEGSEGQSCYAATKAALNSFTRSWSKEL 179
N GVF S++V++ M+ + G IV V S + YA++KAA FT+ EL
Sbjct: 116 NSTGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLEL 175

Query: 180 GKHGIRVVGIAPGILEKTGLRTPEYEEALAWTRNITVEQLREGYT---KNAIPIGRAGRL 236
++ IR ++PG E + W EQ+ +G K IP+ + +
Sbjct: 176 AEYNIRCNIVSPGSTETDMQWS-------LWADENGAEQVIKGSLETFKTGIPLKKLAKP 228

Query: 237 AEVADFVCYLLSERASYITGVTTNIAGGKTRG 268
+++AD V +L+S +A +IT + GG T G
Sbjct: 229 SDIADAVLFLVSGQAGHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4094HTHFIS290.033 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 28.6 bits (64), Expect = 0.033
Identities = 6/21 (28%), Positives = 12/21 (57%)

Query: 24 QAQIARELGIYRTTISRLLKR 44
Q + A LG+ R T+ + ++
Sbjct: 452 QIKAADLLGLNRNTLRKKIRE 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4098PF06580418e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.0 bits (96), Expect = 8e-06
Identities = 21/99 (21%), Positives = 38/99 (38%), Gaps = 18/99 (18%)

Query: 442 LIENALE-ALGP-EPGGEISVTLHYRHGWLHCEVNDDGPGIAPDKIDHIFDKGVSTKGSE 499
L+EN ++ + GG+I + +G + EV + G +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN------------TKES 310

Query: 500 RGVGLALVKQQVENLGG---SIAVESEPGIFTQFFVQIP 535
G GL V+++++ L G I + + G V IP
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4099HTHFIS712e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 71.4 bits (175), Expect = 2e-16
Identities = 31/109 (28%), Positives = 50/109 (45%), Gaps = 4/109 (3%)

Query: 4 VLIIDDDAMVAELNRRYVAQIPGFQCCGTASTLEKAKEIIFNSDAPIDLILLDIYMQKEN 63
+L+ DDDA + + + +++ G+ S I + DL++ D+ M EN
Sbjct: 6 ILVADDDAAIRTVLNQALSRA-GYDVR-ITSNAATLWRWI--AAGDGDLVVTDVVMPDEN 61

Query: 64 GLDLLPVLHNARCKSDVIVISSAADAATIKDSLHYGVVDYLIKPFQASR 112
DLLP + AR V+V+S+ T + G DYL KPF +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTE 110


85SF4110SF4117N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF4110-3150.072160DNA-binding transcriptional regulator BasR
SF4111-215-0.110855sensor protein BasS/PmrB
SF4112-1150.278235proline/glycine betaine transporter
SF41131181.375362hypothetical protein
SF41140180.784737hypothetical protein
SF41150281.829277hypothetical protein
SF41160282.428796hypothetical protein
SF4117-1262.538022phosphonate/organophosphate ester transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4110HTHFIS912e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 90.7 bits (225), Expect = 2e-23
Identities = 40/121 (33%), Positives = 59/121 (48%)

Query: 2 KILIVEDDTLLLQGLILAAQTEGYACDGVTTARMAEQSLEDGHYSLVVLDLGLPDEDGLH 61
IL+ +DD + L A GY + A + + G LVV D+ +PDE+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 FLARIRQKKYTLPVLILTARDTLTDKIAGLDVGADDYLVKPFALEELHARIRALLRRHNN 121
L RI++ + LPVL+++A++T I + GA DYL KPF L EL I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 Q 122
+
Sbjct: 125 R 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4111PF06580371e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 1e-04
Identities = 40/182 (21%), Positives = 80/182 (43%), Gaps = 34/182 (18%)

Query: 184 ARLDQMMESVSQLLQLARAGQSFSSGNYQHVKLLEDV-ILPSYDELSTML--DQRQQTLL 240
+ +M+ S+S+L++ S N + V L +++ ++ SY +L+++ D+ Q
Sbjct: 191 TKAREMLTSLSELMR-----YSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 241 LPESAADITVQGDATLLRMLLRNLVENAHRY----SPQGSNIMIKLQEDGGAV-MAVEDE 295
+ + D+ V ML++ LVEN ++ PQG I++K +D G V + VE+
Sbjct: 246 INPAIMDVQV------PPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENT 299

Query: 296 GPGIDESKCGELSKAFVRMDSRYGGIGLGLSIV-SRITQLHHGQFFLQNRQETSGTRAWV 354
G + + G GL V R+ L+ + ++ ++ A V
Sbjct: 300 GSLA--------------LKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 355 RL 356
+
Sbjct: 346 LI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4112TCRTETA432e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.9 bits (101), Expect = 2e-06
Identities = 57/290 (19%), Positives = 105/290 (36%), Gaps = 55/290 (18%)

Query: 85 FFGMLGDKYGRQKILAITIVIMSISTFCIGLIPSYDTIGIWAPILLLICKMAQGFSVGGE 144
G L D++GR+ +L +++ ++ + P +W +L I ++ G + G
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPF-----LW---VLYIGRIVAGIT-GAT 112

Query: 145 YTGASIFVAEYSPDRKR----GFMGSWLDFGSIAGFVLGAGVVVLISTIVGEANFLDWGW 200
A ++A+ + +R GFM + FG +AG VLG G++ S
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG-GLMGGFSP------------ 159

Query: 201 RIPFFIALPLGIIGLYLRHALEETPAFQQHVDKLEQGDREGLQDGPKVSFKEIATKYWRS 260
PFF A L + L K E+ P SF+ W
Sbjct: 160 HAPFFAAAALNGLNFLTGCFLLPESH------KGERRPLRREALNPLASFR------WAR 207

Query: 261 LLTCIGLVIATNVTYYML----LTYMPSYLSHNLHYS-EDHGVLIIIAIMIGMLFVQPVM 315
+T + ++A ++ + H+ G+ + ++ L +
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMIT 267

Query: 316 GLLSDRFGRRPFVLLG----SVALFVLA--------IPAFILINSNVIGL 353
G ++ R G R ++LG +LA P +L+ S IG+
Sbjct: 268 GPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM 317



Score = 41.0 bits (96), Expect = 8e-06
Identities = 39/164 (23%), Positives = 73/164 (44%), Gaps = 16/164 (9%)

Query: 286 LSHNLHYSEDHGVLI-IIAIMIGMLFVQPVMGLLSDRFGRRPFVLLGSVALFVLAIPAFI 344
L H+ + +G+L+ + A+M PV+G LSDRFGRRP +L+ L A+ I
Sbjct: 35 LVHSNDVTAHYGILLALYALM--QFACAPVLGALSDRFGRRPVLLVS---LAGAAVDYAI 89

Query: 345 LINSNVIGLIFAGLLMLAVILNCFMGVMASTLPAMFPTHIR---YSALAAAFNISVLVAG 401
+ + + +++ G ++A I V + + + R + ++A F +VAG
Sbjct: 90 MATAPFLWVLYIG-RIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFG-MVAG 147

Query: 402 LTPTLAAWLVESSQNLMMPAYYLMVVAVIGLITG-VTMKETANR 444
P L + S + P + + + +TG + E+
Sbjct: 148 --PVLGGLMGGFSPH--APFFAAAALNGLNFLTGCFLLPESHKG 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4117PF05272290.019 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.019
Identities = 12/22 (54%), Positives = 13/22 (59%)

Query: 32 MVALLGPSGSGKSTLLRHLSGL 53
V L G G GKSTL+ L GL
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL 619


86SF4170SF4177N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF4170-117-0.031566maltose ABC transporter ATP-binding protein
SF4171017-0.755923maltose ABC transporter substrate-binding
SF4172-1130.209316maltose transporter membrane protein
SF41730100.657197maltose ABC transporter permease
SF41740100.532250D-xylose transporter XylE
SF4175-1142.286009phosphate-starvation-inducible protein PsiE
SF4175a-1141.921390hypothetical protein
SF4176-1152.238899hypothetical protein
SF41771141.799349hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4170PF05272356e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.7 bits (79), Expect = 6e-04
Identities = 13/35 (37%), Positives = 18/35 (51%)

Query: 32 VVFVGPSGCGKSTLLRMIAGLETITSGDLFIGEKR 66
VV G G GKSTL+ + GL+ + IG +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGK 633


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4171MALTOSEBP7550.0 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 755 bits (1951), Expect = 0.0
Identities = 395/396 (99%), Positives = 395/396 (99%)

Query: 1 MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK 60
MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK
Sbjct: 1 MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK 60

Query: 61 VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW 120
VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW
Sbjct: 61 VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW 120

Query: 121 DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP 180
DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP
Sbjct: 121 DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP 180

Query: 181 YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE 240
YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE
Sbjct: 181 YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE 240

Query: 241 AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSTGINAASPNKE 300
AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLS GINAASPNKE
Sbjct: 241 AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKE 300

Query: 301 LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP 360
LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP
Sbjct: 301 LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP 360

Query: 361 QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK 396
QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK
Sbjct: 361 QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4172FLGHOOKAP1310.012 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 31.1 bits (70), Expect = 0.012
Identities = 22/124 (17%), Positives = 43/124 (34%), Gaps = 21/124 (16%)

Query: 128 GDEWQLALSDGETGKNYLSDAFKFGGEQKLQLKETTAQPEGERANLRVITQNRQALSDIT 187
++WQ+ T DA L+L T + L+ + A+ ++
Sbjct: 367 NNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPV---SDAIVNMD 423

Query: 188 AILPDGNKVMMSSLRQFSGTQPLYTLDGDGTLTNNQSGVKYRPNNQ--------IGFYQS 239
++ D K+ M+S GD N Q+ + + N++ Y S
Sbjct: 424 VLITDEAKIAMAS----------EEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYAS 473

Query: 240 ITAD 243
+ +D
Sbjct: 474 LVSD 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4174TCRTETA363e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.0 bits (83), Expect = 3e-04
Identities = 20/87 (22%), Positives = 42/87 (48%), Gaps = 3/87 (3%)

Query: 279 VIGVMLSIFQQFVGINVVLYYAPEVFKTLGASTDIALLQTIIVGVINLTFTVLAIMT--- 335
+I ++ ++ VGI +++ P + + L S D+ I++ + L A +
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 336 VDKFGRKPLQIIGALGMAIGMFSLGTA 362
D+FGR+P+ ++ G A+ + TA
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4177CHANLCOLICIN310.006 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.8 bits (69), Expect = 0.006
Identities = 21/95 (22%), Positives = 38/95 (40%), Gaps = 3/95 (3%)

Query: 20 AAGTVKVFSNGSSEAKTLTGAEHLIDLVGQPRLANSWWPGAVISEELATAAALRQQQALL 79
A + + + LT + L D+V + N+ + A AA++ + L
Sbjct: 73 AKAAAEAQAKAKANRDALT--QRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERL 130

Query: 80 TRLAEQGADSSTDDAAAINALRQQIQALKVTGRQK 114
RLA+ + + AA A ++ Q K R+K
Sbjct: 131 -RLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREK 164


87SF4427SF4433N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SF4427-1170.684875phosphoglycerate mutase
SF4428-1140.081552right oriC-binding transcriptional activator
SF4429hypothetical protein
SF4430DNA-binding response regulator CreB
SF4431sensory histidine kinase CreC
SF4432hypothetical protein
SF4433two-component response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4427VACCYTOTOXIN290.014 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 29.2 bits (65), Expect = 0.014
Identities = 14/45 (31%), Positives = 20/45 (44%), Gaps = 4/45 (8%)

Query: 145 PLLVSHGIALGCLVSTILGLPAWAERRLRLRNCSISRVDYQESLW 189
P +V GIA G V T+ GL W ++ N D + +W
Sbjct: 42 PAIVG-GIATGAAVGTVSGLLGWGLKQAEEAN---KTPDKPDKVW 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4430HTHFIS876e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 86.8 bits (215), Expect = 6e-22
Identities = 33/139 (23%), Positives = 60/139 (43%)

Query: 1 MQRETVWLVEDEQGIADTLVYMLQQEGFAVEVFERGLPVLDKARQQVPDVMILDVGLPDI 60
M T+ + +D+ I L L + G+ V + + D+++ DV +PD
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 SGFELCRQLLALHPALPVLFLTARSEEVDRLLGLEIGADDYVAKPFSPREVCARVRTLLR 120
+ F+L ++ P LPVL ++A++ + + E GA DY+ KPF E+ + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 RVKKFSTPSPVIRIGHFEL 139
K+ + L
Sbjct: 121 EPKRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4431PF06580330.003 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.9 bits (75), Expect = 0.003
Identities = 47/207 (22%), Positives = 80/207 (38%), Gaps = 51/207 (24%)

Query: 298 LTQNARMQAL---------VETL--LRQARLENRQEVVLTAVDVAALFR---RVSEARTV 343
+ Q A++ AL L +R LE+ + ++ L R R S AR V
Sbjct: 157 MAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQV 216

Query: 344 QLAE--KNITLHVM--------PTEVNVAAEPALLDQALGNLL-----DNA----IDFTP 384
LA+ + ++ + PA++D + +L +N I P
Sbjct: 217 SLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLP 276

Query: 385 ESGCITLSAEVDQEHVTLKVLDTGSGIPDYALSRIFERFYSLPRANGQKSSGLGLAFVSE 444
+ G I L D VTL+V +TGS N ++S+G GL V E
Sbjct: 277 QGGKILLKGTKDNGTVTLEVENTGSLALK----------------NTKESTGTGLQNVRE 320

Query: 445 -VARLFNGEVTLR-NVQEGGVLASLRL 469
+ L+ E ++ + ++G V A + +
Sbjct: 321 RLQMLYGTEAQIKLSEKQGKVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SF4433HTHFIS824e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.8 bits (202), Expect = 4e-20
Identities = 30/122 (24%), Positives = 60/122 (49%), Gaps = 1/122 (0%)

Query: 1 MQTPHILIVEDELVTRNTLKSIFEAEGYDVFEATDGAEMHQILSEYDINLVIMDINLPGK 60
M IL+ +D+ R L GYDV ++ A + + ++ D +LV+ D+ +P +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLLLARELRE-QANVALMFLTGRDNEVDKILGLEIGADDYITKPFNPRELTIRARNLLS 119
N L +++ + ++ ++ ++ ++ + I E GA DY+ KPF+ EL L+
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RT 121

Sbjct: 121 EP 122



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.