PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome1080.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_007946 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1UTI89_C0010UTI89_C0021Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0010230-0.259446molybdenum cofactor biosynthesis protein MogA
UTI89_C00112300.083146hypothetical protein
UTI89_C0012329-0.022550hypothetical protein
UTI89_C0014228-1.256200hypothetical protein
UTI89_C0015022-2.198069glutamate dehydrogenase
UTI89_C0016-119-3.765972molecular chaperone DnaK
UTI89_C0017-120-4.308055molecular chaperone DnaJ
UTI89_C0018123-5.748132gef membrane toxin
UTI89_C0019123-5.887556hypothetical protein
UTI89_C0020122-5.149639hypothetical protein
UTI89_C0021122-4.500208hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0012PF07201300.006 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 30.2 bits (68), Expect = 0.006
Identities = 9/51 (17%), Positives = 24/51 (47%)

Query: 138 LHAVDARVNELEELLPLLMKDKLLAKGVSHLLSSQLTRILRTHAAMSVLGH 188
+ V+ +VN+ +P L + + +++ +S L +S + + A +
Sbjct: 80 VSDVEEQVNQYLSKVPELEQKQNVSELLSLLSNSPNISLSQLKAYLEGKSE 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0016SHAPEPROTEIN1427e-40 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 142 bits (361), Expect = 7e-40
Identities = 83/387 (21%), Positives = 149/387 (38%), Gaps = 84/387 (21%)

Query: 5 IGIDLGTTNSCVAIMDGTTPRVLENAEGDRTTPSIIAYTQDGET------LVGQPAKRQA 58
+ IDLGT N+ + + + E PS++A QD VG AK+
Sbjct: 13 LSIDLGTANTLIYVKGQGIV-LNE--------PSVVAIRQDRAGSPKSVAAVGHDAKQML 63

Query: 59 VTNPQNTLFAIKRLIGRRFQDEEVQRDVSIMPFKIIAADNGDAWVEVKGQKMAPPQISAE 118
P N + AI+ + +D I F + +
Sbjct: 64 GRTPGN-IAAIRPM-----------KDGVIADFFVTEK------------------MLQH 93

Query: 119 VLKKMKKTAEDYLGEPVTEAVITVPAYFNDAQRQATKDAGRIAGLEVKRIINEPTAAALA 178
+K++ + P ++ VP +R+A +++ + AG +I EP AAA+
Sbjct: 94 FIKQVHS---NSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIG 150

Query: 179 YGL--DKGTGNRTIAVYDLGGGTFDISIIEIDEVDGEKTFEVLATNGDTHLGGEDFDSRL 236
GL + TG+ V D+GGGT ++++I ++ V + +GG+ FD +
Sbjct: 151 AGLPVSEATGS---MVVDIGGGTTEVAVISLNGV---------VYSSSVRIGGDRFDEAI 198

Query: 237 INYLVEEFKKDQGIDLRNDPLAMQRLKEAAEKAKIELSSA----QQTDVNLPYITADATG 292
INY+ + G + AE+ K E+ SA + ++ +
Sbjct: 199 INYVRRNYGSLIG-------------EATAERIKHEIGSAYPGDEVREIEVRGRNLAEGV 245

Query: 293 PKHMNIKVTRAKLESLVEDLVNRSIEPLKVALQD-AGLSVSDIDD--VILVGGQTRMPMV 349
P+ + + LE+L E + + + VAL+ SDI + ++L GG + +
Sbjct: 246 PRGFTLN-SNEILEALQEP-LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNL 303

Query: 350 QKKVAEFFGKEPRKDVNPDEAVAIGAA 376
+ + E G +P VA G
Sbjct: 304 DRLLMEETGIPVVVAEDPLTCVARGGG 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0018HOKGEFTOXIC632e-17 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 62.5 bits (152), Expect = 2e-17
Identities = 18/46 (39%), Positives = 30/46 (65%)

Query: 23 HKAMIVALIVICITAVVAALVTRKDLCEVHIRTGQTEVAVFTAYES 68
+++ ++++C+T ++ +TRK LCE+ R G EVA F AYES
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


2UTI89_C0062UTI89_C0075Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0062-2203.506468Dna-J like membrane chaperone protein
UTI89_C0063-2204.26033523S rRNA/tRNA pseudouridine synthase A
UTI89_C0064-2194.019706ATP-dependent helicase HepA
UTI89_C0065-2173.981274DNA polymerase II
UTI89_C0066-3121.752597L-ribulose-5-phosphate 4-epimerase
UTI89_C0067-212-0.330837L-arabinose isomerase
UTI89_C0068117-1.087955ribulokinase
UTI89_C0069224-3.666169DNA-binding transcriptional regulator AraC
UTI89_C0070117-0.696521hypothetical protein
UTI89_C0071114-0.512433hypothetical protein
UTI89_C00720142.153863hypothetical protein
UTI89_C00731163.486225hypothetical protein
UTI89_C00741163.660035thiamine transporter ATP-binding subunit
UTI89_C00751183.525476thiamine transporter membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C006256KDTSANTIGN290.024 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 28.8 bits (64), Expect = 0.024
Identities = 32/120 (26%), Positives = 51/120 (42%), Gaps = 18/120 (15%)

Query: 157 IAEELGISRAQFD-----QFLRMMQGGAQFGGGYQQQSGGGNWQQAQRGPTLEDACNVLG 211
EEL R FD F+ + QQQ G G QQAQ T ++A
Sbjct: 310 TLEEL---RDSFDGYINNAFVNQIHLNFVMPPQAQQQQGQGQQQQAQ--ATAQEAVAAAA 364

Query: 212 VKPTDDATTIKRAYRKLMS-EHHPDKLVAKGLPPEMMEMAKQKAQEIQ-QAYELIKQQKG 269
V+ + + I + Y+ L+ + H G+ M ++A Q+ ++ + Q KQQ+G
Sbjct: 365 VRLLNGSDQIAQLYKDLVKLQRH------AGIRKAMEKLAAQQEEDAKNQGKGDCKQQQG 418


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0072SECYTRNLCASE260.047 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 26.3 bits (58), Expect = 0.047
Identities = 13/29 (44%), Positives = 17/29 (58%), Gaps = 1/29 (3%)

Query: 2 SKYIYILLSF-LVLFFIFFYAYISLMSKE 29
IYI+ F L++FF FFY IS +E
Sbjct: 314 DHPIYIVTYFLLIVFFAFFYVAISFNPEE 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0075PF06580320.005 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.1 bits (73), Expect = 0.005
Identities = 17/80 (21%), Positives = 28/80 (35%), Gaps = 5/80 (6%)

Query: 4 RRQPLIPGWLIPGVSAATLVVAVALAAFLALWWNAPQGDWVAVWQDS-YLWHVVRFSFWQ 62
R GWL + L V A +W+ A ++W+ ++
Sbjct: 60 RSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFVAN----TSIWRLLAFINTKPVAFTLP 115

Query: 63 AFLSALLSVVPAIFLARALY 82
LS + +VV F+ LY
Sbjct: 116 LALSIIFNVVVVTFMWSLLY 135


3UTI89_C0119UTI89_C0127Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0119216-3.498763regulatory protein AmpE
UTI89_C0120314-2.909036aromatic amino acid transporter
UTI89_C0121422-4.924892uropathogenic specific protein
UTI89_C0122425-1.932315hypothetical protein
UTI89_C01235330.263211hypothetical protein
UTI89_C01245380.911135hypothetical protein
UTI89_C01254321.887722transcriptional regulator PdhR
UTI89_C01263352.306406hypothetical protein
UTI89_C01273352.466333pyruvate dehydrogenase subunit E1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0121PYOCINKILLER1841e-52 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 184 bits (467), Expect = 1e-52
Identities = 97/287 (33%), Positives = 136/287 (47%), Gaps = 27/287 (9%)

Query: 313 ALAGSTATTRVRFFWGTDIHGKPQVYGVHTGEGTPY-ENVRVANMQWNEQTQRYEFT--- 368
A+A ++ T + + G V + +G + V V +N T YE T
Sbjct: 345 AVAKASGTVDLPMRLTNEARGNTTTLSVVSTDGVSVPKAVPVRMAAYNATTGLYEVTVPS 404

Query: 369 PAHDVDGPLITWTPENPEHGYVPGHTGN--DRPPLEQPTILVTPIPDGTDTYTTPPFPVP 426
+ ++TWTP +P P T +P +TP+ T +P
Sbjct: 405 TTAEAPPLILTWTPASPPGNQNPSSTTPVVPKPVPVYEGATLTPV-----KATPETYPGV 459

Query: 427 DPKEFNDYILVFPAGSGIKPIYVYLKEDPRKLPGVVTGRGVPLSPGTRWLDMSVSNNGNG 486
D I+ FPA SGIKPIYV + DPR +PG TG+G P+ WL ++ G G
Sbjct: 460 ITLP-EDLIIGFPADSGIKPIYVMFR-DPRDVPGAATGKGQPV--SGNWLG--AASQGEG 513

Query: 487 APIPAHIADKLRGREFKTFDEFREALWLEVSQDPELIAQFSSGNQTRIKQGLTAKAPIDG 546
APIP+ IADKLRG+ FK + +FRE W+ V+ DPEL QF+ G+ ++ G
Sbjct: 514 APIPSQIADKLRGKTFKNWRDFREQFWIAVANDPELSKQFNPGSLAVMRDGGAPYVR--- 570

Query: 547 WYYGPKEIV---KKFQIHHRVAVEYGGSVYDIDNLRIVTPRLHDEIH 590
E K +IHH+V V GG VY++ NL VTP+ H EIH
Sbjct: 571 ----ESEQAGGRIKIEIHHKVRVADGGGVYNMGNLVAVTPKRHIEIH 613


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0123PF04605260.018 Virulence-associated protein D (VapD)
		>PF04605#Virulence-associated protein D (VapD)

Length = 125

Score = 26.0 bits (57), Expect = 0.018
Identities = 13/34 (38%), Positives = 20/34 (58%)

Query: 1 MYNFKDEIEDYTEREFIELLGEFTNPTGDNAQLK 34
Y+ K+ I+D ++F + L EFT T N +LK
Sbjct: 88 QYSLKETIQDLCAKDFHQKLKEFTEKTPKNQKLK 121


4UTI89_C0145UTI89_C0169Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0145016-3.853695hypothetical protein
UTI89_C0146119-4.538744hypothetical protein
UTI89_C0147225-5.526910pantoate--beta-alanine ligase
UTI89_C0148227-6.8113883-methyl-2-oxobutanoate
UTI89_C0149332-8.121870fimbrial-like adhesin protein
UTI89_C0150331-7.703762fimbrial subunit YadK
UTI89_C0151230-7.164006fimbrial subunit YadL
UTI89_C0152021-4.475133fimbrial subunit YadM
UTI89_C0153-118-3.054657outer membrane usher protein
UTI89_C0154-117-0.607818chaperone protein EcpD
UTI89_C01550150.296122fimbrial subunit YadN
UTI89_C01560142.0942442-amino-4-hydroxy-6-
UTI89_C01570133.517693poly(A) polymerase
UTI89_C01581163.031382glutamyl-Q tRNA(Asp) synthetase
UTI89_C01590173.183442RNA polymerase-binding transcription factor
UTI89_C01600172.931224sugar fermentation stimulation protein A
UTI89_C01610142.5015962'-5' RNA ligase
UTI89_C0162-1142.568550ATP-dependent RNA helicase HrpB
UTI89_C0163-1151.690926hypothetical protein
UTI89_C0164-2173.314578penicillin-binding protein 1b
UTI89_C0165-1153.458375hypothetical protein
UTI89_C0166-1143.537555ferrichrome outer membrane transporter
UTI89_C01670164.475395iron-hydroxamate transporter ATP-binding
UTI89_C01681144.147695iron-hydroxamate transporter substrate-binding
UTI89_C01690144.036280iron-hydroxamate transporter permease subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0148FLGMRINGFLIF290.018 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 29.2 bits (65), Expect = 0.018
Identities = 27/100 (27%), Positives = 40/100 (40%), Gaps = 22/100 (22%)

Query: 110 MVKIEGGEWL----VETVQMLTERAVPVCGHLGLTPQSVNIFGGYKVQGRGDEAGDRLL- 164
V +E G L + V L AV GL P +V + D++G LL
Sbjct: 176 TVTLEPGRALDEGQISAVVHLVSSAVA-----GLPPGNVTLV---------DQSG-HLLT 220

Query: 165 -SDALALEAAGAQLLVLECVPVELAKRITEALAIPVIGIG 203
S+ + AQL V + +RI L+ P++G G
Sbjct: 221 QSNTSGRDLNDAQLKFANDVESRIQRRIEAILS-PIVGNG 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0153PF005778050.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 805 bits (2080), Expect = 0.0
Identities = 263/869 (30%), Positives = 428/869 (49%), Gaps = 40/869 (4%)

Query: 12 IATFCALLYSNSALCAELVEYDHTFLMGKDASNIDLSRYTEGNPTLPGIYDVSVYVNDQP 71
+ CA AE + ++ FL + DLSR+ G PG Y V +Y+N+
Sbjct: 30 LFVACAFAAQAPLSSAE-LYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGY 88

Query: 72 IMSQSIAFAVIEGKKNAQACITQKNLLQFHISSPDKNSEKAILLKRDDDLGDCLNLAEMI 131
+ ++ + F + ++ C+T+ L +++ + + C+ L MI
Sbjct: 89 MATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLL------ADDACVPLTSMI 142

Query: 132 PQSSIRYDVNDQRLDIDVPQAWIMKNYQNYVDPSLWENGINAAMLSYNLNGYHSESP-GR 190
++ + DV QRL++ +PQA++ + Y+ P LW+ GINA +L+YN +G ++ G
Sbjct: 143 HDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGG 202

Query: 191 TNDSIYAAFNGGINLGAWRLRASGNYNWMTNVHS-----DYDFQNRYLQRDLASLRSQLV 245
+ Y G+N+GAWRLR + +++ ++ S + N +L+RD+ LRS+L
Sbjct: 203 NSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLT 262

Query: 246 IGESYTTGETFDSVRIRGIRLYSDSRMLPPVLASFAPIIHGVANTNAKVTVMQNGYKIYE 305
+G+ YT G+ FD + RG +L SD MLP FAP+IHG+A A+VT+ QNGY IY
Sbjct: 263 LGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYN 322

Query: 306 TTVPPGAFAIDDLSPSGYGSDLIVTIEEADGTKRTFSQPFSSVVQMLRPGVGRWDISAGQ 365
+TVPPG F I+D+ +G DL VTI+EADG+ + F+ P+SSV + R G R+ I+AG+
Sbjct: 323 STVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGE 382

Query: 366 VLKD-SIQDEPNLFQASYYYGLNNYLTGYTGIQLTDNNYTAGLLGLGMNT-PVGAFSVDV 423
+ Q++P FQ++ +GL T Y G QL D Y A G+G N +GA SVD+
Sbjct: 383 YRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLAD-RYRAFNFGIGKNMGALGALSVDM 441

Query: 424 THSNVSIPDDKTYQGQSYRISWNKLFENTSTSLNIAAYRYSTQHYLGLNDALTLIDEVEH 483
T +N ++PDD + GQS R +NK + T++ + YRYST Y D +
Sbjct: 442 TQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYN 501

Query: 484 PE-----QDLEPKSMRNYSRM---KNQVTVSINQPLKFEKKDYGSFYLSGSWSDYWASGQ 535
E ++PK Y+ + ++ +++ Q L + YLSGS YW +
Sbjct: 502 IETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQL----GRTSTLYLSGSHQTYWGTSN 557

Query: 536 NSTNYSIGYSNSASWGSYSISAQRSLNE-DGQTDDSIYLSFTIPIENLLGTEHRSS-GFQ 593
+ G + + ++++S + N D + L+ IP + L ++ +S
Sbjct: 558 VDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHA 617

Query: 594 SIDTQLNSDFKGNNQLNISSSGYSDT-NRISYSVNTGYMMNKSSDDLSYIGGYASYESPW 652
S ++ D G G N +SYSV TGY + S +Y +
Sbjct: 618 SASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGY 677

Query: 653 GTLSGSASASSDNSRQFSLNTDGGFVLHSGGLTFSNDSFSDSDTLAVIQAPGAKGARINY 712
G + S S D +Q GG + H+ G+T +DT+ +++APGAK A++
Sbjct: 678 GNANIGYSHSDDI-KQLYYGVSGGVLAHANGVTLGQPL---NDTVVLVKAPGAKDAKVEN 733

Query: 713 GNST-VDRWGYGVTSALSPYHENRIALDINDLENDVELKSTSTVAVPRQGAVVFADFETV 771
D GY V + Y ENR+ALD N L ++V+L + VP +GA+V A+F+
Sbjct: 734 QTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKAR 793

Query: 772 QGQSAIMNIVRSDGKNIPFAADIYDEQNNIIGNVGQGGQAFVRGIGQEGNIRITWIEEGK 831
G +M + + K +PF A + E + G V GQ ++ G+ G +++ W EE
Sbjct: 794 VGIKLLMT-LTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEEN 852

Query: 832 PVSCFAHYQQNTTSEKIAQSIILNGLRCQ 860
C A+YQ S++ Q + C+
Sbjct: 853 A-HCVANYQLPPESQQ--QLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0168FERRIBNDNGPP5090.0 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 509 bits (1312), Expect = 0.0
Identities = 293/296 (98%), Positives = 294/296 (99%)

Query: 2 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAVIDPNRIVALEWLPVELLLALGIVPYGVA 61
MSGLPLISRRRLLTAMALSPLLWQMNTAHAA IDPNRIVALEWLPVELLLALGIVPYGVA
Sbjct: 1 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA 60

Query: 62 DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 121
DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR
Sbjct: 61 DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 120

Query: 122 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAHYEDFIRSMKPRFVKRGARPLLLT 181
GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLA YEDFIRSMKPRFVKRGARPLLLT
Sbjct: 121 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLT 180

Query: 182 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 241
TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH
Sbjct: 181 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 240

Query: 242 DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRILDNAIGGKA 297
DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVR+LDNAIGGKA
Sbjct: 241 DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA 296


5UTI89_C0230UTI89_C0271Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0230-222-4.119034membrane-bound lytic murein transglycosylase D
UTI89_C0231124-5.376734hydroxyacylglutathione hydrolase
UTI89_C0232016-1.456383hypothetical protein
UTI89_C0233-1172.075495ribonuclease H
UTI89_C02340193.031615DNA polymerase III subunit epsilon
UTI89_C02361203.968529*aminopeptidase
UTI89_C02370246.064396hypothetical protein
UTI89_C02380246.016260hypothetical protein
UTI89_C02390256.284467hypothetical protein
UTI89_C02400256.009732hypothetical protein
UTI89_C0241-1276.417232hypothetical protein
UTI89_C0242-1265.857985hypothetical protein
UTI89_C02430254.490692hypothetical protein
UTI89_C02440234.365345hypothetical protein
UTI89_C02450213.366693hypothetical protein
UTI89_C02461193.041330hypothetical protein
UTI89_C02470171.982269hypothetical protein
UTI89_C02482232.083328hypothetical protein
UTI89_C0249220-1.066560hypothetical protein
UTI89_C0250324-3.281959hypothetical protein
UTI89_C0251532-5.426607hypothetical protein
UTI89_C0252536-6.619553hypothetical protein
UTI89_C0253641-8.095946hypothetical protein
UTI89_C0254548-11.388128hypothetical protein
UTI89_C0255442-9.315378hypothetical protein
UTI89_C0256016-1.629122hypothetical protein
UTI89_C0257-113-0.452232hypothetical protein
UTI89_C0258-1151.279432hypothetical protein
UTI89_C0259-2172.100166hypothetical protein
UTI89_C0260-1181.755918C-lysozyme inhibitor
UTI89_C02610191.892139acyl-CoA dehydrogenase
UTI89_C0262218-1.723239phosphoheptose isomerase
UTI89_C0263221-1.045884amidotransferase
UTI89_C02640221.241918hypothetical protein
UTI89_C02651241.848472hypothetical protein
UTI89_C02661212.130945lipoprotein YafL
UTI89_C02672171.500127hypothetical protein
UTI89_C02682181.934378hypothetical protein
UTI89_C02692170.605011FhiA protein
UTI89_C0270316-1.096877hypothetical protein
UTI89_C0271218-0.935618DNA polymerase IV
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0231BINARYTOXINB352e-04 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 35.4 bits (81), Expect = 2e-04
Identities = 13/55 (23%), Positives = 28/55 (50%), Gaps = 4/55 (7%)

Query: 186 NDYYRKVKELRAKNQITLPVILKNERQINVFLRT----EDIDLINVINEETLLKQ 236
+ ++ EL A N T+ +K ++N+ +R D + I V +E+++K+
Sbjct: 589 QNIKNQLAELNATNIYTVLDKIKLNAKMNILIRDKRFHYDRNNIAVGADESVVKE 643


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0253ICENUCLEATIN320.012 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 31.6 bits (71), Expect = 0.012
Identities = 30/107 (28%), Positives = 42/107 (39%), Gaps = 10/107 (9%)

Query: 519 TETIGNDQKITVGLG--QTVNVGSKKEGGHDQKVTVANDQHLTIKNDRHKVVNNNQTSKV 576
T+T G D +T G G QT GS G+ T D L + QT+
Sbjct: 359 TQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAG------YGSTQTAGE 412

Query: 577 TGTDTEEVVKKQSIKIGDNYELKVEHGTNIISGDSIELICGQGESGT 623
T T Q+ + G + L +G+ +GD LI G G + T
Sbjct: 413 ESTQTAGYGSTQTAQKGSD--LTAGYGSTGTAGDDSSLIAGYGSTQT 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0270OMPADOMAIN406e-06 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 39.9 bits (93), Expect = 6e-06
Identities = 30/118 (25%), Positives = 47/118 (39%), Gaps = 22/118 (18%)

Query: 122 FERGSAKIMPFFKTLLVELAPVLDSL---DNKIIITGHTDAM---AYKNNIYNNWNLSGD 175
F A + P + L +L L +L D +++ G+TD + AY N LS
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAY------NQGLSER 276

Query: 176 RALSARRVLEEAGMPENKVMQVS-----AMADQMLLDAKNPQS-----AGNRRIEIMV 223
RA S L G+P +K+ + + K + A +RR+EI V
Sbjct: 277 RAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


6UTI89_C0282UTI89_C0344Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0282020-6.132312outer membrane phosphoporin protein E
UTI89_C0283329-8.718487gamma-glutamyl kinase
UTI89_C0284434-9.868082gamma-glutamyl phosphate reductase
UTI89_C0286741-11.781464*integrase
UTI89_C0287843-12.559125hypothetical protein
UTI89_C0288846-13.084061hypothetical protein
UTI89_C0289643-10.395316hypothetical protein
UTI89_C0290340-6.087644resolvase
UTI89_C0291442-9.509707hypothetical protein
UTI89_C0292543-10.919629transcription regulator
UTI89_C0293645-12.230800hypothetical protein
UTI89_C0295640-10.995393*CP4-like integrase
UTI89_C0296434-11.295186hypothetical protein
UTI89_C0297129-10.237699hypothetical protein
UTI89_C0298228-5.985125hypothetical protein
UTI89_C0299228-4.202203InsA protein
UTI89_C0300022-2.420744hypothetical protein
UTI89_C0301017-1.505953InsB protein
UTI89_C03020180.676725hypothetical protein
UTI89_C03031191.106711hypothetical protein
UTI89_C03042211.826195ferredoxin
UTI89_C03052210.874186hypothetical protein
UTI89_C03062200.418708hypothetical protein
UTI89_C03072210.231932hypothetical protein
UTI89_C0308524-3.530104hypothetical protein
UTI89_C0309624-6.137185MatB
UTI89_C0310432-8.140796hypothetical protein
UTI89_C0311135-6.773627hypothetical protein
UTI89_C0312235-7.154768hypothetical protein
UTI89_C0313231-6.04421450S ribosomal protein L36
UTI89_C0314128-4.99663150S ribosomal protein L31
UTI89_C0315126-4.375201NADH-dependent flavin oxidoreductase
UTI89_C0316019-1.714306hypothetical protein
UTI89_C0317019-1.927416LysR family transcriptional regulator
UTI89_C0318018-1.779516transcriptional regulator YcjZ
UTI89_C0319017-2.192081aldo/keto reductase
UTI89_C0320019-2.8769182,5-diketo-D-gluconate reductase A
UTI89_C0321-122-3.614793attaching and effacing protein, pathogenesis
UTI89_C0322229-6.752671transcriptional regulator YkgA
UTI89_C0323228-5.5687222,5-diketo-D-gluconate reductase A
UTI89_C0324225-4.012113hypothetical protein
UTI89_C0326123-3.097455hypothetical protein
UTI89_C0327021-2.676939pyridine nucleotide-disulfide oxidoreductase
UTI89_C0328020-3.368806transcriptional regulator YkgD
UTI89_C0329019-4.401724hypothetical protein
UTI89_C0330021-4.734247electron transport protein YkgF
UTI89_C0331224-5.326474hypothetical protein
UTI89_C0332526-6.109847hypothetical protein
UTI89_C0333426-6.303049hypothetical protein
UTI89_C0334427-5.291481hypothetical protein
UTI89_C0335016-0.469312hypothetical protein
UTI89_C03360152.032299hypothetical protein
UTI89_C03370163.062481hypothetical protein
UTI89_C03380193.302906type 1 fimbriae regulatory protein FimX
UTI89_C03390192.913945hypothetical protein
UTI89_C03400192.751118choline dehydrogenase
UTI89_C03410161.808035betaine aldehyde dehydrogenase
UTI89_C0342-1150.005960transcriptional regulator BetI
UTI89_C0343015-1.158766choline transport protein BetT
UTI89_C0344220-4.031661hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0282ECOLIPORIN5480.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 548 bits (1413), Expect = 0.0
Identities = 231/384 (60%), Positives = 267/384 (69%), Gaps = 34/384 (8%)

Query: 3 MKKSTLALVVMGIVASVSVQAAEIYNKDGNKLDVYGKVKAMHYMSDNDSKDGDQSYIRFG 62
MK+ LALV+ ++A+ + AAEIYNKDGNKLD+YGKV +HY SD+ SKDGDQ+Y+R G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 63 FKGETQINVQLTGYGRWEAEFAGNKAESDTAQQKTRLAFAGLKYKDLGSFDYGRNLGALY 122
FKGETQIN QLTGYG+WE N E + A TRLAFAGLK+ D GSFDYGRN G LY
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 123 DVEAWTDMFPEFGGDSSAQTDNFMTKRASGLATYRNTDFFGVIDGLNLTLQYQGKNEN-- 180
DVE WTDM PEFGGDS DN+MT RA+G+ATYRNTDFFG++DGLN LQYQGKNE+
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 181 --------------RDVKKQNGDGFGTSLTYDFGGSDFAISGAYTNSDRTNEQNLQSR-- 224
D++ NGDGFG S TYD G F+ AYT SDRTNEQ
Sbjct: 181 ADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDI-GMGFSAGAAYTTSDRTNEQVNAGGTI 239

Query: 225 GTGKRAEAWATGLKYDANNIYLATFYSETRKMTP-------ISGGFANKTQNFEAVAQYQ 277
G +A+AW GLKYDANNIYLAT YSETR MTP GG ANKTQNFE AQYQ
Sbjct: 240 AGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQYQ 299

Query: 278 FDFGLRPSLGYVLSKGKDIE----GIGDEDLVNYVDVGATYYFNKNMSAFVDYKINQLDS 333
FDFGLRP++ +++SKGKD+ D+DLV Y DVGATYYFNKN S +VDYKIN LD
Sbjct: 300 FDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDD 359

Query: 334 DNKL----NINNDDIVAVGMTYQF 353
D+ I+ DDIVA+GM YQF
Sbjct: 360 DDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0283CARBMTKINASE376e-05 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 37.5 bits (87), Expect = 6e-05
Identities = 28/127 (22%), Positives = 48/127 (37%), Gaps = 17/127 (13%)

Query: 119 DTLRALLDNNI---------VPVINENDAVATAEIKVGDNDNLSALAAILAGADKLLLLT 169
+T++ L++ + VPVI E+ + E V D D A AD ++LT
Sbjct: 177 ETIKKLVERGVIVIASGGGGVPVILEDGEIKGVE-AVIDKDLAGEKLAEEVNADIFMILT 235

Query: 170 DQKGLYTADPRSNPQAELIKDVYGIDDALRAIAGDSVSGLGTGGMSTKLQAA-DVACRAG 228
D G + + +++V +++ + G M K+ AA G
Sbjct: 236 DVNGAALY--YGTEKEQWLREV-KVEELRKYYEEG---HFKAGSMGPKVLAAIRFIEWGG 289

Query: 229 IDTIIAA 235
IIA
Sbjct: 290 ERAIIAH 296



Score = 30.2 bits (68), Expect = 0.013
Identities = 16/76 (21%), Positives = 33/76 (43%), Gaps = 13/76 (17%)

Query: 4 SQTLVVKLGTSVLTGGSRRLNRAHIVELVRQCAQ----LHAAGHRIVIVTSG-------- 51
+ +V+ LG + L ++ + +++ VR+ A+ + A G+ +VI
Sbjct: 2 GKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLL 61

Query: 52 -AIAAGREHLGYPELP 66
+ AG+ G P P
Sbjct: 62 LHMDAGQATYGIPAQP 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0286STREPTOPAIN300.021 Streptopain (C10) cysteine protease family signature.
		>STREPTOPAIN#Streptopain (C10) cysteine protease family signature.

Length = 398

Score = 30.4 bits (68), Expect = 0.021
Identities = 17/78 (21%), Positives = 29/78 (37%), Gaps = 4/78 (5%)

Query: 131 DKDDPEVFGFI----IDSAINIGIRGSLSGYVQPLVSSRILREMIAGEPLDAETAFKSSE 186
DK PE+ G+ D+ I + YV+ + ++ L AG + KS
Sbjct: 94 DKRSPEILGYSTSGSFDANGKENIASFMESYVEQIKENKKLDTTYAGTAEIKQPVVKSLL 153

Query: 187 ASSKAAWFVDFPGVEISP 204
S + P ++P
Sbjct: 154 DSKGIHYNQGNPYNLLTP 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0307PF00577635e-12 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 62.6 bits (152), Expect = 5e-12
Identities = 29/247 (11%), Positives = 73/247 (29%), Gaps = 23/247 (9%)

Query: 487 TLNLNSLWSKLGTFSISYNDDRRYNSHYYTADYYQSVYSGTFGSLGLRAGIQRYNNGDSS 546
L + + T +S + Y + +Q+ + F + N
Sbjct: 530 QLTVTQQLGRTSTLYLSG-SHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQK 588

Query: 547 ANTGKYIALDLSLPLGNWFSAGMTHQNGYTMANLSARKQFDEGT------------IRTV 594
+ +AL++++P +W + Q + A+ S + +
Sbjct: 589 -GRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNL 647

Query: 595 GANLSRAISGDTGDDKTLSGGAYAQFDARYASGTLNVNSAADGYINTNLTANGSVGWQGK 654
++ +G + +G A + Y + + S +D +G V
Sbjct: 648 SYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGY-SHSDDIKQLYYGVSGGVLAHAN 706

Query: 655 NIAASGRTDGNAGVIFDTGLEN---DGQISAKINGRIFPLNGKRNYLPLSPYGRYEVELQ 711
+ + ++ G ++ + Q + + R G + Y V L
Sbjct: 707 GVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWR-----GYAVLPYATEYRENRVALD 761

Query: 712 NSKNSLD 718
+ + +
Sbjct: 762 TNTLADN 768


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0321INTIMIN549e-178 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 549 bits (1416), Expect = e-178
Identities = 234/832 (28%), Positives = 356/832 (42%), Gaps = 78/832 (9%)

Query: 41 PVMAARAQHAVQPRLSMENTTVTADNNVEKNVASLAANAGTFLSSQPDS-----DATRNF 95
P++AA +L+ + VT N + + AA L SQ S D ++
Sbjct: 131 PLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSRSLNGDYAKDT 190

Query: 96 ITGMATAKANQEIQEWLGKYGTARVKLNVDKNFSLKDSSLEMLYPIYDTPTNMLFTQGAI 155
G+A +A+ ++Q WL YGTA V L NF SSL+ L P YD+ + F Q
Sbjct: 191 ALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD--GSSLDFLLPFYDSEKMLAFGQVGA 248

Query: 156 HRTDDRTQSNIGFGWRHFSENDWMAGVNTFIDHDLSRSHTRIGVGAEYWRDYLKLSANGY 215
D R +N+G G R F + M G N FID D S +TR+G+G EYWRDY K S NGY
Sbjct: 249 RYIDSRFTANLGAGQRFF-LPENMLGYNVFIDQDFSGDNTRLGIGGEYWRDYFKSSVNGY 307

Query: 216 IRASGWKKSPDVEDYQERPANGWDIRAEGYLPAWPQLGASLMYEQYYGDEVGLFGKDKRQ 275
R SGW +S + +DY ERPANG+DIR GYLP++P LGA LMYEQYYGD V LF DK Q
Sbjct: 308 FRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDNVALFNSDKLQ 367

Query: 276 KDPHAITAEVNYTPVPLLTLSAGHKQGKSGENDTRFGLEVNYRIGEPLEKQLDTDSIRER 335
+P A T VNYTP+PL+T+ ++ G END + ++ Y+ +P +Q++ + E
Sbjct: 368 SNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQQIEPQYVNEL 427

Query: 336 RMLAGSRYDLVERNNNIVLEYRKSEVIRIALPERIEGKGGQTVSLGLVVSKATHGLKNVQ 395
R L+GSRYDLV+RNNNI+LEY+K +++ + +P I G T + L+V K+ +GL +
Sbjct: 428 RTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTERSTQKIQLIV-KSKYGLDRIV 486

Query: 396 WEAPSLLAAGGKITGQG----NQWQVTLPAYQAGKDNYYAISAIAYDNKGNASKRVQTEV 451
W+ +L + GG+I G +Q LPAY G N Y ++A AYD GN+S V +
Sbjct: 487 WDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTI 546

Query: 452 VISGAGMSADRTALTLDGQSRIQMLANGNEQKPLVLSLR----DAEGQPVTGMKDQIKTE 507
+ G D+ +T + A+G E +++ PV+
Sbjct: 547 TVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVS--------- 597

Query: 508 LTFKPAGNIVTRTLKATKSQAKPTLGEFTETEAGVYQSVFTTGTQSGEATITVSVDDMSK 567
NIV+ T + + A +G + G+ ++ +M+
Sbjct: 598 ------FNIVSGTAVLSANSAN-------TNGSGKATVTLKSDK-PGQVVVSAKTAEMTS 643

Query: 568 TVTAELRATMMDVSNSTLSANEPSGDVVADGQQAYTLTLTAVDSEGNPVTGEASRLRLVP 627
+ A + S VA+GQ A T T+ V PV+ +
Sbjct: 644 ALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVK-VMKGDKPVSNQEVTF---- 698

Query: 628 QDTNGVTVGAIS--EIKPGVYSATVSSTRAGNVVVRAFSEQYQLGTLQQTLKFVAGP--- 682
T + + G T++ST G +V A + ++F
Sbjct: 699 -TTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTID 757

Query: 683 ----------LDAAHSSITLNPDK---PVVGGTVTAIWTAKDANDNPVTGLNPDAPSLSG 729
+ ++ L + GG W + + V + +L
Sbjct: 758 DGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDA-SSGQVTLKE 816

Query: 730 AAAAGSTASGWTDNGDGTWTAQISLGTTAGELDVMPKLNGQDAAANAAKVTVVADALSSN 789
+ +DN T+T T L V L S+
Sbjct: 817 KGTTTISVIS-SDNQTATYTIA-----TPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSS 870

Query: 790 QSKV-------SVAEDHVKAGESTTVTLVAKDAHGNAISGLSLSASLTGTAS 834
Q+++ A + S T+ + +A SG++ + L
Sbjct: 871 QNELENVFKAWGAANKYEYYKSSQTIISWVQQTAQDAKSGVASTYDLVKQNP 922



Score = 73.6 bits (180), Expect = 4e-15
Identities = 75/364 (20%), Positives = 120/364 (32%), Gaps = 39/364 (10%)

Query: 882 TVIAGEMSSANSTLVADNKTPTVKTTTELTFTMKDAYGNPVTGLKPDAPVFSGAASTGSE 941
V + ++ ++ AD T T + PV+ + SG A
Sbjct: 557 QVGVTDFTADKTSAKADGTEAITYTAT-VKKNGVAQANVPVSFN-----IVSGTAV---- 606

Query: 942 RPSAGNWTEKGNGVYVSTLTLGSAAGQLSVMPRVNGQNAVAQPLVLNVAGDASKAEIRDM 1001
SA + G+G TL + +A+ V+ V D +KA I ++
Sbjct: 607 -LSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFV--DQTKASITEI 663

Query: 1002 TVKVNNQLANGQSANQITLTV-VDSYGNPLQGQEVTLTLPQGVTSKTGNTVTTNAAGKVD 1060
+ANGQ A IT TV V P+ QEVT T G S + T T+ G
Sbjct: 664 KADKTTAVANGQDA--ITYTVKVMKGDKPVSNQEVTFTTTLGKLSNS--TEKTDTNGYAK 719

Query: 1061 IELMSTVAGELEIEASVKNSQKTVKVKFKADFSTGQASLEVDAA-AQKVANGKDAFTLTA 1119
+ L ST G+ + A V + VK F +L +D + V G T
Sbjct: 720 VTLTSTTPGKSLVSARVSDVAVDVKAPEVEFF----TTLTIDDGNIEIVGTGVKGKLPTV 775

Query: 1120 TVK-DQYGNLLPGAVVVFNLPRGVKPLADGNIMVNADKEGKAELKVVSVTAGTYEITASA 1178
++ Q G + A+ I G+ LK GT I+ +
Sbjct: 776 WLQYGQVNLKASGGNGKYTW-----RSANPAIASVDASSGQVTLK----EKGTTTISVIS 826

Query: 1179 GNDQPSNAQSVTFVADKTTATISSIEVIGNRAVADGKTKQTYKVTVTDANNNLLKDSEVT 1238
++Q T+ + I + D ++ N L++
Sbjct: 827 SDNQT-----ATYTIATPNSLI-VPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKA 880

Query: 1239 LTAS 1242
A+
Sbjct: 881 WGAA 884



Score = 54.7 bits (131), Expect = 2e-09
Identities = 46/249 (18%), Positives = 82/249 (32%), Gaps = 24/249 (9%)

Query: 1168 TAGTYEITASA----GNDQPSNAQSVTFVADKTTAT---ISSIEVIGNRAVADGKTKQTY 1220
+ Y++TA A GN + ++T +++ ++ A ADG TY
Sbjct: 521 GSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITY 580

Query: 1221 KVTVTDANNNLLKDSEVTLTASPENLVLTPNGTATTNEQGQAIFTATTTVAATYTLTAKV 1280
TV S + +A TN G+A T + ++AK
Sbjct: 581 TATVKKNGVAQANVPVSFNIVS--GTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAK- 637

Query: 1281 EQADGQESTKTAESKFVADDKNAVLAASPERVDSLVADGKTTATLTVTLMSGVNPVGGTM 1340
A+ + FV K ++ ++ + VA+G+ T TV +M G PV
Sbjct: 638 -TAEMTSALNANAVIFVDQTKASITEIKADK-TTAVANGQDAITYTVKVMKGDKPVSNQE 695

Query: 1341 WVDIEA--PEGVTEADYQFLPSKNDHFASGKITRTFSTNKPGTYTFTFNSLTYGGYEMKP 1398
+ K D +G T ++ PG + ++ ++K
Sbjct: 696 VTFTTTLGKLSNSTE-------KTD--TNGYAKVTLTSTTPGKSLVS-ARVSDVAVDVKA 745

Query: 1399 VTVTINAVP 1407
V
Sbjct: 746 PEVEFFTTL 754



Score = 45.8 bits (108), Expect = 1e-06
Identities = 55/368 (14%), Positives = 104/368 (28%), Gaps = 56/368 (15%)

Query: 779 VTVVADALSSNQSKV---SVAEDHVKAGESTTVTLVA------KDAHGNAISGLSLSASL 829
+TV+++ +Q V + + KA + +T A +S +S
Sbjct: 546 ITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSG-- 603

Query: 830 TGTASEGATVSSWTEKGDGSYVAT--LTTGGKTGELRVMPLFNGQPAATEAAQLTVIAGE 887
A +S+ + +GS AT L + + A A + +
Sbjct: 604 ------TAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALN--ANAVIFVDQ 655

Query: 888 MSSANSTLVADNKTPTVKTTTELTFTMKDAY-GNPVTGLKPDAPVFSGAASTGSERPSAG 946
++ + + AD T +T+T+K PV+ + +T + S
Sbjct: 656 TKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEV-------TFTTTLGKLSNS 708

Query: 947 NWTEKGNGVYVSTLTLGSAAGQLSVMPRVNGQN-AVAQPLVLNVAG---DASKAEIRDMT 1002
NG TLT + G+ V RV+ V P V D EI
Sbjct: 709 TEKTDTNGYAKVTLT-STTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEI---- 763

Query: 1003 VKVNNQLANGQSANQITLTV-VDSYGNPLQGQEVTLTLPQGVTSKTGNTVTTNAAGKVDI 1061
+ G T+ + G T ++ ++G+V +
Sbjct: 764 ------VGTGVKGKLPTVWLQYGQVNLKASGGNGKYTW---RSANPAIASVDASSGQVTL 814

Query: 1062 ELMSTVAGELEIEASVKNSQKTVKVKFKADFSTGQASLEVDAAAQKVANGKDAFTLTATV 1121
+ G I ++Q + + + +
Sbjct: 815 K----EKGTTTISVISSDNQ---TATYTIATPNSLIVPNMSKRVT-YNDAVNTCKNFGGK 866

Query: 1122 KDQYGNLL 1129
N L
Sbjct: 867 LPSSQNEL 874


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0322HTHTETR280.028 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.4 bits (63), Expect = 0.028
Identities = 12/42 (28%), Positives = 19/42 (45%)

Query: 14 RQKILQQLLEWIECNLEHPISIEDIAQKSGYSRRNIQLLFRN 55
RQ IL L S+ +IA+ +G +R I F++
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKD 54


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0334PRTACTNFAMLY1221e-30 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 122 bits (306), Expect = 1e-30
Identities = 112/509 (22%), Positives = 187/509 (36%), Gaps = 73/509 (14%)

Query: 330 LDINLSDSSVWKGKVSGAGDASVSLQNGSVWNVTGSSTVDALAVKDSTVNITKATVNTGT 389
LD+ L+ + W G S+S+ N W +T +S V AL + + G
Sbjct: 413 LDVALASQARWTGATRAVD--SLSIDNA-TWVMTDNSNVGALRLASDGSVDFQQPAEAGR 469

Query: 390 FA-------SQNGTLI----VDASSENTLDISGKASGDLRVY---------SAGSLDLIN 429
F + +G D + L + ASG R++ SA +L L+
Sbjct: 470 FKVLTVNTLAGSGLFRMNVFADLGLSDKLVVMQDASGQHRLWVRNSGSEPASANTLLLVQ 529

Query: 430 EQ----TAFISTGKDSTLKATGTTEGGLYQYDLTQGADGNFYFVKNTHK----------- 474
F KD G + G Y+Y L +G + V
Sbjct: 530 TPLGSAATFTLANKD------GKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGP 583

Query: 475 -----------------------ASNASSVIQAMA-AAPANVANLQADTLSARQDAVRLS 510
++ A++ + + + +++ LS R +RL+
Sbjct: 584 QPPQPPQPQPEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRLN 643

Query: 511 ENDKGGVWIQYFGGKQKHTTAGNASYDLDVNGVMLGGDTRFMTEDGSWLAGVAMSSAKGD 570
D GG W + F +Q+ +D V G LG D G W G +GD
Sbjct: 644 P-DAGGAWGRGFAQRQQLDNRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGD 702

Query: 571 MT-TMQSKGDTEGYSFHAYLSRQYNNGIFIDTAAQFGHYSNTADVRLMNGGGTIKADFNT 629
T G T+ Y + ++G ++D + N V +G +K + T
Sbjct: 703 RGFTGDGGGHTDSVHVGGYATYIADSGFYLDATLRASRLENDFKVAGSDGY-AVKGKYRT 761

Query: 630 NGFGAMVKGGYTWKDGNGLFIQPYAKLSALTLEGVDYQL-NGVDVHSDSYNSVLGEAGTR 688
+G GA ++ G + +G F++P A+L+ G Y+ NG+ V + +SVLG G
Sbjct: 762 HGVGASLEAGRRFTHADGWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLE 821

Query: 689 VGYDFAVGNA-TVKPYLNLAALNEFSDGNKVRLGDESVNASIDGAAFRVGAGVQADITKN 747
VG + V+PY+ + L EF V + + G +G G+ A + +
Sbjct: 822 VGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRG 881

Query: 748 MGAYASLDYTKGDDIENPLQGVVGINVTW 776
YAS +Y+KG + P G +W
Sbjct: 882 HSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0342HTHTETR623e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.3 bits (151), Expect = 3e-14
Identities = 31/172 (18%), Positives = 58/172 (33%), Gaps = 15/172 (8%)

Query: 16 RRRQLIDATLEAINEVGMHDATIAQIARRAGVSTGIISHYFRDKNGLLEATMRDITSQLR 75
R+ ++D L ++ G+ ++ +IA+ AGV+ G I +F+DK+ L S +
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 76 DAVLNRLHALPQGSAELRLQAIVGGNFDETQVSSAAMKAWLAFWASSMHQP-------ML 128
+ L P L+ I+ + T V+ + +
Sbjct: 72 ELELEYQAKFPGDPLS-VLREILIHVLEST-VTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 129 YRLQQVSSRRLLSNLVSEFRRE---LPRQQAQEAGYGLAALIDGL---WLRA 174
R + S + + + A + I GL WL A
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFA 181


7UTI89_C0407UTI89_C0417Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0407122-3.017662shikimate kinase II
UTI89_C0408025-4.063409hypothetical protein
UTI89_C0409116-2.337639hypothetical protein
UTI89_C0410119-0.557141hypothetical protein
UTI89_C0411218-0.011277hypothetical protein
UTI89_C04124171.013698hypothetical protein
UTI89_C04133151.324357hypothetical protein
UTI89_C04142151.380711recombination associated protein
UTI89_C04151171.590837fructokinase
UTI89_C04161151.351166MFS transport protein AraJ
UTI89_C04172161.662886exonuclease SbcC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0407PF05272280.029 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 27.7 bits (61), Expect = 0.029
Identities = 17/68 (25%), Positives = 25/68 (36%), Gaps = 6/68 (8%)

Query: 4 PLFLIGPRGCGKTTVGMALADSLNRRFVDTDLWL----QSQLNMTVAEIVEREEWAGFRA 59
+ L G G GK+T+ L F DT + S + E E FR
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL--DFFSDTHFDIGTGKDSYEQIAGIVAYELSEMTAFRR 655

Query: 60 RETAALEA 67
+ A++A
Sbjct: 656 ADAEAVKA 663


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0415ACETATEKNASE300.015 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.8 bits (67), Expect = 0.015
Identities = 17/69 (24%), Positives = 29/69 (42%), Gaps = 10/69 (14%)

Query: 187 FISGTGFATDYRRLSGHALKGSEIIRLVEESDPVAELALRRYELRLAKSLAHVVNILDP- 245
+G ++D+R L A + D A+LAL + R+ K++ +
Sbjct: 273 VYGISGISSDFRDLEDAAF---------KNGDKRAQLALNVFAYRVKKTIGSYAAAMGGV 323

Query: 246 DVIVLGGGM 254
DVIV G+
Sbjct: 324 DVIVFTAGI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0416TCRTETA514e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 51.0 bits (122), Expect = 4e-09
Identities = 74/356 (20%), Positives = 126/356 (35%), Gaps = 35/356 (9%)

Query: 33 ILSLALGTFGLGMAEFGIMGVLTELAHNVGISIPAAGH---MISYYALGVVVGAPIIALF 89
+ ++AL G+G+ IM VL L ++ S H +++ YAL AP++
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 90 SSRYSLKHILLFLVALCVIGNAMFTLSSSYLMLAIGRLVSGFPHGAFFGVGAIVLSKIIK 149
S R+ + +LL +A + A+ + +L IGR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 150 PGKVTAAVAGMVSGMTVANLLGIPLGTYLSQEFSWRYTFLLIAVFNIAVMASVYFWVPDI 209
G A G +S ++ P+ L FS F A N + F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 210 RDEAKGKLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYVKPYMMFI 257
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 258 SGFSETAMTFIMMLVGLGM---VLGNMLSGRISGRYSPLRIAAVTDFIIVLALLMLFFFG 314
F A T + L G+ + M++G ++ R R + ++L F
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 315 GMKTTSLIFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAVG 368
I + G+ LQ +L + E G G +A +L S VG
Sbjct: 299 RGWMAFPIMVLLASGGIG--MPALQAMLSRQV-DEERQGQLQGSLAALTSLTSIVG 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0417IGASERPTASE392e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.5 bits (89), Expect = 2e-04
Identities = 40/264 (15%), Positives = 81/264 (30%), Gaps = 11/264 (4%)

Query: 162 LNAKPKERAELLEELTGTEIYGQISAMVFEQHKSARTELEKLQAQASGVALLTPEQVQSL 221
A P E E + E + Q S V + + A + + A Q+
Sbjct: 1029 APATPSETTETVAENSK-----QESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTN 1083

Query: 222 TASLQVLTDEEKQLITAQQQEQQSLNWLTRLD-ELQQEGSRRQQALQQALAEEEKAQPQL 280
+ +E Q ++ +++ E QE + + + E QPQ
Sbjct: 1084 EVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQA 1143

Query: 281 AALSLAQPARNLRPHWE---RIAEYSTALAHTRQQIEEVNTRLQSTMALRASIRHHAAKQ 337
P N++ A+ T +E+ T + + + +
Sbjct: 1144 EPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTT 1203

Query: 338 SAELQQQQQSLNAWLQEHDRLRQWNNELAGWRAQFSQQTSDREHLRQWQQQLTHAEQKLN 397
A Q S ++ ++ R + + ++DR + T+ L+
Sbjct: 1204 PATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPA-TTSSNDRSTVALCDLTSTNTNAVLS 1262

Query: 398 ALAAITLTLTADEVASALAQHAEQ 421
A + + V A++QH Q
Sbjct: 1263 DARAKAQFVALN-VGKAVSQHISQ 1285



Score = 33.9 bits (77), Expect = 0.005
Identities = 27/139 (19%), Positives = 54/139 (38%), Gaps = 13/139 (9%)

Query: 738 QQDVLAAQSLQKAQAQFDTALQASVFDDQQAFLAALMDEQTLTQLEQLKQNLENQRRQAQ 797
Q DV + S + A+ D A A E T T E KQ + + Q
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPP-------APATPSETTETVAENSKQESKTVEKNEQ 1056

Query: 798 TLVTQTAETLTQHQQHRPGGLSLTVTVEQIQQELAQTHQKLRENTTSQGEIRQQLKQDAD 857
TA+ ++ + V E+AQ+ + +E T++ + ++++
Sbjct: 1057 DATETTAQNREVAKEAKS-----NVKANTQTNEVAQSGSETKETQTTETKETATVEKEEK 1111

Query: 858 NRQQQQTLMQQIAQMTQQV 876
+ + + Q++ ++T QV
Sbjct: 1112 AKVETEK-TQEVPKVTSQV 1129


8UTI89_C0451UTI89_C0464Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C04514250.735105protoheme IX farnesyltransferase
UTI89_C04522210.927779cytochrome o ubiquinol oxidase subunit IV
UTI89_C04531220.968795hypothetical protein
UTI89_C04540210.924005cytochrome o ubiquinol oxidase subunit III
UTI89_C0455-1190.588215cytochrome o ubiquinol oxidase subunit I
UTI89_C0456-114-0.380560cytochrome o ubiquinol oxidase subunit II
UTI89_C0457015-0.450094muropeptide transporter
UTI89_C0458427-1.619476hypothetical protein
UTI89_C0459929-1.290597hypothetical protein
UTI89_C0460727-1.104682transcriptional regulator BolA
UTI89_C0461626-0.611590hypothetical protein
UTI89_C04626250.067027hypothetical protein
UTI89_C04635270.072014trigger factor
UTI89_C04643230.249379hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0457TCRTETA394e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 38.7 bits (90), Expect = 4e-05
Identities = 71/347 (20%), Positives = 135/347 (38%), Gaps = 20/347 (5%)

Query: 62 KFLWSPLMDRYTPPFFGRRRGWLLATQILLLVAIAAMGFLEPGTQLRWMAALAVVIAFCS 121
+F +P++ + F RR LL + V A M W+ + ++A +
Sbjct: 56 QFACAPVLGALSDRF--GRRPVLLVSLAGAAVDYAIMAT----APFLWVLYIGRIVAGIT 109

Query: 122 ASQDIVFDAWKTDVLPAEERGAGAAISVLGYRLGMLVSGGLALWLADKWLGWQGMYWLMA 181
+ V A+ D+ +ER + GM+ L + ++ A
Sbjct: 110 GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG--FSPHAPFFAAA 167

Query: 182 AL-LIPCIIATLLAPEP--TDTIPVPKTLEQAVVAPLRDFFGRNNAWLILLLIVLYKLGD 238
AL + + L PE + P+ + + + A L+ + ++ +G
Sbjct: 168 ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQ 227

Query: 239 AFAMSLTTTFLIRGVGFDAGEVGVVNKTLGLLATIVGALYGGILMQRLSLFRALLIFGIL 298
A +L F +DA +G+ G+L ++ A+ G + RL RAL+ G++
Sbjct: 228 VPA-ALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALM-LGMI 285

Query: 299 QGASNAGYWLLSITDKHLYSMGAAVFFENLCGGMGTSAFVALLMTLCNKSFSATQFALLS 358
A GY LL+ + + V GG+G A A+L ++ L+
Sbjct: 286 --ADGTGYILLAFATRGWMAFPIMVLL--ASGGIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 359 ALSAVGRVYVGPVAGWFVEAHGWSTF--YLFSVAAAVPGLILLLVCR 403
AL+++ + VGP+ + A +T+ + + AA+ L L + R
Sbjct: 342 ALTSLTSI-VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0458PF06291270.027 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 26.5 bits (58), Expect = 0.027
Identities = 11/34 (32%), Positives = 18/34 (52%)

Query: 3 KKILFPLVALFMLAGCAKPPTTIEVSPTITLPQQ 36
KK+LF ++ GCA+ T+ PT P++
Sbjct: 7 KKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKE 40


9UTI89_C0477UTI89_C0499Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0477-1143.149592multidrug ABC transporter ATP-binding protein
UTI89_C0478-2141.947084nitrogen regulatory protein P-II 2
UTI89_C0479-2120.479944ammonium transporter
UTI89_C0480-113-1.312379acyl-CoA thioesterase
UTI89_C0481-116-2.037345hypothetical protein
UTI89_C0482018-3.414158hypothetical protein
UTI89_C0483119-4.969961hypothetical protein
UTI89_C0484-112-1.699668hypothetical protein
UTI89_C0485217-0.543747hypothetical protein
UTI89_C0486116-0.743330maltose O-acetyltransferase
UTI89_C0487115-0.146596hemolysin expression-modulating protein
UTI89_C04881150.093627hypothetical protein
UTI89_C04891160.963931acridine efflux pump
UTI89_C04901120.438245acriflavine resistance protein A
UTI89_C04911150.301304DNA-binding transcriptional repressor AcrR
UTI89_C04923162.390898potassium efflux protein KefA
UTI89_C04934173.902411hypothetical protein
UTI89_C04944173.868293primosomal replication protein N''
UTI89_C04952183.675070hypothetical protein
UTI89_C04964243.148087adenine phosphoribosyltransferase
UTI89_C04974252.889216DNA polymerase III subunits gamma and tau
UTI89_C04983261.580067hypothetical protein
UTI89_C04992221.396016hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0477ACRIFLAVINRP340.002 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 33.7 bits (77), Expect = 0.002
Identities = 21/96 (21%), Positives = 42/96 (43%), Gaps = 4/96 (4%)

Query: 130 AAVGVVQQLRTDVMDAA--LRQPLSEFDTQ-PVGQVISRVTNDTEVIRDLYVTVVATVLR 186
A +G+ + +D A ++ L+E P G + + T ++ VV T+
Sbjct: 287 AGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFE 346

Query: 187 SAALVGAMLVAMFSLDWRMALVAIMIFPVVLVVMVI 222
+ LV +++ +F + R L+ + PVVL+
Sbjct: 347 AIMLV-FLVMYLFLQNMRATLIPTIAVPVVLLGTFA 381


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0484BCTERIALGSPF290.035 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.4 bits (66), Expect = 0.035
Identities = 31/137 (22%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 247 IWLPLGLVIGLLAAMFVLRILRRIQSPHHRLQDAIENRDICVHYQPIVSLANGKIVGAEA 306
W+ L L+ G +A +LR R+ + + P++ G+I
Sbjct: 228 PWMLLALLAGFMAFRVMLR------QEKRRVS-----FHRRLLHLPLI----GRIARGLN 272

Query: 307 LARWPQTDGSWLSPDSFIPLAQQTGLS-EPLTLLIIRSVFEDMGDWLRQHPQQHISINLE 365
AR+ +T + S +PL Q +S + ++ R D +R+ H + LE
Sbjct: 273 TARYARTLSILNA--SAVPLLQAMRISGDVMSNDYARHRLSLATDAVREGVSLHKA--LE 328

Query: 366 STVLTSEKIPQLLREMI 382
T L P ++R MI
Sbjct: 329 QTAL----FPPMMRHMI 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0489ACRIFLAVINRP13690.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1369 bits (3546), Expect = 0.0
Identities = 802/1033 (77%), Positives = 915/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++SA+YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ + LNK++LTPVDVI +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL 300
+ EEFGK+ L+VN DGS V L+DVA++ELGGENY++IA NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+LA+++PFFP G+K++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVF+PMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWFNRMFEKSTHHYTDSVGGILRSTGR 540
SVLVALILTPALCAT+LKP++ H E K GFFGWFN F+ S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTHYYLT 600
YL++Y +IV GM LF+RLPSSFLP+EDQGVF+TM+QLPAGATQERTQKVL++VT YYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD 660
EK NVESVF VNGF F+G+ QN G+AFVSLK W +R G+EN EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQLL AA+HP L SVRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS++DIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D+ YVR+A+G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSIPFS 900
M LME LASKLP G+GYDWTGMSYQERLSGNQAP+L AIS +VVFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF 1020
+EATL AVRMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0490RTXTOXIND446e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.4 bits (105), Expect = 6e-07
Identities = 33/212 (15%), Positives = 71/212 (33%), Gaps = 23/212 (10%)

Query: 112 TYQAAYDSAKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTA 171
+ Y A +L + + Q+ Q +++ ++ L +Q +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 172 AKAAVETARINLAYTKVTSPISGRIGKSNV-TEGALVQNGQATALATVQQLDPIYVDVTQ 230
+ + + +P+S ++ + V TEG +V + T + V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALV 372

Query: 231 SSNDFLRLKQELA----------NGTLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVD 280
+ D + KV I D I+ + G + ++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLV---GKVKNINLDAIEDQRLGLVFNVIISIE 429

Query: 281 QTTGSITLRAIFPNPDHTLLPGMFVRARLEEG 312
+ S + I L GM V A ++ G
Sbjct: 430 ENCLSTGNKNIP------LSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 7e-04
Identities = 24/125 (19%), Positives = 43/125 (34%), Gaps = 13/125 (10%)

Query: 61 PLQITTELPGR-TSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQAAYDS 119
++I G+ T + R E++P + I+ + KEG + G L ++ +A
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA---- 134

Query: 120 AKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTAAKAAVETA 179
D K Q++ A+L RYQ L E ++
Sbjct: 135 ---DTLKTQSSLLQARLEQTRYQILS-----RSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 180 RINLA 184
+L
Sbjct: 187 LTSLI 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0491HTHTETR2225e-76 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 222 bits (567), Expect = 5e-76
Identities = 215/215 (100%), Positives = 215/215 (100%)

Query: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60
MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180
GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215
APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0492RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRVKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0497IGASERPTASE399e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.5 bits (89), Expect = 9e-05
Identities = 40/251 (15%), Positives = 78/251 (31%), Gaps = 31/251 (12%)

Query: 404 PLPETTSQVLAARQQLQRVQGATKAKKSEPAA----ATRARPVNNAALERLASVTDRVQA 459
P E +Q + + + P+ AR + A + A T
Sbjct: 983 PEVEKRNQTVDTTN----ITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETT 1037

Query: 460 RPVPSALEKAPAKKEAYRWKATTPVMQQKE--------VVATPKALKKA---LEHEKTPE 508
V ++ E AT Q +E V A + + A E ++T
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 509 LAAKLAA---------EAIERDPWAAQVSQLSLPKLVEQVALNAWKE-ESDNAVCLHLRS 558
K A E+ +V+ PK + + E +N ++++
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 559 SQRHLNNRGAQQKLAEALST-LKGSTVELTIVEDDNPAVRTPLEWRQAIYEEKLAQARES 617
Q N ++ A+ S+ ++ E T V N V P A + + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 618 IIADNNIQTLR 628
+ + +++R
Sbjct: 1218 KPKNRHRRSVR 1228


10UTI89_C0526UTI89_C0562Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C05260183.211855metal resistance protein
UTI89_C05271174.361731thioredoxin-like protein
UTI89_C05280163.935132short chain dehydrogenase
UTI89_C05291163.748923multifunctional acyl-CoA thioesterase I and
UTI89_C05300153.688571ABC transporter ATP-binding protein
UTI89_C0531-2123.176122hypothetical protein
UTI89_C0532-2150.470520tRNA 2-selenouridine synthase
UTI89_C0533-215-0.371677DNA-binding transcriptional activator AllS
UTI89_C0534017-1.454677ureidoglycolate hydrolase
UTI89_C0535-117-1.172790DNA-binding transcriptional repressor AllR
UTI89_C0536116-1.261442glyoxylate carboligase
UTI89_C0537217-1.079351hydroxypyruvate isomerase
UTI89_C0538215-0.8710482-hydroxy-3-oxopropionate reductase
UTI89_C0539214-0.719768allantoin permease
UTI89_C05402130.245770allantoinase
UTI89_C05413171.168449purine permease YbbY
UTI89_C05422152.204573glycerate kinase II
UTI89_C05432142.400170hypothetical protein
UTI89_C05441153.704605allantoate amidohydrolase
UTI89_C05450154.518102ureidoglycolate dehydrogenase
UTI89_C05461165.283123membrane protein FdrA
UTI89_C05471164.883285hypothetical protein
UTI89_C05481154.208139hypothetical protein
UTI89_C05491162.743141carbamate kinase
UTI89_C05501202.148933phosphoribosylaminoimidazole carboxylase ATPase
UTI89_C05512181.398330phosphoribosylaminoimidazole carboxylase
UTI89_C05522180.742436hypothetical protein
UTI89_C05533211.552728UDP-2,3-diacylglucosamine hydrolase
UTI89_C05543211.755250peptidyl-prolyl cis-trans isomerase B
UTI89_C05552192.134290cysteinyl-tRNA synthetase
UTI89_C05561191.040025hypothetical protein
UTI89_C05572230.151788hypothetical protein
UTI89_C0558219-0.360405hypothetical protein
UTI89_C0559121-2.970090bifunctional 5,10-methylene-tetrahydrofolate
UTI89_C0560531-6.411140hypothetical protein
UTI89_C0562224-4.852609*prophage DLP12 integrase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0528DHBDHDRGNASE784e-19 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 77.8 bits (191), Expect = 4e-19
Identities = 49/212 (23%), Positives = 81/212 (38%), Gaps = 7/212 (3%)

Query: 16 KSVLITGCSSGIGLESALELKRQGFHVLAGCRKPDDVERMNS----MGFTGVLIDLDL-- 69
K ITG + GIG A L QG H+ A P+ +E++ S D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 70 PESVDRAADEVIALTDNCLYGIFNNAGFGMYGPLSTISRAQMEQQFSANFFGAHQLTMRL 129
++D + + + N AG G + ++S + E FS N G + +
Sbjct: 69 SAAIDEITARIEREMGP-IDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 130 LPAMLPHGEGRIVMTSSVMGLISTPGRGAYAASKYALEAWSDALRMELRYSGIKVSLIEP 189
M+ G IV S + AYA+SK A ++ L +EL I+ +++ P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 190 GPIRTRFTDNVNQTQSDKPVENPGIAARFTLG 221
G T ++ ++ G F G
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTG 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0530PF05272290.013 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.013
Identities = 12/20 (60%), Positives = 13/20 (65%)

Query: 41 LVGESGSGKSTLLAILAGLD 60
L G G GKSTL+ L GLD
Sbjct: 601 LEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0535PF09025280.019 YopR Core
		>PF09025#YopR Core

Length = 143

Score = 28.4 bits (63), Expect = 0.019
Identities = 17/61 (27%), Positives = 25/61 (40%), Gaps = 8/61 (13%)

Query: 126 EAVLIGQLECKSMVRMCAPLGSR--------LPLHASGAGKALLYPLAEEELMSIILQTG 177
+ + +LE K+M+R PLG + L G L LA EL +I G
Sbjct: 68 QGLEADRLELKAMLRAELPLGRQQQTFLLQLLGAVEHAPGGEYLAQLARRELQVLIPLNG 127

Query: 178 L 178
+
Sbjct: 128 M 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0540UREASE561e-10 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 56.3 bits (136), Expect = 1e-10
Identities = 39/163 (23%), Positives = 60/163 (36%), Gaps = 32/163 (19%)

Query: 4 DLIIKNGTVILENEARVVDIAVKDGKIAAIG-------QD-----LGDAKDVMDASGLVV 51
D +I N ++ DI +KDG+IAAIG Q +G +V+ G +V
Sbjct: 69 DTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIV 128

Query: 52 SPGMVDAHTHISEPGRSHWEGYETGTRAAAKGGITTMIEMPLNQLPATVDRAS------- 104
+ G +D+H H P + A G+T M+ PA A+
Sbjct: 129 TAGGMDSHIHFICPQQIE---------EALMSGLTCMLGGGTG--PAHGTLATTCTPGPW 177

Query: 105 -IELKFDAAKGKLTIDAAQLGGLVSYNIDRLHELDEVGVVGFK 146
I +AA ++ A G + L E+ G K
Sbjct: 178 HIARMIEAADA-FPMNLAFAGKGNASLPGALVEMVLGGATSLK 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0549CARBMTKINASE383e-136 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 383 bits (984), Expect = e-136
Identities = 125/310 (40%), Positives = 175/310 (56%), Gaps = 16/310 (5%)

Query: 2 KTLVVALGGNALLQRGEALTAKNQYRNIASAVPALARL-ARSYRLAIVHGNGPQVGLLAL 60
K +V+ALGGNAL QRG+ + + N+ +A + AR Y + I HGNGPQVG L L
Sbjct: 3 KRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLL 62

Query: 61 QNLAWKE---VEPYPLDVLVAESQGMIGYMLAQGLSAQPQM----PPVTTVLTRIEVSPD 113
A + + P+DV A SQG IGYM+ Q L + + V T++T+ V +
Sbjct: 63 HMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKN 122

Query: 114 DPAFLQPEKFIGPVYQPEEQEALEAAYGWQMKRD-GKYLRRVVASPQPRKILDSEAIELL 172
DPAF P K +GP Y E + L GW +K D G+ RRVV SP P+ +++E I+ L
Sbjct: 123 DPAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKL 182

Query: 173 LKEGHVVICSGGGGVPVAEDG---AGSEAVIDKDLAAALLAEQINADGLVILTDADAVYE 229
++ G +VI SGGGGVPV + G EAVIDKDLA LAE++NAD +ILTD +
Sbjct: 183 VERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAAL 242

Query: 230 NWGTPQQRAIRRATPDELAPFAKAD----GSMGPKVTAVSGYVRSRGKPAWIGALSRIEE 285
+GT +++ +R +EL + + GSMGPKV A ++ G+ A I L + E
Sbjct: 243 YYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLEKAVE 302

Query: 286 TLAGEAGTCI 295
L G+ GT +
Sbjct: 303 ALEGKTGTQV 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0555RTXTOXIND290.030 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.030
Identities = 16/150 (10%), Positives = 44/150 (29%), Gaps = 8/150 (5%)

Query: 299 RSQLNYSEENLKQARAALERLYTALRGTDKTVAPAGGEAFEARFIEAMDDDFNTP----- 353
+ ++ +L QAR R R + P E F +++
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIK 192

Query: 354 EAYSVLFDMAREVNRLKAEDMAAANAMASHLRKLSAVLGLLEQEPEAFLQSGAQADDSEV 413
E +S + + + A + + + + + + + + F +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSS---LLHKQAI 249

Query: 414 AEIEALIQQRLDARKAKDWAAADAARDRLN 443
A+ L Q+ + + +++
Sbjct: 250 AKHAVLEQENKYVEAVNELRVYKSQLEQIE 279


11UTI89_C0583UTI89_C0617Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0583-2123.117582phosphopantetheinyltransferase component of
UTI89_C0584-2122.387949outer membrane receptor FepA
UTI89_C0585-1132.444469hypothetical protein
UTI89_C0586-1133.030585enterobactin/ferric enterobactin esterase
UTI89_C05870143.643734hypothetical protein
UTI89_C05880143.984958enterobactin synthase subunit F
UTI89_C05891143.506664ferric enterobactin transport protein FepE
UTI89_C05900145.504189iron-enterobactin transporter ATP-binding
UTI89_C05910165.641904iron-enterobactin transporter permease
UTI89_C0592-1175.281073iron-enterobactin transporter membrane protein
UTI89_C0593-1184.832811enterobactin exporter EntS
UTI89_C0594-2184.671200iron-enterobactin transporter periplasmic
UTI89_C0595-2225.037628isochorismate synthase
UTI89_C0596-2245.053019enterobactin synthase subunit E
UTI89_C0597-1214.8369692,3-dihydro-2,3-dihydroxybenzoate synthetase
UTI89_C0598-1194.7011962,3-dihydroxybenzoate-2,3-dehydrogenase
UTI89_C05990193.464647hypothetical protein
UTI89_C06000161.642317carbon starvation protein A
UTI89_C0601016-2.836573hypothetical protein
UTI89_C0602-115-3.186807hypothetical protein
UTI89_C0603-216-4.502932aminotransferase
UTI89_C0604-218-4.529632hypothetical protein
UTI89_C0605-218-4.404065hypothetical protein
UTI89_C0606-119-3.671876transcriptional regulator YbdO
UTI89_C0607022-0.558661disulfide isomerase/thiol-disulfide oxidase
UTI89_C06080230.107315alkyl hydroperoxide reductase
UTI89_C06090150.153225alkyl hydroperoxide reductase
UTI89_C06100120.412927universal stress protein UspG
UTI89_C06110121.659019hypothetical protein
UTI89_C0612-1152.480000nucleoside diphosphate kinase regulator
UTI89_C0613-2152.963275ribonuclease I
UTI89_C0614-2183.895571citrate DASS carrier/transporter
UTI89_C0615-1235.178986triphosphoribosyl-dephospho-CoA synthase
UTI89_C06160275.5092562'-(5''-triphosphoribosyl)-3'-dephospho-CoA:apo-
UTI89_C0617-1264.015256hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0583ENTSNTHTASED2697e-94 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 269 bits (690), Expect = 7e-94
Identities = 109/184 (59%), Positives = 132/184 (71%), Gaps = 1/184 (0%)

Query: 51 MKTTHTSLPFAGHTLHFVEFDPASFREQDLLWLPHYAQLQHAGRKRKTEHLAGRIAAIYA 110
M T+H LPFAGH LH V+FD +SFRE DLLWLPH+ +L+ AGRKRK EHLAGRIAA++A
Sbjct: 1 MLTSHFPLPFAGHRLHIVDFDASSFREHDLLWLPHHDRLRSAGRKRKAEHLAGRIAAVHA 60

Query: 111 LREYGYKCVPAIGELRQPVWPAGVYGSISHCGTTALAVVSRQPIGIDIEEIFSAQTAREL 170
LRE G + VP +G+ RQP+WP G++GSISHC TTALAV+SRQ IGIDIE+I S TA EL
Sbjct: 61 LREVGVRTVPGMGDKRQPLWPDGLFGSISHCATTALAVISRQRIGIDIEKIMSQHTATEL 120

Query: 171 TDNIITPAEHKRLADCGLAFPLALTLAFSAKESAFKA-SEIQAAQGFLDYQIISWNKQQI 229
+II E + L L FPLALTLAFSAKES +KA S+ GF ++ S I
Sbjct: 121 APSIIDSDERQILQASLLPFPLALTLAFSAKESVYKAFSDRVTLPGFNSAKVTSLTATHI 180

Query: 230 IIRL 233
+ L
Sbjct: 181 SLHL 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0593TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.3 bits (84), Expect = 2e-04
Identities = 82/394 (20%), Positives = 146/394 (37%), Gaps = 38/394 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATPALVGR 141
V+L + G ++ + P L +Y+ + G + G A A +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPALPP 201
+ + G V P++GGL+ GG + + AA L L LP
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLM---GGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 202 PPQPREHPLK----SLLAGFRFLLASPLVGGIALLGGLLTMAS----AVRVLYPALADNW 253
+ PL+ + LA FR+ +V + + ++ + A+ V++ D +
Sbjct: 184 SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRF 241

Query: 254 QMSAAQIGFLYAAIP-LGAAIGALTSGKLAHSVRPGLLMLLSTLG---AFLAIGLFGLMP 309
A IG AA L + A+ +G +A + ++L + ++ +
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 310 MWILGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGG 369
M +V LA G ML Q E G++ G A +G L
Sbjct: 302 MAFPIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 370 LGAMMTPVASASASGFGLLIIGVLLLLVLVELRR 403
+ A + + +G+ + L LL L LRR
Sbjct: 358 IYA----ASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0594FERRIBNDNGPP632e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 62.7 bits (152), Expect = 2e-13
Identities = 60/280 (21%), Positives = 100/280 (35%), Gaps = 35/280 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSAEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKS--- 151
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 152 --WQSLLTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
+ LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQ 309
KD DA+ A PL +P V+ + + F SAM
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMH 283


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0597ISCHRISMTASE444e-161 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 444 bits (1142), Expect = e-161
Identities = 146/299 (48%), Positives = 195/299 (65%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPESHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPIPASKA-----------ALREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA + + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0598DHBDHDRGNASE365e-131 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 365 bits (937), Expect = e-131
Identities = 111/258 (43%), Positives = 150/258 (58%), Gaps = 20/258 (7%)

Query: 5 GKNVWITGAGKGIGYATALAFVEAGAKVTGFD---------------QAFTQEQYPFATE 49
GK +ITGA +GIG A A GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAAQVAQVCQRLLAETERLDVLVNAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+A + ++ R+ E +D+LVN AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRRQRGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALSVGLELAGSGVRC 169
+ +R G+IVTV S+ A PR M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0607BCTLIPOCALIN290.013 Bacterial lipocalin signature.
		>BCTLIPOCALIN#Bacterial lipocalin signature.

Length = 171

Score = 28.8 bits (64), Expect = 0.013
Identities = 18/98 (18%), Positives = 39/98 (39%), Gaps = 13/98 (13%)

Query: 30 QGITIIKTFDAPGGMKGYLGKYQDMGVTIYLTPDGKHAISG--YMYNEKGENLSNTLIEK 87
+ + + F+ YLGK+ ++ + G ++ + N+ G ++ N
Sbjct: 21 ESVKPVSDFEL----NNYLGKWYEVARLDHSFERGLSQVTAEYRVRNDGGISVLN----- 71

Query: 88 EIYAPAGREMWQRMEQSHWLLDGKKDAPVIVYVFADPF 125
Y+ + W+ E + ++G D + V F PF
Sbjct: 72 RGYSEE-KGEWKEAEGKAYFVNGSTDGYLKVSFFG-PF 107


12UTI89_C0717UTI89_C0735Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C07172252.917373succinate dehydrogenase cytochrome b556 large
UTI89_C07181283.106549succinate dehydrogenase cytochrome b556 small
UTI89_C07191303.093141succinate dehydrogenase flavoprotein subunit
UTI89_C07200251.497201succinate dehydrogenase iron-sulfur subunit
UTI89_C07210240.6222512-oxoglutarate dehydrogenase E1
UTI89_C0722-124-0.708207dihydrolipoamide succinyltransferase
UTI89_C0723022-1.023291succinyl-CoA synthetase subunit beta
UTI89_C0724122-0.997454succinyl-CoA synthetase subunit alpha
UTI89_C0725119-1.357638hypothetical protein
UTI89_C0726223-0.107561hypothetical protein
UTI89_C07273241.164869cytochrome d terminal oxidase, polypeptide
UTI89_C07281220.861676hypothetical protein
UTI89_C07290180.253082cytochrome d terminal oxidase polypeptide
UTI89_C0730715-0.263783hypothetical protein
UTI89_C07312170.289332hypothetical protein
UTI89_C07322180.015251acyl-CoA thioester hydrolase
UTI89_C07332220.084283colicin uptake protein TolQ
UTI89_C0734322-0.039146colicin uptake protein TolR
UTI89_C0735319-0.194235cell envelope integrity inner membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0720TCRTETOQM310.003 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.4 bits (71), Expect = 0.003
Identities = 11/41 (26%), Positives = 23/41 (56%), Gaps = 1/41 (2%)

Query: 14 VDDAPRMQDYTLEAEEGRDM-MLLDALIQLKEKDPSLSFRR 53
+++ + T+E + + MLLDAL+++ + DP L +
Sbjct: 339 IENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYV 379


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0725SYCDCHAPRONE280.037 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 28.0 bits (62), Expect = 0.037
Identities = 18/65 (27%), Positives = 26/65 (40%)

Query: 255 AMQNSGDTQLARKYNREGEAVYKTGQLEQAIQLFQQATELDGNYGQAFSNLGLAYQKNGN 314
AM N + + Y++G+ E A ++FQ LD + F LG Q G
Sbjct: 26 AMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQ 85

Query: 315 IAEAI 319
AI
Sbjct: 86 YDLAI 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0735IGASERPTASE648e-13 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 63.5 bits (154), Expect = 8e-13
Identities = 35/203 (17%), Positives = 67/203 (33%), Gaps = 10/203 (4%)

Query: 99 EQERLKQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADAKAKAEADAKAAEE 158
E E+ Q QA+ + + ++ + A +E AE
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 159 AAKKAAADAKKKAEAEAAKA-----AVEAQKKAEAAAAALKK---KAEAAEAAAAEARKK 210
+ +++ K + +A A A EA+ +A + +E E E ++
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKET 1103

Query: 211 AATEAAEKAKAEAEKKAAAEKAAADKKAAADKKAAEKAAAEKAAADKKA--AAEKAAADK 268
A E EKAK E EK K + ++ + AE A + E +
Sbjct: 1104 ATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTN 1163

Query: 269 KAAAAKAAAEKAAAAKAAAEADD 291
A + A++ ++ +
Sbjct: 1164 TTADTEQPAKETSSNVEQPVTES 1186



Score = 57.0 bits (137), Expect = 1e-10
Identities = 30/219 (13%), Positives = 82/219 (37%), Gaps = 11/219 (5%)

Query: 68 QSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQA 127
Q+ S ++E+ ++ +E ++ + +K E+ A +
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN--EQDATET 1061

Query: 128 ELKQKQ-AEEAAAKAAAD------AKAKAEADAKAAEEAAKKAAADAKKKAEAEAAKAAV 180
+ ++ A+EA + A+ A++ +E E + A + ++KA+ E K
Sbjct: 1062 TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQE 1121

Query: 181 EAQKKAEAAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAA 240
+ ++ + ++++E + A AR+ T ++ +++ A E+ A + +
Sbjct: 1122 VPKVTSQVSPK--QEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNV 1179

Query: 241 DKKAAEKAAAEKAAADKKAAAEKAAADKKAAAAKAAAEK 279
++ E + + A + ++ K
Sbjct: 1180 EQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNK 1218



Score = 56.6 bits (136), Expect = 1e-10
Identities = 31/229 (13%), Positives = 70/229 (30%), Gaps = 9/229 (3%)

Query: 66 RMQSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAK 125
R ++E+ + + + Q+ E +E Q E + +EKE A E +K E
Sbjct: 1066 REVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKV 1125

Query: 126 QAELKQKQAEEAAAKAAADAKAKAEADAKAAEEAAKKAAADAKKKAEAEAAKAAVEAQKK 185
+++ KQ + + A+ + + +E + A + A+ + VE
Sbjct: 1126 TSQVSPKQEQSETVQPQAEPARENDP-TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVT 1184

Query: 186 AEAAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAADKKAA 245
E E + + +++ + A
Sbjct: 1185 ESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDR 1244

Query: 246 EKAA--------AEKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAA 286
A +D +A A+ A + A ++ ++ +
Sbjct: 1245 STVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQ 1293



Score = 53.5 bits (128), Expect = 1e-09
Identities = 34/260 (13%), Positives = 80/260 (30%), Gaps = 9/260 (3%)

Query: 51 DAVMVDSGAVVEQYKRMQSQESSAKRSDEQRKMKEQQAAE-ELREKQAAEQER------L 103
D V A + ++ ++K+ + + EQ A E + ++ A++ +
Sbjct: 1021 DEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANT 1080

Query: 104 KQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADAKAKAEADAKAAEEAAKKA 163
+ E + ++ ++ Q K+ +K+ + K + +E ++
Sbjct: 1081 QTNEVAQSGSETKETQ-TTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETV 1139

Query: 164 AADAKKKAEAEAAKAAVEAQKKAEAAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEA 223
A+ E + E Q + A + E + +
Sbjct: 1140 QPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENP 1199

Query: 224 EKKAAAEKAAADKKAAADKKAAEKAAAEKAAADKKAAAEKAAADKKAAA-AKAAAEKAAA 282
E A +++K + ++ A ++ D+ A + A
Sbjct: 1200 ENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNA 1259

Query: 283 AKAAAEADDIFGELSSGKNA 302
+ A A F L+ GK
Sbjct: 1260 VLSDARAKAQFVALNVGKAV 1279


13UTI89_C0770UTI89_C0780Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C07700133.868340pectinesterase
UTI89_C07711144.078344kinase inhibitor protein
UTI89_C0772-1123.613268adenosylmethionine-8-amino-7-oxononanoate
UTI89_C0773-1133.105349biotin synthase
UTI89_C0774-1132.9395988-amino-7-oxononanoate synthase
UTI89_C07750132.003831biotin biosynthesis protein BioC
UTI89_C07761121.484976dithiobiotin synthetase
UTI89_C07772131.553671excinuclease ABC subunit B
UTI89_C0778-1121.884575hypothetical protein
UTI89_C07790162.630522hypothetical protein
UTI89_C07801173.126538molybdenum cofactor biosynthesis protein A
14UTI89_C0790UTI89_C0798Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0790-1213.256005cardiolipin synthase 2
UTI89_C0791-1223.614544DNase
UTI89_C0792-2223.384618hypothetical protein
UTI89_C0793-1203.695274hypothetical protein
UTI89_C0794-1183.843879hypothetical protein
UTI89_C0795-1173.567573ABC transporter
UTI89_C0796-1143.453655hypothetical protein
UTI89_C0797-1133.181579DNA-binding transcriptional regulator
UTI89_C07980133.204170ATP-dependent RNA helicase RhlE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0793ABC2TRNSPORT473e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 47.2 bits (112), Expect = 3e-08
Identities = 36/146 (24%), Positives = 63/146 (43%), Gaps = 5/146 (3%)

Query: 197 AREREQGTLDQLLVSPLTTWQIFIGKAVPALIVATFQATIVLAIGIWAYQIPFAGSLALF 256
R Q T + +L + L I +G+ A A IG+ A + + L+L
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGA---GIGVVAAALGYTQWLSLL 148

Query: 257 YFTMVI--YGLSLVGFGLLISSLCSTQQQAFIGVFVFMMPAILLSGYVSPVENMPVWLQN 314
Y VI GL+ G+++++L + + + P + LSG V PV+ +P+ Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 315 LTWINPIRHFTDITKQIYLKDASLDI 340
P+ H D+ + I L +D+
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0795PF05272310.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.012
Identities = 20/90 (22%), Positives = 28/90 (31%), Gaps = 21/90 (23%)

Query: 298 TPRFEDAFIDLLGGAGTSESPLGAILHTVEGTPGETVIEAKELTKKFGDFAATDHVNFAV 357
PR E + +LG P + + + K HV +
Sbjct: 547 VPRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVM 589

Query: 358 KRGEIFG----LLGPNGAGKSTTFKMMCGL 383
+ G F L G G GKST + GL
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGL 619



Score = 29.3 bits (65), Expect = 0.048
Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 39 YVTGLVGPDGAGKTTLMRMLAGL 61
Y L G G GK+TL+ L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0796RTXTOXIND627e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 62.2 bits (151), Expect = 7e-13
Identities = 42/259 (16%), Positives = 92/259 (35%), Gaps = 25/259 (9%)

Query: 83 ALMQAKAGVSVAQAQYDLMLAGYRDEEIAQAAAAVKQAQAAYDYAQNFYNRQQGLWKSRT 142
Q + + +A+ +LA E + + + + L +
Sbjct: 201 QKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENK 260

Query: 143 ISA--NDLENARSSRDQAQATLKSAQDKLRQYRSGNREQ---DIAQAKASLEQAQAQLAQ 197
N+L +S +Q ++ + SA+++ + + + + Q ++ +LA+
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 198 AELNLQDSTLIAPSDGTLLTRAV-EPGTVLNEGGTVFTVSLT-RPVWVRAYVDERNLDQA 255
E Q S + AP + V G V+ T+ + + V A V +++
Sbjct: 321 NEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFI 380

Query: 256 QPGRKVLLYTDGRPNKPYH---GQIGFVSPTAEFTPKTVETPDLRTDLVYRLRIVVT--- 309
G+ ++ + P Y G++ ++ A D R LV+ + I +
Sbjct: 381 NVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISIEENC 432

Query: 310 ----DADDALRQGMPVTVQ 324
+ + L GM VT +
Sbjct: 433 LSTGNKNIPLSSGMAVTAE 451


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0797HTHTETR729e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.4 bits (177), Expect = 9e-18
Identities = 32/220 (14%), Positives = 74/220 (33%), Gaps = 29/220 (13%)

Query: 13 KGEQAKKQLIAAALAQFGEYGMNATT-REIAAQAGQNIAAITYYFGSKEDLYLACAQWIA 71
+ ++ ++ ++ AL F + G+++T+ EIA AG AI ++F K DL+ +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 72 DFIGEQFRPHAEEAERLFAQPQPDRAAIRELILRACRNMIKLLTQDDTVNLSK---FISR 128
IGE E + P + +RE+++ + + + + +
Sbjct: 68 SNIGELEL---EYQAKFPGDP---LSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 129 EQLSPTAAYHLVHEQVISPLHSHLTRLIAAW---TGCDASDTRMILHTHALIGEILAFRL 185
E A + + + L I A +I+ I ++
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIM--RGYISGLM---- 175

Query: 186 GKETILLRTGWTAFDEEKTELINQTVTCHIDLILQGLSQR 225
W + + + ++ ++L+
Sbjct: 176 --------ENWLFAPQSFD--LKKEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0798SECA300.026 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.026
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 4/67 (5%)

Query: 246 QQVLVFTRTKHGANHLAEQLNKDGIRSAAIHG-NKSQGARTRALADFKSGDIRVLVATDI 304
Q VLV T + + ++ +L K GI+ ++ + A A A + + V +AT++
Sbjct: 450 QPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNM 506

Query: 305 AARGLDI 311
A RG DI
Sbjct: 507 AGRGTDI 513


15UTI89_C0826UTI89_C0836Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0826-2123.133054formate acetyltransferase 3
UTI89_C0827-1113.111205pyruvate formate-lyase 3 activating enzyme
UTI89_C0828-1133.026179fructose-6-phosphate aldolase
UTI89_C0829-1133.016672molybdopterin biosynthesis protein MoeB
UTI89_C08300152.781799molybdopterin biosynthesis protein MoeA
UTI89_C0831115-0.953450L-asparaginase
UTI89_C0832015-2.639000glutathione transporter ATP-binding protein
UTI89_C0833012-3.302922hemin-binding lipoprotein
UTI89_C0834011-4.503627ABC transporter permease
UTI89_C0835010-4.634163transport system permease
UTI89_C0836110-5.001131hypothetical protein
16UTI89_C0913UTI89_C0969Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C09130193.538527bacteriophage late gene regulator
UTI89_C09140203.388973tail sheath protein of prophage CP-933T
UTI89_C09150192.895998tail fiber component of prophage CP-933T
UTI89_C0916-1191.246045prophage CP-933T hypothetical protein
UTI89_C09170210.451660phage tail protein
UTI89_C0918021-0.623033phage tail protein
UTI89_C0919-125-3.888754tail fiber protein of prophage CP-933T
UTI89_C0920025-3.253726DNA-invertase
UTI89_C0921022-1.585152tail fiber protein gpH
UTI89_C09220220.044843tail fiber assembly
UTI89_C09231181.830447tail fiber assembly protein
UTI89_C09242193.588897phage protein gpH
UTI89_C09254285.849751phage protein gpI
UTI89_C09262276.134515phage baseplate assembly protein
UTI89_C09272285.518504phage baseplate assembly protein
UTI89_C09280285.397053phage protein gp15
UTI89_C09291284.888002tail completion phage protein
UTI89_C09302265.570736tail completion phage protein
UTI89_C0931-1286.434448hypothetical protein
UTI89_C09320305.015771endolysin
UTI89_C09330295.123138phage holin
UTI89_C09340254.308706phage tail protein
UTI89_C0935-1233.852487capsid completion protein
UTI89_C09360223.055720hypothetical protein
UTI89_C0937-1211.994950major capsid protein
UTI89_C0938-1181.291702capsid scaffolding protein
UTI89_C0939-1200.430951phage protein gpP
UTI89_C0940023-1.665155hypothetical protein
UTI89_C0941332-4.527166hypothetical protein
UTI89_C0942333-4.065922hypothetical protein
UTI89_C0943222-1.370134prophage CP-933R hypothetical protein
UTI89_C09441232.974141hypothetical protein
UTI89_C09451253.148957hypothetical protein
UTI89_C09460263.625003ATP-dependent specificity component of ClpP
UTI89_C09470264.378470prophage CP-933T stability/partitioning protein
UTI89_C09480273.799097prophage CP-933T stability/partitioning protein
UTI89_C09490273.592680prophage CP-933T replication protein
UTI89_C09502270.448160prophage CP-933T hypothetical protein
UTI89_C09512270.564057prophage CP-933T hypothetical protein
UTI89_C09521280.519093hypothetical protein
UTI89_C0953231-1.023781serine protease
UTI89_C0954231-2.129777hypothetical protein
UTI89_C0956131-2.599253prophage CP-933T hypothetical protein
UTI89_C0957032-3.049888hypothetical protein
UTI89_C0958033-5.946359hypothetical protein
UTI89_C0959-132-5.674772prophage CP-933T hypothetical protein
UTI89_C0960-132-5.737810hypothetical protein
UTI89_C0961129-4.893719hypothetical protein
UTI89_C0962-116-2.208795hypothetical protein
UTI89_C0963-116-2.033968hypothetical protein
UTI89_C0964-115-0.551088hypothetical protein
UTI89_C0965-215-0.951474bacteriophage WPhi phage protein C
UTI89_C0966-115-0.835293phage integrase
UTI89_C0967-114-0.246303anaerobic dimethyl sulfoxide reductase chain A
UTI89_C0968013-0.892354anaerobic dimethyl sulfoxide reductase subunit
UTI89_C0969227-0.755529anaerobic dimethyl sulfoxide reductase subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0918PF03544300.039 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 29.9 bits (67), Expect = 0.039
Identities = 20/67 (29%), Positives = 25/67 (37%), Gaps = 1/67 (1%)

Query: 825 QERKIAQVSKPAPAINITPV-VPAPLPPALVPVVAASSRPVAEAIRSPVASVPVTSRNRE 883
E A P P + P P P PP PVV +P + PV V R+ +
Sbjct: 60 LEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVK 119

Query: 884 PVASGFG 890
PV S
Sbjct: 120 PVESRPA 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0924FLAGELLIN300.029 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 30.4 bits (68), Expect = 0.029
Identities = 32/236 (13%), Positives = 66/236 (27%), Gaps = 6/236 (2%)

Query: 121 GSGRAQTFRTILTVSSTATVALTVDNTMVMATVDYVDDKLKEHEQSRRHPDASLTAKGFV 180
SG T T TV V + L + +S + G +
Sbjct: 210 NSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAI 269

Query: 181 QLSSATNSVSETQAATPKAVKAAYDLANGKYTAQDASTTRKGLVQLSSATNSTSETQAAT 240
+ ++ K D T + + +++ + +
Sbjct: 270 KGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQS 329

Query: 241 PKAVKAAYDLANAKYTAQDATTAQKGIVQLSSATNSSSETLAATSKAVKAVMDETNKKAP 300
K V + N ++T D T + + A N+ T + + K
Sbjct: 330 SKNVYTSVV--NGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVT 387

Query: 301 LNSPALTGTPTTPTARQGTNNTQIASTAFVMAAIAALVDSSPDALNTLNELAAALG 356
L + T N A+ +A++ AL+ ++ + ++LG
Sbjct: 388 LAGKTMFIDKTASGVSTLINEDAAAAKKSTANPLASI----DSALSKVDAVRSSLG 439


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0930ANTHRAXTOXNA280.016 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 28.2 bits (62), Expect = 0.016
Identities = 12/44 (27%), Positives = 27/44 (61%), Gaps = 4/44 (9%)

Query: 79 LLNPERNQDIKFSAVI----NDDDSADLLFTLPLRERVRITRSS 118
+++ +++ D +F +I +D DS+DLLF+ +E++ + S
Sbjct: 181 IISKDKSLDPEFLNLIKSLSDDSDSSDLLFSQKFKEKLELNNKS 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0937CHLAMIDIAOM6290.038 Chlamydia cysteine-rich outer membrane protein 6 si...
		>CHLAMIDIAOM6#Chlamydia cysteine-rich outer membrane protein 6

signature.
Length = 547

Score = 28.9 bits (64), Expect = 0.038
Identities = 15/42 (35%), Positives = 24/42 (57%), Gaps = 2/42 (4%)

Query: 48 ASKESTEFTKRINVIGVTDQKGEKIL-LDT-TGPIARTNTSY 87
SKE+ EF+ + + D +GE IL DT T P++ T ++
Sbjct: 504 GSKETVEFSVTLKAVSAGDARGEAILSSDTLTVPVSDTENTH 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0953PYOCINKILLER290.021 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 29.0 bits (64), Expect = 0.021
Identities = 20/70 (28%), Positives = 27/70 (38%), Gaps = 11/70 (15%)

Query: 188 TANQKTLQREQEVKQREAEANMLRA------EAAGQ-----ADAIRTKAQAEADAIRLRG 236
+ Q Q+ R A + A AAG+ A + AQA +DAI + G
Sbjct: 230 AKRKAEEQARQQAAIRAANTYAMPANGSVVATAAGRGLIQVAQGAASLAQAISDAIAVLG 289

Query: 237 EALRQNPGVM 246
L P VM
Sbjct: 290 RVLASAPSVM 299


17UTI89_C1065UTI89_C1093Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C10652150.929704glucose-1-phosphatase/inositol phosphatase
UTI89_C10662222.783913hypothetical protein
UTI89_C10671184.347806TrpR binding protein WrbA
UTI89_C10681234.471765hypothetical protein
UTI89_C10690214.855532hypothetical protein
UTI89_C1070-1184.6100564-hydroxyphenylacetate 3-monooxygenase small
UTI89_C10710143.618569hypothetical protein
UTI89_C1072-1114.027690hypothetical protein
UTI89_C1073-1113.205630hypothetical protein
UTI89_C1074-1123.036078isochorismatase YcdL
UTI89_C1075-1132.829589monooxygenase YcdM
UTI89_C1076-1152.328728transcriptional regulator YcdC
UTI89_C1077-1152.549318trifunctional transcriptional regulator/proline
UTI89_C1078-1150.902958hypothetical protein
UTI89_C1079-1140.854636sodium/proline symporter
UTI89_C1080-212-0.655455hypothetical protein
UTI89_C1081-217-2.867575hypothetical protein
UTI89_C1082-221-3.649602hypothetical protein
UTI89_C1083-126-6.299457hypothetical protein
UTI89_C1084-130-6.571659PGA biosynthesis protein
UTI89_C1085032-8.760789N-glycosyltransferase
UTI89_C1086-135-9.392083outer membrane N-deacetylase
UTI89_C1087034-9.806350outer membrane protein PgaA
UTI89_C1088239-11.989951hypothetical protein
UTI89_C1089438-10.593350P4 family integrase
UTI89_C1090537-12.091182hypothetical protein
UTI89_C1091429-6.703430hypothetical protein
UTI89_C1092326-5.842527transposase
UTI89_C1093219-3.983406prophage regulatory protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1074ISCHRISMTASE762e-18 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 75.8 bits (186), Expect = 2e-18
Identities = 44/176 (25%), Positives = 70/176 (39%), Gaps = 23/176 (13%)

Query: 26 TFDPQQTALIVVDMQNAYATPGGYLDLAGFDVSTTRPVIANIQTAVTAARTAGMLIIWFQ 85
DP + L++ DMQN + +D S + ANI+ G+ +++
Sbjct: 25 VPDPNRAVLLIHDMQNYF------VDAFTAGASPVTELSANIRKLKNQCVQLGIPVVY-- 76

Query: 86 NGWDEQYVEAGGPGSPNYHKSNALKTMRNQPLLQGKLLAKGSWDYQLVDELVPQPGDIVL 145
PGS N L G L G ++ +++ EL P+ D+VL
Sbjct: 77 ---------TAQPGSQNPDDRALLTDF------WGPGLNSGPYEEKIITELAPEDDDLVL 121

Query: 146 PKPRYSGFFNTPLDSILRSRGIRHLVFTGIATNVCVESTLRDGFFLEYFGVVLEDA 201
K RYS F T L ++R G L+ TGI ++ T + F + + DA
Sbjct: 122 TKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDA 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1076HTHTETR662e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.8 bits (160), Expect = 2e-15
Identities = 30/165 (18%), Positives = 62/165 (37%), Gaps = 8/165 (4%)

Query: 19 GKRSRAVSAKKKAILSAALDTFSQFGFHGTRLEQIAELAGVSKTNLLYYFPSKEALYIAV 78
K + ++ IL AL FSQ G T L +IA+ AGV++ + ++F K L+ +
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 79 LRQILDIWLAPLKAFREDF--APLAAIKEYIRLKLEVSRDYPQASRLFCM-----EMLAG 131
++ F PL+ ++E + LE + + L + E +
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 132 APLLMDELTGDLKALIDEKSALIAGWVKSGKL-APIDPQHLIFMI 175
++ D + +++ L A + + ++
Sbjct: 123 MAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIM 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1087ARGDEIMINASE310.030 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 30.6 bits (69), Expect = 0.030
Identities = 26/181 (14%), Positives = 59/181 (32%), Gaps = 22/181 (12%)

Query: 451 HRAAENELKKAEVIEPRNINLEVEQAWTALTLQEWQQA--AVLTHDVVEREPQDPGVV-R 507
A + A +++ + +E + + L ++ ++E E + +
Sbjct: 49 EVARQEHEVFASILKNNLVEIEYIEDLISEVLVSSVALENKFISQFILEAEIKTDFTINL 108

Query: 508 LK---RAVDVHNLAELRIAGSTGIDAEGPDSGKHDVDLTTIVYS---PPLKDNWRGFAGF 561
LK ++ + N+ I+G E + DL P+ + F
Sbjct: 109 LKDYFSSLTIDNMISKMISGVVT--EELKNYTSSLDDLVNGANLFIIDPMPNVL--FT-- 162

Query: 562 GYADGQFSEGKGIVRDWLAGVEWRSRNIWLEAEYAERVFNHEHKPGARLSGWYDFNDNWR 621
D S G G+ + + + R E +AE +F + + W + +
Sbjct: 163 --RDPFASIGNGVT---INKMFTKVRQ--RETIFAEYIFKYHPVYKENVPIWLNRWEEAS 215

Query: 622 I 622
+
Sbjct: 216 L 216


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1093HTHFIS260.028 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 25.9 bits (57), Expect = 0.028
Identities = 6/15 (40%), Positives = 13/15 (86%)

Query: 21 SQLLGISRSTIYEKM 35
+ LLG++R+T+ +K+
Sbjct: 456 ADLLGLNRNTLRKKI 470


18UTI89_C1104UTI89_C1167Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C1104531-4.059549hypothetical protein
UTI89_C1105632-4.388629hypothetical protein
UTI89_C1106734-5.148308hypothetical protein
UTI89_C1107933-2.728716S fimbrial switch regulatory protein
UTI89_C1108933-2.664171S fimbrial switch regulatory protein
UTI89_C1109934-2.994686S fimbriae major subunit SfaA
UTI89_C11101033-2.711564S fimbriae minor subunit SfaD
UTI89_C11111032-2.985287S fimbriae periplasmic chaperone SfaE
UTI89_C11121032-3.456870S fimbriae outer membrane usher
UTI89_C1113524-4.706089S fimbriae minor subunit SfaG
UTI89_C1114521-3.786214S fimbriae minor subunit SfaS
UTI89_C1115317-1.602270S fimbriae minor subunit SfaH
UTI89_C11160161.752643hypothetical protein
UTI89_C11170162.499169regulatory protein
UTI89_C11180153.205542outer membrane receptor FepA
UTI89_C11191174.309259IroE protein
UTI89_C11202193.958579IroD protein
UTI89_C11212203.586520IroC protein
UTI89_C11224221.373286glucosyltransferase
UTI89_C1123626-1.417856hypothetical protein
UTI89_C1124727-2.333277hypothetical protein
UTI89_C1125628-3.824058hypothetical protein
UTI89_C1127626-3.137399*transposase
UTI89_C1128527-4.596199hypothetical protein
UTI89_C1129530-6.869179outer membrane heme/hemoglobin receptor
UTI89_C1130638-9.766445hypothetical protein
UTI89_C1131537-9.963720hypothetical protein
UTI89_C1132441-10.429605hypothetical protein
UTI89_C1133431-6.874812hypothetical protein
UTI89_C11349200.850345hypothetical protein
UTI89_C11359222.471334hypothetical protein
UTI89_C11369234.113593hypothetical protein
UTI89_C11379244.811865hypothetical protein
UTI89_C11389255.294855hypothetical protein
UTI89_C11399255.160174autotransporter
UTI89_C11408274.807884hypothetical protein
UTI89_C11418274.654690hypothetical protein
UTI89_C11429274.870968hypothetical protein
UTI89_C11438264.629006hypothetical protein
UTI89_C11448284.978976hypothetical protein
UTI89_C11457264.385059radC-like protein YeeS
UTI89_C11466253.340966hypothetical protein
UTI89_C11477261.098352hypothetical protein
UTI89_C1148322-0.191314intergenic-region protein
UTI89_C1149118-1.201684hypothetical protein
UTI89_C1150115-1.164106hypothetical protein
UTI89_C1151116-1.103907hypothetical protein
UTI89_C1152014-1.095925hypothetical protein
UTI89_C1154-216-1.051526*dehydrogenase
UTI89_C1155-214-2.123767hydrolase
UTI89_C1156-116-3.453141hypothetical protein
UTI89_C1157019-4.933336hypothetical protein
UTI89_C1158124-7.399338curli production protein CsgG
UTI89_C1159132-10.111969curli assembly protein CsgF
UTI89_C1160231-8.389843curli assembly protein CsgE
UTI89_C1161335-8.408771DNA-binding transcriptional regulator CsgD
UTI89_C1162636-7.540781hypothetical protein
UTI89_C1163431-5.358601hypothetical protein
UTI89_C1164022-3.074832curlin minor subunit
UTI89_C1165018-3.472492cryptic curlin major subunit
UTI89_C1166-121-3.853366autoagglutination protein
UTI89_C1167-114-3.062486hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1108FIMREGULATRY1471e-49 Escherichia coli: P pili regulatory PapB protein si...
		>FIMREGULATRY#Escherichia coli: P pili regulatory PapB protein

signature.
Length = 104

Score = 147 bits (371), Expect = 1e-49
Identities = 84/102 (82%), Positives = 88/102 (86%)

Query: 1 MAQHEVITRGGDAFLLKLRESALSSGSMSEEQFFLLIGISSIHSDRVILAMKDYLVSGHS 60
MA HEVI+R G+AFLL +RES L GSMSE FFLLIGISSIHSDRVILAMKDYLV GHS
Sbjct: 1 MAHHEVISRSGNAFLLNIRESVLLPGSMSEMHFFLLIGISSIHSDRVILAMKDYLVGGHS 60

Query: 61 RKDVCEKYQMNNGYFSTTLGRLTRLNVLVARLAPYYTDSVSA 102
RK+VCEKYQMNNGYFSTTLGRL RLN L ARLAPYYTD SA
Sbjct: 61 RKEVCEKYQMNNGYFSTTLGRLIRLNALAARLAPYYTDESSA 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1110FIMBRIALPAPE280.010 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 28.5 bits (63), Expect = 0.010
Identities = 39/160 (24%), Positives = 64/160 (40%), Gaps = 23/160 (14%)

Query: 19 AALAGNHWHVMLPGGNMRFQGKIIAEACSLALSDRQMTVDMGQLSSNRFHAAGEYGDPVG 78
A L H H N+ F+GK+I AC++ + V+ G + +G G+
Sbjct: 15 AVLMSQHVHA---ADNLTFKGKLIIPACTV----QNAEVNWGDIEIQNLVQSG--GNQKD 65

Query: 79 FDIHLQDCSTVVSQRVGISFYGVSDIHEPELLSVEEENDASDGIAIALFNES----GELV 134
F + + ++ + +V I+ G + +L + DG+ I L+N + G V
Sbjct: 66 FTVDMNCPYSLGTMKVTITSNGQTG---NSILVPNTSTASGDGLLIYLYNSNNSGIGNAV 122

Query: 135 KLNQPPENWVHLTRGDMKLHMQARYKATHYPVTGGKANGQ 174
L +T G + AR K T Y G K N Q
Sbjct: 123 TLGSQ------VTPGKITGTAPAR-KITLYAKLGYKGNMQ 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1112PF005779600.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 960 bits (2482), Expect = 0.0
Identities = 545/861 (63%), Positives = 688/861 (79%), Gaps = 9/861 (1%)

Query: 41 RMKFNILPLAFFIGIIVSPAR------AELYFNPRFLSDDPDAVADLSAFTQGQELPPGV 94
K + + + + A AELYFNPRFL+DDP AVADLS F GQELPPG
Sbjct: 18 IRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGT 77

Query: 95 YRVDIYLNDTYISTRDVQFQMSQDGKQLAPCLSPEHMSAMGVNRYAVPGMERLPADTCTS 154
YRVDIYLN+ Y++TRDV F + + PCL+ +++MG+N +V GM L D C
Sbjct: 78 YRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVP 137

Query: 155 LNSMIQGATFRFDVGQQRLYLTVPQLYMSNQARGYIAPEYWDNGITAALLNYDFSGNRVR 214
L SMI AT + DVGQQRL LT+PQ +MSN+ARGYI PE WD GI A LLNY+FSGN V+
Sbjct: 138 LTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQ 197

Query: 215 DSYGGTSDYAYLNLKTGLNIGSWRLRDNTSWSYSAGKGYS--QNNWQHINTWLERDIVPL 272
+ GG S YAYLNL++GLNIG+WRLRDNT+WSY++ S +N WQHINTWLERDI+PL
Sbjct: 198 NRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPL 257

Query: 273 RSRLTMGDSYTRGDIFDGVNFRGIQLASDDNMVPDSQRGYAPTIHGISRGTSRISIRQNG 332
RSRLT+GD YT+GDIFDG+NFRG QLASDDNM+PDSQRG+AP IHGI+RGT++++I+QNG
Sbjct: 258 RSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNG 317

Query: 333 YEIYQSTLPPGPFEINDIYPAGSGGDLQVTLQEADGSVQRFNVPWSSVPVLQREGHLKYA 392
Y+IY ST+PPGPF INDIY AG+ GDLQVT++EADGS Q F VP+SSVP+LQREGH +Y+
Sbjct: 318 YDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYS 377

Query: 393 LSAGEFRSGGHQQDNPRFAEGTLKYGLPAGWTVYGGAWIAERYRAFNLGMGKNMGWLGAV 452
++AGE+RSG QQ+ PRF + TL +GLPAGWT+YGG +A+RYRAFN G+GKNMG LGA+
Sbjct: 378 ITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGAL 437

Query: 453 SLDATRANARLPDESRHDGQSYRFLYNKSLTETGTNIQLIGYRYSTRGYFSFADTAWKKM 512
S+D T+AN+ LPD+S+HDGQS RFLYNKSL E+GTNIQL+GYRYST GYF+FADT + +M
Sbjct: 438 SVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRM 497

Query: 513 SGYSVLTQDGVIQIQPKYTDYYNLAYNKRGRVQVSISQQTGESSTLYLSGSHQSYWGTDR 572
+GY++ TQDGVIQ++PK+TDYYNLAYNKRG++Q++++QQ G +STLYLSGSHQ+YWGT
Sbjct: 498 NGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSN 557

Query: 573 TDRQLNAGFNSSVNDISWSLNYSLSRNAWQHETDRILSFDVSIPFSHWMRSDSTSAWRNA 632
D Q AG N++ DI+W+L+YSL++NAWQ D++L+ +V+IPFSHW+RSDS S WR+A
Sbjct: 558 VDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHA 617

Query: 633 SARYSQTLEAHGQAASTAGLYGTSLGDNNLGYSIQSGYTRGGYEGSSKTGYASLNYRGGY 692
SA YS + + +G+ + AG+YGT L DNNL YS+Q+GY GG S TGYA+LNYRGGY
Sbjct: 618 SASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGY 677

Query: 693 GNASAGYSHSGGYRQLYYGLSGGILAHANGLTLSQPLGDTLILVRAPGASDTRIENQTGV 752
GNA+ GYSHS +QLYYG+SGG+LAHANG+TL QPL DT++LV+APGA D ++ENQTGV
Sbjct: 678 GNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGV 737

Query: 753 STDWRGYAVLPYATDYRENRVALDTNTLADNVDIENTVVSVVPTHGAVVRADYKTRVGVK 812
TDWRGYAVLPYAT+YRENRVALDTNTLADNVD++N V +VVPT GA+VRA++K RVG+K
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIK 797

Query: 813 VLMTLMRNGKAVPFGSVVTARNGGS-SIAGENGQVYLSGMPLSGQVSVKWGSQTTDQCTA 871
+LMTL N K +PFG++VT+ + S I +NGQVYLSGMPL+G+V VKWG + C A
Sbjct: 798 LLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVA 857

Query: 872 DYKLPKESAGQILSHVTVSCR 892
+Y+LP ES Q+L+ ++ CR
Sbjct: 858 NYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1136PF05704260.035 Capsular polysaccharide synthesis protein
		>PF05704#Capsular polysaccharide synthesis protein

Length = 307

Score = 26.0 bits (57), Expect = 0.035
Identities = 7/43 (16%), Positives = 14/43 (32%)

Query: 38 HFSLELFAGEPDPREKLTSDIVQPYRLHLLENWYNTDYDCNAW 80
F + + + V H+L+ N YD + +
Sbjct: 225 DFVSVMAVSKEYSKYWKEIPYVNNVNPHMLQYLGNLPYDNSMF 267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1139PRTACTNFAMLY340.003 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 33.9 bits (77), Expect = 0.003
Identities = 167/827 (20%), Positives = 257/827 (31%), Gaps = 117/827 (14%)

Query: 38 VALSLAAVTSVPALAAD----TVVQAGETVNGGTLTNHDNQIVLGTANGMTISTG----- 88
+A++L A+ + PA AD ++V+ GE +G + D V TA+G TI
Sbjct: 19 LAMALGALGAAPAAHADWNNQSIVKTGERQHGIHIQGSDPGGVR-TASGTTIKVSGRQAQ 77

Query: 89 ---LEYGPDNEANTGGQWIQNGGIANNTTVTGGGLQRVNAGGSVSDTVISAGGGQSLQGQ 145
LE G +G ++++ G V AG V+D A G +
Sbjct: 78 GILLENPAAELQFRNGSVTSSGQLSDDGIRRFLGTVTVKAGKLVADHATLANVGDTWDDD 137

Query: 146 AVNTTLNGGEQWVHEGGIA---TGTVINEKGWQAVKSGAMATDTVVNTGAEGGPDAENGD 202
+ + G + G V E+G + D ++ GA E+
Sbjct: 138 GIALYVAGEQAQASIADSTLQGAGGVQIERGANVTVQRSAIVDGGLHIGALQSLQPEDLP 197

Query: 203 TGQTVYGDAVRTTINKNGRQIVAAEGTANTTVVYAGGDQTVHGHALDTTLNGGYQYVHNG 262
+ V D T V A G V + T+ G + G +
Sbjct: 198 PSRVVLRDTNVTA--------VPASGAPAAVSVLGASELTLDGGHITGGRAAGVAAMQGA 249

Query: 263 GTASDTVVNSDGWQIIKEGGLADFTTVNQKGKLQVNAGGTATNVTLTQGGALVTSTAATV 322
G GG V G + G L G V + ++V
Sbjct: 250 VVHLQRATIRRGDA--PAGGAVPGGAV-PGGAVPGGFGPGGFGPVL-DGWYGVDVSGSSV 305

Query: 323 TGSNRLGNFTVENGNADGVVLESGGRLDVLEGHSAWKTLVDDGGTLAVSAGGKATDVTMT 382
L VE A G + V G GG+L+ G +V T
Sbjct: 306 ----ELAQSIVE---APE----LGAAIRVGRGARV----TVSGGSLSAPHG----NVIET 346

Query: 383 SGGALIADSGATVE-----GTNASGKFSIDGISGQASGLLLENG----GSFTVNAGGLAS 433
G A A + G +A GK + + + L L G G
Sbjct: 347 GGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKLTLTGGADAQGDIVATELPSIP 406

Query: 434 NTTVGHRGTLTLAAGGSLSGRTQLSKGASMVLNGDVVSTGDIVNAGEIRFDNQTTPDAAL 493
T++G + LA+ +G T+ S+ N V T + N G +R + + D
Sbjct: 407 GTSIG-PLDVALASQARWTGATRAVDSLSID-NATWVMTDN-SNVGALRLASDGSVDFQQ 463

Query: 494 SRAVAKGDSPVTFHKLTTSNLTGQGGTINMRVRLDGSNASDQLVINGGQATGKTWLAFTN 553
+ F LT + L G G M V D SD+LV+ A+G+ L N
Sbjct: 464 PAEAGR------FKVLTVNTLAGS-GLFRMNVFAD-LGLSDKLVVMQD-ASGQHRLWVRN 514

Query: 554 VGNSNLGVATSGQGIRVVDAQNGATTEEGAFALSRPLQAGAFNYTLNRDSDEDWYLRSEN 613
G+ S + +V G+ + G + Y L + + W L
Sbjct: 515 SGSE----PASANTLLLVQTPLGSAATFTLANKDGKVDIGTYRYRLAANGNGQWSLVGAK 570

Query: 614 AYRAEVPLY-----------------------ASMLTQAMDYDRILAGSRSHQSGVSGEN 650
A A P L+ A + G + E+
Sbjct: 571 APPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAES 630

Query: 651 NSVRLSIQGGHLGHDNNGGIARGATPESNGSYGFVRLEGDLLRTEVAGMSL--------T 702
N++ + L D G RG R +VAG L
Sbjct: 631 NALSKRLGELRLNPDAGGAWGRGFAQRQQLDNRAGR----RFDQKVAGFELGADHAVAVA 686

Query: 703 TGVYGAAGHSSVDVKDDDGSRAGTVRDDAGSLGGYLHLVHTSSGLWADIVAQGTRHSMKA 762
G + G + D + G D+ +GGY + SG + D + +R
Sbjct: 687 GGRWHLGGLAGYTRGDRGFTGDGGGHTDSVHVGGYATYI-ADSGFYLDATLRASRLENDF 745

Query: 763 SSDNND-------FRARGWGWLGSLETGLPFSITDNLMLEPQLQYTW 802
+D +R G G SLE G F+ D LEPQ +
Sbjct: 746 KVAGSDGYAVKGKYRTHGVGA--SLEAGRRFTHADGWFLEPQAELAV 790


19UTI89_C1192UTI89_C1209Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C11922151.327342hypothetical protein
UTI89_C11932141.446826virulence factor
UTI89_C11940110.907648virulence factor
UTI89_C11951160.886401flagellar synthesis protein FlgN
UTI89_C11961151.009272anti-sigma28 factor FlgM
UTI89_C11971152.272995flagellar basal body P-ring biosynthesis protein
UTI89_C11982162.433340flagellar basal-body rod protein FlgB
UTI89_C11993152.296513flagellar basal body rod protein FlgC
UTI89_C12003132.457132flagellar basal body rod modification protein
UTI89_C12011132.424744flagellar hook protein FlgE
UTI89_C1202-1122.420118flagellar basal body rod protein FlgF
UTI89_C1203-191.250356flagellar basal body rod protein FlgG
UTI89_C12040132.268337flagellar basal body L-ring protein
UTI89_C12050132.053023flagellar basal body P-ring biosynthesis protein
UTI89_C12061131.737144flagellar rod assembly protein/muramidase FlgJ
UTI89_C12071131.289713flagellar hook-associated protein FlgK
UTI89_C12083151.222002flagellar hook-associated protein FlgL
UTI89_C12094171.622309ribonuclease E
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1201FLGHOOKAP1416e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.1 bits (96), Expect = 6e-06
Identities = 17/49 (34%), Positives = 29/49 (59%)

Query: 353 TLTNGALEASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR 401
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + L+N+R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 36.9 bits (85), Expect = 1e-04
Identities = 22/56 (39%), Positives = 30/56 (53%), Gaps = 4/56 (7%)

Query: 6 AVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1203FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1204FLGLRINGFLGH350e-126 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 350 bits (900), Expect = e-126
Identities = 232/232 (100%), Positives = 232/232 (100%)

Query: 6 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 65
MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 66 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 125
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 126 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 185
RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 186 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 237
SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1205FLGPRINGFLGI427e-152 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 427 bits (1099), Expect = e-152
Identities = 157/363 (43%), Positives = 213/363 (58%), Gaps = 9/363 (2%)

Query: 5 FLSALILLLVTTAAQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQTLN 64
F + L A RI+D+ S+Q R N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 13 FSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMR 72

Query: 65 NMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGTLLM 124
ML LGIT G + KN+AAVMVTA+LPPF G +DV VSS+G+A SLRGG L+M
Sbjct: 73 AMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIM 131

Query: 125 TPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFGVGN 184
T L G D Q+YA+AQG ++V G A +++ R+ NGA+IERELPS+F
Sbjct: 132 TSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSV 191

Query: 185 TLNLQLNDEDFSMAQQIADTINRVR----GYGSATALDARTIQVRVPSGNSSQVRFLADI 240
L LQL + DFS A ++AD +N G A D++ I V+ P + R +A+I
Sbjct: 192 NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEI 250

Query: 241 QNMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAVAQGNLSVTVNRQANVSQPDTPFGG 300
+N+ V T AKVVIN RTG++V+ +V + AV+ G L+V V V QP PF
Sbjct: 251 ENLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSR 308

Query: 301 GQTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLR 360
GQT V PQT I Q G + ++ L +V LN++G +++ILQ ++SAG L+
Sbjct: 309 GQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQ 367

Query: 361 AKL 363
A+L
Sbjct: 368 AEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1206FLGFLGJ5070.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 507 bits (1306), Expect = 0.0
Identities = 310/313 (99%), Positives = 311/313 (99%)

Query: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60
MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSERTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120
LFSSE TRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGNSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180
VVRYQNQALSQLVQKAVPRNYDDSLPG+SKAFLAQLSLPAQLASQQSGVPHHLILAQAAL
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180

Query: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240
ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL
Sbjct: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240

Query: 241 EALSDYVGLLTRNPRYAAVTTAVSAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300
EALSDYVGLLTRNPRYAAVTTA SAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK
Sbjct: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300

Query: 301 VSKTYSMNIDNLF 313
VSKTYSMNIDNLF
Sbjct: 301 VSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1207FLGHOOKAP16820.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 682 bits (1761), Expect = 0.0
Identities = 545/546 (99%), Positives = 545/546 (99%)

Query: 2 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 61
SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 121
GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 181
SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 241
QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSSADPSRTTVAYVDRTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 301
RQLAAVPSSADPSRTTVAYVD TAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 361
ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 362 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 421
YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV
Sbjct: 361 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 420

Query: 422 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 481
NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN
Sbjct: 421 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 480

Query: 482 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 541
KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD
Sbjct: 481 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 540

Query: 542 ALINIR 547
ALINIR
Sbjct: 541 ALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1208FLAGELLIN468e-08 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 46.2 bits (109), Expect = 8e-08
Identities = 42/226 (18%), Positives = 80/226 (35%), Gaps = 9/226 (3%)

Query: 7 MMYQQNMRGITNSQAEWMKYGEQMSTGKRVVNPSDDPIAASQAVVLSQAQAQNSQYTLAR 66
++ Q N+ +S + + E++S+G R+ + DD + A + +Q +
Sbjct: 11 LLTQNNLNKSQSSLSSAI---ERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNA 67

Query: 67 TFATQKVSLEESVLSQVTTAIQNAQEKIVYASNGTLSDDDRASLATDIQGLRDQLLNLAN 126
E L+++ +Q +E V A+NGT SD D S+ +IQ +++ ++N
Sbjct: 68 NDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSN 127

Query: 127 TTDGNGRYIFAGYKTETAPFSEADGDYVGGTESIKQQVDASRSMVIGHTGDKIFDSITSN 186
T NG + + DG E+I + +G G + +
Sbjct: 128 QTQFNGVKVLSQDNQMKIQVGANDG------ETITIDLQKIDVKSLGLDGFNVNGPKEAT 181

Query: 187 AVAEPDGSASETNLFAMLDSAIAALKTPVADSEADKETAAAALDKT 232
+ T A + + TA DK
Sbjct: 182 VGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1209IGASERPTASE643e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 63.9 bits (155), Expect = 3e-12
Identities = 41/226 (18%), Positives = 79/226 (34%), Gaps = 12/226 (5%)

Query: 590 PAEQSAPKAEAKPERQQDRR-----KPRQNNRRDRNERRDTRSERTEGSDNREENRRNRR 644
P+ S + A+ + N ++++++ D E +NR
Sbjct: 1008 PSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNRE 1067

Query: 645 QAQQQTAETRESRQQAEV------TEKARTTDEQQAPRRERSRRRNDDKRQAQQEAKALN 698
A++ + + + Q EV T++ +TT+ ++ E+ + + + Q+ K +
Sbjct: 1068 VAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPK-VT 1126

Query: 699 VEEQSVQETEQEERVRPVQPRRKQRQLNQKVRYEQSVAEEAVVAPVVEETVAAEPIVQEA 758
+ QE + + + R +N K Q+ P E + E V E+
Sbjct: 1127 SQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES 1186

Query: 759 PAPRTELVKVPLPVVAQAAPEQQEENNADNRDNGGMPRRSRRSPRH 804
T V P A Q N+ + RRS RS H
Sbjct: 1187 TTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPH 1232



Score = 62.0 bits (150), Expect = 1e-11
Identities = 46/287 (16%), Positives = 91/287 (31%), Gaps = 35/287 (12%)

Query: 513 PSEEEFAERKRPEQPALATFAMPDVPPAPT-PAEPAAPVVAPAPKAATATPASPAQPGLL 571
P E+ + DVP P+ E A AP P A ATP+ +
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTE---- 1038

Query: 572 SRFFGALKALFSGGEETKPAEQSAPKAEAKPERQQDRRKP-RQNNRRDRNERRDTRSER- 629
AE S +++ + +QD + QN + + + ++
Sbjct: 1039 -----------------TVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQ 1081

Query: 630 -TEGSDNREENRRNRRQAQQQTAETRESRQQAEVTEKARTTDEQQAPRRERSRRRNDDKR 688
E + + E + + ++TA + + TEK + + + + + +
Sbjct: 1082 TNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQP 1141

Query: 689 QAQ---QEAKALNVEEQSVQETEQEERVRPVQPRRKQRQLNQKVRYEQSV--AEEAVVAP 743
QA+ + +N++E Q + +P + + Q V +V V P
Sbjct: 1142 QAEPARENDPTVNIKEPQSQTNTTADTEQPA--KETSSNVEQPVTESTTVNTGNSVVENP 1199

Query: 744 VVEETVAAEPIVQEAPAPRTELVKVPLPVVAQAAPEQQEENNADNRD 790
+P V + K ++ P E + D
Sbjct: 1200 ENTTPATTQPTVNSESS---NKPKNRHRRSVRSVPHNVEPATTSSND 1243


20UTI89_C1265UTI89_C1364Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C1265025-3.599843hypothetical protein
UTI89_C1266125-4.983542isocitrate dehydrogenase
UTI89_C1267236-6.506003prophage lambda integrase
UTI89_C1268437-5.552491excisionase
UTI89_C1269332-5.235492hypothetical protein
UTI89_C1270329-3.346069hypothetical protein
UTI89_C1271426-1.795084hypothetical protein
UTI89_C1272424-0.980354C4-type zinc finger protein
UTI89_C1273424-1.396123hypothetical protein
UTI89_C1274325-1.797717hypothetical protein
UTI89_C1275425-1.901929prophage CP-933K exonuclease
UTI89_C1276627-4.901629recombination protein Bet of prophage
UTI89_C1277734-8.701407host-nuclease inhibitor protein Gam of
UTI89_C1278638-10.484793hypothetical protein
UTI89_C1279536-8.972692Kil protein
UTI89_C1280539-8.321444lambda regulatory protein CIII
UTI89_C1281741-9.225811bacteriophage lambda single-stranded DNA binding
UTI89_C1282641-9.528429lambda ant-restriction protein
UTI89_C1283840-8.462365superinfection exclusion protein B of prophage
UTI89_C1284838-7.755835N protein
UTI89_C1285832-7.393715hypothetical protein
UTI89_C1286630-5.918862hypothetical protein
UTI89_C1287427-4.429329hypothetical protein
UTI89_C1288329-2.478180CI repressor of bacteriophage
UTI89_C1289128-1.472239bacteriophage regulatory protein CII
UTI89_C1290130-1.340065bacteriophage replication protein O
UTI89_C1291-132-1.658663bacteriophage replication protein P
UTI89_C1292237-3.066182prophage exclusion protein Ren
UTI89_C1293435-2.482667NinB protein encoded within prophage
UTI89_C1294435-3.039546DNA N-6-adenine-methyltransferase of
UTI89_C1295434-5.693673NinE protein (phage 82- and lambda-like)
UTI89_C1296227-6.997402hypothetical protein
UTI89_C1297228-6.138731hypothetical protein
UTI89_C1298227-5.975798endodeoxyribonuclease RUS
UTI89_C1299327-5.995068hypothetical protein
UTI89_C1300328-6.662505lambdoid prophage DLP12 antitermination protein
UTI89_C1301328-6.267148porin
UTI89_C1302128-4.910749hypothetical protein
UTI89_C1303129-4.561964lysozyme from lambdoid prophage DLP12
UTI89_C1304230-4.460452Rz endopeptidase from lambdoid prophage DLP12
UTI89_C1305038-6.809378lambdoid prophage DLP12 Bor-like protein
UTI89_C1306032-4.496375prophage TonB-like membrane protein
UTI89_C1307220-1.448595hypothetical protein
UTI89_C1308120-0.920557hypothetical protein
UTI89_C13091220.861676hypothetical protein
UTI89_C13101232.844449hypothetical protein
UTI89_C13111233.776371Qin prophage packaging protein NU1-like protein
UTI89_C13121233.559084prophage DNA packaging protein terminase large
UTI89_C13132264.493258prophage DNA packaging protein
UTI89_C13142264.491342prophage capsid protein
UTI89_C13152244.537062bacteriophage head-tail preconnector protein
UTI89_C13162252.290248bacteriophage capsid protein small subunit
UTI89_C13173232.720858prophage capsid protein gp7
UTI89_C13183283.822820hypothetical protein
UTI89_C13193284.069848prophage head-tail joining protein
UTI89_C13202275.313864prophage tail fiber component Z
UTI89_C13214285.527704prophage tail component
UTI89_C13224285.653965tail component of prophage CP-933X
UTI89_C13233265.304253prophage tail component
UTI89_C13243275.942447prophage tail component
UTI89_C13253256.249010prophage tail component
UTI89_C13265265.055648minor tail protein
UTI89_C13272232.787840hypothetical protein
UTI89_C13283242.256815prophage tail component
UTI89_C13293252.120724prophage tail component
UTI89_C13302231.336080prophage tail component
UTI89_C13312210.420902prophage tail component
UTI89_C1332122-2.342707hypothetical protein
UTI89_C1333216-1.097691hypothetical protein
UTI89_C1334117-1.018561hypothetical protein
UTI89_C1335118-1.125011hypothetical protein
UTI89_C1336121-4.010970SitD protein
UTI89_C1337121-4.432662SitC protein
UTI89_C1338227-6.925889SitB protein
UTI89_C1339230-7.729028iron transport protein, periplasmic-binding
UTI89_C1340133-9.182341hypothetical protein
UTI89_C1341133-9.738454hypothetical protein
UTI89_C1342032-7.748232isocitrate dehydrogenase
UTI89_C1343033-8.389888hypothetical protein
UTI89_C1344134-8.182445transcriptional regulator YcgE
UTI89_C1345032-7.467585hypothetical protein
UTI89_C1346134-7.111992hypothetical protein
UTI89_C1347335-6.289807hypothetical protein
UTI89_C1348129-4.883426hypothetical protein
UTI89_C1349025-4.340518hypothetical protein
UTI89_C1350124-2.229959hypothetical protein
UTI89_C1351129-5.165953hypothetical protein
UTI89_C1352026-5.056172hypothetical protein
UTI89_C1353126-4.736576hypothetical protein
UTI89_C1354126-6.075851hypothetical protein
UTI89_C1355023-4.265488hypothetical protein
UTI89_C1356020-3.935029hypothetical protein
UTI89_C1357020-3.605092hypothetical protein
UTI89_C1358024-4.782269hypothetical protein
UTI89_C1359-121-5.140898cell division topological specificity factor
UTI89_C1360020-5.170070cell division inhibitor MinD
UTI89_C1361-223-4.651267septum formation inhibitor
UTI89_C1362-319-5.691080hypothetical protein
UTI89_C1363-219-5.933467hypothetical protein
UTI89_C1364-219-4.933417protein YcgK
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1281UREASE290.007 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 28.6 bits (64), Expect = 0.007
Identities = 18/66 (27%), Positives = 26/66 (39%), Gaps = 7/66 (10%)

Query: 57 IMLAQHALLIAISSDLNAYGVVCEFDWN----DGNGQEGWPPMDGSEGIRITD---IDTS 109
+ LA L I + D +G +F DG GQ G+ IT+ +D
Sbjct: 22 VRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTREGGAVDTVITNALILDHW 81

Query: 110 GIFDSD 115
GI +D
Sbjct: 82 GIVKAD 87


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1291FLGMOTORFLIG270.043 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 27.5 bits (61), Expect = 0.043
Identities = 17/77 (22%), Positives = 27/77 (35%), Gaps = 11/77 (14%)

Query: 2 KNIAAQMVNFDREQM-----------RRIANNMPEQYDEKPQVQQVAQIINGVFSQLLAT 50
N+A ++ DR +++A+ E Y V V +IIN +
Sbjct: 165 TNVARRIALMDRTSPEVVREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKF 224

Query: 51 FPASLANRDQNELNEIR 67
SL D EI+
Sbjct: 225 IIESLEEEDPELAEEIK 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1301ECOLIPORIN5080.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 508 bits (1310), Expect = 0.0
Identities = 241/388 (62%), Positives = 280/388 (72%), Gaps = 33/388 (8%)

Query: 21 MKKLTVAISAVAASVLMAMSAQAAEIYNKDSNKLDLYGKVNAKHYFSSNDADDGDTTYAR 80
MK+ +A+ V ++L A +A AAEIYNKD NKLDLYGKV+ HYFS + + DGD TY R
Sbjct: 1 MKRKVLAL--VIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMR 58

Query: 81 LGFKGETQINDQLTGFGQWEYEFKGNRAESQGSSKDKTRLAFAGLKFGDYGSIDYGRNYG 140
+GFKGETQINDQLTG+GQWEY + N E +G++ TRLAFAGLKFGDYGS DYGRNYG
Sbjct: 59 VGFKGETQINDQLTGYGQWEYNVQANTTEGEGANS-WTRLAFAGLKFGDYGSFDYGRNYG 117

Query: 141 VAYDIGAWTDVLPEFGGDTWTQTDVFMTGRTTGVATYRNNDFFGLVDGLNFAAQYQGKND 200
V YD+ WTD+LPEFGGD++T D +MTGR GVATYRN DFFGLVDGLNFA QYQGKN+
Sbjct: 118 VLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNE 177

Query: 201 R----------------TDVTEANGDGFGFSTTYEY-EGFGVGATYAKSDRTDGQVAYGK 243
D+ NGDGFG STTY+ GF GA Y SDRT+ QV G
Sbjct: 178 SQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGG 237

Query: 244 SKFNASGKNAEVWAAGLKYDANNIYLATTYSETQNMTVFG------NNHIANKAQNFEAV 297
+ A G A+ W AGLKYDANNIYLAT YSET+NMT +G + +ANK QNFE
Sbjct: 238 T--IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVT 295

Query: 298 AQYQFDFGLRPSVAYLQSKGKDLGVH----GDRDLVKYVDVGATYYFNKNMSTFVDYKIN 353
AQYQFDFGLRP+V++L SKGKDL + D+DLVKY DVGATYYFNKN ST+VDYKIN
Sbjct: 296 AQYQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKIN 355

Query: 354 LID-DSKFTKTAGIDTDDIVAVGLVYQF 380
L+D D F K AGI TDDIVA+G+VYQF
Sbjct: 356 LLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1305PF062911892e-66 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 189 bits (482), Expect = 2e-66
Identities = 102/102 (100%), Positives = 102/102 (100%)

Query: 12 MQDNKMKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHFFVSGIGQKKTVDAA 71
MQDNKMKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHFFVSGIGQKKTVDAA
Sbjct: 1 MQDNKMKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHFFVSGIGQKKTVDAA 60

Query: 72 KICGGAENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ 113
KICGGAENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ
Sbjct: 61 KICGGAENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1306TONBPROTEIN692e-17 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 68.9 bits (168), Expect = 2e-17
Identities = 33/82 (40%), Positives = 46/82 (56%)

Query: 41 ADEPRQLVTVYPRYPEYAAANYIKGLVEVKFDIGADGTVTRIVFLRSEPHNLFRDEVVKA 100
A PR L P+YP A A I+G V+VKFD+ DG V + L ++P N+F EV A
Sbjct: 150 ASGPRALSRNQPQYPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREVKNA 209

Query: 101 MAKWRFEKNRPCQGVKRQFIFT 122
M +WR+E +P G+ +F
Sbjct: 210 MRRWRYEPGKPGSGIVVNILFK 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C13122FE2SRDCTASE310.010 Ferric iron reductase signature.
		>2FE2SRDCTASE#Ferric iron reductase signature.

Length = 262

Score = 31.2 bits (70), Expect = 0.010
Identities = 10/41 (24%), Positives = 21/41 (51%), Gaps = 1/41 (2%)

Query: 310 TRDGLMFFSARGDEIPPPRSITFHIWTAYSPFTTWVQIVYD 350
R+ L+ F R DE P ++T W++ + ++ + + D
Sbjct: 36 HREHLLEF-IRLDEPAPLNAMTLAQWSSPNVLSSLLAVYSD 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1325GPOSANCHOR412e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 41.2 bits (96), Expect = 2e-05
Identities = 56/377 (14%), Positives = 125/377 (33%), Gaps = 36/377 (9%)

Query: 236 SGLTAMARQFHNVTAEQIAYVAQLQRSGDESGALQAANEAATKGFDDQTRRLKENMGTLE 295
S R+ +E+ + + +L+ + + + + L+ L
Sbjct: 95 SNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALA 154

Query: 296 TWADRTARAFKSMWDAVLDI-GRPDTAQEMLIKAEAAFKKADDIWNLRKDDYFVNDEARA 354
+A + + + T + EA + + + +
Sbjct: 155 ARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIK 214

Query: 355 RYWDDREKARLALEAARK-KAEQQSQQDKNAQQQSDTEASRLKYTEEAQKAYERLQTPLE 413
++ K + ++ + EA + + + L+ +
Sbjct: 215 TLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMN 274

Query: 414 KYTARQEELNKALKDGKILQADYNTLMAAAKKDYEATLKKPKQSGVKVSAGDRQEDSAHA 473
TA ++ + L+A+ L + A + + R D++
Sbjct: 275 FSTADSAKIKTLEAEKAALEAEKADLEHQ-SQVLNANRQSLR----------RDLDASRE 323

Query: 474 ALLTLQAELRTLEKHAGANEKISQQ-RRDL-------WKAESQFAVLEEAAQRRQLSAQE 525
A L+AE + LE+ +E Q RRDL + E++ LEE + + S Q
Sbjct: 324 AKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQS 383

Query: 526 KS--LLAHKDETLEYKRQLAALGDKVTYQERLNALAQQADKFAQQQRAKRAAIDAKSRGL 583
L A ++ + ++ L K+ E+LN +++ K ++++A
Sbjct: 384 LRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKA------------ 431

Query: 584 TDRQAEREATEQRLKEQ 600
+ QA+ EA + LKE+
Sbjct: 432 -ELQAKLEAEAKALKEK 447


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1330PF06291280.012 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 27.7 bits (61), Expect = 0.012
Identities = 13/40 (32%), Positives = 19/40 (47%), Gaps = 5/40 (12%)

Query: 122 MTGILFSLGASMVLGGVAQML-----APKARTPRTQTTDN 156
M +LFS +M++ G AQ P A TP+ T +
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHH 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1332IGASERPTASE412e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.8 bits (95), Expect = 2e-05
Identities = 27/132 (20%), Positives = 56/132 (42%), Gaps = 15/132 (11%)

Query: 123 SQSAAAAKKSETAAASSRNA--AKTSETNAGNSAKAAASSKTAAQNAATAAERSETNARA 180
S + A+ E A ++T+ET A NS + + + + Q+A + N
Sbjct: 1012 SNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQ---NREV 1068

Query: 181 SEEASADSEEASRRN--AESAAENAGVATTKAREAAADATKAGQKKDEALSAATRAEKAA 238
++EA ++ + ++ N A+S +E +E TK ++ A EK
Sbjct: 1069 AKEAKSNVKANTQTNEVAQSGSE--------TKETQTTETKETATVEKEEKAKVETEKTQ 1120

Query: 239 DRAEVAAEVTAE 250
+ +V ++V+ +
Sbjct: 1121 EVPKVTSQVSPK 1132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1339adhesinb329e-115 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 329 bits (846), Expect = e-115
Identities = 90/296 (30%), Positives = 163/296 (55%), Gaps = 7/296 (2%)

Query: 9 MLLGGLALTCSIAFQASATEKFKVITTFTIIADMAKNVAGDAAEVSSITKPGAEIHEYQP 68
+G A + + + + K V+ T +IIAD+ KN+AGD + SI G + HEY+P
Sbjct: 13 AFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKNIAGDKINLHSIVPVGQDPHEYEP 72

Query: 69 TPGDIKRAQGAQLILANGMNLEL----WFQRFYQHLNGVPE---VIVSSGVTPVGITEGP 121
P D+K+ A LI NG+NLE WF + ++ VS GV + +
Sbjct: 73 LPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVSEGVDVIYLEGQS 132

Query: 122 YEGKPNPHAWMSPDNALIYVDNIRDALIKYDPANAQTYQRNADTYKAKITQTLAPLRKQI 181
+GK +PHAW++ +N +IY NI L + DPAN +TY++N Y K++ +++
Sbjct: 133 EKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANKETYEKNLKAYVEKLSALDKEAKEKF 192

Query: 182 TELPENQRWMVTSEGAFSYLARDLGLKELYLWPINADQQGTPQQVRKVVDIVKKNHIPAV 241
+P ++ +VTSEG F Y ++ + Y+W IN +++GTP Q++ +V+ ++K +P++
Sbjct: 193 NNIPGEKKMIVTSEGCFKYFSKAYNVPSAYIWEINTEEEGTPDQIKTLVEKLRKTKVPSL 252

Query: 242 FSESTISDKPARQVARETGAHYGGVLYVDSLSTENGPVPTYIDLLKVTTSTLVQGI 297
F ES++ D+P + V+++T ++ DS++ + +Y ++K + +G+
Sbjct: 253 FVESSVDDRPMKTVSKDTNIPIYAKIFTDSVAEKGEEGDSYYSMMKYNLEKIAEGL 308


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1357PRTACTNFAMLY429e-08 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 42.0 bits (98), Expect = 9e-08
Identities = 26/101 (25%), Positives = 45/101 (44%), Gaps = 1/101 (0%)

Query: 8 TRSIYRELGATLSYNMRLGNGMEIEPWLKAAVRKEFVDDNRVKVNSDGNFINDLSGRRGI 67
S+ LG + + L G +++P++KA+V +EF V N + +L G R
Sbjct: 811 GSSVLGRLGLEVGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNGIAH-RTELRGTRAE 869

Query: 68 YQAGIKASFSSTLSGHLGVGYSRGAGVESPWNAVVGVNWSF 108
G+ A+ S + YS+G + PW G +S+
Sbjct: 870 LGLGMAAALGRGHSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


21UTI89_C1425UTI89_C1439Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C1425-119-3.630387Tpn1330-like protein
UTI89_C1428-119-3.353936**formyltetrahydrofolate deformylase
UTI89_C1429-121-3.387408hypothetical protein
UTI89_C1430-122-3.630830hypothetical protein
UTI89_C1431-224-5.329631response regulator of RpoS
UTI89_C1432-117-3.447651UTP-glucose-1-phosphate uridylyltransferase
UTI89_C1433128-2.010290hypothetical protein
UTI89_C1434026-2.090353global DNA-binding transcriptional dual
UTI89_C1435023-2.753182hypothetical protein
UTI89_C1436121-2.921536thymidine kinase
UTI89_C1437019-2.683935transposase InsG for insertion sequence element
UTI89_C1438-122-3.045834bifunctional acetaldehyde-CoA/alcohol
UTI89_C1439-114-3.397338hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1429SECA572e-12 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 57.2 bits (138), Expect = 2e-12
Identities = 16/28 (57%), Positives = 20/28 (71%)

Query: 125 IDGTRPQFGRNDPCPCGSGKKFKKCCGQ 152
+ GRNDPCPCGSGKK+K+C G+
Sbjct: 872 AQTGERKVGRNDPCPCGSGKKYKQCHGR 899


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1431HTHFIS874e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 87.2 bits (216), Expect = 4e-21
Identities = 40/152 (26%), Positives = 65/152 (42%), Gaps = 3/152 (1%)

Query: 10 ILIVEDEQVFRSLLDSWFSSLGATTVLAADGVDALELLGGFTPDLMICDIAMPRMNGLKL 69
IL+ +D+ R++L+ S G + ++ + DL++ D+ MP N L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 70 LEHIRNSGDQTPVLVISATENMADIAKALRLGVEDVLLKPVKDLNRLREMVFACLYPSMF 129
L I+ + PVLV+SA KA G D L KP DL L ++ L +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF-DLTELIGIIGRAL--AEP 122

Query: 130 NSRVEEEERLFRDWDAMVDNPAAAAKLLQELQ 161
R + E +D +V AA ++ + L
Sbjct: 123 KRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLA 154


22UTI89_C1448UTI89_C1524Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C1448-122-3.401737hypothetical protein
UTI89_C1449-122-3.705534voltage-gated potassium channel
UTI89_C1450117-1.118474hypothetical protein
UTI89_C1451216-1.489985transport protein TonB
UTI89_C1452117-3.121076hypothetical protein
UTI89_C1453017-3.662133acyl-CoA thioester hydrolase
UTI89_C1454017-3.304685intracellular septation protein A
UTI89_C1455019-3.260299hypothetical protein
UTI89_C1456122-3.849450outer membrane protein W
UTI89_C1457223-3.973196prophage CP-933O integrase
UTI89_C1458123-3.633912excisionase
UTI89_C1459122-3.709265prophage CP-933O exonuclease VIII
UTI89_C1460129-5.461509hypothetical protein
UTI89_C1461129-6.260405regulator of cell division encoded by prophage
UTI89_C1462331-5.031181hypothetical protein
UTI89_C1463427-3.929532hypothetical protein
UTI89_C1464327-1.446016hypothetical protein
UTI89_C1465228-1.586345hypothetical protein
UTI89_C1466130-5.154929prophage CP-933O repressor protein
UTI89_C1467132-6.447847hypothetical protein
UTI89_C1468032-6.783676hypothetical protein
UTI89_C1469134-7.223974hypothetical protein
UTI89_C1470337-9.488973prophage hypothetical protein
UTI89_C1471029-6.947411hypothetical protein
UTI89_C1472027-3.624047hypothetical protein
UTI89_C1473021-0.082210hypothetical protein
UTI89_C1474-1210.730213cryptic prophage CP-933M cell killing protein
UTI89_C14750201.013209hypothetical protein
UTI89_C14760210.983386hypothetical protein
UTI89_C14781280.506601endodeoxyribonuclease RusA
UTI89_C14793280.242839cryptic prophage CP-933M antitermination protein
UTI89_C14806340.887382hypothetical protein
UTI89_C14815340.463846prophage CP-933O DNA adenine methyltransferase
UTI89_C1484837-1.111241**hypothetical protein
UTI89_C1485532-0.687285hypothetical protein
UTI89_C1486531-1.878591hypothetical protein
UTI89_C1487426-0.998443lambdoid prophage DLP12 lysis protein S-like
UTI89_C1488225-4.924779hypothetical protein
UTI89_C1489324-5.412913hypothetical protein
UTI89_C1490124-6.137750phage lysozyme
UTI89_C1491129-6.925355hypothetical protein
UTI89_C1492219-3.024823bacteriophage lambda cell lysis protein
UTI89_C1493321-3.163267hypothetical protein
UTI89_C14942231.089883prophage hypothetical protein
UTI89_C14951243.399444hypothetical protein
UTI89_C14961243.922569prophage Qin DNA packaging protein NU1-like
UTI89_C14971244.041303prophage DNA packaging protein terminase large
UTI89_C14981285.372220prophage DNA packaging protein
UTI89_C14992285.436254prophage capsid protein
UTI89_C15003265.011773prophage capsid assembly protein
UTI89_C15014272.538273capsid protein small subunit
UTI89_C15023252.425835prophage capsid protein
UTI89_C15033252.462110hypothetical protein
UTI89_C15043262.591026prophage head-tail joining protein
UTI89_C15053274.624169minor tail protein
UTI89_C15063275.186174prophage tail component
UTI89_C15072245.268054tail fiber component V of prophage CP-933U
UTI89_C15083265.452153prophage tail component
UTI89_C15093276.046748tail component of prophage CP-933O
UTI89_C15103266.093893tail component of prophage CP-933O
UTI89_C15112233.415392minor tail protein
UTI89_C15123254.238839hypothetical protein
UTI89_C15133244.209413minor tail protein
UTI89_C15140213.656603prophage tail fiber component K
UTI89_C15150203.744767tail component of prophage CP-933K
UTI89_C15160213.118503hypothetical protein
UTI89_C15170202.817431prophage tail component
UTI89_C15181251.081493hypothetical protein
UTI89_C15192270.561871tail fiber protein
UTI89_C1520229-2.003370hypothetical protein
UTI89_C1521533-6.785414phage-related tail fiber assembly protein G
UTI89_C1522434-7.932531hypothetical protein
UTI89_C1523332-6.762722hypothetical protein
UTI89_C1524023-3.969706hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1450adhesinmafb314e-04 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 31.2 bits (70), Expect = 4e-04
Identities = 16/57 (28%), Positives = 20/57 (35%), Gaps = 2/57 (3%)

Query: 41 GPMPAVDSNDPGAAGFTGSTVIAEFESLEAAQAWADADPYVAAGVYEHVSVKPFKKV 97
P+PA G GS E + EA W +P A V +V KV
Sbjct: 268 APLPA--EGKFAVIGGLGSVAGFEKNTREAVDRWIQENPNAAETVEAVFNVAAAAKV 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1451TONBPROTEIN2591e-89 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 259 bits (663), Expect = 1e-89
Identities = 234/239 (97%), Positives = 236/239 (98%), Gaps = 1/239 (0%)

Query: 18 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVAPADLEPPQA 77
MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMV PADLEPPQA
Sbjct: 1 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQA 60

Query: 78 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQ-PKRDVKPVESR 136
VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKV++ PKRDVKPVESR
Sbjct: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120

Query: 137 PASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 196
PASPFENTAPAR TSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF
Sbjct: 121 PASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 180

Query: 197 DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ 255
DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ
Sbjct: 181 DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ 239


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1461STREPKINASE290.002 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 29.3 bits (65), Expect = 0.002
Identities = 12/36 (33%), Positives = 22/36 (61%)

Query: 17 TSPGSTRHRITKFIVEDAIMETLLPNVNTSEGCFEI 52
T G+ H++ K + AI E L+ NV++++ FE+
Sbjct: 91 TDSGAMSHKLEKADLLKAIQEQLIANVHSNDDYFEV 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1474HOKGEFTOXIC643e-18 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 64.5 bits (157), Expect = 3e-18
Identities = 18/46 (39%), Positives = 31/46 (67%)

Query: 23 QKAMLIALIVICLTVIVTALVTRKDLCEVRIRTGQTEVAVFTVYEP 68
+ +++ ++++CLT+++ +TRK LCE+R R G EVA F YE
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1481FbpA_PF05833300.013 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 30.2 bits (68), Expect = 0.013
Identities = 13/56 (23%), Positives = 24/56 (42%), Gaps = 4/56 (7%)

Query: 204 ESDYLKLQA--LFARVAEEKHR-RGELEKLHHQLVDTYTSLN-RQYAELLSEYKHL 255
+SD LK ++ L V +R + + L++ L + Y ELL+ +
Sbjct: 293 KSDRLKSKSSDLQKIVMNNINRCTKKDKILNNTLKKCEDKDIFKLYGELLTANIYA 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1507INTIMIN330.001 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 32.7 bits (74), Expect = 0.001
Identities = 23/119 (19%), Positives = 45/119 (37%), Gaps = 17/119 (14%)

Query: 134 KEVITRTVKVTNVGKPSVAEERSKITPVSAIKVTP-------------TSGTVAKGKTTT 180
++ IT TVKV KP +E + T + + + TS T K +
Sbjct: 675 QDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSA 734

Query: 181 LT--VSFEPESATDKTFRAVSADPSKATI--SVKDMTITVNGVATGKVQIPVVSGNGQF 235
V+ + ++ + F ++ D I + + + G+V + GNG++
Sbjct: 735 RVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKY 793


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1510GPOSANCHOR382e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 37.7 bits (87), Expect = 2e-04
Identities = 38/231 (16%), Positives = 71/231 (30%), Gaps = 12/231 (5%)

Query: 377 TLQSDMEKAGELAARDRAERESSQLKYTGEAQKAYERLQTPLDKYTARQKELNKALKDGK 436
+ ++ + E D + + ++ + L+ AR+ +L KAL+
Sbjct: 109 SEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAM 168

Query: 437 ILQADYNTLMASAKKDYESTLKKPSGVKVSAGERQEDRAHAALLALETELRTLEKHSGVN 496
SAK K + + E+ + A A +++TLE
Sbjct: 169 NFSTAD-----SAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAAL 223

Query: 497 E---KISQQRRDLWEAESQYVVLKEAATKRQLSEQEKSLLAHEKETLEYKRQLAELGDKI 553
++ + S K + + + E EK KI
Sbjct: 224 AARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKI 283

Query: 554 E-HQKRLNELAQQAARFEQQQSAKQAAISAKARGLTDRQAQRESEEQRLRE 603
+ + L + A E Q A + R L A RE+++Q E
Sbjct: 284 KTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDL---DASREAKKQLEAE 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1515PF06291270.032 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 26.5 bits (58), Expect = 0.032
Identities = 13/37 (35%), Positives = 17/37 (45%), Gaps = 5/37 (13%)

Query: 128 ILFSMGAAMTLGGVAQML-----APKARTPRTQTTDN 159
+LFS AM + G AQ P A TP+ T +
Sbjct: 9 MLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHH 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1516PHAGEIV300.001 Gene IV protein signature.
		>PHAGEIV#Gene IV protein signature.

Length = 426

Score = 30.3 bits (68), Expect = 0.001
Identities = 15/49 (30%), Positives = 30/49 (61%), Gaps = 2/49 (4%)

Query: 35 KNIDELSGCISRQWAGNGTPITSLPIEN-GVSL-LVPQAMGGYDVVLDI 81
+N+ ++G ++ + A P ++ +N G+S+ + P AM G ++VLDI
Sbjct: 289 QNVPFITGRVTGESANVNNPFQTVERQNVGISMSVFPVAMAGGNIVLDI 337


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1518ENTEROVIROMP1384e-44 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 138 bits (350), Expect = 4e-44
Identities = 63/200 (31%), Positives = 98/200 (49%), Gaps = 30/200 (15%)

Query: 1 MRKVCAVILSAAICLSVSGAPAWASEHQSTLSAGYLHARTNAPGSDNLNGINVKYRYEFT 60
M+K+ + AA+ +G A ST++ GY + + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAA---TSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DA-LGLITSFSYANAEDEQKTHYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYSMAGV 119
++ LG+I SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVTIDLAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+V +D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDAFIVGIGYRF 199
S +I G+GYRF
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1519CHANLCOLICIN468e-07 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 45.8 bits (108), Expect = 8e-07
Identities = 54/319 (16%), Positives = 116/319 (36%)

Query: 152 ARAASTSAGQAASSAQSASSSAGTASTKAREAAKSAAAAESSKSAAATSASAAKTSETNA 211
+ S S AA A + S+A T+A +AA++ AAAE+ A A + + +
Sbjct: 39 GKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIV 98

Query: 212 AASQQSAATSASTATTKASEAATSARDASASKEAAKSSETNAASSASSAASSATAAANSA 271
+ + A+ +AT A + + AK+ E + ++ + A
Sbjct: 99 NEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQEAEQRRK 158

Query: 272 KAAKTSETNARSSETAAGQSASAAADSKTAAALSASAASTSAGQASASATAAGKSAESAA 331
+ + R + A + AA S+ A A+ + SA Q+ ++
Sbjct: 159 EIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSR 218

Query: 332 SSASTATTKAGEAAVQASAAARSASAAKTSKTNAKASETSAESSKTAAASSASSAASSAS 391
S+S A + + ++AK + + + S ++ A
Sbjct: 219 LSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRV 278

Query: 392 SASASKDEATRQASAAKGSATTASTKATEAAGSATAAAQSKSTAESAATRAETAAKRAED 451
A ++E +Q +A++ + T+ + + + +++ + AE K+A++
Sbjct: 279 GAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHEAEENLKKAQN 338

Query: 452 IASAVALEDASTTKKGIVQ 470
++DA Q
Sbjct: 339 NLLNSQIKDAVDATVSFYQ 357



Score = 36.6 bits (84), Expect = 5e-04
Identities = 66/358 (18%), Positives = 125/358 (34%), Gaps = 30/358 (8%)

Query: 313 AGQASASATAAGKSAESAASSASTATTKAGEAAVQASAAARSASAAKTSKTNAKASETSA 372
+G KS SAA A+ + A QA AAR+ +AA ++ A
Sbjct: 32 SGSGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAA--------EAQAKA 83

Query: 373 ESSKTAAASSASSAASSASSASASKDEATRQASAAKGSATTASTKATEAAGSATAAAQSK 432
++++ A + A +AS+ + + + A+ A +A A+++
Sbjct: 84 KANRDALTQRLKDIVNEALRHNASRTPSATELA-------HANNAAMQAEDERLRLAKAE 136

Query: 433 STAESAATRAETAAKRAEDIASAVALEDASTTKKGIVQLSSATNSTSESLAATPKAVKAA 492
A A AE A + AE + E A T ++ ++L+ A +L+ KAV+ A
Sbjct: 137 EKARKEAEAAEKAFQEAEQRRKEIEREKAETERQ--LKLAEAEEKRLAALSEEAKAVEIA 194

Query: 493 YELANGKYTAQDATTAQKGIVQLSNATNSTSEMLAATPKSVKAAYDLANGKYTAQDATTA 552
+ + AQ +V++ + + L+++ + A GK +A
Sbjct: 195 ---------QKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASA 245

Query: 553 QKGIVQLSSATNSTSEMLAATPKSVKAAYDLANGKYTAQDAT-TAQKGIVQLSSATNSAS 611
+ + A P + ++ + A QK + + N +
Sbjct: 246 K---YKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASETRINRIN 302

Query: 612 ETLAATPKAVKAANNNANGRVPSARKVNGKALSADITLTPKDIGTLNSTTMSFSGGAG 669
+ KA+ +NN N + + A L I T+SF
Sbjct: 303 ADITQIQKAISQVSNNRNAGIARVHEAEENLKKAQNNLLNSQIKDAVDATVSFYQTLT 360



Score = 31.2 bits (70), Expect = 0.022
Identities = 52/321 (16%), Positives = 99/321 (30%), Gaps = 23/321 (7%)

Query: 114 SRNASAVAQNTAAAKKSASDASASASEAATHATDAAASARAASTSAGQAASSAQSASSSA 173
S++ S+ A + A +A A +AA A A A+A + + +
Sbjct: 43 SKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEAL 102

Query: 174 GTASTKAREAAKSAAAAESSKSAAATSASAAKTSETNAAASQQSAATSASTATTKASEAA 233
+++ A + A A ++ A AK + A + A KA + A
Sbjct: 103 RHNASRTPSATELAHANNAAMQAEDERLRLAK---------AEEKARKEAEAAEKAFQEA 153

Query: 234 TSARDASASKEAAKSSETNAAS------SASSAASSATAAANSAKAAKTSETNARSSE-T 286
R ++A + A +A S + A A +A SE E
Sbjct: 154 EQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIK 213

Query: 287 AAGQSASAAADSKTAAALSASAASTSAGQASASATAAGKSAESAASSASTA-TTKAGEAA 345
S++ ++ A + + QASA + + + A+ + A
Sbjct: 214 TLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEA 273

Query: 346 VQASAAARSASAAKTSKTNAKASETSAESSKTAAASSASSAASSASSAS------ASKDE 399
+ A K + A + + ++ A S S+ +A A ++
Sbjct: 274 TRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHEAEENL 333

Query: 400 ATRQASAAKGSATTASTKATE 420
Q + A
Sbjct: 334 KKAQNNLLNSQIKDAVDATVS 354


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1522LUXSPROTEIN300.005 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 29.9 bits (67), Expect = 0.005
Identities = 17/66 (25%), Positives = 30/66 (45%), Gaps = 7/66 (10%)

Query: 40 TKEHLLPHFL-EHVGNNHLDI------GVGTGFYLTHVPESSLISLMDLNEASLNAASTR 92
T EHL F+ H+ + ++I G TGFY++ + S + D A++
Sbjct: 54 TLEHLYAGFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKV 113

Query: 93 AGESKI 98
++KI
Sbjct: 114 ENQNKI 119


23UTI89_C1597UTI89_C1610Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C1597-119-4.218331murein peptide amidase A
UTI89_C1598020-5.141659hypothetical protein
UTI89_C1599-116-4.050367hypothetical protein
UTI89_C1600-114-3.789126transcriptional regulator YcjZ
UTI89_C1601-112-3.103338periplasmic murein peptide-binding protein
UTI89_C1602-115-3.383133hypothetical protein
UTI89_C1603-116-2.523292hypothetical protein
UTI89_C1604-115-2.963955universal stress protein UspE
UTI89_C1605124-5.491481fumarate/nitrate reduction transcriptional
UTI89_C1606127-5.318931O-6-alkylguanine-DNA:cysteine-protein
UTI89_C1607026-5.817668hypothetical protein
UTI89_C1608-226-4.801005hypothetical protein
UTI89_C1609-121-3.583191secretion protein
UTI89_C1610-119-3.169441transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1609RTXTOXIND1149e-31 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 114 bits (288), Expect = 9e-31
Identities = 61/414 (14%), Positives = 128/414 (30%), Gaps = 105/414 (25%)

Query: 11 VVAIGILLAGVVFFIW-WVSK--------GRFIQTTDDAYIGGNITTVASKVSGYISAIE 61
+VA I+ V+ FI + + G+ G + + + I
Sbjct: 59 LVAYFIMGFLVIAFILSVLGQVEIVATANGKLT-------HSGRSKEIKPIENSIVKEII 111

Query: 62 VRDNQSVKKGDIILRLDDRDYRANVARLEAKIKSSKANLESIQATI-------------- 107
V++ +SV+KGD++L+L A+ + ++ + ++ Q
Sbjct: 112 VKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLP 171

Query: 108 -------------AMQQSIIQSASETWQAVKHEEQKRLRD--------TERYEKLAQSAA 146
S+I+ TWQ K++++ L R + +
Sbjct: 172 DEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSR 231

Query: 147 ISQQIIDNAR-------FDYQQVAAKERKAANDFLVEKQRLAVLSAQEENVRASIEEVQA 199
+ + +D+ V +E K L V +Q E + + I +
Sbjct: 232 VEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEA----VNELRVYKSQLEQIESEILSAKE 287

Query: 200 ALTQ-----------------------------ALLDLEYTLVRAPIDGIVANRSAHT-G 229
+ +++RAP+ V HT G
Sbjct: 288 EYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEG 347

Query: 230 SWVEGGTSLVSLVPVSE-LWVDANYKENQIAGMKPGMKAEIRADILKGEVFH---GHIES 285
V +L+ +VP + L V A + I + G A I+ + + G +++
Sbjct: 348 GVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKN 407

Query: 286 LSPATGASFSLIPIENATGNFTKIVQRVPVRIAFDDAKELKQLLRPGLSVTVSV 339
++ + G ++ + K + L G++VT +
Sbjct: 408 INLDA-------IEDQRLGLVFNVIISIEENCLSTGNKNIP--LSSGMAVTAEI 452


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1610TCRTETB1043e-26 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 104 bits (261), Expect = 3e-26
Identities = 81/418 (19%), Positives = 171/418 (40%), Gaps = 21/418 (5%)

Query: 3 SMRKHIAFASMCIGLFIAQLDIQIVSSSLNEIGGGLSAGKDEMAWLQTSYLIAEIIVIPL 62
++R + +CI F + L+ +++ SL +I + W+ T++++ I +
Sbjct: 9 NLRHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAV 68

Query: 63 SGWLSRVFSTRWLFTLSAGIFTLMSIACGLAWN-IQIMIFFRALQGVAGASMIPLVFTTA 121
G LS + L I S+ + + ++I R +QG A+ LV
Sbjct: 69 YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVV 128

Query: 122 FIYYQGKELGLAAAVVSALASLSPTLGPTLGGWITDNLDWRWLFYINILPGIYLVLSIPF 181
Y + G A ++ ++ ++ +GP +GG I + W +L I ++ ++++PF
Sbjct: 129 ARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMI----TIITVPF 184

Query: 182 LVNFDKPDLSLLKVADYPSIILLAMTLGCLEYTLEEGARWGWLDDNTILLTSVLALVSFI 241
L+ K ++ + D IIL+++ + +L +++++SF+
Sbjct: 185 LMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTS-YSISFL---------IVSVLSFL 234

Query: 242 LFAARTLKISNPIMDLHAFKDKYFTLGCFFSFSGGVGIFSTVYLIPVFLGQVRGLNAEEI 301
+F K+++P +D K+ F +G + V ++P + V L+ EI
Sbjct: 235 IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEI 294

Query: 302 GFAVCTTG-IFQLFSVPFYFWLSKKINLQWLLMAGLGGFVFSMYL--FTPITHEWGWQEL 358
G + G + + L + ++L G+ S F T W + +
Sbjct: 295 GSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSW-FMTI 353

Query: 359 LFPQAIRGISQQFAMAPIVTLTLGGIPKERLKLASGVFNLTRNLGGASGIALCGSILN 416
+ + G+S F I T+ + ++ + N T L +GIA+ G +L+
Sbjct: 354 IIVFVLGGLS--FTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


24UTI89_C1622UTI89_C1627Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C1622210-0.788248heat-inducible protein
UTI89_C162329-0.594390D-lactate dehydrogenase
UTI89_C1624310-0.677204hypothetical protein
UTI89_C1625412-1.282399hypothetical protein
UTI89_C1626412-1.763395hypothetical protein
UTI89_C1627411-1.357387EntS/YbdA MFS transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1622PF06291290.003 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 29.2 bits (65), Expect = 0.003
Identities = 16/51 (31%), Positives = 25/51 (49%), Gaps = 6/51 (11%)

Query: 5 MKKVAALVALSLLMAGC------VSSDKIAVTPEQLQHHRFVLESVNGKPV 49
MKK+ AL++L+ GC V + AVTP++ H F + + K
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHFFVSGIGQKKT 56


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1627ICENUCLEATIN462e-06 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 46.3 bits (109), Expect = 2e-06
Identities = 179/830 (21%), Positives = 267/830 (32%), Gaps = 18/830 (2%)

Query: 223 DNTTVNNTGKTIVDGKGATGTEIAGNNAVVNQDGELDVSGGGHGIDITGDSATVDNKGGM 282
D+T V G T G+ ++ G+ + +L +G G DS+ + G
Sbjct: 205 DSTLVAGYGSTQTAGEESSQMAGYGSTQTGMKGSDL-TAGYGSTGTAGDDSSLIAGYGST 263

Query: 283 TVTDPDSIGIQIDGDKAVVNKDGDSAISNGGTGTQVNGDEATVNNNGSTTVDGKDSTGTE 342
DS G K D G TGT D + + GST G++ST T
Sbjct: 264 QTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGT-AGADSSLIAGYGSTQTAGEESTQTA 322

Query: 343 INGDKAIVNNDGDSTILDGGTGTRITGDDAT-ANNSGNTTVDGQGSTGTEIAGNNAVVNQ 401
G D T G TGT GDD++ G+T G+ S+ T G+ Q
Sbjct: 323 GYGSTQTAQKGSDLTAGYGSTGT--AGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTA-Q 379

Query: 402 DGTLDVSGGGHGIDITGDSATVDNKGGMTVTDPDSIGIQRDGDKAVVNNDGDNAISNGGT 461
G+ +G G DS+ + G +S G D G T
Sbjct: 380 KGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGST 439

Query: 462 GTQVNGDEATVNNNGSTTVDGKDSTGTEINGDKAIVNNDGDSTILDGGTGTRITGDDATA 521
GT D + + GST G+DS+ T G D T G T T + +
Sbjct: 440 GT-AGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTST-AGYESSLI 497

Query: 522 NNSGNTTVDGQGSTGTEIAGNNAVVNQDGELDVSGGGHGIDITGDSATVDNKGGMTVTDP 581
G+T G GST T G+ + +L ++G G +S+ + G
Sbjct: 498 AGYGSTQTAGYGSTLTAGYGSTQTAQNESDL-ITGYGSTSTAGANSSLIAGYGSTQTASY 556

Query: 582 DSIGIQIDGDKAVVNNDGGSAISNGGTGTQINGDEATVNNNGNTTVDGQGSTGTEIAGNN 641
+S+ G G TGT D + + G+T S+ T G+
Sbjct: 557 NSVLTAGYGSTQTAREGSDLTAGYGSTGTA-GSDSSIIAGYGSTQTASYHSSLTAGYGST 615

Query: 642 AVVNQDGELDVSGGGHGIDITGDSATVDNKGGMTVTDPDSIGIQIDGDKAVVNNDGDSAI 701
+ L +G G DS+ + G +SI G D
Sbjct: 616 QTAREQSVL-TTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTA 674

Query: 702 SNGGTGTQVNGDEATVNNNGKTTVDGKDSTGTEINGDKAIVNNDGDSTILDGGTGTRITG 761
G T T D + + G T G +S T G D T G T T
Sbjct: 675 GYGSTST-AGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGAD 733

Query: 762 DDATANNSGNTTVDGQGSTGTEIAGNNAVVNQDGELDVSGGGHGIDITGDSATVDNKGGM 821
A G+T S+ T G+ + L +G G DS+ + G
Sbjct: 734 SSLIA-GYGSTQTASYHSSLTAGYGSTQTAREQSVL-TTGYGSTSTAGADSSLIAGYGST 791

Query: 822 TVTDPDSIGIQIDGDKAVVNNDGDNAISNGGTGTQVNGDEATVNNNGNTTVDGKDSTGTE 881
SI G D G T T D + + G+T G +S T
Sbjct: 792 QTAGYHSILTAGYGSTQTAQERSDLTTGYGSTST-AGADSSLIAGYGSTQTAGYNSILTA 850

Query: 882 INGDKAIVNNDGDSTILDGGTGTRITGDDATANNSGNTTVDGQGSTGTEIAGNNAVVNQD 941
G + D L G G+ T ++ +G + G AG +
Sbjct: 851 GYGSTQTAQENSD---LTTGYGSTSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQ 907

Query: 942 GELDVSGGGHGIDITGDSATVDNKGGMTVTDPDSIGIQIDGDKAVVNNDGDSAISNGGTG 1001
D++ G G +++ G T T + + + S + G+
Sbjct: 908 ENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGST 967

Query: 1002 TQVNGDEATVNNNGNTTVDGKDSTGTEINGDKAIVNNDGDSTILDGGTGT 1051
+ D + + G+T G ST T G + T G T T
Sbjct: 968 SMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQTAEHSSTLTAGYGSTAT 1017



Score = 44.0 bits (103), Expect = 1e-05
Identities = 184/848 (21%), Positives = 268/848 (31%), Gaps = 25/848 (2%)

Query: 350 VNNDGDSTILDGGTGTRITGDDATANNSGNTTVDGQ---GSTGTEIAGNNAVVNQDGTLD 406
V +D D+TI G T T + AT ++ + T Q G TE AG+++ +
Sbjct: 140 VTDDIDATIESGSTQPTQTIEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTL------- 192

Query: 407 VSGGGHGIDITGDSATVDNKGGMTVTDPDSIGIQRDGDKAVVNNDGDNAISNGGTGTQVN 466
++G G DS V G +S + G D G TGT +
Sbjct: 193 IAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGD 252

Query: 467 GDEATVNNNGSTTVDGKDSTGTEINGDKAIVNNDGDSTILDGGTGTRITGDDATANNSGN 526
D + + GST G+DS+ T G D T G TGT D + G+
Sbjct: 253 -DSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGT-AGADSSLIAGYGS 310

Query: 527 TTVDGQGSTGTEIAGNNAVVNQDGELDVSGGGHGIDITGDSATVDNKGGMTVTDPDSIGI 586
T G+ ST T G+ + +L G G GD +++ G T T + +
Sbjct: 311 TQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTG--TAGDDSSLIAGYGSTQTAGEDSSL 368

Query: 587 QIDGDKAVVNNDGGSAISNGGTGTQINGDEATVNNNGNTTVDGQGSTGTEIAGNNAVVNQ 646
G + G+ D + + G+T G+ ST T G+ +
Sbjct: 369 TAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQK 428

Query: 647 DGELDVSGGGHGIDITGDSATVDNKGGMTVTDPDSIGIQIDGDKAVVNNDGDSAISNGGT 706
+L G G DS+ + G DS G D G T
Sbjct: 429 GSDLTAGYGSTGT-AGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGST 487

Query: 707 GTQVNGDEATVNNNGKTTVDGKDSTGTEINGDKAIVNNDGDSTILDGGTGTRITGDDATA 766
T + + + G T G ST T G N+ D G T T + +
Sbjct: 488 ST-AGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTST-AGANSSLI 545

Query: 767 NNSGNTTVDGQGSTGTEIAGNNAVVNQDGELDVSGGGHGIDITGDSATVDNKGGMTVTDP 826
G+T S T G+ + +L G G DS+ + G
Sbjct: 546 AGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGT-AGSDSSIIAGYGSTQTASY 604

Query: 827 DSIGIQIDGDKAVVNNDGDNAISNGGTGTQVNGDEATVNNNGNTTVDGKDSTGTEINGDK 886
S G G T T D + + G+T G +S T G
Sbjct: 605 HSSLTAGYGSTQTAREQSVLTTGYGSTST-AGADSSLIAGYGSTQTAGYNSILTAGYGST 663

Query: 887 AIVNNDGDSTILDGGTGTRITGDDATANNSGNTTVDGQGSTGTEIAGNNAVVNQDGELDV 946
D T G T T D + G+T G S T G+ Q+G
Sbjct: 664 QTAQEGSDLTAGYGSTST-AGADSSLIAGYGSTQTAGYNSILTAGYGSTQTA-QEGSDLT 721

Query: 947 SGGGHGIDITGDSATVDNKGGMTVTDPDSIGIQIDGDKAVVNNDGDSAISNGGTGTQVNG 1006
SG G DS+ + G S G G T T
Sbjct: 722 SGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTST-AGA 780

Query: 1007 DEATVNNNGNTTVDGKDSTGTEINGDKAIVNNDGDSTILDGGTGTRITGDDATANNSGNT 1066
D + + G+T G S T G D T G T T D + G+T
Sbjct: 781 DSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTSTA-GADSSLIAGYGST 839

Query: 1067 TVDGQGSTGTEIAGNNAVVNQDGELDVSGGGHGIDITGDSATVDNKGGMTVTDPDSIGIQ 1126
G S T G+ ++ +L +G G DS+ + G +SI
Sbjct: 840 QTAGYNSILTAGYGSTQTAQENSDL-TTGYGSTSTAGYDSSLIAGYGSTQTAGYNSILTA 898

Query: 1127 IDGDKAVVNNDGDNAISNGGTGTQVNGDEATVNNNGSTTVDGQGSTGTEIAGNNAVVNQD 1186
G + D G T T + + + GST ST G++ +
Sbjct: 899 GYGSTQTAQENSDLTTGYGSTST-AGYESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQ 957

Query: 1187 GTLDVSGG 1194
+L G
Sbjct: 958 SSLTAGYG 965


25UTI89_C1672UTI89_C1678Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C1672227-0.587824L-asparagine permease
UTI89_C1673534-3.431107transferase
UTI89_C1674534-3.819299hypothetical protein
UTI89_C1675433-2.985771hypothetical protein
UTI89_C1676334-3.182120hypothetical protein
UTI89_C1677132-7.890244hypothetical protein
UTI89_C1678021-3.670261hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1676ICENUCLEATIN330.005 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 33.2 bits (75), Expect = 0.005
Identities = 25/133 (18%), Positives = 53/133 (39%), Gaps = 8/133 (6%)

Query: 545 GHDQSITVANDRCITVRNDQTLQVTNDRTVSVSNDDGLYVRNDRKVTVEGKQEHKTTGNH 604
G +S + +R + + + Q R+ +S D + + +R + G +T G+
Sbjct: 1091 GP-ESTQITGNRSMLIAGKGSSQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAGDR 1149

Query: 605 VSLVEGKHSLVVKGDLARKVSGALGIKVDGDIVLESSSRISLKVGGSFVVIHSGGVDIVG 664
L+ G +S + GD ++ +G D +L + R L G + ++ ++G
Sbjct: 1150 SKLLAGNNSYLTAGDRSKLTAG-------NDCILMAGDRSKLTAGINSILTAGCRSKLIG 1202

Query: 665 PKISLNSGGSPGT 677
S + G
Sbjct: 1203 SNGSTLTAGENSV 1215



Score = 30.9 bits (69), Expect = 0.027
Identities = 15/69 (21%), Positives = 35/69 (50%)

Query: 567 QVTNDRTVSVSNDDGLYVRNDRKVTVEGKQEHKTTGNHVSLVEGKHSLVVKGDLARKVSG 626
Q+ + R+ ++ + + +R + + GK +T G +L+ G S+ + G+ + ++G
Sbjct: 1080 QIASHRSSLIAGPESTQITGNRSMLIAGKGSSQTAGYRSTLISGADSVQMAGERGKLIAG 1139

Query: 627 ALGIKVDGD 635
A + GD
Sbjct: 1140 ADSTQTAGD 1148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1678PF07299280.010 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 28.3 bits (63), Expect = 0.010
Identities = 16/51 (31%), Positives = 23/51 (45%), Gaps = 13/51 (25%)

Query: 61 LNDMYAFIPGDNYYFIKS------SGYKFVND-------KWFTLKSINNIF 98
+ M AFI D Y FIKS +G+ ND K ++ I ++F
Sbjct: 4 VIKMEAFIRSDQYNFIKSQAYILANGHATANDRGVIQALKSLAIEKIIHVF 54


26UTI89_C1704UTI89_C1736Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C1704-213-3.137851hypothetical protein
UTI89_C1705-212-3.875953lipoprotein YddW
UTI89_C1706-213-5.398196amino acid antiporter
UTI89_C1707-215-6.049328glutamate decarboxylase beta
UTI89_C1708-220-7.487294zinc protease PqqL
UTI89_C1709-221-8.550131hypothetical protein
UTI89_C1710-123-7.437382ABC transporter ATP-binding protein
UTI89_C1711026-7.212064hypothetical protein
UTI89_C1712027-6.423614sulfatase YdeN
UTI89_C1713127-6.061897transcriptional regulator YdeO
UTI89_C1714227-5.372794hypothetical protein
UTI89_C1715228-5.495829oxidoreductase
UTI89_C1716334-7.137137Fml fimbrial adhesin FmlD
UTI89_C1717233-6.906481outer membrane usher FmlC protein
UTI89_C1718334-6.176139periplasmic chaperone FmlB protein
UTI89_C1719335-5.919913Fml fimbriae subunit
UTI89_C1720230-5.940396hypothetical protein
UTI89_C1721128-6.319378hypothetical protein
UTI89_C1722031-7.400783hypothetical protein
UTI89_C1723131-8.234417hypothetical protein
UTI89_C1724135-8.932271hypothetical protein
UTI89_C1725137-9.031159phage-related secreted protein
UTI89_C1726141-10.229624phage-related membrane protein
UTI89_C1727345-10.601494hypothetical protein
UTI89_C1728240-10.332132phage-related membrane protein
UTI89_C1729239-9.946524hypothetical protein
UTI89_C1730237-9.343915phage-related membrane protein
UTI89_C1731442-12.430145phage-related membrane protein
UTI89_C1732339-11.845965hypothetical protein
UTI89_C1733231-9.056902hypothetical protein
UTI89_C1734126-7.318570hypothetical protein
UTI89_C1735-219-4.912669hypothetical protein
UTI89_C1736-215-4.431978hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1717PF005775610.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 561 bits (1446), Expect = 0.0
Identities = 284/494 (57%), Positives = 365/494 (73%), Gaps = 10/494 (2%)

Query: 15 QVLILPRFARLTFALGLATAVFPVDAEYYFNPRFLSNDLAESVDLSAFTKGREAPPGTYR 74
+ + F RL A A AE YFNPRFL++D DLS F G+E PPGTYR
Sbjct: 20 KHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYR 79

Query: 75 VDIYLNDEFMASRDITFIADDNNADLIPCLSTDLLVSLGIKKSALLDNKEHSADKHVPDN 134
VDIYLN+ +MA+RD+TF D+ ++PCL+ L S+G+ +++ + ++ +
Sbjct: 80 VDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASV-------SGMNLLAD 132

Query: 135 SACTPLQDRLADASSEFDVGQQHLSLSVPQIYVGRMAHGYVSPDLWEEGINAGLLNYSFN 194
AC PL + DA+++ DVGQQ L+L++PQ ++ A GY+ P+LW+ GINAGLLNY+F+
Sbjct: 133 DACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFS 192

Query: 195 GNSINNRSNHNAGKSNYAYLNLQSGINIGSWRLRDNSTWSYNSGSSNSSDSNKWQHINTS 254
GNS N G S+YAYLNLQSG+NIG+WRLRDN+TWSYNS S+S NKWQHINT
Sbjct: 193 GNS---VQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTW 249

Query: 255 AERDIIPLRSRLTVGDSYTDGDIFDSVNFRGLKINSTEAMLPDSQHGFAPVIHGIARGTA 314
ERDIIPLRSRLT+GD YT GDIFD +NFRG ++ S + MLPDSQ GFAPVIHGIARGTA
Sbjct: 250 LERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTA 309

Query: 315 QVSVKQNGYDVYQTTVPPGPFTIDDINSAANGGNLQVTIKEADGSIQTLYVPYSSVPVLQ 374
QV++KQNGYD+Y +TVPPGPFTI+DI +A N G+LQVTIKEADGS Q VPYSSVP+LQ
Sbjct: 310 QVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQ 369

Query: 375 RAGYTRYALAMGEYRSGNNLQSTPKFVQASLMHGLKGNWTPYGGMQIAEDYQAFNLGIGK 434
R G+TRY++ GEYRSGN Q P+F Q++L+HGL WT YGG Q+A+ Y+AFN GIGK
Sbjct: 370 REGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGK 429

Query: 435 DLGLFGAFSFDITQANTTLADDTRHSGQSVKFVYSKSFYQTGTNIQVAGYRYSTQGFYNL 494
++G GA S D+TQAN+TL DD++H GQSV+F+Y+KS ++GTNIQ+ GYRYST G++N
Sbjct: 430 NMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNF 489

Query: 495 SDSAYSRMSGYTDY 508
+D+ YSRM+GY
Sbjct: 490 ADTTYSRMNGYNIE 503


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1721NUCEPIMERASE353e-04 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 34.8 bits (80), Expect = 3e-04
Identities = 22/73 (30%), Positives = 34/73 (46%), Gaps = 10/73 (13%)

Query: 1 MRILVAGATGSIGIHVVNTAIAMGHQPVTL---------VRNRRKIKLLPRGTDIFY-GD 50
M+ LV GA G IG HV + GHQ V + + +++LL + F+ D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 51 VSIPETLTDLPKD 63
++ E +TDL
Sbjct: 61 LADREGMTDLFAS 73


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1725BCTERIALGSPD1052e-26 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 105 bits (262), Expect = 2e-26
Identities = 60/334 (17%), Positives = 130/334 (38%), Gaps = 47/334 (14%)

Query: 119 LSDILGGYVSGSFNNSGAVISDDSLKGSSGASNYINRTGDILVYYGTKEDIAILKTLVTS 178
L+ I S D ++ + + L+ + + L+ ++
Sbjct: 286 LTGISSTMQSEKQAAKPVAALDKNIIIKAHGQT------NALIVTAAPDVMNDLERVIAQ 339

Query: 179 LDTMSDEVVVSGYVFEVQ---------------------TSQSDGSGILLAAKILSDK-- 215
LD +V+V + EVQ T+ +A +K
Sbjct: 340 LDIRRPQVLVEAIIAEVQDADGLNLGIQWANKNAGMTQFTNSGLPISTAIAGANQYNKDG 399

Query: 216 FNISVGAAGLDNF----INIRTGSIDAIFNLLKTDSRFTVVSAPRLRVKNNASASFSVGS 271
S A+ L +F G+ + L + ++ +++ P + +N A+F+VG
Sbjct: 400 TVSSSLASALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQ 459

Query: 272 DVPVL-GSVTVNNNTTTQSVEYRSSGVLFNVTPSI-KSRTMDLKIQQQLSNFVTTETGVN 329
+VPVL GS T + + +VE ++ G+ V P I + ++ L+I+Q++S+ + +
Sbjct: 460 EVPVLTGSQTTSGDNIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVADAASSTS 519

Query: 330 NS--PTLIKRDVTTEVSLADGDIILLGGLAEQKDSKASSGWSF----------FGSRTSE 377
+ T R V V + G+ +++GGL ++ S + F S + +
Sbjct: 520 SDLGATFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKK 579

Query: 378 SNKTDIMVMLQVRKVDRSRATPRSAARSGELFRD 411
+K ++M+ ++ + ++++ F D
Sbjct: 580 VSKRNLMLFIRPTVIRDRDEYRQASSGQYTAFND 613


27UTI89_C1747UTI89_C1757Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C1747016-3.650121sugar efflux transporter
UTI89_C1748019-6.162725multiple drug resistance protein MarC
UTI89_C1749022-7.030119DNA-binding transcriptional repressor MarR
UTI89_C1750226-7.797054DNA-binding transcriptional activator MarA
UTI89_C1751128-8.032039hypothetical protein
UTI89_C1752226-8.5440696-phospho-beta-glucosidase
UTI89_C1753126-7.828664outer membrane protein YieC
UTI89_C1754220-5.958137PTS system cellobiose-specific transporter
UTI89_C1755121-6.035887hypothetical protein
UTI89_C1756-118-3.844096PTS system cellobiose-specific transporter
UTI89_C1757-116-3.166120hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1747TCRTETB553e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 54.5 bits (131), Expect = 3e-10
Identities = 41/192 (21%), Positives = 84/192 (43%), Gaps = 8/192 (4%)

Query: 36 LSDIAHSFHMQTAQVGIMLTIYAWVVALMSLPFMLMTSQVERRKLLICLFVVFIASHVLS 95
L DIA+ F+ A + T + ++ + + ++ Q+ ++LL+ ++ V+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 96 FLSWS-FTVLVISRIGVAFAHAIFWSITASLAIRMAPAGKRAQALSLIATGTALAMVLGL 154
F+ S F++L+++R A F ++ + R P R +A LI + A+ +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 155 PLGRIVGQYFGWRMTFFAIGIGALITLLCLIKLLPLLPSEHSGSLKSLPLLFRRPALMSI 214
+G ++ Y W I + +IT+ L+KLL + LMS+
Sbjct: 157 AIGGMIAHYIHWSY-LLLIPMITIITVPFLMKLLK------KEVRIKGHFDIKGIILMSV 209

Query: 215 YLLTVVVVTAHY 226
++ ++ T Y
Sbjct: 210 GIVFFMLFTTSY 221


28UTI89_C1880UTI89_C1885Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C1880215-3.444410hypothetical protein
UTI89_C1881118-4.744288hypothetical protein
UTI89_C1882015-3.800285transport protein YdiM
UTI89_C1883-115-4.096281transport protein YdiN
UTI89_C1884017-3.761634quinate/shikimate dehydrogenase
UTI89_C1885-216-3.2633543-dehydroquinate dehydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1882TCRTETA290.027 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.4 bits (66), Expect = 0.027
Identities = 54/312 (17%), Positives = 105/312 (33%), Gaps = 18/312 (5%)

Query: 61 FAGLLSDRFGRRPFIMLGMCCYMAFFFGILHTNNIIIAYVFGFLAGMANSFLDAGTYPSL 120
G LSDRFGRRP +++ + + + + + Y+ +AG+ + A +
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAYI 120

Query: 121 MEAFPRSPGTANI-LIKAFVSSGQFLLPLIISLLVWAELWFGWSFMIAAGIMFINALFLY 179
+ + + A G P++ L+ F AA + +N L
Sbjct: 121 ADITDGDERARHFGFMSACFGFGMVAGPVLGGLM--GGFSPHAPFFAAAALNGLNFLTGC 178

Query: 180 RCTFPPHPGRHLPV---IKKTTSSTEHRCSIIDLASYSLYGYISMATFYLVSQWLAQYGQ 236
H G P+ +S + +A+ +I + + +G+
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGE 238

Query: 237 FVAGMSYTM-SIKLLSIYTVGSLLCVFITAPLIRNTVRPTTLL--MLYTFISFIALLTVC 293
T I L + + SL IT P+ L+ M+ +I L
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 294 LHPTFYVVIIFAFVIGFTSAGGVVQIGLTLMAERF--PYAKGKATGIYYSAGSIATFTIP 351
+ +++ ++GG+ L M R +G+ G + S+ + P
Sbjct: 299 RGWMAFPIMV------LLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGP 352

Query: 352 LITAHLSQRSIA 363
L+ + SI
Sbjct: 353 LLFTAIYAASIT 364


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1883TCRTETB310.008 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.4 bits (71), Expect = 0.008
Identities = 38/177 (21%), Positives = 75/177 (42%), Gaps = 9/177 (5%)

Query: 14 ILAVLCIYFSYFLHGISVITLAQNMTSLAEKFSTDNAGIAYLISGIGLGRLISILFFGVI 73
IL LCI F ++ + L ++ +A F+ A ++ + L I +G +
Sbjct: 15 ILIWLCIL--SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKL 72

Query: 74 SDKFGRRAVILMAVIMY----LLFFFGIPACPNLTLAYCLAVCVGIANSALDTGGYPALM 129
SD+ G + ++L +I+ ++ F G L +A + A AL
Sbjct: 73 SDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARY- 131

Query: 130 ECFPKASGSAVILVKAMVSFGQMFYPMLVSYMLLNNIWYGYGLIIPGILFVLITLML 186
+ G A L+ ++V+ G+ P + M+ + I + Y L+IP I + + ++
Sbjct: 132 -IPKENRGKAFGLIGSIVAMGEGVGP-AIGGMIAHYIHWSYLLLIPMITIITVPFLM 186


29UTI89_C1905UTI89_C1913Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C19053271.589190integration host factor subunit alpha
UTI89_C19064281.295967phenylalanyl-tRNA synthetase subunit beta
UTI89_C1907426-1.210634phenylalanyl-tRNA synthetase subunit alpha
UTI89_C1908223-3.989928hypothetical protein
UTI89_C1909121-4.94907250S ribosomal protein L20
UTI89_C1910-116-4.61555150S ribosomal protein L35
UTI89_C1911-114-3.480664translation initiation factor IF-3
UTI89_C1912-217-3.742884threonyl-tRNA synthetase
UTI89_C1913024-5.047946hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1905DNABINDINGHU1193e-39 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 119 bits (301), Expect = 3e-39
Identities = 34/89 (38%), Positives = 55/89 (61%)

Query: 4 TKAEMSEYLFDKLGLSKRDAKELVELFFEEIRRALENGEQVKLSGFGNFDLRDKNQRPGR 63
K ++ + + L+K+D+ V+ F + L GE+V+L GFGNF++R++ R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 64 NPKTGEDIPITARRVVTFRPGQKLKSRVE 92
NP+TGE+I I A +V F+ G+ LK V+
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


30UTI89_C1924UTI89_C1940Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C1924-115-3.222377cell division modulator
UTI89_C1925-113-3.191660hydroperoxidase II
UTI89_C1926118-4.753697hypothetical protein
UTI89_C1927218-5.5021596-phospho-beta-glucosidase
UTI89_C1928119-5.405286DNA-binding transcriptional regulator ChbR
UTI89_C1929018-3.412887PTS system N,N'-diacetylchitobiose-specific
UTI89_C1930117-2.920677PTS system N,N'-diacetylchitobiose-specific
UTI89_C1931118-2.618177PTS system N,N'-diacetylchitobiose-specific
UTI89_C1932-117-1.838672hypothetical protein
UTI89_C1933014-1.233094DNA-binding transcriptional activator OsmE
UTI89_C1934013-0.414368NAD synthetase
UTI89_C19350130.941783nucleotide excision repair endonuclease
UTI89_C19360132.827746hypothetical protein
UTI89_C19370133.358352hypothetical protein
UTI89_C1938-1123.699579periplasmic protein
UTI89_C1939-1133.698605succinylglutamate desuccinylase
UTI89_C19400133.772122succinylarginine dihydrolase
31UTI89_C1963UTI89_C1992Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C1963-119-4.331560cytoplasmic asparaginase I
UTI89_C1964-121-5.222980nicotinamidase/pyrazinamidase
UTI89_C1965022-5.906115metabolite transport protein YdjE
UTI89_C1966-122-5.154117DEOR-type transcriptional regulator
UTI89_C1967020-4.063805oxidoreductase YdjG
UTI89_C1968021-4.425943sugar kinase YdjH
UTI89_C1969019-3.931041membrane protein YdjI
UTI89_C1970-216-3.018896zinc-type alcohol dehydrogenase-like protein
UTI89_C1971-216-2.415962metabolite transport protein YdjK
UTI89_C1972-119-2.207448oxidoreductase
UTI89_C1973023-1.889646hypothetical protein
UTI89_C1974119-1.494011methionine sulfoxide reductase B
UTI89_C1975117-1.497142glyceraldehyde-3-phosphate dehydrogenase A
UTI89_C1976-110-3.781969hypothetical protein
UTI89_C1977-112-4.470730hypothetical protein
UTI89_C1978-112-4.519145hypothetical protein
UTI89_C1979-115-5.071889hypothetical protein
UTI89_C1980-218-5.521083hypothetical protein
UTI89_C1981-222-5.648982hypothetical protein
UTI89_C1982-121-1.613438hypothetical protein
UTI89_C19830200.708733hypothetical protein
UTI89_C19841200.834949hypothetical protein
UTI89_C1985-120-0.557052hypothetical protein
UTI89_C1986-120-1.432016transcriptional regulator YeaM
UTI89_C1987020-1.700435amino acid/amine transport protein
UTI89_C1988020-3.745822hypothetical protein
UTI89_C1990022-4.164437hypothetical protein
UTI89_C1991-220-5.081476hypothetical protein
UTI89_C1992-120-4.910580hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1964ISCHRISMTASE373e-05 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 36.9 bits (85), Expect = 3e-05
Identities = 35/192 (18%), Positives = 55/192 (28%), Gaps = 58/192 (30%)

Query: 2 PPRALLLV-DLQNDFCAGGALAVPEGDSTVDVANRLIDWCQSRGEAVI-----ASQD--- 52
P RA+LL+ D+QN F +L + C G V+ SQ+
Sbjct: 28 PNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCVQLGIPVVYTAQPGSQNPDD 87

Query: 53 -------WHPANHGSFASQHGVEPYTPGQLDGLPQTFWPDHCVQNSEGAQLHPLLKQKAI 105
W P + + + P D + T W
Sbjct: 88 RALLTDFWGPGLNSGPYEEKIITELAPEDDDLV-LTKW---------------------- 124

Query: 106 AAVFHKGENPLVDSYSAFFDNGRRQKTALDDWLRAHVINELIVMGLATDYCVKFTVLDAL 165
YSAF +T L + +R ++LI+ G+ T +A
Sbjct: 125 -------------RYSAFK------RTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAF 165

Query: 166 QLGYKVNVITDG 177
K + D
Sbjct: 166 MEDIKAFFVGDA 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1965TCRTETB402e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 39.9 bits (93), Expect = 2e-05
Identities = 30/129 (23%), Positives = 50/129 (38%), Gaps = 1/129 (0%)

Query: 88 ALMFGYFIGSLTGGFIGDYFGRRRAFRINLLIVGIAATGAAFVPDMY-WLIFFRFLMGTG 146
A M + IG+ G + D G +R ++I + + LI RF+ G G
Sbjct: 57 AFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAG 116

Query: 147 MGALIMVGYASFTEFIPATVRGKWSARLSFVGNWSPMLSAAIGVVVIAFFSWRIMFLLGG 206
A + +IP RGK + + + AIG ++ + W + L+
Sbjct: 117 AAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPM 176

Query: 207 IGILLAWFL 215
I I+ FL
Sbjct: 177 ITIITVPFL 185


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1971TCRTETB310.011 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.0 bits (70), Expect = 0.011
Identities = 33/142 (23%), Positives = 48/142 (33%), Gaps = 23/142 (16%)

Query: 71 MFLGALVGGIIGDKTGRRNAFILYEAIHIASMVVGAFSPNMDF-LIACRFVMGVGLGALL 129
+G V G + D+ G + + I+ V+G + LI RF+ G G A
Sbjct: 62 FSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFP 121

Query: 130 VTLFAGFTEYMPGRNR----GTWSSRVSFIGNWSYPLCSLIAMGLTPLISA----EWNWR 181
+ Y+P NR G S V+ + G+ P I +W
Sbjct: 122 ALVMVVVARYIPKENRGKAFGLIGSIVA------------MGEGVGPAIGGMIAHYIHWS 169

Query: 182 VQLLIPAILSLIATALAWRYFP 203
LLIP I I T
Sbjct: 170 YLLLIPMI--TIITVPFLMKLL 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1976INVEPROTEIN290.021 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 29.3 bits (65), Expect = 0.021
Identities = 18/81 (22%), Positives = 34/81 (41%), Gaps = 13/81 (16%)

Query: 165 ETTSALHTYFNVGDIAKVSVSGLGDRFIDKVNDAKED-----------VLTDGIQTFPDR 213
E ++AL + N D K S S L + F ++V + + V ++ F +
Sbjct: 57 EMSAALAQFRNRRDYEKKS-SNLSNSF-ERVLEDEALPKAKQILKLISVHGGALEDFLRQ 114

Query: 214 TDRVYLNPQDCSVINDEALNR 234
++ +P D ++ E L R
Sbjct: 115 ARSLFPDPSDLVLVLRELLRR 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1985PRTACTNFAMLY280.022 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 27.7 bits (61), Expect = 0.022
Identities = 18/61 (29%), Positives = 26/61 (42%)

Query: 49 QGLSIGIIILTIGVMAPIASGTLPPSTLIHSFLNWKSLVAIAVGVIVSWLGGRGVTLMGS 108
Q +I L IG + + LPPS ++ N ++ A VS LG +TL G
Sbjct: 174 QRSAIVDGGLHIGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPAAVSVLGASELTLDGG 233

Query: 109 Q 109

Sbjct: 234 H 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1992HTHTETR306e-04 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 30.0 bits (67), Expect = 6e-04
Identities = 9/37 (24%), Positives = 17/37 (45%), Gaps = 5/37 (13%)

Query: 4 LSWIIFGLIAGILAKWIMPG-----KDGGGFFMTILL 35
+ I+ G I+G++ W+ K ++ ILL
Sbjct: 163 AAIIMRGYISGLMENWLFAPQSFDLKKEARDYVAILL 199


32UTI89_C2018UTI89_C2023Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2018-320-3.911620hypothetical protein
UTI89_C2019-222-3.936905hypothetical protein
UTI89_C2020121-4.96013623S rRNA methyltransferase
UTI89_C2021226-8.263462cold shock-like protein CspC
UTI89_C2022-120-5.846199hypothetical protein
UTI89_C2023-318-3.215510hypothetical protein
33UTI89_C2129UTI89_C2277Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2129-118-3.161821hypothetical protein
UTI89_C2130023-4.571861hypothetical protein
UTI89_C2131433-7.329553hypothetical protein
UTI89_C2132333-6.705035hypothetical protein
UTI89_C2133024-4.273078porin
UTI89_C2134-116-1.819297transcriptional regulator YbcM
UTI89_C21350141.569486kinase inhibitor
UTI89_C2136-1153.898514multidrug efflux protein
UTI89_C21371184.667041flagellar hook-basal body protein FliE
UTI89_C21381154.432539flagellar MS-ring protein
UTI89_C21391184.531555flagellar motor switch protein G
UTI89_C2140-1173.829921flagellar assembly protein H
UTI89_C2141-2193.516621flagellum-specific ATP synthase
UTI89_C2142-2162.382533flagellar biosynthesis chaperone
UTI89_C2143-1162.347199flagellar hook-length control protein
UTI89_C2144-3211.868504flagellar basal body protein FliL
UTI89_C21450170.534763flagellar motor switch protein FliM
UTI89_C2146116-2.378383flagellar motor switch protein FliN
UTI89_C2147017-3.266588flagellar biosynthesis protein FliO
UTI89_C2148-119-3.895273flagellar biosynthesis protein FliP
UTI89_C2149-122-5.245238flagellar biosynthesis protein FliQ
UTI89_C2150-119-5.508745flagellar biosynthesis protein FliR
UTI89_C2151021-3.660132capsular polysaccharide synthesis DNA-binding
UTI89_C2152017-0.026947hypothetical protein
UTI89_C2153-117-0.222205hypothetical protein
UTI89_C2154-2150.172231hypothetical protein
UTI89_C2155-1160.805093hypothetical protein
UTI89_C2156-1171.416047mannosyl-3-phosphoglycerate phosphatase
UTI89_C21570161.270347hypothetical protein
UTI89_C21582171.895936hypothetical protein
UTI89_C21591171.789243hypothetical protein
UTI89_C2160-111-0.512167hypothetical protein
UTI89_C2161-111-1.389146DNA mismatch endonuclease, patch repair protein
UTI89_C2162-214-4.349465DNA cytosine methylase
UTI89_C2163-122-5.997469hypothetical protein
UTI89_C2164027-7.703241hypothetical protein
UTI89_C2165-224-6.548645outer membrane protein N
UTI89_C2166-125-5.670754chaperone protein HchA
UTI89_C2167030-7.1452002-component sensor protein
UTI89_C2168128-6.076989transcriptional regulatory protein YedW
UTI89_C2169127-5.999007hypothetical protein
UTI89_C2170-121-3.916381sulfite oxidase subunit YedY
UTI89_C2171-121-2.789924sulfite oxidase subunit YedZ
UTI89_C2172-1210.047209hypothetical protein
UTI89_C21730183.136594hypothetical protein
UTI89_C21750184.412927*hypothetical protein
UTI89_C21771215.528516*integrase
UTI89_C21780217.149178salicylate synthase Irp9
UTI89_C21790238.207765hypothetical protein
UTI89_C21800248.245176ABC transporter protein
UTI89_C21810248.334703ABC transporter inner membrane protein
UTI89_C21820248.477731AraC family transcriptional regulator
UTI89_C2183-1248.292170peptide synthetase-like protein
UTI89_C2184-1237.376158HMWP1 nonribosomal peptide/polyketide synthase
UTI89_C2185-1214.695937thiazolinyl-S-HMWP1 reductase
UTI89_C2186-1182.598027hypothetical protein
UTI89_C2187-1181.492218siderophore biosynthetic protein
UTI89_C2188-120-2.502609pesticin receptor
UTI89_C2189-126-4.947038hypothetical protein
UTI89_C2190-224-3.832922hypothetical protein
UTI89_C2191-226-4.871102hypothetical protein
UTI89_C2192-230-6.126399hypothetical protein
UTI89_C2193-128-5.446301hypothetical protein
UTI89_C2194-225-3.875544shikimate transporter
UTI89_C2195-128-4.343141AMP nucleosidase
UTI89_C2196-125-4.026192hypothetical protein
UTI89_C2197-125-3.058168hypothetical protein
UTI89_C2199026-3.309466*hypothetical protein
UTI89_C2201025-3.735076*transcriptional regulator Cbl
UTI89_C2202123-3.133412nitrogen assimilation transcriptional regulator
UTI89_C2204020-1.836526*prophage P4 integrase
UTI89_C2205121-0.153814hypothetical protein
UTI89_C22061200.083461hypothetical protein
UTI89_C22071211.166760thioesterase
UTI89_C22082192.764515hypothetical protein
UTI89_C22092183.555475polyketide synthase
UTI89_C22102184.104708peptide synthetase
UTI89_C22112174.371874hypothetical protein
UTI89_C22122174.620405amidase
UTI89_C22132164.437278peptide synthetase
UTI89_C22142164.048119peptide synthetase
UTI89_C22151162.822759polyketide synthase
UTI89_C22161172.267974hypothetical protein
UTI89_C22172191.504930transacylase
UTI89_C22181200.335515acyl-CoA dehydrogenase
UTI89_C22191190.211105hypothetical protein
UTI89_C22201190.2762283-hydroxyacyl-CoA dehydrogenase
UTI89_C22211180.667104polyketide synthase
UTI89_C2222119-0.507887peptide/polyketide synthase
UTI89_C2223331-8.387766hypothetical protein
UTI89_C2224027-3.904409transposase
UTI89_C2225126-2.842431transposase
UTI89_C2226128-2.470227transposase
UTI89_C2227230-2.431638hypothetical protein
UTI89_C2228229-1.353987hypothetical protein
UTI89_C2229126-3.295402nicotinate-nucleotide--dimethylbenzimidazole
UTI89_C2230126-4.793306cobalamin synthase
UTI89_C2231329-6.319040adenosylcobinamide kinase
UTI89_C2232528-6.140163hypothetical protein
UTI89_C2233627-5.058530hypothetical protein
UTI89_C2234729-6.473193outer membrane receptor for iron compound or
UTI89_C2235428-5.641898hypothetical protein
UTI89_C2236426-5.155033hypothetical protein
UTI89_C2237324-4.814074hypothetical protein
UTI89_C2238325-5.300616hypothetical protein
UTI89_C2239230-6.105263hypothetical protein
UTI89_C2240228-5.491533polysaccharide metabolism
UTI89_C2241129-5.718580transferase
UTI89_C2242229-5.452379hypothetical protein
UTI89_C2243329-5.189001hypothetical protein
UTI89_C2244328-4.602921hypothetical protein
UTI89_C2245320-2.626552hypothetical protein
UTI89_C22462210.086942phosphotriesterase-related protein
UTI89_C22477291.626641transposase
UTI89_C22485273.584694hypothetical protein
UTI89_C22494283.806168hypothetical protein
UTI89_C22503243.205255hypothetical protein
UTI89_C22514252.371909hypothetical protein
UTI89_C22524241.557647hypothetical protein
UTI89_C22534221.708084transposase
UTI89_C2254521-0.032287hypothetical protein
UTI89_C2255624-0.332020hypothetical protein
UTI89_C2256625-0.324210hypothetical protein
UTI89_C2257525-0.413309transposase/IS protein
UTI89_C22582250.894873transposase
UTI89_C22590211.998248transposase
UTI89_C2260-1192.444863hypothetical protein
UTI89_C2261-2162.524670hypothetical protein
UTI89_C2262-1183.437583hypothetical protein
UTI89_C2263-2172.669809ABC transporter
UTI89_C2264-1162.395628ABC transporter
UTI89_C22650172.113203hypothetical protein
UTI89_C22660171.119774TonB dependent receptor
UTI89_C22674230.099009hypothetical protein
UTI89_C22684230.306457hypothetical protein
UTI89_C22694241.161662hypothetical protein
UTI89_C22704240.999976hypothetical protein
UTI89_C22715231.125624hypothetical protein
UTI89_C22727252.687987hypothetical protein
UTI89_C22738264.504640hypothetical protein
UTI89_C22747264.139799hypothetical protein
UTI89_C22757273.334204hypothetical protein
UTI89_C22766262.666188hypothetical protein
UTI89_C22772220.048122hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2130RTXTOXIND300.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.017
Identities = 10/57 (17%), Positives = 17/57 (29%), Gaps = 2/57 (3%)

Query: 164 RFTLLPIFRIPVKMQKVSAASPLTQKPDQARRRF--RLGMLVFFGMLGWALLTAMNQ 218
R L R + + + A L + P R R M ++L +
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEI 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2131PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2133ECOLIPORIN495e-179 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 495 bits (1277), Expect = e-179
Identities = 234/370 (63%), Positives = 271/370 (73%), Gaps = 31/370 (8%)

Query: 1 MSAQAAEIYNKDSNKLDLYGKVNAKHYFSSNDADDGDTTYVRLGFKGETQINDQLTGFGQ 60
+A AAEIYNKD NKLDLYGKV+ HYFS + + DGD TY+R+GFKGETQINDQLTG+GQ
Sbjct: 17 GAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVGFKGETQINDQLTGYGQ 76

Query: 61 WEYEFKGNRAESQGSSKDKTRLAFAGLKFGDYGSIDYGRNYGVAYDIGAWTDVLPEFGGD 120
WEY + N E +G++ TRLAFAGLKFGDYGS DYGRNYGV YD+ WTD+LPEFGGD
Sbjct: 77 WEYNVQANTTEGEGANS-WTRLAFAGLKFGDYGSFDYGRNYGVLYDVEGWTDMLPEFGGD 135

Query: 121 TWTQTDVFMTGRTTGVATYRNNDFFGLVDGLNFAAQYQGKNDR----------------T 164
++T D +MTGR GVATYRN DFFGLVDGLNFA QYQGKN+
Sbjct: 136 SYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQSADDVNIGTNNRNNGD 195

Query: 165 DVTEANGDGFGFSTTYEY-EGFGVGATYAKSDRTNDQVIYGNNSLNASGQNAEVWAAGLK 223
D+ NGDGFG STTY+ GF GA Y SDRTN+QV G A G A+ W AGLK
Sbjct: 196 DIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGT--IAGGDKADAWTAGLK 253

Query: 224 YDANNIYLATTYSETQNMTVFG------NNHIANKAQNFEVVAQYQFDFGLRPSVAYLQS 277
YDANNIYLAT YSET+NMT +G + +ANK QNFEV AQYQFDFGLRP+V++L S
Sbjct: 254 YDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQYQFDFGLRPAVSFLMS 313

Query: 278 KGKDLG----AWGDQDLVEYIDVGATYYFNKNMSTFVDYKINLIDKSD-FTKASGVATDD 332
KGKDL D+DLV+Y DVGATYYFNKN ST+VDYKINL+D D F K +G++TDD
Sbjct: 314 KGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDDDDPFYKDAGISTDD 373

Query: 333 IVAVGLVYQF 342
IVA+G+VYQF
Sbjct: 374 IVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2137FLGHOOKFLIE1178e-38 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 117 bits (293), Expect = 8e-38
Identities = 102/103 (99%), Positives = 102/103 (99%)

Query: 2 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTVARTQAEKFTL 61
SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQT ARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2138FLGMRINGFLIF7500.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 750 bits (1938), Expect = 0.0
Identities = 476/555 (85%), Positives = 511/555 (92%), Gaps = 5/555 (0%)

Query: 3 ATAAQTKSLEWLNRLRANPKIPLIVAGSAAVAVMVALILWAKAPDYRTLFSNLSDQDGGA 62
+TA Q K LEWLNRLRANP+IPLIVAGSAAVA++VA++LWAK PDYRTLFSNLSDQDGGA
Sbjct: 5 STATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGA 64

Query: 63 IVSQLTQMNIPYRFSEASGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 122
IV+QLTQMNIPYRF+ SGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ
Sbjct: 65 IVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 124

Query: 123 FSEQVNYQRALEGELSRTIETIGPVKGARVHLAMPKPSLFVREQKSPSASVTVNLLPGRA 182
FSEQVNYQRALEGEL+RTIET+GPVK ARVHLAMPKPSLFVREQKSPSASVTV L PGRA
Sbjct: 125 FSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRA 184

Query: 183 LDEGQISAIVHLVSSAVAGLPPGNVTLVDQGGHLLTQSNTSGRDLNDAQLKYASDVEGRI 242
LDEGQISA+VHLVSSAVAGLPPGNVTLVDQ GHLLTQSNTSGRDLNDAQLK+A+DVE RI
Sbjct: 185 LDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRI 244

Query: 243 QRRIEAILSPIVGNGNIHAQVTAQLDFASKEQTEEQYRPNGDESHAALRSRQLNESEQSG 302
QRRIEAILSPIVGNGN+HAQVTAQLDFA+KEQTEE Y PNGD S A LRSRQLN SEQ G
Sbjct: 245 QRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVG 304

Query: 303 SGYPGGVPGALSNQPAPANNAPISTPPANQNNRQQ--QASTTSNS---GPRSTQRNETSN 357
+GYPGGVPGALSNQPAP N API+TPP NQ N Q Q ST++NS GPRSTQRNETSN
Sbjct: 305 AGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSN 364

Query: 358 YEVDRTIRHTKMNVGDVQRLSVAVVVNYKTLPDGKPLPLSNEQMKQIEALTREAMGFSEK 417
YEVDRTIRHTKMNVGD++RLSVAVVVNYKTL DGKPLPL+ +QMKQIE LTREAMGFS+K
Sbjct: 365 YEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDK 424

Query: 418 RGDSLNVVNSPFNSSDESGGALPFWQQQVFIDQLLAAGRWLLVLLVAWLLWRKAVRPQLT 477
RGD+LNVVNSPF++ D +GG LPFWQQQ FIDQLLAAGRWLLVL+VAW+LWRKAVRPQLT
Sbjct: 425 RGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLT 484

Query: 478 RRAEAVKTVQQQAQAREEVEDAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 537
RR E K Q+QAQ R+E E+AVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR
Sbjct: 485 RRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 544

Query: 538 VVALVIRQWINNDHE 552
VVALVIRQW++NDHE
Sbjct: 545 VVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2139FLGMOTORFLIG341e-119 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 341 bits (876), Expect = e-119
Identities = 117/329 (35%), Positives = 197/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE 60
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD 120
+ + DY R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLKRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLANRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2140FLGFLIH372e-135 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 372 bits (957), Expect = e-135
Identities = 225/228 (98%), Positives = 227/228 (99%)

Query: 8 MSDNLPWKTWMPDDLAPPQAEFVPMVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 67
MSDNLPWKTW PDDLAPPQAEFVP+VEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 68 AEGRQQGHEQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 127
AEGRQQGH+QGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 128 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 187
MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 188 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 235
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2142FLGFLIJ2024e-70 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 202 bits (514), Expect = 4e-70
Identities = 145/147 (98%), Positives = 147/147 (100%)

Query: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60
MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQKRQST 120
+TSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQ+RQST
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147
AALLAENRLDQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2143FLGHOOKFLIK468e-168 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 468 bits (1206), Expect = e-168
Identities = 366/375 (97%), Positives = 370/375 (98%)

Query: 1 MIRLAPLITANVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60
MIRLAPLITA+VDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLVSDIVSDAQQADLLIPVDETLPVINDEQSTSTPLTTAQTMTLAAVADKNTTKDEKA 120
GEPL+SDIVSDAQQA+LLIPVDET PVINDEQSTSTPLTTAQTM LAAVADKNTTKDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPAEKPTLFTKLTSAQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLP EKPTLFTKLTS QLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTADASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTPLVAEAQSKAEVISTPSPVTA ASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMISPHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQM+SPHQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360

Query: 361 LQGRVTGNSGVDIFA 375
LQGRVTGNSGVDIFA
Sbjct: 361 LQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2145FLGMOTORFLIM385e-136 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 385 bits (989), Expect = e-136
Identities = 85/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 20 ILSQAEIDALLNGDS--EVKDEPTASVSGESDIRPYDPNTQRRVVRERLQALEIINERFA 77
+LSQ EID LL S + E +S I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 78 RHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 137
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 138 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 197
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 198 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 255
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 256 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 312
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 313 GVPVLTSQYGTLNGQYALRIEHLI 336
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2146FLGMOTORFLIN2105e-74 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 210 bits (537), Expect = 5e-74
Identities = 126/137 (91%), Positives = 134/137 (97%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTSGKSAADAVFQQFGGGDVSGALQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T+ KSAADAVFQQ GGGDVSGA+QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2148FLGBIOSNFLIP335e-119 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 335 bits (860), Expect = e-119
Identities = 242/245 (98%), Positives = 244/245 (99%)

Query: 1 MRRLLSVAPVLLWLVTPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60
MRRLLSVAPVLLWL+TPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFNEEK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPF+EEK
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALEKGEQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEALEKG QPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2149TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2150TYPE3IMRPROT2011e-66 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 201 bits (514), Expect = 1e-66
Identities = 257/261 (98%), Positives = 261/261 (100%)

Query: 1 MMQVTSDQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
M+QVTS+QWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEMFNLLADIISELPLI 261
EHLFSE+FNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2162PF05272290.044 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.044
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 15/62 (24%)

Query: 320 AKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVTRTLSARYYKDGAEILIDRG 379
A+Y + PVLW Y+ R+ K + G+ VY +R +DG+E RG
Sbjct: 166 ARYQVGPVLWGYVVRFIK---SDGDKLTLPYVY------------SRSQRDGSEAWKWRG 210

Query: 380 WD 381
WD
Sbjct: 211 WD 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2163CARBMTKINASE338e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 32.9 bits (75), Expect = 8e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 9/92 (9%)

Query: 46 AQKLAADDDVDMLVILTACYFHDIVSLAKDHPQRQRSSILAAEETRRLLREEFVQFPA-- 103
+KLA + + D+ +ILT + +L + Q + EE R+ E F A
Sbjct: 219 GEKLAEEVNADIFMILTDV---NGAALYYGTEKEQWLREVKVEELRKYYEEG--HFKAGS 273

Query: 104 --EKIEAVCHAIAAHSFSAQIAPLTTEAKIVQ 133
K+ A I A IA L + ++
Sbjct: 274 MGPKVLAAIRFIEWGGERAIIAHLEKAVEALE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2165ECOLIPORIN444e-158 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 444 bits (1144), Expect = e-158
Identities = 205/395 (51%), Positives = 256/395 (64%), Gaps = 36/395 (9%)

Query: 11 MKRKVLAMLVPALLVAGAANAAEIYNKDGNKVDFYGKMVGERIWSNTDDNNSENEDTSYA 70
MKRKVLA+++PALL AGAA+AAEIYNKDGNK+D YGK+ G +S D++S++ D +Y
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFS---DDSSKDGDQTYM 57

Query: 71 RFGVKGETQITSELTGFGQFEYNLDASKPEGE-NQEKTRLTFAGLKYNELGSFDYGRNYG 129
R G KGETQI +LTG+GQ+EYN+ A+ EGE TRL FAGLK+ + GSFDYGRNYG
Sbjct: 58 RVGFKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYG 117

Query: 130 VAYDAAAYTDMLVEWGGDSWASADNFMNGRTNGVATYRNYDFFGLVDGLDFAIQYQGKNS 189
V YD +TDML E+GGDS+ ADN+M GR NGVATYRN DFFGLVDGL+FA+QYQGKN
Sbjct: 118 VLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNE 177

Query: 190 NRS----------------TKKQNGDGYALSVDYNI-NGFGIVGAYSKSDRTNDQVA--- 229
++S + NGDG+ +S Y+I GF AY+ SDRTN+QV
Sbjct: 178 SQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGG 237

Query: 230 -DGNGSNAELWSLAAKYDANNVYAVVMYGETRNMTPGSIDTGVADREGNTIMRDQLINET 288
G A+ W+ KYDANN+Y MY ETRNMTP + + + N+T
Sbjct: 238 TIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYG--------KTDKGYDGGVANKT 289

Query: 289 QNFEAVVQYQFDFGLRPSLGYVYSKGKDIKGVPGHRYVDADRVNYIEVGTWYYFNKNMNV 348
QNFE QYQFDFGLRP++ ++ SKGKD+ D D V Y +VG YYFNKN +
Sbjct: 290 QNFEVTAQYQFDFGLRPAVSFLMSKGKDL-TYNNVNGDDKDLVKYADVGATYYFNKNFST 348

Query: 349 YTAYKFNMLDKDDA--AITGAAADDQFAVGIVYQF 381
Y YK N+LD DD G + DD A+G+VYQF
Sbjct: 349 YVDYKINLLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2166SUBTILISIN280.038 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 28.3 bits (63), Expect = 0.038
Identities = 7/29 (24%), Positives = 14/29 (48%)

Query: 160 GLPESEDVAAALQWAIENDRFVISLCHGP 188
G + + + + +AIE +IS+ G
Sbjct: 122 GSGQYDWIIQGIYYAIEQKVDIISMSLGG 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2168HTHFIS832e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.6 bits (204), Expect = 2e-20
Identities = 30/117 (25%), Positives = 60/117 (51%), Gaps = 1/117 (0%)

Query: 2 KILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDDYALIILDIMLPGMDGWQ 61
IL+ +D+ + + Q LS AGY + S+ D L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 ILQTLRTA-KQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSFSELLARVRAQLRQ 117
+L ++ A PV+ ++A+++ ++ + GA DYL KPF +EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2183ISCHRISMTASE512e-08 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 51.2 bits (122), Expect = 2e-08
Identities = 22/70 (31%), Positives = 44/70 (62%)

Query: 28 QQLRERLIQELNLTPQQLHEESNLIQAGLDSIRLMRWLHWFRKNGYRLTLRELYAAPTLA 87
+ +R+++ + L TP+ + ++ +L+ GLDS+R+M + +R+ G +T EL PT+
Sbjct: 233 ENIRKQIAELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIE 292

Query: 88 AWNQLMLSRS 97
W +L+ +RS
Sbjct: 293 EWQKLLTTRS 302


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2184DHBDHDRGNASE461e-06 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 45.8 bits (108), Expect = 1e-06
Identities = 32/156 (20%), Positives = 55/156 (35%), Gaps = 19/156 (12%)

Query: 1561 LVTGAFGGLGRLAVNWLREKGARRIALLAPRVDESWLRDVEGGQTRVCR------CDVGD 1614
+TGA G+G L +GA + A + L V R DV D
Sbjct: 12 FITGAAQGIGEAVARTLASQGAH---IAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 1615 AGQLATVLDDLAAN-GGIAGAIHAAGVLADAPLQELDDHQLAAVFAVKAQAASQLLQTLR 1673
+ + + + G I ++ AGVL + L D + A F+V + +++
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 1674 NH-----DGRYLILYSSAAAT----LGAPGQSAHAL 1700
+ G + + S+ A + A S A
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAA 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2189INTIMIN752e-17 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 75.1 bits (184), Expect = 2e-17
Identities = 20/60 (33%), Positives = 29/60 (48%), Gaps = 3/60 (5%)

Query: 181 QQIASTSQLIGSLLAEDMNSEQAANIARGWASSQASGVMTDWLSRFGTARITLGVDEDFS 240
QQ AS + S +N + A + A G A +QAS + WL +GTA + L +F
Sbjct: 168 QQAASLGSQLQS---RSLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2191INTIMIN563e-10 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 55.8 bits (134), Expect = 3e-10
Identities = 62/263 (23%), Positives = 91/263 (34%), Gaps = 20/263 (7%)

Query: 175 IAVKAHVNDQFGNPVTHQPATFSAAPSSQMIISQNTVSTNTQGVAEVTMTPERNGSYTVK 234
I A V G + P +F+ S ++S N+ +TN G A VT+ ++ G V
Sbjct: 578 ITYTATVKKN-GVAQANVPVSFNIV-SGTAVLSANSANTNGSGKATVTLKSDKPGQVVVS 635

Query: 235 ASLANGASLEKQLEAI---DEKLTLTSSPLIGVNAPKGATLTATLT---SANGTPVEGQV 288
A A S I K ++T A T T PV Q
Sbjct: 636 AKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQE 695

Query: 289 INFSVTLEGATLSGGKVRTNSSGQAPVVLTSNKVGTYTVTASFHNGVTIQTQTTVKVTGN 348
+ F+ TL LS +T+++G A V LTS G V+A + V+
Sbjct: 696 VTFTTTL--GKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTT 753

Query: 349 PSTAHVASFIADPSTIAATNSDLSTLKATVEDGSGNL-IEGLTVYFALKSGSTTLTSLTA 407
+ I T ++ G NL G + +S + + S
Sbjct: 754 LTID------DGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIAS--- 804

Query: 408 VTDQNGIATTSVKGEITGSVTVS 430
V +G T KG T SV S
Sbjct: 805 VDASSGQVTLKEKGTTTISVISS 827



Score = 52.4 bits (125), Expect = 3e-09
Identities = 46/170 (27%), Positives = 65/170 (38%), Gaps = 7/170 (4%)

Query: 271 TLTATLTSANGTPVEGQVINFSVTLEGATLSGGKVRTNSSGQAPVVLTSNKVGTYTVTAS 330
T TAT+ NG ++F++ A LS TN SG+A V L S+K G V+A
Sbjct: 579 TYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAK 637

Query: 331 FHNGV-TIQTQTTVKVTGNPSTAHVASFIADPSTIAATNSDLSTLKATVEDGSGNLIEGL 389
+ + V + A + AD +T A D T V G +
Sbjct: 638 TAEMTSALNANAVIFVDQ--TKASITEIKADKTTAVANGQDAITYTVKVMKG-DKPVSNQ 694

Query: 390 TVYFALKSGSTTLTSLTAVTDQNGIATTSVKGEITGSVTVSAVTSAGGMQ 439
V F G + + T TD NG A ++ G VSA S +
Sbjct: 695 EVTFTTTLGKLSNS--TEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVD 742



Score = 51.2 bits (122), Expect = 7e-09
Identities = 51/233 (21%), Positives = 89/233 (38%), Gaps = 16/233 (6%)

Query: 13 AVTDADGKAKVTLKGTKAGAHTVTASMVGGKS--EQLVVNFTADTLTAQVNLNVTEDNFI 70
A T+ GKA VTLK K G V+A S V F T + + + +
Sbjct: 612 ANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAV 671

Query: 71 ANNIGMTRLQATVTDGNGNPVEGIKVNFRGTSVTLSSTSVETDDQVFAEILVTSTEVGLK 130
AN V PV +V F T LS+++ +TD +A++ +TST G
Sbjct: 672 ANGQDAITYTVKVMK-GDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKS 730

Query: 131 TVSASLADKPTEVISRLLN----AKVDVNSATI----TSQEIPEGQVMVAQDIAVKAHVN 182
VSA ++D +V + + +D + I ++P + Q + N
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN 790

Query: 183 DQFGNPVTHQPATFSAAPSSQMIISQNTVSTNTQGVAEVTMTPERNGSYTVKA 235
++ + A S Q+ + + +T + V + + +YT+
Sbjct: 791 GKYTWRSANPAIASVDASSGQVTLKEKGTTTIS-----VISSDNQTATYTIAT 838



Score = 40.1 bits (93), Expect = 2e-05
Identities = 35/213 (16%), Positives = 63/213 (29%), Gaps = 18/213 (8%)

Query: 4 NFTLSDGDKAVTDADGKAKVTLKGTKAGAHTVTASMVGGKSE--QLVVNFTADTLTAQVN 61
TD +G AKVTL T G V+A + + V F N
Sbjct: 701 TLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGN 760

Query: 62 LNVTEDNFIANNIGMTRLQATVTDGNGN-PVEGIKVNFRGTSVTLSSTSVETDDQVFAEI 120
+ + + + + G N G + S + SV+
Sbjct: 761 IEI-----VGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQ---- 811

Query: 121 LVTSTEVGLKTVSASLADKPTEVISRLLNAKVDVNSATITSQEIPEGQVMVAQDIAVKAH 180
VT E G T+S +D T + + + ++ + V ++
Sbjct: 812 -VTLKEKGTTTISVISSDNQT--ATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGG--- 865

Query: 181 VNDQFGNPVTHQPATFSAAPSSQMIISQNTVST 213
N + + + AA + S T+ +
Sbjct: 866 KLPSSQNELENVFKAWGAANKYEYYKSSQTIIS 898


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2192INTIMIN280.022 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 27.7 bits (61), Expect = 0.022
Identities = 22/129 (17%), Positives = 46/129 (35%), Gaps = 6/129 (4%)

Query: 11 KISAIDYSQNINGDYKATVTGGGEGIATLIPVLNGVHQAGLSTTIEFISAETRPMTGTVS 70
K+S + NG K T+T G + + ++ V + +EF G +
Sbjct: 704 KLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFF-TTLTIDDGNIE 762

Query: 71 VNSANLPTASFPSQGFTGAYYQLNNDNFAPGKTAADYSFSSSASWVGVDATGKVTFKNDG 130
+ + P+ L + G + ++ A ++G+VT K G
Sbjct: 763 IVGTGV-KGKLPTVWLQYGQVNL---KASGGNGKYTWRSANPAIASVDASSGQVTLKEKG 818

Query: 131 DSNTVIITA 139
+ T+ + +
Sbjct: 819 -TTTISVIS 826


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2194TCRTETB330.002 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.3 bits (76), Expect = 0.002
Identities = 39/259 (15%), Positives = 96/259 (37%), Gaps = 18/259 (6%)

Query: 79 LGGVIFGHFGDRLGRKRMLMLTVWMMGIATALIGILPSFSTIGWWAPILLVTLRAIQGFA 138
+G ++G D+LG KR+L+ + + + + + SF ++ I+ ++ A
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSL----LIMARFIQGAGAAA 119

Query: 139 VGGEWGGAALLSVESAPKNKK-AFYSSGVQVGYGVGLLLSTGLVSLISMMTTDEQFLSWG 197
+ + K S V +G GVG + + I
Sbjct: 120 FPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------------H 167

Query: 198 WRIPFLFSIVLVLGALWVRNGMEESAEFEQQQHNQAAAKKRIPVIEALLRHPGAFLKIIA 257
W L ++ ++ ++ +++ + + + ++ +L + +
Sbjct: 168 WSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLI 227

Query: 258 LRLCELLTMYIVTAFALNYSTQNMGLPRELFLNIGLLVGGLSCLTIPCFAWLADRFGRRR 317
+ + L +++ + + GL + + IG+L GG+ T+ F + +
Sbjct: 228 VSVLSFL-IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDV 286

Query: 318 VYITGALIGTLSAFPFFMA 336
++ A IG++ FP M+
Sbjct: 287 HQLSTAEIGSVIIFPGTMS 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2208BICOMPNTOXIN330.002 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 33.3 bits (76), Expect = 0.002
Identities = 6/41 (14%), Positives = 16/41 (39%)

Query: 303 LAADNRILYASGWFIDQNQGPYISHGGQNPNFSSCIALRPD 343
+ +F+ ++ P + G NP+F + ++
Sbjct: 210 VGYKPHSKDPRDYFVPDSELPPLVQSGFNPSFIATVSHEKG 250


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2210ISCHRISMTASE429e-06 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 42.3 bits (99), Expect = 9e-06
Identities = 20/87 (22%), Positives = 40/87 (45%), Gaps = 3/87 (3%)

Query: 956 DVRQMVATVRNTAPASGSER-LGDAAIRHSVRVCVEGALEQTEFDDNENLYVLGLDSIKS 1014
++ A V+ T+ +G + IR + ++ E + D E+L GLDS++
Sbjct: 209 QLQNAPADVQKTSANTGKKNVFTCENIRKQIAELLQETPE--DITDQEDLLDRGLDSVRI 266

Query: 1015 IQIAAQLRHHGWTMSAVQVMECGTVNA 1041
+ + Q R G ++ V++ E T+
Sbjct: 267 MTLVEQWRREGAEVTFVELAERPTIEE 293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2222DHBDHDRGNASE512e-08 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 51.2 bits (122), Expect = 2e-08
Identities = 32/167 (19%), Positives = 58/167 (34%), Gaps = 7/167 (4%)

Query: 2191 IPGNVLWIIGGEKGIGRMIGEALAQREGVRVVLSSRTGYHHEAVQQDAL------DVIHC 2244
I G + +I G +GIG + LA +G + E V +
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLAS-QGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPA 64

Query: 2245 DVTQAEAVRACLATLLERYGRLDGVIFAADATTTLTLHQLSESALRDTLTVKERGTANVL 2304
DV + A+ A + G +D ++ A +H LS+ T +V G N
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 2305 HALAQRNLLDERLLLLFCNSLAAVNAEIGQTGYATASAYLDALAQQL 2351
++++ + ++ S A YA++ A + L
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2241FLGPRINGFLGI270.043 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 26.8 bits (59), Expect = 0.043
Identities = 12/30 (40%), Positives = 19/30 (63%)

Query: 1 MIIDESAGEVVIGANTRICHGAVIQGPVVI 30
++I+E G +VIGA+ RI AV G + +
Sbjct: 263 VVINERTGTIVIGADVRISRVAVSYGTLTV 292


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2263PF05272330.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.7 bits (74), Expect = 0.002
Identities = 12/22 (54%), Positives = 13/22 (59%)

Query: 32 VTVLLGPNGCGKSTLLRALAGL 53
VL G G GKSTL+ L GL
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL 619


34UTI89_C2291UTI89_C2323Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2291-1223.625944antitoxin YefM
UTI89_C2292-1234.053860ATP phosphoribosyltransferase
UTI89_C22930233.880544histidinol dehydrogenase
UTI89_C2294-1262.994399histidinol-phosphate aminotransferase
UTI89_C2295-2190.977603imidazole glycerol-phosphate
UTI89_C2296-218-1.458076imidazole glycerol phosphate synthase subunit
UTI89_C2297-116-2.1313781-(5-phosphoribosyl)-5-[(5-
UTI89_C2298-119-6.016892imidazole glycerol phosphate synthase subunit
UTI89_C2299028-9.061001bifunctional phosphoribosyl-AMP
UTI89_C2300235-11.704461regulator of length of O-antigen component of
UTI89_C2301339-13.808062UDP-glucose 6-dehydrogenase
UTI89_C2302447-16.1198276-phosphogluconate dehydrogenase
UTI89_C2303660-19.582021galactosyltransferase WbgM
UTI89_C2304660-19.249590glycosyltransferase WbdM
UTI89_C2305654-17.596446glycosyltransferase
UTI89_C2306449-14.700218rhamnosyl transferase
UTI89_C2307345-12.690124bacteriophage HK620 O-antigen modification
UTI89_C2308141-10.412235O-antigen transporter
UTI89_C2309-128-6.615514dTDP-4-dehydrorhamnose 3,5-epimerase
UTI89_C2310-220-5.152110glucose-1-phosphate thymidylyltransferase
UTI89_C2311-416-3.083405dTDP-4-dehydrorhamnose reductase
UTI89_C2312-316-2.000124dTDP-glucose 4,6 dehydratase
UTI89_C2313-318-0.148263hypothetical protein
UTI89_C2314-2180.861676UTP-glucose-1-phosphate uridylyltransferase
UTI89_C2315-1211.537338colanic acid biosynthesis protein
UTI89_C23160242.990362colanic acid biosynthesis glycosyltransferase
UTI89_C23170253.021667pyruvyl transferase
UTI89_C23180243.294464ribonucleoside diphosphage reductase 1 subunit
UTI89_C23190253.366145colanic acid exporter
UTI89_C23200233.475002UDP-glucose lipid carrier transferase
UTI89_C23210243.807728phosphomannomutase
UTI89_C2322-1243.776587hypothetical protein
UTI89_C2323-1223.194441mannose-1-phosphate guanylyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2304PF07520310.007 Virulence protein SrfB
		>PF07520#Virulence protein SrfB

Length = 1041

Score = 31.5 bits (71), Expect = 0.007
Identities = 9/44 (20%), Positives = 16/44 (36%), Gaps = 4/44 (9%)

Query: 160 AITMYNGIDTNKFKFDLLARREIRDGINIKNDDILLLAAGRLTL 203
+T Y G D + R+G + DD++ + L
Sbjct: 610 MVTTYRGEDNRVLHPEQT----FREGFRVAGDDLVHRVISAIVL 649


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2311NUCEPIMERASE491e-08 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 48.6 bits (116), Expect = 1e-08
Identities = 31/172 (18%), Positives = 67/172 (38%), Gaps = 29/172 (16%)

Query: 1 MNILLFGKTGQVGWELQRALAPLGN-LIALDVHSTDY--------------------CGD 39
M L+ G G +G+ + + L G+ ++ +D + Y D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 40 FSNPEGVAETVRSIRPDIIVNAAAHTAVDKAESEPEF---AQLLNATSVEAIAKAANEVG 96
++ EG+ + S + + + AV + P + L ++ + +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNK-IQ 119

Query: 97 AWVIHYSTDYVFPGTGEIPWQEEDATA-PLNVYGETKLAGEKALQEHCAKHL 147
+++ S+ V+ ++P+ +D+ P+++Y TK A E L H HL
Sbjct: 120 H-LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANE--LMAHTYSHL 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2312NUCEPIMERASE1804e-56 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 180 bits (459), Expect = 4e-56
Identities = 88/360 (24%), Positives = 148/360 (41%), Gaps = 48/360 (13%)

Query: 1 MKILVTGGAGFIGSAVVRHIINDTQDSVVNVDKLT--YAGNL-ESLADVSDSERYFFEHA 57
MK LVTG AGFIG V + ++ + VV +D L Y +L ++ ++ + F
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLL-EAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 58 DICDAAAMARIFAQHQPDAVMHLAAESHVDRSITGPAAFIETNIVGTYVLLEAARNYWSA 117
D+ D M +FA + V V S+ P A+ ++N+ G +LE R+
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN--- 116

Query: 118 LDGDKKNSFRFHHISTDEVYGDLPHPDEVNNKEGLPLFTETTAYAPSSPYSASKASSDHL 177
+ S+ VYG +P T+ + P S Y+A+K +++ +
Sbjct: 117 ------KIQHLLYASSSSVYGLNRK---------MPFSTDDSVDHPVSLYAATKKANELM 161

Query: 178 VRAWKRTYGLPTIVTNCSNNYGPYHFPEKLIPLVILNALEGKGLPIYGKGDQIRDWLYVE 237
+ YGLP YGP+ P+ + LEGK + +Y G RD+ Y++
Sbjct: 162 AHTYSHLYGLPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYID 221

Query: 238 D-------------HARALYTVVTEGKA-----GETYNIGGHNEKKNIDVVLTICDLLDE 279
D HA +TV T A YNIG + + +D + + D L
Sbjct: 222 DIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALG- 280

Query: 280 IVPKEKSYREQITYVADRPGHDRRYAIDAEKIGRELGWKPQETFESGIRKTVEWYLSNTK 339
+ +K+ +PG + D + + +G+ P+ T + G++ V WY K
Sbjct: 281 -IEAKKNMLPL------QPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYK 333


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2318PF05946290.003 Toxin-coregulated pilus subunit TcpA
		>PF05946#Toxin-coregulated pilus subunit TcpA

Length = 199

Score = 28.7 bits (64), Expect = 0.003
Identities = 12/37 (32%), Positives = 15/37 (40%)

Query: 3 SVPATVAGCGVKRLIRPGGLGLDLVGLIRRASVASGT 39
S A G GV + I P LDL + + GT
Sbjct: 153 SAAAAETGVGVIKSIAPASKNLDLTNITHVEKLCKGT 189


35UTI89_C2346UTI89_C2368Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2346-3133.300057hypothetical protein
UTI89_C2347-3174.104327hypothetical protein
UTI89_C2348-3174.085079hypothetical protein
UTI89_C2349-2174.139385multidrug efflux system subunit MdtA
UTI89_C2350-2183.957197multidrug efflux system subunit MdtB
UTI89_C2351-2152.542238multidrug efflux system subunit MdtC
UTI89_C2352-114-2.971807multidrug efflux system protein MdtE
UTI89_C2353022-5.692215signal transduction histidine-protein kinase
UTI89_C2354031-9.135856DNA-binding transcriptional regulator BaeR
UTI89_C2355025-7.904406hypothetical protein
UTI89_C2356028-8.426813hypothetical protein
UTI89_C2357026-7.784157hypothetical protein
UTI89_C2358-120-5.891473hypothetical protein
UTI89_C2359-118-4.690523hypothetical protein
UTI89_C2360213-1.820118hypothetical protein
UTI89_C2361320-3.565690hypothetical protein
UTI89_C2362319-3.274209lipid kinase
UTI89_C2363319-3.411899hypothetical protein
UTI89_C2364320-2.385448galactitol-1-phosphate dehydrogenase
UTI89_C2365318-2.704958PTS system galactitol-specific transporter
UTI89_C2366117-2.563192PTS system galactitol-specific transporter
UTI89_C2367115-1.829354PTS system galactitol-specific transporter
UTI89_C2368213-0.328800tagatose 6-phosphate kinase GatZ
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2349RTXTOXIND486e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.5 bits (113), Expect = 6e-08
Identities = 48/369 (13%), Positives = 105/369 (28%), Gaps = 87/369 (23%)

Query: 4 SYKSRWVIVIVVVIAAIAAFWFWQGRNDSQSAAPG-----ATKQAQQSPAGGR------- 51
S + R V ++ IA G+ + + A G + +
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 52 --RGMRAG-PLA---PVQAATAVEQAVPRYLTGLGTITAANTVTVRSRVDG--QLMALHF 103
+R G L + A + L T ++ ++ +L
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 104 QEGQQVKAGDLLAEI------------DPSQFKVALAQAQGQLA-------KDKATLANA 144
Q V ++L Q ++ L + + + + +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 145 RRDLSRYQQLAKTNLVSRQELDAQQALVSETEGTIKADEASVA----------------- 187
+ L + L +++ + Q+ E ++ ++ +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 188 --------------------------SAQLQLDWSRITAPVDGRV-GLKQVDVGNQISSG 220
+ + S I APV +V LK G +++
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 221 DTTGIVVITQTHPIDLVFTLPESDIATVVQAQKAGKPLVVEARDRTNSKKL-SEGTLLSL 279
+T +V++ + +++ + DI + Q A + VEA T L + ++L
Sbjct: 354 ETL-MVIVPEDDTLEVTALVQNKDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINL 410

Query: 280 DNQIDATTG 288
D D G
Sbjct: 411 DAIEDQRLG 419


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2350ACRIFLAVINRP9200.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 920 bits (2379), Expect = 0.0
Identities = 300/1036 (28%), Positives = 513/1036 (49%), Gaps = 29/1036 (2%)

Query: 13 SRLFIMRPVATTLLMVAILLAGIIGYRALPVSALPEVDYPTIQVVTLYPGASPDVMTSAV 72
+ FI RP+ +L + +++AG + LPV+ P + P + V YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMSSQS-SGGASVITLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L MSS S S G+ ITL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPNPPVYSKVNPADPPIMTLAVTSTAMPMTQVE--DMVETRVAQKISQISGVGLVTLSGG 189
+ + S + +M S TQ + D V + V +S+++GVG V L G
Sbjct: 122 VQQQGI-SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAQAIAALGLTSETVRTAITGANVNSAKGSLDGP------SRAVTLSANDQ 243
Q A+R+ L+A + LT V + N A G L G ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MQSAEEYRQLII-AYQNGAPIRLGDVATVEQGAENSWLGAWANKEQAIVMNVQRQPGANI 302
++ EE+ ++ + +G+ +RL DVA VE G EN + A N + A + ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 ISTADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVDDTQFELMMAIALVVMIIYLFL 362
+ TA +I+ L +L P+ +KV D T ++ S+ + L AI LV +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N+ AT+IP +AVP+ L+GTFA++ +SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAITLAVAIL 481
+ E P A K +I ++ + L AV IP+ F G G ++R+F+IT+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SQESLRKQNRFSRASEKMFDRIIAAYGRGLAKVLNHPWL 538
+S +V+L LTP +CA +L S E + F FD + Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVALSTLLLSVLLWVFIPKGFFPVQDNGIIQGTLQAPQSSSFANMAQRQRQVADVILQ 598
L + + V+L++ +P F P +D G+ +Q P ++ + QV D L+
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VQSLTSFVGVDGTNPSLNSARLQINLKPLDERDDR---VQKVIARLQTAVDKVPG 653
+ V+S+ + G + + N+ ++LKP +ER+ + VI R + + K+
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR- 658

Query: 654 VDLFLQPTQDLTIDTQVSRTQYQFTLQ---ATSLDALSTWVPQLMEKLQQLP-QLSDVSS 709
D F+ P I + T + F L DAL+ QL+ Q P L V
Sbjct: 659 -DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDKGLVAYVNVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTE 769
+ + + VD++ A LG+S++D++ + A G ++ + ++ ++ + +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 NTPGLAALDTIRLTSSDGGVVPLSSIAKIEQRFAPLSINHLDQFPVTTISFNVPDNYSLG 829
+D + + S++G +VP S+ + + + P I S G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 DAVQAIMDTEKTLNLPVDITTQFQGSTLAFQSALGSTVWLIVAAVVAMYIVLGILYESFI 889
DA A+M+ + LP I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DA-MALMENLAS-KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALMIAGSELDVIAIIGIILLIGIVKKNAIMMIDFALAAEREQ 949
P++++ +P VG LLA + + DV ++G++ IG+ KNAI++++FA ++
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMSPREAIYQACLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIGMVGGLIVSQV 1009
G EA A +R RPILMT+LA +LG LPL +S G G+ + +GIG++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDRL 1025
L +F PV +++ R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2351ACRIFLAVINRP9160.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 916 bits (2369), Expect = 0.0
Identities = 288/1035 (27%), Positives = 504/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ +L++ + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVSEMTSSS-SLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLPSGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSDT--YSQGELYDFASTQLAPTISQIDGVGDVDVGGSSL 182
+ S + +M+ SD +Q ++ D+ ++ + T+S+++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLNPQALFNQGVSLDDVRTAISNANVRKPQG------ALEDGTHRWQIQTNDELK 236
A+R+ L+ L ++ DV + N + G AL I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDSIRARLPELQSTIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRS 355
T +I+A+L ELQ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RAT+IP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RMLVALQQGYGKSLKWVLNHTRLVG 530
+LV+L LTP +C +LK + GF Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVLLGTIALNIWLYISIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 586
++ +A + L++ +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 587 RD-DPAVDNVTGFT-GGSRVNSGMMFITLKPRGERS---ETAQQIIDRLRKKLAKEPGAN 641
+ +V V GF+ G N+GM F++LKP ER+ +A+ +I R + +L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLMAVQDIRVGGRQANASYQYTLLSDDLAALREWEPKIRKKLATL-----PELADVNSD 696
+ + I G ++ L D + + R +L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QEDNGAEMNLIYDRDTMARLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 756
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 TQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSD 816
++K++V + G+ +P S F + + I G S D
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVILIIAAIATVYIVLGILYESYVH 876
A A ++ ++L P+ + + G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALQLFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGN 936
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA +
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLL 996
EA A +R RPI+MT+LA + G LPL +S G GS + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVVYLFFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031



Score = 81.4 bits (201), Expect = 1e-17
Identities = 76/448 (16%), Positives = 161/448 (35%), Gaps = 26/448 (5%)

Query: 592 VDNVTGFTGGS-RVNSGMMFITLKPRGERSETAQQIIDRLRKKLAKEPGANLFLMAVQDI 650
+DN+ + S S + +T + + Q+ ++L+ P + Q I
Sbjct: 72 IDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE----VQQQGI 127

Query: 651 RVGGRQANASYQYTLLSDDLAALREW-----EPKIRKKLATLPELADVNSDQEDNGAE-- 703
V ++ +SD+ ++ ++ L+ L + DV GA+
Sbjct: 128 SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL----FGAQYA 183

Query: 704 MNLIYDRDTMARLGID----VQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRYTQD 759
M + D D + + + + + + T P Q + R+
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP 243

Query: 760 ISALEKMFVINNEGKAIPLSYFAK--WQPANAPLSVNHQGLSAASTISFNLPTGKSLSDA 817
+ +N++G + L A+ N + G AA +L D
Sbjct: 244 EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL-DT 302

Query: 818 SAAIDRAMTQL--GVPSTVRGSFA-GTAQVFQETMNSQVILIIAAIATVYIVLGILYESY 874
+ AI + +L P ++ + T Q +++ V + AI V++V+ + ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 875 VHPLTILSTLPSAGVGALLALQLFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRH 934
L +P +G L F + + + G++L IG++ +AI++V+
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 935 GNLTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQ 994
L P+EA ++ ++ + +P+ GG + + ITIV + +S
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 995 LLTLYTTPVVYLFFDRLRLRFSRKPKQA 1022
L+ L TP + + + K
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENKGG 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2352TCRTETB1268e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 126 bits (317), Expect = 8e-34
Identities = 97/429 (22%), Positives = 189/429 (44%), Gaps = 23/429 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIATLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGFSPLAIAGLVAVGVVALVLYLLHAQNNNRALFSLKL 257
G +L++VG+ L F+ + V V++ ++++ H + L
Sbjct: 202 KGIILMSVGIVFFML---------FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+R G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHVSVDSGTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYTWLSMAF 441
+Y+ L + F
Sbjct: 428 LYSNLLLLF 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2353BCTERIALGSPF330.002 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 32.9 bits (75), Expect = 0.002
Identities = 27/95 (28%), Positives = 35/95 (36%), Gaps = 20/95 (21%)

Query: 164 RQTSWLIVALSTLLAALATF------PLARGLLAPVKRLVDGTHKLAAGDFTTRVAPTIE 217
RQ + L+ A L AL P L+A V+ V H LA + P
Sbjct: 75 RQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSF 131

Query: 218 DEL-----------GRLAEDFNQLASTLEKNQQMR 241
+ L G L N+LA E+ QQMR
Sbjct: 132 ERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2354HTHFIS766e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 6e-18
Identities = 28/136 (20%), Positives = 65/136 (47%), Gaps = 1/136 (0%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLPYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTILRRCK 129
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 PQRELQQQDAESPLII 145
+ + D++ + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2361LIPOLPP20270.027 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 26.6 bits (58), Expect = 0.027
Identities = 21/88 (23%), Positives = 42/88 (47%), Gaps = 11/88 (12%)

Query: 18 EGEMKKIAAISLISVFLMSGCAVHNDETSIGKFGLAYKSNIQ-------RKLDNQYYTEA 70
+ ++KKI +S+++ ++ GC+ H ++ I K AYK + L+ E
Sbjct: 2 KNQVKKILGMSVVAAMVIVGCS-HAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60

Query: 71 EASLARGRISGAENIVKNDAVHFCVTQG 98
+ + GR AE+++ N+ V + Q
Sbjct: 61 YSGVFLGR---AEDLITNNDVDYSTNQA 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2364DHBDHDRGNASE310.004 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 31.2 bits (70), Expect = 0.004
Identities = 21/92 (22%), Positives = 35/92 (38%), Gaps = 2/92 (2%)

Query: 156 AQGCENKNVIIIGAGT-IGLLAIQCAVALGAKSVTAIDISSEKLALAKSFGAMQTFNSLE 214
A+G E K I GA IG + + GA + A+D + EKL S + ++
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAH-IAAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 215 MSAPQMQGVLRELRFNQLILETAGVPQTVELA 246
A + ++ E + V +A
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVA 93


36UTI89_C2377UTI89_C2383Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2377222-2.792260phosphomethylpyrimidine kinase
UTI89_C2378224-5.064838hydroxyethylthiazole kinase
UTI89_C2379226-7.092005hypothetical protein
UTI89_C2380327-7.534375nickel/cobalt efflux protein RcnA
UTI89_C2381331-8.762519hypothetical protein
UTI89_C2382226-7.063976Yeh fimbiral adhesin YehA
UTI89_C2383-112-3.918213outer membrane usher protein YehB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2381TYPE3OMGPROT280.024 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 27.9 bits (62), Expect = 0.024
Identities = 13/42 (30%), Positives = 21/42 (50%), Gaps = 1/42 (2%)

Query: 66 KMLLGALLLVTSAAWAAPATAGSTNTSGISKYE-LSSFIADF 106
++L G LLL++S +WA ++K E L + DF
Sbjct: 11 RVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDF 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2382BINARYTOXINB280.043 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 28.5 bits (63), Expect = 0.043
Identities = 18/79 (22%), Positives = 34/79 (43%), Gaps = 8/79 (10%)

Query: 93 NITLSNNQ---TSFTSGYSVTVTPAASNAKVNVSAGGGGSVMINGVATLSSA-----SSS 144
NI LS N+ T T + T++ S ++ + S G + + + + S+S
Sbjct: 297 NIILSKNEDQSTQNTDSQTRTISKNTSTSRTHTSEVHGNAEVHASFFDIGGSVSAGFSNS 356

Query: 145 TRGSAAVQFLLCLLGGKSW 163
+ A+ L L G ++W
Sbjct: 357 NSSTVAIDHSLSLAGERTW 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2383PF005777210.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 721 bits (1862), Expect = 0.0
Identities = 242/837 (28%), Positives = 387/837 (46%), Gaps = 32/837 (3%)

Query: 10 PPLASAIVALLIGIEAYAAEETFDTHFMIGGMKDQQVSNIRL--EDNQPLPGQYDIDIYV 67
A +AE F+ F+ Q V+++ + PG Y +DIY+
Sbjct: 27 FVRLFVACAFAAQAPLSSAELYFNPRFLADD--PQAVADLSRFENGQELPPGTYRVDIYL 84

Query: 68 NKQWRGKYEIIVKDNPQET----CLSREMIKRLGINTDS-----FASGKQCLTFKQLIQG 118
N + ++ E CL+R + +G+NT S + C+ +I
Sbjct: 85 NNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHD 144

Query: 119 GSYTWDIGVFRLDFSVPQAWVEELESGYVPPENWERGINAFYTSYYMSQYYSDYKASGNS 178
+ D+G RL+ ++PQA++ GY+PPE W+ GINA +Y S + GNS
Sbjct: 145 ATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNS 204

Query: 179 KSTYVRFNSGLNLLGWQLHSDASFSKTNNNPGV-----WKSNTLYLERGFAQLLGTLRVG 233
Y+ SGLN+ W+L + ++S +++ W+ +LER L L +G
Sbjct: 205 HYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLG 264

Query: 234 DMYTSSDIFDSVRFSGVRLFRDMQMLPNSKQNFTPRVQGIAQSNALVTIEQNGFVVYQKE 293
D YT DIFD + F G +L D MLP+S++ F P + GIA+ A VTI+QNG+ +Y
Sbjct: 265 DGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNST 324

Query: 294 VPPGPFAITDLQLAGGGADLDVSVKEADGSVTTYLVPYAAVPNMLQPGVSKYDFAAGRSH 353
VPPGPF I D+ AG DL V++KEADGS + VPY++VP + + G ++Y AG
Sbjct: 325 VPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYR 384

Query: 354 IEGASKQSD-FVQVGHQYGFNNLLTLYGGSMVANNYYAFTLGTGWNT-RIGAISVDATKS 411
A ++ F Q +G T+YGG+ +A+ Y AF G G N +GA+SVD T++
Sbjct: 385 SGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQA 444

Query: 412 HSKQDNGDVFDGQSYQIAYNKFVSQTSTRFGLAAWRYSSRDYRTFNDHVWANNKDNYRRD 471
+S + DGQS + YNK ++++ T L +RYS+ Y F D ++
Sbjct: 445 NSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIET 504

Query: 472 ENDIYDI----ADYYQNDFGRKNSFSANMSQSLPEGWGSVSLSTLWRDYWGRSGSSKDYQ 527
++ + + DYY + ++ ++Q L ++ LS + YWG S + +Q
Sbjct: 505 QDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSNVDEQFQ 563

Query: 528 LSYSNNLRRISYTLAASHAYDENHHE-EKRFNIFISIPFD--WGDDVTTPRRQIYMSNST 584
+ I++TL+ S + ++ + ++IPF D + R S S
Sbjct: 564 AGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSM 623

Query: 585 TFDDQGVASNNTGLSGTVGSRDQFNYGVNLSYQYQGN---ETTAGANLTWNAPVATVNGS 641
+ D G +N G+ GT+ + +Y V Y G+ +T A L + N
Sbjct: 624 SHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIG 683

Query: 642 YSQSSAYRQAGASVSGGIVAWSGGVNLANRLSETFAVMNAPGIKDAYVNGQKYRTTNRNG 701
YS S +Q VSGG++A + GV L L++T ++ APG KDA V Q T+ G
Sbjct: 684 YSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRG 743

Query: 702 VVVYDGMTPYRENYLMLDVSQSDSEAELRGNRKIAAPYRGAVVLVNFDTDQRKPWFIKAL 761
V T YREN + LD + +L P RGA+V F + + L
Sbjct: 744 YAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKA-RVGIKLLMTL 802

Query: 762 RADGQPLTFGYEVNDIHGHNIGVVGQGSQLFIRTNEVPPSVNVAIDKQQGLSCTITF 818
+ +PL FG V + G+V Q+++ + V V +++ C +
Sbjct: 803 THNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANY 859


37UTI89_C2471UTI89_C2483Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C24710203.609710transcriptional regulator NarP
UTI89_C24720224.082881cytochrome c-type biogenesis protein CcmH
UTI89_C24730214.426143disulfide oxidoreductase
UTI89_C2474-1184.427766cytochrome c-type biogenesis protein CcmF
UTI89_C2475-1163.118665cytochrome c-type biogenesis protein CcmE
UTI89_C24760163.372028heme exporter protein C
UTI89_C24770153.271445heme exporter protein C
UTI89_C2478-1184.121750heme exporter protein B
UTI89_C24790214.315804cytochrome c biogenesis protein CcmA
UTI89_C24800234.278636cytochrome c-type protein NapC
UTI89_C2481-1224.645108citrate reductase cytochrome c-type subunit
UTI89_C2482-1214.123294quinol dehydrogenase membrane component
UTI89_C24830233.439499quinol dehydrogenase periplasmic component
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2471HTHFIS652e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.9 bits (158), Expect = 2e-14
Identities = 22/113 (19%), Positives = 48/113 (42%), Gaps = 2/113 (1%)

Query: 19 VMIVDDHPLMRRGVRQLLELDSGFEVVAEAGDGASAIDLANRLDIDVILLDLNMKGMSGL 78
+++ DD +R + Q L +G++V + A+ D D+++ D+ M +
Sbjct: 6 ILVADDDAAIRTVLNQALS-RAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 79 DTLNALRRDGVTAQIIILTVSDASSDVFALIDAGADGYLLKDSDPEVLLEAIR 131
D L +++ +++++ + + GA YL K D L+ I
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


38UTI89_C2538UTI89_C2563Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C25382141.152773hypothetical protein
UTI89_C25392132.4056394-amino-4-deoxy-L-arabinose transferase
UTI89_C25401143.582309hypothetical protein
UTI89_C2541-1133.513877hypothetical protein
UTI89_C2542-1154.075751polymyxin B resistance protein PmrD
UTI89_C2543-1134.740151O-succinylbenzoic acid--CoA ligase
UTI89_C2544-1124.081942O-succinylbenzoate synthase
UTI89_C2545-1113.190685naphthoate synthase
UTI89_C2546-1122.767474hypothetical protein
UTI89_C2547-1112.203274acyl-CoA thioester hydrolase
UTI89_C2548-2110.0980362-succinyl-5-enolpyruvyl-6-hydroxy-3-
UTI89_C2549-117-2.509983menaquinone-specific isochorismate synthase
UTI89_C2550021-3.952589hypothetical protein
UTI89_C2551-112-1.923366hypothetical protein
UTI89_C2552-19-0.656844ribonuclease Z
UTI89_C2553-2140.709154hypothetical protein
UTI89_C2554-1192.297806aminopeptidase
UTI89_C25551273.600233hypothetical protein
UTI89_C25560304.250263NADH dehydrogenase subunit N
UTI89_C25570313.723291NADH dehydrogenase subunit M
UTI89_C25580314.347851NADH dehydrogenase subunit L
UTI89_C2559-1314.077623NADH dehydrogenase subunit K
UTI89_C25600314.136560NADH dehydrogenase subunit J
UTI89_C25610304.243135NADH dehydrogenase subunit I
UTI89_C25620294.007171NADH dehydrogenase subunit H
UTI89_C25631284.027369NADH dehydrogenase subunit G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2540BCTERIALGSPC280.007 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 28.0 bits (62), Expect = 0.007
Identities = 12/31 (38%), Positives = 18/31 (58%), Gaps = 1/31 (3%)

Query: 34 KHIVLWLGLALACLGLAMVLWLLVL-QNVPV 63
+ I+ +L + L C LAM+ W + L N PV
Sbjct: 15 RRILFYLLMLLFCQQLAMIFWRIGLPDNAPV 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2543ACETATEKNASE310.015 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 30.5 bits (69), Expect = 0.015
Identities = 19/124 (15%), Positives = 46/124 (37%), Gaps = 20/124 (16%)

Query: 339 EMHNGKLTIVG-----RLDNLFFSGGEGIQPEEVERVIAAHPAVLQVFIVPVADKEF--- 390
E +G + G +++ + + ++++ + H +++ + + + ++
Sbjct: 19 ESKDGNVLAKGLAERIGINDSLLTHNANGEKIKIKKDMKDHKDAIKLVLDALVNSDYGVI 78

Query: 391 ---------GHRPVAVVEYDQQSVDLDEWVKDKLARFQQPVRWLTLPPELKNGGIKISRQ 441
GHR V EY SV + + V + + L P + GIK Q
Sbjct: 79 KDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDC-IELAPLHNPANI--EGIKACTQ 135

Query: 442 ALKE 445
+ +
Sbjct: 136 IMPD 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2551AUTOINDCRSYN325e-04 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 32.1 bits (73), Expect = 5e-04
Identities = 13/74 (17%), Positives = 29/74 (39%), Gaps = 12/74 (16%)

Query: 1 MIDWQDLHHSDLSVSQLYALLQLRCAVFV--------VEQNCPYQDIDGDDLEGENRHIL 52
M++ D++H+ LS ++ L LR F + D ++ ++
Sbjct: 1 MLEIFDVNHTLLSETKSGELFTLRKETFKDRLNWAVQCTDGMEFDQYDNNN----TTYLF 56

Query: 53 GWHNGTLVAYARIL 66
G + T++ R +
Sbjct: 57 GIKDNTVICSLRFI 70


39UTI89_C2629UTI89_C2705Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2629018-3.070891long-chain fatty acid outer membrane
UTI89_C2630228-5.654377VacJ lipoprotein
UTI89_C2631331-6.454700hypothetical protein
UTI89_C2632331-6.746100transport
UTI89_C2634536-7.486772*prophage integrase
UTI89_C2635639-8.869172hypothetical protein
UTI89_C2636638-8.810364endo-alpha-sialidase
UTI89_C2637741-11.194952antirepressor protein
UTI89_C2638747-7.615013hypothetical protein
UTI89_C2639644-6.741694regulatory protein
UTI89_C2640642-5.746786hypothetical protein
UTI89_C2641539-4.582389hypothetical protein
UTI89_C2642537-3.762020hypothetical protein
UTI89_C2643432-2.418981DNA transfer protein
UTI89_C2644330-2.326045DNA transfer protein
UTI89_C2645327-1.544529DNA transfer protein
UTI89_C2646427-1.618812head assembly protein
UTI89_C2647427-1.489028DNA stabilization protein
UTI89_C2648426-0.853229DNA stabilization protein
UTI89_C2649426-0.767688hypothetical protein
UTI89_C2650426-0.819649hypothetical protein
UTI89_C2651527-0.782499hypothetical protein
UTI89_C2652526-1.424530scaffold protein
UTI89_C2653526-1.637759portal protein
UTI89_C2654428-2.654608terminase large subunit
UTI89_C2655534-3.761239hypothetical protein
UTI89_C2656430-3.588702hypothetical protein
UTI89_C2657131-2.627372hypothetical protein
UTI89_C2658334-2.274706hypothetical protein
UTI89_C2659236-1.937103hypothetical protein
UTI89_C2660236-2.312927lysin
UTI89_C2661239-2.179604holin
UTI89_C2662137-2.784547hypothetical protein
UTI89_C2663035-4.293629bacteriophage ST64T antitermination protein
UTI89_C2664-133-3.743092hypothetical protein
UTI89_C2665-232-3.546591Holliday-junction resolvase
UTI89_C2666-132-4.044529hypothetical protein
UTI89_C2667129-2.293368DNA-binding protein Roi
UTI89_C2668130-1.729486hypothetical protein
UTI89_C2669331-1.503915bacteriophage lambda nin 60-like protein
UTI89_C2670331-1.548932hypothetical protein
UTI89_C2671330-1.535317bacteriophage HK97 gp56
UTI89_C2672431-1.832713bacteriophage Nil2 gene P DnaB analogue
UTI89_C2673633-3.756412hypothetical protein
UTI89_C2674839-4.336850hypothetical protein
UTI89_C2675638-4.536091bacteriophage 343 regulatory protein CII
UTI89_C2676540-6.423971Cro protein
UTI89_C2677541-6.979457repressor protein
UTI89_C2678442-8.242228hypothetical protein
UTI89_C2679746-9.340913phage regulatory protein N
UTI89_C2680444-10.195628bacteriophage lambda restriction inhibitor
UTI89_C2681637-8.575237hypothetical protein
UTI89_C2682433-7.157905protease inhibitor FtsH
UTI89_C2683329-6.321536kil protein of bacteriophage HK97
UTI89_C2684326-5.883250hypothetical protein
UTI89_C2685126-4.595600hypothetical protein
UTI89_C2686125-4.685215bacteriophage HK97 gp40
UTI89_C2687125-4.424097bacteriophage HK97 gp40
UTI89_C2688125-3.803310hypothetical protein
UTI89_C2689125-4.285233hypothetical protein
UTI89_C2690020-3.781181hypothetical protein
UTI89_C2691120-3.574071hypothetical protein
UTI89_C2692117-2.492769hypothetical protein
UTI89_C2693123-4.337837hypothetical protein
UTI89_C2694124-4.613509hypothetical protein
UTI89_C2695027-5.496122DNA-binding transcriptional regulator DsdC
UTI89_C2696028-6.373798permease DsdX
UTI89_C2697-132-8.780348D-serine dehydratase
UTI89_C2698035-9.734070multidrug resistance protein Y
UTI89_C2699035-9.361663hypothetical protein
UTI89_C2700-134-8.455566multidrug resistance protein K
UTI89_C2701031-7.674365DNA-binding transcriptional activator EvgA
UTI89_C2702031-7.257509hybrid sensory histidine kinase in two-component
UTI89_C2703230-5.493892hypothetical protein
UTI89_C2704230-5.284635transporter YfdV
UTI89_C2705025-4.152924oxalyl-CoA decarboxylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2630VACJLIPOPROT407e-148 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 407 bits (1048), Expect = e-148
Identities = 250/251 (99%), Positives = 251/251 (100%)

Query: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60
MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR
Sbjct: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60

Query: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120
DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM
Sbjct: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120

Query: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADSLYPVLSWLTWPM 180
ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMAD+LYPVLSWLTWPM
Sbjct: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADALYPVLSWLTWPM 180

Query: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240
SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA
Sbjct: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240

Query: 241 IQDDLKDIDSE 251
IQDDLKDIDSE
Sbjct: 241 IQDDLKDIDSE 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2664HTHFIS260.006 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 26.3 bits (58), Expect = 0.006
Identities = 11/28 (39%), Positives = 14/28 (50%)

Query: 6 QTIPELLIQTRGNQTEVARMLSCARGTV 33
I L TRGNQ + A +L R T+
Sbjct: 439 PLILAALTATRGNQIKAADLLGLNRNTL 466


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2674PF05704260.033 Capsular polysaccharide synthesis protein
		>PF05704#Capsular polysaccharide synthesis protein

Length = 307

Score = 25.6 bits (56), Expect = 0.033
Identities = 6/31 (19%), Positives = 19/31 (61%)

Query: 56 EGYTFIPNAFLEKLLKEDISVSQFNDVLKVF 86
+ + IP+ +++ + + + F+D+L++F
Sbjct: 111 KEWVDIPDFLIKRWQEGKMLDAWFSDILRLF 141


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2677PF07675280.037 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 27.8 bits (61), Expect = 0.037
Identities = 14/52 (26%), Positives = 24/52 (46%)

Query: 133 MTAPAGLSIPEGMIILVDPEVEPRNGKLVVAKLEGENEATFKKLVIDAGRKF 184
+T + IP G+ EP +GK+ +A G A + +AG+K+
Sbjct: 458 VTGQGEVVIPGGVYDYCITNPEPASGKMWIAGDGGNQPARYDDFAFEAGKKY 509


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2698TCRTETB1222e-32 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 122 bits (308), Expect = 2e-32
Identities = 92/404 (22%), Positives = 167/404 (41%), Gaps = 17/404 (4%)

Query: 19 VTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVITSFGVANAIAIPVTGRLAQ 78
+ I L + +F +L+ + NV++P I+ WV T+F + +I V G+L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 79 RIGELRLFLLSVTFFSLSSLMCSLS-TNLDVLIFFRVVQGLMAGPLIPLSQSLLLRNYPP 137
++G RL L + S++ + + +LI R +QG A L ++ R P
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 138 EKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRE 197
E R A L V + GP +GG I W +L+ +PM I+ L L +E
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192

Query: 198 TETSPVKMNLPGLTLLVLGVGGLQIMLDKGRDLDWFNSSTIILLTVVSVISLISLVIWES 257
++ G+ L+ +G+ + ML F +S I +VSV+S + V
Sbjct: 193 VRIKG-HFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIR 241

Query: 258 TSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQETMGYNAIWAGLAYAPI 317
+P +D L K+ F IG++ + +G + ++P ++++ + G
Sbjct: 242 KVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 318 GIMPLLIS-PLIGRYGNKIDMRLLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQFFQG 376
G M ++I + G ++ ++ +V + S T F II+ G
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGG 361

Query: 377 FAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL 420
+ ++TI S L + S+ NF LS G ++
Sbjct: 362 LSFTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2700RTXTOXIND786e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 78.3 bits (193), Expect = 6e-18
Identities = 62/413 (15%), Positives = 123/413 (29%), Gaps = 98/413 (23%)

Query: 13 RRKYFALLAVVLFIAFSGAYAYWSMELKDMISTDDAYVTGNADPISAQVSGSVTVVNHKD 72
RR ++ F+ + + + +G + I + V + K+
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKE 114

Query: 73 TNYVRQGDILVSLDKTDATIALNKA----------------------------------- 97
VR+GD+L+ L A K
Sbjct: 115 GESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEP 174

Query: 98 -----------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQSLEDYN 137
K + Q + L + AE + + Y+
Sbjct: 175 YFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEK 234

Query: 138 RRV----PLAKQGVISKEALEHTKDTLI----------SSKAALNAAIQAYKANKALVMN 183
R+ L + I+K A+ ++ + S + + I + K +
Sbjct: 235 SRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE--YQLV 292

Query: 184 TPLNRQPQVIEAADATKE----------AWLALKRTDIKSPVTGYIAQRSVQ-VGETVSP 232
T L + + + T + + I++PV+ + Q V G V+
Sbjct: 293 TQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTT 352

Query: 233 GQSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINMGTGN 291
++LM +VP + V A + + + +GQ+ I + F G +G
Sbjct: 353 AETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLVGK-- 404

Query: 292 AFSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDT 340
+ + +V V +S++ L PL G+++TA I T
Sbjct: 405 -VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKT 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2701HTHFIS493e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 3e-09
Identities = 22/148 (14%), Positives = 53/148 (35%), Gaps = 31/148 (20%)

Query: 4 IIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQV 63
++ DD + L + ++ + + + + D+V+ DV +P N +
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 LETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF 123
L ++K + ++++SA+N + AI+A++ G +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDY 101

Query: 124 ---PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 102 LPKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2702HTHFIS792e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.5 bits (196), Expect = 2e-17
Identities = 30/105 (28%), Positives = 51/105 (48%)

Query: 960 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKVSMQHYDLLITDVNMPNMDGFE 1019
+IL+ADD R +L + L+ GYDV ++ ++ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 LTRKLREQNSSLPIWGLTANAQANEREKGLNCGMNLCLFKPLTLD 1064
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


40UTI89_C2744UTI89_C2782Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C27442180.951894cell division protein ZipA
UTI89_C2745223-0.616498regulatory protein AlgP
UTI89_C2746124-2.005571sulfate transport protein CysZ
UTI89_C2747222-1.649315cysteine synthase A
UTI89_C2748121-1.631467hypothetical protein
UTI89_C2749-118-0.363519PTS system phosphohistidinoprotein-hexose
UTI89_C2750-2161.031465phosphoenolpyruvate-protein phosphotransferase
UTI89_C2751-2202.845030PTS system glucose-specific transporter subunit
UTI89_C2752-1243.914847pyridoxal kinase
UTI89_C2753-1264.062393hypothetical protein
UTI89_C2754-2254.173307cysteine synthase B
UTI89_C2755-1243.749541sulfate/thiosulfate transporter subunit
UTI89_C2756-1223.567909sulfate/thiosulfate transporter permease
UTI89_C27570203.012125sulfate/thiosulfate transporter subunit
UTI89_C2758-1182.789760thiosulfate transporter subunit
UTI89_C2759-2173.096406short chain dehydrogenase
UTI89_C2760-1172.769940N-acetylmuramic acid-6-phosphate etherase
UTI89_C27610152.599691PTS system N-acetylmuramic acid transporter
UTI89_C27622140.961801hypothetical protein
UTI89_C27632141.008094hypothetical protein
UTI89_C27641140.952029hypothetical protein
UTI89_C27651140.494376hypothetical protein
UTI89_C27661141.416461hypothetical protein
UTI89_C27670152.034808acetyltransferase
UTI89_C27680143.039216N-acetylmuramoyl-L-alanine amidase
UTI89_C2769-2173.868775coproporphyrinogen III oxidase
UTI89_C2770-2184.898838transcriptional regulator EutR
UTI89_C2771-2215.226756ethanolamine utilization protein EutK
UTI89_C2772-1215.467106ethanolamine utilization protein EutL
UTI89_C2773-1215.728669ethanolamine ammonia-lyase small subunit
UTI89_C27740225.857553regulatory subunit of ethanolamine
UTI89_C27752206.049351reactivating factor for ethanolamine ammonia
UTI89_C27761185.537442ethanolamine utilization transport protein EutH
UTI89_C27773196.161713ethanolamine utilization protein EutG
UTI89_C27782196.077822ethanolamine utilization protein EutJ
UTI89_C27792205.391293ethanolamine utilization protein EutE
UTI89_C27800184.201706detox protein
UTI89_C27811183.618167detox protein
UTI89_C27822173.163557phosphotransacetylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2744PF03544451e-07 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 45.4 bits (107), Expect = 1e-07
Identities = 28/127 (22%), Positives = 43/127 (33%), Gaps = 5/127 (3%)

Query: 85 VHRVNHAPANAQEHEAARPSPQHQYQPPYASAQPRQPVQQPPEAQVPPQHAPRPAQPVQQ 144
VH+V PA AQ +P P P V+ PE + P P+ A V +
Sbjct: 37 VHQVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIP-EPPKEAPVVIE 95

Query: 145 PVQQPAYQPQPEQPLQQPVSPQVASAPQPVHSAPQPAQQAFQPAEPVAAPQPEPVAEPAP 204
+ +P Q +PV S P + PA P ++ ++P
Sbjct: 96 KPK----PKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVT 151

Query: 205 VMDKPKR 211
+ R
Sbjct: 152 SVASGPR 158



Score = 38.0 bits (88), Expect = 3e-05
Identities = 22/111 (19%), Positives = 41/111 (36%), Gaps = 2/111 (1%)

Query: 91 APANAQEHEAARPSPQHQYQPPYASAQPRQPVQQPPEAQVPPQHAPRPAQPVQQPVQQPA 150
APA+ + +A +P P+ +P +P ++ P P+ P+P + V+QP
Sbjct: 56 APADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPK 115

Query: 151 YQPQPEQPLQQPVSPQVASAPQPVHS--APQPAQQAFQPAEPVAAPQPEPV 199
+P + A A + A + P A + +P
Sbjct: 116 RDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQ 166



Score = 33.8 bits (77), Expect = 7e-04
Identities = 23/80 (28%), Positives = 33/80 (41%), Gaps = 5/80 (6%)

Query: 136 PRPAQPVQQPVQQPAYQPQPEQPLQQPVSPQVASAPQPVHSAPQPAQQAFQPAEPVAAPQ 195
P PAQP+ + PA P+ Q P P V P+P P + + P+
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAV-QPPPEPVVEPEPEPEPIPEPPKEAPVV----IEKPK 98

Query: 196 PEPVAEPAPVMDKPKRKEAV 215
P+P +P PV + K V
Sbjct: 99 PKPKPKPKPVKKVEQPKRDV 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2750PHPHTRNFRASE7500.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 750 bits (1937), Expect = 0.0
Identities = 276/571 (48%), Positives = 386/571 (67%), Gaps = 2/571 (0%)

Query: 1 MISGILASPGIAFGKALLLKEDEIVIDRKKISADQVDQEVERFLSGRAKASAQLETIKTK 60
I+GI AS G+A KA + E + I++ I V E+E+ + K+ +L IK +
Sbjct: 4 KITGIAASSGVAIAKAFIHLEPNVDIEKTSI--TDVSTEIEKLTAALEKSKEELRAIKDQ 61

Query: 61 AGETFGEEKEAIFEGHIMLLEDEELEQEIIALIKDKHMTADAAAHEVIEGQASALEELDD 120
+ G +K IF H+++L+D EL I I+++ M A+ A EV + S E +D+
Sbjct: 62 TEASMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDN 121

Query: 121 EYLKERAADVRDIGKRLLRNILGLKIIDLSAIQDEVILVAADLTPSETAQLNLKKVLGFI 180
EY+KERAAD+RD+ KR+L +++G++ L+ I +E +++A DLTPS+TAQLN + V GF
Sbjct: 122 EYMKERAADIRDVSKRVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFA 181

Query: 181 TDAGGRTSHTSIMARSLELPAIVGTGSVTSQVKNDDYLILDAVNNQVYVNPTNEVIDKMR 240
TD GGRTSH++IM+RSLE+PA+VGT VT ++++ D +I+D + V VNPT E +
Sbjct: 182 TDIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYE 241

Query: 241 AVQEQVASEKAELAKLKDLPAITLDGHQVEVCANIGTVRDVEGAERNGAEGVGLYRTEFL 300
+ +K E AKL P+ T DG VE+ ANIGT +DV+G NG EG+GLYRTEFL
Sbjct: 242 EKRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFL 301

Query: 301 FMDRDTLPTEEEQFAAYKAVAEACGSQAVIVRTMDIGGDKELPYMNFPKEENPFLGWRAI 360
+MDRD LPTEEEQF AYK V + + V++RT+DIGGDKEL Y+ PKE NPFLG+RAI
Sbjct: 302 YMDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFRAI 361

Query: 361 RIAMDRKEILRDQLRAILRASAFGKLRIMFPMIISVEEVRALRKEIEIYKQELRDEGKAF 420
R+ +++++I R QLRA+LRAS +G L++MFPMI ++EE+R + ++ K +L EG
Sbjct: 362 RLCLEKQDIFRTQLRALLRASTYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEGVDV 421

Query: 421 DESIEIGVMVETPAAATIARHLAKEVDFFSIGTNDLTQYTLAVDRGNDMISHLYQPMSPS 480
+SIE+G+MVE P+ A A AKEVDFFSIGTNDL QYT+A DR N+ +S+LYQP P+
Sbjct: 422 SDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPA 481

Query: 481 VLNLIKQVIDASHAEGKWTGMCGELAGDERATLLLLGMGLDEFSMSAISIPRIKKIIRNT 540
+L L+ VI A+H+EGKW GMCGE+AGDE A LLLG+GLDEFSMSA SI + +
Sbjct: 482 ILRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLLKL 541

Query: 541 NFEDAKVLAEQALAQPTTDELMTLVNKFIEE 571
+ E+ K A++AL T +E+ LV K +
Sbjct: 542 SKEELKPFAQKALMLDTAEEVEQLVKKTYLK 572


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2755PF05272348e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.3 bits (78), Expect = 8e-04
Identities = 11/33 (33%), Positives = 16/33 (48%)

Query: 30 MVALLGPSGSGKTTLLRIIAGLEHQTSGHIRFH 62
V L G G GK+TL+ + GL+ + H
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIG 630


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2759DHBDHDRGNASE1553e-48 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 155 bits (393), Expect = 3e-48
Identities = 95/255 (37%), Positives = 137/255 (53%), Gaps = 4/255 (1%)

Query: 26 LTGKTALITGALQGIGEGIARTFARHGANLILLDISPE-IEKLADELCGRGHRCTAVVAD 84
+ GK A ITGA QGIGE +ART A GA++ +D +PE +EK+ L A AD
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 85 VRDPASVAAAIKRAKEKEGRIDILVNNAGVCRLGSFLDMSDEDRDFHIDINIKGVWNVTK 144
VRD A++ R + + G IDILVN AGV R G +SDE+ + +N GV+N ++
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 145 AVLPEMIARKDGRIVMMSSVTGDMVADPGETAYALTKAAIVGLTKSLAVEYAQSGIRVNA 204
+V M+ R+ G IV + S V AYA +KAA V TK L +E A+ IR N
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAG-VPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 205 ICPGYVRTPMAESIARQSNPEDP--ESVLTEMAKAIPMRRLADPLEVGELAAFLASDESS 262
+ PG T M S+ N + + L IP+++LA P ++ + FL S ++
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 263 YLTGTQNVIDGGSTL 277
++T +DGG+TL
Sbjct: 245 HITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2767SACTRNSFRASE326e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 6e-04
Identities = 15/102 (14%), Positives = 38/102 (37%), Gaps = 4/102 (3%)

Query: 61 LRPWNDPEMDIERKMNHDVSLFLVAEVNGEVVG--TVMGGYDGHRGSAYYLGVHPEFRGR 118
+ + D +MD+ + FL + +G + ++G + V ++R +
Sbjct: 47 FKQYEDDDMDVSYVEEEGKAAFL-YYLENNCIGRIKIRSNWNG-YALIEDIAVAKDYRKK 104

Query: 119 GIANALLNRLEKKLIARGCPKIQINVPEDNDMVLGMYERLGY 160
G+ ALL++ + + + + N Y + +
Sbjct: 105 GVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2778SHAPEPROTEIN512e-09 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 50.5 bits (121), Expect = 2e-09
Identities = 33/116 (28%), Positives = 50/116 (43%), Gaps = 9/116 (7%)

Query: 63 VRDGIVWDFFGAVTIVRRHLD-TLEQQFGRRFSHAATSFPPGTDP---RISINVLESAGL 118
++DG++ DFF +++ + F R P G R + AG
Sbjct: 76 MKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGA 135

Query: 119 EVSHVLDEPTAVA---DLLQLDNAG--VVDIGGGTTGIAIVKKGKVTYSADEATGG 169
+++EP A A L + G VVDIGGGTT +A++ V YS+ GG
Sbjct: 136 REVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGG 191


41UTI89_C2816UTI89_C2822Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2816-211-3.002998phosphoribosylglycinamide formyltransferase
UTI89_C2817-311-3.482324polyphosphate kinase
UTI89_C2818-115-3.917576exopolyphosphatase
UTI89_C2819016-5.060784hypothetical protein
UTI89_C2820118-1.626212hypothetical protein
UTI89_C28212270.456669hypothetical protein
UTI89_C28222201.458780hypothetical protein
42UTI89_C2946UTI89_C3032Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2946214-1.229227hypothetical protein
UTI89_C2947216-0.949687heat shock protein GrpE
UTI89_C2948317-1.404509hypothetical protein
UTI89_C2949215-0.941241inorganic polyphosphate/ATP-NAD kinase
UTI89_C2950218-1.009072recombination and repair protein
UTI89_C2951226-2.391369hypothetical protein
UTI89_C2952226-4.326922hypothetical protein
UTI89_C2953428-5.245238hypothetical protein
UTI89_C2954531-5.511399SsrA-binding protein
UTI89_C2955538-8.304990hypothetical protein
UTI89_C2956538-8.434291hypothetical protein
UTI89_C2957436-8.048271hypothetical protein
UTI89_C2958335-6.625614hypothetical protein
UTI89_C2959230-4.931974prophage CP-933H hypothetical protein
UTI89_C2960022-2.523971hypothetical protein
UTI89_C29613162.249178hypothetical protein
UTI89_C29625183.700089lambdoid prophage e14 tail fiber assembly
UTI89_C29634205.536767hypothetical protein
UTI89_C29645225.928795bacteriophage V tail protein
UTI89_C29653224.332294bacteriophage V tail protein
UTI89_C29663243.855022bacteriophage V tail protein
UTI89_C29673243.546803bacteriophage V tail protein
UTI89_C29681223.921625bacteriophage V tail protein
UTI89_C29691213.228309bacteriophage V tail/DNA circulation protein
UTI89_C29701222.913802bacteriophage V tail protein
UTI89_C29710244.472398hypothetical protein
UTI89_C29721244.844879hypothetical protein
UTI89_C29731235.327197bacteriophage V tail sheath protein
UTI89_C29741214.541330hypothetical protein
UTI89_C29751224.508232hypothetical protein
UTI89_C29760224.046974hypothetical protein
UTI89_C29771253.591431hypothetical protein
UTI89_C29782253.664833hypothetical protein
UTI89_C29792263.515853hypothetical protein
UTI89_C29801263.401722hypothetical protein
UTI89_C29811253.771872major capsid protein
UTI89_C29822264.256513pro-head protease
UTI89_C29830253.346732portal protein
UTI89_C2984-1212.045941hypothetical protein
UTI89_C29850201.205083bacteriophage V large terminase subunit
UTI89_C2986228-0.742074bacteriophage V small terminase subunit
UTI89_C2987230-5.251311hypothetical protein
UTI89_C2988432-6.559832hypothetical protein
UTI89_C2989326-2.748293hypothetical protein
UTI89_C2990224-2.303908endolysin of prophage CP-933X
UTI89_C2991222-1.477713holin protein of prophage CP-933X
UTI89_C2992222-0.830571hypothetical protein
UTI89_C29931211.216286lambdoid prophage Qin antitermination protein
UTI89_C29942222.551762hypothetical protein
UTI89_C29950211.919437hypothetical protein
UTI89_C29962232.214439bacteriophage V crossover junction
UTI89_C29972221.416587hypothetical protein
UTI89_C2998220-0.004124bacteriophage V DNA adenine methylase
UTI89_C2999323-1.789239hypothetical protein
UTI89_C3000324-3.314583hypothetical protein
UTI89_C3001424-5.174247hypothetical protein
UTI89_C3002221-5.524231hypothetical protein
UTI89_C3003232-10.206145hypothetical protein
UTI89_C3004233-9.847209phage repressor
UTI89_C3005236-10.807814hypothetical protein
UTI89_C3006135-10.997298hypothetical protein
UTI89_C3007442-11.564169hypothetical protein
UTI89_C3008545-12.534187hypothetical protein
UTI89_C3009228-6.642951hypothetical protein
UTI89_C3010-117-2.892092hypothetical protein
UTI89_C3011-1130.573570hypothetical protein
UTI89_C30120162.341728hypothetical protein
UTI89_C30132213.275653hypothetical protein
UTI89_C30142213.773849hypothetical protein
UTI89_C30152203.347846hydroxyglutarate oxidase
UTI89_C30162182.766964succinate-semialdehyde dehydrogenase I
UTI89_C30171171.8092334-aminobutyrate aminotransferase
UTI89_C3018118-0.300536gamma-aminobutyrate transporter
UTI89_C3019022-1.453573DNA-binding transcriptional regulator CsiR
UTI89_C3020121-2.877301LysM domain/BON superfamily protein
UTI89_C3021329-4.394863hypothetical protein
UTI89_C3022124-4.313991hypothetical protein
UTI89_C3023223-4.676193transcriptional regulator
UTI89_C3024223-4.342591hypothetical protein
UTI89_C3025-120-1.752144DNA binding protein, nucleoid-associated
UTI89_C3026020-1.206055hypothetical protein
UTI89_C3027018-0.173914hypothetical protein
UTI89_C30280180.494260hypothetical protein
UTI89_C30290161.641929hypothetical protein
UTI89_C30300111.134124hypothetical protein
UTI89_C30311150.267642hypothetical protein
UTI89_C3032214-0.389008hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2951BLACTAMASEA260.032 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 26.3 bits (58), Expect = 0.032
Identities = 23/87 (26%), Positives = 36/87 (41%), Gaps = 11/87 (12%)

Query: 4 KTLTAAAAVLLMLTAGCSTLERVVYRPDINQGNYLTANDVSKIRV--GMTQQQVAYALGT 61
K + AVL + AG LER ++ Q + + + VS+ + GMT ++ A
Sbjct: 69 KVV-LCGAVLARVDAGDEQLERKIH---YRQQDLVDYSPVSEKHLADGMTVGELCAA--A 122

Query: 62 PLMSDPFGTNTWFYVFRQQPGHEGVTQ 88
MSD N + G G+T
Sbjct: 123 ITMSDNSAANL---LLATVGGPAGLTA 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2967cloacin280.029 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 27.8 bits (61), Expect = 0.029
Identities = 15/33 (45%), Positives = 20/33 (60%), Gaps = 1/33 (3%)

Query: 50 PYGFTARANSGAEAVVLFPDGDRSHAVVVTVSD 82
P GFT N+ +AV+ FP +AV V+VSD
Sbjct: 258 PAGFTQGGNT-RDAVIRFPKDSGHNAVYVSVSD 289


43UTI89_C3066UTI89_C3088Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C30661143.001387PTS system glucitol/sorbitol-specific
UTI89_C30671152.705231sorbitol-6-phosphate dehydrogenase
UTI89_C30681162.867539DNA-binding transcriptional activator GutM
UTI89_C30691163.889113DNA-binding transcriptional repressor SrlR
UTI89_C30701164.318159D-arabinose 5-phosphate isomerase
UTI89_C30711173.761623anaerobic nitric oxide reductase transcription
UTI89_C30720173.642298anaerobic nitric oxide reductase
UTI89_C30730173.456788nitric oxide reductase
UTI89_C30740152.993459hydrogenase maturation protein HypF
UTI89_C3075-1161.422996electron transport protein HydN
UTI89_C30760161.516234ascBF operon repressor
UTI89_C30770172.326878PTS system cellobiose/arbutin/salicin-specific
UTI89_C30780181.749669cryptic 6-phospho-beta-glucosidase
UTI89_C30790243.280631hypothetical protein
UTI89_C3080-1294.726840hydrogenase 3 maturation protease
UTI89_C3081-1275.362497formate hydrogenlyase maturation protein HycH
UTI89_C3082-1275.482604formate hydrogenlyase subunit 7
UTI89_C30830264.946667formate hydrogenlyase complex iron-sulfur
UTI89_C30840244.821857formate hydrogenlyase subunit 5
UTI89_C30852224.694911membrane-spanning protein of formate
UTI89_C30862224.172334formate hydrogenlyase subunit 3
UTI89_C30872192.758727formate hydrogenlyase subunit 2
UTI89_C30881193.402751formate hydrogenlyase regulatory protein HycA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3067DHBDHDRGNASE842e-21 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 83.9 bits (207), Expect = 2e-21
Identities = 67/257 (26%), Positives = 120/257 (46%), Gaps = 7/257 (2%)

Query: 3 QVAVVIGGGQTLGAFLCHGLAAEGYRVAVVDIQSDKAANVAQEINAEYGEGTAYGFGADA 62
++A + G Q +G + LA++G +A VD +K V + AE A F AD
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE--ARHAEAFPADV 66

Query: 63 TSEQSVLALSRGVDEIFGRVDLLVYSAGIAKAAFISDFQLGDFDRSLQVNLVGYFLCARE 122
++ ++ ++ G +D+LV AG+ + I +++ + VN G F +R
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 123 FSRLMIRDGIQGRIIQINSKSGKVGSKHNSGYSAAKFGGVGLTQSLALDLAEYGITVHSL 182
S+ M D G I+ + S V + Y+++K V T+ L L+LAEY I + +
Sbjct: 127 VSKYM-MDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 183 MLGNLLKSPMFQSL-LPQYATKLGIKPDQVEQYYIDKVPLKRGCDYQDVLNMLLFYASPK 241
G+ ++ M SL + + IK +E + +PLK+ D+ + +LF S +
Sbjct: 186 SPGS-TETDMQWSLWADENGAEQVIKGS-LETFKTG-IPLKKLAKPSDIADAVLFLVSGQ 242

Query: 242 ASYCTGQSINVTGGQVM 258
A + T ++ V GG +
Sbjct: 243 AGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3069ARGREPRESSOR280.024 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 27.9 bits (62), Expect = 0.024
Identities = 10/45 (22%), Positives = 18/45 (40%), Gaps = 5/45 (11%)

Query: 1 MKPRQRQAAILEYLQKQGKCSVEEL-----AQYFDTTGTTIRKDL 40
M QR I E + + +EL ++ T T+ +D+
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDI 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3071HTHFIS372e-126 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 372 bits (956), Expect = e-126
Identities = 125/388 (32%), Positives = 193/388 (49%), Gaps = 33/388 (8%)

Query: 149 IAALAAGALS----------NALLIEQLESQNMLPGDAAPFEAVKQTQMIGLSPGMTQLK 198
I A GA +I + ++ ++ ++G S M ++
Sbjct: 91 IKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIY 150

Query: 199 KEIEIVAASDLNVLISGETGTGKELVAKAIHEASPRAVNPLVYLNCAALPESVAESELFG 258
+ + + +DL ++I+GE+GTGKELVA+A+H+ R P V +N AA+P + ESELFG
Sbjct: 151 RVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFG 210

Query: 259 HVKGAFTGAISNRSGKFEMADNGTLFLDEIGELSLALQAKLLRVLQYGDIQRVGDDRSLR 318
H KGAFTGA + +G+FE A+ GTLFLDEIG++ + Q +LLRVLQ G+ VG +R
Sbjct: 211 HEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIR 270

Query: 319 VDVRVLAATNRDLREEVLAGRFRADLFHRLSVFPLSVPPLRERGDDVILLAGYFCEQCRL 378
DVR++AATN+DL++ + G FR DL++RL+V PL +PPLR+R +D+ L +F +Q
Sbjct: 271 SDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAE- 329

Query: 379 RLGLSRVVLSAGARNLLQHYNFPGNVRELEHAIHRAVVLSRATRSGDEVIL-----EAQH 433
+ GL A L++ + +PGNVRELE+ + R L E+I E
Sbjct: 330 KEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPD 389

Query: 434 FAFPEVTLPPPEAAAVPVVKQNLR-----------------EATEAFQRETIRQALAQNH 476
+ + V++N+R + I AL
Sbjct: 390 SPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATR 449

Query: 477 HNWAACARMLETDVANLHRLAKRLGLKD 504
N A +L + L + + LG+
Sbjct: 450 GNQIKAADLLGLNRNTLRKKIRELGVSV 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3076HTHTETR290.017 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 29.2 bits (65), Expect = 0.017
Identities = 19/103 (18%), Positives = 34/103 (33%), Gaps = 8/103 (7%)

Query: 6 LIACKGMKMMTTMLEVAKRAGVSKATVSRVLSG-----NGYVSQETKDRVFQAVEESGYR 60
L + +G+ T++ E+AK AGV++ + + + +E
Sbjct: 23 LFSQQGVSS-TSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELELEYQAKF 81

Query: 61 PNLLARNLSAKSTQTLGLVVTNTLYHGIYFSELLFHAARMAEE 103
P L L VT + E++FH E
Sbjct: 82 PGDPLSVLREILIHVLESTVTEERRRLLM--EIIFHKCEFVGE 122


44UTI89_C3104UTI89_C3114Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3104020-3.619188aldolase
UTI89_C3105-120-3.779914hypothetical protein
UTI89_C3106-317-3.471417inner membrane permease YgbN
UTI89_C3107-316-3.447013hypothetical protein
UTI89_C3108-317-0.659325hypothetical protein
UTI89_C3109-2150.568583hypothetical protein
UTI89_C3110-1141.277450hypothetical protein
UTI89_C3111-1121.208374RNA polymerase sigma factor RpoS
UTI89_C31121141.523795lipoprotein NlpD
UTI89_C31132172.214786hypothetical protein
UTI89_C31142162.079921protein-L-isoaspartate O-methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3112RTXTOXIND300.018 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.8 bits (67), Expect = 0.018
Identities = 16/84 (19%), Positives = 36/84 (42%), Gaps = 12/84 (14%)

Query: 292 IIATADGRVVYAGNALRGYGNLIIIKHNDDYLSAYAHNDTMLVREQQEVKAGQKIATMGS 351
I+ATA+G++ ++G + IK ++ ++V+E + V+ G + + +
Sbjct: 82 IVATANGKLTHSGRSK-------EIKP---IENSIV--KEIIVKEGESVRKGDVLLKLTA 129

Query: 352 TGTSSTRLHFEIRYKGKSVNPLRY 375
G + L + + RY
Sbjct: 130 LGAEADTLKTQSSLLQARLEQTRY 153


45UTI89_C3126UTI89_C3148Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C31260163.141674phosphoadenosine phosphosulfate reductase
UTI89_C31270142.814383sulfite reductase subunit beta
UTI89_C31280142.626240sulfite reductase subunit alpha
UTI89_C31290192.2321766-pyruvoyl tetrahydrobiopterin synthase
UTI89_C31302182.630771electron transfer flavoprotein-quinone
UTI89_C31311141.489214ferredoxin-like protein YgcO
UTI89_C31320130.319574anti-terminator regulatory protein
UTI89_C31331110.216212flavoprotein
UTI89_C3134111-0.673217transport protein
UTI89_C3135012-1.260716metabolite transport protein YgcS
UTI89_C3136012-2.311305hypothetical protein
UTI89_C3137215-3.795165oxidoreductase YgcW
UTI89_C3138115-3.580525hypothetical protein
UTI89_C3139018-3.868923sugar kinase YgcE
UTI89_C3140-125-5.053827hypothetical protein
UTI89_C3141129-5.720223hypothetical protein
UTI89_C3142124-4.258474hypothetical protein
UTI89_C3143021-2.510093hypothetical protein
UTI89_C3144022-1.939029hypothetical protein
UTI89_C3145023-1.960452hypothetical protein
UTI89_C3147027-1.200566hypothetical protein
UTI89_C3148231-0.760242phosphopyruvate hydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3127PF07675300.021 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 30.4 bits (68), Expect = 0.021
Identities = 20/92 (21%), Positives = 39/92 (42%), Gaps = 12/92 (13%)

Query: 206 ILGQTYLPRKFKTTVVIP---PQND--IDLHANDMNFVAIAENGKLVGFNLLVGGGLSIE 260
++ +P+ T +P PQN + A+ ++VAI+++G L G + G++
Sbjct: 240 VMPYRAMPKT--NTYTLPASLPQNQASYSIQASAGSYVAISKDGVLYGTGVANASGVATV 297

Query: 261 HGNK-----KTYARTASEFGYLPLEHTLAVAE 287
+ K Y + YLP+ + E
Sbjct: 298 NMTKQITENGNYDVVITRSNYLPVIKQIQAGE 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3135TCRTETB362e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 36.4 bits (84), Expect = 2e-04
Identities = 53/338 (15%), Positives = 123/338 (36%), Gaps = 34/338 (10%)

Query: 93 LGSLVLGWISDHIGRQKIFTFSFMLITLASFLQFFATTP-EHLIGLRILIGIGLGGDYSV 151
+G+ V G +SD +G +++ F ++ S + F + LI R + G G ++
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPAL 123

Query: 152 GHTLLAEFSPRRHRGVLLGAFSVVWT----VGYVLASIAGHHFISESPEAWRWLLASAAL 207
++A + P+ +RG G + VG + + H+ W +LL +
Sbjct: 124 VMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------HWSYLLLIPMI 177

Query: 208 PALLITLLRWGTPESPRWLLRQGRFAEAHAIVHRYFGPHVLLGDEVATATHKHIKTLF-- 265
+ + L + R +G F I+ +L + + + L
Sbjct: 178 TIITVPFLMKLLKKEVR---IKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFL 234

Query: 266 -SSRYWRRTA--------FNSVFFVCLVIPWFVIYT----WLPTIAQTIGLEDALTASLM 312
++ R+ ++ F+ V+ +I+ ++ + + L+ + +
Sbjct: 235 IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEI 294

Query: 313 LNALLIVGALLGLVLTHLLAHRRFLLGSFLLLTATLVVMACLPSGSSLTLLLFVLFSTTI 372
+ ++ G + ++ ++ G +L + ++ S F+L +T+
Sbjct: 295 GSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLS-----VSFLTASFLLETTSW 349

Query: 373 SAVSNLVGILPAESFPTDIRSLGVGFATAMSRLGAAVS 410
+V +L SF + S V + GA +S
Sbjct: 350 FMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMS 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3137DHBDHDRGNASE1091e-30 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 109 bits (274), Expect = 1e-30
Identities = 74/257 (28%), Positives = 117/257 (45%), Gaps = 11/257 (4%)

Query: 36 MDFFSLKGKTAIVTGGNSGLGQAFAMALAKAGANVFIPSFVKDNGETKEMIEK-QGVEVD 94
M+ ++GK A +TG G+G+A A LA GA++ + + E K + +
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 95 FMQVDITAEGAPQKIIAACCERFGTVDILVNNAGICKLNKVLDFGRADWDPMIDVNLTAA 154
D+ A +I A G +DILVN AG+ + + +W+ VN T
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 155 FELSYEAAKIMIPQKSGKIINICSLFSYLGGQWSPAYSATKHALAGFTKAYCDELGQYNI 214
F S +K M+ ++SG I+ + S + + AY+++K A FTK EL +YNI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 215 QVNGISPGYYATDI--TLATRSNPETNQRVLDY-------IPANRWGDTQDLMGAAVFLA 265
+ N +SPG TD+ +L N Q + IP + D+ A +FL
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAE-QVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 266 SPASNYVNGHLLVVDGG 282
S + ++ H L VDGG
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3138TCRTETA310.006 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 31.3 bits (71), Expect = 0.006
Identities = 20/76 (26%), Positives = 34/76 (44%), Gaps = 1/76 (1%)

Query: 41 GFSNTEIGLIMSTFGIAAIIFYA-PSGVIADKFSHRKMITSAMIITGLLGLIMATYPPLW 99
+ T IG+ ++ FGI + A +G +A + R+ + MI G +++A W
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 100 VMLCIQVAFAITTILM 115
+ I V A I M
Sbjct: 302 MAFPIMVLLASGGIGM 317



Score = 30.6 bits (69), Expect = 0.012
Identities = 22/103 (21%), Positives = 45/103 (43%), Gaps = 8/103 (7%)

Query: 48 GLIMSTFGIAAIIFYAPSGVIADKFSHRKMITSAMIITGLLGLIMATYPPLWVMLCIQVA 107
G++++ + + G ++D+F R ++ ++ + IMAT P LWV+ ++
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV 105

Query: 108 FAITTILMLWSVSIKAASLLGD---HSEQGKIMGWMEGLRGVG 147
IT + A + + D E+ + G+M G G
Sbjct: 106 AGITG-----ATGAVAGAYIADITDGDERARHFGFMSACFGFG 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C314256KDTSANTIGN300.005 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 30.3 bits (68), Expect = 0.005
Identities = 19/76 (25%), Positives = 31/76 (40%), Gaps = 12/76 (15%)

Query: 30 NASWSEVLNQYQRRTDLIPNLVASIKGYSSHEQEVLEAVTLARSQANRASSDLQKTPGDE 89
+AS ++ ++ Q D + L S GY + + N+ + P +
Sbjct: 294 SASIEQIQSKIQELGDTLEELRDSFDGY------------INNAFVNQIHLNFVMPPQAQ 341

Query: 90 QKLQAWQQAQAQAQAQ 105
Q+ QQ QAQA AQ
Sbjct: 342 QQQGQGQQQQAQATAQ 357


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3147cloacin347e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 33.9 bits (77), Expect = 7e-04
Identities = 16/30 (53%), Positives = 18/30 (60%)

Query: 270 GSSSSSSGGGSSGGGFSGGGGSSGGGGASG 299
GS S GG SG G GG G+SGGG +G
Sbjct: 49 GSGSGIHWGGGSGHGNGGGNGNSGGGSGTG 78



Score = 31.2 bits (70), Expect = 0.005
Identities = 12/23 (52%), Positives = 14/23 (60%)

Query: 276 SGGGSSGGGFSGGGGSSGGGGAS 298
SG G+ GG + GGGS GG S
Sbjct: 60 SGHGNGGGNGNSGGGSGTGGNLS 82



Score = 29.7 bits (66), Expect = 0.015
Identities = 11/32 (34%), Positives = 14/32 (43%)

Query: 267 SRKGSSSSSSGGGSSGGGFSGGGGSSGGGGAS 298
S S G G G SGGG +GG ++
Sbjct: 52 SGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSA 83



Score = 29.3 bits (65), Expect = 0.020
Identities = 13/30 (43%), Positives = 16/30 (53%)

Query: 270 GSSSSSSGGGSSGGGFSGGGGSSGGGGASG 299
S ++ GGGS G GGG G GG +G
Sbjct: 40 SSENNPWGGGSGSGIHWGGGSGHGNGGGNG 69



Score = 28.5 bits (63), Expect = 0.035
Identities = 13/38 (34%), Positives = 16/38 (42%)

Query: 262 SKERASRKGSSSSSSGGGSSGGGFSGGGGSSGGGGASG 299
S E G S S G G +GGG + GGG+
Sbjct: 40 SSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGT 77



Score = 28.5 bits (63), Expect = 0.039
Identities = 12/30 (40%), Positives = 13/30 (43%)

Query: 267 SRKGSSSSSSGGGSSGGGFSGGGGSSGGGG 296
S G G +GGG GG SG GG
Sbjct: 50 SGSGIHWGGGSGHGNGGGNGNSGGGSGTGG 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3148ANTHRAXTOXNA290.038 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.3 bits (65), Expect = 0.038
Identities = 31/132 (23%), Positives = 51/132 (38%), Gaps = 9/132 (6%)

Query: 211 GYAPNLGSNAEALAVIAEAVKAAGYELGKDITLAMDCAASEFYKDGKYVLA-----GEGN 265
P L N + A+ +E K YE+GK I+L + + ++ + +
Sbjct: 147 RETPKLIINIKDYAINSEQSKEVYYEIGKGISLDIISKDKSLDPEFLNLIKSLSDDSDSS 206

Query: 266 KAFTSEEFTHFLEELTKQYPIVSIEDGLDESDW---DGFAYQTKVLG-DKIQLVGDDLFV 321
S++F LE K I I++ L E F+Y ++L D+F
Sbjct: 207 DLLFSQKFKEKLELNNKSIDINFIKENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFE 266

Query: 322 TNTKILKEGIEK 333
K+ K G EK
Sbjct: 267 YMNKLEKGGFEK 278


46UTI89_C3186UTI89_C3214Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3186-2194.212181murein transglycosylase A
UTI89_C3190-1204.320775***hypothetical protein
UTI89_C31911234.967940hypothetical protein
UTI89_C31921277.496674hypothetical protein
UTI89_C31930266.413050hypothetical protein
UTI89_C31940173.141170hypothetical protein
UTI89_C3195116-0.023695hypothetical protein
UTI89_C3196115-0.527143secreted protein Hcp
UTI89_C3197116-0.710683ClpB protein
UTI89_C3198426-6.160257hypothetical protein
UTI89_C3199220-3.273321hypothetical protein
UTI89_C32000161.292524hypothetical protein
UTI89_C32010151.878913hypothetical protein
UTI89_C3202-1141.224181hypothetical protein
UTI89_C3203-1141.306560hypothetical protein
UTI89_C3204-2183.423744hypothetical protein
UTI89_C32050210.354609hypothetical protein
UTI89_C3206224-3.852047hypothetical protein
UTI89_C3207121-2.719896hypothetical protein
UTI89_C3208121-1.864955hypothetical protein
UTI89_C3209226-5.006160hypothetical protein
UTI89_C3210226-6.218814hypothetical protein
UTI89_C3211232-10.518151hypothetical protein
UTI89_C3212132-11.0993282-hydroxyacid dehydrogenase
UTI89_C3213025-8.834431phosphosugar isomerase
UTI89_C3214-119-6.708247beta-cystathionase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3194OMPADOMAIN811e-18 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 80.7 bits (199), Expect = 1e-18
Identities = 44/142 (30%), Positives = 63/142 (44%), Gaps = 14/142 (9%)

Query: 415 PEQKMEVTASLQVQTVRLDSMSLFDVGQARLKDGSTKVL---VDALVNIRAKPGWLILVA 471
+Q + L S LF+ +A LK L L N+ K G ++V
Sbjct: 200 VAPAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDG-SVVVL 258

Query: 472 GYTDATGDEKSNQQLSLRRAEAVRNWMLQTSDIPATCFAVQGLGESQPAATNDTPQGR-- 529
GYTD G + NQ LS RRA++V ++ L + IPA + +G+GES P N +
Sbjct: 259 GYTDRIGSDAYNQGLSERRAQSVVDY-LISKGIPADKISARGMGESNPVTGNTCDNVKQR 317

Query: 530 -------AVNRRVEISLVPRSD 544
A +RRVEI + D
Sbjct: 318 AALIDCLAPDRRVEIEVKGIKD 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3197HTHFIS320.011 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.1 bits (73), Expect = 0.011
Identities = 35/189 (18%), Positives = 66/189 (34%), Gaps = 34/189 (17%)

Query: 512 IMTLRQEGTDSTELQQQLRTHQGFAPLLALDVDARAVATVVADWTG----IPLSSLL--- 564
+ + ++ +L +++ + P+L + + + A G +P L
Sbjct: 52 VTDVVMPDENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTEL 111

Query: 565 RDEQSDLLSMEQSLENR----------VVGQRPALCAIAQRL-RAAKTGLTPENGPQGVF 613
L+ + ++ +VG+ A+ I + L R +T LT
Sbjct: 112 IGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLT--------L 163

Query: 614 LLTGPSGTGKTETALTLADTLFGGEKSLITINLSEYQEPHTVSQLKGSPPGYVGYGQGGV 673
++TG SGTGK A L D + IN++ S+L G + G
Sbjct: 164 MITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGH--------EKGA 215

Query: 674 LTEAVRKRP 682
T A +
Sbjct: 216 FTGAQTRST 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3204PF00577310.038 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 31.0 bits (70), Expect = 0.038
Identities = 15/71 (21%), Positives = 24/71 (33%), Gaps = 6/71 (8%)

Query: 274 LRLAHTLAERGIAHWQSVL---KPLLAGGAFSSLRLRGLMFSPPLAAVPEAAPHAWLPSP 330
+ +T ER I +S L G F + RG + +P +P
Sbjct: 243 WQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLP---DSQRGFAP 299

Query: 331 VWAGVTGDNAR 341
V G+ A+
Sbjct: 300 VIHGIARGTAQ 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3208ANTHRAXTOXNA290.010 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.3 bits (65), Expect = 0.010
Identities = 13/83 (15%), Positives = 35/83 (42%), Gaps = 9/83 (10%)

Query: 33 ESKSVASAVFYKQIKILHLDFFSR---------SALNTDAEDTPLSTMVHVWQLKTREDF 83
+ + V+Y+ K + LD S+ + + + ++D+ S ++ + K + +
Sbjct: 161 INSEQSKEVYYEIGKGISLDIISKDKSLDPEFLNLIKSLSDDSDSSDLLFSQKFKEKLEL 220

Query: 84 DKADYDTLFMQEEKTLEKDVLAK 106
+ D F++E T + +
Sbjct: 221 NNKSIDINFIKENLTEFQHAFSL 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3210RTXTOXIND290.048 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.048
Identities = 34/215 (15%), Positives = 62/215 (28%), Gaps = 20/215 (9%)

Query: 235 WPIFMAGMVVMAGLGGTGLW-GWSQLNQPDALIQRIQLSVMPLP-QSLESGELAKLDVKD 292
P + +M L + Q+ ++ S + +E+ + ++ VK+
Sbjct: 56 RPRLV-AYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKE 114

Query: 293 -------KALLAQDRT-----IAASQMQLEQLNKLPARWPLEQGYRQLRQLDAL----WP 336
LL +Q L Q R+ + +L +L L P
Sbjct: 115 GESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEP 174

Query: 337 DNPQVRALNAQWRKQRELSALSAEALNGYAQAQSQLQRLSAQLDALDERKGRYLTGSELK 396
V S N Q + L + A+ + R RY S ++
Sbjct: 175 YFQNVSEEEVLRLTSLIKEQFSTWQ-NQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 397 TAVYGIRQSLKEPPLEELLRQLEEQKQTGEVSPTL 431
+ SL LE++ + E L
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNEL 268


47UTI89_C3363UTI89_C3383Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3363-115-4.369662polysialic acid capsule synthesis protein KpsE
UTI89_C3364223-7.705485polysialic acid transport protein KpsD
UTI89_C3365335-11.3731943-deoxy-manno-octulosonate cytidylyltransferase
UTI89_C3366342-14.159097capsule polysaccharide export protein KpsC
UTI89_C3367753-19.078420polysialic acid capsule synthesis protein KpsS
UTI89_C33681061-21.598016poly-alpha-2,8 sialosyl sialyltransferase
UTI89_C3369960-20.539083NeuE protein
UTI89_C3370754-17.806539polysialic acid biosynthesis protein P7
UTI89_C3371548-15.285616acylneuraminate cytidylyltransferase
UTI89_C3372234-9.763720sialic acid synthase NeuB
UTI89_C3373-122-4.528066sialic acid synthase NeuD
UTI89_C3374-217-1.051560polysialic acid transport ATP-binding protein
UTI89_C33750161.936157polysialic acid transport protein KpsM
UTI89_C33761194.276705general secretion pathway protein YghD
UTI89_C33770164.281829GspL-like protein
UTI89_C33780204.772751hypothetical protein
UTI89_C3379-1195.232861type II secretion protein GspJ
UTI89_C3380-2194.533228type II secretion protein GspI
UTI89_C3381-2164.076501type II secretion protein GspH
UTI89_C3382-3163.367246type II secretion protein GspG
UTI89_C3383-2153.143292type II secretion protein GspF
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3375ABC2TRNSPORT336e-04 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 33.4 bits (76), Expect = 6e-04
Identities = 29/125 (23%), Positives = 54/125 (43%), Gaps = 10/125 (8%)

Query: 137 ITNFLQLVLTWSLLIILS--CGVGLIF----MVVGKTFPEMQKVL---PILLKPLYFISC 187
+ L SLL L GL F MVV P + +++ P+ F+S
Sbjct: 135 VAAALGYTQWLSLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSG 194

Query: 188 IMFPLHSIPKQYWSYLLWNPLVHVVELSREAVMPGYISE-GVSLNYLAMFTLVTLFIGLA 246
+FP+ +P + + + PL H ++L R ++ + + + L ++ ++ F+ A
Sbjct: 195 AVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTA 254

Query: 247 LYRTR 251
L R R
Sbjct: 255 LLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3380BCTERIALGSPH323e-04 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 32.2 bits (73), Expect = 3e-04
Identities = 13/24 (54%), Positives = 18/24 (75%)

Query: 2 KRGFTLLEVMLALAIFALAAMAVL 25
+RGFTLLE+ML L + ++A VL
Sbjct: 3 QRGFTLLEMMLILLLMGVSAGMVL 26


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3381BCTERIALGSPH773e-20 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 77.3 bits (190), Expect = 3e-20
Identities = 42/196 (21%), Positives = 71/196 (36%), Gaps = 41/196 (20%)

Query: 1 MPERGFTLLEIMLVIFLIGLASAGVVQTFATASEPPAKKAAQDFLTRFAQFKDRAVIEGQ 60
M +RGFTLLE+ML++ L+G+++ V+ F + + A + F + + R + GQ
Sbjct: 1 MRQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQ 60

Query: 61 TLGVLIDPPGYQFMQRRHGQWLPVSATRLSAQVTVPKQVQMLLQPGSDIWQKEYALELQR 120
GV + P +QF+ + P D W L L+
Sbjct: 61 FFGVSVHPDRWQFLVLEARDGADPA-------------------PADDGWSGYRWLPLRA 101

Query: 121 RRL----TLHDIELEL-----QKEAKKKTPQIRFSPFEPATPFTLRFYSAAQNACWAVKL 171
R+ ++ +L L + P + P TPF L L
Sbjct: 102 GRVATSGSIAGGKLNLAFAQGEAWTPGDNPDVLIFPGGEMTPFRLT-------------L 148

Query: 172 AHDGALSLNQCDERMP 187
++ N E +P
Sbjct: 149 GEAPGIAFNARGESLP 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3382BCTERIALGSPG2182e-76 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 218 bits (556), Expect = 2e-76
Identities = 91/146 (62%), Positives = 109/146 (74%), Gaps = 3/146 (2%)

Query: 6 RTQKPRAGFTLLEVMVVIVILGVLASLVVPNLLGNKEKADRQKAISDIVALENALDMYRL 65
R + GFTLLE+MVVIVI+GVLASLVVPNL+GNKEKAD+QKA+SDIVALENALDMY+L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL 61

Query: 66 DNGRYPTTEQGLEALIQQPANMADARNYRTGGYIKRLPKDPWGNDYQYLSPGEKGLFDVY 125
DN YPTT QGLE+L++ P A NY GYIKRLP DPWGNDY ++PGE G +D+
Sbjct: 62 DNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLL 121

Query: 126 TLGADGQENGEGAGADIGNWNLQEFQ 151
+ G DG+ E DI NW L + +
Sbjct: 122 SAGPDGEMGTED---DITNWGLSKKK 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3383BCTERIALGSPF453e-161 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 453 bits (1167), Expect = e-161
Identities = 226/406 (55%), Positives = 301/406 (74%), Gaps = 1/406 (0%)

Query: 1 MALFYYQALERNGRKTKGMIEADSARHARQLLRGKDLIPVHI-EARMNASAGGLLQRRRH 59
MA ++YQAL+ G+K +G EADSAR ARQLLR + L+P+ + E R + G
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 60 AHRRVATADLALFTRQLATLVQAAMPLETCLQAVSEQSEKLHVKSLGMALRSRIQEGYTL 119
R++T+DLAL TRQLATLV A+MPLE L AV++QSEK H+ L A+RS++ EG++L
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 120 SDSLREHPRVFDSLFCSMVAAGEKSGHLDVVLNRLADYTEQRQRLKSRLLQAMLYPLVLL 179
+D+++ P F+ L+C+MVAAGE SGHLD VLNRLADYTEQRQ+++SR+ QAM+YP VL
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 180 VVATGVVTILLTAVVPKIIEQFDHLGHALPASTRMLIAMSDALQASGVYWLAGLLGLLVL 239
VVA VV+ILL+ VVPK++EQF H+ ALP STR+L+ MSDA++ G + L LL +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 240 GQRLLKNPAMRLRWDKTLLRLPVTGRVARGLNTARFSRTLSILTASSVPLLEGIQTAAAV 299
+ +L+ R+ + + LL LP+ GR+ARGLNTAR++RTLSIL AS+VPLL+ ++ + V
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 300 SANRYVEQQLLLAADRVREGSSLRAALADLRLFPPMMLYMIASGEQSGELETMLEQAAVN 359
+N Y +L LA D VREG SL AL LFPPMM +MIASGE+SGEL++MLE+AA N
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 360 QEREFDTQVGLALGLFEPALVVMMAGVVLFIVIAILEPMLQLNNMV 405
Q+REF +Q+ LALGLFEP LVV MA VVLFIV+AIL+P+LQLN ++
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


48UTI89_C3435UTI89_C3454Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3435-213-3.021078hypothetical protein
UTI89_C3436022-6.090009regulator
UTI89_C3437124-5.998886oxidoreductase YdfI
UTI89_C3438020-4.197134zinc-type alcohol dehydrogenase-like protein
UTI89_C3439018-3.912181ureidoglycolate dehydrogenase
UTI89_C3440-119-3.930914c4-dicarboxylate transport system binding
UTI89_C3441-111-0.424842hypothetical protein
UTI89_C3442-2110.550872c4-dicarboxylate permease
UTI89_C3443-2131.977748repressor protein for FtsI
UTI89_C3444-1121.0601921-acyl-sn-glycerol-3-phosphate acyltransferase
UTI89_C3445-1131.412578hypothetical protein
UTI89_C3446-1111.557983DNA topoisomerase IV subunit A
UTI89_C3447-215-0.660169binding protein
UTI89_C3448-120-4.060382hypothetical protein
UTI89_C3449025-6.518029hypothetical protein
UTI89_C3450023-6.286353DNA-binding transcriptional regulator QseB
UTI89_C3451124-7.206169sensor protein QseC
UTI89_C3452231-10.718410hypothetical protein
UTI89_C3453023-7.004698hypothetical protein
UTI89_C3454-219-3.769276hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3439CHLAMIDIAOM6330.002 Chlamydia cysteine-rich outer membrane protein 6 si...
		>CHLAMIDIAOM6#Chlamydia cysteine-rich outer membrane protein 6

signature.
Length = 547

Score = 32.7 bits (74), Expect = 0.002
Identities = 16/36 (44%), Positives = 20/36 (55%), Gaps = 1/36 (2%)

Query: 112 VSVKNTSHCGALSYFAEMITH-KGLVAIVMTQTDTC 146
V VK+ S CG + AE T+ KG+ A M DTC
Sbjct: 406 VVVKSCSDCGTCTSCAEATTYWKGVAATHMCVVDTC 441


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3450HTHFIS906e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 6e-23
Identities = 30/129 (23%), Positives = 56/129 (43%)

Query: 2 RILLIEDDMLIGDGIKTGLSKMGFSVDWFTQGRQGKEALYSAPYDAVILDLTLPGMDGRD 61
IL+ +DD I + LS+ G+ V + + + D V+ D+ +P + D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 ILREWREKGQREPVLILTARDALAERVEGLRLGADDYLCKPFALIEVAARLEALMRRTNG 121
+L ++ PVL+++A++ ++ GA DYL KPF L E+ + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 QASNELRHG 130
+ S
Sbjct: 125 RPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3451PF06580363e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 35.6 bits (82), Expect = 3e-04
Identities = 37/176 (21%), Positives = 60/176 (34%), Gaps = 29/176 (16%)

Query: 284 DRATRLVDQLLTLSRLDSLDNLQDVAEIPLEDLLQSSVMDIYHTAQQANIDVRLTLNANG 343
+A ++ L L R SL ++ L D L +V+D Y + RL
Sbjct: 191 TKAREMLTSLSELMRY-SLRYSNA-RQVSLADEL--TVVDSYLQLASIQFEDRLQFENQI 246

Query: 344 IKRTGQ----PLLLSLLVRNLLDNAVRYSPQGSVVDVTLNADN----FIVRDNGPGVTPE 395
P+L+ LV N + + + PQG + + DN V + G
Sbjct: 247 NPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN 306

Query: 396 ALARIGERFYRPPGQTATGSGLGLSIV-QRIAKLHDMNVEFG-NAEQGGFEAKVSW 449
T +G GL V +R+ L+ + + +QG A V
Sbjct: 307 ---------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLI 347


49UTI89_C3475UTI89_C3489Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3475013-3.177874disulfide oxidoreductase
UTI89_C3476012-3.261097hypothetical protein
UTI89_C3477017-5.535238zinc transporter ZupT
UTI89_C3478017-5.701221hypothetical protein
UTI89_C3479017-5.289823fimbrial protein
UTI89_C3480018-5.198930outer membrane usher protein YqiG
UTI89_C3481-126-6.134450periplasmic chaperone YqiH
UTI89_C3482-225-4.642221Yqi fimbrial adhesin
UTI89_C3483015-0.5186703,4-dihydroxy-2-butanone 4-phosphate synthase
UTI89_C3484191.214113hypothetical protein
UTI89_C3485091.724904glycogen synthesis protein GlgS
UTI89_C3486091.896507hypothetical protein
UTI89_C34870103.047160hypothetical protein
UTI89_C3488-1133.673171bifunctional heptose 7-phosphate kinase/heptose
UTI89_C3489-1143.355724bifunctional glutamine-synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3480PF005776880.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 688 bits (1777), Expect = 0.0
Identities = 233/878 (26%), Positives = 401/878 (45%), Gaps = 70/878 (7%)

Query: 14 HAIKNALSG------VVCSLLFVLPVH--AVEFNVDMIDAEDRENIDISRFEKKGYIPPG 65
H K+ L+G V C+ P+ + FN + + + D+SRFE +PPG
Sbjct: 17 HIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPG 76

Query: 66 RYLVRVQINKNMLPQTLILEWVKADNESGSLLCLTKENLTNFGLNTEFIESLQNIAGSEC 125
Y V + +N + + + D+E G + CLT+ L + GLNT + + +A C
Sbjct: 77 TYRVDIYLNNGYMATRDV-TFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDAC 135

Query: 126 LDLSQR-QELTTRLDKATMILSLSVPQAWLKYQATNWTPPEFWDTGIAGFILDYNVYASQ 184
+ L+ + T +LD L+L++PQA++ +A + PPE WD GI +L+YN +
Sbjct: 136 VPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNS 195

Query: 185 YAPHHGDSTQNVSSYGTLGFNLGAWRLRSDYQYNQNFADGRSVNRDS-EFARTYLFRPIP 243
G ++ G N+GAWRLR + ++ N +D S +++ + T+L R I
Sbjct: 196 VQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDII 255

Query: 244 SWSSKFTMGQYDLSSNLYDTFHFTGASLESDESMLPPDLQGYAPQITGIAQTNAKVTVAQ 303
S+ T+G +++D +F GA L SD++MLP +G+AP I GIA+ A+VT+ Q
Sbjct: 256 PLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQ 315

Query: 304 NGRVLYQTTVAPGPFTISDL-GQSFQGLLDVTVEEEDGRTSTFQVGSASIPYLTRKGQVR 362
NG +Y +TV PGPFTI+D+ G L VT++E DG T F V +S+P L R+G R
Sbjct: 316 NGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTR 375

Query: 363 YKTSLGKPTSVGHNDINNPFFWTAEASWGWLNNVSLYGGGMFTADDYQAITTGIGFNLNQ 422
Y + G+ S P F+ + G ++YGG AD Y+A GIG N+
Sbjct: 376 YSITAGEYRSGNAQQE-KPRFFQSTLLHGLPAGWTIYGGTQL-ADRYRAFNFGIGKNMGA 433

Query: 423 FGSLSFDVTGADASLQQQNSGNLRGYSYRFNYAKHFESTGSQITFAGYRFSDKDYVSMSE 482
G+LS D+T A+++L G S RF Y K +G+ I GYR+S Y + ++
Sbjct: 434 LGALSVDMTQANSTLPDD--SQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFAD 491

Query: 483 YLSSRNGDESID--------------------NEKESYVISLNQYFETLELNSYLNVTRN 522
SR +I+ N++ +++ Q YL+ +
Sbjct: 492 TTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRT-STLYLSGSHQ 550

Query: 523 TYWDS-ASNTNYSVSVSKNFDIGDFKGISASLAVSRIR--WDDDEENQYYFSFSLPL--- 576
TYW + + + ++ F+ I+ +L+ S + W + + ++P
Sbjct: 551 TYWGTSNVDEQFQAGLNTA-----FEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHW 605

Query: 577 --------QQNRNISYSMQRTGSSNTSQMISWYDS--SDRNNIWNISASATDDNIRDGEP 626
++ + SYSM + + + Y + D N +++ +
Sbjct: 606 LRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGS 665

Query: 627 TLRGSYQHYSPWGRLNINGSVQPNQYNSVTAGWYGSLTATRHGIALHDYSYGDNARMMVD 686
T + + +G NI S + + G G + A +G+ L ++ ++V
Sbjct: 666 TGYATLNYRGGYGNANIGYSHS-DDIKQLYYGVSGGVLAHANGVTLGQPL--NDTVVLVK 722

Query: 687 TDGISGIEINSNRTV-TNGLGIAVIPSLSNYTTSMLRVNNNDLPEGVDVENSVIRTTLTQ 745
G ++ + V T+ G AV+P + Y + + ++ N L + VD++N+V T+
Sbjct: 723 APGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTR 782

Query: 746 GAIGYAKLNATTGYQIVGVIRQENGRFPPLGVNVTDKATGKDVGLVAEDGFVYLSGIQEN 805
GAI A+ A G +++ + N + P G VT + + G+VA++G VYLSG+
Sbjct: 783 GAIVRAEFKARVGIKLLMTLTH-NNKPLPFGAMVTS-ESSQSSGIVADNGQVYLSGMPLA 840

Query: 806 STLHLTWGD---NTCEVT---PPNQSNISESAIILPCK 837
+ + WG+ C PP + + C+
Sbjct: 841 GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3487IGASERPTASE502e-08 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 50.4 bits (120), Expect = 2e-08
Identities = 47/287 (16%), Positives = 92/287 (32%), Gaps = 16/287 (5%)

Query: 197 PNNAFDAEGLTKLTQETERRRRERNEVEQDVEVAVREKNRDALSRKLEIEQQEAFMTLEQ 256
N A+ + + E R A + + ++ E +QE+ +
Sbjct: 999 TPNNIQAD-VPSVPSNNEEIARVDEAPVPPPAPATPSETTETVA---ENSKQESKTVEKN 1054

Query: 257 EQQVKTRTAEQNAKIAAFEAERRREAE-QTRILAERQIQETEIDREQAVRSRKVEAEREV 315
EQ TA+ + A EA+ +A QT +A+ + E + + VE E +
Sbjct: 1055 EQDATETTAQN--REVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKA 1112

Query: 316 RIKEIEQQQVTEIANQTKSIAIAAKSEQ---QSQAEARANLALAEAVSAQQNVETTRQTA 372
+++ + Q+V ++ +Q +++ Q + E + + E S T Q A
Sbjct: 1113 KVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPA 1172

Query: 373 EADRAKQVALIAAAQDAET------KAVELTVRAKAEKEAAEMQAAAIVELAEATRKKGL 426
+ + + + T T +E + R
Sbjct: 1173 KETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPH 1232

Query: 427 AEAEAQRALNDAINVLSDEQTSLKFKLALLQALPAVIEKSVEPMKAI 473
A + ND V + TS L A ++ KA+
Sbjct: 1233 NVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAV 1279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3488LPSBIOSNTHSS290.029 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 29.0 bits (65), Expect = 0.029
Identities = 10/37 (27%), Positives = 20/37 (54%)

Query: 347 GVFDILHAGHVSYLANARKLGDRLIVAVNSDASTKRL 383
G FD + GH+ + +L D++ VAV + + + +
Sbjct: 7 GSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPM 43


50UTI89_C3634UTI89_C3659Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3634216-0.4641203-deoxy-D-manno-octulosonate 8-phosphate
UTI89_C3635117-0.633139hypothetical protein
UTI89_C36362160.163307lipopolysaccharide transport periplasmic protein
UTI89_C36373170.386882ABC transporter ATP-binding protein YhbG
UTI89_C3638315-0.156435RNA polymerase factor sigma-54
UTI89_C3639017-0.456738sigma(54) modulation protein
UTI89_C3640-1150.202035PTS system transporter subunit IIA-like
UTI89_C3641013-0.233014hypothetical protein
UTI89_C3642-1130.546990phosphohistidinoprotein-hexose
UTI89_C3643-1130.272437hypothetical protein
UTI89_C3644-1172.872361monofunctional biosynthetic peptidoglycan
UTI89_C3645-1203.230448isoprenoid biosynthesis protein with
UTI89_C3646-1202.989926aerobic respiration control sensor protein ArcB
UTI89_C3647-1214.431394hypothetical protein
UTI89_C3648-1204.510905hypothetical protein
UTI89_C3649-2204.761974glutamate synthase subunit alpha
UTI89_C3650-2133.632239glutamate synthase subunit beta
UTI89_C3651-2123.233218hypothetical protein
UTI89_C3652-1133.998868N-acetylmannosamine kinase
UTI89_C3653-1173.346760N-acetylmannosamine-6-phosphate 2-epimerase
UTI89_C36540182.367417sialic acid transporter
UTI89_C36553221.341203N-acetylneuraminate lyase
UTI89_C36564281.064311transcriptional regulator NanR
UTI89_C36571190.679693ClpXP protease specificity-enhancing factor
UTI89_C36581200.575783stringent starvation protein A
UTI89_C36592210.650250hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3635MYCMG045290.017 Hypothetical mycoplasma lipoprotein (MG045) signature.
		>MYCMG045#Hypothetical mycoplasma lipoprotein (MG045) signature.

Length = 483

Score = 28.5 bits (63), Expect = 0.017
Identities = 30/144 (20%), Positives = 53/144 (36%), Gaps = 17/144 (11%)

Query: 57 ALSYRLIAQHVEYYSDQAVSWFTQPVLTTFDKDKIPTWSVKADKAKLTNDRMLYLYGHVE 116
AL +I + + +A L D+ K +K D+ T+D YL G ++
Sbjct: 307 ALDLLVINKQQSNFQKEAHEIIFDLALDGADQTKEQL--IKTDEELGTDDEDFYLKGAMQ 364

Query: 117 ----VNALVPDSQLRRITT----------DNAQINLVTQDVTSEDLVTLYGTTFNSSGLK 162
VN + P + +T + + T +TSE Y T + K
Sbjct: 365 NFSYVNYVSPLKVISDPSTGIVSSKKNNAEMKSKQMSTDQMTSEKEFDYYTETLKALLEK 424

Query: 163 M-RGNLRSKNAELIEKVRTSYEIQ 185
L +L+E ++ +Y I+
Sbjct: 425 EDSAELNENEKKLVETIKKAYTIE 448


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3646HTHFIS647e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.5 bits (157), Expect = 7e-13
Identities = 26/115 (22%), Positives = 45/115 (39%), Gaps = 4/115 (3%)

Query: 528 VLLVEDIELNVIVARSVLEKLGNSVDVAMTGKAALEMFKPGEYDLVLLDIQLPDMTGLDI 587
+L+ +D V L + G V + G+ DLV+ D+ +PD D+
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 588 SRELTKRYPREDLPPLVALTA-NVLKDKQEYLNAGMDDVLSKPLSVPALTAMIKK 641
+ K P P++ ++A N + G D L KP + L +I +
Sbjct: 66 LPRIKKARPD---LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3654TCRTETB591e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 59.1 bits (143), Expect = 1e-11
Identities = 81/455 (17%), Positives = 158/455 (34%), Gaps = 46/455 (10%)

Query: 40 LLDGFDFVLIALVLTEVQGEFGLTTVQAASLISAAFISRWFGGLMLGAMGDRYGRRLAMV 99
+ +++ + L ++ +F + +A ++ G + G + D+ G + ++
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 100 TSIVLFSAGTLACGFAPGYITMFI-ARLVIGMGMAGEYGSSATYVIESWPKHLRNKASGF 158
I++ G++ + ++ I AR + G G A V PK R KA G
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 159 LISGFSVGAVVAAQVYSLVVPVWGWRALFFIGILPIIFALWLRKNIPEAEDWKEKHGGKA 218
+ S ++G V + ++ W L I ++ II +L K + +
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVR--------- 194

Query: 219 PVRTMVDILYRGEHRIANIVMTLAAATALWFCFAGNLQNAAIVAVLGLLCAAIFISFMVQ 278
+G I I++ + IV+VL L IF+ + +
Sbjct: 195 ---------IKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFL---IFVKHIRK 242

Query: 279 STGK----RWPTGVMLMVVVLFAFLYSWPIQA---LLPTYLKTDLAYDPHTVANVLFFSG 331
T + M+ VL + + ++P +K + +V+ F G
Sbjct: 243 VTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPG 302

Query: 332 -FGAAVGCCVGGFLGDWLGTRK-AYVCSLLASQLLIIPVFAIGGANVWVLGLLLFFQQML 389
+ +GG L D G + S + F + W + +++ F
Sbjct: 303 TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLET-TSWFMTIIIVFVLGG 361

Query: 390 GQGIAGILPKLIGGYFDTDQRAAGLGFTYNVGALGGALAP-ILGALIA-----QRL---- 439
++ ++ + AG+ L I+G L++ QRL
Sbjct: 362 LSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPME 421

Query: 440 -DLGTALAS---LSFSLTFVVILLIGLDMPSRVQR 470
D T L S L FS V+ L+ L++ QR
Sbjct: 422 VDQSTYLYSNLLLLFSGIIVISWLVTLNVYKHSQR 456


51UTI89_C3688UTI89_C3694Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3688013-3.353633acetyl-CoA carboxylase biotin carboxyl carrier
UTI89_C3689014-4.123673acetyl-CoA carboxylase biotin carboxylase
UTI89_C3690022-5.244204hypothetical protein
UTI89_C3691024-6.291834hypothetical protein
UTI89_C3692025-6.354903ribose transport system permease RbsC
UTI89_C3693021-5.031555ribose transport ATP-binding protein RbsA
UTI89_C3694019-3.954015ribose ABC transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3688RTXTOXIND270.026 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 27.5 bits (61), Expect = 0.026
Identities = 8/27 (29%), Positives = 16/27 (59%)

Query: 127 IEADKSGTVKAILVESGQPVEFDEPLV 153
I+ ++ VK I+V+ G+ V + L+
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLL 125


52UTI89_C3925UTI89_C3934Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3925-115-3.701439thiosulfate sulfurtransferase
UTI89_C3926020-5.398064glycerol-3-phosphate dehydrogenase
UTI89_C3927138-9.531603hypothetical protein
UTI89_C3928139-9.867102hypothetical protein
UTI89_C3929242-10.264775hypothetical protein
UTI89_C3930245-11.395819Auf fimbrial chaperone 2
UTI89_C3931342-10.319276Auf fimbriae minor subunit AufE
UTI89_C3932122-6.365214Auf fimbriae minor subunit AufD
UTI89_C3933017-4.399631hypothetical protein
UTI89_C3934014-3.431136outer membrane usher protein AufC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3934PF005778830.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 883 bits (2282), Expect = 0.0
Identities = 398/866 (45%), Positives = 570/866 (65%), Gaps = 28/866 (3%)

Query: 19 KRVVPLLLVIMPACSIA--------GMRFNPAFLSGDTEAVADLSRFEKGMTYLPGSYEV 70
R+ + + AC+ A + FNP FL+ D +AVADLSRFE G PG+Y V
Sbjct: 21 HRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRV 80

Query: 71 EVWVNDSPLLSRTVTFKADDANQ-LIPCLSLADLLSLGINKNALPEQALASSENSCLDLR 129
++++N+ + +R VTF D+ Q ++PCL+ A L S+G+N ++ L + +++C+ L
Sbjct: 81 DIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLA-DDACVPLT 139

Query: 130 IWFPDVHYMPELDAQRLKLTFPQAIIKRDARGYIPPEQWDNGITAFLLNYDFSGN--NDR 187
D ++ QRL LT PQA + ARGYIPPE WD GI A LLNY+FSGN +R
Sbjct: 140 SMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNR 199

Query: 188 GDYSSNNYYLNLRAGINIGAWRFRDYSTWSR-----GSNSAGKLEHISSTLQRVIIPFRS 242
+S+ YLNL++G+NIGAWR RD +TWS S S K +HI++ L+R IIP RS
Sbjct: 200 IGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRS 259

Query: 243 ELTLGDTWSSSDVFDSVSIRGIKLESDENMLPDSQSGFAPTVRGIAKSRAQVTIKQNGYV 302
LTLGD ++ D+FD ++ RG +L SD+NMLPDSQ GFAP + GIA+ AQVTIKQNGY
Sbjct: 260 RLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYD 319

Query: 303 IYQTYMPPGPFEISDLNPTSSAGDLEVTIKESDNSETVYTVPYAAVPILQREGHSKYSTT 362
IY + +PPGPF I+D+ ++GDL+VTIKE+D S ++TVPY++VP+LQREGH++YS T
Sbjct: 320 IYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSIT 379

Query: 363 VGQYRSNSYNQKSPYIFQGELIWGLPWDITAYGGAQFSEDYRALALGLGLNLGVFGATSF 422
G+YRS + Q+ P FQ L+ GLP T YGG Q ++ YRA G+G N+G GA S
Sbjct: 380 AGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSV 439

Query: 423 DVTQANSSLVDGSKHQGQSYRFLYSKSLVQTGTAFHIIGYRYSTQGFYTLSDTTYQQMSG 482
D+TQANS+L D S+H GQS RFLY+KSL ++GT ++GYRYST G++ +DTTY +M+G
Sbjct: 440 DMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNG 499

Query: 483 TVVDPKTLDDKDYVYNWNDFYNLRYSKRGKFQASVSQPFGNYGSMYLSASQQTYWNTDKK 542
++ + + + D+YNL Y+KRGK Q +V+Q G ++YLS S QTYW T
Sbjct: 500 YNIETQDGVIQVKPK-FTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNV 558

Query: 543 DSLYQVGYNTSIKGIYLNVAWNYSKSPGTN-ADKIVSLNVSLPISNWLSSTNDGRSSSNA 601
D +Q G NT+ + I ++++ +K+ D++++LNV++P S+WL S D +S
Sbjct: 559 DEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRS--DSKSQWRH 616

Query: 602 MTATYGYSQDNHGQVNQYTGVSGSLLEQHNLSYNIQHGFANQDNSSSGSVG---VNYRGA 658
+A+Y S D +G++ GV G+LLE +NLSY++Q G+A + +SGS G +NYRG
Sbjct: 617 ASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGG 676

Query: 659 YGSLNSAYSYDNEGNQQINYGISGALVVHENGLTLSQPLGETNVLIKAPGANNVDVQRGT 718
YG+ N YS+ + +Q+ YG+SG ++ H NG+TL QPL +T VL+KAPGA + V+ T
Sbjct: 677 YGNANIGYSHSD-DIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQT 735

Query: 719 GISTDWRGYAVVPYATEYRRNNISLDPMSMNMHTELDITSTEVIPGKGALVRAEFAAHIG 778
G+ TDWRGYAV+PYATEYR N ++LD ++ + +LD V+P +GA+VRAEF A +G
Sbjct: 736 GVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVG 795

Query: 779 IRGLFTVRYRNKSVPFGATASAQIKNSSQITGIVGDNGQLYLSGLPLEGVINIQWGDGVQ 838
I+ L T+ + NK +PFGA +++ SSQ +GIV DNGQ+YLSG+PL G + ++WG+
Sbjct: 796 IKLLMTLTHNNKPLPFGAMVTSE---SSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEEN 852

Query: 839 QKCQANYKLPETELDNPVSYATLECR 864
C ANY+LP ++ + ECR
Sbjct: 853 AHCVANYQLPPESQQQLLTQLSAECR 878


53UTI89_C3947UTI89_C4004Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3947016-4.616339hypothetical protein
UTI89_C3948121-6.094800dehydrogenase
UTI89_C3949117-3.670286hypothetical protein
UTI89_C3950016-3.100785hypothetical protein
UTI89_C3951114-1.471521acetyltransferase YhhY
UTI89_C39520130.053051hypothetical protein
UTI89_C3953-1162.513140hypothetical protein
UTI89_C3954-2193.162765gamma-glutamyltranspeptidase
UTI89_C3955-2233.069004hypothetical protein
UTI89_C3956-2253.570330cytoplasmic glycerophosphodiester
UTI89_C3957-2253.334400glycerol-3-phosphate transporter ATP-binding
UTI89_C3958-1263.221732glycerol-3-phosphate transporter membrane
UTI89_C3959-2263.691179glycerol-3-phosphate transporter permease
UTI89_C3960-2233.231048glycerol-3-phosphate transporter periplasmic
UTI89_C3961-2223.161021leucine/isoleucine/valine transporter
UTI89_C3962-2192.913006leucine/isoleucine/valine transporter
UTI89_C3963-1182.690988leucine/isoleucine/valine transporter permease
UTI89_C3964-1201.880775branched-chain amino acid transporter permease
UTI89_C3965-116-0.414220leucine-specific binding protein
UTI89_C3966-116-4.052609hypothetical protein
UTI89_C3967018-4.364050acyltransferase
UTI89_C3968120-6.307257hypothetical protein
UTI89_C3969223-7.954238Leu/Ile/Val-binding protein
UTI89_C3970434-11.156797phosphotransferase system protein
UTI89_C3971224-8.565964phosphotransferase system protein
UTI89_C3972-117-5.806646phosphotransferase system protein
UTI89_C3973-214-4.146727phosphotransferase system protein
UTI89_C3974012-1.994981phosphoglycerate dehydrogenase
UTI89_C39751140.494388dihydrodipicolinate synthase
UTI89_C39761171.814057RNA polymerase factor sigma-32
UTI89_C39772141.567144cell division protein FtsX
UTI89_C39782142.097428cell division protein FtsE
UTI89_C39791133.423998cell division protein FtsY
UTI89_C3980-1153.69981116S rRNA m(2)G966-methyltransferase
UTI89_C39810163.645395hypothetical protein
UTI89_C3982-1163.650004hypothetical protein
UTI89_C3983-1173.601402hypothetical protein
UTI89_C3984-1143.528894zinc/cadmium/mercury/lead-transporting ATPase
UTI89_C39851160.906731hypothetical protein
UTI89_C39860151.578069sulfur transfer protein SirA
UTI89_C39871151.431393hypothetical protein
UTI89_C39880151.920785hypothetical protein
UTI89_C39891172.847071major facilitator superfamily transporter
UTI89_C3990-2193.509295hypothetical protein
UTI89_C39910235.196531holo-(acyl carrier protein) synthase 2
UTI89_C3992-1255.270847hypothetical protein
UTI89_C39930255.443937nickel-binding periplasmic protein
UTI89_C39940214.324592nickel transporter permease NikB
UTI89_C3995-2193.012671nickel transporter permease NikC
UTI89_C39960180.804304nickel transporter ATP-binding protein NikD
UTI89_C3997218-1.251636nickel transporter ATP-binding protein NikE
UTI89_C3998117-1.269899nickel responsive regulator
UTI89_C3999218-1.882524regulator
UTI89_C4000216-0.891135phosphotransferase system enzyme subunit
UTI89_C40011140.086603phosphotransferase system enzyme subunit
UTI89_C40020161.977271PTS system galactitol-specific transporter
UTI89_C40030163.255711xylulose kinase
UTI89_C40041183.323552phosphocarrier protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3951SACTRNSFRASE354e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 35.3 bits (81), Expect = 4e-05
Identities = 20/92 (21%), Positives = 32/92 (34%), Gaps = 16/92 (17%)

Query: 55 VACIDGIVVGHLTIDVQQRPRRSHVADFGICVDSRWKNRGVASALMREMID------MCD 108
+ ++ +G + I + + D + D R K GV +AL+ + I+ C
Sbjct: 69 LYYLENNCIGRIKIR-SNWNGYALIEDIAVAKDYRKK--GVGTALLHKAIEWAKENHFCG 125

Query: 109 NWLRVDRIELTVFVDNAPAIKVYKKFGFEIEG 140
L I N A Y K F I
Sbjct: 126 LMLETQDI-------NISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3954NAFLGMOTY320.005 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 32.0 bits (72), Expect = 0.005
Identities = 27/80 (33%), Positives = 36/80 (45%), Gaps = 13/80 (16%)

Query: 272 RTPISGDYRGYQVYSMPPPSSGGIHIVQILNILENFDMQKYGF-GSADAMQIMAEAEKYA 330
R P+ G+ R + SMPPP G H +I N+ F Q G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNL--KFFKQFDGYVGGQTAWGILSELEKGR 133

Query: 331 YADRSEYLGDPDFVKVPWQA 350
Y P F WQ+
Sbjct: 134 Y---------PTFSYQDWQS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3956PF04619280.020 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 28.4 bits (63), Expect = 0.020
Identities = 12/60 (20%), Positives = 22/60 (36%), Gaps = 4/60 (6%)

Query: 29 VGAKYGHKMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGELNWQD----LLRVDAGSW 84
+G ++ D + G+ FL+ D+N ++ W + D GSW
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3957PF05272290.042 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.9 bits (64), Expect = 0.042
Identities = 10/29 (34%), Positives = 16/29 (55%)

Query: 33 IVMVGPSGCGKSTLLRMVAGLERVTTGDI 61
+V+ G G GKSTL+ + GL+ +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHF 627


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3960MALTOSEBP392e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 39.3 bits (91), Expect = 2e-05
Identities = 39/160 (24%), Positives = 66/160 (41%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQDLADYAAKLKASGMKCGYASGWQ 193
G L++ P L YNKD PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKDLLP-------NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3979IGASERPTASE527e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 51.6 bits (123), Expect = 7e-09
Identities = 44/208 (21%), Positives = 67/208 (32%), Gaps = 21/208 (10%)

Query: 20 QTPEK-ETEVQNEQPVVEEIVQAQEPVKASEHAVEEQPQAHTEAEAETFAANVVEVTEQV 78
TP + +V + EEI + E T AE + VE EQ
Sbjct: 998 TTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQD 1057

Query: 79 AESEKAQ---------------PEAEVVAQPESVVEETPEPVAIEREELPLPEDVNAEAV 123
A AQ + VAQ S +ET E + E E
Sbjct: 1058 ATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVET- 1116

Query: 124 SPEEWQAEAETVEIVEAAEEEAAKEEITDEEPEAQALAAEVAEEA-VMVVSPAEEEQPVE 182
E E V + ++E ++ EP + +E + A+ EQP +
Sbjct: 1117 ---EKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAK 1173

Query: 183 EIAQEQEKPTKEGFFARLKRSLLKTKEN 210
E + E+P E S+++ EN
Sbjct: 1174 ETSSNVEQPVTESTTVNTGNSVVENPEN 1201



Score = 50.8 bits (121), Expect = 1e-08
Identities = 39/197 (19%), Positives = 66/197 (33%), Gaps = 10/197 (5%)

Query: 19 EQTPEKETEVQNEQPVVEEIVQAQEPVKASEHAVEEQPQ----AHTEAEAETFAANVVEV 74
E +T +NEQ E Q +E K ++ V+ Q A + +E + +
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 75 TEQVAESEKAQPEAEVVAQPESVVEETPEPVAIEREELPL--PEDVNAEAVSPEEWQAEA 132
T V + EKA+ E E + V + P P N V+ +E Q++
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQT 1162

Query: 133 ETVEIVEAAEEEAAKEEITDEEPEAQALAAEVAEEAVMVVSPAEEEQPVEEIAQEQEKPT 192
T A E+ AKE ++ E +V+ + +
Sbjct: 1163 NT----TADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNK 1218

Query: 193 KEGFFARLKRSLLKTKE 209
+ R RS+ E
Sbjct: 1219 PKNRHRRSVRSVPHNVE 1235



Score = 42.0 bits (98), Expect = 6e-06
Identities = 28/156 (17%), Positives = 45/156 (28%), Gaps = 14/156 (8%)

Query: 17 QKEQTPEKETEVQNEQPVVEEIVQAQEPVKASE------HAVEEQPQAHTEAEAETFAAN 70
Q +T E T + E+ VE + P S+ + QPQA E +
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNI 1155

Query: 71 VVEVTEQVAESEKAQPEAEVVAQPESVVEETPEPVAIEREELPLPEDVNAEAVSPEEWQA 130
++ ++ QP E + E V E+ V + PE+ P
Sbjct: 1156 KEPQSQTNTTADTEQPAKETSSNVEQPVTES-TTVNTGNSVVENPENTTPATTQPTVNSE 1214

Query: 131 EAETVEI-------VEAAEEEAAKEEITDEEPEAQA 159
+ + E A D A
Sbjct: 1215 SSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALC 1250



Score = 40.4 bits (94), Expect = 2e-05
Identities = 27/178 (15%), Positives = 53/178 (29%), Gaps = 9/178 (5%)

Query: 17 QKEQTPEKETEVQNEQPVVEEIVQAQEPVKASEHAVEEQPQAHTEAEAETFAANVVEVTE 76
+E E ++ V+ E E + +E E +A+ EV +
Sbjct: 1065 NREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPK 1124

Query: 77 QVAESEKAQPEAEVVAQPESVVEETPEPVAIEREELPLPEDVNAEAVSPEEWQAEAETVE 136
++ Q ++E V QP++ +P + +E + A+ P + ET
Sbjct: 1125 VTSQVSPKQEQSETV-QPQAEPARENDPT-VNIKEPQSQTNTTADTEQPAK-----ETSS 1177

Query: 137 IVEAAEEEAAKEEITDEEPEA--QALAAEVAEEAVMVVSPAEEEQPVEEIAQEQEKPT 192
VE E+ + E A S + + +
Sbjct: 1178 NVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVE 1235


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3986PF012061053e-34 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 105 bits (265), Expect = 3e-34
Identities = 24/72 (33%), Positives = 41/72 (56%)

Query: 9 DHTLDALGLRCPEPVMMVRKTVRNMQPGETLLIIADDPATTRDIPGFCTFMEHELVAKET 68
D +LDA GL CP P++ +KT+ M GE L ++A DP + +D F HEL+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 69 DGLPYRYLIRKG 80
+ Y + +++
Sbjct: 65 EDGTYHFRLKRA 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3989TCRTETA538e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 53.3 bits (128), Expect = 8e-10
Identities = 80/398 (20%), Positives = 147/398 (36%), Gaps = 32/398 (8%)

Query: 13 LRLNLRIVSIVMFNFASYLTIGLPLAVLPGYVHDVM--GFSAFWAGLVISLQYFATLLSR 70
++ N ++ I+ + IGL + VLPG + D++ G++++L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 71 PHAGRYADLLGPKKIVVFGLCGCFLSGLGYLTAGLTASLPVISLLLLCLGRVILGI-GQS 129
P G +D G + +++ L G + + Y L V L +GR++ GI G +
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAG---AAVDYAIMATAPFLWV-----LYIGRIVAGITGAT 112

Query: 130 FAGTGSTLWGVGVVGSL--HIGRVISWNGIVTYGAMAMGAPLGVVFYHWGGLQALALIIM 187
A G+ + + H G + + G +G +G H A AL +
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGL 172

Query: 188 GVALVAILLAIPRPTVK--ASKGKPLPFRAVLGRVWLYGMALALA-----SAGFGVIATF 240
LL + + P + + +A +A V A
Sbjct: 173 NFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAAL 232

Query: 241 ITLFYDAK-GWDGAAFALTLFSCAFVGT---RLLFPNGINRIGGLNVAMICFSVEIIGLL 296
+F + + WD ++L + + + ++ R+G M+ + G +
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI 292

Query: 297 LVGVATMPWMAKIG-VLLAGAGFSLVFPALGVVAVKAVPQQNQGAALATYTVFMDLSLGV 355
L+ AT WMA VLLA G + PAL + + V ++ QG + L+ +
Sbjct: 293 LLAFATRGWMAFPIMVLLASGGIGM--PALQAMLSRQVDEERQGQLQGSLAALTSLT-SI 349

Query: 356 TGPLAGLVMSWAGVPV----IYLAAAGLVAIALLLTWR 389
GPL + A + ++A A L + L R
Sbjct: 350 VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3994BORPETOXINB280.048 Bordetella pertussis toxin B subunit signature.
		>BORPETOXINB#Bordetella pertussis toxin B subunit signature.

Length = 226

Score = 27.7 bits (61), Expect = 0.048
Identities = 21/77 (27%), Positives = 32/77 (41%), Gaps = 10/77 (12%)

Query: 204 GQRHVTWARLRGLSDKQTERRHILRNASLPMITAVGMHIGELIGGTMIIENIFAWPGVG- 262
R +T A LRG D Q RH+ R S+ + G ++G GG +I++ PG
Sbjct: 53 KTRALTVAELRGSGDLQEYLRHVTRGWSIFALYD-GTYLGGEYGG--VIKD--GTPGGAF 107

Query: 263 ----RYAVSAIFNRDYP 275
+ + N P
Sbjct: 108 DLKTTFCIMTTRNTGQP 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3997HTHFIS300.008 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.2 bits (68), Expect = 0.008
Identities = 10/34 (29%), Positives = 19/34 (55%)

Query: 25 QAVLNNVSLALKSGETVALLGRSGCGKSTLARLL 58
Q + ++ +++ T+ + G SG GK +AR L
Sbjct: 147 QEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


54UTI89_C4015UTI89_C4026Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4015-2213.031604inner membrane transporter YhiP
UTI89_C4016-1233.570862methyltransferase
UTI89_C4017-2232.504393oligopeptidase A
UTI89_C40180180.852641hypothetical protein
UTI89_C4019116-1.763349glutathione reductase
UTI89_C4020322-6.522236hypothetical protein
UTI89_C4021321-8.137633arsenate reductase
UTI89_C4022121-6.925187hypothetical protein
UTI89_C4023-114-4.402821hypothetical protein
UTI89_C4024-115-4.356631starvation induced outer membrane protein
UTI89_C4025-116-4.350394hypothetical protein
UTI89_C4026-215-3.889646transcriptional regulator YhiF
55UTI89_C4117UTI89_C4184Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4117219-2.3185242,3-diketo-L-gulonate reductase
UTI89_C4118117-2.674392hypothetical protein
UTI89_C4119-119-0.210971hypothetical protein
UTI89_C4120-1201.2319202,3-diketo-L-gulonate TRAP transporter small
UTI89_C4121-2182.229346hypothetical protein
UTI89_C4122-2203.727638hypothetical protein
UTI89_C4123-3194.371135ABC transporter periplasmic binding protein
UTI89_C4124-3194.782833cryptic L-xylulose kinase
UTI89_C4125-2143.3995883-keto-L-gulonate-6-phosphate decarboxylase
UTI89_C4126-2122.001411L-xylulose 5-phosphate 3-epimerase
UTI89_C4127-1122.708565L-ribulose-5-phosphate 4-epimerase
UTI89_C4128-1122.752955hypothetical protein
UTI89_C41290132.311344aldehyde dehydrogenase
UTI89_C41300122.093966hypothetical protein
UTI89_C41310122.635849alcohol dehydrogenase
UTI89_C4132-3143.163920selenocysteinyl-tRNA-specific translation
UTI89_C4133-3132.261863selenocysteine synthase
UTI89_C4134-3141.359512glutathione S-transferase
UTI89_C4135-2161.160746hypothetical protein
UTI89_C4136-1190.861676hypothetical protein
UTI89_C4137-2180.274250mannitol-specific PTS system enzyme IIABC
UTI89_C4138218-0.331692mannitol-1-phosphate 5-dehydrogenase
UTI89_C41393170.182193mannitol repressor protein
UTI89_C41403170.490269hypothetical protein
UTI89_C41412161.102887hypothetical protein
UTI89_C41422151.319248hypothetical protein
UTI89_C41432161.684267autotransport adhesin
UTI89_C4144-2143.584472L-lactate permease
UTI89_C4145-1133.172469DNA-binding transcriptional repressor LldR
UTI89_C4146-1142.827108L-lactate dehydrogenase
UTI89_C4147-2142.054147tRNA/rRNA methyltransferase YibK
UTI89_C4148-1151.322642serine acetyltransferase
UTI89_C41490171.040577NAD(P)H-dependent glycerol-3-phosphate
UTI89_C4150227-1.636153hypothetical protein
UTI89_C4151421-2.518736preprotein translocase subunit SecB
UTI89_C4152124-0.473251glutaredoxin 3
UTI89_C41533191.583288hypothetical protein
UTI89_C41542181.639765hypothetical protein
UTI89_C41550140.145561hypothetical protein
UTI89_C4156-1120.365645hypothetical protein
UTI89_C4157-1120.662225phosphoglyceromutase
UTI89_C4158-1110.855111hypothetical protein
UTI89_C4159210-0.410323hypothetical protein
UTI89_C4160212-0.066530glycosyl transferase
UTI89_C41610120.944492hypothetical protein
UTI89_C4162-114-2.637636L-threonine 3-dehydrogenase
UTI89_C4163019-5.6937602-amino-3-ketobutyrate CoA ligase
UTI89_C4164226-8.792879ADP-L-glycero-D-manno-heptose-6-epimerase
UTI89_C4165332-10.737713ADP-heptose:LPS heptosyltransferase II
UTI89_C4166339-13.941375ADP-heptose:LPS heptosyl transferase I
UTI89_C4167341-16.093192lipid A-core surface polymer ligase
UTI89_C4168334-14.041896beta1,3-glucosyltransferase
UTI89_C4169228-11.056950UDP-galactose:(galactosyl) LPS
UTI89_C4170226-9.304075lipopolysaccharide core biosynthesis protein
UTI89_C4171222-6.231151lipopolysaccharide 1,2-glucosyltransferase
UTI89_C4172217-3.912616lipopolysaccharide 1,3-galactosyltransferase
UTI89_C4173113-1.150505lipopolysaccharide core biosynthesis protein
UTI89_C4174015-1.121504lipopolysaccharide core biosynthesis
UTI89_C4175-113-0.917458lipopolysaccharide core biosynthesis protein
UTI89_C4176-1120.5582863-deoxy-D-manno-octulosonic-acid transferase
UTI89_C4177018-1.292519phosphopantetheine adenylyltransferase
UTI89_C4178118-0.608608formamidopyrimidine-DNA glycosylase
UTI89_C4179220-1.101001hypothetical protein
UTI89_C41800170.12843550S ribosomal protein L33
UTI89_C41810140.74821150S ribosomal protein L28
UTI89_C4182-1131.412506DNA repair protein RadC
UTI89_C41830152.177137bifunctional phosphopantothenoylcysteine
UTI89_C41842131.258502deoxyuridine 5'-triphosphate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4132TCRTETOQM585e-11 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 58.3 bits (141), Expect = 5e-11
Identities = 44/147 (29%), Positives = 69/147 (46%), Gaps = 18/147 (12%)

Query: 3 IATAGHVDHGKTTLLQAI---TGV------------NADRLPEEKKRGMTIDLGYAYWPQ 47
I HVD GKTTL +++ +G D E++RG+TI G +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 48 PDGRVPGFIDVPGHEKFLSNMLAGVGGIDHALLVVACDDGVMAQTREHLAILQLTGNPML 107
+ +V ID PGH FL+ + + +D A+L+++ DGV AQTR L+ G P +
Sbjct: 66 ENTKV-NIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTI 124

Query: 108 TVALTKADRVDEARVDEVERQVKEVLR 134
+ K D+ + V + +KE L
Sbjct: 125 -FFINKIDQNG-IDLSTVYQDIKEKLS 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4135RTXTOXIND642e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 64.5 bits (157), Expect = 2e-13
Identities = 56/314 (17%), Positives = 103/314 (32%), Gaps = 82/314 (26%)

Query: 66 ITPQVTGIVTEVTDKNNQLIQKGEVLFKLDPVR------------YQARVD--RLQA--- 108
I P IV E+ K + ++KG+VL KL + QAR++ R Q
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSR 158

Query: 109 ------------------------DLMTATHNIK----TLRAQLTEAQANTTQVSAERDR 140
+++ T IK T + Q + + N + AER
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT 218

Query: 141 LFKNYQRY----------LKGSQAAVNPFS---------ERDIDDARQNF---LAQDALV 178
+ RY L + ++ + E +A +Q +
Sbjct: 219 VLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQI 278

Query: 179 KGSVAE----QAQIQSQLDSMVNGE----QSQIVSLRAQLTEAKYNLEQTVIRAPSNGYV 230
+ + + + + + I L +L + + + +VIRAP + V
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKV 338

Query: 231 TQVLIR-PGTYAAALPLRPVMVFIPEQKRQIV-AQFRQNSLLRLKPGDDAEVVFNALPGQ 288
Q+ + G +MV +PE V A + + + G +A + A P
Sbjct: 339 QQLKVHTEGGVVT--TAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYT 396

Query: 289 VFH---GKLTSILP 299
+ GK+ +I
Sbjct: 397 RYGYLVGKVKNINL 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4143PF03895634e-14 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 62.5 bits (152), Expect = 4e-14
Identities = 19/79 (24%), Positives = 36/79 (45%), Gaps = 2/79 (2%)

Query: 1506 ESKLSGGIASAMAMTGLPQAYTPGASMASIGGGTYNGESAVALGV-SMVSANGRWVYKLQ 1564
+L G+A+ A++ L Q G + S G Y ++A+A+GV S ++ +
Sbjct: 2 SKELQTGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGVA 61

Query: 1565 GSTNSQGEYSAALGAGIQW 1583
+T + G S G ++
Sbjct: 62 FNTYN-GGMSYGASVGYEF 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4149NUCEPIMERASE290.020 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.4 bits (66), Expect = 0.020
Identities = 20/87 (22%), Positives = 30/87 (34%), Gaps = 13/87 (14%)

Query: 8 MTVI---GAGSYGTALAITLARNGHEVVLWGHD---PEHIATLERDRCNAAFLPDVPFPD 61
M + AG G ++ L GH+VV G D + +L++ R P F
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVV--GIDNLNDYYDVSLKQARLELLAQPGFQF-- 56

Query: 62 TLHLESDLATALAASRNILVVVPSHVF 88
+ DLA + VF
Sbjct: 57 ---HKIDLADREGMTDLFASGHFERVF 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4151SECBCHAPRONE2401e-84 Bacterial protein-transport SecB chaperone protein ...
		>SECBCHAPRONE#Bacterial protein-transport SecB chaperone protein

signature.
Length = 170

Score = 240 bits (614), Expect = 1e-84
Identities = 87/153 (56%), Positives = 117/153 (76%), Gaps = 4/153 (2%)

Query: 23 EQNNTEMTFQIQRIYTKDISFEAPNAPHVFQKDWQPEVKLDLDTASTQLADDVYEVVLRV 82
Q + QIQRIY KD+SFEAPN PH+FQ+DW+P++ DL T + Q+ DD+YEV L +
Sbjct: 12 TQATQQPVLQIQRIYVKDVSFEAPNLPHIFQQDWEPKLSFDLSTEAKQVGDDLYEVCLNI 71

Query: 83 TVTASLGEE--TAFLCEVQQGGIFSIAGIEGTQMAHCLGAYCPNILFPYARECITSMVSR 140
+V ++ AF+CEV+Q G+F+I+G+E QMAHCL + CPN+LFPYARE ++S+V+R
Sbjct: 72 SVETTMESSGDVAFICEVKQAGVFTISGLEEMQMAHCLTSQCPNMLFPYARELVSSLVNR 131

Query: 141 GTFPQLNLAPVNFDALFMNYL--QQQAGEGTEE 171
GTFP LNL+PVNFDALFM+YL Q+QA + TEE
Sbjct: 132 GTFPALNLSPVNFDALFMDYLQRQEQAEQTTEE 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4158CHANLCOLICIN362e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 36.2 bits (83), Expect = 2e-04
Identities = 51/223 (22%), Positives = 74/223 (33%), Gaps = 22/223 (9%)

Query: 60 RAVRQKQQQRASLLAQLKKQEEAISEATRKLRETQNTLNQLNKQIDEMNASIAKLEQQKA 119
R + +++ R A K +EA RE T QL E A E+ KA
Sbjct: 131 RLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKA 190

Query: 120 ---AQERSLAAQLDAAFRQGEHTGIQLILSGEESQRGQRLQAYFGYLNQARQETIAQLKQ 176
AQ++ AAQ + GE + LS AR + L
Sbjct: 191 VEIAQKKLSAAQSEVVKMDGEIKTLNSRLSS---------------SIHARDAEMKTLAG 235

Query: 177 TREEVAMQRAELEEKQSEQQTLLYEQRAQQAKLTQALSERKKTLAGLESSIQQGQQQLSE 236
R E+A A+ +E + L + R++ AG +Q Q SE
Sbjct: 236 KRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASE 295

Query: 237 LRANESRLRNSIARAEAAAKARAEREAREAQAVRDRQKEATRK 279
R N R+ I + + A R A R + E K
Sbjct: 296 TRIN--RINADITQIQKAIS--QVSNNRNAGIARVHEAEENLK 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4164NUCEPIMERASE1047e-28 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 104 bits (260), Expect = 7e-28
Identities = 77/348 (22%), Positives = 127/348 (36%), Gaps = 67/348 (19%)

Query: 2 IIVTGGAGFIGSNIVKALNDKGITDILVVDNLKD--------------GTKFVNLVDLDI 47
+VTG AGFIG ++ K L + G ++ +DNL D +D+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 48 ADYMDKEDFLIQIMAGEEFGDVEAIFHEGACSSTTEWDGKYMMDNNYQYSK-------EL 100
AD + + + A F E +F + +Y ++N + Y+ +
Sbjct: 62 ADR----EGMTDLFASGHF---ERVFISPHRLAV-----RYSLENPHAYADSNLTGFLNI 109

Query: 101 LHYCLEREIP-FLYASSAATYGGRTSD-FIESREYEKPLNVYGYSKFLFDEYVRQILPEA 158
L C +I LYASS++ YG F + P+++Y +K +
Sbjct: 110 LEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 159 NSQIVGFRYFNVYGPREGHKGSMASVAFHLNTQLNNGESPKLFEGSENFKRDFVYVGDVA 218
G R+F VYGP + MA F + G+S ++ KRDF Y+ D+A
Sbjct: 170 GLPATGLRFFTVYGPWG--RPDMA--LFKFTKAMLEGKSIDVY-NYGKMKRDFTYIDDIA 224

Query: 219 DVNL------------WFLENGVSG-------IFNLGTGRAESFQAVADATLAY-HKKGQ 258
+ + W +E G ++N+G A + +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 259 IEYIPFPDKLKGRYQAFTQADLTNLRAA-GYDKPFKTVAEGVTEYMAW 305
+P G T AD L G+ P TV +GV ++ W
Sbjct: 285 KNMLPLQ---PGDVL-ETSADTKALYEVIGF-TPETTVKDGVKNFVNW 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4177LPSBIOSNTHSS2479e-88 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 247 bits (633), Expect = 9e-88
Identities = 78/154 (50%), Positives = 112/154 (72%)

Query: 5 AIYPGTFDPITNGHIDIVTRATQMFDHVILAIAASPSKKPMFTLEERVELAQQATAHLGN 64
AIYPG+FDPIT GH+DI+ R ++FD V +A+ +P+K+PMF+++ER+E +A AHL N
Sbjct: 3 AIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLPN 62

Query: 65 VEVVGFSDLMANFARNQHATVLIRGLRAVADFEYEMQLAHMNRHLMPELESVFLMPSKEW 124
+V F L N+AR + A ++RGLR ++DFE E+Q+A+ N+ L +LE+VFL S E+
Sbjct: 63 AQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTEY 122

Query: 125 SFISSSLVKEVARHQGDVTHFLPENVHQALMAKL 158
SF+SSSLVKEVAR G+V HF+P +V AL +
Sbjct: 123 SFLSSSLVKEVARFGGNVEHFVPSHVAAALYDQF 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4183UREASE290.030 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 29.3 bits (66), Expect = 0.030
Identities = 18/55 (32%), Positives = 22/55 (40%), Gaps = 15/55 (27%)

Query: 98 GHIELGKWADLVILAPA----------TADLIARVAAGMANDLVSTICLATPAPV 142
G +E+GK ADLV+ PA IA G N + TP PV
Sbjct: 424 GSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNA-----SIPTPQPV 473


56UTI89_C4194UTI89_C4225Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4194-2143.400868DNA-directed RNA polymerase subunit omega
UTI89_C4195-2133.230236bifunctional (p)ppGpp synthetase II/
UTI89_C4196-2133.108581tRNA guanosine-2'-O-methyltransferase
UTI89_C4197-1122.840078ATP-dependent DNA helicase RecG
UTI89_C4198-1121.906973glutamate transport protein
UTI89_C4199-2112.169123transport protein
UTI89_C4200-2131.701608hypothetical protein
UTI89_C4201-1140.748722hypothetical protein
UTI89_C4202-1160.137516hypothetical protein
UTI89_C4203120-3.258829hypothetical protein
UTI89_C4204-115-2.723274aldolase
UTI89_C4205-114-3.453675PTS enzyme-II fructose
UTI89_C4206-214-3.863909PTS system fructose-like transporter subunit
UTI89_C4207-213-3.902273phosphotransferase system (PTS),
UTI89_C4208-213-3.574253transcriptional antiterminator
UTI89_C4209-28-2.162862alpha-xylosidase YicI
UTI89_C4210-312-3.657147transporter
UTI89_C4212018-4.802104*cytoplasmic protein
UTI89_C4213019-3.550484hypothetical protein
UTI89_C4214-119-2.412358transport protein YicL
UTI89_C4215-119-2.857144cytoplasmic membrane lipoprotein-28
UTI89_C4216-116-0.716738hypothetical protein
UTI89_C4217-2110.085326hypothetical protein
UTI89_C4218-1121.044827ribonucleoside transporter
UTI89_C4219-2121.679505hypothetical protein
UTI89_C4220-1132.116974hypothetical protein
UTI89_C4221-1153.220785cryptic adenine deaminase
UTI89_C42220173.658217sugar phosphate antiporter
UTI89_C42231163.960654regulatory protein UhpC
UTI89_C42241164.232778sensory histidine kinase UhpB
UTI89_C42251173.564547DNA-binding transcriptional activator UhpA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4197SECA404e-05 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 39.9 bits (93), Expect = 4e-05
Identities = 38/129 (29%), Positives = 56/129 (43%), Gaps = 18/129 (13%)

Query: 233 NLSMLALRAGAQRFHAQPLSANDALKNKLLAALPFKPTGAQARVVAEIERDM-ALDVPMM 291
LS L+ F A+ L + L+N + A A R ++ M DV ++
Sbjct: 37 KLSDEELKGKTAEFRAR-LEKGEVLENLIPEAF------AVVREASKRVFGMRHFDVQLL 89

Query: 292 ---RLVQGDV-----GSGKTLVAALAA-LRAVAHGKQVALMAPTELLAEQHANNFRNWFA 342
L + + G GKTL A L A L A+ GK V ++ + LA++ A N R F
Sbjct: 90 GGMVLNERCIAEMRTGEGKTLTATLPAYLNALT-GKGVHVVTVNDYLAQRDAENNRPLFE 148

Query: 343 PLGIEVGWL 351
LG+ VG
Sbjct: 149 FLGLTVGIN 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4208PF08280340.001 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 34.1 bits (78), Expect = 0.001
Identities = 79/491 (16%), Positives = 168/491 (34%), Gaps = 73/491 (14%)

Query: 7 RQNRLLRFLLPRREYTTIVTIAGYLNVSEKTIQRDLRLLEQWL-GQWRINVEKRAGAGVM 65
+ +L+ + I +A ++ + L + + ++KR M
Sbjct: 45 SKCQLVVLFF-KTSSLPITEVAEKTGLTFLQLNHYCEELNAFFPDSLSMTIQKR-----M 98

Query: 66 LSAENIADLLHLDHLLVAECEEIDGVMNNARRVKIASQLLSETPNETSISKLSERYFISG 125
+ H ++ + + ++ +++ + L+ + ++ + +F+S
Sbjct: 99 I-------SCQFTHP--SKETYLYQLYASSNVLQLLAFLIKNGSHSRPLTDFARSHFLSN 149

Query: 126 ASIVNDLRVIESWLAPLGLSLIRSPSGTHIEGSEGQVRQAMALLINGIINHNEPQGVVYS 185
+S + L L L S I G E ++R +ALL G+
Sbjct: 150 SSAYRMREALIPLLRNFELKL----SKNKIVGEEYRIRYLIALL-------YSKFGIKVY 198

Query: 186 RLDPGSYKALVHYFGEEEVLFVQSLLLDMENELSWSLGEPYYVNIFTHILIMMYRNTHGN 245
L K ++H F L S L LS E + F IL+ + H
Sbjct: 199 DLTQQD-KNIIHSF-----LSHSSTHLKTSPWLS----ESFS---FYDILLALSWKRHQF 245

Query: 246 ALSREEDQTRQYDENIF---NVASQMIHKIEQRIAHTLPDDEVWFIYQ-YIISSGVAIDG 301
+++ + + Q + +F ++ IE ++ ++Y YI ++
Sbjct: 246 SVTIPQTRIFQQLKKLFVYDSLKKSSRDIIETYCQLNFSAGDLDYLYLIYITANNSFASL 305

Query: 302 Q---KDVSIISHMQASNEA-RLITWRLITVFSDIVD---------CDFSEDSALYDGLLV 348
Q + + + N+ RL+ +IT+ ++ + FS+ S L++ L
Sbjct: 306 QWTPEHIRQCCQLFEENDTFRLLLNPIITLLPNLKEQKASLVKALMFFSK-SFLFN--LQ 362

Query: 349 HIKPLINRLNYRIHIRNPLLEDIKAELADVWRLTQYVVNQVFKTWGENAVSEDEVGYLTV 408
H P N + N L + + W + K G+ ++
Sbjct: 363 HFIPETNLFVSPYYKGNQKLYTSLKLIVEEW---------MAKLPGKRYLNHKHFHLFCH 413

Query: 409 HFQAAMERQIARKRVLLVCSTGIGTSHLLKSRILRAFPEWTI---VDVISAANLSQVLPD 465
+ + + V+ V S I +HLL R F + +I + N+ Q+
Sbjct: 414 YVEQILRNIQPPLVVVFVASNFI-NAHLLTDSFPRYFSDKSIDFHSYYLLQDNVYQIPDL 472

Query: 466 NIELIISTINL 476
+L+I+ L
Sbjct: 473 KPDLVITHSQL 483


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4218TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 38.7 bits (90), Expect = 3e-05
Identities = 34/208 (16%), Positives = 72/208 (34%), Gaps = 13/208 (6%)

Query: 33 IIVEFLPVSLLTP----MAQDLGISEGVA---GQSVTVTAFVAMFASLFITQTIQATDRR 85
+ ++ + + L+ P + +DL S V G + + A + + + RR
Sbjct: 14 VALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRR 73

Query: 86 YVVILFAVLLTLSCLLVSFANSFSLLLIGRACLGLALGGFWAMSASLTMRLVPPRTVPKA 145
V+++ + +++ A +L IGR G+ G A++ + + +
Sbjct: 74 PVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARH 132

Query: 146 LSVIFGAVSIALVIAAPLGSFLGELIGWRNVFNAAAAMG----VLCIFWIIKSLPSLPGE 201
+ +V LG +G F AAAA+ + F + +S
Sbjct: 133 FGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRP 191

Query: 202 PSHQKQNTFRLLQRPGVMAGMIAIFMSF 229
+ N + M + A+ F
Sbjct: 192 LRREALNPLASFRWARGMTVVAALMAVF 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4221UREASE381e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 38.2 bits (89), Expect = 1e-04
Identities = 30/105 (28%), Positives = 43/105 (40%), Gaps = 17/105 (16%)

Query: 22 AVSRGDAVADYIIDNVSILDLINGGEISGPIVIKGRYIAGVG-AEYAD---------APA 71
V+R D +I N ILD + G + I +K IA +G A D P
Sbjct: 60 QVTREGGAVDTVITNALILD--HWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPG 117

Query: 72 LQRIDARGATAVPGFIDAHLHIESSMMTPVTFETATLPRGLTTVI 116
+ I G G +D+H+H + P E A L GLT ++
Sbjct: 118 TEVIAGEGKIVTAGGMDSHIH----FICPQQIEEA-LMSGLTCML 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4222TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 0.001
Identities = 28/168 (16%), Positives = 61/168 (36%), Gaps = 17/168 (10%)

Query: 49 FNIAQNDMISTYGLSMTQLGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAIC 108
N++ D+ + + + F +T+ +G + +D K+ L F +I++ C
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN--C 90

Query: 109 MLGFSASMGSGSVSLFLMIAFYALSGFFQSTGGSCSYSTI----TKWTPRRKRGTFLGFW 164
+G SL +M + F Q G + + + ++ P+ RG G
Sbjct: 91 FGSVIGFVGHSFFSLLIM------ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 165 NISHNLGGAGAAGVALFGANYLFDGHVIGMFIFPSIIALIVGFIGLRY 212
+G + A+Y+ + + + P + I+ L
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSY---LLLIP--MITIITVPFLMK 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4223TCRTETB401e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 39.9 bits (93), Expect = 1e-05
Identities = 64/408 (15%), Positives = 135/408 (33%), Gaps = 60/408 (14%)

Query: 30 RHILLTIWLGYALFY--FTRKSFNAAVPEILANGVLSRSDIGLLATLFYITYGVSKFVSG 87
RH + IWL F+ N ++P+I + + + T F +T+ + V G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 88 IVSDRSNARYFMGIGLIATGIINILFGFSTSLWAFAVLWVLNAFFQGWGS---PVCARLL 144
+SD+ + + G+I +++ S F L ++ F QG G+ P ++
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHS---FFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 145 TAWY-SRTERGGWWALWNTAHNVGGALIPIVMAASALHYGWRAGMMIAGCMAIVVGIFLC 203
A Y + RG + L + +G + P + A + W ++ M ++ +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPF- 184

Query: 204 WRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKEILTKYVLLNPYIWLLSFCYVLV 263
+ L +I G L I+ + Y VL
Sbjct: 185 --------LMKLLKKEVRIKGHFDIK----GIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 264 YVV-----RAAINDWGN-----------LYMSEMLGVDLVTANTAVTMFELGGFIGALVA 307
+++ R + + + + + V ++ + + A
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 308 GWGSDKLFNGNRGPMNLIFAAGILL-SVGSLWLMPFASYVMQATCFFTIGFFVFGPQMLI 366
GS +F G + + GIL+ G L+++ + + F T F + +
Sbjct: 293 EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFM 351

Query: 367 ---------GMAAAECS---------HKEAAGAATGFVGLFAYLGASL 396
G++ + ++ AGA + ++L
Sbjct: 352 TIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4224PF06580402e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.8 bits (93), Expect = 2e-05
Identities = 28/142 (19%), Positives = 57/142 (40%), Gaps = 11/142 (7%)

Query: 365 LRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWRIDESALSENQRVTLFRVCQEGLNN 424
LR ++L + + ++L L++ + + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 425 IVKHA-----DASAVTLQGWQQDERLMLVIEDDGSGLPPDSGQ-HGFGLTGMRERVTALG 478
+KH + L+G + + + L +E+ GS ++ + G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 479 G---TLTISCLHG-TRVSVSLP 496
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4225HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.0 bits (148), Expect = 2e-13
Identities = 29/174 (16%), Positives = 59/174 (33%), Gaps = 20/174 (11%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGCGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATG 118
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 119 GCYLTPDIAIKLASGRQDPLTKRERQVAEKLAQG---MAVKEIAAELGLSPKTV 169
A+ R L + + + + + A L + T+
Sbjct: 118 --------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


57UTI89_C4246UTI89_C4255Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C42462222.478264galactonate operon transcriptional repressor
UTI89_C42472212.359915sugar phosphatase
UTI89_C42483202.583055hypothetical protein
UTI89_C42494232.747959DNA gyrase subunit B
UTI89_C42503152.626952recombination protein F
UTI89_C42514132.587413DNA polymerase III subunit beta
UTI89_C42523172.406650chromosome replication initiator DnaA
UTI89_C42530212.235396hypothetical protein
UTI89_C4254-1193.67916650S ribosomal protein L34
UTI89_C4255-1193.417621ribonuclease P
58UTI89_C4280UTI89_C4290Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C42803312.002708phosphate ABC transporter periplasmic
UTI89_C42812282.233069glucosamine--fructose-6-phosphate
UTI89_C42824342.279021bifunctional N-acetylglucosamine-1-phosphate
UTI89_C42835342.322083hypothetical protein
UTI89_C42845342.147235ATP synthase F0F1 subunit epsilon
UTI89_C42855312.063279ATP synthase F0F1 subunit beta
UTI89_C42863281.714555ATP synthase F0F1 subunit gamma
UTI89_C42875301.717996ATP synthase F0F1 subunit alpha
UTI89_C42884301.059425hypothetical protein
UTI89_C42893230.611050ATP synthase F0F1 subunit delta
UTI89_C42902220.065124hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4282RTXTOXINA290.048 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.048
Identities = 23/80 (28%), Positives = 31/80 (38%), Gaps = 10/80 (12%)

Query: 367 LGDAEIGDNVNIGAGTITCNYDGANKFKTIIGDDVFVGSDTQLVAPVTVGKGATIAAGTT 426
LGD + D V + AG+ N G DV T G AT A T
Sbjct: 616 LGDGD--DKVFLSAGSA--NIYAGK------GHDVVYYDKTDTGYLTIDGTKATEAGNYT 665

Query: 427 VTRNVGENALAISRVPQTQK 446
VTR +G + + V + Q+
Sbjct: 666 VTRVLGGDVKVLQEVVKEQE 685


59UTI89_C4323UTI89_C4334Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4323-1213.662797hypothetical protein
UTI89_C4324-1204.363489acetolactate synthase 2 catalytic subunit
UTI89_C43250253.951891acetolactate synthase 2 regulatory subunit
UTI89_C43260264.008221branched-chain amino acid aminotransferase
UTI89_C43271233.894349dihydroxy-acid dehydratase
UTI89_C43281193.332562threonine dehydratase
UTI89_C43290152.128000DNA-binding transcriptional regulator IlvY
UTI89_C4330-1141.294759ketol-acid reductoisomerase
UTI89_C4331-1111.365992peptidyl-prolyl cis-trans isomerase C
UTI89_C4332-1111.386886ATP-dependent DNA helicase Rep
UTI89_C43330140.053802guanosine pentaphosphate phosphohydrolase
UTI89_C4334218-1.081400ATP-dependent RNA helicase RhlB
60UTI89_C4371UTI89_C4380Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4371-2224.833814lipoprotein
UTI89_C4372-2183.725254diaminopimelate epimerase
UTI89_C4373-2172.681927hypothetical protein
UTI89_C4374-2182.326878site-specific tyrosine recombinase XerC
UTI89_C4375-2150.606787flavin mononucleotide phosphatase
UTI89_C4376-111-2.571650DNA-dependent helicase II
UTI89_C4377-113-6.285593hypothetical protein
UTI89_C4378-113-6.064731hypothetical protein
UTI89_C4379-112-5.086760magnesium/nickel/cobalt transporter CorA
UTI89_C4380017-6.628960hypothetical protein
61UTI89_C4428UTI89_C4463Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4428-1173.0804853-octaprenyl-4-hydroxybenzoate carboxy-lyase
UTI89_C4429-2182.986130FMN reductase
UTI89_C4430-2192.9589493-ketoacyl-CoA thiolase
UTI89_C4431-2172.006077multifunctional fatty acid oxidation complex
UTI89_C4432-2130.861676proline dipeptidase
UTI89_C4433-1130.106438hypothetical protein
UTI89_C4434-114-1.132737potassium transporter
UTI89_C4435-114-2.568814protoporphyrinogen oxidase
UTI89_C4443-119-3.823569**molybdopterin-guanine dinucleotide biosynthesis
UTI89_C4444-222-6.345437molybdopterin-guanine dinucleotide biosynthesis
UTI89_C4445-221-6.948484hypothetical protein
UTI89_C4446-122-7.503919serine/threonine protein kinase
UTI89_C4447-214-4.553958periplasmic protein disulfide isomerase I
UTI89_C4448-215-4.657861hypothetical protein
UTI89_C4449-213-3.657671hypothetical protein
UTI89_C4450-115-1.348821acyltransferase
UTI89_C44511150.120212hypothetical protein
UTI89_C44520161.683053DNA polymerase I
UTI89_C4453-1172.659567hypothetical protein
UTI89_C44540182.555493ribosome biogenesis GTP-binding protein YsxC
UTI89_C44552242.489163hypothetical protein
UTI89_C44561212.175243coproporphyrinogen III oxidase
UTI89_C44571181.811366nitrogen regulation protein NR(I)
UTI89_C44580160.973478nitrogen regulation protein NR(II)
UTI89_C44590190.513968glutamine synthetase
UTI89_C4460-2140.460532GTP-binding protein
UTI89_C4461122-0.519108transcriptional regulator YihW
UTI89_C4462122-1.396388sugar kinase YihV
UTI89_C4463222-1.450120oxidoreductase YihU
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4455SECA310.002 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 31.0 bits (70), Expect = 0.002
Identities = 11/71 (15%), Positives = 30/71 (42%)

Query: 14 AKARRKTREELDQEARDRKRQKKRRGHAPGSRAAGGNTTSGSKGQNAPKDPRIGSKTPIP 73
+K + + EE+++ + R+ + +R ++ + + + ++G P P
Sbjct: 827 SKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCP 886

Query: 74 LGVAEKVTKQH 84
G +K + H
Sbjct: 887 CGSGKKYKQCH 897


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4457HTHFIS6020.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 602 bits (1553), Expect = 0.0
Identities = 206/478 (43%), Positives = 299/478 (62%), Gaps = 11/478 (2%)

Query: 1 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGAEVLEALASKTPDVLLSDIRMPGM 60
M + V DDD++IR VL +AL+ AG N A + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 120
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 HYQEQQQPRNVQLNGPTTDIIGEAPAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 180
+ + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 181 LHRHSPRAKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 240
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 241 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVQEGKFREDLFHR 300
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L+Q + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 301 LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL 360
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 361 ENTCRWLTVMAAGQEVLIQDLPGELFESNVPESTSHMQPDSWATLLAQWADRALRS---- 416
EN R LT + + + + EL S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 417 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 469
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4460TCRTETOQM1804e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 180 bits (458), Expect = 4e-51
Identities = 97/445 (21%), Positives = 170/445 (38%), Gaps = 81/445 (18%)

Query: 4 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQE--RVMDSNDLEKERGITILAKNT 61
K+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIVYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPALSVDEPTV 307
K+ ++ + E + D A +G+IV + L ++ + DT+ + + P +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 308 SMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSGRGELHLS 367
+ + D L LR +S G++ +
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKVQME 397

Query: 368 VLIENMRRE-GFELAVSRPKVIFRE 391
V ++ + E+ + P VI+ E
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


62UTI89_C4517UTI89_C4527Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C45172123.247788ATP-dependent protease peptidase subunit
UTI89_C45181123.102727essential cell division protein FtsN
UTI89_C45190122.461340DNA-binding transcriptional regulator CytR
UTI89_C45201144.152262primosome assembly protein PriA
UTI89_C4521-2121.329209hypothetical protein
UTI89_C4522-310-0.472444peptidoglycan peptidase
UTI89_C4523-211-2.488197transcriptional repressor protein MetJ
UTI89_C4524-112-2.792659cystathionine gamma-synthase
UTI89_C4525-112-3.675625bifunctional aspartate kinase II/homoserine
UTI89_C4526018-6.357139bifunctional nucleoside and albicidin
UTI89_C4527-211-3.201540hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4518IGASERPTASE422e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 42.0 bits (98), Expect = 2e-06
Identities = 32/155 (20%), Positives = 64/155 (41%), Gaps = 5/155 (3%)

Query: 114 LTPEQRQLLEQMQADMRQQPTQLVEVPWNEQTPEQRQQTLQRQRQAQQLAEQQRLAQQSR 173
+ +QAD+ P+ E+ ++ P + +AE + Q+S+
Sbjct: 992 VDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK--QESK 1049

Query: 174 TTEQSWQQQT-RTSQAAPVQAQPRQSKPASTQQPYQDLLQTPAHTTAQSKPQQAAPVARA 232
T E++ Q T T+Q V + + + A+TQ + T ++ ++ A V +
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 233 ADAPKPTAEKKDERRWMVQCGSFRGAEQAETVRAQ 267
A T + ++ + Q + EQ+ETV+ Q
Sbjct: 1110 EKAKVETEKTQEVPKVTSQVSPKQ--EQSETVQPQ 1142


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4526CHANNELTSX357e-127 Nucleoside-specific channel-forming protein Tsx signa...
		>CHANNELTSX#Nucleoside-specific channel-forming protein Tsx

signature.
Length = 294

Score = 357 bits (916), Expect = e-127
Identities = 171/262 (65%), Positives = 204/262 (77%), Gaps = 6/262 (2%)

Query: 30 WLHQSLNVIGRTDSRFGPRLTNDLYPEYTVAGRKDWFDFYGYVDLPKFFGVGSHYDVGIW 89
W HQS+NV+G +RFGP++ ND Y EY +KDWFDFYGY+D P FFG G+ GIW
Sbjct: 34 WWHQSVNVVGSYHTRFGPQIRNDTYLEYEAFAKKDWFDFYGYIDAPVFFG-GNSTAKGIW 92

Query: 90 DEGSPLFTEIEPRFSIDKLTGLNLAFGPFKEWFIANNYVYDMGDNQSSRQSTWYMGLGTD 149
++GSPLF EIEPRFSIDKLT +L+FGPFKEW+ ANNY+YDMG N S QSTWYMGLGTD
Sbjct: 93 NKGSPLFMEIEPRFSIDKLTNTDLSFGPFKEWYFANNYIYDMGRNDSQEQSTWYMGLGTD 152

Query: 150 IDTGLPIKLSANIYAKYQWQNYGAANENEWDGYRFKIKYSIPLTNLFGGRLVYNSFTNFD 209
IDTGLP+ LS N+YAKYQWQNYGA+NENEWDGYRFK+KY +PLT+L+GG L Y FTNFD
Sbjct: 153 IDTGLPMSLSLNVYAKYQWQNYGASNENEWDGYRFKVKYFVPLTDLWGGSLSYIGFTNFD 212

Query: 210 FGSDLADKSHNN-----KRTSNAIASSHILSLLYEHWKFAFTLRYFHNGGQWNAGEKVNF 264
+GSDL D + + RTSN+IASSHIL+L Y HW ++ RYFHNGGQW K+NF
Sbjct: 213 WGSDLGDDNFYDLNGKHARTSNSIASSHILALNYAHWHYSIVARYFHNGGQWADDAKLNF 272

Query: 265 GDGPFELKNTGWGTYTTIGYQF 286
GDGPF +++TGWG Y +GY F
Sbjct: 273 GDGPFSVRSTGWGGYFVVGYNF 294


63UTI89_C4654UTI89_C4663Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4654-1203.331325hypothetical protein
UTI89_C4655-3202.999449hypothetical protein
UTI89_C4656-3203.100289hypothetical protein
UTI89_C4657-2233.597285acetate permease
UTI89_C4658-2224.053191hypothetical protein
UTI89_C4659-2204.571920acetyl-CoA synthetase
UTI89_C4660-1163.913602cytochrome c552
UTI89_C46610184.317140cytochrome c nitrite reductase pentaheme
UTI89_C4662-1194.028747NrfC protein
UTI89_C46630183.207387NrfD protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4658RTXTOXIND270.020 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 26.7 bits (59), Expect = 0.020
Identities = 5/33 (15%), Positives = 13/33 (39%), Gaps = 1/33 (3%)

Query: 17 ELVEKR-QRFATILSIIMLAVYIGFILLIAFAP 48
EL+E R +++ ++ + +L
Sbjct: 47 ELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQ 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4662VACJLIPOPROT300.007 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 29.9 bits (67), Expect = 0.007
Identities = 6/21 (28%), Positives = 12/21 (57%)

Query: 179 FGNLDDPSSEISQLLRQKPTY 199
GNL++P+ ++ L+ P
Sbjct: 75 TGNLEEPAVMVNYFLQGDPYQ 95


64UTI89_C4685UTI89_C4697Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C46851276.153114ribose-5-phosphate isomerase B
UTI89_C46860327.301237hypothetical protein
UTI89_C4687-2327.755383carbon-phosphorus lyase complex accessory
UTI89_C4688-1347.325168aminoalkylphosphonic acid N-acetyltransferase
UTI89_C4689-1327.456581ribose 1,5-bisphosphokinase
UTI89_C4690-1347.671117PhnM protein
UTI89_C4691-1337.846654phosphonates transport ATP-binding protein PhnL
UTI89_C4692-1348.242471phosphonate C-P lyase system protein PhnK
UTI89_C4693-1358.368097PhnJ protein
UTI89_C46940367.770690PhnI protein
UTI89_C46950387.594247carbon-phosphorus lyase complex subunit
UTI89_C46961376.621345PhnG protein
UTI89_C46972344.848156phosphonate metabolism transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4688SACTRNSFRASE333e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.6 bits (74), Expect = 3e-04
Identities = 20/84 (23%), Positives = 32/84 (38%), Gaps = 5/84 (5%)

Query: 50 HLALLDGEVVGMIGLHLQFHLHHVNWIGEIQELVVMPQARGLNVGSKLLAWAEEEARQAG 109
L L+ +G I + + N I+++ V R VG+ LL A E A++
Sbjct: 68 FLYYLENNCIGRIKIRSNW-----NGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENH 122

Query: 110 AEMTELSTNVKRHDAHRFYLREGY 133
L T A FY + +
Sbjct: 123 FCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4691PF05272300.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.012
Identities = 17/70 (24%), Positives = 25/70 (35%), Gaps = 8/70 (11%)

Query: 36 CVVLHGHSGSGKSTLLRSLYANYLPDEGQIQIKHGDEWVDLVTAPARKVVEI------RK 89
VVL G G GKSTL+ +L + I G + + + E+ R+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIV--AYELSEMTAFRR 655

Query: 90 TTVGWVSQFL 99
V F
Sbjct: 656 ADAEAVKAFF 665


65UTI89_C4717UTI89_C4726Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4717019-3.681906hypothetical protein
UTI89_C4718118-4.458821DNA-binding transcriptional activator DcuR
UTI89_C4719220-5.767642sensory histidine kinase DcuS
UTI89_C4720434-10.107062hypothetical protein
UTI89_C4721324-7.308013hypothetical protein
UTI89_C4722-118-4.437179hypothetical protein
UTI89_C4723-118-3.964881hypothetical protein
UTI89_C4724022-3.045150hypothetical protein
UTI89_C4725018-3.912501hypothetical protein
UTI89_C4726018-3.495101lysyl-tRNA synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4718HTHFIS682e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 68.3 bits (167), Expect = 2e-15
Identities = 31/109 (28%), Positives = 51/109 (46%), Gaps = 4/109 (3%)

Query: 4 VLIIDDDAMVAELNRRYVAQIPGFQCCGTASTLEKAKEIIFNSDTPIDLILLDIYMQKEN 63
+L+ DDDA + + + +++ G+ S I + DL++ D+ M EN
Sbjct: 6 ILVADDDAAIRTVLNQALSRA-GYDVR-ITSNAATLWRWI--AAGDGDLVVTDVVMPDEN 61

Query: 64 GLDLLPVLHNARCKSDVIVISSAADAVTIKDSLHYGVVDYLIKPFQASR 112
DLLP + AR V+V+S+ +T + G DYL KPF +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTE 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4719PF06580418e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.0 bits (96), Expect = 8e-06
Identities = 21/99 (21%), Positives = 38/99 (38%), Gaps = 18/99 (18%)

Query: 442 LIENALE-ALGP-EPGGEISVTLHYRHGWLHCEVNDDGPGIAPDKIDHIFDKGVSTKGSE 499
L+EN ++ + GG+I + +G + EV + G +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN------------TKES 310

Query: 500 RGVGLALVKQQVENLGG---SIAVESEPGIFTQFFVQIP 535
G GL V+++++ L G I + + G V IP
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4721SACTRNSFRASE260.012 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 26.4 bits (58), Expect = 0.012
Identities = 9/28 (32%), Positives = 16/28 (57%)

Query: 32 LAIIEHTDVDESLKGQGIGKQLVAKVVE 59
A+IE V + + +G+G L+ K +E
Sbjct: 89 YALIEDIAVAKDYRKKGVGTALLHKAIE 116


66UTI89_C4759UTI89_C4780Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4759-2123.062488hypothetical protein
UTI89_C4760-2112.183486phosphatidylserine decarboxylase
UTI89_C4761-2132.482315ribosome-associated GTPase
UTI89_C4762-2133.132708oligoribonuclease
UTI89_C4766-1123.168012***electron transport protein YjeS
UTI89_C4767-2123.307514hypothetical protein
UTI89_C4768-1142.548712ATPase
UTI89_C47690133.113129N-acetylmuramoyl-L-alanine amidase
UTI89_C47701142.784464DNA mismatch repair protein
UTI89_C47712181.917974tRNA delta(2)-isopentenylpyrophosphate
UTI89_C47724252.124222RNA-binding protein Hfq
UTI89_C47734232.018848GTPase HflX
UTI89_C47744232.387210FtsH protease regulator HflK
UTI89_C47754232.086099FtsH protease regulator HflC
UTI89_C47762191.235770hypothetical protein
UTI89_C47773171.164057adenylosuccinate synthetase
UTI89_C47784130.206148transcriptional repressor NsrR
UTI89_C4779413-0.019463exoribonuclease R
UTI89_C4780217-2.66626023S rRNA (guanosine-2'-O-)-methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4759GPOSANCHOR534e-09 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 53.1 bits (127), Expect = 4e-09
Identities = 50/312 (16%), Positives = 106/312 (33%), Gaps = 18/312 (5%)

Query: 121 SRQAQQEQERAREIADSLNQLPQQQTDARRQLNEIERRLGTLTGNTPLNQAQNFALQSDS 180
+ ++ QERA + N L + +D ++ LT L+ +
Sbjct: 49 TDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTE---ELSNAKEKLRKND 105

Query: 181 ARLKALVDEL-ELAQLSANNRQELARLRSELAEKES--QQLDAYLQALRNQLNSQRQQEA 237
L ++ EL A+ + L + + + L+A AL + + ++
Sbjct: 106 KSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAAR-KADLEKAL 164

Query: 238 ERALESTEQLAESSADLPKDIVAQFKINRELSAALNQQAQRMDLVASQQRQAASQTLQVR 297
E A+ + + L + A EL AL +++ + ++ +
Sbjct: 165 EGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALA 224

Query: 298 QALNTLREQSQWLGSSNLLGEALRAQVARLPEMPKPQQLDTEMAQLRVQRLRYEDLLNKQ 357
L + L + A A++ L + L+ A+L +
Sbjct: 225 ARKADLEKA---LEGAMNFSTADSAKIKTLEA--EKAALEARQAELEKALEGAMNFSTAD 279

Query: 358 PLLRQIHQADGQPLTAE------QNRILEAQLRTQRELLNSLLQGGDTLLLELTKLKVSN 411
+ +A+ L AE Q+++L A ++ R L++ + L E KL+ N
Sbjct: 280 SAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQN 339

Query: 412 GQLEDALKEVNE 423
E + + +
Sbjct: 340 KISEASRQSLRR 351



Score = 40.4 bits (94), Expect = 4e-05
Identities = 52/257 (20%), Positives = 93/257 (36%), Gaps = 59/257 (22%)

Query: 20 ATAPDSKQITQELEQAKAAKPAQPEVVEALQSALNALEERKGSLERIKQYQEVIDNYPKL 79
A A + + LE A A ++ L++ ALE R+ LE+ +
Sbjct: 222 ALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAM------NF 275

Query: 80 SATLRAQLNNMRDEPRSVSPGMSNDALNQEILQISS--QLLDKSRQAQQEQERAREIADS 137
S A++ + E AL E + Q+L+ +RQ+ + A
Sbjct: 276 STADSAKIKTLEAE---------KAALEAEKADLEHQSQVLNANRQSLRRDLDA------ 320

Query: 138 LNQLPQQQTDARRQLNEIERRLGTLTGNTPLNQAQNFALQSDSARLKALVDELELAQLSA 197
+R ++E L N+ + QS L A
Sbjct: 321 ----------SREAKKQLEAEHQKL---EEQNKISEASRQSLRRDLDAS----------- 356

Query: 198 NNRQELARLRSEL--AEKESQQLDAYLQALRNQLNSQRQQEAERALESTEQLAESSADLP 255
R+ +L +E E++++ +A Q+LR L++ R EA++ +E + A S
Sbjct: 357 --REAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASR--EAKKQVEKALEEANSKLA-- 410

Query: 256 KDIVAQFKINRELSAAL 272
A K+N+EL +
Sbjct: 411 ----ALEKLNKELEESK 423


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4773SECA320.005 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 31.8 bits (72), Expect = 0.005
Identities = 26/144 (18%), Positives = 54/144 (37%), Gaps = 6/144 (4%)

Query: 282 HVIDAADVRVQENIEAVNTVLEEIDAHEIPTLLVMNKIDMLDDFEPRIDRDEENK-PIRV 340
++D +DV N + IDA+ P L ++ + + R+ D + PI
Sbjct: 665 ELLDVSDVSETINSIREDVFKATIDAYIPPQSL--EEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 341 WLSAQTGAGIPQLFQALTERLSGEVAQHTLRLPPQEGRLRSRFYQLQAIEKEWMEEDGSV 400
WL + L + + + + + + R + LQ ++ W E ++
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAM 782

Query: 401 SLQVRMPIVDWRRLCKQEPALIDY 424
+R I R +++P +Y
Sbjct: 783 D-YLRQGIH-LRGYAQKDP-KQEY 803


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4774cloacin320.006 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.6 bits (71), Expect = 0.006
Identities = 25/81 (30%), Positives = 30/81 (37%), Gaps = 10/81 (12%)

Query: 17 GSSKPGGNSEGNGNKGGRDQGPPDLDDIFRKLSKKLGGLGGGKGTGSGGGSSSQGP---- 72
S G +SE N GG G G GGG GTG G S+ P
Sbjct: 33 ASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTG-GNLSAVAAPVAFG 91

Query: 73 -----RPQLGGRVVTIAAAAI 88
P GG V+I+A A+
Sbjct: 92 FPALSTPGAGGLAVSISAGAL 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4779RTXTOXIND310.028 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.028
Identities = 12/55 (21%), Positives = 24/55 (43%), Gaps = 1/55 (1%)

Query: 165 VVPDDSRLSFDILIPPDQIMGARMGFVVVVELTQRPTRRTKAV-GKIVEVLGDNM 218
+VP+D L L+ I +G ++++ P R + GK+ + D +
Sbjct: 359 IVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAI 413


67UTI89_C4795UTI89_C4803Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4795-1243.195439PTS system L-ascorbate-specific transporter
UTI89_C4796-1223.0090263-keto-L-gulonate-6-phosphate decarboxylase
UTI89_C4797-1232.488704L-xylulose 5-phosphate 3-epimerase
UTI89_C47981260.819159L-ribulose-5-phosphate 4-epimerase
UTI89_C4799227-2.098785hypothetical protein
UTI89_C4800231-3.70975230S ribosomal protein S6
UTI89_C4801128-4.398991primosomal replication protein N
UTI89_C4802125-3.96274830S ribosomal protein S18
UTI89_C4803126-3.91052850S ribosomal protein L9
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4796ECOLNEIPORIN270.037 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 27.5 bits (61), Expect = 0.037
Identities = 6/19 (31%), Positives = 7/19 (36%), Gaps = 2/19 (10%)

Query: 105 FNGDVQI--ELTGYWTWEQ 121
F G + L W EQ
Sbjct: 62 FKGQEDLGNGLKAIWQVEQ 80


68UTI89_C4873UTI89_C4954Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4873-123-3.620286gluconate 5-dehydrogenase
UTI89_C4874024-4.148805L-idonate 5-dehydrogenase
UTI89_C4875123-3.803863D-gluconate kinase
UTI89_C4876121-3.575432oxidoreductase
UTI89_C4878226-5.162560*prophage P4 integrase
UTI89_C4879223-5.122229hypothetical protein
UTI89_C4880322-4.686661DNA helicase superfamily protein I
UTI89_C4881527-6.317944hypothetical protein
UTI89_C4882635-9.476369adhesin
UTI89_C4883841-10.662432hypothetical protein
UTI89_C4884841-9.720334hypothetical protein
UTI89_C4885737-7.873351hypothetical protein
UTI89_C4886738-6.445052regulator PapX protein
UTI89_C4887638-5.353513p pilus adhesin PapG protein
UTI89_C4888634-1.444871minor pilin subunit PapF
UTI89_C4889533-1.079716minor pilin subunit PapE
UTI89_C4890531-0.901319minor pilin subunit PapK
UTI89_C4891633-1.613973PapJ protein
UTI89_C4892631-2.512788periplasmid chaperone PapD protein
UTI89_C4893531-2.341498outer membrane usher protein PapC
UTI89_C4894743-6.523517minor pilin subunit PapH
UTI89_C4895639-6.169921major pilin subunit PapA
UTI89_C4896538-5.328800pap operon regulatory protein PapB
UTI89_C4897435-3.999091pap operon regulatory protein PapI
UTI89_C4898735-2.348596hypothetical protein
UTI89_C4899834-4.619614hypothetical protein
UTI89_C4900939-6.705828hypothetical protein
UTI89_C4901841-7.594954hypothetical protein
UTI89_C4902844-8.110975hypothetical protein
UTI89_C4903844-8.324321hypothetical protein
UTI89_C4904843-7.527205F17-like fimbril adhesin subunit
UTI89_C4905740-7.182872F17-like fimbrial usher
UTI89_C4906536-5.212182F17-like fimbrial chaperone
UTI89_C4907634-4.023315F17-like fimbrial subunit
UTI89_C4908530-2.608503hypothetical protein
UTI89_C4909630-3.134548FMN-dependent dehydrogenase
UTI89_C4910424-1.981300hypothetical protein
UTI89_C4911425-1.292210UidR transriptional regulator
UTI89_C4912624-0.742602hypothetical protein
UTI89_C4913423-0.404749hypothetical protein
UTI89_C4914421-0.416158hypothetical protein
UTI89_C49155220.948178hypothetical protein
UTI89_C4916536-9.564892hypothetical protein
UTI89_C4917639-10.240330hypothetical protein
UTI89_C4918640-10.814564hypothetical protein
UTI89_C4919742-11.765675transposase
UTI89_C4920744-12.347901hypothetical protein
UTI89_C4921744-12.122674cytotoxic necrotizing factor 1
UTI89_C4922643-11.414711hypothetical protein
UTI89_C4923642-11.601264hypothetical protein
UTI89_C4924641-11.521660hemolysin D
UTI89_C4925639-11.153761hemolysin secretion protein HlyB
UTI89_C4926637-10.429845hemolysin A
UTI89_C4927733-8.181545hemolysin C
UTI89_C4928729-6.792943hypothetical protein
UTI89_C4929630-6.121352hypothetical protein
UTI89_C4930631-5.916496hypothetical protein
UTI89_C4931539-10.159326response regulator
UTI89_C4932849-14.141669hypothetical protein
UTI89_C49331055-16.207390hypothetical protein
UTI89_C49341056-16.497013hypothetical protein
UTI89_C49351056-16.170368hypothetical protein
UTI89_C4936853-14.848139hypothetical protein
UTI89_C4937747-12.683061hypothetical protein
UTI89_C4938333-5.247837hypothetical protein
UTI89_C49392221.058570hypothetical protein
UTI89_C4940123-0.614514hypothetical protein
UTI89_C49414193.232718hypothetical protein
UTI89_C49425203.373084hypothetical protein
UTI89_C49436203.420011hypothetical protein
UTI89_C49445203.362572hypothetical protein
UTI89_C49455193.134840hypothetical protein
UTI89_C49466203.638285ShlA/HecA/FhaA exofamily protein
UTI89_C49478242.771273hypothetical protein
UTI89_C49484220.029924hypothetical protein
UTI89_C4949118-0.080784hypothetical protein
UTI89_C4950117-0.882333hypothetical protein
UTI89_C4952116-1.284534hypothetical protein
UTI89_C4953117-1.980105hypothetical protein
UTI89_C4954219-3.217466hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4873DHBDHDRGNASE1441e-44 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 144 bits (365), Expect = 1e-44
Identities = 86/256 (33%), Positives = 133/256 (51%), Gaps = 8/256 (3%)

Query: 7 LAGKNILITGSAQGIGFLLATGLGKYGAQIIINDITAERAELAVEKLHQEGIQAVAAPFN 66
+ GK ITG+AQGIG +A L GA I D E+ E V L E A A P +
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 67 VTHKHEIDAAVEHIEKDIGPIDVLVNNAGIQRRHPFTEFPEQEWNDVIAVNQTAVFLVSQ 126
V ID IE+++GPID+LVN AG+ R ++EW +VN T VF S+
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 127 AVTRHMVERKAGKVINICSMQSELGRDTITPYAASKGAVKMLTRGMCVELARHNIQVNGI 186
+V+++M++R++G ++ + S + + R ++ YA+SK A M T+ + +ELA +NI+ N +
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 187 APGYFKTEMTKALVEDE--------AFTAWLCKRTPAARWGDPQELIGAAVFLSSKASDF 238
+PG +T+M +L DE P + P ++ A +FL S +
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 239 VNGHLLFVDGGMLVAV 254
+ H L VDGG + V
Sbjct: 246 ITMHNLCVDGGATLGV 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4879SECA270.007 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 27.1 bits (60), Expect = 0.007
Identities = 9/29 (31%), Positives = 14/29 (48%)

Query: 15 VQGSEHKHIQQYVRTIDFITVNTFFTVYN 43
V+ E IQ +T+ IT +F +Y
Sbjct: 357 VEAKEGVQIQNENQTLASITFQNYFRLYE 385


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4882OMPADOMAIN412e-06 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 41.1 bits (96), Expect = 2e-06
Identities = 50/246 (20%), Positives = 79/246 (32%), Gaps = 55/246 (22%)

Query: 6 MNKVFVVSVVAAACVFAVNAGAKEGKSGFYLTGKAGASVMSLSDQRFLSGDEEETSKYKG 65
M K + VA A FA A A + +Y K G S D F++
Sbjct: 1 MKKTAIAIAVALA-GFATVAQAAPKDNTWYTGAKLGWS--QYHDTGFIN---------NN 48

Query: 66 GDDHDTVFSGGIAVGYDFYPQFSIPVRTELEFYARGKADSKYNVDKDSWSGGYWRDDLKN 125
G H+ G GY P E+ + G+ K +V+ +
Sbjct: 49 GPTHENQLGAGAFGGYQVNPYVGF----EMGYDWLGRMPYKGSVE-----------NGAY 93

Query: 126 EVSVNTLMLNAYYDFRNDSAFTPWVSAGIGYARIHQKTTGISTWDYEYGSSGRESLSRSG 185
+ L Y +D Y R+ G W + S+ +G
Sbjct: 94 KAQGVQLTAKLGYPITDDLDI---------YTRL-----GGMVWRADTKSNVYGKNHDTG 139

Query: 186 SADNFAWSLGAGVRYDVTPDIALDLSYRYLDAGDSSVSYKDEWGDKYKSEVDVKSHDIML 245
+ F GV Y +TP+IA L Y++ + GD + + + L
Sbjct: 140 VSPVF----AGGVEYAITPEIATRLEYQWT----------NNIGDAHTIGTRPDNGMLSL 185

Query: 246 GMTYNF 251
G++Y F
Sbjct: 186 GVSYRF 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4887PF036275460.0 PapG
		>PF03627#PapG

Length = 336

Score = 546 bits (1407), Expect = 0.0
Identities = 191/339 (56%), Positives = 231/339 (68%), Gaps = 7/339 (2%)

Query: 1 MKKWLPAFLF-LSLSGCNDALAANQSTMFYSFNDNIYRPQLSVKVTDIVQFIVDINSASS 59
MKKW PA LF L +SG + A + +FYS + +V +T QFI +
Sbjct: 1 MKKWFPALLFSLCVSGESSAW---NNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIA 57

Query: 60 TATLSYVACNGFTWTHGLYWSEYFAWLVVPKHV-SYNGYNIYLELQSRGSFSLD-AEDND 117
T T + GF Y+ EY AW+V PK V + NGY +++E+ ++GS+S + DND
Sbjct: 58 TVTWNQCNGPGFADGSWAYYREYIAWVVFPKKVMTKNGYPLFIEVHNKGSWSEENTGDND 117

Query: 118 NYYLTKGFAWDE-VNSSGRVCFDIGEKRSLAWSFGGVTLNARLPVDLPKGDYTFPVKFLR 176
+Y+ KG+ WDE +G +C GE L F + LP DLP GDY+ + +
Sbjct: 118 SYFFLKGYKWDERAFDAGNLCQKPGETTRLTEKFDDIIFKVALPADLPLGDYSVTIPYTS 177

Query: 177 GIQRNNYDYIGGRYKIPSSLMKTFPFNGTLNFSIKNTGGCRPSAQSLEINHGDLSINSAN 236
G+QR+ Y+G R+KIP ++ KT P + F KN GGCRPSAQSLEI HGDLSINSAN
Sbjct: 178 GMQRHFASYLGARFKIPYNVAKTLPRENEMLFLFKNIGGCRPSAQSLEIKHGDLSINSAN 237

Query: 237 NHYAAQTLSVSCDVPTNIRFFLLSNTTPAYSHGQQFSVGLGHGWDSIVSINGVDTGETTM 296
NHYAAQTLSVSCDVP NIRF LL NTTP YSHG++FSVGLGHGWDSIVS+NGVDTGETTM
Sbjct: 238 NHYAAQTLSVSCDVPANIRFMLLRNTTPTYSHGKKFSVGLGHGWDSIVSVNGVDTGETTM 297

Query: 297 RWYRAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP 335
RWY+AGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP
Sbjct: 298 RWYKAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4888FIMBRIALPAPF292e-105 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 292 bits (749), Expect = e-105
Identities = 165/167 (98%), Positives = 165/167 (98%)

Query: 1 MIRLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE 60
MIRLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE
Sbjct: 1 MIRLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE 60

Query: 61 VTKTISISCPYKSGSLWIKVTGNTMGGGQNNVLATNITHFGIALYQGKGMSTPLTLGNGS 120
VTK ISISCPYKSGSLWIKVTGNTMG GQNNVLATNITHFGIALYQGKGMSTPLTLGNGS
Sbjct: 61 VTKNISISCPYKSGSLWIKVTGNTMGVGQNNVLATNITHFGIALYQGKGMSTPLTLGNGS 120

Query: 121 GNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMIYN 167
GNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMIYN
Sbjct: 121 GNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMIYN 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4889FIMBRIALPAPE310e-112 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 310 bits (794), Expect = e-112
Identities = 172/173 (99%), Positives = 172/173 (99%)

Query: 1 MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAEVNWGDIEIQNLVQSG 60
MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAEVNWGDIEIQNLVQSG
Sbjct: 1 MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAEVNWGDIEIQNLVQSG 60

Query: 61 GNQKDFTVDMNCPYSLGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYNSNNSGIGN 120
GNQKDFTVDMNCPYSLGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYNSNNSGIGN
Sbjct: 61 GNQKDFTVDMNCPYSLGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYNSNNSGIGN 120

Query: 121 AVTLGSQFTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVASYS 173
AVTLGSQ TPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVASYS
Sbjct: 121 AVTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVASYS 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4893PF005777400.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 740 bits (1911), Expect = 0.0
Identities = 243/879 (27%), Positives = 361/879 (41%), Gaps = 67/879 (7%)

Query: 1 MKDRI-PFAVNNITCVILLSLFCNAASAVEFNTDVLDAADKKNIDFTRFSEAGYVLPGQY 59
K R+ F V + +++ + FN L + D +RF + PG Y
Sbjct: 19 RKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTY 78

Query: 60 LLDVIVNGQSISPASLQISFVEPALSGDKAEKKLPQACLTSDMVRLMGLTAESLDKVVYW 119
+D+ +N + A+ ++F CLT + MGL S+ +
Sbjct: 79 RVDIYLNNGYM--ATRDVTFNTGDSEQGI------VPCLTRAQLASMGLNTASVSGMNLL 130

Query: 120 HDGQCADF-HGLPGVDIRPDTGAGVLRINMPQAWLEYSDATWLPPSRWDDGIPGLMLDYN 178
D C + + D G L + +PQA++ ++PP WD GI +L+YN
Sbjct: 131 ADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYN 190

Query: 179 LNGTVSRNYQGGDSHQFSYNGTVGGNLGPWRLRADYQGSQEQSRYNGEKTTNRNFTWSRF 238
+G +N GG+SH N G N+G WRLR + S S + + +
Sbjct: 191 FSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSS--DSSSGSKNKWQHINT 248

Query: 239 YLFRAIPRWRANLTLGENNINSDIFRSWSYTGASLESDDRMLPPRLRGYAPQITGIAETN 298
+L R I R+ LTLG+ DIF ++ GA L SDD MLP RG+AP I GIA
Sbjct: 249 WLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGT 308

Query: 299 ARVVVSQQGRVLYDSMVPAGPFSIQDLD-SSVRGRLDVEVIEQNGRKKTFQVDTASVPYL 357
A+V + Q G +Y+S VP GPF+I D+ + G L V + E +G + F V +SVP L
Sbjct: 309 AQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLL 368

Query: 358 TRPGQVRYKLVSGRSRGYGHETEGPVFATGEASWGLSNQWSLYGGAVLAGDYNALAAGAG 417
R G RY + +G R + E P F GL W++YGG LA Y A G G
Sbjct: 369 QREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIG 428

Query: 418 WDLGMPGTLSADITQSVARIEGERTFQGKSWRLSYSKRFDNADADITFAGYRFSERNYMT 477
++G G LS D+TQ+ + + + G+S R Y+K + + +I GYR+S Y
Sbjct: 429 KNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFN 488

Query: 478 MEQYLNARYR--------------------NDYSSREKEMYTVTLNKNVADWNTSFNLQY 517
+R + + ++ +T+ + + + + L
Sbjct: 489 FADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTS-TLYLSG 547

Query: 518 SRQTYWDIRKTD-YYTVSVNRYFNVFGLQGVAVGLSASRSKYLGRD--NDSAYLRISVPL 574
S QTYW D + +N F + LS S +K + + L +++P
Sbjct: 548 SHQTYWGTSNVDEQFQAGLNTAFE-----DINWTLSYSLTKNAWQKGRDQMLALNVNIPF 602

Query: 575 GT------------GTASYSGSMSND-RYVNMAGYTDT-FNDGLDSYSLNAGLNSGGGLT 620
+ASYS S + R N+AG T D SYS+ G GG
Sbjct: 603 SHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGN 662

Query: 621 SQRQINAYYSHRSPLANLSANIASLQKGYTSFGVSASGGATITGKGAALHAGGMSGGTRL 680
S A ++R N + S SGG G L G T +
Sbjct: 663 SGSTGYATLNYRGGYGNANIG-YSHSDDIKQLYYGVSGGVLAHANGVTL--GQPLNDTVV 719

Query: 681 LVDTDGVGGVPVDGGQVV-TNRWGTGVVTDISSYYRNTTSVDLKRLPDDVEATRSVVESA 739
LV G V+ V T+ G V+ + Y N ++D L D+V+ +V
Sbjct: 720 LVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVV 779

Query: 740 LTEGAIGYRKFSVLKGKRLFAILRLADGSQPPFGASVTSEKGRELGMVADEGLAWLSGVT 799
T GAI +F G +L L + PFGA VTSE + G+VAD G +LSG+
Sbjct: 780 PTRGAIVRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMP 838

Query: 800 PGETLSVNW--DGKIQCQVNVPETAISDQQLL----LPC 832
+ V W + C N S QQLL C
Sbjct: 839 LAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4894FIMBRIALPAPE290.006 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 29.2 bits (65), Expect = 0.006
Identities = 20/77 (25%), Positives = 41/77 (53%), Gaps = 12/77 (15%)

Query: 29 GMSLPEYWG----EEHVWWDGRAAFHGEVVRPACTLAMEDAWQIIDMGESPVRDL-QNGF 83
G+ LP G +HV F G+++ PACT+ + ++ G+ +++L Q+G
Sbjct: 6 GLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAE----VNWGDIEIQNLVQSG- 60

Query: 84 SGPERKFSLRLRNCEFN 100
G ++ F++ + NC ++
Sbjct: 61 -GNQKDFTVDM-NCPYS 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4896FIMREGULATRY1685e-58 Escherichia coli: P pili regulatory PapB protein si...
		>FIMREGULATRY#Escherichia coli: P pili regulatory PapB protein

signature.
Length = 104

Score = 168 bits (426), Expect = 5e-58
Identities = 104/104 (100%), Positives = 104/104 (100%)

Query: 1 MAHHEVISRSGNAFLLNIRESVLLPGSMSEMHFFLLIGISSIHSDRVILAMKDYLVGGHS 60
MAHHEVISRSGNAFLLNIRESVLLPGSMSEMHFFLLIGISSIHSDRVILAMKDYLVGGHS
Sbjct: 1 MAHHEVISRSGNAFLLNIRESVLLPGSMSEMHFFLLIGISSIHSDRVILAMKDYLVGGHS 60

Query: 61 RKEVCEKYQMNNGYFSTTLGRLIRLNALAARLAPYYTDESSAFD 104
RKEVCEKYQMNNGYFSTTLGRLIRLNALAARLAPYYTDESSAFD
Sbjct: 61 RKEVCEKYQMNNGYFSTTLGRLIRLNALAARLAPYYTDESSAFD 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4905PF005777430.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 743 bits (1920), Expect = 0.0
Identities = 260/872 (29%), Positives = 419/872 (48%), Gaps = 47/872 (5%)

Query: 22 NLSCLIYCRCSLLLFAALGLTVTNHSF----AAEEAEFDSEFLHLDKGINAIDIRRFSHG 77
N CL + L F + ++ E F+ FL D D+ RF +G
Sbjct: 12 NTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADD-PQAVADLSRFENG 70

Query: 78 NPVPEGRYYSDIYVNNVWKGKADLQYLRTANTGAPTLCLTPELLS-----LIDLVKDTMS 132
+P G Y DIY+NN + D+ + + CLT L+ + +
Sbjct: 71 QELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLL 130

Query: 133 GNTSCFPASTGLSSARINFDLSTLRLNIEIPQALLNTRPRGYISPAQWQSGVPAAFINYD 192
+ +C P ++ + A D+ RLN+ IPQA ++ R RGYI P W G+ A +NY+
Sbjct: 131 ADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYN 190

Query: 193 ANYYQY-SSSGTSNEQTYLGLKAGFNLWGWALRHRGSESWNNSYPAG-----YQNIETSI 246
+ + G ++ YL L++G N+ W LR + S+N+S + +Q+I T +
Sbjct: 191 FSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWL 250

Query: 247 MHDLAPLRAQFTLGDFYTNGELMDSLSLRGVRLASDERMLPGSLRGYAPAVRGIANSNAK 306
D+ PLR++ TLGD YT G++ D ++ RG +LASD+ MLP S RG+AP + GIA A+
Sbjct: 251 ERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQ 310

Query: 307 VTIYQNAHILYETTVPAGPFVINDLYPSGYAGDLLVKITESNGQTRMFTVPFAAVAQLIR 366
VTI QN + +Y +TVP GPF IND+Y +G +GDL V I E++G T++FTVP+++V L R
Sbjct: 311 VTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQR 370

Query: 367 PGFSRWQMSVGKYR-YANKTYNDLIAQGTYQYGLTNDITLNSGLTTASGYTAGLAGLAFN 425
G +R+ ++ G+YR + Q T +GL T+ G A Y A G+ N
Sbjct: 371 EGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKN 430

Query: 426 T-PLGAIASDITLSRTAFRYSGVTRKGYSLHSSYSINIPASNTNITLAAYRYSSKDFYHL 484
LGA++ D+T + + G S+ Y+ ++ S TNI L YRYS+ +++
Sbjct: 431 MGALGALSVDMTQANSTLP-DDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNF 489

Query: 485 KDALSANHNAF-------IDDVSVKSTAFY----RPRNQFQISINQELGEKWGGMYLTGT 533
D + N + + V K T +Y R + Q+++ Q+LG +YL+G+
Sbjct: 490 ADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGS 548

Query: 534 TYNYWGHKGSRNEYQMGYSNFWKQLGYQIGLSQSRDNEQQRRDDRFYINFTLPLGG---- 589
YWG ++Q G + ++ + + + S +++ Q+ RD +N +P
Sbjct: 549 HQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRS 608

Query: 590 ----SVQSPVFSTVLNYSKEEKNSIQTSISGTGGEDNQFSYGIS-----GNSQENGPSGY 640
+ S +++ + + + GT EDN SY + G +G +GY
Sbjct: 609 DSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGY 668

Query: 641 AMNGGYRSPYVNITTTVGHDTQNNNQRSFGASGAVVAHPYGVTLSNDLSDTFAIIHAEGA 700
A YR Y N H + Q +G SG V+AH GVTL L+DT ++ A GA
Sbjct: 669 A-TLNYRGGYGNANIGYSHS-DDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGA 726

Query: 701 QGAVINNASGSRLDFWGNGVVPYVTPYEKNQISIDPSNLDLNVELSATEQEIIPRANSAT 760
+ A + N +G R D+ G V+PY T Y +N++++D + L NV+L ++P +
Sbjct: 727 KDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIV 786

Query: 761 LVKFDTKTGRSLLFDIRMSTGNPPPMASEVLDEHGQLAGYVAQAGKVFTRGLPEKGHLSV 820
+F + G LL + + P P + V E Q +G VA G+V+ G+P G + V
Sbjct: 787 RAEFKARVGIKLLMTLTHN-NKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQV 845

Query: 821 VWGPDNKDRCSFVYHVAHNKDDMQSQLVPVLC 852
WG + C Y + + C
Sbjct: 846 KWGEEENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4909PHAGEIV300.020 Gene IV protein signature.
		>PHAGEIV#Gene IV protein signature.

Length = 426

Score = 29.9 bits (67), Expect = 0.020
Identities = 18/88 (20%), Positives = 35/88 (39%), Gaps = 10/88 (11%)

Query: 249 FNQKVELTPADI-EFVK---KITGLPVIVKGILRGEDAVVAIDAGADAI------QVSNH 298
F Q +E+ + + +FV K TG VIV ++G V + D + + + +
Sbjct: 20 FAQVIEMNNSSLRDFVTWYSKQTGESVIVSPDVKGTVTVYSSDVKPENLRDFFISVLRAN 79

Query: 299 GGRQIDGVPSAISQLQEVAARVGHKVPV 326
+ +PS I + ++P
Sbjct: 80 NFDMVGSIPSIIQKYNPNNQDYIDELPS 107


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4911HTHTETR799e-20 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 78.9 bits (194), Expect = 9e-20
Identities = 38/162 (23%), Positives = 59/162 (36%), Gaps = 11/162 (6%)

Query: 77 ARKTRSCSPEKTARTRQQIARAALEEFSAQGFARASISNISKRAGVAKGTVYNYFPTKEL 136
ARKT+ ++ TRQ I AL FS QG + S+ I+K AGV +G +Y +F K
Sbjct: 2 ARKTK----QEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSD 57

Query: 137 LFEAVLKE----FIATVRTELESSPRRNGETVKAYLLRVMLPAIRKIDDASTGRARIAHL 192
LF + + P ++ L+ +L + + I H
Sbjct: 58 LFSEIWELSESNIGELELEYQAKFPGDPLSVLREILIH-VLESTVTEERRRLLMEIIFHK 116

Query: 193 VMTEGSRFPVIAQAYLREIHQPLQQAMTQLIQEAASAGELKA 234
G V R + + Q ++ A L A
Sbjct: 117 CEFVGEMAVVQQAQ--RNLCLESYDRIEQTLKHCIEAKMLPA 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4915PF02370310.007 M protein repeat
		>PF02370#M protein repeat

Length = 168

Score = 30.8 bits (69), Expect = 0.007
Identities = 20/103 (19%), Positives = 44/103 (42%), Gaps = 3/103 (2%)

Query: 18 EQAEALRQKDQQLSLVEETEAFLRSALARAEEKIEEEERETEHLRAQIEKLRRMLFGTRS 77
+ +++ R+ D Q + LR + ++KIEE E+E + + + E+ + R
Sbjct: 38 DSSDSKRENDPQYRALMGENQDLRKREGQYQDKIEELEKERKEKQERPERREKFE---RQ 94

Query: 78 EKLRREVEQAEALLNQRRQDSDRYSGWEDDPQVPRQLRQSRHR 120
+ + EQ + +++Q + Q+ RQ +R
Sbjct: 95 HQDKHYQEQQKKHQQEQQQLEAEKQKLAKEKQISDASRQGLNR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4918RTXTOXIND260.033 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 25.9 bits (57), Expect = 0.033
Identities = 15/80 (18%), Positives = 27/80 (33%), Gaps = 6/80 (7%)

Query: 17 PEFRNEALKLAERIGVAAAARELSLYESQLYAWRSKQQQ-----QMSSSERESELAAENV 71
PE + + + R SL + Q W++++ Q +ER + LA N
Sbjct: 166 PELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARIN- 224

Query: 72 RLKRQLAEQAEELAILQKAA 91
R + + L
Sbjct: 225 RYENLSRVEKSRLDDFSSLL 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4924RTXTOXIND5970.0 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 597 bits (1541), Expect = 0.0
Identities = 462/478 (96%), Positives = 468/478 (97%)

Query: 1 MKTWLMGFSEFLLRYKLVWSETWKIRKQLDTPVREKDENEFLPAHLELIETPVSRRPRLV 60
MKTWLMGFSEFLLRYKLVWSETWKIRKQLDTPVREKDENEFLPAHLELIETPVSRRPRLV
Sbjct: 1 MKTWLMGFSEFLLRYKLVWSETWKIRKQLDTPVREKDENEFLPAHLELIETPVSRRPRLV 60

Query: 61 AYFIMGFLVIAVILSVLGQVEIVATANGKLTLSGRSKEIKPIENSIVKEIIVKEGESVRK 120
AYFIMGFLVIA ILSVLGQVEIVATANGKLT SGRSKEIKPIENSIVKEIIVKEGESVRK
Sbjct: 61 AYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRK 120

Query: 121 GDVLLKLTALGAEADTLKTQSSLLQTRLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS 180
GDVLLKLTALGAEADTLKTQSSLLQ RLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS
Sbjct: 121 GDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS 180

Query: 181 EEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTILARINRYENLSRVEKSRLDDF 240
EEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT+LARINRYENLSRVEKSRLDDF
Sbjct: 181 EEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDF 240

Query: 241 RSLLHKQAIAKHAVLEQENKYVEAANELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI 300
SLLHKQAIAKHAVLEQENKYVEA NELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI
Sbjct: 241 SSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI 300

Query: 301 LDKLRQTTDSIELLTLELEKNEERQQASVIRAPVSGKVQQLKVHTEGGVVTTAETLMVIV 360
LDKLRQTTD+I LLTLEL KNEERQQASVIRAPVS KVQQLKVHTEGGVVTTAETLMVIV
Sbjct: 301 LDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIV 360

Query: 361 PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQKLGL 420
PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQ+LGL
Sbjct: 361 PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGL 420

Query: 421 VFNVIVSVEENDLSTGNKHIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTESLHER 478
VFNVI+S+EEN LSTGNK+IPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTESL ER
Sbjct: 421 VFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4926RTXTOXINA14780.0 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 1478 bits (3828), Expect = 0.0
Identities = 977/1024 (95%), Positives = 995/1024 (97%)

Query: 1 MPTITTAQIKSTLQSAKQSAANKLHSAGQSTKDALKKAAEQTRNAGNRLILLIPKDYKGQ 60
M TITTAQIKSTLQSAKQSAANKLHSAGQSTKDALKKAAEQTRNAGNRLILLIPKDYKGQ
Sbjct: 1 MTTITTAQIKSTLQSAKQSAANKLHSAGQSTKDALKKAAEQTRNAGNRLILLIPKDYKGQ 60

Query: 61 GSSLNDLVRTADELGIEVQYDEKNGTAITKQVFGTAEKLIGLTERGVTIFAPQLDKLLQK 120
GSSLNDLVRTADELGIEVQYDEKNGTAITKQVFGTAEKLIGLTERGVTIFAPQLDKLLQK
Sbjct: 61 GSSLNDLVRTADELGIEVQYDEKNGTAITKQVFGTAEKLIGLTERGVTIFAPQLDKLLQK 120

Query: 121 YQKAGNKLGGSAENIGDNLGKAGSVLSTFQNFLGTALSSMKIDELIKKQKSGSNVSSSEL 180
YQKAGN LGG AENIGDNLGKAG +LSTFQNFLGTALSSMKIDELIKKQKSG NVSSSEL
Sbjct: 121 YQKAGNILGGGAENIGDNLGKAGGILSTFQNFLGTALSSMKIDELIKKQKSGGNVSSSEL 180

Query: 181 AKASIELINQLVDTAASINNNVNSFSQQLNKLGSVLSNTKHLNGVGNKLQNLPNLDNIGA 240
AKASIELINQLVDT AS+NNNVNSFSQQLN LGSVLSNTKHLNGVGNKLQNLPNLDNIGA
Sbjct: 181 AKASIELINQLVDTVASLNNNVNSFSQQLNTLGSVLSNTKHLNGVGNKLQNLPNLDNIGA 240

Query: 241 GLDTVSGILSAISASFILSNADADTGTKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGL 300
GLDTVSGILSAISASFILSNADADT TKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGL
Sbjct: 241 GLDTVSGILSAISASFILSNADADTRTKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGL 300

Query: 301 STSAAAAGLIASVVTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKE 360
STSAAAAGLIAS VTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKE
Sbjct: 301 STSAAAAGLIASAVTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKE 360

Query: 361 TGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEASKQAMFEH 420
TGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEASKQAMFEH
Sbjct: 361 TGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEASKQAMFEH 420

Query: 421 VASKMADVIAEWEKKHGKNYFENGYDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHW 480
VASKMADVIAEWEKKHGKNYFENGYDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHW
Sbjct: 421 VASKMADVIAEWEKKHGKNYFENGYDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHW 480

Query: 481 DMLIGELASVTRNGDKTLSGKSYIDYYEEGKRLERRPKEFQQQIFDPLKGNIDLSDSKSS 540
D LIGELA VTRNGDKTLSGKSYIDYYEEGKRLE++ EFQ+Q+FDPLKGNIDLSDSKSS
Sbjct: 481 DTLIGELAGVTRNGDKTLSGKSYIDYYEEGKRLEKKXDEFQKQVFDPLKGNIDLSDSKSS 540

Query: 541 TLLKFVTPLLTPGEEIRERRQSGKYEYITELLVKGVDKWTVKGVQDKGSVYDYSNLIQHA 600
TLLKFVTPLLTPGEEIRERRQSGKYEYITELLVKGVDKWTVKGVQDKG+VYDYSNLIQHA
Sbjct: 541 TLLKFVTPLLTPGEEIRERRQSGKYEYITELLVKGVDKWTVKGVQDKGAVYDYSNLIQHA 600

Query: 601 SVGNNQYREIRIESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATE 660
SVGNNQYREIRIESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATE
Sbjct: 601 SVGNNQYREIRIESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATE 660

Query: 661 AGNYTVTRVLGGDVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVE 720
AGNYTVTRVLGGDVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVE
Sbjct: 661 AGNYTVTRVLGGDVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVE 720

Query: 721 ELIGTTRADKFFGSKFTDIFHGADGDDHIEGNDGNDRLYGDKGNDTLRGGNGDDQLYGGD 780
ELIGTTRADKFFGSKFTDIFHGADGDD IEGNDGNDRLYGDKGNDTL GGNGDDQLYGGD
Sbjct: 721 ELIGTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGD 780

Query: 781 GNDKLIGGTGNNYLNGGDGDDELQVQGNSLAKNVLSGGKGNDKLYGSEGADLLDGGEGND 840
GNDKLIG GNNYLNGGDGDDE QVQGNSLAKNVL GGKGNDKLYGSEGADLLDGGEG+D
Sbjct: 781 GNDKLIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDD 840

Query: 841 LLKGGYGNDIYRYLSGYGHHIIDDEGGKDDKLSLADIDFRDVAFKREGNDLIMYKAEGNV 900
LLKGGYGNDIYRYLSGYGHHIIDD+GGK+DKLSLADIDFRDVAFKREGNDLIMYK EGNV
Sbjct: 841 LLKGGYGNDIYRYLSGYGHHIIDDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEGNV 900

Query: 901 LSIGHKNGITFKNWFEKESDDLSNHQIEQIFDKDGRVITPDSLKKAFEYQQSNNKVSYVY 960
LSIGHKNGITF+NWFEKES D+SNH+IEQIFDK GR+ITPDSLKKA EYQQ NNK SYVY
Sbjct: 901 LSIGHKNGITFRNWFEKESGDISNHEIEQIFDKSGRIITPDSLKKALEYQQRNNKASYVY 960

Query: 961 GHDASTYGSQDNLNPLINEISKIISAAGNFDVKEERSAASLLQLSGNASDFSYGRNSITL 1020
G+DA YGSQ +LNPLINEISKIISAAG+FDVKEER+AASLLQLSGNASDFSYGRNSITL
Sbjct: 961 GNDALAYGSQGDLNPLINEISKIISAAGSFDVKEERTAASLLQLSGNASDFSYGRNSITL 1020

Query: 1021 TASA 1024
T SA
Sbjct: 1021 TTSA 1024


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4927RTXTOXINC318e-115 Gram-negative bacterial RTX toxin-activating protein C...
		>RTXTOXINC#Gram-negative bacterial RTX toxin-activating protein C

signature.
Length = 170

Score = 318 bits (817), Expect = e-115
Identities = 162/170 (95%), Positives = 167/170 (98%)

Query: 45 MNMNNPLEVLGHVSWLWASSPLHRNWPVSLFAINVLPAIRANQYALLTRDNYPVAYCSWA 104
MN+N PLE+LGHVSWLWASSPLHRNWPVSLFAINVLPAI+ANQY LLTRD+YPVAYCSWA
Sbjct: 1 MNINKPLEILGHVSWLWASSPLHRNWPVSLFAINVLPAIQANQYVLLTRDDYPVAYCSWA 60

Query: 105 NLSLENEIKYLNDVTSLVAEDWTSGDRKWFIDWIAPFGDNGALYKYMRKKFPDELFRAIR 164
NLSLENEIKYLNDVTSLVAEDWTSGDRKWFIDWIAPFGDNGALYKYMRKKFPDELFRAIR
Sbjct: 61 NLSLENEIKYLNDVTSLVAEDWTSGDRKWFIDWIAPFGDNGALYKYMRKKFPDELFRAIR 120

Query: 165 VDPKTHVGKVSEFHGGKIDKQLANKIFKQYHHELITEVKNKTDFNFSLTG 214
VDPKTHVGKVSEFHGGKIDKQLANKIFKQYHHELITEVK K+DFNFSLTG
Sbjct: 121 VDPKTHVGKVSEFHGGKIDKQLANKIFKQYHHELITEVKRKSDFNFSLTG 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4931HTHFIS909e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 9e-23
Identities = 35/129 (27%), Positives = 60/129 (46%)

Query: 7 KILLMEDDYDIAALLRLNLQDEGYQIVHEADGARARLLLDKQTWDAVILDLMLPNVNGLE 66
IL+ +DD I +L L GY + ++ A + D V+ D+++P+ N +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 67 ICRYIRQMTRYLPVIIISARTSETHRVLGLEMGADDYLPKPFSIPELIARIKALFRRQEA 126
+ I++ LPV+++SA+ + + E GA DYLPKPF + ELI I +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 127 MGQNILLAG 135
+
Sbjct: 125 RPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4932PF06580423e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.2 bits (99), Expect = 3e-06
Identities = 24/137 (17%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 360 LSIETRRLQLRIMMSHSLPLIRADISMIERVITNLLDNAVRH----TPPEGSIRLKVWQE 415
L + + + + R+ + + D+ + ++ L++N ++H P G I LK ++
Sbjct: 229 LQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD 288

Query: 416 DNRLHVEVADSGPGLTEDMRTHLFRRASVLCHEPSEEPRGGLGLLIVRRMLVLHGGD--- 472
+ + +EV ++G + + + G GL VR L + G
Sbjct: 289 NGTVTLEVENTGSLALK-----------------NTKESTGTGLQNVRERLQMLYGTEAQ 331

Query: 473 IRLTDSTTGACFRFFLP 489
I+L++ +P
Sbjct: 332 IKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4946PF05860767e-18 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 75.6 bits (186), Expect = 7e-18
Identities = 25/139 (17%), Positives = 45/139 (32%), Gaps = 26/139 (18%)

Query: 32 AVITPQNGA---GMDKAANGVPVVNIATPNGAGISHNRFTDYNVGKEGLILNNATGKLNP 88
A ITP ++ T G+ + H+ F +++V G N
Sbjct: 1 AQITPDTTLPINSNITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFNNPT---- 55

Query: 89 TQLGGLIQNNPNLKAGGEAKGIINEVTGGNRSLLQGYTEVAGKAANVMVANPYGITCDGC 148
+ II+ VTGG+ S + G A N+ + NP GI
Sbjct: 56 -----------------NIQNIISRVTGGSVSNIDGLIRANATA-NLFLINPNGIIFGQN 97

Query: 149 GFINTPHATLTTGKPVMNA 167
++ + + + +
Sbjct: 98 ARLDIGGSFVGSTANRLKF 116


69UTI89_C4968UTI89_C5012Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4968-118-3.321078hypothetical protein
UTI89_C4969-118-3.452172hypothetical protein
UTI89_C4970017-2.056910hypothetical protein
UTI89_C4971018-1.784294hypothetical protein
UTI89_C4972019-3.976392hypothetical protein
UTI89_C4973122-4.540050hypothetical protein
UTI89_C4974224-3.882764hypothetical protein
UTI89_C4975225-4.265052hypothetical protein
UTI89_C4976226-4.975406hypothetical protein
UTI89_C4977327-5.226403hypothetical protein
UTI89_C4978330-6.932381Na+/H+ antiporter
UTI89_C4979427-5.326499hypothetical protein
UTI89_C4980428-6.726833hypothetical protein
UTI89_C4981529-6.395093hypothetical protein
UTI89_C4982632-7.426253hypothetical protein
UTI89_C4983429-6.712231hypothetical protein
UTI89_C4984428-1.397367hypothetical protein
UTI89_C4985426-0.574534hypothetical protein
UTI89_C49865260.920915hypothetical protein
UTI89_C49875281.814057hypothetical protein
UTI89_C49886282.705857hypothetical protein
UTI89_C49897314.668246hypothetical protein
UTI89_C49908304.728561hypothetical protein
UTI89_C49918294.897288hypothetical protein
UTI89_C49928274.722680hypothetical protein
UTI89_C49937274.338810hypothetical protein
UTI89_C49947283.192398hypothetical protein
UTI89_C49959261.880392hypothetical protein
UTI89_C49967260.332576hypothetical protein
UTI89_C4997437-10.590314hypothetical protein
UTI89_C4998440-11.629039hypothetical protein
UTI89_C4999240-11.199606hypothetical protein
UTI89_C5000137-9.906234hypothetical protein
UTI89_C5001135-9.590555hypothetical protein
UTI89_C5002134-9.881849hypothetical protein
UTI89_C5003126-6.229421hypothetical protein
UTI89_C5004126-6.551307hypothetical protein
UTI89_C5005130-7.060115hypothetical protein
UTI89_C5006030-7.172297N-acetylneuraminic acid mutarotase
UTI89_C5007130-6.028747hypothetical protein
UTI89_C5008031-5.604659hypothetical protein
UTI89_C5009127-3.999331tyrosine recombinase
UTI89_C5010126-3.345760tyrosine recombinase
UTI89_C5011124-2.885104type 1 fimbriae major subunit FimA
UTI89_C5012324-2.771587FimI fimbrial protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4974HTHTETR565e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 55.8 bits (134), Expect = 5e-12
Identities = 25/108 (23%), Positives = 50/108 (46%)

Query: 20 YQQLLESAAMIAGRDGIAALSLNAVAREAGVSKGGLLHHFPNKQALIYALFARLLAIMEE 79
Q +L+ A + + G+++ SL +A+ AGV++G + HF +K L ++ + + E
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 80 AIAALMQKDNISYGRFTRAYVNYLSALTDTQESRQLMVLSLAMPDEPV 127
K R + ++ T T+E R+L++ + E V
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120


70UTI89_C5023UTI89_C5033Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C5023123-3.910487dihydroxyacetone kinase subunit DhaL
UTI89_C5024123-3.533193dihydroxyacetone kinase subunit DhaK
UTI89_C5025122-3.480867glycerol dehydrogenase
UTI89_C5026125-3.994240transporter CgxT
UTI89_C5027127-3.831299dihydrolipoamide dehydrogenase CdlD
UTI89_C5028031-6.527976carnitine transporter CniT
UTI89_C5029031-6.798779glycerate kinase GclK
UTI89_C5030133-7.6995343-hydroxyisobutyrate dehydrogenase
UTI89_C5031133-8.060710regulatory protein GclR
UTI89_C5032-128-6.873543glyoxylate carboligase
UTI89_C5033-222-5.651447DNA-binding transcriptional regulator DhaR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C5026TCRTETA371e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.7 bits (85), Expect = 1e-04
Identities = 58/324 (17%), Positives = 106/324 (32%), Gaps = 25/324 (7%)

Query: 39 TGATNAELGFLMTAYGLVNFLLYLPGGWAADRFSARKLMTFSLISTGISGFYYATFPSYT 98
+ A G L+ Y L+ F G +DRF R ++ SL + AT P
Sbjct: 38 SNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLW 97

Query: 99 MICLLHALWAVTTVFTFWAVCVRIIRTLGTSEEQGRLYGYWFLGKGLTSIVLGFLSVPVF 158
++ + + +T T AV I + +E+ R +G+ + G ++ PV
Sbjct: 98 VLYIGRIVAGITGA-TG-AVAGAYIADITDGDERARHFGF--MS---ACFGFGMVAGPVL 150

Query: 159 AKFGEGVDGLRATIIFYSVVTILAGVLAWFVCQDETHSEDKANFRLADMAF-----VLKM 213
G A + + L + F+ + E + R A M
Sbjct: 151 GGLMGGF-SPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGM 209

Query: 214 PTVWLAGVVTFCMWSI-YIGFGMVTPYLTQILHMGESEVAVASILRAYVLFAMGGLIGGQ 272
V V F M + + + + H + + ++ + L
Sbjct: 210 TVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGIS----LAAFGILHSLAQAM 265

Query: 273 LADRCASRTRFMIYAFIGMIVFTTVYFFLP--GESRYVTIALANMVALGVFIYSANAVFF 330
+ A+R +GMI T Y L + + + G+ + + A+
Sbjct: 266 ITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLS 325

Query: 331 SIIDEVRIPAKVTGTAAGLISLLT 354
+DE R G G ++ LT
Sbjct: 326 RQVDEER-----QGQLQGSLAALT 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C5033HTHFIS2281e-69 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 228 bits (582), Expect = 1e-69
Identities = 82/355 (23%), Positives = 157/355 (44%), Gaps = 41/355 (11%)

Query: 327 RKIAQQQISTNANFTFDSLHAASGGMKQVLLIARRAIKSISPILINGEEGVGKLSLAMAI 386
K ++ ++ L S M+++ + R +++ ++I GE G GK +A A+
Sbjct: 122 PKRRPSKLEDDSQ-DGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180

Query: 387 HNESEQRDGPFISVDCQMLSPENILHELLGSDVG-------PSPSKFELAHNGTLYLDKV 439
H+ ++R+GPF++++ + + I EL G + G S +FE A GTL+LD++
Sbjct: 181 HDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEI 240

Query: 440 EYLSGEVQSVLLKVLKTGLVTRSDSHRLIPVRFRLITCTSSSLREYVQQGAFSRQLYYEI 499
+ + Q+ LL+VL+ G T I R++ T+ L++ + QG F LYY +
Sbjct: 241 GDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRL 300

Query: 500 SMNEIEIPPLRKRREDLKQMIDDIIDKYQERTRKKMTITPDANSVLLEYRWPGNISEFKN 559
++ + +PPLR R ED+ ++ + + ++ +A ++ + WPGN+ E +N
Sbjct: 301 NVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELEN 360

Query: 560 RMEKVFINCNRLVLGLENIPLDIRQN-----NSSGDDDIPHLT----------------- 597
+ ++ + V+ E I ++R L+
Sbjct: 361 LVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFG 420

Query: 598 -----------SLAELEMQAIEHTCRVCEWNLTKAAEVLKIGRTTLWRKLKIYNL 641
LAE+E I N KAA++L + R TL +K++ +
Sbjct: 421 DALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


71UTI89_C5070UTI89_C5141Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C5070123-3.232400hypothetical protein
UTI89_C5071017-1.684021hypothetical protein
UTI89_C50720160.432676DNA-binding transcriptional activator BglJ
UTI89_C50730181.763312ferric iron reductase involved in ferric
UTI89_C50740171.800908hypothetical protein
UTI89_C5078-1192.522212***hypothetical protein
UTI89_C5079-216-0.07894716S ribosomal RNA m2G1207 methyltransferase
UTI89_C5081127-6.373897DNA polymerase III subunit psi
UTI89_C5082026-6.468485ribosomal-protein-alanine N-acetyltransferase
UTI89_C5083029-6.842012nucleotidase
UTI89_C5084134-7.718702peptide chain release factor 3
UTI89_C5085133-7.482130integrase-like protein
UTI89_C5086030-7.260342hypothetical protein
UTI89_C5087119-3.045640hypothetical protein
UTI89_C5088122-3.604695hypothetical protein
UTI89_C5089121-3.385943hypothetical protein
UTI89_C5090121-4.906566hypothetical protein
UTI89_C5091120-4.920637hypothetical protein
UTI89_C5092223-4.273970hypothetical protein
UTI89_C5093323-3.225093hypothetical protein
UTI89_C5094225-3.364294hypothetical protein
UTI89_C5095526-3.221059regulatory protein
UTI89_C5096426-1.290965hypothetical protein
UTI89_C5097424-0.377372hypothetical protein
UTI89_C5098423-0.021455antirepressor
UTI89_C50994261.470415hypothetical protein
UTI89_C51004250.906793replication protein
UTI89_C51012222.305861hypothetical protein
UTI89_C51021201.298623DNA adenine methylase
UTI89_C5103122-0.887507hypothetical protein
UTI89_C5104122-1.441757hypothetical protein
UTI89_C5105123-2.599523hypothetical protein
UTI89_C5106325-2.850189hypothetical protein
UTI89_C5107327-5.222980Q protein
UTI89_C5108226-3.959139hypothetical protein
UTI89_C5109225-2.436707prophage CP-933O hypothetical protein
UTI89_C5110126-2.526699hypothetical protein
UTI89_C5111228-1.607459lambdoid prophage DLP12 lysis protein S-like
UTI89_C51123190.875429bacteriophage lambda lysozyme-like protein
UTI89_C51134211.421900endopeptidase Rz of prophage CP-933K
UTI89_C51144202.238992hypothetical protein
UTI89_C51154191.727850hypothetical protein
UTI89_C51164201.950527hypothetical protein
UTI89_C51174202.062858prophage prophage DNA packaging protein
UTI89_C51183221.852817hypothetical protein
UTI89_C51194211.784112portal protein
UTI89_C51205201.251114protease/scaffold protein
UTI89_C51216282.260422hypothetical protein
UTI89_C51226292.770611prophage CP-933K hypothetical protein
UTI89_C51234262.339420tail component of prophage CP-933K
UTI89_C51244272.065056tail component of prophage CP-933K
UTI89_C51254262.278420tail component of prophage CP-933K
UTI89_C51264282.951396tail component of prophage CP-933K
UTI89_C51273263.506365minor tail protein
UTI89_C51283234.066188tail length tape measure protein
UTI89_C51293254.991020minor tail protein
UTI89_C51300213.001594minor tail protein
UTI89_C5131-1202.997518prophage tail fiber component K
UTI89_C5132-1192.038776tail component of prophage CP-933K
UTI89_C5133-2180.521039prophage tail component
UTI89_C5134024-2.585558prophage protein
UTI89_C5135028-5.602050hypothetical protein
UTI89_C5136132-8.789970hypothetical protein
UTI89_C5137232-9.722401hypothetical protein
UTI89_C5138233-9.622770hypothetical protein
UTI89_C5139120-5.826650hypothetical protein
UTI89_C5140216-5.609390hypothetical protein
UTI89_C5141215-2.755231hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C50732FE2SRDCTASE481e-177 Ferric iron reductase signature.
		>2FE2SRDCTASE#Ferric iron reductase signature.

Length = 262

Score = 481 bits (1238), Expect = e-177
Identities = 249/262 (95%), Positives = 253/262 (96%)

Query: 1 MAYRSAPLYEDVIWRTHLQPQDAGLAQAVRATIAEHREHLLEFIRLDEPAPLNAMTLAQW 60
MAYRSAPLYEDVIWRTHLQPQD LAQAVRATIA+HREHLLEFIRLDEPAPLNAMTLAQW
Sbjct: 1 MAYRSAPLYEDVIWRTHLQPQDPTLAQAVRATIAKHREHLLEFIRLDEPAPLNAMTLAQW 60

Query: 61 SSPNALSSLLAVYSDHIYRNQPLMIRENKPLISLWAQWYIGLMVPPLMLALLTQEKALDV 120
SSPN LSSLLAVYSDHIYRNQP+MIRENKPLISLWAQWYIGLMVPPLMLALLTQEKALDV
Sbjct: 61 SSPNVLSSLLAVYSDHIYRNQPMMIRENKPLISLWAQWYIGLMVPPLMLALLTQEKALDV 120

Query: 121 SPEHVHVEFHETGRAACFWVDVCEDKNATLHSPQQRMETLISQALVPVLQALEATGEING 180
SPEH H EFHETGR ACFWVDVCEDKNAT HSPQ RMETLISQALVPV+QALEATGEING
Sbjct: 121 SPEHFHAEFHETGRVACFWVDVCEDKNATPHSPQHRMETLISQALVPVVQALEATGEING 180

Query: 181 KLIWSNTGYLINWYLTEMKQLLGEATVESLRYALFFEKTLTTGEDNPLWRTVVLRDGLLV 240
KLIWSNTGYLINWYLTEMKQLLGEATVESLR+ALFFEKTLT GEDNPLWRTVVLRDGLLV
Sbjct: 181 KLIWSNTGYLINWYLTEMKQLLGEATVESLRHALFFEKTLTNGEDNPLWRTVVLRDGLLV 240

Query: 241 RRTCCQRYRLPDVQQCGDCTLK 262
RRTCCQRYRLPDVQQCGDCTLK
Sbjct: 241 RRTCCQRYRLPDVQQCGDCTLK 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C5082SACTRNSFRASE554e-12 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 54.6 bits (131), Expect = 4e-12
Identities = 23/80 (28%), Positives = 35/80 (43%), Gaps = 1/80 (1%)

Query: 62 DEATLFNIAVDPDYQRQGLGRALLEHLIDELEKRGVATLWLEVRASNAAAIALYESLGFN 121
A + +IAV DY+++G+G ALL I+ ++ L LE + N +A Y F
Sbjct: 88 GYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFI 147

Query: 122 EATIRRNYYPTTDG-REDAI 140
+ Y E AI
Sbjct: 148 IGAVDTMLYSNFPTANEIAI 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C5097FLGHOOKAP1270.038 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 27.2 bits (60), Expect = 0.038
Identities = 11/47 (23%), Positives = 24/47 (51%), Gaps = 3/47 (6%)

Query: 76 AVAQSAGGV---FVSLPEIEEVENADINQRLLEVIEQIGSYSKQIRS 119
A+ + G+ F + + ++ +N + ++QI +Y+KQI S
Sbjct: 131 ALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIAS 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C5128cloacin442e-06 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 44.3 bits (104), Expect = 2e-06
Identities = 34/142 (23%), Positives = 63/142 (44%), Gaps = 4/142 (2%)

Query: 519 DQQRLNELQEKKRQKDLQDAK--EQAERNYQEQQKRRNAENAALNRMNETEAARHQREIA 576
DQ + + +E +RQ++ E AERNY+ + N N + R E +A Q +
Sbjct: 294 DQVKQRQDEENRRQQEWDATHPVEAAERNYERARAELNQANEDVARNQERQAKAVQVYNS 353

Query: 577 RINAMQYADQTVRDA-AIQRENERYEKALASGKKKTRETRNDEATRLLLQYSQQQAQVEG 635
R + + A++T+ DA A ++ R+ +G + + +A R + +QA +
Sbjct: 354 RKSELDAANKTLADAIAEIKQFNRFAHDPMAGGHRMWQMAGLKAQRAQTDVNNKQAAFDA 413

Query: 636 QIAAARQSAGIATERMTEAHKQ 657
A + A A E+ K+
Sbjct: 414 -AAKEKSDADAALSSAMESRKK 434


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C5134ENTEROVIROMP1459e-47 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 145 bits (368), Expect = 9e-47
Identities = 63/200 (31%), Positives = 101/200 (50%), Gaps = 30/200 (15%)

Query: 1 MRKLCAVILSAVVWLVAAGTPASAAEHQSTLSAGYLQTHTYMPGSDDLKGINVKYRYEFT 60
M+K+ + A V AGT +A ST++ GY Q+ + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAA---TSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGLVTSFSYAGYKNRQLTRYSDTRWHKDSVRNRWFSVMAGPSVRVNEWFSAYAMAGM 119
++ LG++ SF+Y K+R + + N+++ + AGP+ R+N+W S Y + G+
Sbjct: 57 NSPLGVIGSFTYT-EKSRTASSGDYNK-------NQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDIAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C5135IGASERPTASE474e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 46.6 bits (110), Expect = 4e-07
Identities = 39/233 (16%), Positives = 85/233 (36%), Gaps = 9/233 (3%)

Query: 101 PEALRRFEEMVEEAARNAEAASQSAAAAKKSETAAASSKNAAKTSETNAANSAQAAATSQ 160
+ + E+ E + E ++ + K E + A E + + + +
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQT 1162

Query: 161 TASANSATAAKKSETNAKNSETAAKTSETNAK-----SSQTAAKTSETNAKASETAAKNS 215
+A++ AK++ +N + T + T T + T A T T S KN
Sbjct: 1163 NTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNR 1222

Query: 216 QVAAAQSESAAAGSATSAAGSATAAANSQKAAKTSETNAKSSQTAAKTSET--NAKASET 273
+ +S AT+++ + A ++ TNA S AK N + +
Sbjct: 1223 HRRSVRSVPHNVEPATTSSNDRSTVA--LCDLTSTNTNAVLSDARAKAQFVALNVGKAVS 1280

Query: 274 AAKSSQDAAAQSESAAASSASEAAASATASANSQKAAKTSETNAKASETTAAN 326
S + + + S + + ++S + ++K+++T +T + N
Sbjct: 1281 QHISQLEMNNEGQYNVWVSNTSMNKNYSSSQYRRFSSKSTQTQLGWDQTISNN 1333



Score = 43.5 bits (102), Expect = 4e-06
Identities = 34/165 (20%), Positives = 55/165 (33%), Gaps = 18/165 (10%)

Query: 209 ETAAKNSQVAAAQSESAAAGSA--TSAAGSATAAANSQKAAKTSETNAKSSQTAAKTSET 266
E +N V + A S + A +A A S+T +E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 267 NAKASETAAKSSQDA------------AAQSESAAASSASEAAASATASANSQ----KAA 310
+ + S+T K+ QDA A+S A + +E A S + + +Q K
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKET 1103

Query: 311 KTSETNAKASETTAANSAKASAASQTAAKASEDAAREYASQAADP 355
T E KA T SQ + K + + ++ A
Sbjct: 1104 ATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARE 1148


72UTI89_C5152UTI89_C5157Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C51520223.309384deoxyribose-phosphate aldolase
UTI89_C51530193.503883thymidine phosphorylase
UTI89_C5154-2193.771058phosphopentomutase
UTI89_C5155-2164.548765purine nucleoside phosphorylase
UTI89_C5156-2163.895886hypothetical protein
UTI89_C5157-2173.116659lipoate-protein ligase A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C5156SSPAMPROTEIN300.007 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type M

signature.
Length = 147

Score = 30.4 bits (68), Expect = 0.007
Identities = 11/16 (68%), Positives = 12/16 (75%)

Query: 157 WLVGEGNYQRWITAQR 172
WL EGNYQRWI Q+
Sbjct: 113 WLRKEGNYQRWIIRQK 128


73UTI89_C0415UTI89_C0421N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C04151171.590837fructokinase
UTI89_C04161151.351166MFS transport protein AraJ
UTI89_C04172161.662886exonuclease SbcC
UTI89_C0418-2131.976863exonuclease SbcD
UTI89_C0419-2171.753634hypothetical protein
UTI89_C0420-2152.262110transcriptional regulator PhoB
UTI89_C0421-1132.141124phosphate regulon sensor protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0415ACETATEKNASE300.015 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.8 bits (67), Expect = 0.015
Identities = 17/69 (24%), Positives = 29/69 (42%), Gaps = 10/69 (14%)

Query: 187 FISGTGFATDYRRLSGHALKGSEIIRLVEESDPVAELALRRYELRLAKSLAHVVNILDP- 245
+G ++D+R L A + D A+LAL + R+ K++ +
Sbjct: 273 VYGISGISSDFRDLEDAAF---------KNGDKRAQLALNVFAYRVKKTIGSYAAAMGGV 323

Query: 246 DVIVLGGGM 254
DVIV G+
Sbjct: 324 DVIVFTAGI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0416TCRTETA514e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 51.0 bits (122), Expect = 4e-09
Identities = 74/356 (20%), Positives = 126/356 (35%), Gaps = 35/356 (9%)

Query: 33 ILSLALGTFGLGMAEFGIMGVLTELAHNVGISIPAAGH---MISYYALGVVVGAPIIALF 89
+ ++AL G+G+ IM VL L ++ S H +++ YAL AP++
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 90 SSRYSLKHILLFLVALCVIGNAMFTLSSSYLMLAIGRLVSGFPHGAFFGVGAIVLSKIIK 149
S R+ + +LL +A + A+ + +L IGR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 150 PGKVTAAVAGMVSGMTVANLLGIPLGTYLSQEFSWRYTFLLIAVFNIAVMASVYFWVPDI 209
G A G +S ++ P+ L FS F A N + F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 210 RDEAKGKLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYVKPYMMFI 257
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 258 SGFSETAMTFIMMLVGLGM---VLGNMLSGRISGRYSPLRIAAVTDFIIVLALLMLFFFG 314
F A T + L G+ + M++G ++ R R + ++L F
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 315 GMKTTSLIFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAVG 368
I + G+ LQ +L + E G G +A +L S VG
Sbjct: 299 RGWMAFPIMVLLASGGIG--MPALQAMLSRQV-DEERQGQLQGSLAALTSLTSIVG 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0417IGASERPTASE392e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.5 bits (89), Expect = 2e-04
Identities = 40/264 (15%), Positives = 81/264 (30%), Gaps = 11/264 (4%)

Query: 162 LNAKPKERAELLEELTGTEIYGQISAMVFEQHKSARTELEKLQAQASGVALLTPEQVQSL 221
A P E E + E + Q S V + + A + + A Q+
Sbjct: 1029 APATPSETTETVAENSK-----QESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTN 1083

Query: 222 TASLQVLTDEEKQLITAQQQEQQSLNWLTRLD-ELQQEGSRRQQALQQALAEEEKAQPQL 280
+ +E Q ++ +++ E QE + + + E QPQ
Sbjct: 1084 EVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQA 1143

Query: 281 AALSLAQPARNLRPHWE---RIAEYSTALAHTRQQIEEVNTRLQSTMALRASIRHHAAKQ 337
P N++ A+ T +E+ T + + + +
Sbjct: 1144 EPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTT 1203

Query: 338 SAELQQQQQSLNAWLQEHDRLRQWNNELAGWRAQFSQQTSDREHLRQWQQQLTHAEQKLN 397
A Q S ++ ++ R + + ++DR + T+ L+
Sbjct: 1204 PATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPA-TTSSNDRSTVALCDLTSTNTNAVLS 1262

Query: 398 ALAAITLTLTADEVASALAQHAEQ 421
A + + V A++QH Q
Sbjct: 1263 DARAKAQFVALN-VGKAVSQHISQ 1285



Score = 33.9 bits (77), Expect = 0.005
Identities = 27/139 (19%), Positives = 54/139 (38%), Gaps = 13/139 (9%)

Query: 738 QQDVLAAQSLQKAQAQFDTALQASVFDDQQAFLAALMDEQTLTQLEQLKQNLENQRRQAQ 797
Q DV + S + A+ D A A E T T E KQ + + Q
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPP-------APATPSETTETVAENSKQESKTVEKNEQ 1056

Query: 798 TLVTQTAETLTQHQQHRPGGLSLTVTVEQIQQELAQTHQKLRENTTSQGEIRQQLKQDAD 857
TA+ ++ + V E+AQ+ + +E T++ + ++++
Sbjct: 1057 DATETTAQNREVAKEAKS-----NVKANTQTNEVAQSGSETKETQTTETKETATVEKEEK 1111

Query: 858 NRQQQQTLMQQIAQMTQQV 876
+ + + Q++ ++T QV
Sbjct: 1112 AKVETEK-TQEVPKVTSQV 1129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0420HTHFIS951e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.9 bits (236), Expect = 1e-24
Identities = 33/149 (22%), Positives = 62/149 (41%), Gaps = 9/149 (6%)

Query: 4 RILVVEDEAPIREMVCFVLEQNGFQPVEAEDYDSAVNQLNEPWPDLILLDWMLPGGSGIQ 63
ILV +D+A IR ++ L + G+ + + + DL++ D ++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 FIKHLKRESMTRDIPVVMLTARGEEEDRVRGLETGADDYITKPFSPKELVARIKAVMRRI 123
+ +K+ D+PV++++A+ ++ E GA DY+ KPF EL+ I +
Sbjct: 65 LLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA-- 120

Query: 124 SPMAVEEVIEMQGLSLDPTSHRVMAGEEP 152
E L D + G
Sbjct: 121 -----EPKRRPSKLEDDSQDGMPLVGRSA 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0421PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.1 bits (78), Expect = 0.001
Identities = 19/105 (18%), Positives = 33/105 (31%), Gaps = 26/105 (24%)

Query: 325 LVYNAVNH----TPEGTHITVRWQRVPHGAEFSVEDNGPGIAPEHIPRLTERFYRVDKAR 380
LV N + H P+G I ++ + VE+ G
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN---------------- 306

Query: 381 SRQTGGSGLGLAIVKHAVNH---HESRLNIESTVGKGTRFSFVIP 422
+G GL V+ + E+++ + GK +IP
Sbjct: 307 --TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM-VLIP 348


74UTI89_C0466UTI89_C0474N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C04661220.278630ATP-dependent protease ATP-binding subunit ClpX
UTI89_C04670200.227231DNA-binding ATP-dependent protease La
UTI89_C0468-1140.263757transcriptional regulator HU subunit beta
UTI89_C0469-2130.213650peptidyl-prolyl cis-trans isomerase D
UTI89_C0470-217-0.395067hypothetical protein
UTI89_C0471-1140.018671hypothetical protein
UTI89_C04720131.442350queuosine biosynthesis protein QueC
UTI89_C04730131.471457hypothetical protein
UTI89_C04740132.103912hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0466HTHFIS290.043 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.043
Identities = 16/73 (21%), Positives = 29/73 (39%), Gaps = 13/73 (17%)

Query: 60 ERSALPTPHEIRNHLDDYVIGQEQAKKVLAVAVYNHYKRLRNGDTSNGVELGKSNILLIG 119
E P+ E + ++G+ A + +Y RL D +++ G
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGRSAAMQ----EIYRVLARLMQTD---------LTLMITG 167

Query: 120 PTGSGKTLLAETL 132
+G+GK L+A L
Sbjct: 168 ESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0467GPOSANCHOR340.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 34.3 bits (78), Expect = 0.002
Identities = 33/144 (22%), Positives = 67/144 (46%), Gaps = 12/144 (8%)

Query: 195 KQSVLEMSDVNERLEYLMAMMESEIDLLQVEKRIRNRVKKQMEKSQREYYLNEQMKAIQK 254
++ + L A QV R +++ ++ S+ +Q++A +
Sbjct: 277 TADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAK---KQLEAEHQ 333

Query: 255 ELGEMDDAPD-ENEALKRKIDAAKMPKEAKEKAEAELQKLKMMSPMS-AEATVVRGYIDW 312
+L E + + ++L+R +DA++ EAK++ EAE QKL+ + +S A +R +D
Sbjct: 334 KLEEQNKISEASRQSLRRDLDASR---EAKKQLEAEHQKLEEQNKISEASRQSLRRDLDA 390

Query: 313 MVQVPWNARSKVKKDLRQAQEILD 336
+ A+ +V+K L +A L
Sbjct: 391 SRE----AKKQVEKALEEANSKLA 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0468DNABINDINGHU1173e-38 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 117 bits (294), Expect = 3e-38
Identities = 49/88 (55%), Positives = 67/88 (76%)

Query: 2 NKSQLIDKIAAGADISKAAAGRALDAIIASVTESLKEGDDVALVGFGTFAVKERAARTGR 61
NK LI K+A +++K + A+DA+ ++V+ L +G+ V L+GFG F V+ERAAR GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEITIAAAKVPSFRAGKALKDAV 89
NPQTG+EI I A+KVP+F+AGKALKDAV
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0471PF08280270.021 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 27.1 bits (60), Expect = 0.021
Identities = 24/138 (17%), Positives = 41/138 (29%), Gaps = 20/138 (14%)

Query: 1 MQTQIKVRGYHLDVYQHVNNARYL-------EFLEEARWHGLENSDSFHWMTAH------ 47
+Q I + Y N Y E++ + N FH +
Sbjct: 361 LQHFIPETNLFVSPYYKGNQKLYTSLKLIVEEWMAKLPGKRYLNHKHFHLFCHYVEQILR 420

Query: 48 ------NIAFVVVN-ININYRRPAVLSDLLTITSQLQQLNGKSGILSQVITLEPEGQVVA 100
+ FV N IN + + + + Q+ L+P+ +
Sbjct: 421 NIQPPLVVVFVASNFINAHLLTDSFPRYFSDKSIDFHSYYLLQDNVYQIPDLKPDLVITH 480

Query: 101 DALITFVCIDLKTQKALA 118
LI FV +L A+A
Sbjct: 481 SQLIPFVHHELTKGIAVA 498


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0474HTHFIS290.020 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.020
Identities = 12/64 (18%), Positives = 24/64 (37%), Gaps = 10/64 (15%)

Query: 197 LTVLTQHLGLSLRDCMAFGDAMNDREMLGSVGSGFIMGN----------AMPQLRAELPH 246
TVL Q L + D +A + + ++ + +P+++ P
Sbjct: 16 RTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKARPD 75

Query: 247 LPVI 250
LPV+
Sbjct: 76 LPVL 79


75UTI89_C0484UTI89_C0497N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0484-112-1.699668hypothetical protein
UTI89_C0485217-0.543747hypothetical protein
UTI89_C0486116-0.743330maltose O-acetyltransferase
UTI89_C0487115-0.146596hemolysin expression-modulating protein
UTI89_C04881150.093627hypothetical protein
UTI89_C04891160.963931acridine efflux pump
UTI89_C04901120.438245acriflavine resistance protein A
UTI89_C04911150.301304DNA-binding transcriptional repressor AcrR
UTI89_C04923162.390898potassium efflux protein KefA
UTI89_C04934173.902411hypothetical protein
UTI89_C04944173.868293primosomal replication protein N''
UTI89_C04952183.675070hypothetical protein
UTI89_C04964243.148087adenine phosphoribosyltransferase
UTI89_C04974252.889216DNA polymerase III subunits gamma and tau
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0484BCTERIALGSPF290.035 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.4 bits (66), Expect = 0.035
Identities = 31/137 (22%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 247 IWLPLGLVIGLLAAMFVLRILRRIQSPHHRLQDAIENRDICVHYQPIVSLANGKIVGAEA 306
W+ L L+ G +A +LR R+ + + P++ G+I
Sbjct: 228 PWMLLALLAGFMAFRVMLR------QEKRRVS-----FHRRLLHLPLI----GRIARGLN 272

Query: 307 LARWPQTDGSWLSPDSFIPLAQQTGLS-EPLTLLIIRSVFEDMGDWLRQHPQQHISINLE 365
AR+ +T + S +PL Q +S + ++ R D +R+ H + LE
Sbjct: 273 TARYARTLSILNA--SAVPLLQAMRISGDVMSNDYARHRLSLATDAVREGVSLHKA--LE 328

Query: 366 STVLTSEKIPQLLREMI 382
T L P ++R MI
Sbjct: 329 QTAL----FPPMMRHMI 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0489ACRIFLAVINRP13690.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1369 bits (3546), Expect = 0.0
Identities = 802/1033 (77%), Positives = 915/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++SA+YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ + LNK++LTPVDVI +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL 300
+ EEFGK+ L+VN DGS V L+DVA++ELGGENY++IA NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+LA+++PFFP G+K++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVF+PMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWFNRMFEKSTHHYTDSVGGILRSTGR 540
SVLVALILTPALCAT+LKP++ H E K GFFGWFN F+ S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTHYYLT 600
YL++Y +IV GM LF+RLPSSFLP+EDQGVF+TM+QLPAGATQERTQKVL++VT YYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD 660
EK NVESVF VNGF F+G+ QN G+AFVSLK W +R G+EN EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQLL AA+HP L SVRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS++DIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D+ YVR+A+G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSIPFS 900
M LME LASKLP G+GYDWTGMSYQERLSGNQAP+L AIS +VVFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF 1020
+EATL AVRMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0490RTXTOXIND446e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.4 bits (105), Expect = 6e-07
Identities = 33/212 (15%), Positives = 71/212 (33%), Gaps = 23/212 (10%)

Query: 112 TYQAAYDSAKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTA 171
+ Y A +L + + Q+ Q +++ ++ L +Q +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 172 AKAAVETARINLAYTKVTSPISGRIGKSNV-TEGALVQNGQATALATVQQLDPIYVDVTQ 230
+ + + +P+S ++ + V TEG +V + T + V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALV 372

Query: 231 SSNDFLRLKQELA----------NGTLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVD 280
+ D + KV I D I+ + G + ++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLV---GKVKNINLDAIEDQRLGLVFNVIISIE 429

Query: 281 QTTGSITLRAIFPNPDHTLLPGMFVRARLEEG 312
+ S + I L GM V A ++ G
Sbjct: 430 ENCLSTGNKNIP------LSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 7e-04
Identities = 24/125 (19%), Positives = 43/125 (34%), Gaps = 13/125 (10%)

Query: 61 PLQITTELPGR-TSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQAAYDS 119
++I G+ T + R E++P + I+ + KEG + G L ++ +A
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA---- 134

Query: 120 AKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTAAKAAVETA 179
D K Q++ A+L RYQ L E ++
Sbjct: 135 ---DTLKTQSSLLQARLEQTRYQILS-----RSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 180 RINLA 184
+L
Sbjct: 187 LTSLI 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0491HTHTETR2225e-76 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 222 bits (567), Expect = 5e-76
Identities = 215/215 (100%), Positives = 215/215 (100%)

Query: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60
MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180
GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215
APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0492RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRVKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0497IGASERPTASE399e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.5 bits (89), Expect = 9e-05
Identities = 40/251 (15%), Positives = 78/251 (31%), Gaps = 31/251 (12%)

Query: 404 PLPETTSQVLAARQQLQRVQGATKAKKSEPAA----ATRARPVNNAALERLASVTDRVQA 459
P E +Q + + + P+ AR + A + A T
Sbjct: 983 PEVEKRNQTVDTTN----ITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETT 1037

Query: 460 RPVPSALEKAPAKKEAYRWKATTPVMQQKE--------VVATPKALKKA---LEHEKTPE 508
V ++ E AT Q +E V A + + A E ++T
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 509 LAAKLAA---------EAIERDPWAAQVSQLSLPKLVEQVALNAWKE-ESDNAVCLHLRS 558
K A E+ +V+ PK + + E +N ++++
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 559 SQRHLNNRGAQQKLAEALST-LKGSTVELTIVEDDNPAVRTPLEWRQAIYEEKLAQARES 617
Q N ++ A+ S+ ++ E T V N V P A + + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 618 IIADNNIQTLR 628
+ + +++R
Sbjct: 1218 KPKNRHRRSVR 1228


76UTI89_C0565UTI89_C0575N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0565-211-0.180282hypothetical protein
UTI89_C0566-1100.805800outer membrane protease
UTI89_C05670101.569066hypothetical protein
UTI89_C05680111.406344bacteriophage N4 receptor, outer membrane
UTI89_C05690141.033770bacteriophage N4 adsorption protein B
UTI89_C05700202.237879sensor kinase CusS
UTI89_C05710202.422785DNA-binding transcriptional activator CusR
UTI89_C0572-1191.493874copper/silver efflux system outer membrane
UTI89_C0573-1191.649387periplasmic copper-binding protein
UTI89_C0574-1191.709929copper/silver efflux system membrane fusion
UTI89_C0575-2181.145699cation efflux system protein CusA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0565LUXSPROTEIN310.002 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 31.4 bits (71), Expect = 0.002
Identities = 18/66 (27%), Positives = 30/66 (45%), Gaps = 7/66 (10%)

Query: 40 TKEHLLPHFL-EHLGNNHLDI------GVGTGFYLTHVPESSLISLMDLNEASLNAASTR 92
T EHL F+ HL + ++I G TGFY++ + S + D A++
Sbjct: 54 TLEHLYAGFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKV 113

Query: 93 AGESKI 98
++KI
Sbjct: 114 ENQNKI 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0566OMPTIN5280.0 Omptin serine protease signature.
		>OMPTIN#Omptin serine protease signature.

Length = 317

Score = 528 bits (1360), Expect = 0.0
Identities = 313/317 (98%), Positives = 316/317 (99%)

Query: 1 MRAKLLGIVLTPPIAISSFASTETLSFTPDNINADISLGTLSGKTKERVYLAEEGGRKVS 60
MRAKLLGIVLT PIAISSFASTETLSFTPDNINADISLGTLSGKTKERVYLAEEGGRKVS
Sbjct: 1 MRAKLLGIVLTTPIAISSFASTETLSFTPDNINADISLGTLSGKTKERVYLAEEGGRKVS 60

Query: 61 QLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDQDWMDSSNPGTWTDESR 120
QLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDQDWMDSSNPGTWTDESR
Sbjct: 61 QLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDQDWMDSSNPGTWTDESR 120

Query: 121 HPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRDDI 180
HPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRDDI
Sbjct: 121 HPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRDDI 180

Query: 181 GSFPNGERAIGYKQRFKIPYIGLTGSYRYEDFELGGTFKYSGWVEASDNDEHYDPGKRIT 240
GSFPNGERAIGYKQRFK+PYIGLTGSYRYEDFELGGTFKYSGWVE+SDNDEHYDPGKRIT
Sbjct: 181 GSFPNGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWVESSDNDEHYDPGKRIT 240

Query: 241 YRSKVKDQNYYSVAVNAGYYVTPNAKVYVEGAWNRVTNKKGNTSLYDHNDNTSDYSKNGA 300
YRSKVKDQNYYSVAVNAGYYVTPNAKVYVEGAWNRVTNKKGNTSLYDHN+NTSDYSKNGA
Sbjct: 241 YRSKVKDQNYYSVAVNAGYYVTPNAKVYVEGAWNRVTNKKGNTSLYDHNNNTSDYSKNGA 300

Query: 301 GIENYNFITTAGLKYTF 317
GIENYNFITTAGLKYTF
Sbjct: 301 GIENYNFITTAGLKYTF 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0570PF06580310.007 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.007
Identities = 29/184 (15%), Positives = 67/184 (36%), Gaps = 34/184 (18%)

Query: 306 EELTRMAKMVSDML-FLAQADNNQLIPEKKMLNLADEVGKVFDFFEALAEDR-GVELRFV 363
+ M +S+++ + + N + + LADE+ V + + LA + L+F
Sbjct: 191 TKAREMLTSLSELMRYSLRYSNARQVS------LADELTVVDSYLQ-LASIQFEDRLQFE 243

Query: 364 GDECQVAGDPLMLRRALSNLLSNALRY----TPTGETIVVRCQTVDHLVQVTVENPGTPI 419
D + + L+ N +++ P G I+++ + V + VEN G+
Sbjct: 244 NQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA 303

Query: 420 APEHLPRLFDRFYRVDPSRQRKGEGSGIGLAIVK---SIVVAHKGTVAVTSDVRGTRFVI 476
E +G GL V+ ++ + + ++ ++
Sbjct: 304 LKN------------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 477 ILPA 480
++P
Sbjct: 346 LIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0571HTHFIS862e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.7 bits (212), Expect = 2e-21
Identities = 35/117 (29%), Positives = 62/117 (52%)

Query: 2 KLLIVEDEKKTGEYLTKGLTEAGFVVDLADNGLNGYHLAMTGDYDLIILDIMLPDVNGWD 61
+L+ +D+ L + L+ AG+ V + N + GD DL++ D+++PD N +D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 IVRMLRSANKGMPILLLTALGTIEHRVKGLELGADDYLVKPFAFAELLARVRTLLRR 118
++ ++ A +P+L+++A T +K E GA DYL KPF EL+ + L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0572RTXTOXIND385e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 38.3 bits (89), Expect = 5e-05
Identities = 25/189 (13%), Positives = 60/189 (31%), Gaps = 13/189 (6%)

Query: 254 QAQTVNSDSLQSVKLPA-GLPSQILLQRPDIMEAEHALM-----AANANIGAARAAFFPS 307
+ +S + +K + +I+++ + + L+ A A+ ++
Sbjct: 87 NGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQS----- 141

Query: 308 ISLTSGISTASSDLSSLFNASSGMWNFIPKIEIPIFNAGRNQANLDIAEIRQQQSVVNYE 367
SL + + + + P F + L + + ++Q
Sbjct: 142 -SLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQN 200

Query: 368 QKIQNAFKEVADALALRQSLNDQISAQQRYLASLQITLQRARALYQHGAVSYLEVLDAER 427
QK Q + A R ++ +I+ + + L +L A++ VL+ E
Sbjct: 201 QKYQ-KELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQEN 259

Query: 428 SLFATRQTL 436
L
Sbjct: 260 KYVEAVNEL 268


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0575ACRIFLAVINRP6950.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 695 bits (1795), Expect = 0.0
Identities = 213/1059 (20%), Positives = 440/1059 (41%), Gaps = 54/1059 (5%)

Query: 1 MIEWIIRRSVANRFLVLMGALFLSIWGTWTIINTPVDALPDLSDVQVIIKTSYPGQAPQI 60
M + IRR + A+ L + G I+ PV P ++ V + +YPG Q
Sbjct: 1 MANFFIRR----PIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQT 56

Query: 61 VENQVTYPLTTTMLSVPGAKTVRGFSQ-FGDSYVYVIFEDGTDPYWARSRVLEYLNQVQG 119
V++ VT + M + + S G + + F+ GTDP A+ +V L
Sbjct: 57 VQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATP 116

Query: 120 KLPAGVSAELGP-DATGVGWIYEYALVDRSGKHDLADLRSLQDWFLKYELKTIPDVAEVA 178
LP V + + + ++ V + D+ +K L + V +V
Sbjct: 117 LLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQ 176

Query: 179 SVGGVVKEYQVVIDPQRLAQYGISLAEVKSALDASNQEAGGSSIELA------EAEYMVR 232
G ++ +D L +Y ++ +V + L N + + + +
Sbjct: 177 LFGAQ-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASII 235

Query: 233 ASGYLQTLDDFNHIVLKASENGVPVYLRDVAKVQVGPEMRRGIAELNGEGEVAGGVVILR 292
A + ++F + L+ + +G V L+DVA+V++G E IA +NG+ AG + L
Sbjct: 236 AQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGK-PAAGLGIKLA 294

Query: 293 SGKNAREVIAAVKDKLETLKSSLPEGVEIVTTYDRSQLIDRAIDNLSGKLLEEFIVVAVV 352
+G NA + A+K KL L+ P+G++++ YD + + +I + L E ++V +V
Sbjct: 295 TGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLV 354

Query: 353 CALFLWHVRSALVAIISLPLGLCIAFIVMHFQGLNANIMSLGGIAIAVGAMVDAAIVMIE 412
LFL ++R+ L+ I++P+ L F ++ G + N +++ G+ +A+G +VD AIV++E
Sbjct: 355 MYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVE 414

Query: 413 NAHKRLEEWQHQHPDATLDNKTRWQVITDASVEVGPALFISLLIITLSFIPIFTLEGQEG 472
N + + E D + + ++ AL ++++ FIP+ G G
Sbjct: 415 NVERVMME----------DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTG 464

Query: 473 RLFGPLAFTKTYAMAGAALLAIVVIPILMGYWIRGKIPPESSNPLNRF----------LI 522
++ + T AMA + L+A+++ P L ++ + E F +
Sbjct: 465 AIYRQFSITIVSAMALSVLVALILTPALCATLLKP-VSAEHHENKGGFFGWFNTTFDHSV 523

Query: 523 RVYHPLLLKVLHWPKTTLLVAALSVLTVLWPLNKVGGEFLPQINEGDLLYMPSTLPGISA 582
Y + K+L LL+ AL V ++ ++ FLP+ ++G L M G +
Sbjct: 524 NHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQ 583

Query: 583 AEAASMLQKTDKLIM--SVPEVARVFGKTGKAETATDSAPLEMVETTIQLKPQDQW-RPG 639
+L + + V VF G + + + LKP ++
Sbjct: 584 ERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQN---AGMAFVSLKPWEERNGDE 640

Query: 640 MTMDKIIEELDNTVRLPGLANLWVPPIRNRIDMLSTGIKSPIGIKVSGTVLADI-DTMAE 698
+ + +I + + + +++ + I +G + +
Sbjct: 641 NSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQ 700

Query: 699 QIEEVARTVPGVASALAERLEGGRYINVEINREKAARYGMTVADVQLFVTSAVGGAMVGE 758
+ A+ + S LE +E+++EKA G++++D+ +++A+GG V +
Sbjct: 701 LLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVND 760

Query: 759 TVEGIARYPINLRYPQSWRDSPQALRQLPILTPMKQQITLADVADVKVSTGPSMLKTENA 818
++ + ++ +R P+ + +L + + + + + G L+ N
Sbjct: 761 FIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNG 820

Query: 819 RPTSWIYIDARDRDMVSVVHDLQKAIAEKVQLKPGTSVAFSGQFELLERANHKLKLMVPM 878
P+ I +A L + +A K L G ++G + ++ +V +
Sbjct: 821 LPSMEIQGEAAPGTSSGDAMALMENLASK--LPAGIGYDWTGMSYQERLSGNQAPALVAI 878

Query: 879 TLMIIFVLLYLAFRRVGEALLIISSVPFALVGGIWLLWWMGFHLSVAMGTGFIALAGVAA 938
+ +++F+ L + + ++ VP +VG + V G + G++A
Sbjct: 879 SFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSA 938

Query: 939 EFGVVMLMYLRHAIEAEPSLNNPQTFSEQKLDEALYHGAVLRVRPKAMTVAVIIAGLLPI 998
+ ++++ + + +E E + + EA +R+RP MT I G+LP+
Sbjct: 939 KNAILIVEFAKDLMEKE----------GKGVVEATLMAVRMRLRPILMTSLAFILGVLPL 988

Query: 999 LWGTGAGSEVMSRIAAPMIGGMITAPLLSLFIIPAAYKL 1037
GAGS + + ++GGM++A LL++F +P + +
Sbjct: 989 AISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVV 1027


77UTI89_C0593UTI89_C0598N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0593-1184.832811enterobactin exporter EntS
UTI89_C0594-2184.671200iron-enterobactin transporter periplasmic
UTI89_C0595-2225.037628isochorismate synthase
UTI89_C0596-2245.053019enterobactin synthase subunit E
UTI89_C0597-1214.8369692,3-dihydro-2,3-dihydroxybenzoate synthetase
UTI89_C0598-1194.7011962,3-dihydroxybenzoate-2,3-dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0593TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.3 bits (84), Expect = 2e-04
Identities = 82/394 (20%), Positives = 146/394 (37%), Gaps = 38/394 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATPALVGR 141
V+L + G ++ + P L +Y+ + G + G A A +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPALPP 201
+ + G V P++GGL+ GG + + AA L L LP
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLM---GGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 202 PPQPREHPLK----SLLAGFRFLLASPLVGGIALLGGLLTMAS----AVRVLYPALADNW 253
+ PL+ + LA FR+ +V + + ++ + A+ V++ D +
Sbjct: 184 SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRF 241

Query: 254 QMSAAQIGFLYAAIP-LGAAIGALTSGKLAHSVRPGLLMLLSTLG---AFLAIGLFGLMP 309
A IG AA L + A+ +G +A + ++L + ++ +
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 310 MWILGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGG 369
M +V LA G ML Q E G++ G A +G L
Sbjct: 302 MAFPIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 370 LGAMMTPVASASASGFGLLIIGVLLLLVLVELRR 403
+ A + + +G+ + L LL L LRR
Sbjct: 358 IYA----ASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0594FERRIBNDNGPP632e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 62.7 bits (152), Expect = 2e-13
Identities = 60/280 (21%), Positives = 100/280 (35%), Gaps = 35/280 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSAEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKS--- 151
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 152 --WQSLLTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
+ LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQ 309
KD DA+ A PL +P V+ + + F SAM
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMH 283


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0597ISCHRISMTASE444e-161 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 444 bits (1142), Expect = e-161
Identities = 146/299 (48%), Positives = 195/299 (65%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPESHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPIPASKA-----------ALREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA + + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0598DHBDHDRGNASE365e-131 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 365 bits (937), Expect = e-131
Identities = 111/258 (43%), Positives = 150/258 (58%), Gaps = 20/258 (7%)

Query: 5 GKNVWITGAGKGIGYATALAFVEAGAKVTGFD---------------QAFTQEQYPFATE 49
GK +ITGA +GIG A A GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAAQVAQVCQRLLAETERLDVLVNAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+A + ++ R+ E +D+LVN AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRRQRGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALSVGLELAGSGVRC 169
+ +R G+IVTV S+ A PR M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


78UTI89_C0793UTI89_C0798N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0793-1203.695274hypothetical protein
UTI89_C0794-1183.843879hypothetical protein
UTI89_C0795-1173.567573ABC transporter
UTI89_C0796-1143.453655hypothetical protein
UTI89_C0797-1133.181579DNA-binding transcriptional regulator
UTI89_C07980133.204170ATP-dependent RNA helicase RhlE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0793ABC2TRNSPORT473e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 47.2 bits (112), Expect = 3e-08
Identities = 36/146 (24%), Positives = 63/146 (43%), Gaps = 5/146 (3%)

Query: 197 AREREQGTLDQLLVSPLTTWQIFIGKAVPALIVATFQATIVLAIGIWAYQIPFAGSLALF 256
R Q T + +L + L I +G+ A A IG+ A + + L+L
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGA---GIGVVAAALGYTQWLSLL 148

Query: 257 YFTMVI--YGLSLVGFGLLISSLCSTQQQAFIGVFVFMMPAILLSGYVSPVENMPVWLQN 314
Y VI GL+ G+++++L + + + P + LSG V PV+ +P+ Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 315 LTWINPIRHFTDITKQIYLKDASLDI 340
P+ H D+ + I L +D+
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0795PF05272310.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.012
Identities = 20/90 (22%), Positives = 28/90 (31%), Gaps = 21/90 (23%)

Query: 298 TPRFEDAFIDLLGGAGTSESPLGAILHTVEGTPGETVIEAKELTKKFGDFAATDHVNFAV 357
PR E + +LG P + + + K HV +
Sbjct: 547 VPRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVM 589

Query: 358 KRGEIFG----LLGPNGAGKSTTFKMMCGL 383
+ G F L G G GKST + GL
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGL 619



Score = 29.3 bits (65), Expect = 0.048
Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 39 YVTGLVGPDGAGKTTLMRMLAGL 61
Y L G G GK+TL+ L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0796RTXTOXIND627e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 62.2 bits (151), Expect = 7e-13
Identities = 42/259 (16%), Positives = 92/259 (35%), Gaps = 25/259 (9%)

Query: 83 ALMQAKAGVSVAQAQYDLMLAGYRDEEIAQAAAAVKQAQAAYDYAQNFYNRQQGLWKSRT 142
Q + + +A+ +LA E + + + + L +
Sbjct: 201 QKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENK 260

Query: 143 ISA--NDLENARSSRDQAQATLKSAQDKLRQYRSGNREQ---DIAQAKASLEQAQAQLAQ 197
N+L +S +Q ++ + SA+++ + + + + Q ++ +LA+
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 198 AELNLQDSTLIAPSDGTLLTRAV-EPGTVLNEGGTVFTVSLT-RPVWVRAYVDERNLDQA 255
E Q S + AP + V G V+ T+ + + V A V +++
Sbjct: 321 NEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFI 380

Query: 256 QPGRKVLLYTDGRPNKPYH---GQIGFVSPTAEFTPKTVETPDLRTDLVYRLRIVVT--- 309
G+ ++ + P Y G++ ++ A D R LV+ + I +
Sbjct: 381 NVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISIEENC 432

Query: 310 ----DADDALRQGMPVTVQ 324
+ + L GM VT +
Sbjct: 433 LSTGNKNIPLSSGMAVTAE 451


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0797HTHTETR729e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.4 bits (177), Expect = 9e-18
Identities = 32/220 (14%), Positives = 74/220 (33%), Gaps = 29/220 (13%)

Query: 13 KGEQAKKQLIAAALAQFGEYGMNATT-REIAAQAGQNIAAITYYFGSKEDLYLACAQWIA 71
+ ++ ++ ++ AL F + G+++T+ EIA AG AI ++F K DL+ +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 72 DFIGEQFRPHAEEAERLFAQPQPDRAAIRELILRACRNMIKLLTQDDTVNLSK---FISR 128
IGE E + P + +RE+++ + + + + +
Sbjct: 68 SNIGELEL---EYQAKFPGDP---LSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 129 EQLSPTAAYHLVHEQVISPLHSHLTRLIAAW---TGCDASDTRMILHTHALIGEILAFRL 185
E A + + + L I A +I+ I ++
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIM--RGYISGLM---- 175

Query: 186 GKETILLRTGWTAFDEEKTELINQTVTCHIDLILQGLSQR 225
W + + + ++ ++L+
Sbjct: 176 --------ENWLFAPQSFD--LKKEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0798SECA300.026 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.026
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 4/67 (5%)

Query: 246 QQVLVFTRTKHGANHLAEQLNKDGIRSAAIHG-NKSQGARTRALADFKSGDIRVLVATDI 304
Q VLV T + + ++ +L K GI+ ++ + A A A + + V +AT++
Sbjct: 450 QPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNM 506

Query: 305 AARGLDI 311
A RG DI
Sbjct: 507 AGRGTDI 513


79UTI89_C0842UTI89_C0850N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0842115-1.001930D-alanyl-D-alanine carboxypeptidase fraction C
UTI89_C0843116-0.484516DNA-binding transcriptional repressor DeoR
UTI89_C0844014-0.442996undecaprenyl pyrophosphate phosphatase
UTI89_C0845013-0.344608MdfA/Cmr MFS multidrug transporter
UTI89_C0846-114-0.807711hypothetical protein
UTI89_C0847-115-1.117670hypothetical protein
UTI89_C0848-214-0.339236DEOR-type transcriptional regulator
UTI89_C0849-112-0.775686DEOR-type transcriptional regulator
UTI89_C08500120.322084hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0842BLACTAMASEA438e-07 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 43.2 bits (102), Expect = 8e-07
Identities = 41/201 (20%), Positives = 64/201 (31%), Gaps = 34/201 (16%)

Query: 23 AFLFLFAPTAFAAEQTVEAPSVDARAW----------ILMDYASGKVLAEGNADEKLDPA 72
+ L A A P + I MD ASG+ L ADE+
Sbjct: 7 CIISLLATLPLAV-HASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMM 65

Query: 73 SLTKIMTSYVVGQALKADKIKLTDMVTVGKDAWATGNPALRGSSVMFLKPGDQVSVADLN 132
S K++ V + A +L + + +P V D ++V +L
Sbjct: 66 STFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSP------VSEKHLADGMTVGELC 119

Query: 133 KGVIIQSGNDACIALADYVAGSQESFIGLMNGYAKKLGLTNTT---FQTVHGLDAPGQF- 188
I S N A L V G + + +++G T ++T PG
Sbjct: 120 AAAITMSDNSAANLLLATVGGPAG-----LTAFLRQIGDNVTRLDRWETELNEALPGDAR 174

Query: 189 --STARDMA------LLGKAL 201
+T MA L + L
Sbjct: 175 DTTTPASMAATLRKLLTSQRL 195


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0845TCRTETA416e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 41.0 bits (96), Expect = 6e-06
Identities = 60/269 (22%), Positives = 109/269 (40%), Gaps = 23/269 (8%)

Query: 81 LLGPLSDRIGRRPVMLAGVVWFIVTCLAILLAQNIEQFTLLRFLQGISLCFIGAVGYAAI 140
+LG LSDR GRRPV+L + V + A + + R + GI+ GAV A I
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGA-TGAVAGAYI 120

Query: 141 QESFEEAVCIKITALMANVALIAPLLGPLVG---AAWIHVLPWEGMFVLFAALAAISFFG 197
+ + + M+ + GP++G + P F AAL ++F
Sbjct: 121 ADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAP----FFAAAALNGLNFLT 176

Query: 198 LQRAMPETATRIGEKLSLKELGRDYKLVLKNG-RFVAGALALGFVSLPLLAWIAQSP--I 254
+PE+ L + L G VA +A+ F ++ + Q P +
Sbjct: 177 GCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFF----IMQLVGQVPAAL 232

Query: 255 IIITGEQLSSYEYGLLQVPI--FGAL--IAGNLLLARLTSRRTVRSLIIMGGWPIMIGLL 310
+I GE ++ + + + FG L +A ++ + +R R +++G G +
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI 292

Query: 311 VAAAATVISSHAYLWMTAGLSIYAFGIGL 339
+ A AT ++ + + + GIG+
Sbjct: 293 LLAFAT----RGWMAFPIMVLLASGGIGM 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0848TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.7 bits (77), Expect = 0.001
Identities = 34/150 (22%), Positives = 65/150 (43%), Gaps = 6/150 (4%)

Query: 239 LLIGVVVLAMAFAEGSANDWL-PLLMVDGHGFSP-TSGSLIYAGFTLGMTVGRFTGGWFI 296
+IGV+ + F + + P +M D H S GS+I T+ + + + GG +
Sbjct: 258 FMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILV 317

Query: 297 DRYSRVAVVR-ASALM--GALGIGLIIFVDSAWVA-GVSVVLWGLGASLGFPLTISAASD 352
DR + V+ + L ++ S ++ + VL GL + TI ++S
Sbjct: 318 DRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSL 377

Query: 353 TGPDAPTRVSVVATTGYLAFLVGPPLLGYL 382
+A +S++ T +L+ G ++G L
Sbjct: 378 KQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0849HTHTETR522e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 51.6 bits (123), Expect = 2e-10
Identities = 14/81 (17%), Positives = 31/81 (38%)

Query: 12 RRANDPQRREKIIQATLEAVKLYGIHAVTHRKIATLAGVPLGSMTYYFSGIDELLLEAFS 71
+ + R+ I+ L G+ + + +IA AGV G++ ++F +L E +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 72 RFTEIMSRQYQAFFSDVSDAP 92
+ + + P
Sbjct: 65 LSESNIGELELEYQAKFPGDP 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0850TCRTETA320.006 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.1 bits (73), Expect = 0.006
Identities = 21/106 (19%), Positives = 34/106 (32%), Gaps = 6/106 (5%)

Query: 394 LMIGMITFQFSTFSFGMGNAAGLLFAGIML-GFMRANHPTFG-YIPQ--GALSMVKEFGL 449
L++ + +L+ G ++ G A G YI + FG
Sbjct: 76 LLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGF 135

Query: 450 MVFMAGVGLSAGSGINNGLGAIGGQM--LIAGLIVSLVPVVICFLF 493
M G G+ AG + +G A + L + CFL
Sbjct: 136 MSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLL 181


80UTI89_C0867UTI89_C0872N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C0867-1132.924503arginine transporter ATP-binding subunit
UTI89_C0868-1133.442845lipoprotein
UTI89_C0869-1143.117665hypothetical protein
UTI89_C0870-2133.220527N-acetylmuramoyl-L-alanine amidase
UTI89_C0871-3142.978078hypothetical protein
UTI89_C0872-3122.275338hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0867PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.010
Identities = 9/18 (50%), Positives = 12/18 (66%)

Query: 31 LVLLGPSGAGKSSLLRVL 48
+VL G G GKS+L+ L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0870ECOLIPORIN290.027 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 28.7 bits (64), Expect = 0.027
Identities = 20/54 (37%), Positives = 26/54 (48%), Gaps = 9/54 (16%)

Query: 2 RRVFWLVAAALLLAGCTGEKGIVEKEGYQLDTRRQAQAAYPRIKVLVIHYTADD 55
R+V LV ALL AG I K+G +LD Y ++ L HY +DD
Sbjct: 3 RKVLALVIPALLAAGAAHAAEIYNKDGNKLDL-------YGKVDGL--HYFSDD 47


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0871NUCEPIMERASE752e-17 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 75.2 bits (185), Expect = 2e-17
Identities = 70/363 (19%), Positives = 123/363 (33%), Gaps = 65/363 (17%)

Query: 22 MKVLVTGATSGLGRNAVEFLCQKGISVRA---------TGRNEAMGKLLEKMGAEFVPAD 72
MK LVTGA +G + + L + G V +A +LL + G +F D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 73 LTELVSSQAKVMLAGIDTLWHCS-------SFTSPWGTQQAFDLANVRATRRLGEWAVAW 125
L + + ++ S +P A+ +N+ + E
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPH----AYADSNLTGFLNILEGCRHN 116

Query: 126 GVRNFIHISSPSLYFDYHHHRDIKEDFRPHRFANEFARSKAASEEVINMLSQANPQTRFT 185
+++ ++ SS S+Y + D + +A +K A+E + + S T
Sbjct: 117 KIQHLLYASSSSVYGL-NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH-LYGLPAT 174

Query: 186 ILRPQSLFGPHDK--VFIPRLAHMMHHYGSILLPHGGSALVDMTYYENAVHAMWLASQEA 243
LR +++GP + + + + M SI + + G D TY ++ A+
Sbjct: 175 GLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI 234

Query: 244 CDKLPS--------------GRVYNITNGEHRTLRSIVQKLIDELNIDCRIRSVPYPMLD 289
RVYNI N L +Q L D L I+ + +P D
Sbjct: 235 PHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGD 294

Query: 290 MIARSMERLGRKSAKEPPLTHYGVSKLNFDFTLDITRAQEELGYQPVLTLDEGIEKTAAW 349
+ T D E +G+ P T+ +G++ W
Sbjct: 295 V----------------LETS-----------ADTKALYEVIGFTPETTVKDGVKNFVNW 327

Query: 350 LRD 352
RD
Sbjct: 328 YRD 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C0872NUCEPIMERASE553e-10 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 54.8 bits (132), Expect = 3e-10
Identities = 29/125 (23%), Positives = 52/125 (41%), Gaps = 17/125 (13%)

Query: 29 RILVLGASGYIGQHLVRTLSQQGHQLLA---------AARHVDRLAKLQLANVSCHKVDL 79
+ LV GA+G+IG H+ + L + GHQ++ + RL L HK+DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 80 SWPDNLPALLQD--IDTVYFLVH------SMGEGGDFIAQERQVALNVRDALREVPVKQL 131
+ + + L + V+ H S+ + LN+ + R ++ L
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 132 IFLSS 136
++ SS
Sbjct: 122 LYASS 126


81UTI89_C1201UTI89_C1209N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C12011132.424744flagellar hook protein FlgE
UTI89_C1202-1122.420118flagellar basal body rod protein FlgF
UTI89_C1203-191.250356flagellar basal body rod protein FlgG
UTI89_C12040132.268337flagellar basal body L-ring protein
UTI89_C12050132.053023flagellar basal body P-ring biosynthesis protein
UTI89_C12061131.737144flagellar rod assembly protein/muramidase FlgJ
UTI89_C12071131.289713flagellar hook-associated protein FlgK
UTI89_C12083151.222002flagellar hook-associated protein FlgL
UTI89_C12094171.622309ribonuclease E
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1201FLGHOOKAP1416e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.1 bits (96), Expect = 6e-06
Identities = 17/49 (34%), Positives = 29/49 (59%)

Query: 353 TLTNGALEASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR 401
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + L+N+R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 36.9 bits (85), Expect = 1e-04
Identities = 22/56 (39%), Positives = 30/56 (53%), Gaps = 4/56 (7%)

Query: 6 AVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1203FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1204FLGLRINGFLGH350e-126 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 350 bits (900), Expect = e-126
Identities = 232/232 (100%), Positives = 232/232 (100%)

Query: 6 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 65
MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 66 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 125
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 126 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 185
RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 186 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 237
SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1205FLGPRINGFLGI427e-152 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 427 bits (1099), Expect = e-152
Identities = 157/363 (43%), Positives = 213/363 (58%), Gaps = 9/363 (2%)

Query: 5 FLSALILLLVTTAAQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQTLN 64
F + L A RI+D+ S+Q R N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 13 FSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMR 72

Query: 65 NMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGTLLM 124
ML LGIT G + KN+AAVMVTA+LPPF G +DV VSS+G+A SLRGG L+M
Sbjct: 73 AMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIM 131

Query: 125 TPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFGVGN 184
T L G D Q+YA+AQG ++V G A +++ R+ NGA+IERELPS+F
Sbjct: 132 TSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSV 191

Query: 185 TLNLQLNDEDFSMAQQIADTINRVR----GYGSATALDARTIQVRVPSGNSSQVRFLADI 240
L LQL + DFS A ++AD +N G A D++ I V+ P + R +A+I
Sbjct: 192 NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEI 250

Query: 241 QNMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAVAQGNLSVTVNRQANVSQPDTPFGG 300
+N+ V T AKVVIN RTG++V+ +V + AV+ G L+V V V QP PF
Sbjct: 251 ENLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSR 308

Query: 301 GQTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLR 360
GQT V PQT I Q G + ++ L +V LN++G +++ILQ ++SAG L+
Sbjct: 309 GQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQ 367

Query: 361 AKL 363
A+L
Sbjct: 368 AEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1206FLGFLGJ5070.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 507 bits (1306), Expect = 0.0
Identities = 310/313 (99%), Positives = 311/313 (99%)

Query: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60
MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSERTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120
LFSSE TRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGNSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180
VVRYQNQALSQLVQKAVPRNYDDSLPG+SKAFLAQLSLPAQLASQQSGVPHHLILAQAAL
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180

Query: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240
ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL
Sbjct: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240

Query: 241 EALSDYVGLLTRNPRYAAVTTAVSAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300
EALSDYVGLLTRNPRYAAVTTA SAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK
Sbjct: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300

Query: 301 VSKTYSMNIDNLF 313
VSKTYSMNIDNLF
Sbjct: 301 VSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1207FLGHOOKAP16820.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 682 bits (1761), Expect = 0.0
Identities = 545/546 (99%), Positives = 545/546 (99%)

Query: 2 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 61
SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 121
GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 181
SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 241
QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSSADPSRTTVAYVDRTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 301
RQLAAVPSSADPSRTTVAYVD TAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 361
ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 362 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 421
YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV
Sbjct: 361 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 420

Query: 422 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 481
NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN
Sbjct: 421 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 480

Query: 482 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 541
KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD
Sbjct: 481 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 540

Query: 542 ALINIR 547
ALINIR
Sbjct: 541 ALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1208FLAGELLIN468e-08 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 46.2 bits (109), Expect = 8e-08
Identities = 42/226 (18%), Positives = 80/226 (35%), Gaps = 9/226 (3%)

Query: 7 MMYQQNMRGITNSQAEWMKYGEQMSTGKRVVNPSDDPIAASQAVVLSQAQAQNSQYTLAR 66
++ Q N+ +S + + E++S+G R+ + DD + A + +Q +
Sbjct: 11 LLTQNNLNKSQSSLSSAI---ERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNA 67

Query: 67 TFATQKVSLEESVLSQVTTAIQNAQEKIVYASNGTLSDDDRASLATDIQGLRDQLLNLAN 126
E L+++ +Q +E V A+NGT SD D S+ +IQ +++ ++N
Sbjct: 68 NDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSN 127

Query: 127 TTDGNGRYIFAGYKTETAPFSEADGDYVGGTESIKQQVDASRSMVIGHTGDKIFDSITSN 186
T NG + + DG E+I + +G G + +
Sbjct: 128 QTQFNGVKVLSQDNQMKIQVGANDG------ETITIDLQKIDVKSLGLDGFNVNGPKEAT 181

Query: 187 AVAEPDGSASETNLFAMLDSAIAALKTPVADSEADKETAAAALDKT 232
+ T A + + TA DK
Sbjct: 182 VGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1209IGASERPTASE643e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 63.9 bits (155), Expect = 3e-12
Identities = 41/226 (18%), Positives = 79/226 (34%), Gaps = 12/226 (5%)

Query: 590 PAEQSAPKAEAKPERQQDRR-----KPRQNNRRDRNERRDTRSERTEGSDNREENRRNRR 644
P+ S + A+ + N ++++++ D E +NR
Sbjct: 1008 PSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNRE 1067

Query: 645 QAQQQTAETRESRQQAEV------TEKARTTDEQQAPRRERSRRRNDDKRQAQQEAKALN 698
A++ + + + Q EV T++ +TT+ ++ E+ + + + Q+ K +
Sbjct: 1068 VAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPK-VT 1126

Query: 699 VEEQSVQETEQEERVRPVQPRRKQRQLNQKVRYEQSVAEEAVVAPVVEETVAAEPIVQEA 758
+ QE + + + R +N K Q+ P E + E V E+
Sbjct: 1127 SQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES 1186

Query: 759 PAPRTELVKVPLPVVAQAAPEQQEENNADNRDNGGMPRRSRRSPRH 804
T V P A Q N+ + RRS RS H
Sbjct: 1187 TTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPH 1232



Score = 62.0 bits (150), Expect = 1e-11
Identities = 46/287 (16%), Positives = 91/287 (31%), Gaps = 35/287 (12%)

Query: 513 PSEEEFAERKRPEQPALATFAMPDVPPAPT-PAEPAAPVVAPAPKAATATPASPAQPGLL 571
P E+ + DVP P+ E A AP P A ATP+ +
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTE---- 1038

Query: 572 SRFFGALKALFSGGEETKPAEQSAPKAEAKPERQQDRRKP-RQNNRRDRNERRDTRSER- 629
AE S +++ + +QD + QN + + + ++
Sbjct: 1039 -----------------TVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQ 1081

Query: 630 -TEGSDNREENRRNRRQAQQQTAETRESRQQAEVTEKARTTDEQQAPRRERSRRRNDDKR 688
E + + E + + ++TA + + TEK + + + + + +
Sbjct: 1082 TNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQP 1141

Query: 689 QAQ---QEAKALNVEEQSVQETEQEERVRPVQPRRKQRQLNQKVRYEQSV--AEEAVVAP 743
QA+ + +N++E Q + +P + + Q V +V V P
Sbjct: 1142 QAEPARENDPTVNIKEPQSQTNTTADTEQPA--KETSSNVEQPVTESTTVNTGNSVVENP 1199

Query: 744 VVEETVAAEPIVQEAPAPRTELVKVPLPVVAQAAPEQQEENNADNRD 790
+P V + K ++ P E + D
Sbjct: 1200 ENTTPATTQPTVNSESS---NKPKNRHRRSVRSVPHNVEPATTSSND 1243


82UTI89_C1416UTI89_C1420N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C1416-1131.609095hypothetical protein
UTI89_C1417-1202.028631transcriptional regulator NarL
UTI89_C1418-1212.349772nitrate/nitrite sensor protein NarX
UTI89_C1419-1272.612706hypothetical protein
UTI89_C1420-1232.018358nitrite extrusion protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1416INTIMIN2561e-78 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 256 bits (656), Expect = 1e-78
Identities = 119/378 (31%), Positives = 195/378 (51%), Gaps = 21/378 (5%)

Query: 79 GEQAKAFALGKVRDALSQQVNQHVESWLSPWGNASVDVKVDNEGHFTGSRGSWFVPLQDN 138
G+ AK ALG + Q + +++WL +G A V+++ N F GS + +P D+
Sbjct: 184 GDYAKDTALGIAGN----QASSQLQAWLQHYGTAEVNLQSGNN--FDGSSLDFLLPFYDS 237

Query: 139 DRYLTWSQLGLTQQDDGLVSNVGVGQRWARGSWLVGYNTFYDNLLDENLQRAGFGAEAWG 198
++ L + Q+G D +N+G GQR+ ++GYN F D + R G G E W
Sbjct: 238 EKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLGIGGEYWR 297

Query: 199 EYLRLSANFYQPFAAWHE--QTATQEQRMARGYDLTARMRMPFYQHLNTSVSVEQYFGDR 256
+Y + S N Y + WHE ++R A G+D+ +P Y L + EQY+GD
Sbjct: 298 DYFKSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDN 357

Query: 257 VDLFNSGTGYHNPVALSLGLNYTPVPLVTVTAQHKQGESGENQNNLGLNLNYRFGVPLKK 316
V LFNS NP A ++G+NYTP+PLVT+ ++ G EN + Y+F P +
Sbjct: 358 VALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQ 417

Query: 317 QLSAGEVAESQSLRGSRYDNPQRNNLPTLEYRQRKTLTVFLATPPWDLKPGETVPLKLQI 376
Q+ V E ++L GSRYD QRNN LEY+++ L++ + + T ++L +
Sbjct: 418 QIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNI-PHDINGTERSTQKIQLIV 476

Query: 377 RSRYGIRQLIWQGDTQILS-----LTPGAQANSEEGWTLIMPDWQNGEGASNHWRLSVVV 431
+S+YG+ +++W D+ + S G+Q S + + I+P + +G SN ++++
Sbjct: 477 KSKYGLDRIVWD-DSALRSQGGQIQHSGSQ--SAQDYQAILPAYV--QGGSNVYKVTARA 531

Query: 432 EDNQGQRVSSNEITLTLV 449
D G SSN + LT+
Sbjct: 532 YDRNGN--SSNNVLLTIT 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1417HTHFIS758e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.3 bits (185), Expect = 8e-18
Identities = 32/117 (27%), Positives = 56/117 (47%), Gaps = 2/117 (1%)

Query: 29 ATILLIDDHPMLRTGVKQLISMAPDITVVGEASNGEQGIELAESLDPDLILLDLNMPGMN 88
ATIL+ DD +RT + Q +S A + SN + D DL++ D+ MP N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 89 GLETLDKLREKSLSGRIVVFSVSNHEEDVVTALKRGADGYLLKDMEPEDLLKALHQA 145
+ L ++++ ++V S N + A ++GA YL K + +L+ + +A
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1418PF06580532e-09 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 52.9 bits (127), Expect = 2e-09
Identities = 36/172 (20%), Positives = 73/172 (42%), Gaps = 23/172 (13%)

Query: 424 PESSRELLSQIRNELNASWVQLRELLTTFRLQLTEPGLRPALEASCEEYSAKFGFPVKLD 483
P +RE+L+ + + S + +LT +++ + S +F ++ +
Sbjct: 190 PTKAREMLTSLSELMRYSLRYSNARQVSLADELT------VVDSYLQLASIQFEDRLQFE 243

Query: 484 YQLPPRL----VPSHQAIHLLQIAREALSNALKH-----SQASEVVVTVAQNDNQVKLTV 534
Q+ P + VP L+Q E N +KH Q ++++ +++ V L V
Sbjct: 244 NQINPAIMDVQVPPM----LVQTLVE---NGIKHGIAQLPQGGKILLKGTKDNGTVTLEV 296

Query: 535 QDNGCGVPENAIRSNHYGMIIMRDRAQSLRG-DCRVRRRESGGTEVVVTFIP 585
++ G +N S G+ +R+R Q L G + +++ E G + IP
Sbjct: 297 ENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1420ACRIFLAVINRP310.011 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.0 bits (70), Expect = 0.011
Identities = 35/166 (21%), Positives = 60/166 (36%), Gaps = 22/166 (13%)

Query: 260 IMSLLYLATFGSFIGFSAGFAMLSKTQFPDVQILQYAFFGPFIGALARSA---GGALSDR 316
I+S + L+ + I A A L K + + FFG F S ++
Sbjct: 474 IVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKI 533

Query: 317 LGGTRVTLVNFILMAIFSGLLFLTLPTD----GQGGSFMAFFAVFLALFLTAGLGSGSTF 372
LG T L+ + L+ +LFL LP+ G F+ L +G+T
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTM----------IQLPAGATQ 583

Query: 373 QMISVIFRKLTMDRVKAEGGSDER-----AMREAATDTAAALGFIS 413
+ + ++T +K E + E + A + F+S
Sbjct: 584 ERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVS 629


83UTI89_C1515UTI89_C1522N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C15150203.744767tail component of prophage CP-933K
UTI89_C15160213.118503hypothetical protein
UTI89_C15170202.817431prophage tail component
UTI89_C15181251.081493hypothetical protein
UTI89_C15192270.561871tail fiber protein
UTI89_C1520229-2.003370hypothetical protein
UTI89_C1521533-6.785414phage-related tail fiber assembly protein G
UTI89_C1522434-7.932531hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1515PF06291270.032 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 26.5 bits (58), Expect = 0.032
Identities = 13/37 (35%), Positives = 17/37 (45%), Gaps = 5/37 (13%)

Query: 128 ILFSMGAAMTLGGVAQML-----APKARTPRTQTTDN 159
+LFS AM + G AQ P A TP+ T +
Sbjct: 9 MLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHH 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1516PHAGEIV300.001 Gene IV protein signature.
		>PHAGEIV#Gene IV protein signature.

Length = 426

Score = 30.3 bits (68), Expect = 0.001
Identities = 15/49 (30%), Positives = 30/49 (61%), Gaps = 2/49 (4%)

Query: 35 KNIDELSGCISRQWAGNGTPITSLPIEN-GVSL-LVPQAMGGYDVVLDI 81
+N+ ++G ++ + A P ++ +N G+S+ + P AM G ++VLDI
Sbjct: 289 QNVPFITGRVTGESANVNNPFQTVERQNVGISMSVFPVAMAGGNIVLDI 337


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1518ENTEROVIROMP1384e-44 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 138 bits (350), Expect = 4e-44
Identities = 63/200 (31%), Positives = 98/200 (49%), Gaps = 30/200 (15%)

Query: 1 MRKVCAVILSAAICLSVSGAPAWASEHQSTLSAGYLHARTNAPGSDNLNGINVKYRYEFT 60
M+K+ + AA+ +G A ST++ GY + + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAA---TSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DA-LGLITSFSYANAEDEQKTHYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYSMAGV 119
++ LG+I SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVTIDLAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+V +D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDAFIVGIGYRF 199
S +I G+GYRF
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1519CHANLCOLICIN468e-07 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 45.8 bits (108), Expect = 8e-07
Identities = 54/319 (16%), Positives = 116/319 (36%)

Query: 152 ARAASTSAGQAASSAQSASSSAGTASTKAREAAKSAAAAESSKSAAATSASAAKTSETNA 211
+ S S AA A + S+A T+A +AA++ AAAE+ A A + + +
Sbjct: 39 GKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIV 98

Query: 212 AASQQSAATSASTATTKASEAATSARDASASKEAAKSSETNAASSASSAASSATAAANSA 271
+ + A+ +AT A + + AK+ E + ++ + A
Sbjct: 99 NEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQEAEQRRK 158

Query: 272 KAAKTSETNARSSETAAGQSASAAADSKTAAALSASAASTSAGQASASATAAGKSAESAA 331
+ + R + A + AA S+ A A+ + SA Q+ ++
Sbjct: 159 EIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSR 218

Query: 332 SSASTATTKAGEAAVQASAAARSASAAKTSKTNAKASETSAESSKTAAASSASSAASSAS 391
S+S A + + ++AK + + + S ++ A
Sbjct: 219 LSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRV 278

Query: 392 SASASKDEATRQASAAKGSATTASTKATEAAGSATAAAQSKSTAESAATRAETAAKRAED 451
A ++E +Q +A++ + T+ + + + +++ + AE K+A++
Sbjct: 279 GAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHEAEENLKKAQN 338

Query: 452 IASAVALEDASTTKKGIVQ 470
++DA Q
Sbjct: 339 NLLNSQIKDAVDATVSFYQ 357



Score = 36.6 bits (84), Expect = 5e-04
Identities = 66/358 (18%), Positives = 125/358 (34%), Gaps = 30/358 (8%)

Query: 313 AGQASASATAAGKSAESAASSASTATTKAGEAAVQASAAARSASAAKTSKTNAKASETSA 372
+G KS SAA A+ + A QA AAR+ +AA ++ A
Sbjct: 32 SGSGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAA--------EAQAKA 83

Query: 373 ESSKTAAASSASSAASSASSASASKDEATRQASAAKGSATTASTKATEAAGSATAAAQSK 432
++++ A + A +AS+ + + + A+ A +A A+++
Sbjct: 84 KANRDALTQRLKDIVNEALRHNASRTPSATELA-------HANNAAMQAEDERLRLAKAE 136

Query: 433 STAESAATRAETAAKRAEDIASAVALEDASTTKKGIVQLSSATNSTSESLAATPKAVKAA 492
A A AE A + AE + E A T ++ ++L+ A +L+ KAV+ A
Sbjct: 137 EKARKEAEAAEKAFQEAEQRRKEIEREKAETERQ--LKLAEAEEKRLAALSEEAKAVEIA 194

Query: 493 YELANGKYTAQDATTAQKGIVQLSNATNSTSEMLAATPKSVKAAYDLANGKYTAQDATTA 552
+ + AQ +V++ + + L+++ + A GK +A
Sbjct: 195 ---------QKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASA 245

Query: 553 QKGIVQLSSATNSTSEMLAATPKSVKAAYDLANGKYTAQDAT-TAQKGIVQLSSATNSAS 611
+ + A P + ++ + A QK + + N +
Sbjct: 246 K---YKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASETRINRIN 302

Query: 612 ETLAATPKAVKAANNNANGRVPSARKVNGKALSADITLTPKDIGTLNSTTMSFSGGAG 669
+ KA+ +NN N + + A L I T+SF
Sbjct: 303 ADITQIQKAISQVSNNRNAGIARVHEAEENLKKAQNNLLNSQIKDAVDATVSFYQTLT 360



Score = 31.2 bits (70), Expect = 0.022
Identities = 52/321 (16%), Positives = 99/321 (30%), Gaps = 23/321 (7%)

Query: 114 SRNASAVAQNTAAAKKSASDASASASEAATHATDAAASARAASTSAGQAASSAQSASSSA 173
S++ S+ A + A +A A +AA A A A+A + + +
Sbjct: 43 SKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEAL 102

Query: 174 GTASTKAREAAKSAAAAESSKSAAATSASAAKTSETNAAASQQSAATSASTATTKASEAA 233
+++ A + A A ++ A AK + A + A KA + A
Sbjct: 103 RHNASRTPSATELAHANNAAMQAEDERLRLAK---------AEEKARKEAEAAEKAFQEA 153

Query: 234 TSARDASASKEAAKSSETNAAS------SASSAASSATAAANSAKAAKTSETNARSSE-T 286
R ++A + A +A S + A A +A SE E
Sbjct: 154 EQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIK 213

Query: 287 AAGQSASAAADSKTAAALSASAASTSAGQASASATAAGKSAESAASSASTA-TTKAGEAA 345
S++ ++ A + + QASA + + + A+ + A
Sbjct: 214 TLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEA 273

Query: 346 VQASAAARSASAAKTSKTNAKASETSAESSKTAAASSASSAASSASSAS------ASKDE 399
+ A K + A + + ++ A S S+ +A A ++
Sbjct: 274 TRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHEAEENL 333

Query: 400 ATRQASAAKGSATTASTKATE 420
Q + A
Sbjct: 334 KKAQNNLLNSQIKDAVDATVS 354


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1522LUXSPROTEIN300.005 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 29.9 bits (67), Expect = 0.005
Identities = 17/66 (25%), Positives = 30/66 (45%), Gaps = 7/66 (10%)

Query: 40 TKEHLLPHFL-EHVGNNHLDI------GVGTGFYLTHVPESSLISLMDLNEASLNAASTR 92
T EHL F+ H+ + ++I G TGFY++ + S + D A++
Sbjct: 54 TLEHLYAGFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKV 113

Query: 93 AGESKI 98
++KI
Sbjct: 114 ENQNKI 119


84UTI89_C1558UTI89_C1568N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C1558-114-0.789019RNase II stability modulator
UTI89_C1559-2120.213425exoribonuclease II
UTI89_C1560-2110.252235hypothetical protein
UTI89_C1561-111-0.053675enoyl-ACP reductase
UTI89_C1562-213-0.226068transcriptional repressor
UTI89_C1563-2120.172388acriflavine resistance protein A
UTI89_C1564-2120.861676acriflavine resistance protein B
UTI89_C1565-2130.764824outer membrane channel protein
UTI89_C1566-2140.762240membrane transport protein
UTI89_C1567-2131.175875peptide transport system ATP-binding protein
UTI89_C1568-2131.173973peptide ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1558PF08280300.043 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 29.8 bits (67), Expect = 0.043
Identities = 21/105 (20%), Positives = 36/105 (34%), Gaps = 2/105 (1%)

Query: 526 PIDVELTESCLIENDELALSVIQQFSRLGAQVHLDDFGTGYSSLSQLARFPIDAIKLDQV 585
P+ V S I L S + FS + + ++ Q+ D +
Sbjct: 425 PLVVVFVASNFINAHLLTDSFPRYFS--DKSIDFHSYYLLQDNVYQIPDLKPDLVITHSQ 482

Query: 586 FVRDIHKQPVSQSLVRAIVAVAQALNLQVIAEGVESAKEDAFLTK 630
+ +H + V I L++Q + V+ K A LTK
Sbjct: 483 LIPFVHHELTKGIAVAEISFDESILSIQELMYQVKEEKFQADLTK 527


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1561DHBDHDRGNASE501e-09 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 50.4 bits (120), Expect = 1e-09
Identities = 51/260 (19%), Positives = 98/260 (37%), Gaps = 22/260 (8%)

Query: 4 LSGKRILVTGVASKLSIAYGIAQAMHREGAEL-AFTYQNDKLKGRVEEFAAQLGSDIVLQ 62
+ GK +TG A I +A+ + +GA + A Y +KL+ V A+
Sbjct: 6 IEGKIAFITGAAQ--GIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 CDVAEDASIDTMFAELGKVWPKFDGFVHSIGF---APGDQLDGDYVNAVTREGFKIAHDI 119
DV + A+ID + A + + D V+ G L + A F +
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEAT----FSVN--- 116

Query: 120 SSYSFVAMAKACRSMLNP-GSALLTLSYLGAERAIPNYNVMGLAKASLEANVRYMANAMG 178
S+ F A + M++ +++T+ A + +KA+ + + +
Sbjct: 117 STGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 179 PEGVRVNAISAGPIRTLAASGI--------KDFRKMLAHCEAVTPIRRTVTIEDVGNSAA 230
+R N +S G T + + + L + P+++ D+ ++
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVL 236

Query: 231 FLCSDLSAGISGEVVHVDGG 250
FL S + I+ + VDGG
Sbjct: 237 FLVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1562HTHTETR574e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 56.9 bits (137), Expect = 4e-12
Identities = 17/65 (26%), Positives = 33/65 (50%)

Query: 34 MTSKLEIRHKQRQDEIINAARRCFRRCGFHAASMSQIASEAQLSVGQIYRYFANKDAIIE 93
M K + ++ + I++ A R F + G + S+ +IA A ++ G IY +F +K +
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 94 EMVRR 98
E+
Sbjct: 61 EIWEL 65


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1563RTXTOXIND484e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.9 bits (114), Expect = 4e-08
Identities = 19/70 (27%), Positives = 39/70 (55%), Gaps = 1/70 (1%)

Query: 52 PVSVVSELTGR-TSAALSAEVRPQVGGIIQKRLFKEGDLVKAGQPLYQIDAASYQAAWNE 110
V +V+ G+ T + S E++P I+++ + KEG+ V+ G L ++ A +A +
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLK 138

Query: 111 ARAALQQAQA 120
+++L QA+
Sbjct: 139 TQSSLLQARL 148



Score = 30.6 bits (69), Expect = 0.010
Identities = 15/116 (12%), Positives = 32/116 (27%), Gaps = 9/116 (7%)

Query: 94 QPLYQIDAASYQAAWN--EARAALQQAQALVKADCQKAQRYTRLVKENGVSQQDADDAQS 151
L A + A + K+ ++ + KE Q ++
Sbjct: 241 SSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE--YQLVTQLFKN 298

Query: 152 TCAQDKASVEAKKAALET----ARINLDWTTVTAPISGRI-GISSVTPGALVTASQ 202
L + + AP+S ++ + T G +VT ++
Sbjct: 299 EILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE 354


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1564ACRIFLAVINRP11610.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1161 bits (3004), Expect = 0.0
Identities = 583/1033 (56%), Positives = 760/1033 (73%), Gaps = 6/1033 (0%)

Query: 3 SRFFVRRPVFAWVIAILIMLAGILAIRTLPVAQYPDVAPPTIKISATYTGASAETLENSV 62
+ FF+RRP+FAWV+AI++M+AG LAI LPVAQYP +APP + +SA Y GA A+T++++V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 63 TQVIEQQLTGLDNLLYFSSTSSSDGSVSINVTFEQGTDPDTAQVQVQNKIQQAESRLPSE 122
TQVIEQ + G+DNL+Y SSTS S GSV+I +TF+ GTDPD AQVQVQNK+Q A LP E
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 123 VQQTGVTVEKSQSNFLLIAAVYDTTDKASSSDIADWLVSNVQDPLARVEGVGSLQVFGAE 182
VQQ G++VEKS S++L++A + DI+D++ SNV+D L+R+ GVG +Q+FGA+
Sbjct: 122 VQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQ 181

Query: 183 YAMRIWLDPAKLASYSLMPSDVQSAIEAQNVQVTAGKIGALPSPNTQQLTATVRAQSRLQ 242
YAMRIWLD L Y L P DV + ++ QN Q+ AG++G P+ QQL A++ AQ+R +
Sbjct: 182 YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 243 TVDQFKNIIVKSQSDGAVVRIKDVARVEMGSEDYTAIGKLNGHPSAGVAVMLSPGANALN 302
++F + ++ SDG+VVR+KDVARVE+G E+Y I ++NG P+AG+ + L+ GANAL+
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 303 TATLVKDKIAEFQRNMPQGYDIAYPKDSTEFIKISVEDVIQTLFEAIVLVVCVMYLFLQN 362
TA +K K+AE Q PQG + YP D+T F+++S+ +V++TLFEAI+LV VMYLFLQN
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 363 LRATLIPALAVPVVLLGTFGVLALFGYSINTLTLFAMVLAIGLLVDDAIVVVENVERIMR 422
+RATLIP +AVPVVLLGTF +LA FGYSINTLT+F MVLAIGLLVDDAIVVVENVER+M
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 423 DEGLPAREATEKSMGEISGALVAIALVLSAVFLPMAFFGGSTGVIYRQFSITIISAMLLS 482
++ LP +EATEKSM +I GALV IA+VLSAVF+PMAFFGGSTG IYRQFSITI+SAM LS
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 483 VVVALTLTPALCGSVL----QHVPPHKKGFFGAFNRFYRRTEDKYQRGVIYVLRRAARTM 538
V+VAL LTPALC ++L +K GFFG FN + + + Y V +L R +
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 539 GLYVVLGGGMALMMWKLPGSFLPTEDQGEIMVQYTLPAGATAARTAEVNRQIVDWFLINE 598
+Y ++ GM ++ +LP SFLP EDQG + LPAGAT RT +V Q+ D++L NE
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 599 KANTDVIFTVDGFSFSGSGQNTGMAFVSLKNWSQRKGAENTAQAIALRATKELGTIRDAT 658
KAN + +FTV+GFSFSG QN GMAFVSLK W +R G EN+A+A+ RA ELG IRD
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 659 VFAMTPPAVDGLGQSNGFTFELLANGGADRETLLQMRNQLIEKANQSP-ELHSVRANDLP 717
V PA+ LG + GF FEL+ G + L Q RNQL+ A Q P L SVR N L
Sbjct: 662 VIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLE 721

Query: 718 QMPQLQVDIDSNKAVSLGLSLNDVTDTLSSAWGGTYVNDFIDRGRVKKVYIQGDSEFRSA 777
Q ++++D KA +LG+SL+D+ T+S+A GGTYVNDFIDRGRVKK+Y+Q D++FR
Sbjct: 722 DTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRML 781

Query: 778 PSDLGKWFVRGSDNAMTPFSAFATTRWLYGPERLVRYNGSAAYEIQGENATGFSSGDAMT 837
P D+ K +VR ++ M PFSAF T+ W+YG RL RYNG + EIQGE A G SSGDAM
Sbjct: 782 PEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMA 841

Query: 838 KMEELANSLPAGTTWAWSGLSLQEKLASGQALSLYAVSILVVFLCLAALYESWSVPFSVI 897
ME LA+ LPAG + W+G+S QE+L+ QA +L A+S +VVFLCLAALYESWS+P SV+
Sbjct: 842 LMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVM 901

Query: 898 LVIPLGLLGAALAAWMRDLNNDVYFQVALLTTIGLSSKNAILIVEFA-EAAVAEGYSLSR 956
LV+PLG++G LAA + + NDVYF V LLTTIGLS+KNAILIVEFA + EG +
Sbjct: 902 LVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVE 961

Query: 957 AALRAAQTRLRPIIMTSLAFIAGVMPLAIATGAGANSRIAIGTGIIGGTLTATLLAIFFV 1016
A L A + RLRPI+MTSLAFI GV+PLAI+ GAG+ ++ A+G G++GG ++ATLLAIFFV
Sbjct: 962 ATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFV 1021

Query: 1017 PLFFVLVKRLFAG 1029
P+FFV+++R F G
Sbjct: 1022 PVFFVVIRRCFKG 1034



Score = 75.3 bits (185), Expect = 1e-15
Identities = 53/330 (16%), Positives = 117/330 (35%), Gaps = 19/330 (5%)

Query: 721 QLQVDIDSNKAVSLGLSLNDVTDTLSSA----WGGTYVNDFIDRGRVKKVYIQGDSEFRS 776
+++ +D++ L+ DV + L G G+ I + F++
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKN 242

Query: 777 APSDLGKWFVRGSDN-AMTPFSAFATTRWLYGPER--LVRYNGSAA-----YEIQGENAT 828
P + GK +R + + ++ A L G + R NG A G NA
Sbjct: 243 -PEEFGKVTLRVNSDGSVVRLKDVARVE-LGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 829 GFSSGDAMTKMEELANSLPAG--TTWAWSGLSLQEKLASGQALSLYAVSILVVFLCLAAL 886
+ K+ EL P G + + + +L+ +LV + +
Sbjct: 301 DTAKA-IKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLV-MYLF 358

Query: 887 YESWSVPFSVILVIPLGLLGAALAAWMRDLNNDVYFQVALLTTIGLSSKNAILIVEFAEA 946
++ + +P+ LLG + + ++ IGL +AI++VE E
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 947 AVAEGYSLSRAALRAAQTRLR-PIIMTSLAFIAGVMPLAIATGAGANSRIAIGTGIIGGT 1005
+ E + A + ++++ ++ ++ A +P+A G+ I+
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 1006 LTATLLAIFFVPLFFVLVKRLFAGKPRRQE 1035
+ L+A+ P + + + + +
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1565RTXTOXIND290.048 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.048
Identities = 24/166 (14%), Positives = 49/166 (29%), Gaps = 11/166 (6%)

Query: 70 DVQKAIADIDSARALYGQTNASLFPTVNAALSSTRSRSLANGTGTTAEADGTVSSYTLDL 129
A AD ++ Q +RS L + + + +
Sbjct: 128 TALGAEADTLKTQSSLLQARL----EQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEE 183

Query: 130 FGRNQSLSRAARETWLASEFTAQNTRLTLIAEISTAWLTLAADNSNLALAKETMASAENS 189
R SL + TW ++ + AE T + + + ++
Sbjct: 184 VLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTV-------LARINRYENLSRVEKSR 236

Query: 190 LKIIQRQQQVGTAAATDVSEAMSVYQQARASVASYQTQVMQDKNAL 235
L A V E + Y +A + Y++Q+ Q ++ +
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEI 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1566TCRTETA681e-14 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 67.9 bits (166), Expect = 1e-14
Identities = 68/312 (21%), Positives = 114/312 (36%), Gaps = 18/312 (5%)

Query: 5 SLSWALILGLLAGIGPMCTDLYLPALPEMSEQLAATTTITQLTLTASLIGLGVGQLLFGP 64
L L L +G L +P LP + L + +T L + Q P
Sbjct: 6 PLIVILSTVALDAVG---IGLIMPVLPGLLRDLVHSNDVTA-HYGILLALYALMQFACAP 61

Query: 65 ----LSDKIGRKRPLILSLLLFIVSSILCATTNNIYWLVVWRFIQGIAGAGGSVLSRSIA 120
LSD+ GR+ L++SL V + AT ++ L + R + GI GA G+V IA
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIA 121

Query: 121 RDKYQGVTLTQFFALLMTVNGLAPVLSPVLGGYIVSTFDWRTLFWVMAEISTVLLLGCLL 180
D G + F + G V PVLGG + F F+ A ++ + L
Sbjct: 122 -DITDGDERARHFGFMSACFGFGMVAGPVLGGLM-GGFSPHAPFFAAAALNGLNFLTGCF 179

Query: 181 FINETLPENKRGSSL----LLTGRSVVQNRRFMRFCLIQSFMLAGLFAYIGSSSFVL--Q 234
+ E+ +R L + + + F + L + ++ +V+ +
Sbjct: 180 LLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFF-IMQLVGQVPAALWVIFGE 238

Query: 235 KEFGFSPMQFSLVFGLNGI-GLIIASWIFSRLARRINAMTLLRGGLIAAILCALLTVLCA 293
F + + GI + + I +A R+ L G+IA +L
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 294 WVQLPIPALVAL 305
+ P +V L
Sbjct: 299 RGWMAFPIMVLL 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C1568HTHFIS310.007 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.007
Identities = 9/16 (56%), Positives = 14/16 (87%)

Query: 38 LVGESGSGKSLIAKAI 53
+ GESG+GK L+A+A+
Sbjct: 165 ITGESGTGKELVARAL 180


85UTI89_C2086UTI89_C2093N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2086-191.208556chemotaxis regulatory protein CheY
UTI89_C2087091.461985chemotaxis-specific methylesterase
UTI89_C2088-191.183427chemotaxis methyltransferase CheR
UTI89_C2089-1110.707997methyl-accepting chemotaxis protein II
UTI89_C2090-1110.347315purine-binding chemotaxis protein
UTI89_C2091-113-0.250533chemotaxis protein CheA
UTI89_C2092014-1.551765flagellar motor protein MotB
UTI89_C2093013-1.871079flagellar motor protein MotA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2086HTHFIS889e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.3 bits (219), Expect = 9e-24
Identities = 30/105 (28%), Positives = 51/105 (48%), Gaps = 3/105 (2%)

Query: 7 KFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGLDALNKLQAGGYGFVISDWNMPNMDGL 66
LV DD + +R ++ L G++ V + + AG V++D MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 ELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKPF 111
+LL I+ LPVL+++A+ I A++ GA Y+ KPF
Sbjct: 64 DLLPRIKKARPD--LPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2087HTHFIS663e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.4 bits (162), Expect = 3e-14
Identities = 35/188 (18%), Positives = 73/188 (38%), Gaps = 23/188 (12%)

Query: 1 MSKIRVLSVDDSALMRQIMTEIINSHSDMEMVATAPDPLVARDLIKKFNPDVLTLDVEMP 60
M+ +L DD A +R ++ + ++ V + I + D++ DV MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 RMDGLDFLEKLMRLRPMPVVMVSSLTGKGS-EVTLRALELGAIDFVTKPQLGIREGMLAY 119
+ D L ++ + RP V+V ++ + + ++A E GA D++ KP + E +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLV--MSAQNTFMTAIKASEKGAYDYLPKP-FDLTELIGII 115

Query: 120 SEMIAEKVRTAAKASLAAHKPLSVPTTLKAGPLLSSEKLIAIGASTGGTEAIRHVLQPLP 179
+AE R +K + + + +G S E R + + +
Sbjct: 116 GRALAEPKRRPSKLEDDSQDGMPL-----------------VGRSAAMQEIYRVLARLMQ 158

Query: 180 LSSPALLI 187
++
Sbjct: 159 TDLTLMIT 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2091PF06580402e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 40.2 bits (94), Expect = 2e-05
Identities = 22/151 (14%), Positives = 48/151 (31%), Gaps = 52/151 (34%)

Query: 379 ELDKSLIERIIDPLT--HLVRNSLDHGIELPEKRLAAGKNSVGNLILSAEHQGGNICIEV 436
+++ ++++ + P+ LV N + HGI G ++L G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 437 TDDGAGLNRERILAKAASQGLTVSENMSDDEVAMLIFAPGFSTAEQVTDVSGRGVGMDVV 496
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKNTK--------------------------------------ESTGTGLQNV 318

Query: 497 KRNIQEMGG---HVEIQSMQGTGTTIRILLP 524
+ +Q + G +++ QG +L+P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2092PF05272310.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.8 bits (69), Expect = 0.009
Identities = 22/93 (23%), Positives = 35/93 (37%), Gaps = 11/93 (11%)

Query: 46 LISISSPKELIQIAEYFRTPLATAVTGGDRISNSESPIPGGGDDYTQSQGEVNKQPNIEE 105
L +SSP A P + G + ++ PGGGDD GE +++
Sbjct: 384 LADVSSPTAAAGGAGGGEPPKKRDPSAG---AGTDPGGPGGGDD-----GEDPFGEWLDD 435

Query: 106 LKKRM---EQSRLRKLRGDLDQLIESDPKLRAL 135
R+ + L+ R L + + S P L
Sbjct: 436 EVARLRLRGRWLLKPRRAALIEALRSAPALAGC 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2093PF05844330.001 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 32.7 bits (74), Expect = 0.001
Identities = 12/28 (42%), Positives = 22/28 (78%), Gaps = 2/28 (7%)

Query: 76 MDLLALLYRLMAKSRQMGMFSLERDIEN 103
++LL +L+R+ K+R++G+ L+RD EN
Sbjct: 74 VELLLILFRIAQKARELGV--LQRDNEN 99


86UTI89_C2124UTI89_C2150N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2124015-1.339697flagellin
UTI89_C2125-2160.340970flagellar capping protein
UTI89_C2126-2140.178929flagellar protein FliS
UTI89_C2127-1130.558350flagellar biosynthesis protein FliT
UTI89_C2128-114-1.013625cytoplasmic alpha-amylase
UTI89_C2129-118-3.161821hypothetical protein
UTI89_C2130023-4.571861hypothetical protein
UTI89_C2131433-7.329553hypothetical protein
UTI89_C2132333-6.705035hypothetical protein
UTI89_C2133024-4.273078porin
UTI89_C2134-116-1.819297transcriptional regulator YbcM
UTI89_C21350141.569486kinase inhibitor
UTI89_C2136-1153.898514multidrug efflux protein
UTI89_C21371184.667041flagellar hook-basal body protein FliE
UTI89_C21381154.432539flagellar MS-ring protein
UTI89_C21391184.531555flagellar motor switch protein G
UTI89_C2140-1173.829921flagellar assembly protein H
UTI89_C2141-2193.516621flagellum-specific ATP synthase
UTI89_C2142-2162.382533flagellar biosynthesis chaperone
UTI89_C2143-1162.347199flagellar hook-length control protein
UTI89_C2144-3211.868504flagellar basal body protein FliL
UTI89_C21450170.534763flagellar motor switch protein FliM
UTI89_C2146116-2.378383flagellar motor switch protein FliN
UTI89_C2147017-3.266588flagellar biosynthesis protein FliO
UTI89_C2148-119-3.895273flagellar biosynthesis protein FliP
UTI89_C2149-122-5.245238flagellar biosynthesis protein FliQ
UTI89_C2150-119-5.508745flagellar biosynthesis protein FliR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2124FLAGELLIN2301e-70 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 230 bits (587), Expect = 1e-70
Identities = 252/583 (43%), Positives = 306/583 (52%), Gaps = 76/583 (13%)

Query: 2 AQVINTNSLSLITQNNINKNQSALSSSIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 61
AQVINTNSLSL+TQNN+NK+QS+LSS+IERLSSGLRINSAKDDAAGQAIANRFTSNIKGL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 TQAARNANDGISVAQTTEGALSEINNNLQRIRELTVQASTGTNSDSDLDSIQDEIKSRLD 121
TQA+RNANDGIS+AQTTEGAL+EINNNLQR+REL+VQA+ GTNSDSDL SIQDEI+ RL+
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EIDRVSGQTQFNGVNVLAKDGSMKIQVGANDGETITIDLKKIDSDTLGLNGFNVNGKGTI 181
EIDRVS QTQFNGV VL++D MKIQVGANDGETITIDL+KID +LGL+GFNVNG
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNG---- 176

Query: 182 TNKAATVSDLTSAGAKLNTTTGLYDLKTENTLLTTDAAFDKLGNGDKVTVGGVDYTYNAK 241
+ + + TG D + NA
Sbjct: 177 ----PKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAA 232

Query: 242 SGDFTTTKSTAGTGVDAAAQATDSAKKRDALAATLHADVGKSVNGSYTTKDGTVSFETDS 301
+G TT + T VD +A +A A K G D
Sbjct: 233 NGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIA------------GAIKGGKEGDTFDY 280

Query: 302 AGNITIGGSQAYVDDAGNLTTNNAGSAAKADMKALLKAASEGSDGASLTFNGTEYTIAKA 361
G ++ D G ++T
Sbjct: 281 KGVTFTIDTKTGNDGNGKVSTT-------------------------------------- 302

Query: 362 TPATTSPVAPLIPGGITYQATVSKDVVLSETKAAAATSSITFNSGVLSKTIGFTAGESSD 421
I G + AA SS + V++ F ++
Sbjct: 303 -----------INGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNE 351

Query: 422 AAKSYVDDKGGITNVADYTVSYSVNKDNGSVTVAGYASATDTNKDYAPAIGTAVNVNSAG 481
+AK + A+ +
Sbjct: 352 SAKLSDLEANNAVKGE-------SKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVST 404

Query: 482 KITTETTSAGSATTNPLAALDDAISSIDKFRSSLGAIQNRLDSAVTNLNNTTTNLSEAQS 541
I + +A +T NPLA++D A+S +D RSSLGAIQNR DSA+TNL NT TNL+ A+S
Sbjct: 405 LINEDAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARS 464

Query: 542 RIQDADYATEVSNMSKAQIIQQAGNSVLAKANQVPQQVLSLLQ 584
RI+DADYATEVSNMSKAQI+QQAG SVLA+ANQVPQ VLSLL+
Sbjct: 465 RIEDADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2125TYPE3OMBPROT330.003 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 32.7 bits (74), Expect = 0.003
Identities = 24/72 (33%), Positives = 37/72 (51%), Gaps = 2/72 (2%)

Query: 214 NGMEVSVAAQNAQLTVNNVAIENSSNTISNALENITLNLNDVTTGNQTLTITQDTSKAQT 273
N E +VAA+N + + A+ + +S AL T++L V+T LT T T ++
Sbjct: 236 NSSERAVAARNKAEELVSAALYSRPELLSQALSGKTVDLKIVSTS--LLTPTSLTGGEES 293

Query: 274 AIKDWVNAYNSL 285
+KD VNA L
Sbjct: 294 MLKDQVNALKGL 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2130RTXTOXIND300.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.017
Identities = 10/57 (17%), Positives = 17/57 (29%), Gaps = 2/57 (3%)

Query: 164 RFTLLPIFRIPVKMQKVSAASPLTQKPDQARRRF--RLGMLVFFGMLGWALLTAMNQ 218
R L R + + + A L + P R R M ++L +
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEI 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2131PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2133ECOLIPORIN495e-179 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 495 bits (1277), Expect = e-179
Identities = 234/370 (63%), Positives = 271/370 (73%), Gaps = 31/370 (8%)

Query: 1 MSAQAAEIYNKDSNKLDLYGKVNAKHYFSSNDADDGDTTYVRLGFKGETQINDQLTGFGQ 60
+A AAEIYNKD NKLDLYGKV+ HYFS + + DGD TY+R+GFKGETQINDQLTG+GQ
Sbjct: 17 GAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVGFKGETQINDQLTGYGQ 76

Query: 61 WEYEFKGNRAESQGSSKDKTRLAFAGLKFGDYGSIDYGRNYGVAYDIGAWTDVLPEFGGD 120
WEY + N E +G++ TRLAFAGLKFGDYGS DYGRNYGV YD+ WTD+LPEFGGD
Sbjct: 77 WEYNVQANTTEGEGANS-WTRLAFAGLKFGDYGSFDYGRNYGVLYDVEGWTDMLPEFGGD 135

Query: 121 TWTQTDVFMTGRTTGVATYRNNDFFGLVDGLNFAAQYQGKNDR----------------T 164
++T D +MTGR GVATYRN DFFGLVDGLNFA QYQGKN+
Sbjct: 136 SYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQSADDVNIGTNNRNNGD 195

Query: 165 DVTEANGDGFGFSTTYEY-EGFGVGATYAKSDRTNDQVIYGNNSLNASGQNAEVWAAGLK 223
D+ NGDGFG STTY+ GF GA Y SDRTN+QV G A G A+ W AGLK
Sbjct: 196 DIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGT--IAGGDKADAWTAGLK 253

Query: 224 YDANNIYLATTYSETQNMTVFG------NNHIANKAQNFEVVAQYQFDFGLRPSVAYLQS 277
YDANNIYLAT YSET+NMT +G + +ANK QNFEV AQYQFDFGLRP+V++L S
Sbjct: 254 YDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQYQFDFGLRPAVSFLMS 313

Query: 278 KGKDLG----AWGDQDLVEYIDVGATYYFNKNMSTFVDYKINLIDKSD-FTKASGVATDD 332
KGKDL D+DLV+Y DVGATYYFNKN ST+VDYKINL+D D F K +G++TDD
Sbjct: 314 KGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDDDDPFYKDAGISTDD 373

Query: 333 IVAVGLVYQF 342
IVA+G+VYQF
Sbjct: 374 IVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2137FLGHOOKFLIE1178e-38 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 117 bits (293), Expect = 8e-38
Identities = 102/103 (99%), Positives = 102/103 (99%)

Query: 2 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTVARTQAEKFTL 61
SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQT ARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2138FLGMRINGFLIF7500.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 750 bits (1938), Expect = 0.0
Identities = 476/555 (85%), Positives = 511/555 (92%), Gaps = 5/555 (0%)

Query: 3 ATAAQTKSLEWLNRLRANPKIPLIVAGSAAVAVMVALILWAKAPDYRTLFSNLSDQDGGA 62
+TA Q K LEWLNRLRANP+IPLIVAGSAAVA++VA++LWAK PDYRTLFSNLSDQDGGA
Sbjct: 5 STATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGA 64

Query: 63 IVSQLTQMNIPYRFSEASGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 122
IV+QLTQMNIPYRF+ SGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ
Sbjct: 65 IVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 124

Query: 123 FSEQVNYQRALEGELSRTIETIGPVKGARVHLAMPKPSLFVREQKSPSASVTVNLLPGRA 182
FSEQVNYQRALEGEL+RTIET+GPVK ARVHLAMPKPSLFVREQKSPSASVTV L PGRA
Sbjct: 125 FSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRA 184

Query: 183 LDEGQISAIVHLVSSAVAGLPPGNVTLVDQGGHLLTQSNTSGRDLNDAQLKYASDVEGRI 242
LDEGQISA+VHLVSSAVAGLPPGNVTLVDQ GHLLTQSNTSGRDLNDAQLK+A+DVE RI
Sbjct: 185 LDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRI 244

Query: 243 QRRIEAILSPIVGNGNIHAQVTAQLDFASKEQTEEQYRPNGDESHAALRSRQLNESEQSG 302
QRRIEAILSPIVGNGN+HAQVTAQLDFA+KEQTEE Y PNGD S A LRSRQLN SEQ G
Sbjct: 245 QRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVG 304

Query: 303 SGYPGGVPGALSNQPAPANNAPISTPPANQNNRQQ--QASTTSNS---GPRSTQRNETSN 357
+GYPGGVPGALSNQPAP N API+TPP NQ N Q Q ST++NS GPRSTQRNETSN
Sbjct: 305 AGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSN 364

Query: 358 YEVDRTIRHTKMNVGDVQRLSVAVVVNYKTLPDGKPLPLSNEQMKQIEALTREAMGFSEK 417
YEVDRTIRHTKMNVGD++RLSVAVVVNYKTL DGKPLPL+ +QMKQIE LTREAMGFS+K
Sbjct: 365 YEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDK 424

Query: 418 RGDSLNVVNSPFNSSDESGGALPFWQQQVFIDQLLAAGRWLLVLLVAWLLWRKAVRPQLT 477
RGD+LNVVNSPF++ D +GG LPFWQQQ FIDQLLAAGRWLLVL+VAW+LWRKAVRPQLT
Sbjct: 425 RGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLT 484

Query: 478 RRAEAVKTVQQQAQAREEVEDAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 537
RR E K Q+QAQ R+E E+AVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR
Sbjct: 485 RRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 544

Query: 538 VVALVIRQWINNDHE 552
VVALVIRQW++NDHE
Sbjct: 545 VVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2139FLGMOTORFLIG341e-119 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 341 bits (876), Expect = e-119
Identities = 117/329 (35%), Positives = 197/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE 60
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD 120
+ + DY R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLKRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLANRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2140FLGFLIH372e-135 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 372 bits (957), Expect = e-135
Identities = 225/228 (98%), Positives = 227/228 (99%)

Query: 8 MSDNLPWKTWMPDDLAPPQAEFVPMVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 67
MSDNLPWKTW PDDLAPPQAEFVP+VEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 68 AEGRQQGHEQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 127
AEGRQQGH+QGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 128 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 187
MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 188 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 235
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2142FLGFLIJ2024e-70 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 202 bits (514), Expect = 4e-70
Identities = 145/147 (98%), Positives = 147/147 (100%)

Query: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60
MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQKRQST 120
+TSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQ+RQST
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147
AALLAENRLDQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2143FLGHOOKFLIK468e-168 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 468 bits (1206), Expect = e-168
Identities = 366/375 (97%), Positives = 370/375 (98%)

Query: 1 MIRLAPLITANVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60
MIRLAPLITA+VDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLVSDIVSDAQQADLLIPVDETLPVINDEQSTSTPLTTAQTMTLAAVADKNTTKDEKA 120
GEPL+SDIVSDAQQA+LLIPVDET PVINDEQSTSTPLTTAQTM LAAVADKNTTKDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPAEKPTLFTKLTSAQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLP EKPTLFTKLTS QLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTADASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTPLVAEAQSKAEVISTPSPVTA ASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMISPHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQM+SPHQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360

Query: 361 LQGRVTGNSGVDIFA 375
LQGRVTGNSGVDIFA
Sbjct: 361 LQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2145FLGMOTORFLIM385e-136 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 385 bits (989), Expect = e-136
Identities = 85/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 20 ILSQAEIDALLNGDS--EVKDEPTASVSGESDIRPYDPNTQRRVVRERLQALEIINERFA 77
+LSQ EID LL S + E +S I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 78 RHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 137
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 138 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 197
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 198 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 255
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 256 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 312
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 313 GVPVLTSQYGTLNGQYALRIEHLI 336
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2146FLGMOTORFLIN2105e-74 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 210 bits (537), Expect = 5e-74
Identities = 126/137 (91%), Positives = 134/137 (97%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTSGKSAADAVFQQFGGGDVSGALQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T+ KSAADAVFQQ GGGDVSGA+QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2148FLGBIOSNFLIP335e-119 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 335 bits (860), Expect = e-119
Identities = 242/245 (98%), Positives = 244/245 (99%)

Query: 1 MRRLLSVAPVLLWLVTPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60
MRRLLSVAPVLLWL+TPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFNEEK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPF+EEK
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALEKGEQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEALEKG QPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2149TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2150TYPE3IMRPROT2011e-66 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 201 bits (514), Expect = 1e-66
Identities = 257/261 (98%), Positives = 261/261 (100%)

Query: 1 MMQVTSDQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
M+QVTS+QWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEMFNLLADIISELPLI 261
EHLFSE+FNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


87UTI89_C2162UTI89_C2168N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2162-214-4.349465DNA cytosine methylase
UTI89_C2163-122-5.997469hypothetical protein
UTI89_C2164027-7.703241hypothetical protein
UTI89_C2165-224-6.548645outer membrane protein N
UTI89_C2166-125-5.670754chaperone protein HchA
UTI89_C2167030-7.1452002-component sensor protein
UTI89_C2168128-6.076989transcriptional regulatory protein YedW
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2162PF05272290.044 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.044
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 15/62 (24%)

Query: 320 AKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVTRTLSARYYKDGAEILIDRG 379
A+Y + PVLW Y+ R+ K + G+ VY +R +DG+E RG
Sbjct: 166 ARYQVGPVLWGYVVRFIK---SDGDKLTLPYVY------------SRSQRDGSEAWKWRG 210

Query: 380 WD 381
WD
Sbjct: 211 WD 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2163CARBMTKINASE338e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 32.9 bits (75), Expect = 8e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 9/92 (9%)

Query: 46 AQKLAADDDVDMLVILTACYFHDIVSLAKDHPQRQRSSILAAEETRRLLREEFVQFPA-- 103
+KLA + + D+ +ILT + +L + Q + EE R+ E F A
Sbjct: 219 GEKLAEEVNADIFMILTDV---NGAALYYGTEKEQWLREVKVEELRKYYEEG--HFKAGS 273

Query: 104 --EKIEAVCHAIAAHSFSAQIAPLTTEAKIVQ 133
K+ A I A IA L + ++
Sbjct: 274 MGPKVLAAIRFIEWGGERAIIAHLEKAVEALE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2165ECOLIPORIN444e-158 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 444 bits (1144), Expect = e-158
Identities = 205/395 (51%), Positives = 256/395 (64%), Gaps = 36/395 (9%)

Query: 11 MKRKVLAMLVPALLVAGAANAAEIYNKDGNKVDFYGKMVGERIWSNTDDNNSENEDTSYA 70
MKRKVLA+++PALL AGAA+AAEIYNKDGNK+D YGK+ G +S D++S++ D +Y
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFS---DDSSKDGDQTYM 57

Query: 71 RFGVKGETQITSELTGFGQFEYNLDASKPEGE-NQEKTRLTFAGLKYNELGSFDYGRNYG 129
R G KGETQI +LTG+GQ+EYN+ A+ EGE TRL FAGLK+ + GSFDYGRNYG
Sbjct: 58 RVGFKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYG 117

Query: 130 VAYDAAAYTDMLVEWGGDSWASADNFMNGRTNGVATYRNYDFFGLVDGLDFAIQYQGKNS 189
V YD +TDML E+GGDS+ ADN+M GR NGVATYRN DFFGLVDGL+FA+QYQGKN
Sbjct: 118 VLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNE 177

Query: 190 NRS----------------TKKQNGDGYALSVDYNI-NGFGIVGAYSKSDRTNDQVA--- 229
++S + NGDG+ +S Y+I GF AY+ SDRTN+QV
Sbjct: 178 SQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGG 237

Query: 230 -DGNGSNAELWSLAAKYDANNVYAVVMYGETRNMTPGSIDTGVADREGNTIMRDQLINET 288
G A+ W+ KYDANN+Y MY ETRNMTP + + + N+T
Sbjct: 238 TIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYG--------KTDKGYDGGVANKT 289

Query: 289 QNFEAVVQYQFDFGLRPSLGYVYSKGKDIKGVPGHRYVDADRVNYIEVGTWYYFNKNMNV 348
QNFE QYQFDFGLRP++ ++ SKGKD+ D D V Y +VG YYFNKN +
Sbjct: 290 QNFEVTAQYQFDFGLRPAVSFLMSKGKDL-TYNNVNGDDKDLVKYADVGATYYFNKNFST 348

Query: 349 YTAYKFNMLDKDDA--AITGAAADDQFAVGIVYQF 381
Y YK N+LD DD G + DD A+G+VYQF
Sbjct: 349 YVDYKINLLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2166SUBTILISIN280.038 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 28.3 bits (63), Expect = 0.038
Identities = 7/29 (24%), Positives = 14/29 (48%)

Query: 160 GLPESEDVAAALQWAIENDRFVISLCHGP 188
G + + + + +AIE +IS+ G
Sbjct: 122 GSGQYDWIIQGIYYAIEQKVDIISMSLGG 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2168HTHFIS832e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.6 bits (204), Expect = 2e-20
Identities = 30/117 (25%), Positives = 60/117 (51%), Gaps = 1/117 (0%)

Query: 2 KILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDDYALIILDIMLPGMDGWQ 61
IL+ +D+ + + Q LS AGY + S+ D L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 ILQTLRTA-KQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSFSELLARVRAQLRQ 117
+L ++ A PV+ ++A+++ ++ + GA DYL KPF +EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


88UTI89_C2189UTI89_C2194N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2189-126-4.947038hypothetical protein
UTI89_C2190-224-3.832922hypothetical protein
UTI89_C2191-226-4.871102hypothetical protein
UTI89_C2192-230-6.126399hypothetical protein
UTI89_C2193-128-5.446301hypothetical protein
UTI89_C2194-225-3.875544shikimate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2189INTIMIN752e-17 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 75.1 bits (184), Expect = 2e-17
Identities = 20/60 (33%), Positives = 29/60 (48%), Gaps = 3/60 (5%)

Query: 181 QQIASTSQLIGSLLAEDMNSEQAANIARGWASSQASGVMTDWLSRFGTARITLGVDEDFS 240
QQ AS + S +N + A + A G A +QAS + WL +GTA + L +F
Sbjct: 168 QQAASLGSQLQS---RSLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2191INTIMIN563e-10 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 55.8 bits (134), Expect = 3e-10
Identities = 62/263 (23%), Positives = 91/263 (34%), Gaps = 20/263 (7%)

Query: 175 IAVKAHVNDQFGNPVTHQPATFSAAPSSQMIISQNTVSTNTQGVAEVTMTPERNGSYTVK 234
I A V G + P +F+ S ++S N+ +TN G A VT+ ++ G V
Sbjct: 578 ITYTATVKKN-GVAQANVPVSFNIV-SGTAVLSANSANTNGSGKATVTLKSDKPGQVVVS 635

Query: 235 ASLANGASLEKQLEAI---DEKLTLTSSPLIGVNAPKGATLTATLT---SANGTPVEGQV 288
A A S I K ++T A T T PV Q
Sbjct: 636 AKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQE 695

Query: 289 INFSVTLEGATLSGGKVRTNSSGQAPVVLTSNKVGTYTVTASFHNGVTIQTQTTVKVTGN 348
+ F+ TL LS +T+++G A V LTS G V+A + V+
Sbjct: 696 VTFTTTL--GKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTT 753

Query: 349 PSTAHVASFIADPSTIAATNSDLSTLKATVEDGSGNL-IEGLTVYFALKSGSTTLTSLTA 407
+ I T ++ G NL G + +S + + S
Sbjct: 754 LTID------DGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIAS--- 804

Query: 408 VTDQNGIATTSVKGEITGSVTVS 430
V +G T KG T SV S
Sbjct: 805 VDASSGQVTLKEKGTTTISVISS 827



Score = 52.4 bits (125), Expect = 3e-09
Identities = 46/170 (27%), Positives = 65/170 (38%), Gaps = 7/170 (4%)

Query: 271 TLTATLTSANGTPVEGQVINFSVTLEGATLSGGKVRTNSSGQAPVVLTSNKVGTYTVTAS 330
T TAT+ NG ++F++ A LS TN SG+A V L S+K G V+A
Sbjct: 579 TYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAK 637

Query: 331 FHNGV-TIQTQTTVKVTGNPSTAHVASFIADPSTIAATNSDLSTLKATVEDGSGNLIEGL 389
+ + V + A + AD +T A D T V G +
Sbjct: 638 TAEMTSALNANAVIFVDQ--TKASITEIKADKTTAVANGQDAITYTVKVMKG-DKPVSNQ 694

Query: 390 TVYFALKSGSTTLTSLTAVTDQNGIATTSVKGEITGSVTVSAVTSAGGMQ 439
V F G + + T TD NG A ++ G VSA S +
Sbjct: 695 EVTFTTTLGKLSNS--TEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVD 742



Score = 51.2 bits (122), Expect = 7e-09
Identities = 51/233 (21%), Positives = 89/233 (38%), Gaps = 16/233 (6%)

Query: 13 AVTDADGKAKVTLKGTKAGAHTVTASMVGGKS--EQLVVNFTADTLTAQVNLNVTEDNFI 70
A T+ GKA VTLK K G V+A S V F T + + + +
Sbjct: 612 ANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAV 671

Query: 71 ANNIGMTRLQATVTDGNGNPVEGIKVNFRGTSVTLSSTSVETDDQVFAEILVTSTEVGLK 130
AN V PV +V F T LS+++ +TD +A++ +TST G
Sbjct: 672 ANGQDAITYTVKVMK-GDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKS 730

Query: 131 TVSASLADKPTEVISRLLN----AKVDVNSATI----TSQEIPEGQVMVAQDIAVKAHVN 182
VSA ++D +V + + +D + I ++P + Q + N
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN 790

Query: 183 DQFGNPVTHQPATFSAAPSSQMIISQNTVSTNTQGVAEVTMTPERNGSYTVKA 235
++ + A S Q+ + + +T + V + + +YT+
Sbjct: 791 GKYTWRSANPAIASVDASSGQVTLKEKGTTTIS-----VISSDNQTATYTIAT 838



Score = 40.1 bits (93), Expect = 2e-05
Identities = 35/213 (16%), Positives = 63/213 (29%), Gaps = 18/213 (8%)

Query: 4 NFTLSDGDKAVTDADGKAKVTLKGTKAGAHTVTASMVGGKSE--QLVVNFTADTLTAQVN 61
TD +G AKVTL T G V+A + + V F N
Sbjct: 701 TLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGN 760

Query: 62 LNVTEDNFIANNIGMTRLQATVTDGNGN-PVEGIKVNFRGTSVTLSSTSVETDDQVFAEI 120
+ + + + + G N G + S + SV+
Sbjct: 761 IEI-----VGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQ---- 811

Query: 121 LVTSTEVGLKTVSASLADKPTEVISRLLNAKVDVNSATITSQEIPEGQVMVAQDIAVKAH 180
VT E G T+S +D T + + + ++ + V ++
Sbjct: 812 -VTLKEKGTTTISVISSDNQT--ATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGG--- 865

Query: 181 VNDQFGNPVTHQPATFSAAPSSQMIISQNTVST 213
N + + + AA + S T+ +
Sbjct: 866 KLPSSQNELENVFKAWGAANKYEYYKSSQTIIS 898


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2192INTIMIN280.022 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 27.7 bits (61), Expect = 0.022
Identities = 22/129 (17%), Positives = 46/129 (35%), Gaps = 6/129 (4%)

Query: 11 KISAIDYSQNINGDYKATVTGGGEGIATLIPVLNGVHQAGLSTTIEFISAETRPMTGTVS 70
K+S + NG K T+T G + + ++ V + +EF G +
Sbjct: 704 KLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFF-TTLTIDDGNIE 762

Query: 71 VNSANLPTASFPSQGFTGAYYQLNNDNFAPGKTAADYSFSSSASWVGVDATGKVTFKNDG 130
+ + P+ L + G + ++ A ++G+VT K G
Sbjct: 763 IVGTGV-KGKLPTVWLQYGQVNL---KASGGNGKYTWRSANPAIASVDASSGQVTLKEKG 818

Query: 131 DSNTVIITA 139
+ T+ + +
Sbjct: 819 -TTTISVIS 826


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2194TCRTETB330.002 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.3 bits (76), Expect = 0.002
Identities = 39/259 (15%), Positives = 96/259 (37%), Gaps = 18/259 (6%)

Query: 79 LGGVIFGHFGDRLGRKRMLMLTVWMMGIATALIGILPSFSTIGWWAPILLVTLRAIQGFA 138
+G ++G D+LG KR+L+ + + + + + SF ++ I+ ++ A
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSL----LIMARFIQGAGAAA 119

Query: 139 VGGEWGGAALLSVESAPKNKK-AFYSSGVQVGYGVGLLLSTGLVSLISMMTTDEQFLSWG 197
+ + K S V +G GVG + + I
Sbjct: 120 FPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------------H 167

Query: 198 WRIPFLFSIVLVLGALWVRNGMEESAEFEQQQHNQAAAKKRIPVIEALLRHPGAFLKIIA 257
W L ++ ++ ++ +++ + + + ++ +L + +
Sbjct: 168 WSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLI 227

Query: 258 LRLCELLTMYIVTAFALNYSTQNMGLPRELFLNIGLLVGGLSCLTIPCFAWLADRFGRRR 317
+ + L +++ + + GL + + IG+L GG+ T+ F + +
Sbjct: 228 VSVLSFL-IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDV 286

Query: 318 VYITGALIGTLSAFPFFMA 336
++ A IG++ FP M+
Sbjct: 287 HQLSTAEIGSVIIFPGTMS 305


89UTI89_C2345UTI89_C2354N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2345-3132.944224chaperone
UTI89_C2346-3133.300057hypothetical protein
UTI89_C2347-3174.104327hypothetical protein
UTI89_C2348-3174.085079hypothetical protein
UTI89_C2349-2174.139385multidrug efflux system subunit MdtA
UTI89_C2350-2183.957197multidrug efflux system subunit MdtB
UTI89_C2351-2152.542238multidrug efflux system subunit MdtC
UTI89_C2352-114-2.971807multidrug efflux system protein MdtE
UTI89_C2353022-5.692215signal transduction histidine-protein kinase
UTI89_C2354031-9.135856DNA-binding transcriptional regulator BaeR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2345SHAPEPROTEIN492e-08 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 49.0 bits (117), Expect = 2e-08
Identities = 32/129 (24%), Positives = 58/129 (44%), Gaps = 20/129 (15%)

Query: 132 AMMLH-IRQQAQAQLPEAITQAVIGRPINFQGLGSDEANTQAQGILERAAKRAGFKDVVF 190
M+ H I+Q + ++ P+ + + + I E +A+ AG ++V
Sbjct: 89 KMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQV-------ERRAIRE-SAQGAGAREVFL 140

Query: 191 QYEPVAAGLDYEATLQEEKRVLVVDIGGGTTDCSLLLMGPQWRSRLDREASLLGHSGCRI 250
EP+AA + + E +VVDIGGGTT+ +++ + ++ S RI
Sbjct: 141 IEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN-----------GVVYSSSVRI 189

Query: 251 GGNDLDIAL 259
GG+ D A+
Sbjct: 190 GGDRFDEAI 198



Score = 35.1 bits (81), Expect = 5e-04
Identities = 32/137 (23%), Positives = 56/137 (40%), Gaps = 23/137 (16%)

Query: 332 RLSYRLV---RSAEESKIALSSV--AETRASLPFISDELAT------LISQQGLESALNQ 380
R +Y + +AE K + S + + LA ++ + AL +
Sbjct: 203 RRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQE 262

Query: 381 PLARILEQVQLALDNAQEKPDV--------IYLTGGSARSPLIKKALAEQLPGIPIAGGD 432
PL I+ V +AL+ Q P++ + LTGG A + + L E+ GIP+ +
Sbjct: 263 PLTGIVSAVMVALE--QCPPELASDISERGMVLTGGGALLRNLDRLLMEET-GIPVVVAE 319

Query: 433 D-FGSVTAGLARWAEVV 448
D V G + E++
Sbjct: 320 DPLTCVARGGGKALEMI 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2349RTXTOXIND486e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.5 bits (113), Expect = 6e-08
Identities = 48/369 (13%), Positives = 105/369 (28%), Gaps = 87/369 (23%)

Query: 4 SYKSRWVIVIVVVIAAIAAFWFWQGRNDSQSAAPG-----ATKQAQQSPAGGR------- 51
S + R V ++ IA G+ + + A G + +
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 52 --RGMRAG-PLA---PVQAATAVEQAVPRYLTGLGTITAANTVTVRSRVDG--QLMALHF 103
+R G L + A + L T ++ ++ +L
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 104 QEGQQVKAGDLLAEI------------DPSQFKVALAQAQGQLA-------KDKATLANA 144
Q V ++L Q ++ L + + + + +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 145 RRDLSRYQQLAKTNLVSRQELDAQQALVSETEGTIKADEASVA----------------- 187
+ L + L +++ + Q+ E ++ ++ +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 188 --------------------------SAQLQLDWSRITAPVDGRV-GLKQVDVGNQISSG 220
+ + S I APV +V LK G +++
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 221 DTTGIVVITQTHPIDLVFTLPESDIATVVQAQKAGKPLVVEARDRTNSKKL-SEGTLLSL 279
+T +V++ + +++ + DI + Q A + VEA T L + ++L
Sbjct: 354 ETL-MVIVPEDDTLEVTALVQNKDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINL 410

Query: 280 DNQIDATTG 288
D D G
Sbjct: 411 DAIEDQRLG 419


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2350ACRIFLAVINRP9200.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 920 bits (2379), Expect = 0.0
Identities = 300/1036 (28%), Positives = 513/1036 (49%), Gaps = 29/1036 (2%)

Query: 13 SRLFIMRPVATTLLMVAILLAGIIGYRALPVSALPEVDYPTIQVVTLYPGASPDVMTSAV 72
+ FI RP+ +L + +++AG + LPV+ P + P + V YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMSSQS-SGGASVITLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L MSS S S G+ ITL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPNPPVYSKVNPADPPIMTLAVTSTAMPMTQVE--DMVETRVAQKISQISGVGLVTLSGG 189
+ + S + +M S TQ + D V + V +S+++GVG V L G
Sbjct: 122 VQQQGI-SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAQAIAALGLTSETVRTAITGANVNSAKGSLDGP------SRAVTLSANDQ 243
Q A+R+ L+A + LT V + N A G L G ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MQSAEEYRQLII-AYQNGAPIRLGDVATVEQGAENSWLGAWANKEQAIVMNVQRQPGANI 302
++ EE+ ++ + +G+ +RL DVA VE G EN + A N + A + ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 ISTADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVDDTQFELMMAIALVVMIIYLFL 362
+ TA +I+ L +L P+ +KV D T ++ S+ + L AI LV +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N+ AT+IP +AVP+ L+GTFA++ +SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAITLAVAIL 481
+ E P A K +I ++ + L AV IP+ F G G ++R+F+IT+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SQESLRKQNRFSRASEKMFDRIIAAYGRGLAKVLNHPWL 538
+S +V+L LTP +CA +L S E + F FD + Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVALSTLLLSVLLWVFIPKGFFPVQDNGIIQGTLQAPQSSSFANMAQRQRQVADVILQ 598
L + + V+L++ +P F P +D G+ +Q P ++ + QV D L+
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VQSLTSFVGVDGTNPSLNSARLQINLKPLDERDDR---VQKVIARLQTAVDKVPG 653
+ V+S+ + G + + N+ ++LKP +ER+ + VI R + + K+
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR- 658

Query: 654 VDLFLQPTQDLTIDTQVSRTQYQFTLQ---ATSLDALSTWVPQLMEKLQQLP-QLSDVSS 709
D F+ P I + T + F L DAL+ QL+ Q P L V
Sbjct: 659 -DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDKGLVAYVNVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTE 769
+ + + VD++ A LG+S++D++ + A G ++ + ++ ++ + +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 NTPGLAALDTIRLTSSDGGVVPLSSIAKIEQRFAPLSINHLDQFPVTTISFNVPDNYSLG 829
+D + + S++G +VP S+ + + + P I S G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 DAVQAIMDTEKTLNLPVDITTQFQGSTLAFQSALGSTVWLIVAAVVAMYIVLGILYESFI 889
DA A+M+ + LP I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DA-MALMENLAS-KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALMIAGSELDVIAIIGIILLIGIVKKNAIMMIDFALAAEREQ 949
P++++ +P VG LLA + + DV ++G++ IG+ KNAI++++FA ++
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMSPREAIYQACLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIGMVGGLIVSQV 1009
G EA A +R RPILMT+LA +LG LPL +S G G+ + +GIG++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDRL 1025
L +F PV +++ R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2351ACRIFLAVINRP9160.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 916 bits (2369), Expect = 0.0
Identities = 288/1035 (27%), Positives = 504/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ +L++ + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVSEMTSSS-SLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLPSGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSDT--YSQGELYDFASTQLAPTISQIDGVGDVDVGGSSL 182
+ S + +M+ SD +Q ++ D+ ++ + T+S+++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLNPQALFNQGVSLDDVRTAISNANVRKPQG------ALEDGTHRWQIQTNDELK 236
A+R+ L+ L ++ DV + N + G AL I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDSIRARLPELQSTIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRS 355
T +I+A+L ELQ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RAT+IP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RMLVALQQGYGKSLKWVLNHTRLVG 530
+LV+L LTP +C +LK + GF Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVLLGTIALNIWLYISIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 586
++ +A + L++ +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 587 RD-DPAVDNVTGFT-GGSRVNSGMMFITLKPRGERS---ETAQQIIDRLRKKLAKEPGAN 641
+ +V V GF+ G N+GM F++LKP ER+ +A+ +I R + +L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLMAVQDIRVGGRQANASYQYTLLSDDLAALREWEPKIRKKLATL-----PELADVNSD 696
+ + I G ++ L D + + R +L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QEDNGAEMNLIYDRDTMARLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 756
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 TQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSD 816
++K++V + G+ +P S F + + I G S D
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVILIIAAIATVYIVLGILYESYVH 876
A A ++ ++L P+ + + G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALQLFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGN 936
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA +
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLL 996
EA A +R RPI+MT+LA + G LPL +S G GS + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVVYLFFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031



Score = 81.4 bits (201), Expect = 1e-17
Identities = 76/448 (16%), Positives = 161/448 (35%), Gaps = 26/448 (5%)

Query: 592 VDNVTGFTGGS-RVNSGMMFITLKPRGERSETAQQIIDRLRKKLAKEPGANLFLMAVQDI 650
+DN+ + S S + +T + + Q+ ++L+ P + Q I
Sbjct: 72 IDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE----VQQQGI 127

Query: 651 RVGGRQANASYQYTLLSDDLAALREW-----EPKIRKKLATLPELADVNSDQEDNGAE-- 703
V ++ +SD+ ++ ++ L+ L + DV GA+
Sbjct: 128 SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL----FGAQYA 183

Query: 704 MNLIYDRDTMARLGID----VQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRYTQD 759
M + D D + + + + + + T P Q + R+
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP 243

Query: 760 ISALEKMFVINNEGKAIPLSYFAK--WQPANAPLSVNHQGLSAASTISFNLPTGKSLSDA 817
+ +N++G + L A+ N + G AA +L D
Sbjct: 244 EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL-DT 302

Query: 818 SAAIDRAMTQL--GVPSTVRGSFA-GTAQVFQETMNSQVILIIAAIATVYIVLGILYESY 874
+ AI + +L P ++ + T Q +++ V + AI V++V+ + ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 875 VHPLTILSTLPSAGVGALLALQLFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRH 934
L +P +G L F + + + G++L IG++ +AI++V+
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 935 GNLTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQ 994
L P+EA ++ ++ + +P+ GG + + ITIV + +S
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 995 LLTLYTTPVVYLFFDRLRLRFSRKPKQA 1022
L+ L TP + + + K
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENKGG 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2352TCRTETB1268e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 126 bits (317), Expect = 8e-34
Identities = 97/429 (22%), Positives = 189/429 (44%), Gaps = 23/429 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIATLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGFSPLAIAGLVAVGVVALVLYLLHAQNNNRALFSLKL 257
G +L++VG+ L F+ + V V++ ++++ H + L
Sbjct: 202 KGIILMSVGIVFFML---------FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+R G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHVSVDSGTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYTWLSMAF 441
+Y+ L + F
Sbjct: 428 LYSNLLLLF 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2353BCTERIALGSPF330.002 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 32.9 bits (75), Expect = 0.002
Identities = 27/95 (28%), Positives = 35/95 (36%), Gaps = 20/95 (21%)

Query: 164 RQTSWLIVALSTLLAALATF------PLARGLLAPVKRLVDGTHKLAAGDFTTRVAPTIE 217
RQ + L+ A L AL P L+A V+ V H LA + P
Sbjct: 75 RQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSF 131

Query: 218 DEL-----------GRLAEDFNQLASTLEKNQQMR 241
+ L G L N+LA E+ QQMR
Sbjct: 132 ERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2354HTHFIS766e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 6e-18
Identities = 28/136 (20%), Positives = 65/136 (47%), Gaps = 1/136 (0%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLPYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTILRRCK 129
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 PQRELQQQDAESPLII 145
+ + D++ + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


90UTI89_C2408UTI89_C2413N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C24081162.111179D-alanyl-D-alanine endopeptidase
UTI89_C24091172.211761hypothetical protein
UTI89_C24101172.148600hypothetical protein
UTI89_C24111152.303354acetoin dehydrogenase
UTI89_C2412-1151.370897multidrug resistance outer membrane protein
UTI89_C24131130.121095tRNA-dihydrouridine synthase C
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2408BLACTAMASEA444e-07 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 43.6 bits (103), Expect = 4e-07
Identities = 42/195 (21%), Positives = 77/195 (39%), Gaps = 18/195 (9%)

Query: 4 MPKFRVSLFSLALMLAVPFAPQAVAKTVAATTASQPEIASGSAMI-VDLNTNKVIYSNHP 62
M R+ + SL + +P A A + + S+ +++ MI +DL + + + +
Sbjct: 1 MRYIRLCIISL--LATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRA 58

Query: 63 DLVRPIASISKLMTAMVVLDARLPLDEKLKVDISQTPEMKGVYSRV---RLNSEISRKDM 119
D P+ S K++ VL DE+L+ I + YS V L ++ ++
Sbjct: 59 DERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGEL 118

Query: 120 LLLALMSSENRAAASLAHHYPGGYKAFIKAMNAKAKSLGMNNTRFV--EPTGLS-----V 172
A+ S+N +AA+L GG + A + +G N TR E
Sbjct: 119 CAAAITMSDN-SAANLLLATVGG----PAGLTAFLRQIGDNVTRLDRWETELNEALPGDA 173

Query: 173 HNVSTARDLTKLLIA 187
+ +T + L
Sbjct: 174 RDTTTPASMAATLRK 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2410BCTERIALGSPF290.017 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 28.6 bits (64), Expect = 0.017
Identities = 5/33 (15%), Positives = 16/33 (48%), Gaps = 2/33 (6%)

Query: 164 WLHNLDQHLKHW-VWLILVVVL-VVGVRWWLKR 194
L + ++ + W++L ++ + R L++
Sbjct: 215 VLMGMSDAVRTFGPWMLLALLAGFMAFRVMLRQ 247


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2411DHBDHDRGNASE1124e-32 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 112 bits (281), Expect = 4e-32
Identities = 70/253 (27%), Positives = 116/253 (45%), Gaps = 12/253 (4%)

Query: 3 QVAIITASDSGIGKECALLLAQQGFDIGITWHSDEEGAKDTARKVVSHGVRAEIVQLDLG 62
++A IT + GIG+ A LA QG I + E + + + AE D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHI-AAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 63 NLPEGAQALEKLIQRLWRIDVLVNNAGAMTKAPFLDMAFDEWRKIFTVDVDGAFLCSQIA 122
+ + ++ + + ID+LVN AG + ++ +EW F+V+ G F S+
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 123 ARQMVKQGQGGRIINITSVHEHTPLPDASAYTAAKHALGGLTKAMALELVRHKILVNAVA 182
++ M+ + + G I+ + S P +AY ++K A TK + LEL + I N V+
Sbjct: 128 SKYMMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PGAIATPMN-----GMDGGD--VKPDAEP---SIPLRRFGTTHEIASLVAWLCSEGANYT 232
PG+ T M +G + +K E IPL++ +IA V +L S A +
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 233 TGQSLIVDGGFML 245
T +L VDGG L
Sbjct: 247 TMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2413SHAPEPROTEIN300.018 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 29.7 bits (67), Expect = 0.018
Identities = 32/127 (25%), Positives = 53/127 (41%), Gaps = 5/127 (3%)

Query: 144 GAKAMREAVPAHLPVSVKVRLGWDSGEK-KFEIADAVQQAGATELVVHGRTKEQGY-RAE 201
G EA+ ++ + +G + E+ K EI A E+ V GR +G R
Sbjct: 190 GGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGF 249

Query: 202 HIDWQAIGE-IRQRLNIPVIANGEIWDWQSAQQCMAISGCDAVMIGRGALNIPNLSRVVK 260
++ I E +++ L V A + + IS V+ G GAL + NL R++
Sbjct: 250 TLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGAL-LRNLDRLL- 307

Query: 261 YNEPRMP 267
E +P
Sbjct: 308 MEETGIP 314


91UTI89_C2496UTI89_C2502N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2496114-1.710284hypothetical protein
UTI89_C2497-111-1.946383porin
UTI89_C2498-111-1.850729phosphotransfer intermediate protein in
UTI89_C2499-112-1.574935transcriptional regulator RcsB
UTI89_C2500-113-0.784657hybrid sensory kinase in two-component
UTI89_C2501-116-0.064756sensory histidine kinase AtoS
UTI89_C2502-1130.828260acetoacetate metabolism regulatory protein AtoC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2496INTIMIN290.037 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 28.9 bits (64), Expect = 0.037
Identities = 31/174 (17%), Positives = 63/174 (36%), Gaps = 7/174 (4%)

Query: 70 NIFQDIFVVVATTQVFTFRLQVSQGRTQTEVELVLSNSFEVLCF---VRPTQGTYASCVV 126
N V++ QV QV + ++ E + + V+ A+ V
Sbjct: 540 NNVLLTITVLSNGQVVD---QVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPV 596

Query: 127 GLGVLSSQVDVVSVVFQTTSVGFSTVAVTDVQRAVLIVSTFGAGDRTTDTEAFVIISDRS 186
++S + + T G +TV + + ++VS A + VI D++
Sbjct: 597 SFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQT 656

Query: 187 ANAVAVLTQSTTTVVGHAFAAYATVFTLVLNSKVQAVNQTEEVGVTVGREAVTT 240
++ + TT V + A ++ K + NQ T+G+ + +T
Sbjct: 657 KASITEIKADKTTAVANGQDAITYTVKVMKGDKPVS-NQEVTFTTTLGKLSNST 709


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2497ECOLIPORIN5340.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 534 bits (1377), Expect = 0.0
Identities = 254/383 (66%), Positives = 293/383 (76%), Gaps = 20/383 (5%)

Query: 1 MKVKVLSLLVPALLVAGAANAAEVYNKDGNKLDLYGKVDGLHYFSDDKSVDGDQTYMRLG 60
MK KVL+L++PALL AGAA+AAE+YNKDGNKLDLYGKVDGLHYFSDD S DGDQTYMR+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQVTDQLTGYGQWEYQIQGNSAENE-NNSWTRVAFAGLKFQDVGSFDYGRNYGVVY 119
FKGETQ+ DQLTGYGQWEY +Q N+ E E NSWTR+AFAGLKF D GSFDYGRNYGV+Y
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 120 DVTSWTDVLPEFGGDTYG-SDNFMQQRGNGFATYRNTDFFGLVDGLNFAVQYQGKNGSVS 178
DV WTD+LPEFGGD+Y +DN+M R NG ATYRNTDFFGLVDGLNFA+QYQGKN S S
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 179 ------GEGMTNNGRGALRQNGDGVGGSITYDY-EGFGIGGAISSSKRTDDQN-SPLYIG 230
G NNG NGDG G S TYD GF G A ++S RT++Q + I
Sbjct: 181 ADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGTIA 240

Query: 231 NGDRAETYTGGLKYDANNIYLAAQYTQTYNATRVGSL------GWANKAQNFEAVAQYQF 284
GD+A+ +T GLKYDANNIYLA Y++T N T G G ANK QNFE AQYQF
Sbjct: 241 GGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQYQF 300

Query: 285 DFGLRPSLAYLQSKGKNLGR---GYDDEDILKYVDVGATYYFNKNMSTYVDYKINLLD-D 340
DFGLRP++++L SKGK+L DD+D++KY DVGATYYFNKN STYVDYKINLLD D
Sbjct: 301 DFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDDD 360

Query: 341 NQFTRDAGINTDNIVALGLVYQF 363
+ F +DAGI+TD+IVALG+VYQF
Sbjct: 361 DPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2499HTHFIS497e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 48.7 bits (116), Expect = 7e-09
Identities = 26/145 (17%), Positives = 60/145 (41%), Gaps = 20/145 (13%)

Query: 16 MNNMNVIIADDHPIVLFGIRKSLEQIEWVNVVGEFEDSTALINNLPKLDAHVLITDLSMP 75
M +++ADD + + ++L + + + ++ L + D +++TD+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 76 GDKYGDGITLIKYIKRHFPSLSIIVLTMNNNPAILSAVLDLDIEGIVLKQGA------PT 129
+ L+ IK+ P L ++V++ N +A+ ++GA P
Sbjct: 59 D---ENAFDLLPRIKKARPDLPVLVMSAQNTFM--TAIKA-------SEKGAYDYLPKPF 106

Query: 130 DLPKALAALQKGKKFTPESVSRLLE 154
DL + + + + S+L +
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2500HTHFIS792e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.5 bits (196), Expect = 2e-17
Identities = 29/106 (27%), Positives = 48/106 (45%)

Query: 827 ILVVDDHPINRSLLADQLGSLGYQCKTANDGVDALNVLNKNHIDIVLSDVNMPNMDGYRL 886
ILV DD R++L L GY + ++ + D+V++DV MP+ + + L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 887 TQRIRQLGLTLPVIGVTANALAEEKQRCLESGMDSCLSKPVTLDVI 932
RI++ LPV+ ++A + E G L KP L +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTEL 111


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2502HTHFIS5620.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 562 bits (1449), Expect = 0.0
Identities = 181/484 (37%), Positives = 269/484 (55%), Gaps = 35/484 (7%)

Query: 1 MTAINRILIVDDEDNVRRMLSTAFALQGFETHCANNGRTALHLFADIHPDVVLMDIRMPE 60
MT IL+ DD+ +R +L+ A + G++ +N T A D+V+ D+ MP+
Sbjct: 1 MTGA-TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPD 59

Query: 61 MDGIKALKEMRSHETRTPVILMTAYAEVETAVEALRCGAFDYVIKPFDLDELNLIVQRAL 120
+ L ++ PV++M+A TA++A GA+DY+ KPFDL EL I+ RAL
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 121 QLQSMKKEIRHLHQALSTSWQWGH-ILTNSPAMMDICKDTAKIALSQASVLISGESGTGK 179
+ L Q G ++ S AM +I + A++ + +++I+GESGTGK
Sbjct: 120 AEP------KRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGK 173

Query: 180 ELIARAIHYNSRRAKGPFIKVNCAALPESLLESELFGHEKGAFTGAQTLRQGLFERANEG 239
EL+ARA+H +R GPF+ +N AA+P L+ESELFGHEKGAFTGAQT G FE+A G
Sbjct: 174 ELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGG 233

Query: 240 TLLLDEIGEMPLVLQAKLLRILQEREFERIGGHQTIKVDIRIIAATNRDLQAMVKEGTFR 299
TL LDEIG+MP+ Q +LLR+LQ+ E+ +GG I+ D+RI+AATN+DL+ + +G FR
Sbjct: 234 TLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFR 293

Query: 300 EDLFYRLNVIHLILPPLRDRREDISLLANHFLQKFSSENQRDIIDIDPMAMSLLTAWSWP 359
EDL+YRLNV+ L LPPLRDR EDI L HF+Q+ E + D A+ L+ A WP
Sbjct: 294 EDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLD-VKRFDQEALELMKAHPWP 352

Query: 360 GNIRELSNVIERAVVMNSGPIIFSEDLPPQIRQPV---------CNAGEAKTAPVGERN- 409
GN+REL N++ R + +I E + ++R + +G + E N
Sbjct: 353 GNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENM 412

Query: 410 ----------------LKEEIKRVEKRIIMEVLEQQEGNRTRTALMLGISRRALMYKLQE 453
+ +E +I+ L GN+ + A +LG++R L K++E
Sbjct: 413 RQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472

Query: 454 YGID 457
G+
Sbjct: 473 LGVS 476


92UTI89_C2698UTI89_C2702N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C2698035-9.734070multidrug resistance protein Y
UTI89_C2699035-9.361663hypothetical protein
UTI89_C2700-134-8.455566multidrug resistance protein K
UTI89_C2701031-7.674365DNA-binding transcriptional activator EvgA
UTI89_C2702031-7.257509hybrid sensory histidine kinase in two-component
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2698TCRTETB1222e-32 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 122 bits (308), Expect = 2e-32
Identities = 92/404 (22%), Positives = 167/404 (41%), Gaps = 17/404 (4%)

Query: 19 VTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVITSFGVANAIAIPVTGRLAQ 78
+ I L + +F +L+ + NV++P I+ WV T+F + +I V G+L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 79 RIGELRLFLLSVTFFSLSSLMCSLS-TNLDVLIFFRVVQGLMAGPLIPLSQSLLLRNYPP 137
++G RL L + S++ + + +LI R +QG A L ++ R P
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 138 EKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRE 197
E R A L V + GP +GG I W +L+ +PM I+ L L +E
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192

Query: 198 TETSPVKMNLPGLTLLVLGVGGLQIMLDKGRDLDWFNSSTIILLTVVSVISLISLVIWES 257
++ G+ L+ +G+ + ML F +S I +VSV+S + V
Sbjct: 193 VRIKG-HFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIR 241

Query: 258 TSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQETMGYNAIWAGLAYAPI 317
+P +D L K+ F IG++ + +G + ++P ++++ + G
Sbjct: 242 KVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 318 GIMPLLIS-PLIGRYGNKIDMRLLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQFFQG 376
G M ++I + G ++ ++ +V + S T F II+ G
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGG 361

Query: 377 FAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL 420
+ ++TI S L + S+ NF LS G ++
Sbjct: 362 LSFTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2700RTXTOXIND786e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 78.3 bits (193), Expect = 6e-18
Identities = 62/413 (15%), Positives = 123/413 (29%), Gaps = 98/413 (23%)

Query: 13 RRKYFALLAVVLFIAFSGAYAYWSMELKDMISTDDAYVTGNADPISAQVSGSVTVVNHKD 72
RR ++ F+ + + + +G + I + V + K+
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKE 114

Query: 73 TNYVRQGDILVSLDKTDATIALNKA----------------------------------- 97
VR+GD+L+ L A K
Sbjct: 115 GESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEP 174

Query: 98 -----------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQSLEDYN 137
K + Q + L + AE + + Y+
Sbjct: 175 YFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEK 234

Query: 138 RRV----PLAKQGVISKEALEHTKDTLI----------SSKAALNAAIQAYKANKALVMN 183
R+ L + I+K A+ ++ + S + + I + K +
Sbjct: 235 SRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE--YQLV 292

Query: 184 TPLNRQPQVIEAADATKE----------AWLALKRTDIKSPVTGYIAQRSVQ-VGETVSP 232
T L + + + T + + I++PV+ + Q V G V+
Sbjct: 293 TQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTT 352

Query: 233 GQSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINMGTGN 291
++LM +VP + V A + + + +GQ+ I + F G +G
Sbjct: 353 AETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLVGK-- 404

Query: 292 AFSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDT 340
+ + +V V +S++ L PL G+++TA I T
Sbjct: 405 -VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKT 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2701HTHFIS493e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 3e-09
Identities = 22/148 (14%), Positives = 53/148 (35%), Gaps = 31/148 (20%)

Query: 4 IIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQV 63
++ DD + L + ++ + + + + D+V+ DV +P N +
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 LETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF 123
L ++K + ++++SA+N + AI+A++ G +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDY 101

Query: 124 ---PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 102 LPKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C2702HTHFIS792e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.5 bits (196), Expect = 2e-17
Identities = 30/105 (28%), Positives = 51/105 (48%)

Query: 960 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKVSMQHYDLLITDVNMPNMDGFE 1019
+IL+ADD R +L + L+ GYDV ++ ++ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 LTRKLREQNSSLPIWGLTANAQANEREKGLNCGMNLCLFKPLTLD 1064
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


93UTI89_C3041UTI89_C3049N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3041-1122.527438transport protein
UTI89_C3042-2142.087541hypothetical protein
UTI89_C3043-2132.105516hypothetical protein
UTI89_C3044-2121.899756transcriptional repressor MprA
UTI89_C3045-2131.747662multidrug resistance protein A
UTI89_C3046-2111.469993multidrug resistance protein B
UTI89_C3047-1110.566614hypothetical protein
UTI89_C3048-2110.426695hypothetical protein
UTI89_C30490151.080112S-ribosylhomocysteinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3041TCRTETB447e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 44.1 bits (104), Expect = 7e-07
Identities = 32/165 (19%), Positives = 70/165 (42%), Gaps = 2/165 (1%)

Query: 34 LDTIARNFSLSASSAGFIVTAAQLGYAAGLLFLVPLGDMFERRRLIVSMTLLAAGGMLIT 93
L IA +F+ +S ++ TA L ++ G L D +RL++ ++ G +I
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 94 ASSQSLA-MMILGTALTGLFSVVAQILVPLA-ATLASPDKRGKVVGTIMSGLLLGILLAR 151
S ++I+ + G + LV + A + RGK G I S + +G +
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 152 TVAGLLANLGGWRTVFWVASMLMALMALALWRGLPQMKSETHLNY 196
+ G++A+ W + + + + + + +++ + H +
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3044PF05272280.018 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.018
Identities = 23/94 (24%), Positives = 36/94 (38%), Gaps = 12/94 (12%)

Query: 23 PYQEILLTRLCMHMQSKLLENRNKMLKAQGINETLFMALITLESQENHSIQPSELSCALG 82
P QE+ L + + L R A+G + + T + ++L ALG
Sbjct: 756 PEQELRLVETGVQGRLWALLTREGAPAAEGAAQKGYSVNTTFVTI-------ADLVQALG 808

Query: 83 -----SSRTNATRIADELEKRGWIERRESDNDRR 111
SS ++ D L + GW RE+ RR
Sbjct: 809 ADPGKSSPMLEGQVRDWLNENGWEYLRETSGQRR 842


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3045RTXTOXIND793e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 79.1 bits (195), Expect = 3e-18
Identities = 64/412 (15%), Positives = 120/412 (29%), Gaps = 97/412 (23%)

Query: 29 LLLTLLFIIIAVAIGIYWFLVLRHFEETDDA----YVAGNQIQIMSQVSGSVTKVWADNT 84
L FI+ + I VL E A +G +I + V ++
Sbjct: 57 PRLVAYFIMGFLVIAFILS-VLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEG 115

Query: 85 DFVKEGDVLVTLDPTDARQAFEKA------------------------------------ 108
+ V++GDVL+ L A K
Sbjct: 116 ESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPY 175

Query: 109 ----------------KTALASSVRQTHQLMINSKQLQANIEVQKIALAKA-------QS 145
K ++ Q +Q +N + +A + + +S
Sbjct: 176 FQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKS 235

Query: 146 DYNRRVPLGNANLIGREELQHARDAVTSAQAQLDVAIQQYNANQAMILGTKLEDQPAVQQ 205
+ L + I + + + A +L V Q ++ IL K E Q Q
Sbjct: 236 RLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQL 295

Query: 206 AATEVRN------------------AWLALERTRIVSPMTGYVSRRAVQ-PGAQISPTTP 246
E+ + + + I +P++ V + V G ++
Sbjct: 296 FKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355

Query: 247 LMAVVPA-TNMWVDANFKETQIANMRIGQPVTITTDIYGDDVKY---TGKVVGLDMGTGS 302
LM +VP + V A + I + +GQ I + + +Y GKV + +
Sbjct: 356 LMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAF-PYTRYGYLVGKVKNI-----N 409

Query: 303 AFSLLPAQNATGNWIKVVQRLPVRIELDQKQLEQYPLRIGLSTLVSVNTTNR 354
++ G V+ + + PL G++ + T R
Sbjct: 410 LDAIE--DQRLGLVFNVIISIEENCLST--GNKNIPLSSGMAVTAEIKTGMR 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3046TCRTETB1321e-35 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 132 bits (333), Expect = 1e-35
Identities = 97/405 (23%), Positives = 169/405 (41%), Gaps = 23/405 (5%)

Query: 20 IALSLATFMQVLDSTIANVAIPTIAGNLGSSLSQGTWVITSFGVANAISIPLTGWLAKRV 79
I L + +F VL+ + NV++P IA + + WV T+F + +I + G L+ ++
Sbjct: 17 IWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQL 76

Query: 80 GEVKLFLWSTIAFAIASWACGVS-SSLNMLIFFRVIQGIVAGPLIPLSQSLLLNNYPPAK 138
G +L L+ I S V S ++LI R IQG A L ++ P
Sbjct: 77 GIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN 136

Query: 139 RSIALALWSMTVIVAPICGPILGGYISDNYHWGWIFFINVPIGVAVVLMTLQTLRGRETR 198
R A L V + GP +GG I+ HW + + +P+ + + L L +E R
Sbjct: 137 RGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVR 194

Query: 199 TERRRIDAVGLALLVIGIGSLQIMLDRGKELDWFSSQEIIILTVVAVVAICFLIVWELTD 258
+ D G+ L+ +GI + ML F++ I +V+V++ +
Sbjct: 195 I-KGHFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIRKV 243

Query: 259 DNPIVDLSLFKSRNFTIGCLCISLAYMLYFGAIVLLPQLLQEVYGYTATWAGLASAPVGI 318
+P VD L K+ F IG LC + + G + ++P ++++V+ + G G
Sbjct: 244 TDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGT 303

Query: 319 IPVILS-PIIGRFAHKLDMRRLVTFSFIMYAVCFYWRAYTFEPGMDFGASAWPQFIQGF- 376
+ VI+ I G + ++ +V F ++ S + I F
Sbjct: 304 MSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFL-----LETTSWFMTIIIVFV 358

Query: 377 --AVACFFMPLTTITLSGLPPERLAAASSLSNFTRTLAGSIGTSI 419
++ ++TI S L + A SL NFT L+ G +I
Sbjct: 359 LGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3049LUXSPROTEIN292e-105 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 292 bits (750), Expect = e-105
Identities = 131/170 (77%), Positives = 148/170 (87%)

Query: 2 PLLDSFTVDHTRMEAPAVRVAKTMNTPHGDAITVFDLRFCVPNKEVMPERGIHTLEHLFA 61
PLLDSFTVDHTRM APAVRVAKTM TP GD ITVFDLRF PNK+++ E+GIHTLEHL+A
Sbjct: 1 PLLDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYA 60

Query: 62 GFMRNHLNGNGVEIIDISPMGCRTGFYMSLIGTPDEQRVADAWKAAMEDVLKVQDQNQIP 121
GFMRNHLNG+ VEIIDISPMGCRTGFYMSLIGTP EQ+VADAW AAMEDVLKV++QN+IP
Sbjct: 61 GFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIP 120

Query: 122 ELNVYQCGTYQMHSLQEAQDIARSILERDVRINSNEELALPKEKLQELHI 171
ELN YQCGT MHSL EA+ IA++ILE V +N N+ELALP+ L+EL I
Sbjct: 121 ELNEYQCGTAAMHSLDEAKQIAKNILEVGVAVNKNDELALPESMLRELRI 170


94UTI89_C3135UTI89_C3142N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3135012-1.260716metabolite transport protein YgcS
UTI89_C3136012-2.311305hypothetical protein
UTI89_C3137215-3.795165oxidoreductase YgcW
UTI89_C3138115-3.580525hypothetical protein
UTI89_C3139018-3.868923sugar kinase YgcE
UTI89_C3140-125-5.053827hypothetical protein
UTI89_C3141129-5.720223hypothetical protein
UTI89_C3142124-4.258474hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3135TCRTETB362e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 36.4 bits (84), Expect = 2e-04
Identities = 53/338 (15%), Positives = 123/338 (36%), Gaps = 34/338 (10%)

Query: 93 LGSLVLGWISDHIGRQKIFTFSFMLITLASFLQFFATTP-EHLIGLRILIGIGLGGDYSV 151
+G+ V G +SD +G +++ F ++ S + F + LI R + G G ++
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPAL 123

Query: 152 GHTLLAEFSPRRHRGVLLGAFSVVWT----VGYVLASIAGHHFISESPEAWRWLLASAAL 207
++A + P+ +RG G + VG + + H+ W +LL +
Sbjct: 124 VMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------HWSYLLLIPMI 177

Query: 208 PALLITLLRWGTPESPRWLLRQGRFAEAHAIVHRYFGPHVLLGDEVATATHKHIKTLF-- 265
+ + L + R +G F I+ +L + + + L
Sbjct: 178 TIITVPFLMKLLKKEVR---IKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFL 234

Query: 266 -SSRYWRRTA--------FNSVFFVCLVIPWFVIYT----WLPTIAQTIGLEDALTASLM 312
++ R+ ++ F+ V+ +I+ ++ + + L+ + +
Sbjct: 235 IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEI 294

Query: 313 LNALLIVGALLGLVLTHLLAHRRFLLGSFLLLTATLVVMACLPSGSSLTLLLFVLFSTTI 372
+ ++ G + ++ ++ G +L + ++ S F+L +T+
Sbjct: 295 GSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLS-----VSFLTASFLLETTSW 349

Query: 373 SAVSNLVGILPAESFPTDIRSLGVGFATAMSRLGAAVS 410
+V +L SF + S V + GA +S
Sbjct: 350 FMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMS 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3137DHBDHDRGNASE1091e-30 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 109 bits (274), Expect = 1e-30
Identities = 74/257 (28%), Positives = 117/257 (45%), Gaps = 11/257 (4%)

Query: 36 MDFFSLKGKTAIVTGGNSGLGQAFAMALAKAGANVFIPSFVKDNGETKEMIEK-QGVEVD 94
M+ ++GK A +TG G+G+A A LA GA++ + + E K + +
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 95 FMQVDITAEGAPQKIIAACCERFGTVDILVNNAGICKLNKVLDFGRADWDPMIDVNLTAA 154
D+ A +I A G +DILVN AG+ + + +W+ VN T
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 155 FELSYEAAKIMIPQKSGKIINICSLFSYLGGQWSPAYSATKHALAGFTKAYCDELGQYNI 214
F S +K M+ ++SG I+ + S + + AY+++K A FTK EL +YNI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 215 QVNGISPGYYATDI--TLATRSNPETNQRVLDY-------IPANRWGDTQDLMGAAVFLA 265
+ N +SPG TD+ +L N Q + IP + D+ A +FL
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAE-QVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 266 SPASNYVNGHLLVVDGG 282
S + ++ H L VDGG
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3138TCRTETA310.006 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 31.3 bits (71), Expect = 0.006
Identities = 20/76 (26%), Positives = 34/76 (44%), Gaps = 1/76 (1%)

Query: 41 GFSNTEIGLIMSTFGIAAIIFYA-PSGVIADKFSHRKMITSAMIITGLLGLIMATYPPLW 99
+ T IG+ ++ FGI + A +G +A + R+ + MI G +++A W
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 100 VMLCIQVAFAITTILM 115
+ I V A I M
Sbjct: 302 MAFPIMVLLASGGIGM 317



Score = 30.6 bits (69), Expect = 0.012
Identities = 22/103 (21%), Positives = 45/103 (43%), Gaps = 8/103 (7%)

Query: 48 GLIMSTFGIAAIIFYAPSGVIADKFSHRKMITSAMIITGLLGLIMATYPPLWVMLCIQVA 107
G++++ + + G ++D+F R ++ ++ + IMAT P LWV+ ++
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV 105

Query: 108 FAITTILMLWSVSIKAASLLGD---HSEQGKIMGWMEGLRGVG 147
IT + A + + D E+ + G+M G G
Sbjct: 106 AGITG-----ATGAVAGAYIADITDGDERARHFGFMSACFGFG 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C314256KDTSANTIGN300.005 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 30.3 bits (68), Expect = 0.005
Identities = 19/76 (25%), Positives = 31/76 (40%), Gaps = 12/76 (15%)

Query: 30 NASWSEVLNQYQRRTDLIPNLVASIKGYSSHEQEVLEAVTLARSQANRASSDLQKTPGDE 89
+AS ++ ++ Q D + L S GY + + N+ + P +
Sbjct: 294 SASIEQIQSKIQELGDTLEELRDSFDGY------------INNAFVNQIHLNFVMPPQAQ 341

Query: 90 QKLQAWQQAQAQAQAQ 105
Q+ QQ QAQA AQ
Sbjct: 342 QQQGQGQQQQAQATAQ 357


95UTI89_C3224UTI89_C3231N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C32240141.224803hypothetical protein
UTI89_C3225-2111.479795hypothetical protein
UTI89_C3226-3101.095103hypothetical protein
UTI89_C3227-3100.799650hypothetical protein
UTI89_C3229-380.807625thymidylate synthase
UTI89_C3230-290.795806prolipoprotein diacylglyceryl transferase
UTI89_C3231-2120.094326fused phosphoenolpyruvate-protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3224BCTERIALGSPH290.002 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 29.1 bits (65), Expect = 0.002
Identities = 27/114 (23%), Positives = 43/114 (37%), Gaps = 29/114 (25%)

Query: 8 QQGFSLPEVMLAMVLMVMIVTA----------------LSGFQRTLMNSLASRNQYQQLW 51
Q+GF+L E+ML ++LM + L+ F+ L Q Q +
Sbjct: 3 QRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQFF 62

Query: 52 -----RHGWQ--QTQLRAISPPA----NWQVNRMQTSQAGCVSISVTLVSPGGR 94
WQ + R + PA W R +AG V+ S ++ GG+
Sbjct: 63 GVSVHPDRWQFLVLEARDGADPAPADDGWSGYRWLPLRAGRVATSGSI--AGGK 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3225BCTERIALGSPG270.016 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 27.2 bits (60), Expect = 0.016
Identities = 13/83 (15%), Positives = 38/83 (45%), Gaps = 7/83 (8%)

Query: 1 MNREKGVSSLALVLMLLILGSLLL----QGMSQQDRSFASRVSMESQSLRRQAIVQSALE 56
++++G + L ++++++I+G L M ++++ + + +L A+ L+
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVAL-ENALDMYKLD 62

Query: 57 WGKMHSWQTQPAVQCLLYAATGA 79
+ T ++ L+ A T
Sbjct: 63 NHHYPT--TNQGLESLVEAPTLP 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3226PilS_PF08805270.030 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 27.2 bits (60), Expect = 0.030
Identities = 12/50 (24%), Positives = 25/50 (50%)

Query: 1 MSARRNRRMPVKEQGFSLLEVLIAMAISSVLLLGAARFLPALQRESLTNT 50
S+ RR +++G +L+EVL+ + + VL A + +Q ++
Sbjct: 13 FSSLSARRKKEQDKGATLMEVLLVVGVIVVLAASAYKLYSMVQSNIQSSN 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3227BCTERIALGSPG290.003 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 29.5 bits (66), Expect = 0.003
Identities = 9/24 (37%), Positives = 18/24 (75%)

Query: 1 MKTQRGYTLIETLVAMLILVMLSA 24
QRG+TL+E +V ++I+ +L++
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLAS 27


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3231PHPHTRNFRASE6110.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 611 bits (1577), Expect = 0.0
Identities = 189/571 (33%), Positives = 314/571 (54%), Gaps = 7/571 (1%)

Query: 168 QTRIRALPAAPGVAIAEGWQDATLPLMEQVYQASTLDPALERERLTGALEEAANEFRRYS 227
+I + A+ GVAIA+ + + + + S D + E E+LT ALE++ E R
Sbjct: 2 HHKITGIAASSGVAIAKAFIHLEPNV--DIEKTSITDVSTEIEKLTAALEKSKEELRAIK 59

Query: 228 KRFAAGAQKETAAIFDLYSHLLSDTRLRRELFAEVDKGSV-AEWAVKTVIEKFAEQFAAL 286
+ A + A IF + +L D L + +++ + AE+A+K V + F F ++
Sbjct: 60 DQTEASMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESM 119

Query: 287 SDNYLKERAGDLRALGQRLLFHLDDANQGPNAW-PERFILVADELSATTLAELPQDRLVG 345
+ Y+KERA D+R + +R+L HL G A E +++A++L+ + A+L + + G
Sbjct: 120 DNEYMKERAADIRDVSKRVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKG 179

Query: 346 VVVRDGAANSHAAIMVRALGIPTVMGA-DIQPSVLHRRTLIVDGYRGELLVDPEPVLLQE 404
G SH+AIM R+L IP V+G ++ + H +IVDG G ++V+P ++
Sbjct: 180 FATDIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKA 239

Query: 405 YQRLISEEIELSRLAEDDVNLPAQLKSGERIKVMLNAGLSPEHEEKLGSRIDGIGLYRTE 464
Y+ + + + V P+ K G +++ N G + + L + +GIGLYRTE
Sbjct: 240 YEEKRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTE 299

Query: 465 IPFMLQSGFPSEEEQVAQYQGMLQMFNDKPVTLRTLDVGADKQLPYMPISEE-NPCLGWR 523
+M + P+EEEQ Y+ ++Q + KPV +RTLD+G DK+L Y+ + +E NP LG+R
Sbjct: 300 FLYMDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFR 359

Query: 524 GIRITLDQPEIFLIQVRAMLRANAATGNLNILLPMVTSLDEVDEARRLIERAGREVEEMI 583
IR+ L++ +IF Q+RA+LRA + GNL ++ PM+ +L+E+ +A+ +++ ++
Sbjct: 360 AIRLCLEKQDIFRTQLRALLRA-STYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEG 418

Query: 584 GYEIPKPRIGIMLEVPSMVFMLPHLAKRVDFISVGTNDLTQYILAVDRNNTRVANIYDSL 643
+GIM+E+PS AK VDF S+GTNDL QY +A DR N RV+ +Y
Sbjct: 419 VDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPY 478

Query: 644 HPAMLRALAMIAREAEIHGIDLRLCGEMAGDPMCVAILIGLGYRHLSMNGRSVARVKYLL 703
HPA+LR + M+ + A G + +CGEMAGD + + +L+GLG SM+ S+ + L
Sbjct: 479 HPAILRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQL 538

Query: 704 RRIDFAEAENLAQRSLEAQLATEVRHQVAAF 734
++ E + AQ++L A EV V
Sbjct: 539 LKLSKEELKPFAQKALMLDTAEEVEQLVKKT 569


96UTI89_C3375UTI89_C3389N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C33750161.936157polysialic acid transport protein KpsM
UTI89_C33761194.276705general secretion pathway protein YghD
UTI89_C33770164.281829GspL-like protein
UTI89_C33780204.772751hypothetical protein
UTI89_C3379-1195.232861type II secretion protein GspJ
UTI89_C3380-2194.533228type II secretion protein GspI
UTI89_C3381-2164.076501type II secretion protein GspH
UTI89_C3382-3163.367246type II secretion protein GspG
UTI89_C3383-2153.143292type II secretion protein GspF
UTI89_C3384-1121.312582type II secretion protein GspE
UTI89_C3385-212-0.183271type II secretion protein GspD
UTI89_C3386-212-0.566231type II secretion protein GspC
UTI89_C3387-312-0.136672lipoprotein
UTI89_C3388-3130.243303prepilin peptidase A
UTI89_C3389-2130.955600lipoprotein AcfD-like
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3375ABC2TRNSPORT336e-04 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 33.4 bits (76), Expect = 6e-04
Identities = 29/125 (23%), Positives = 54/125 (43%), Gaps = 10/125 (8%)

Query: 137 ITNFLQLVLTWSLLIILS--CGVGLIF----MVVGKTFPEMQKVL---PILLKPLYFISC 187
+ L SLL L GL F MVV P + +++ P+ F+S
Sbjct: 135 VAAALGYTQWLSLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSG 194

Query: 188 IMFPLHSIPKQYWSYLLWNPLVHVVELSREAVMPGYISE-GVSLNYLAMFTLVTLFIGLA 246
+FP+ +P + + + PL H ++L R ++ + + + L ++ ++ F+ A
Sbjct: 195 AVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTA 254

Query: 247 LYRTR 251
L R R
Sbjct: 255 LLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3380BCTERIALGSPH323e-04 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 32.2 bits (73), Expect = 3e-04
Identities = 13/24 (54%), Positives = 18/24 (75%)

Query: 2 KRGFTLLEVMLALAIFALAAMAVL 25
+RGFTLLE+ML L + ++A VL
Sbjct: 3 QRGFTLLEMMLILLLMGVSAGMVL 26


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3381BCTERIALGSPH773e-20 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 77.3 bits (190), Expect = 3e-20
Identities = 42/196 (21%), Positives = 71/196 (36%), Gaps = 41/196 (20%)

Query: 1 MPERGFTLLEIMLVIFLIGLASAGVVQTFATASEPPAKKAAQDFLTRFAQFKDRAVIEGQ 60
M +RGFTLLE+ML++ L+G+++ V+ F + + A + F + + R + GQ
Sbjct: 1 MRQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQ 60

Query: 61 TLGVLIDPPGYQFMQRRHGQWLPVSATRLSAQVTVPKQVQMLLQPGSDIWQKEYALELQR 120
GV + P +QF+ + P D W L L+
Sbjct: 61 FFGVSVHPDRWQFLVLEARDGADPA-------------------PADDGWSGYRWLPLRA 101

Query: 121 RRL----TLHDIELEL-----QKEAKKKTPQIRFSPFEPATPFTLRFYSAAQNACWAVKL 171
R+ ++ +L L + P + P TPF L L
Sbjct: 102 GRVATSGSIAGGKLNLAFAQGEAWTPGDNPDVLIFPGGEMTPFRLT-------------L 148

Query: 172 AHDGALSLNQCDERMP 187
++ N E +P
Sbjct: 149 GEAPGIAFNARGESLP 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3382BCTERIALGSPG2182e-76 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 218 bits (556), Expect = 2e-76
Identities = 91/146 (62%), Positives = 109/146 (74%), Gaps = 3/146 (2%)

Query: 6 RTQKPRAGFTLLEVMVVIVILGVLASLVVPNLLGNKEKADRQKAISDIVALENALDMYRL 65
R + GFTLLE+MVVIVI+GVLASLVVPNL+GNKEKAD+QKA+SDIVALENALDMY+L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL 61

Query: 66 DNGRYPTTEQGLEALIQQPANMADARNYRTGGYIKRLPKDPWGNDYQYLSPGEKGLFDVY 125
DN YPTT QGLE+L++ P A NY GYIKRLP DPWGNDY ++PGE G +D+
Sbjct: 62 DNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLL 121

Query: 126 TLGADGQENGEGAGADIGNWNLQEFQ 151
+ G DG+ E DI NW L + +
Sbjct: 122 SAGPDGEMGTED---DITNWGLSKKK 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3383BCTERIALGSPF453e-161 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 453 bits (1167), Expect = e-161
Identities = 226/406 (55%), Positives = 301/406 (74%), Gaps = 1/406 (0%)

Query: 1 MALFYYQALERNGRKTKGMIEADSARHARQLLRGKDLIPVHI-EARMNASAGGLLQRRRH 59
MA ++YQAL+ G+K +G EADSAR ARQLLR + L+P+ + E R + G
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 60 AHRRVATADLALFTRQLATLVQAAMPLETCLQAVSEQSEKLHVKSLGMALRSRIQEGYTL 119
R++T+DLAL TRQLATLV A+MPLE L AV++QSEK H+ L A+RS++ EG++L
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 120 SDSLREHPRVFDSLFCSMVAAGEKSGHLDVVLNRLADYTEQRQRLKSRLLQAMLYPLVLL 179
+D+++ P F+ L+C+MVAAGE SGHLD VLNRLADYTEQRQ+++SR+ QAM+YP VL
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 180 VVATGVVTILLTAVVPKIIEQFDHLGHALPASTRMLIAMSDALQASGVYWLAGLLGLLVL 239
VVA VV+ILL+ VVPK++EQF H+ ALP STR+L+ MSDA++ G + L LL +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 240 GQRLLKNPAMRLRWDKTLLRLPVTGRVARGLNTARFSRTLSILTASSVPLLEGIQTAAAV 299
+ +L+ R+ + + LL LP+ GR+ARGLNTAR++RTLSIL AS+VPLL+ ++ + V
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 300 SANRYVEQQLLLAADRVREGSSLRAALADLRLFPPMMLYMIASGEQSGELETMLEQAAVN 359
+N Y +L LA D VREG SL AL LFPPMM +MIASGE+SGEL++MLE+AA N
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 360 QEREFDTQVGLALGLFEPALVVMMAGVVLFIVIAILEPMLQLNNMV 405
Q+REF +Q+ LALGLFEP LVV MA VVLFIV+AIL+P+LQLN ++
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3385BCTERIALGSPD5760.0 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 576 bits (1486), Expect = 0.0
Identities = 296/668 (44%), Positives = 430/668 (64%), Gaps = 34/668 (5%)

Query: 24 LLPLMLAAALCSSPVWAEEATFTANFKDTDLKSFIETVGANLNKTIIMGPGVQGKVSIRT 83
L L++ AAL P AEE F+A+FK TD++ FI TV NLNKT+I+ P V+G +++R+
Sbjct: 11 SLTLLIFAALLFRPAAAEE--FSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRS 68

Query: 84 MTPLNERQYYQLFLNLLEAQGYAVVPMENDVLKVVKSSAAKVEPLPLVGEGSDNYAGDEM 143
LNE QYYQ FL++L+ G+AV+ M N VLKVV+S AK +P+ + + GDE+
Sbjct: 69 YDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPG-IGDEV 127

Query: 144 VTKVVPVRNVSVRELAPILRQMIDSAGSGNVVNYDPSNVIMLTGRASVVERLTEVIQRVD 203
VT+VVP+ NV+ R+LAP+LRQ+ D+AG G+VV+Y+PSNV+++TGRA+V++RL +++RVD
Sbjct: 128 VTRVVPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVD 187

Query: 204 HAGNRTEEVIPLDNASASEIARVLESLTKNSGENQ-PATLKSQIVADERTNSVIVSGDPA 262
+AG+R+ +PL ASA+++ +++ L K++ ++ P ++ + +VADERTN+V+VSG+P
Sbjct: 188 NAGDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVLVSGEPN 247

Query: 263 TRDKMRRLIRRLDSEMERSGNSQVFYLKYSKAEDLVDVLKQVSGTLTAAKEEAEGTVGSG 322
+R ++ +I++LD + GN++V YLKY+KA DLV+VL +S T+ + K+ A+ +
Sbjct: 248 SRQRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAKPV-AAL 306

Query: 323 REVVSIAASKHSNALIVTAPQDIMQSLQSVIEQLDIRRAQVHVEALIVEVAEGSNINFGV 382
+ + I A +NALIVTA D+M L+ VI QLDIRR QV VEA+I EV + +N G+
Sbjct: 307 DKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGI 366

Query: 383 QWGSKDAGLMQFANGTQIPIGTLGAAISAAKPQKGSTVISENGATTINPDTNGDLST-LA 441
QW +K+AG+ QF N + +PI T A + +G +S+ LA
Sbjct: 367 QWANKNAGMTQFTN-SGLPISTAIAG-------------------ANQYNKDGTVSSSLA 406

Query: 442 QLLSGFSGTAVGVVKGDWMALVQAVKNDSSSNVLSTPSITTLDNQEAFFMVGQDVPVLTG 501
LS F+G A G +G+W L+ A+ + + +++L+TPSI TLDN EA F VGQ+VPVLTG
Sbjct: 407 SALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTG 466

Query: 502 STVGSNNSNPFNTVERKKVGIMLKVTPQINEGNAVQMVIEQEVSKVEGQTS-----LDVV 556
S S N FNTVERK VGI LKV PQINEG++V + IEQEVS V S L
Sbjct: 467 SQTTSG-DNIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGAT 525

Query: 557 FGERKLKTTVLANDGELIVLGGLMDDQAGESVAKVPLLGDIPVIGNLFKSTADKKEKRNL 616
F R + VL GE +V+GGL+D ++ KVPLLGDIPVIG LF+ST+ K KRNL
Sbjct: 526 FNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNL 585

Query: 617 MVFIRPTILRDGMAADGVSQRKYNYMRAEQIYR--DEQGLSLMPHTAQPILPAQNQALPP 674
M+FIRPT++RD S +Y Q + E +++ I P Q+ A
Sbjct: 586 MLFIRPTVIRDRDEYRQASSGQYTAFNDAQSKQRGKENNDAMLNQDLLEIYPRQDTAAFR 645

Query: 675 EVRAFLNA 682
+V A ++A
Sbjct: 646 QVSAAIDA 653


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3386BCTERIALGSPC1163e-33 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 116 bits (292), Expect = 3e-33
Identities = 70/282 (24%), Positives = 115/282 (40%), Gaps = 38/282 (13%)

Query: 1 MFWLMLLIISAKMAYSLWRYFSFSAEYTAVSSSVN-KPLRADAKPFDKNDVQLVSQQNWF 59
+F+L++L+ ++A WR A SSV P +A +P ND L F
Sbjct: 18 LFYLLMLLFCQQLAMIFWR---IGLPDNAPVSSVQITPAQARQQPVTLNDFTL------F 68

Query: 60 GKY-QPVAAPV-KQPESAPVAETRLNVVLRGIAFG---ARPGVVIEEGGKQQVYLQGERL 114
G + A + + + + LN+ L G+ G +R +I + +Q E +
Sbjct: 69 GVSPEKNKAGALDASQMSNLPPSTLNLSLTGVMAGDDDSRSIAIISKDNEQFSRGVNEEV 128

Query: 115 GSHNAVIEEINRDHVMLRYQGKMERLSLAEEERPPVAVTSKKAASDEAKQAVAEPVVSAP 174
+NA I I D V+L+YQG+ E L L +E + SD A
Sbjct: 129 PGYNAKIVSIRPDRVVLQYQGRYEVLGLYSQE---------DSGSDGVPGAQVN------ 173

Query: 175 VEIPAAVRQALAKDPQKIFNYIQLTPVRKEG-IVGYAVKPGADRSLFDASGFREGDIAIA 233
Q + + +Y+ +P+ + + GY + PG F G ++ D+A+A
Sbjct: 174 -------EQLQQRASTTMSDYVSFSPIMNDNKLQGYRLNPGPKSDSFYRVGLQDNDMAVA 226

Query: 234 LNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARYDISIAL 275
LN D D M ++ + + LTV R G R DI +
Sbjct: 227 LNGLDLRDAEQAKKAMERMADVHNFTLTVERDGQRQDIYMEF 268


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3388PREPILNPTASE2771e-95 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 277 bits (710), Expect = 1e-95
Identities = 111/274 (40%), Positives = 150/274 (54%), Gaps = 12/274 (4%)

Query: 1 MLFDVFQQYPAAMPILATVGGLIIGSFLNVVIWRYPIML-RQQMAEFHGETPSTQSKI-- 57
+L ++ P L + L+IGSFLNVVI R PIML R+ AE+ +
Sbjct: 3 LLLELAHGLPWLYFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDE 62

Query: 58 ---SLALPRSHCPHCQQTIRVRDNIPLLSWLMLKGRCRDCQAKISKRYPLVELLTALAFL 114
+L +PRS CPHC I +NIPLLSWL L+GRCR CQA IS RYPLVELLTAL +
Sbjct: 63 PPYNLMVPRSCCPHCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSV 122

Query: 115 LASLVWPESGWGLAVMILSAWLIAASIIDLDNQWLPDVFTQGVLWTGLIAAWAQQSPLTL 174
++ LA ++L+ L+A + IDLD LPD T +LW GL+ ++L
Sbjct: 123 AVAMTLAPGWGTLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNL-LGGFVSL 181

Query: 175 QDAVTGVLVGFITFYSLRWIAGIVLRKEALGMGDVLLFAALGGWVGPLSLPNVALIASCC 234
DAV G + G++ +SL W ++ KE +G GD L AALG W+G +LP V L++S
Sbjct: 182 GDAVIGAMAGYLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLV 241

Query: 235 GLIYAVI-----TKRGSTTLPFGPCLSLGGIATL 263
G + S +PFGP L++ G L
Sbjct: 242 GAFMGIGLILLRNHHQSKPIPFGPYLAIAGWIAL 275


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3389PF03544481e-07 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 48.0 bits (114), Expect = 1e-07
Identities = 29/107 (27%), Positives = 41/107 (38%), Gaps = 8/107 (7%)

Query: 46 PEVKPDPTPTPEPTPEPTPDPEPTPDPTPD-PEPTPEPEPEPVPTKTGYLTLGGSQRVTG 104
V+P P P EP PEP P PEP + +P P+P+P+P P K ++
Sbjct: 64 QAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKK-------VEQPKR 116

Query: 105 ATCNGESSDGFTFTPGNTVSCVVGSTTIATFNTQSEAARSLRAVDKV 151
ES F + T AT + A RA+ +
Sbjct: 117 DVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRN 163



Score = 41.1 bits (96), Expect = 2e-05
Identities = 14/87 (16%), Positives = 21/87 (24%), Gaps = 1/87 (1%)

Query: 35 TPSVDSGSGTLPEVKPDPTPTPEPTPEPTPDPEPTPDPTPDPEPTPEPEPEPVPT-KTGY 93
P PE P+P E P+ + +PV +
Sbjct: 69 PPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASP 128

Query: 94 LTLGGSQRVTGATCNGESSDGFTFTPG 120
R T +T +S T
Sbjct: 129 FENTAPARPTSSTATAATSKPVTSVAS 155



Score = 38.8 bits (90), Expect = 9e-05
Identities = 16/58 (27%), Positives = 22/58 (37%), Gaps = 1/58 (1%)

Query: 31 SSSDTPSVDSGSGTLPEVKPDPTPTPEPTPEPTPDPEPTPDPTPDPEPTPEPEPEPVP 88
S + + + + P P P PEP +P P+PEP PEP E
Sbjct: 36 SVHQVIELPAPAQPISVTMVAPADLEPP-QAVQPPPEPVVEPEPEPEPIPEPPKEAPV 92



Score = 36.9 bits (85), Expect = 4e-04
Identities = 16/46 (34%), Positives = 19/46 (41%)

Query: 46 PEVKPDPTPTPEPTPEPTPDPEPTPDPTPDPEPTPEPEPEPVPTKT 91
P T EP +P P+P +PEP PEP PEP
Sbjct: 46 PAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAP 91



Score = 36.1 bits (83), Expect = 6e-04
Identities = 17/40 (42%), Positives = 17/40 (42%)

Query: 50 PDPTPTPEPTPEPTPDPEPTPDPTPDPEPTPEPEPEPVPT 89
P P T D EP P PEP EPEPEP P
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPI 83


97UTI89_C3665UTI89_C3672N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3665-215-1.056925serine endoprotease
UTI89_C3666-213-1.009472serine endoprotease
UTI89_C3667014-0.884835malate dehydrogenase
UTI89_C3668-113-1.135030arginine repressor ArgR
UTI89_C3669-213-0.535067hypothetical protein
UTI89_C3670-1130.653126hypothetical protein
UTI89_C3671-3121.464995p-hydroxybenzoic acid efflux subunit AaeB
UTI89_C3672-1111.656954p-hydroxybenzoic acid efflux subunit AaeA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3665V8PROTEASE733e-16 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 73.1 bits (179), Expect = 3e-16
Identities = 30/184 (16%), Positives = 63/184 (34%), Gaps = 38/184 (20%)

Query: 104 GLGSGVIINANKGYVLTNNHVINQAQKISIQL------------NDGREFDAKLIGSDDQ 151
+ SGV++ + +LTN HV++ L +G ++ +
Sbjct: 102 FIASGVVVGKDT--LLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 152 SDIALLQIQN-------PSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALG 204
D+A+++ + ++++ + +V G P ++ +
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKP-------VATMW 212

Query: 205 RSGLNLEGLEN-FIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSN 263
S + L+ +Q D S GNSG + N E+IGI+ G+
Sbjct: 213 ESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWG---------GVPNEFNGA 263

Query: 264 MART 267
+
Sbjct: 264 VFIN 267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3666V8PROTEASE538e-10 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 52.7 bits (126), Expect = 8e-10
Identities = 31/160 (19%), Positives = 59/160 (36%), Gaps = 26/160 (16%)

Query: 77 RTLGSGVIMDQRGYIITNKHVINDADQIIVALQ------------DGRVFEALLVGSDSL 124
+ SGV++ + ++TNKHV++ AL+ +G +
Sbjct: 101 TFIASGVVVG-KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 125 TDLAVLKI-------NATGGLPTIPINARRVPHIGDVVLAIGNPYNLGQTITQGIISATG 177
DLA++K + + ++ + + G P + T + G
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGD-KPVATMW--ESKG 216

Query: 178 RIGLNPTGRQNFLQTDASINHGNSGGALVNSLGELMGINT 217
+I + +Q D S GNSG + N E++GI+
Sbjct: 217 KI---TYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHW 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3667DHBDHDRGNASE280.036 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 28.5 bits (63), Expect = 0.036
Identities = 37/167 (22%), Positives = 61/167 (36%), Gaps = 27/167 (16%)

Query: 25 VAVLGAAGGIGQALALLLKTQLPSGSELSLYDIAPVTPGVAVDLSHIPTAVKIKGFSGED 84
+ GAA GIG+A+A L G+ ++ D P V S A + F +
Sbjct: 11 AFITGAAQGIGEAVARTL---ASQGAHIAAVDYNP-EKLEKVVSSLKAEARHAEAFPADV 66

Query: 85 ATPA------------LEGADVVLISAGVARK------PGMDRSDLFNVNAGIVKNLVQQ 126
A + D+++ AGV R + F+VN+ V N +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 127 VAKTCPK----ACIGIITNPVNTT-VAIAAEVLKKAGVYDKNKLFGV 168
V+K + + + +NP ++AA KA K G+
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGL 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3668ARGREPRESSOR1689e-57 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 168 bits (428), Expect = 9e-57
Identities = 44/141 (31%), Positives = 71/141 (50%), Gaps = 5/141 (3%)

Query: 15 KALLKEEKFSSQGEIVAALQEQGFDNINQSKVSRMLTKFGAVRTRNAKMEMVYCLPAELG 74
+ ++ + +Q E+V L++ G+ N+ Q+ VSR + + V+ Y LPA+
Sbjct: 11 REIITANEIETQDELVDILKKDGY-NVTQATVSRDIKELHLVKVPTNNGSYKYSLPADQR 69

Query: 75 VPTTSSPLKNLV---LDIDYNDAVVVIHTSPGAAQLIARLLDSLGKAEGILGTIAGDDTI 131
S ++L+ + ID ++V+ T PG AQ I L+D+L E I+GTI GDDTI
Sbjct: 70 FNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEE-IMGTICGDDTI 128

Query: 132 FTTPANGFTVKELYEAILELF 152
K + + ILEL
Sbjct: 129 LIICRTHDDTKVVQKKILELL 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3672RTXTOXIND542e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 54.4 bits (131), Expect = 2e-10
Identities = 29/163 (17%), Positives = 59/163 (36%), Gaps = 16/163 (9%)

Query: 6 RKFSRTAITVVLVILAFIAIFNAWVYYTE----SPWTRDARFSADVVAIAPDVSGLITQV 61
SR V I+ F+ I + + S I P + ++ ++
Sbjct: 51 TPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEI 110

Query: 62 NVHDNQLVKKGQVLFTIDQPR-------YQKALEEAQADVAYYQVLAQEKRQEAGRRNRL 114
V + + V+KG VL + Q +L +A+ + YQ+L++ E + L
Sbjct: 111 IVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRS--IELNKLPEL 168

Query: 115 GVQAMSREEIDQANNVL---QTVLHQLAKAQATRDLAKLDLER 154
+ + VL + Q + Q + +L+L++
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDK 211



Score = 51.4 bits (123), Expect = 2e-09
Identities = 28/147 (19%), Positives = 54/147 (36%), Gaps = 15/147 (10%)

Query: 100 LAQEKRQEAGRRNRLGVQ-AMSREEIDQANNVLQT-VLHQLAKAQAT-------RDLAKL 150
E R + ++ + ++EE + + +L +L + +
Sbjct: 264 AVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEE 323

Query: 151 DLERTVIRAPADGWVTNLNVYT-GEFITRGSTAVALVKQNSFY-VLAYMEETKLEGVRPG 208
+ +VIRAP V L V+T G +T T + +V ++ V A ++ + + G
Sbjct: 324 RQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVG 383

Query: 209 YRAEIT----PLGSNKVLKGTVDSVAA 231
A I P L G V ++
Sbjct: 384 QNAIIKVEAFPYTRYGYLVGKVKNINL 410


98UTI89_C3703UTI89_C3708N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3703-210-1.923076Fis family transcriptional regulator
UTI89_C3704-211-1.631455methyltransferase
UTI89_C3705-312-1.433083hypothetical protein
UTI89_C3706-313-1.220930DNA-binding transcriptional regulator EnvR
UTI89_C3707-213-0.508787acriflavine resistance protein E
UTI89_C3708-214-1.162085acriflavine resistance protein F
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3703DNABINDNGFIS1573e-54 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 157 bits (399), Expect = 3e-54
Identities = 98/98 (100%), Positives = 98/98 (100%)

Query: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60
MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ
Sbjct: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60

Query: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98
PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN
Sbjct: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3706HTHTETR1277e-39 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 127 bits (321), Expect = 7e-39
Identities = 77/209 (36%), Positives = 122/209 (58%), Gaps = 3/209 (1%)

Query: 1 MAKRTKAEALKTRQELIETAIAQFAQHGVSKTTLNDIADAANVTRGAIYWHFENKTQLFN 60
MA++TK EA +TRQ +++ A+ F+Q GVS T+L +IA AA VTRGAIYWHF++K+ LF+
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EMW-LQQPSLRELIQDHLTAGLEHDPFQQLREKLIVGLQYIAKIPRQQALLKILYHKCEF 119
E+W L + ++ EL + A DP LRE LI L+ R++ L++I++HKCEF
Sbjct: 61 EIWELSESNIGELELE-YQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 120 NDEM-LAEGVIREKMGFNPQTLREVLQACQQQGCVANNLDLDVVMIIIDGAFSGIVQNWL 178
EM + + R + + + L+ C + + +L II+ G SG+++NWL
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 179 MNMAGYDLYKQAPALVDNVLRMFMPDENI 207
+DL K+A V +L M++ +
Sbjct: 180 FAPQSFDLKKEARDYVAILLEMYLLCPTL 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3707RTXTOXIND431e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 43.3 bits (102), Expect = 1e-06
Identities = 38/217 (17%), Positives = 70/217 (32%), Gaps = 38/217 (17%)

Query: 98 ATYQASYDSAKGELAKSEAAAAIAHLTVKRYVPLVGTKYISQQEYDQAIADA-RQADAAV 156
K +L + E+ A + Q + I D RQ +
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLV----------TQLFKNEILDKLRQTTDNI 311

Query: 157 IAAKATVESARINLAYTKVTAPISGRIGK-STVTEGALVTNGQTTELATVQQLDPIYVDV 215
+ + + AP+S ++ + TEG +VT +T + V + D + V
Sbjct: 312 GLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVTA 370

Query: 216 TQSSND--FMRLKQSVEQGNLHKENATSNVELVMENGQTYP-LKGTLQ--FSDVTVDEST 270
+ D F+ + Q+ +++ Y L G ++ D D+
Sbjct: 371 LVQNKDIGFINVGQNAI------------IKVEAFPYTRYGYLVGKVKNINLDAIEDQRL 418

Query: 271 GSIT--LRAV------FPNPQHTLLPGMFVRARIDEG 299
G + + ++ N L GM V A I G
Sbjct: 419 GLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 7e-04
Identities = 22/127 (17%), Positives = 43/127 (33%), Gaps = 13/127 (10%)

Query: 46 TAPLEVKTELPGR-TNAYRIAEVRPQVSGIVLNRNFTEGSDVQAGQSLYQIDPATYQASY 104
+E+ G+ T++ R E++P + IV EG V+ G L ++ +A
Sbjct: 77 LGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA-- 134

Query: 105 DSAKGELAKSEAAAAIAHLTVKRYVPLVGTKYISQQEYDQAIADARQADAAVIAAKATVE 164
+ K++++ A L RY L E ++ +
Sbjct: 135 -----DTLKTQSSLLQARLEQTRYQIL-----SRSIELNKLPELKLPDEPYFQNVSEEEV 184

Query: 165 SARINLA 171
+L
Sbjct: 185 LRLTSLI 191



Score = 29.0 bits (65), Expect = 0.031
Identities = 11/34 (32%), Positives = 15/34 (44%), Gaps = 1/34 (2%)

Query: 65 AEVRPQVSGIVLNRN-FTEGSDVQAGQSLYQIDP 97
+ +R VS V TEG V ++L I P
Sbjct: 328 SVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVP 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3708ACRIFLAVINRP14050.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1405 bits (3639), Expect = 0.0
Identities = 1029/1034 (99%), Positives = 1032/1034 (99%)

Query: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60
MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120
VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180
EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRL 240
QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300
KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIQEVVKTLFEAIMLVFLVMYLFLQ 360
DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSI EVVKTLFEAIMLVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 MEDKLPPREATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480
MEDKLPP+EATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540
SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600
LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERSGDENSAEAVIHRAKMELGKIRDG 660
EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEER+GDENSAEAVIHRAKMELGKIRDG
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720
FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKVYVQADAKFRM 780
EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKK+YVQADAKFRM
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840
LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900
ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960
MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020
EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1021 VPVFFVVIRRCFKG 1034
VPVFFVVIRRCFKG
Sbjct: 1021 VPVFFVVIRRCFKG 1034


99UTI89_C3779UTI89_C3791N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3779-119-1.581698general secretion pathway protein C
UTI89_C3780-117-0.565236general secretion pathway protein D
UTI89_C3781021-0.207448general secretion pathway protein E
UTI89_C3782023-0.941241general secretion pathway protein F
UTI89_C3783124-1.592056general secretion pathway protein G
UTI89_C3784222-1.950931general secretion pathway protein H
UTI89_C3785123-2.510516general secretion pathway protein I
UTI89_C3786222-3.140281general secretion pathway protein J
UTI89_C3787023-3.488175general secretion pathway protein K
UTI89_C3788-119-2.833682general secretion pathway protein L
UTI89_C3789-221-3.562699general secretion pathway protein M
UTI89_C3790-223-2.177049type 4 prepilin-like proteins leader peptide
UTI89_C3791-125-2.657738bacterioferritin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3779BCTERIALGSPC844e-21 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 83.9 bits (207), Expect = 4e-21
Identities = 53/200 (26%), Positives = 94/200 (47%), Gaps = 15/200 (7%)

Query: 59 EFSLAALWRNENHAGVKDANPVAVNQETPKLSIALNGIVLTSNDETSFVLINEGNEQKRY 118
+F+L + +N AG DA N L+++L G++ +D S +I++ NEQ
Sbjct: 64 DFTLFGVSPEKNKAGALDA-SQMSNLPPSTLNLSLTGVMAGDDDSRSIAIISKDNEQFSR 122

Query: 119 SLNEALESAPGT--FIRKINKTSVVFETHGHYEKVTLH-------PGLP--DIIKQPDSE 167
+NE + PG I I VV + G YE + L+ G+P + +Q
Sbjct: 123 GVNEEV---PGYNAKIVSIRPDRVVLQYQGRYEVLGLYSQEDSGSDGVPGAQVNEQLQQR 179

Query: 168 NQNVLADYIIATPIRDGEQIYGLRLNPRKGLNAFTTSLLQPGDIALRINNLSLTHPDEVS 227
++DY+ +PI + ++ G RLNP ++F LQ D+A+ +N L L ++
Sbjct: 180 ASTTMSDYVSFSPIMNDNKLQGYRLNPGPKSDSFYRVGLQDNDMAVALNGLDLRDAEQAK 239

Query: 228 QALSLLLTQQSAQFTIRRNG 247
+A+ + + T+ R+G
Sbjct: 240 KAMERMADVHNFTLTVERDG 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3780BCTERIALGSPD7160.0 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 716 bits (1850), Expect = 0.0
Identities = 344/629 (54%), Positives = 466/629 (74%), Gaps = 11/629 (1%)

Query: 11 ITCCLLAALLMPCAGHAENEQYGANFNNADIRQFVEIVGQHLGKTILIDPSVQGTISVRS 70
+T + AALL A E++ A+F DI++F+ V ++L KT++IDPSV+GTI+VRS
Sbjct: 12 LTLLIFAALLF---RPAAAEEFSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRS 68

Query: 71 NDTFSQQEYYQFFLSILDLYGYSVITLDNGFLKVVRSANVKTSPGMIADSSRPGVGDELV 130
D ++++YYQFFLS+LD+YG++VI ++NG LKVVRS + KT+ +A + PG+GDE+V
Sbjct: 69 YDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPGIGDEVV 128

Query: 131 TRIVPLENVPARDLAPLLRQMMDAGSVGNVVHYEPSNVLILTGRASTINKLIEVIKRVDV 190
TR+VPL NV ARDLAPLLRQ+ D VG+VVHYEPSNVL++TGRA+ I +L+ +++RVD
Sbjct: 129 TRVVPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVDN 188

Query: 191 IGTEKQQIIHLEYASAEDLAEILNQLISESHGKSQMPALLSAKIVADKRTNSLIISGPEK 250
G + L +ASA D+ +++ +L ++ KS +P + A +VAD+RTN++++SG
Sbjct: 189 AGDRSVVTVPLSWASAADVVKLVTEL-NKDTSKSALPGSMVANVVADERTNAVLVSGEPN 247

Query: 251 ARQRITSLLKSLDVEESEEGNTRVYYLKYAKATNLVEVLTGVSEKLKDEKGNSRKPSSTS 310
+RQRI +++K LD +++ +GNT+V YLKYAKA++LVEVLTG+S ++ EK ++ +
Sbjct: 248 SRQRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAK--PVAA 305

Query: 311 AMDNVAITADEQTNSLVITADQSVQEKLATVIARLDIRRAQVLVEAIIVEVQDGNGLNLG 370
N+ I A QTN+L++TA V L VIA+LDIRR QVLVEAII EVQD +GLNLG
Sbjct: 306 LDKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLG 365

Query: 371 VQWANKNVGAQQFTNTGLPVFNAAQGVADYKKNGGITSANPAWDMFSAYNGMAAGFFNGD 430
+QWANKN G QFTN+GLP+ A G Y K+G ++S+ S++NG+AAGF+ G+
Sbjct: 366 IQWANKNAGMTQFTNSGLPISTAIAGANQYNKDGTVSSSLA--SALSSFNGIAAGFYQGN 423

Query: 431 WGVLLTALASNNKNDILATPSIVTLDNKLASFNVGQDVPVLSGSQTTSGDNVFNTVERKT 490
W +LLTAL+S+ KNDILATPSIVTLDN A+FNVGQ+VPVL+GSQTTSGDN+FNTVERKT
Sbjct: 424 WAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDNIFNTVERKT 483

Query: 491 VGTKLKVTPQVNEGDAVLLEIEQEVSSVD---SSSNSTLGPTFNTRTIQNAVLVKTGETV 547
VG KLKV PQ+NEGD+VLLEIEQEVSSV SS++S LG TFNTRT+ NAVLV +GETV
Sbjct: 484 VGIKLKVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGETV 543

Query: 548 VLGGLLDDFSKEQVSKVPLLGDIPLVGQLFRYTSTERAKRNLMVFIRPTIIRDDDVYRSL 607
V+GGLLD + KVPLLGDIP++G LFR TS + +KRNLM+FIRPT+IRD D YR
Sbjct: 544 VVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIRDRDEYRQA 603

Query: 608 SKEKYTRYRQEQQLRIDGKSKALIGSEDL 636
S +YT + Q + ++ + ++DL
Sbjct: 604 SSGQYTAFNDAQSKQRGKENNDAMLNQDL 632


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3782BCTERIALGSPF5120.0 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 512 bits (1321), Expect = 0.0
Identities = 195/405 (48%), Positives = 283/405 (69%), Gaps = 8/405 (1%)

Query: 2 NYRYRAMTQDGQKLQGIIDANDERQARLRLREEGLFLLDIRPQK-------SSGVKTRRP 54
Y Y+A+ G+K +G +A+ RQAR LRE GL L + + S+G+ RR
Sbjct: 3 QYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLRRK 62

Query: 55 -RISHSELTLFTRQLATLSAAALPLEESLAVIGQQSSNNRLADVLNQVRSAILEGHPLSD 113
R+S S+L L TRQLATL AA++PLEE+L + +QS L+ ++ VRS ++EGH L+D
Sbjct: 63 IRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD 122

Query: 114 ALQHFPTLFDSLYRTLVKAGEKSGLLAPVLEKLADYNENRQKIRSKLIQSLIYPCMLTTV 173
A++ FP F+ LY +V AGE SG L VL +LADY E RQ++RS++ Q++IYPC+LT V
Sbjct: 123 AMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLTVV 182

Query: 174 AIVVVIILLTAVVPKITEQFVHMKQQLPLSTRILLGLSDTLQRTGPTLLATVFIVAVGFW 233
AI VV ILL+ VVPK+ EQF+HMKQ LPLSTR+L+G+SD ++ GP +L + + F
Sbjct: 183 AIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMAFR 242

Query: 234 LWLKRGNNRHRFHAMLLRVALIGPLICAINSARYLRTLSILQSSGVPLLDGMNLSTESLN 293
+ L++ R FH LL + LIG + +N+ARY RTLSIL +S VPLL M +S + ++
Sbjct: 243 VMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDVMS 302

Query: 294 NLEIRQRLANAAENVRQGNSIHLSLEQTAIFPPMMLYMVASGEKSGQLGTLMVRAADNQE 353
N R RL+ A + VR+G S+H +LEQTA+FPPMM +M+ASGE+SG+L +++ RAADNQ+
Sbjct: 303 NDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADNQD 362

Query: 354 TLQQNRIALTLSIFEPALIITMALIVLFIVVSVLQPLLQLNSMIN 398
+++ L L +FEP L+++MA +VLFIV+++LQP+LQLN++++
Sbjct: 363 REFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLMS 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3783BCTERIALGSPG2491e-88 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 249 bits (636), Expect = 1e-88
Identities = 144/145 (99%), Positives = 144/145 (99%)

Query: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK 60
MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK 60

Query: 61 LDNHRYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDL 120
LDNH YPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDL
Sbjct: 61 LDNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDL 120

Query: 121 LSAGPDGEMGTEDDITNWGLSKKKK 145
LSAGPDGEMGTEDDITNWGLSKKKK
Sbjct: 121 LSAGPDGEMGTEDDITNWGLSKKKK 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3784BCTERIALGSPH1429e-46 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 142 bits (358), Expect = 9e-46
Identities = 50/154 (32%), Positives = 76/154 (49%), Gaps = 18/154 (11%)

Query: 3 QQRGFTLLEMMLVLALVAITASVVLFTYGREDAASTRARETAARFTAALELAIDRATLSG 62
+QRGFTLLEMML+L L+ ++A +VL + + A +T ARF A L R +G
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFP--ASRDDSAAQTLARFEAQLRFVQQRGLQTG 59

Query: 63 QPVGIHFSDSAWRIMV----PGKTP-------SAWRWVPLQEDAADESKNDWGEELSIQL 111
Q G+ W+ +V G P S +RW+PL+ S + G +L++
Sbjct: 60 QFFGVSVHPDRWQFLVLEARDGADPAPADDGWSGYRWLPLRAGRVATSGSIAGGKLNLAF 119

Query: 112 ---QPFKPDDSNQPQVVILADGQITPFSLLMANA 142
+ + P D P V+I G++TPF L + A
Sbjct: 120 AQGEAWTPGD--NPDVLIFPGGEMTPFRLTLGEA 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3785BCTERIALGSPG319e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 31.0 bits (70), Expect = 9e-04
Identities = 18/91 (19%), Positives = 42/91 (46%), Gaps = 4/91 (4%)

Query: 14 MNKQSGMTLLEVLLAMSIFTAVALTLMSSMQGQ--RTAIERMRNETLALWIADNQLQSQD 71
+KQ G TLLE+++ + I +A ++ ++ G + ++ ++ +AL A + + D
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK-LD 62

Query: 72 SFDEENTSSSGKELINGEELINGEEWNWRSD 102
+ T+ + L+ L N+ +
Sbjct: 63 NHHYPTTNQGLESLVEAPTL-PPLAANYNKE 92


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3786BCTERIALGSPH342e-04 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 33.8 bits (77), Expect = 2e-04
Identities = 12/47 (25%), Positives = 25/47 (53%), Gaps = 2/47 (4%)

Query: 4 RQQGFTLLEVMAALAIFSMLSVLAFMIFSQASELHQRSQKEIQQFNQ 50
RQ+GFTLLE+M L + + + + + F + + + + + +F
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPASRD--DSAAQTLARFEA 46


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3790PREPILNPTASE1521e-47 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 152 bits (386), Expect = 1e-47
Identities = 88/262 (33%), Positives = 119/262 (45%), Gaps = 47/262 (17%)

Query: 5 LPLFILVGFIAGYFVNVMAYHL---------------SPLEDKTALTFRQVLVH------ 43
L L + G F+NV+ + L +D+ L+
Sbjct: 16 FSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEPPYNLMVPRSCCP 75

Query: 44 FWQKKYAWHDTVPLI-------------------------LCVAAAIACALAPFTPIVTG 78
+ +PL+ L ++A A+ T
Sbjct: 76 HCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSVAVAMTLAPGWGTL 135

Query: 79 ALFLYFCFALTLSVIDFRTQLLPDKLTLPLLWLGLVFNAQSGLIDLHDAVYGAVAGYGVL 138
A L + L+ ID LLPD+LTLPLLW GL+FN G + L DAV GA+AGY VL
Sbjct: 136 AALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGYLVL 195

Query: 139 WCVYWGVWLVCHKEGLGYGDFKLLAAAGAWCGWQTLPMILLIASLGGIGYAIVSQLLQRR 198
W +YW L+ KEG+GYGDFKLLAA GAW GWQ LP++LL++SL G I LL+
Sbjct: 196 WSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLRNH 255

Query: 199 TITT-IAFGPWLALGSMINLGY 219
+ I FGP+LA+ I L +
Sbjct: 256 HQSKPIPFGPYLAIAGWIALLW 277


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3791HELNAPAPROT353e-05 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 35.2 bits (81), Expect = 3e-05
Identities = 28/150 (18%), Positives = 59/150 (39%), Gaps = 24/150 (16%)

Query: 5 TKVINYLNKLLGNE---LVAINQYFLHARMFKNWGLKRLNDVEYHESIDEM-----KHAD 56
T V N LN L N ++++ +W +K + HE +E+ + D
Sbjct: 11 TLVENSLNTQLSNWFLLYSKLHRF--------HWYVKGPHFFTLHEKFEELYDHAAETVD 62

Query: 57 RYIERILFLEGLPN--LQDLGKL------NIGEDVEEMLRSDLALELDGAKNLREAIGYA 108
ER+L + G P +++ + EM+++ + + + IG A
Sbjct: 63 TIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYKQISSESKFVIGLA 122

Query: 109 DSVHDYVSRDMMIEILRDEEGHIDWLETEL 138
+ D + D+ + ++ + E + L + L
Sbjct: 123 EENQDNATADLFVGLIEEVEKQVWMLSSYL 152


100UTI89_C3848UTI89_C3858N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C38480170.050025hypothetical protein
UTI89_C38490171.551214FKBP-type peptidylprolyl isomerase
UTI89_C38500153.153444hypothetical protein
UTI89_C3851-1142.979903FKBP-type peptidylprolyl isomerase
UTI89_C3852-1152.730462hypothetical protein
UTI89_C3853-2142.747203glutathione-regulated potassium-efflux system
UTI89_C3854-1172.572180glutathione-regulated potassium-efflux system
UTI89_C3855-1181.798260ABC transporter ATP-binding protein
UTI89_C3856-2120.888872hydrolase
UTI89_C3857-2131.186650hypothetical protein
UTI89_C3858-2131.258502phosphoribulokinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3848ACRIFLAVINRP290.023 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.7 bits (64), Expect = 0.023
Identities = 14/62 (22%), Positives = 29/62 (46%), Gaps = 1/62 (1%)

Query: 164 ASSVEDLVTQTLEFTIEEVNADRNV-SNNAKNRQIVLNLYEKGIFDIKDAINQVADRLNI 222
A +V+D VTQ +E + ++ + S + + + L + D A QV ++L +
Sbjct: 54 AQTVQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQL 113

Query: 223 SK 224
+
Sbjct: 114 AT 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3849INFPOTNTIATR1325e-40 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 132 bits (334), Expect = 5e-40
Identities = 79/226 (34%), Positives = 124/226 (54%), Gaps = 9/226 (3%)

Query: 28 AAKPATTADSKAAFKNDDQKSAYALGASLGRYMENSLKEQEKLGIKLDKDQLIAGVQDAF 87
A A A + D K +Y++GA LG K + GI ++ D L G+QD
Sbjct: 14 AMSTAMAATDATSLTTDKDKLSYSIGADLG-------KNFKNQGIDINPDVLAKGMQDGM 66

Query: 88 A-DKSKLSDQEIEQTLQAFEARVKSSAQAKMEKDAADNEAKGKEYREKFAKEKGVKTSST 146
+ + L++++++ L F+ + + A+ K A +N+AKG + + G+ +
Sbjct: 67 SGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVLPS 126

Query: 147 GLVYQVVEAGKGEAPKDSDTVVVNYKGTLIDGKEFDNSYTRGEPLSFRLDGVIPGWTEGL 206
GL Y++++AG G P SDTV V Y GTLIDG FD++ G+P +F++ VIPGWTE L
Sbjct: 127 GLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEAL 186

Query: 207 KNIKKGGKIKLVIPPELAYGKAGVPG-IPPNSTLVFDVELLDVKPA 251
+ + G ++ +P +LAYG V G I PN TL+F + L+ VK A
Sbjct: 187 QLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVKKA 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C385360KDINNERMP310.021 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 30.7 bits (69), Expect = 0.021
Identities = 13/69 (18%), Positives = 29/69 (42%), Gaps = 6/69 (8%)

Query: 261 TAIDPFKGLLLG---LFFISVGMSLNLGVLYTHL-LWVVISVVVLVAVKILVLYLLARLY 316
A+ P L + L+FIS + L +++ + W +++ V+ ++ L
Sbjct: 318 AAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKA-- 375

Query: 317 GVRSSERMQ 325
S +M+
Sbjct: 376 QYTSMAKMR 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3854ISCHRISMTASE320.001 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 31.9 bits (72), Expect = 0.001
Identities = 32/135 (23%), Positives = 51/135 (37%), Gaps = 16/135 (11%)

Query: 12 YAHPESQDSVANRVLLKPATQLSNVTVHDLYAHYPDFFIDIPREQALLREHEVIVFQH-- 69
Y P + D N+V P + + +HD+ ++ D F L + +
Sbjct: 9 YQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCV 68

Query: 70 ----PLYTYSCPALLKEWLDRVLSRGFASGPGGNQLAGKYWRSVITTGEPESA------Y 119
P+ + P DR L F GPG N +G Y +IT PE +
Sbjct: 69 QLGIPVVYTAQPGSQNP-DDRALLTDFW-GPGLN--SGPYEEKIITELAPEDDDLVLTKW 124

Query: 120 RYDALNRYPMSDVLR 134
RY A R + +++R
Sbjct: 125 RYSAFKRTNLLEMMR 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3855GPOSANCHOR330.004 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.7 bits (74), Expect = 0.004
Identities = 28/152 (18%), Positives = 54/152 (35%), Gaps = 22/152 (14%)

Query: 504 KVEPFDGDLEDYQQWLSDVQKQENQTDEAPKENANSAQARKDQKRREAELRAQTQPLRKE 563
+ D + ++ E + + ++ R+ +R R + L E
Sbjct: 272 AMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAE 331

Query: 564 IARLEKEME---------------------KLNAQLAQAEEKLGDSELYDQSRKAELTAC 602
+LE++ + +L A+ + EE+ SE QS + +L A
Sbjct: 332 HQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDAS 391

Query: 603 LQQQASAKSGLEECEMAWLEAQEQLEQMLLEG 634
+ + + LEE L A E+L + L E
Sbjct: 392 REAKKQVEKALEEANSK-LAALEKLNKELEES 422



Score = 32.0 bits (72), Expect = 0.008
Identities = 13/125 (10%), Positives = 39/125 (31%), Gaps = 7/125 (5%)

Query: 513 EDYQQWLSDVQKQENQTDEAPKENANSAQARKDQKRREAELRAQTQPLRKEIARLEKEME 572
+ + ++ + E A A + D ++ + +++
Sbjct: 127 KALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST-------ADSAKIK 179

Query: 573 KLNAQLAQAEEKLGDSELYDQSRKAELTACLQQQASAKSGLEECEMAWLEAQEQLEQMLL 632
L A+ A E + + E + TA + + ++ + ++ LE +
Sbjct: 180 TLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMN 239

Query: 633 EGQSN 637
++
Sbjct: 240 FSTAD 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3858PF07299320.002 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 31.8 bits (72), Expect = 0.002
Identities = 10/46 (21%), Positives = 21/46 (45%), Gaps = 2/46 (4%)

Query: 71 PEANDFGLLEQTFIEYGQSGKGKSRKYLHTYDEAVPWNQVPGTFTP 116
P+ + + E ++ KG SRK++ ++ + + GTF
Sbjct: 112 PDMEELDMKELSY--LSWIDKGSSRKFIIAKNDKNKFVGLQGTFQS 155


101UTI89_C3951UTI89_C3960N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C3951114-1.471521acetyltransferase YhhY
UTI89_C39520130.053051hypothetical protein
UTI89_C3953-1162.513140hypothetical protein
UTI89_C3954-2193.162765gamma-glutamyltranspeptidase
UTI89_C3955-2233.069004hypothetical protein
UTI89_C3956-2253.570330cytoplasmic glycerophosphodiester
UTI89_C3957-2253.334400glycerol-3-phosphate transporter ATP-binding
UTI89_C3958-1263.221732glycerol-3-phosphate transporter membrane
UTI89_C3959-2263.691179glycerol-3-phosphate transporter permease
UTI89_C3960-2233.231048glycerol-3-phosphate transporter periplasmic
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3951SACTRNSFRASE354e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 35.3 bits (81), Expect = 4e-05
Identities = 20/92 (21%), Positives = 32/92 (34%), Gaps = 16/92 (17%)

Query: 55 VACIDGIVVGHLTIDVQQRPRRSHVADFGICVDSRWKNRGVASALMREMID------MCD 108
+ ++ +G + I + + D + D R K GV +AL+ + I+ C
Sbjct: 69 LYYLENNCIGRIKIR-SNWNGYALIEDIAVAKDYRKK--GVGTALLHKAIEWAKENHFCG 125

Query: 109 NWLRVDRIELTVFVDNAPAIKVYKKFGFEIEG 140
L I N A Y K F I
Sbjct: 126 LMLETQDI-------NISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3954NAFLGMOTY320.005 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 32.0 bits (72), Expect = 0.005
Identities = 27/80 (33%), Positives = 36/80 (45%), Gaps = 13/80 (16%)

Query: 272 RTPISGDYRGYQVYSMPPPSSGGIHIVQILNILENFDMQKYGF-GSADAMQIMAEAEKYA 330
R P+ G+ R + SMPPP G H +I N+ F Q G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNL--KFFKQFDGYVGGQTAWGILSELEKGR 133

Query: 331 YADRSEYLGDPDFVKVPWQA 350
Y P F WQ+
Sbjct: 134 Y---------PTFSYQDWQS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3956PF04619280.020 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 28.4 bits (63), Expect = 0.020
Identities = 12/60 (20%), Positives = 22/60 (36%), Gaps = 4/60 (6%)

Query: 29 VGAKYGHKMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGELNWQD----LLRVDAGSW 84
+G ++ D + G+ FL+ D+N ++ W + D GSW
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3957PF05272290.042 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.9 bits (64), Expect = 0.042
Identities = 10/29 (34%), Positives = 16/29 (55%)

Query: 33 IVMVGPSGCGKSTLLRMVAGLERVTTGDI 61
+V+ G G GKSTL+ + GL+ +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHF 627


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C3960MALTOSEBP392e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 39.3 bits (91), Expect = 2e-05
Identities = 39/160 (24%), Positives = 66/160 (41%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQDLADYAAKLKASGMKCGYASGWQ 193
G L++ P L YNKD PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKDLLP-------NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


102UTI89_C4006UTI89_C4011N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4006-2122.505575hypothetical protein
UTI89_C4007-392.372862ABC transporter ATP-binding protein
UTI89_C4008-1100.680913hypothetical protein
UTI89_C4009011-0.961575hypothetical protein
UTI89_C4010010-0.719276hypothetical protein
UTI89_C4011-1141.944156hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4006ABC2TRNSPORT503e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 50.3 bits (120), Expect = 3e-09
Identities = 41/171 (23%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 228 REREHGTVEHLLVMPITPFEIMMAKV-WSMGLVVLVVSGLSLVLMVKGVLGVPIEGSIPL 286
R T E +L + +I++ ++ W+ L +G+ +V G + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY----TQWLSLL 148

Query: 287 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQMLSGGSTPRESMPQMVQD 345
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P + Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 346 IMLTMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFF-TIALLRFR 395
+P +H + L + I+ ++ + I FF + ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4007PF05272300.046 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.046
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 37 ARCMVGLIGPDGVGKSSLLSLISGAR 62
V L G G+GKS+L++ + G
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4008RTXTOXIND839e-20 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 83.3 bits (206), Expect = 9e-20
Identities = 71/408 (17%), Positives = 138/408 (33%), Gaps = 81/408 (19%)

Query: 6 RHLAWWGVGALAVAAVVAWWLLRPAGVP-EGFAVSNGRIEATEVDIASKIAGRIDTILVK 64
R +A++ +G L +A +++ G +GR + I + I+VK
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVK 113

Query: 65 EGQFVREGEVLAKMDTRV----------------LQEQRLEAI----------------- 91
EG+ VR+G+VL K+ L++ R + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 92 -------------------AQIKEAQSAVAAAQALLEQRQSETRAAQSLVNQRQAELDSV 132
Q Q+ + L+++++E + +N+ +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 133 AKRHTRSRSLAQRGAISAQQLDDDRAAAESARAALESAKAQVSASKAAIEAARTNIIQ-- 190
R SL + AI+ + + A L K+Q+ ++ I +A+
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 191 -----------AQTRVEAAQATERRIAADID--DSELKAPRDGRV-QYRVAEPGEVLAAG 236
QT T + S ++AP +V Q +V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 237 GRVLNMVDLSDVY-MTFFLPTEQAGTLKLGGEARLILDAAPDLRIPATISFVASVAQFTP 295
++ +V D +T + + G + +G A + ++A P R V V
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YLVGKVKNINL 410

Query: 296 KTVETSDERLKLMFRVKARIPPELLQQHLEYV--KTGLPGVAWVRVNE 341
+E D+RL L+F V I L + + +G+ A ++
Sbjct: 411 DAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4011ALARACEMASE290.023 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 29.4 bits (66), Expect = 0.023
Identities = 26/109 (23%), Positives = 42/109 (38%), Gaps = 24/109 (22%)

Query: 215 VITAENGIVFRENLLFTHRGLSGPAVLQISSYWQPGEFVSINLLPDVDLETFL--NEQRN 272
++ E I RE RG GP +L + ++ + + + L T + N Q
Sbjct: 58 LLNLEEAITLRE------RGWKGP-ILMLEGFFHAQD---LEIYDQHRLTTCVHSNWQLK 107

Query: 273 AHPNQSLKNTLAVHL------------PKRLVERLQQLGQIPDVSLKQL 309
A N LK L ++L P R++ QQL + +V L
Sbjct: 108 ALQNARLKAPLDIYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTL 156


103UTI89_C4218UTI89_C4229N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4218-1121.044827ribonucleoside transporter
UTI89_C4219-2121.679505hypothetical protein
UTI89_C4220-1132.116974hypothetical protein
UTI89_C4221-1153.220785cryptic adenine deaminase
UTI89_C42220173.658217sugar phosphate antiporter
UTI89_C42231163.960654regulatory protein UhpC
UTI89_C42241164.232778sensory histidine kinase UhpB
UTI89_C42251173.564547DNA-binding transcriptional activator UhpA
UTI89_C42261152.707528acetolactate synthase 1 regulatory subunit
UTI89_C42271152.809149acetolactate synthase catalytic subunit
UTI89_C4228-1181.796315hypothetical protein
UTI89_C4229-1170.205542multidrug resistance protein D
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4218TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 38.7 bits (90), Expect = 3e-05
Identities = 34/208 (16%), Positives = 72/208 (34%), Gaps = 13/208 (6%)

Query: 33 IIVEFLPVSLLTP----MAQDLGISEGVA---GQSVTVTAFVAMFASLFITQTIQATDRR 85
+ ++ + + L+ P + +DL S V G + + A + + + RR
Sbjct: 14 VALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRR 73

Query: 86 YVVILFAVLLTLSCLLVSFANSFSLLLIGRACLGLALGGFWAMSASLTMRLVPPRTVPKA 145
V+++ + +++ A +L IGR G+ G A++ + + +
Sbjct: 74 PVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARH 132

Query: 146 LSVIFGAVSIALVIAAPLGSFLGELIGWRNVFNAAAAMG----VLCIFWIIKSLPSLPGE 201
+ +V LG +G F AAAA+ + F + +S
Sbjct: 133 FGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRP 191

Query: 202 PSHQKQNTFRLLQRPGVMAGMIAIFMSF 229
+ N + M + A+ F
Sbjct: 192 LRREALNPLASFRWARGMTVVAALMAVF 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4221UREASE381e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 38.2 bits (89), Expect = 1e-04
Identities = 30/105 (28%), Positives = 43/105 (40%), Gaps = 17/105 (16%)

Query: 22 AVSRGDAVADYIIDNVSILDLINGGEISGPIVIKGRYIAGVG-AEYAD---------APA 71
V+R D +I N ILD + G + I +K IA +G A D P
Sbjct: 60 QVTREGGAVDTVITNALILD--HWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPG 117

Query: 72 LQRIDARGATAVPGFIDAHLHIESSMMTPVTFETATLPRGLTTVI 116
+ I G G +D+H+H + P E A L GLT ++
Sbjct: 118 TEVIAGEGKIVTAGGMDSHIH----FICPQQIEEA-LMSGLTCML 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4222TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 0.001
Identities = 28/168 (16%), Positives = 61/168 (36%), Gaps = 17/168 (10%)

Query: 49 FNIAQNDMISTYGLSMTQLGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAIC 108
N++ D+ + + + F +T+ +G + +D K+ L F +I++ C
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN--C 90

Query: 109 MLGFSASMGSGSVSLFLMIAFYALSGFFQSTGGSCSYSTI----TKWTPRRKRGTFLGFW 164
+G SL +M + F Q G + + + ++ P+ RG G
Sbjct: 91 FGSVIGFVGHSFFSLLIM------ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 165 NISHNLGGAGAAGVALFGANYLFDGHVIGMFIFPSIIALIVGFIGLRY 212
+G + A+Y+ + + + P + I+ L
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSY---LLLIP--MITIITVPFLMK 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4223TCRTETB401e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 39.9 bits (93), Expect = 1e-05
Identities = 64/408 (15%), Positives = 135/408 (33%), Gaps = 60/408 (14%)

Query: 30 RHILLTIWLGYALFY--FTRKSFNAAVPEILANGVLSRSDIGLLATLFYITYGVSKFVSG 87
RH + IWL F+ N ++P+I + + + T F +T+ + V G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 88 IVSDRSNARYFMGIGLIATGIINILFGFSTSLWAFAVLWVLNAFFQGWGS---PVCARLL 144
+SD+ + + G+I +++ S F L ++ F QG G+ P ++
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHS---FFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 145 TAWY-SRTERGGWWALWNTAHNVGGALIPIVMAASALHYGWRAGMMIAGCMAIVVGIFLC 203
A Y + RG + L + +G + P + A + W ++ M ++ +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPF- 184

Query: 204 WRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKEILTKYVLLNPYIWLLSFCYVLV 263
+ L +I G L I+ + Y VL
Sbjct: 185 --------LMKLLKKEVRIKGHFDIK----GIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 264 YVV-----RAAINDWGN-----------LYMSEMLGVDLVTANTAVTMFELGGFIGALVA 307
+++ R + + + + + V ++ + + A
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 308 GWGSDKLFNGNRGPMNLIFAAGILL-SVGSLWLMPFASYVMQATCFFTIGFFVFGPQMLI 366
GS +F G + + GIL+ G L+++ + + F T F + +
Sbjct: 293 EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFM 351

Query: 367 ---------GMAAAECS---------HKEAAGAATGFVGLFAYLGASL 396
G++ + ++ AGA + ++L
Sbjct: 352 TIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4224PF06580402e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.8 bits (93), Expect = 2e-05
Identities = 28/142 (19%), Positives = 57/142 (40%), Gaps = 11/142 (7%)

Query: 365 LRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWRIDESALSENQRVTLFRVCQEGLNN 424
LR ++L + + ++L L++ + + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 425 IVKHA-----DASAVTLQGWQQDERLMLVIEDDGSGLPPDSGQ-HGFGLTGMRERVTALG 478
+KH + L+G + + + L +E+ GS ++ + G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 479 G---TLTISCLHG-TRVSVSLP 496
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4225HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.0 bits (148), Expect = 2e-13
Identities = 29/174 (16%), Positives = 59/174 (33%), Gaps = 20/174 (11%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGCGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATG 118
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 119 GCYLTPDIAIKLASGRQDPLTKRERQVAEKLAQG---MAVKEIAAELGLSPKTV 169
A+ R L + + + + + A L + T+
Sbjct: 118 --------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4229TCRTETB591e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 59.1 bits (143), Expect = 1e-11
Identities = 41/184 (22%), Positives = 80/184 (43%), Gaps = 1/184 (0%)

Query: 7 RNVNLLLMLVLLVAVGQMAQTIYIPAIADMARDLNVREGAVQSVMGAYLLTYGVSQLFYG 66
R+ +L+ L +L + + + ++ D+A D N + V A++LT+ + YG
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 67 PISDRVGRRPVILVGMSIFMLATLVA-VTTSSLMVLIAASAMQGMGTGVGGVMARTLPRD 125
+SD++G + ++L G+ I +++ V S +LI A +QG G + +
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVAR 130

Query: 126 LYERTQLRHANSLLNMGILVSPLLAPLIGGLLDTMWNWRACYLFLLVLCAGVTFSMARWM 185
+ A L+ + + + P IGG++ +W L ++ V F M
Sbjct: 131 YIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLK 190

Query: 186 PETR 189
E R
Sbjct: 191 KEVR 194


104UTI89_C4557UTI89_C4577N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4557-213-0.821154citrate permease
UTI89_C4558-215-1.856734DNA-binding transcriptional repressor FabR
UTI89_C4559-217-1.962801hypothetical protein
UTI89_C4560-215-1.740789tRNA (uracil-5-)-methyltransferase
UTI89_C4561-215-1.454362vitamin B12/cobalamin outer membrane
UTI89_C4562-1130.161697glutamate racemase
UTI89_C4570-1120.439232*hypothetical protein
UTI89_C4571-2131.277927hypothetical protein
UTI89_C4572-1172.330684homoserine O-succinyltransferase
UTI89_C4573-1172.001406malate synthase
UTI89_C45740161.523576isocitrate lyase
UTI89_C45750141.575714bifunctional isocitrate dehydrogenase
UTI89_C45760141.819315transcriptional repressor IclR
UTI89_C4577-1141.456657B12-dependent methionine synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4557TCRTETA446e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.4 bits (105), Expect = 6e-07
Identities = 30/158 (18%), Positives = 58/158 (36%), Gaps = 29/158 (18%)

Query: 54 ADHNVALLLAFATFGVSFFMRPLGGIIVGAWADRFGRKPAMVFTIALMSLGTLMIGIAPT 113
H LL +A M+ ++GA +DRFGR+P ++ ++A ++ ++ AP
Sbjct: 42 TAHYGILLALYA------LMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPF 95

Query: 114 YETAGYWGTATLVLARLIQGVAAGGEVGASMSLLVESAPANRRGFYSSWSLATQGLATTF 173
L + R++ G+ G + + + + + R + + A G
Sbjct: 96 LW--------VLYIGRIVAGI-TGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVA 146

Query: 174 GGVVALGLSAWLPFATGSETVMAEWGWRVPFFIGVLLA 211
G V+ +M + PFF L
Sbjct: 147 GPVLG--------------GLMGGFSPHAPFFAAAALN 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4558HTHTETR546e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 53.9 bits (129), Expect = 6e-11
Identities = 29/143 (20%), Positives = 63/143 (44%), Gaps = 3/143 (2%)

Query: 19 VMGVRAQQKEKTRRSLVEAAFSQLSAERSFASLSLREVAREAGIAPTSFYRHFRDVDELG 78
+ Q+ ++TR+ +++ A +L +++ +S SL E+A+ AG+ + Y HF+D +L
Sbjct: 1 MARKTKQEAQETRQHILDVA-LRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLF 59

Query: 79 LTMVDESGLMLRQLMRQ-ARQRIAKGGSVIRTSVSTFMEFIGNNPNAFRLL-LRERSGTS 136
+ + S + +L + + SV+R + +E L+ +
Sbjct: 60 SEIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 137 AAFRAAVAREIQHFIAELADYLE 159
A V + ++ E D +E
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIE 142


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4570SHAPEPROTEIN326e-04 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 31.7 bits (72), Expect = 6e-04
Identities = 23/62 (37%), Positives = 32/62 (51%), Gaps = 9/62 (14%)

Query: 49 IANFFVAEKVLQDLVLQLHPRSTWHSFLPAKRMDIVVSALEMNEGGLSQVEERILHEVVA 108
IA+FFV EK+LQ + Q+H S P+ R+ + V G +QVE R + E
Sbjct: 81 IADFFVTEKMLQHFIKQVHSNSF---MRPSPRVLVCVPV------GATQVERRAIRESAQ 131

Query: 109 GA 110
GA
Sbjct: 132 GA 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4571SACTRNSFRASE341e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.8 bits (77), Expect = 1e-04
Identities = 16/54 (29%), Positives = 21/54 (38%), Gaps = 5/54 (9%)

Query: 78 IDPDVRGCGVGRMLVKHALSMAPE-----LTTNVNEQNEQAVGFYKKVGFKVTG 126
+ D R GVG L+ A+ A E L + N A FY K F +
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4574BINARYTOXINB320.004 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 32.3 bits (73), Expect = 0.004
Identities = 14/58 (24%), Positives = 23/58 (39%)

Query: 294 ETSTPDLELARRFAQAIHAKYPGKLLAYNCSPSFNWQKNLDDKTIASFQQQLSDMGYK 351
ET+ PD+ L A P L Y + N D +T + + QL+++
Sbjct: 544 ETTKPDMTLKEALKIAFGFNEPNGNLQYQGKDITEFDFNFDQQTSQNIKNQLAELNAT 601


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4577BCTERIALGSPD320.018 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 31.8 bits (72), Expect = 0.018
Identities = 19/87 (21%), Positives = 37/87 (42%), Gaps = 17/87 (19%)

Query: 343 SGLEPLNIGEDSLFVNVGERTN---VTGSA----KFKRLIKEEKYSEALDVARQQVENGA 395
+P+ + ++ + +TN VT + +R+I + LD+ R QV A
Sbjct: 298 QAAKPVAALDKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQ------LDIRRPQVLVEA 351

Query: 396 QIIDINMDEGMLDAEAAMVRFLNLIAG 422
I ++ D L+ +++ N AG
Sbjct: 352 IIAEVQ-DADGLNLG---IQWANKNAG 374


105UTI89_C4700UTI89_C4707N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4700-1140.650535phosphonate/organophosphate ester transporter
UTI89_C4701-2150.234321hypothetical protein
UTI89_C4702-2160.459845hypothetical protein
UTI89_C4703-2150.313379hypothetical protein
UTI89_C4704-2140.948833hypothetical protein
UTI89_C4705-213-0.779318proline/glycine betaine transporter
UTI89_C4706-218-0.114423sensor protein BasS/PmrB
UTI89_C4707-217-0.602806DNA-binding transcriptional regulator BasR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4700PF05272290.017 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.017
Identities = 12/22 (54%), Positives = 13/22 (59%)

Query: 32 MVALLGPSGSGKSTLLRHLSGL 53
V L G G GKSTL+ L GL
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4705TCRTETA449e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.0 bits (104), Expect = 9e-07
Identities = 57/290 (19%), Positives = 105/290 (36%), Gaps = 55/290 (18%)

Query: 96 FFGMLGDKYGRQKILAITIVIMSISTFCIGLIPSYDTIGIWAPILLLICKMAQGFSVGGE 155
G L D++GR+ +L +++ ++ + P +W +L I ++ G + G
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPF-----LW---VLYIGRIVAGIT-GAT 112

Query: 156 YTGASIFVAEYSPDRKR----GFMGSWLDFGSIAGFVLGAGVVVLISTIVGEENFLDWGW 211
A ++A+ + +R GFM + FG +AG VLG G++ S
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG-GLMGGFSP------------ 159

Query: 212 RIPFFIALPLGIIGLYLRHALEETPAFQQHVDKLEQGDREGLQDGPKVSFKEIATKHWRS 271
PFF A L + L K E+ P SF+ W
Sbjct: 160 HAPFFAAAALNGLNFLTGCFLLPESH------KGERRPLRREALNPLASFR------WAR 207

Query: 272 LLTCIGLVIATNVTYYML----LTYMPSYLSHNLHYS-EDHGVLIIIAIMIGMLFVQPVM 326
+T + ++A ++ + H+ G+ + ++ L +
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMIT 267

Query: 327 GLLSDRFGRRPFVLLG----SVALFVLA--------IPAFILINSNVIGL 364
G ++ R G R ++LG +LA P +L+ S IG+
Sbjct: 268 GPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM 317



Score = 39.4 bits (92), Expect = 2e-05
Identities = 39/164 (23%), Positives = 73/164 (44%), Gaps = 16/164 (9%)

Query: 297 LSHNLHYSEDHGVLI-IIAIMIGMLFVQPVMGLLSDRFGRRPFVLLGSVALFVLAIPAFI 355
L H+ + +G+L+ + A+M PV+G LSDRFGRRP +L+ L A+ I
Sbjct: 35 LVHSNDVTAHYGILLALYALM--QFACAPVLGALSDRFGRRPVLLVS---LAGAAVDYAI 89

Query: 356 LINSNVIGLIFAGLLMLAVILNCFTGVMASTLPAMFPTHIR---YSALAAAFNISVLVAG 412
+ + + +++ G ++A I V + + + R + ++A F +VAG
Sbjct: 90 MATAPFLWVLYIG-RIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFG-MVAG 147

Query: 413 LTPTLAAWLVESSQNLMMPAYYLMVVAVVGLITG-VTMKETANR 455
P L + S + P + + + +TG + E+
Sbjct: 148 --PVLGGLMGGFSPH--APFFAAAALNGLNFLTGCFLLPESHKG 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4706PF06580377e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.2 bits (86), Expect = 7e-05
Identities = 39/182 (21%), Positives = 80/182 (43%), Gaps = 34/182 (18%)

Query: 184 ARLDQMMESVSQLLQLARAGQSFSSGNYQHVKLLEDV-ILPSYDELSTML--DQRQQTLL 240
+ +M+ S+S+L++ S N + V L +++ ++ SY +L+++ D+ Q
Sbjct: 191 TKAREMLTSLSELMR-----YSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 241 LPESAADITVQGDATLLRMLLRNLVENAHRY----SPQGSNIMIKLQEDGGAV-MAVEDE 295
+ + D+ V ML++ LVEN ++ PQG I++K +D G V + VE+
Sbjct: 246 INPAIMDVQV------PPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENT 299

Query: 296 GPGIDESKCGELSKAFVRMDSRYGGIGLGLSIV-SRITQLHHGQFFLQNRQETSGTRAWI 354
G + + G GL V R+ L+ + ++ ++ A +
Sbjct: 300 GSLA--------------LKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 355 RL 356
+
Sbjct: 346 LI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4707HTHFIS921e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.8 bits (228), Expect = 1e-23
Identities = 42/121 (34%), Positives = 60/121 (49%)

Query: 2 KILIVEDDTLLLQGLILAAQTEGYACDGVSTARMAEQSLEAGHYSLVVLDLGLPDEDGLH 61
IL+ +DD + L A GY S A + + AG LVV D+ +PDE+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 FLARIRQKKYTLPVLILTARDTLTDKIAGLDVGADDYLVKPFALEELHARIRALLRRHNN 121
L RI++ + LPVL+++A++T I + GA DYL KPF L EL I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 Q 122
+
Sbjct: 125 R 125


106UTI89_C4882UTI89_C4896N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4882635-9.476369adhesin
UTI89_C4883841-10.662432hypothetical protein
UTI89_C4884841-9.720334hypothetical protein
UTI89_C4885737-7.873351hypothetical protein
UTI89_C4886738-6.445052regulator PapX protein
UTI89_C4887638-5.353513p pilus adhesin PapG protein
UTI89_C4888634-1.444871minor pilin subunit PapF
UTI89_C4889533-1.079716minor pilin subunit PapE
UTI89_C4890531-0.901319minor pilin subunit PapK
UTI89_C4891633-1.613973PapJ protein
UTI89_C4892631-2.512788periplasmid chaperone PapD protein
UTI89_C4893531-2.341498outer membrane usher protein PapC
UTI89_C4894743-6.523517minor pilin subunit PapH
UTI89_C4895639-6.169921major pilin subunit PapA
UTI89_C4896538-5.328800pap operon regulatory protein PapB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4882OMPADOMAIN412e-06 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 41.1 bits (96), Expect = 2e-06
Identities = 50/246 (20%), Positives = 79/246 (32%), Gaps = 55/246 (22%)

Query: 6 MNKVFVVSVVAAACVFAVNAGAKEGKSGFYLTGKAGASVMSLSDQRFLSGDEEETSKYKG 65
M K + VA A FA A A + +Y K G S D F++
Sbjct: 1 MKKTAIAIAVALA-GFATVAQAAPKDNTWYTGAKLGWS--QYHDTGFIN---------NN 48

Query: 66 GDDHDTVFSGGIAVGYDFYPQFSIPVRTELEFYARGKADSKYNVDKDSWSGGYWRDDLKN 125
G H+ G GY P E+ + G+ K +V+ +
Sbjct: 49 GPTHENQLGAGAFGGYQVNPYVGF----EMGYDWLGRMPYKGSVE-----------NGAY 93

Query: 126 EVSVNTLMLNAYYDFRNDSAFTPWVSAGIGYARIHQKTTGISTWDYEYGSSGRESLSRSG 185
+ L Y +D Y R+ G W + S+ +G
Sbjct: 94 KAQGVQLTAKLGYPITDDLDI---------YTRL-----GGMVWRADTKSNVYGKNHDTG 139

Query: 186 SADNFAWSLGAGVRYDVTPDIALDLSYRYLDAGDSSVSYKDEWGDKYKSEVDVKSHDIML 245
+ F GV Y +TP+IA L Y++ + GD + + + L
Sbjct: 140 VSPVF----AGGVEYAITPEIATRLEYQWT----------NNIGDAHTIGTRPDNGMLSL 185

Query: 246 GMTYNF 251
G++Y F
Sbjct: 186 GVSYRF 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4887PF036275460.0 PapG
		>PF03627#PapG

Length = 336

Score = 546 bits (1407), Expect = 0.0
Identities = 191/339 (56%), Positives = 231/339 (68%), Gaps = 7/339 (2%)

Query: 1 MKKWLPAFLF-LSLSGCNDALAANQSTMFYSFNDNIYRPQLSVKVTDIVQFIVDINSASS 59
MKKW PA LF L +SG + A + +FYS + +V +T QFI +
Sbjct: 1 MKKWFPALLFSLCVSGESSAW---NNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIA 57

Query: 60 TATLSYVACNGFTWTHGLYWSEYFAWLVVPKHV-SYNGYNIYLELQSRGSFSLD-AEDND 117
T T + GF Y+ EY AW+V PK V + NGY +++E+ ++GS+S + DND
Sbjct: 58 TVTWNQCNGPGFADGSWAYYREYIAWVVFPKKVMTKNGYPLFIEVHNKGSWSEENTGDND 117

Query: 118 NYYLTKGFAWDE-VNSSGRVCFDIGEKRSLAWSFGGVTLNARLPVDLPKGDYTFPVKFLR 176
+Y+ KG+ WDE +G +C GE L F + LP DLP GDY+ + +
Sbjct: 118 SYFFLKGYKWDERAFDAGNLCQKPGETTRLTEKFDDIIFKVALPADLPLGDYSVTIPYTS 177

Query: 177 GIQRNNYDYIGGRYKIPSSLMKTFPFNGTLNFSIKNTGGCRPSAQSLEINHGDLSINSAN 236
G+QR+ Y+G R+KIP ++ KT P + F KN GGCRPSAQSLEI HGDLSINSAN
Sbjct: 178 GMQRHFASYLGARFKIPYNVAKTLPRENEMLFLFKNIGGCRPSAQSLEIKHGDLSINSAN 237

Query: 237 NHYAAQTLSVSCDVPTNIRFFLLSNTTPAYSHGQQFSVGLGHGWDSIVSINGVDTGETTM 296
NHYAAQTLSVSCDVP NIRF LL NTTP YSHG++FSVGLGHGWDSIVS+NGVDTGETTM
Sbjct: 238 NHYAAQTLSVSCDVPANIRFMLLRNTTPTYSHGKKFSVGLGHGWDSIVSVNGVDTGETTM 297

Query: 297 RWYRAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP 335
RWY+AGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP
Sbjct: 298 RWYKAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4888FIMBRIALPAPF292e-105 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 292 bits (749), Expect = e-105
Identities = 165/167 (98%), Positives = 165/167 (98%)

Query: 1 MIRLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE 60
MIRLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE
Sbjct: 1 MIRLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE 60

Query: 61 VTKTISISCPYKSGSLWIKVTGNTMGGGQNNVLATNITHFGIALYQGKGMSTPLTLGNGS 120
VTK ISISCPYKSGSLWIKVTGNTMG GQNNVLATNITHFGIALYQGKGMSTPLTLGNGS
Sbjct: 61 VTKNISISCPYKSGSLWIKVTGNTMGVGQNNVLATNITHFGIALYQGKGMSTPLTLGNGS 120

Query: 121 GNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMIYN 167
GNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMIYN
Sbjct: 121 GNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMIYN 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4889FIMBRIALPAPE310e-112 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 310 bits (794), Expect = e-112
Identities = 172/173 (99%), Positives = 172/173 (99%)

Query: 1 MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAEVNWGDIEIQNLVQSG 60
MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAEVNWGDIEIQNLVQSG
Sbjct: 1 MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAEVNWGDIEIQNLVQSG 60

Query: 61 GNQKDFTVDMNCPYSLGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYNSNNSGIGN 120
GNQKDFTVDMNCPYSLGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYNSNNSGIGN
Sbjct: 61 GNQKDFTVDMNCPYSLGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYNSNNSGIGN 120

Query: 121 AVTLGSQFTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVASYS 173
AVTLGSQ TPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVASYS
Sbjct: 121 AVTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVASYS 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4893PF005777400.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 740 bits (1911), Expect = 0.0
Identities = 243/879 (27%), Positives = 361/879 (41%), Gaps = 67/879 (7%)

Query: 1 MKDRI-PFAVNNITCVILLSLFCNAASAVEFNTDVLDAADKKNIDFTRFSEAGYVLPGQY 59
K R+ F V + +++ + FN L + D +RF + PG Y
Sbjct: 19 RKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTY 78

Query: 60 LLDVIVNGQSISPASLQISFVEPALSGDKAEKKLPQACLTSDMVRLMGLTAESLDKVVYW 119
+D+ +N + A+ ++F CLT + MGL S+ +
Sbjct: 79 RVDIYLNNGYM--ATRDVTFNTGDSEQGI------VPCLTRAQLASMGLNTASVSGMNLL 130

Query: 120 HDGQCADF-HGLPGVDIRPDTGAGVLRINMPQAWLEYSDATWLPPSRWDDGIPGLMLDYN 178
D C + + D G L + +PQA++ ++PP WD GI +L+YN
Sbjct: 131 ADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYN 190

Query: 179 LNGTVSRNYQGGDSHQFSYNGTVGGNLGPWRLRADYQGSQEQSRYNGEKTTNRNFTWSRF 238
+G +N GG+SH N G N+G WRLR + S S + + +
Sbjct: 191 FSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSS--DSSSGSKNKWQHINT 248

Query: 239 YLFRAIPRWRANLTLGENNINSDIFRSWSYTGASLESDDRMLPPRLRGYAPQITGIAETN 298
+L R I R+ LTLG+ DIF ++ GA L SDD MLP RG+AP I GIA
Sbjct: 249 WLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGT 308

Query: 299 ARVVVSQQGRVLYDSMVPAGPFSIQDLD-SSVRGRLDVEVIEQNGRKKTFQVDTASVPYL 357
A+V + Q G +Y+S VP GPF+I D+ + G L V + E +G + F V +SVP L
Sbjct: 309 AQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLL 368

Query: 358 TRPGQVRYKLVSGRSRGYGHETEGPVFATGEASWGLSNQWSLYGGAVLAGDYNALAAGAG 417
R G RY + +G R + E P F GL W++YGG LA Y A G G
Sbjct: 369 QREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIG 428

Query: 418 WDLGMPGTLSADITQSVARIEGERTFQGKSWRLSYSKRFDNADADITFAGYRFSERNYMT 477
++G G LS D+TQ+ + + + G+S R Y+K + + +I GYR+S Y
Sbjct: 429 KNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFN 488

Query: 478 MEQYLNARYR--------------------NDYSSREKEMYTVTLNKNVADWNTSFNLQY 517
+R + + ++ +T+ + + + + L
Sbjct: 489 FADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTS-TLYLSG 547

Query: 518 SRQTYWDIRKTD-YYTVSVNRYFNVFGLQGVAVGLSASRSKYLGRD--NDSAYLRISVPL 574
S QTYW D + +N F + LS S +K + + L +++P
Sbjct: 548 SHQTYWGTSNVDEQFQAGLNTAFE-----DINWTLSYSLTKNAWQKGRDQMLALNVNIPF 602

Query: 575 GT------------GTASYSGSMSND-RYVNMAGYTDT-FNDGLDSYSLNAGLNSGGGLT 620
+ASYS S + R N+AG T D SYS+ G GG
Sbjct: 603 SHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGN 662

Query: 621 SQRQINAYYSHRSPLANLSANIASLQKGYTSFGVSASGGATITGKGAALHAGGMSGGTRL 680
S A ++R N + S SGG G L G T +
Sbjct: 663 SGSTGYATLNYRGGYGNANIG-YSHSDDIKQLYYGVSGGVLAHANGVTL--GQPLNDTVV 719

Query: 681 LVDTDGVGGVPVDGGQVV-TNRWGTGVVTDISSYYRNTTSVDLKRLPDDVEATRSVVESA 739
LV G V+ V T+ G V+ + Y N ++D L D+V+ +V
Sbjct: 720 LVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVV 779

Query: 740 LTEGAIGYRKFSVLKGKRLFAILRLADGSQPPFGASVTSEKGRELGMVADEGLAWLSGVT 799
T GAI +F G +L L + PFGA VTSE + G+VAD G +LSG+
Sbjct: 780 PTRGAIVRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMP 838

Query: 800 PGETLSVNW--DGKIQCQVNVPETAISDQQLL----LPC 832
+ V W + C N S QQLL C
Sbjct: 839 LAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4894FIMBRIALPAPE290.006 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 29.2 bits (65), Expect = 0.006
Identities = 20/77 (25%), Positives = 41/77 (53%), Gaps = 12/77 (15%)

Query: 29 GMSLPEYWG----EEHVWWDGRAAFHGEVVRPACTLAMEDAWQIIDMGESPVRDL-QNGF 83
G+ LP G +HV F G+++ PACT+ + ++ G+ +++L Q+G
Sbjct: 6 GLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAE----VNWGDIEIQNLVQSG- 60

Query: 84 SGPERKFSLRLRNCEFN 100
G ++ F++ + NC ++
Sbjct: 61 -GNQKDFTVDM-NCPYS 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4896FIMREGULATRY1685e-58 Escherichia coli: P pili regulatory PapB protein si...
		>FIMREGULATRY#Escherichia coli: P pili regulatory PapB protein

signature.
Length = 104

Score = 168 bits (426), Expect = 5e-58
Identities = 104/104 (100%), Positives = 104/104 (100%)

Query: 1 MAHHEVISRSGNAFLLNIRESVLLPGSMSEMHFFLLIGISSIHSDRVILAMKDYLVGGHS 60
MAHHEVISRSGNAFLLNIRESVLLPGSMSEMHFFLLIGISSIHSDRVILAMKDYLVGGHS
Sbjct: 1 MAHHEVISRSGNAFLLNIRESVLLPGSMSEMHFFLLIGISSIHSDRVILAMKDYLVGGHS 60

Query: 61 RKEVCEKYQMNNGYFSTTLGRLIRLNALAARLAPYYTDESSAFD 104
RKEVCEKYQMNNGYFSTTLGRLIRLNALAARLAPYYTDESSAFD
Sbjct: 61 RKEVCEKYQMNNGYFSTTLGRLIRLNALAARLAPYYTDESSAFD 104


107UTI89_C4924UTI89_C4932N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C4924641-11.521660hemolysin D
UTI89_C4925639-11.153761hemolysin secretion protein HlyB
UTI89_C4926637-10.429845hemolysin A
UTI89_C4927733-8.181545hemolysin C
UTI89_C4928729-6.792943hypothetical protein
UTI89_C4929630-6.121352hypothetical protein
UTI89_C4930631-5.916496hypothetical protein
UTI89_C4931539-10.159326response regulator
UTI89_C4932849-14.141669hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4924RTXTOXIND5970.0 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 597 bits (1541), Expect = 0.0
Identities = 462/478 (96%), Positives = 468/478 (97%)

Query: 1 MKTWLMGFSEFLLRYKLVWSETWKIRKQLDTPVREKDENEFLPAHLELIETPVSRRPRLV 60
MKTWLMGFSEFLLRYKLVWSETWKIRKQLDTPVREKDENEFLPAHLELIETPVSRRPRLV
Sbjct: 1 MKTWLMGFSEFLLRYKLVWSETWKIRKQLDTPVREKDENEFLPAHLELIETPVSRRPRLV 60

Query: 61 AYFIMGFLVIAVILSVLGQVEIVATANGKLTLSGRSKEIKPIENSIVKEIIVKEGESVRK 120
AYFIMGFLVIA ILSVLGQVEIVATANGKLT SGRSKEIKPIENSIVKEIIVKEGESVRK
Sbjct: 61 AYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRK 120

Query: 121 GDVLLKLTALGAEADTLKTQSSLLQTRLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS 180
GDVLLKLTALGAEADTLKTQSSLLQ RLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS
Sbjct: 121 GDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS 180

Query: 181 EEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTILARINRYENLSRVEKSRLDDF 240
EEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT+LARINRYENLSRVEKSRLDDF
Sbjct: 181 EEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDF 240

Query: 241 RSLLHKQAIAKHAVLEQENKYVEAANELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI 300
SLLHKQAIAKHAVLEQENKYVEA NELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI
Sbjct: 241 SSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI 300

Query: 301 LDKLRQTTDSIELLTLELEKNEERQQASVIRAPVSGKVQQLKVHTEGGVVTTAETLMVIV 360
LDKLRQTTD+I LLTLEL KNEERQQASVIRAPVS KVQQLKVHTEGGVVTTAETLMVIV
Sbjct: 301 LDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIV 360

Query: 361 PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQKLGL 420
PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQ+LGL
Sbjct: 361 PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGL 420

Query: 421 VFNVIVSVEENDLSTGNKHIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTESLHER 478
VFNVI+S+EEN LSTGNK+IPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTESL ER
Sbjct: 421 VFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4926RTXTOXINA14780.0 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 1478 bits (3828), Expect = 0.0
Identities = 977/1024 (95%), Positives = 995/1024 (97%)

Query: 1 MPTITTAQIKSTLQSAKQSAANKLHSAGQSTKDALKKAAEQTRNAGNRLILLIPKDYKGQ 60
M TITTAQIKSTLQSAKQSAANKLHSAGQSTKDALKKAAEQTRNAGNRLILLIPKDYKGQ
Sbjct: 1 MTTITTAQIKSTLQSAKQSAANKLHSAGQSTKDALKKAAEQTRNAGNRLILLIPKDYKGQ 60

Query: 61 GSSLNDLVRTADELGIEVQYDEKNGTAITKQVFGTAEKLIGLTERGVTIFAPQLDKLLQK 120
GSSLNDLVRTADELGIEVQYDEKNGTAITKQVFGTAEKLIGLTERGVTIFAPQLDKLLQK
Sbjct: 61 GSSLNDLVRTADELGIEVQYDEKNGTAITKQVFGTAEKLIGLTERGVTIFAPQLDKLLQK 120

Query: 121 YQKAGNKLGGSAENIGDNLGKAGSVLSTFQNFLGTALSSMKIDELIKKQKSGSNVSSSEL 180
YQKAGN LGG AENIGDNLGKAG +LSTFQNFLGTALSSMKIDELIKKQKSG NVSSSEL
Sbjct: 121 YQKAGNILGGGAENIGDNLGKAGGILSTFQNFLGTALSSMKIDELIKKQKSGGNVSSSEL 180

Query: 181 AKASIELINQLVDTAASINNNVNSFSQQLNKLGSVLSNTKHLNGVGNKLQNLPNLDNIGA 240
AKASIELINQLVDT AS+NNNVNSFSQQLN LGSVLSNTKHLNGVGNKLQNLPNLDNIGA
Sbjct: 181 AKASIELINQLVDTVASLNNNVNSFSQQLNTLGSVLSNTKHLNGVGNKLQNLPNLDNIGA 240

Query: 241 GLDTVSGILSAISASFILSNADADTGTKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGL 300
GLDTVSGILSAISASFILSNADADT TKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGL
Sbjct: 241 GLDTVSGILSAISASFILSNADADTRTKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGL 300

Query: 301 STSAAAAGLIASVVTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKE 360
STSAAAAGLIAS VTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKE
Sbjct: 301 STSAAAAGLIASAVTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKE 360

Query: 361 TGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEASKQAMFEH 420
TGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEASKQAMFEH
Sbjct: 361 TGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEASKQAMFEH 420

Query: 421 VASKMADVIAEWEKKHGKNYFENGYDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHW 480
VASKMADVIAEWEKKHGKNYFENGYDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHW
Sbjct: 421 VASKMADVIAEWEKKHGKNYFENGYDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHW 480

Query: 481 DMLIGELASVTRNGDKTLSGKSYIDYYEEGKRLERRPKEFQQQIFDPLKGNIDLSDSKSS 540
D LIGELA VTRNGDKTLSGKSYIDYYEEGKRLE++ EFQ+Q+FDPLKGNIDLSDSKSS
Sbjct: 481 DTLIGELAGVTRNGDKTLSGKSYIDYYEEGKRLEKKXDEFQKQVFDPLKGNIDLSDSKSS 540

Query: 541 TLLKFVTPLLTPGEEIRERRQSGKYEYITELLVKGVDKWTVKGVQDKGSVYDYSNLIQHA 600
TLLKFVTPLLTPGEEIRERRQSGKYEYITELLVKGVDKWTVKGVQDKG+VYDYSNLIQHA
Sbjct: 541 TLLKFVTPLLTPGEEIRERRQSGKYEYITELLVKGVDKWTVKGVQDKGAVYDYSNLIQHA 600

Query: 601 SVGNNQYREIRIESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATE 660
SVGNNQYREIRIESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATE
Sbjct: 601 SVGNNQYREIRIESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATE 660

Query: 661 AGNYTVTRVLGGDVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVE 720
AGNYTVTRVLGGDVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVE
Sbjct: 661 AGNYTVTRVLGGDVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVE 720

Query: 721 ELIGTTRADKFFGSKFTDIFHGADGDDHIEGNDGNDRLYGDKGNDTLRGGNGDDQLYGGD 780
ELIGTTRADKFFGSKFTDIFHGADGDD IEGNDGNDRLYGDKGNDTL GGNGDDQLYGGD
Sbjct: 721 ELIGTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGD 780

Query: 781 GNDKLIGGTGNNYLNGGDGDDELQVQGNSLAKNVLSGGKGNDKLYGSEGADLLDGGEGND 840
GNDKLIG GNNYLNGGDGDDE QVQGNSLAKNVL GGKGNDKLYGSEGADLLDGGEG+D
Sbjct: 781 GNDKLIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDD 840

Query: 841 LLKGGYGNDIYRYLSGYGHHIIDDEGGKDDKLSLADIDFRDVAFKREGNDLIMYKAEGNV 900
LLKGGYGNDIYRYLSGYGHHIIDD+GGK+DKLSLADIDFRDVAFKREGNDLIMYK EGNV
Sbjct: 841 LLKGGYGNDIYRYLSGYGHHIIDDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEGNV 900

Query: 901 LSIGHKNGITFKNWFEKESDDLSNHQIEQIFDKDGRVITPDSLKKAFEYQQSNNKVSYVY 960
LSIGHKNGITF+NWFEKES D+SNH+IEQIFDK GR+ITPDSLKKA EYQQ NNK SYVY
Sbjct: 901 LSIGHKNGITFRNWFEKESGDISNHEIEQIFDKSGRIITPDSLKKALEYQQRNNKASYVY 960

Query: 961 GHDASTYGSQDNLNPLINEISKIISAAGNFDVKEERSAASLLQLSGNASDFSYGRNSITL 1020
G+DA YGSQ +LNPLINEISKIISAAG+FDVKEER+AASLLQLSGNASDFSYGRNSITL
Sbjct: 961 GNDALAYGSQGDLNPLINEISKIISAAGSFDVKEERTAASLLQLSGNASDFSYGRNSITL 1020

Query: 1021 TASA 1024
T SA
Sbjct: 1021 TTSA 1024


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4927RTXTOXINC318e-115 Gram-negative bacterial RTX toxin-activating protein C...
		>RTXTOXINC#Gram-negative bacterial RTX toxin-activating protein C

signature.
Length = 170

Score = 318 bits (817), Expect = e-115
Identities = 162/170 (95%), Positives = 167/170 (98%)

Query: 45 MNMNNPLEVLGHVSWLWASSPLHRNWPVSLFAINVLPAIRANQYALLTRDNYPVAYCSWA 104
MN+N PLE+LGHVSWLWASSPLHRNWPVSLFAINVLPAI+ANQY LLTRD+YPVAYCSWA
Sbjct: 1 MNINKPLEILGHVSWLWASSPLHRNWPVSLFAINVLPAIQANQYVLLTRDDYPVAYCSWA 60

Query: 105 NLSLENEIKYLNDVTSLVAEDWTSGDRKWFIDWIAPFGDNGALYKYMRKKFPDELFRAIR 164
NLSLENEIKYLNDVTSLVAEDWTSGDRKWFIDWIAPFGDNGALYKYMRKKFPDELFRAIR
Sbjct: 61 NLSLENEIKYLNDVTSLVAEDWTSGDRKWFIDWIAPFGDNGALYKYMRKKFPDELFRAIR 120

Query: 165 VDPKTHVGKVSEFHGGKIDKQLANKIFKQYHHELITEVKNKTDFNFSLTG 214
VDPKTHVGKVSEFHGGKIDKQLANKIFKQYHHELITEVK K+DFNFSLTG
Sbjct: 121 VDPKTHVGKVSEFHGGKIDKQLANKIFKQYHHELITEVKRKSDFNFSLTG 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4931HTHFIS909e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 9e-23
Identities = 35/129 (27%), Positives = 60/129 (46%)

Query: 7 KILLMEDDYDIAALLRLNLQDEGYQIVHEADGARARLLLDKQTWDAVILDLMLPNVNGLE 66
IL+ +DD I +L L GY + ++ A + D V+ D+++P+ N +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 67 ICRYIRQMTRYLPVIIISARTSETHRVLGLEMGADDYLPKPFSIPELIARIKALFRRQEA 126
+ I++ LPV+++SA+ + + E GA DYLPKPF + ELI I +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 127 MGQNILLAG 135
+
Sbjct: 125 RPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C4932PF06580423e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.2 bits (99), Expect = 3e-06
Identities = 24/137 (17%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 360 LSIETRRLQLRIMMSHSLPLIRADISMIERVITNLLDNAVRH----TPPEGSIRLKVWQE 415
L + + + + R+ + + D+ + ++ L++N ++H P G I LK ++
Sbjct: 229 LQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD 288

Query: 416 DNRLHVEVADSGPGLTEDMRTHLFRRASVLCHEPSEEPRGGLGLLIVRRMLVLHGGD--- 472
+ + +EV ++G + + + G GL VR L + G
Sbjct: 289 NGTVTLEVENTGSLALK-----------------NTKESTGTGLQNVRERLQMLYGTEAQ 331

Query: 473 IRLTDSTTGACFRFFLP 489
I+L++ +P
Sbjct: 332 IKLSEKQGKVNAMVLIP 348


108UTI89_C5168UTI89_C5174N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
UTI89_C5168-2150.898306phosphoglycerate mutase
UTI89_C5169-1130.316499right origin-binding protein
UTI89_C5170014-0.060943hypothetical protein
UTI89_C5171DNA-binding response regulator CreB
UTI89_C5172sensory histidine kinase CreC
UTI89_C5173hypothetical protein
UTI89_C5174two-component response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C5168VACCYTOTOXIN290.017 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 28.8 bits (64), Expect = 0.017
Identities = 14/45 (31%), Positives = 20/45 (44%), Gaps = 4/45 (8%)

Query: 145 PLLVSHGIALGCLVSTILGLPAWAERRLRLRNCSISRVDYQESLW 189
P +V GIA G V T+ GL W ++ N D + +W
Sbjct: 42 PAIVG-GIATGAAVGTVSGLLGWGLKQAEEAN---KTPDKPDKVW 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C5171HTHFIS909e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 9e-23
Identities = 34/139 (24%), Positives = 61/139 (43%)

Query: 1 MQRETVWLVEDEQGIADTLVYMLQQEGFDVEVFERGLPVLDKARQQVPDVMILDVGLPDI 60
M T+ + +D+ I L L + G+DV + + D+++ DV +PD
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 SGFELCRQLLALHPALPVLFLTARSEEVDRLLGLEIGADDYVAKPFSPREVCARVRTLLR 120
+ F+L ++ P LPVL ++A++ + + E GA DY+ KPF E+ + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 RVKKFSTPSPVIRIGHFEL 139
K+ + L
Sbjct: 121 EPKRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C5172PF06580320.006 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.006
Identities = 41/182 (22%), Positives = 72/182 (39%), Gaps = 40/182 (21%)

Query: 312 LRQARLENRQEVVLTAVDVAALFR---RVSEARTVQLAE--KNITLHVM--------PTE 358
+R LE+ + ++ L R R S AR V LA+ + ++ +
Sbjct: 182 IRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQ 241

Query: 359 VNVASEPALLEQALGNLL-----DNA----IDFTPESGCITLSAEVDQEYVTLKVLDTGS 409
PA+++ + +L +N I P+ G I L D VTL+V +TGS
Sbjct: 242 FENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGS 301

Query: 410 GIPDYALSRIFERFYSLPRANGQKSSGLGLAFVSE-VARLFNGEVTLR-NVQEGGVLASL 467
N ++S+G GL V E + L+ E ++ + ++G V A +
Sbjct: 302 LALK----------------NTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 468 RL 469
+
Sbjct: 346 LI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
UTI89_C5174HTHFIS824e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.8 bits (202), Expect = 4e-20
Identities = 30/122 (24%), Positives = 60/122 (49%), Gaps = 1/122 (0%)

Query: 1 MQTPHILIVEDELVTRNTLKSIFEAEGYDVFEATDGAEMHQILSEYDINLVIMDINLPGK 60
M IL+ +D+ R L GYDV ++ A + + ++ D +LV+ D+ +P +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLLLARELRE-QANVALMFLTGRDNEVDKILGLEIGADDYITKPFNPRELTIRARNLLS 119
N L +++ + ++ ++ ++ ++ + I E GA DY+ KPF+ EL L+
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RT 121

Sbjct: 121 EP 122



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.