PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeEscherichia_coli_CFT073.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in AE014075 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1c0015c0023Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c00152300.184233Hypothetical protein yaaH
c0016126-1.173319Hypothetical protein yaaW
c0017-122-2.313863Hypothetical protein yaaI precursor
c0018-120-3.073264Putative glutamate dehydrogenase
c0019-219-3.429336Chaperone protein dnaK
c0020-219-4.400166Chaperone protein dnaJ
c0021024-6.049081Putative conserved protein
c0022023-5.237722Hypothetical protein
c0023022-4.555444Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0016PF07201300.006 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 30.2 bits (68), Expect = 0.006
Identities = 9/51 (17%), Positives = 24/51 (47%)

Query: 138 LHAVDARVNELEELLPLLMKDKLLAKGVSHLLSSQLTRILRTHAAMSVLGH 188
+ V+ +VN+ +P L + + +++ +S L +S + + A +
Sbjct: 80 VSDVEEQVNQYLSKVPELEQKQNVSELLSLLSNSPNISLSQLKAYLEGKSE 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0019SHAPEPROTEIN1427e-40 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 142 bits (361), Expect = 7e-40
Identities = 83/387 (21%), Positives = 149/387 (38%), Gaps = 84/387 (21%)

Query: 5 IGIDLGTTNSCVAIMDGTTPRVLENAEGDRTTPSIIAYTQDGET------LVGQPAKRQA 58
+ IDLGT N+ + + + E PS++A QD VG AK+
Sbjct: 13 LSIDLGTANTLIYVKGQGIV-LNE--------PSVVAIRQDRAGSPKSVAAVGHDAKQML 63

Query: 59 VTNPQNTLFAIKRLIGRRFQDEEVQRDVSIMPFKIIAADNGDAWVEVKGQKMAPPQISAE 118
P N + AI+ + +D I F + +
Sbjct: 64 GRTPGN-IAAIRPM-----------KDGVIADFFVTEK------------------MLQH 93

Query: 119 VLKKMKKTAEDYLGEPVTEAVITVPAYFNDAQRQATKDAGRIAGLEVKRIINEPTAAALA 178
+K++ + P ++ VP +R+A +++ + AG +I EP AAA+
Sbjct: 94 FIKQVHS---NSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIG 150

Query: 179 YGL--DKGTGNRTIAVYDLGGGTFDISIIEIDEVDGEKTFEVLATNGDTHLGGEDFDSRL 236
GL + TG+ V D+GGGT ++++I ++ V + +GG+ FD +
Sbjct: 151 AGLPVSEATGS---MVVDIGGGTTEVAVISLNGV---------VYSSSVRIGGDRFDEAI 198

Query: 237 INYLVEEFKKDQGIDLRNDPLAMQRLKEAAEKAKIELSSA----QQTDVNLPYITADATG 292
INY+ + G + AE+ K E+ SA + ++ +
Sbjct: 199 INYVRRNYGSLIG-------------EATAERIKHEIGSAYPGDEVREIEVRGRNLAEGV 245

Query: 293 PKHMNIKVTRAKLESLVEDLVNRSIEPLKVALQD-AGLSVSDIDD--VILVGGQTRMPMV 349
P+ + + LE+L E + + + VAL+ SDI + ++L GG + +
Sbjct: 246 PRGFTLN-SNEILEALQEP-LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNL 303

Query: 350 QKKVAEFFGKEPRKDVNPDEAVAIGAA 376
+ + E G +P VA G
Sbjct: 304 DRLLMEETGIPVVVAEDPLTCVARGGG 330


2c0069c0083Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0069-2193.259008Ribosomal large subunit pseudouridine synthase
c0070-3203.901451RNA polymerase associated protein
c0071-3173.724388DNA polymerase II
c0072-4173.398484Transposase
c0073-2173.311702L-ribulose-5-phosphate 4-epimerase
c0074-2130.748616L-arabinose isomerase
c0075120-2.708149L-ribulokinase
c0076228-6.562227Arabinose operon Regulatory protein
c0077230-6.017303Hypothetical protein
c0078118-1.377101Hypothetical protein
c0079114-0.288022Hypothetical protein
c00800142.487318Hypothetical protein
c00811153.397143Hypothetical protein yabI
c00821163.104813Thiamine transport ATP-binding protein thiQ
c00832162.782491Thiamine transport system permease protein thiP
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0080SECYTRNLCASE270.042 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 26.6 bits (59), Expect = 0.042
Identities = 13/29 (44%), Positives = 17/29 (58%), Gaps = 1/29 (3%)

Query: 2 SKYIYILLSF-LVLFFIFFYAYISLMSKE 29
IYI+ F L++FF FFY IS +E
Sbjct: 314 DHPIYIVTYFLLIVFFAFFYVAISFNPEE 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0083PF06580310.014 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.6 bits (69), Expect = 0.014
Identities = 17/80 (21%), Positives = 27/80 (33%), Gaps = 5/80 (6%)

Query: 4 RRQPLIPGWLIPGVSAATLVVAVALAAFLALWWNAPQGDWVAVWQDS-YLWHVVRFSFWQ 62
R GWL + L V A +W+ A ++W+ ++
Sbjct: 60 RSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFVAN----TSIWRLLAFINTKPVAFTLP 115

Query: 63 AFLSAQLSVVPAIFLARALY 82
LS +VV F+ LY
Sbjct: 116 LALSIIFNVVVVTFMWSLLY 135


3c0130c0141Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0130118-4.390923AmpE protein
c0131321-5.051009Aromatic amino acid transport protein aroP
c0133225-5.802544Hypothetical protein
c0134019-1.388651Hypothetical protein
c0136118-0.948544Hypothetical protein
c01374250.920235Hypothetical protein
c01384321.484912Unknown protein of IS629 encoded within
c01393251.762771Putative Transposase for IS629
c01402302.242634Pyruvate dehydrogenase complex repressor
c01412312.076162Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0137PF04605260.019 Virulence-associated protein D (VapD)
		>PF04605#Virulence-associated protein D (VapD)

Length = 125

Score = 26.0 bits (57), Expect = 0.019
Identities = 13/34 (38%), Positives = 20/34 (58%)

Query: 1 MYNFKDKIEDYTEREFIELLGEFTNPTGDNAQLK 34
Y+ K+ I+D ++F + L EFT T N +LK
Sbjct: 88 QYSLKETIQDLCAKDFHQKLKEFTEKTPKNQKLK 121


4c0161c0188Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0161020-3.442137Hypothetical protein
c0162020-4.117505Hypothetical protein yadD
c0163324-5.101464Hypothetical protein
c0164326-5.636171Pantoate--beta-alanine ligase
c0165328-6.7427503-methyl-2-oxobutanoate
c0166332-7.991245Hypothetical fimbrial-like protein yadC
c0167231-7.660489Protein yadK
c0168231-7.994574Hypothetical protein yadL precursor
c0169231-8.164059Hypothetical protein yadM precursor
c0170128-7.384391Outer membrane usher protein htrE precursor
c0171021-3.133332Chaperone protein ecpD precursor
c0172-116-0.997921Hypothetical fimbrial-like protein yadN
c0173-2150.628156Hypothetical protein
c0174-1141.030314Hypothetical protein
c01750142.0986982-amino-4-hydroxy-6-
c01760143.534896Poly(A) polymerase
c01771163.007265Hypothetical protein yadB
c01780173.301794DnaK suppressor protein
c01790173.061849Sugar fermentation stimulation protein A
c01800142.4838982'-5' RNA ligase
c0181-1142.582955ATP-dependent helicase hrpB
c0182-2151.737249Hypothetical protein
c0183-2173.321866Penicillin-binding protein 1B
c0184-2153.456078Hypothetical protein
c0185-1143.530132Ferrichrome-iron receptor precursor
c01860164.609017Ferrichrome transport ATP-binding protein fhuC
c01870154.343782Ferrichrome-binding periplasmic protein
c01880154.335000Ferrichrome transport system permease protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0165FLGMRINGFLIF290.018 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 29.2 bits (65), Expect = 0.018
Identities = 27/100 (27%), Positives = 40/100 (40%), Gaps = 22/100 (22%)

Query: 110 MVKIEGGEWL----VETVQMLTERAVPVCGHLGLTPQSVNIFGGYKVQGRGDEAGDRLL- 164
V +E G L + V L AV GL P +V + D++G LL
Sbjct: 176 TVTLEPGRALDEGQISAVVHLVSSAVA-----GLPPGNVTLV---------DQSG-HLLT 220

Query: 165 -SDALALEAAGAQLLVLECVPVELAKRITEALAIPVIGIG 203
S+ + AQL V + +RI L+ P++G G
Sbjct: 221 QSNTSGRDLNDAQLKFANDVESRIQRRIEAILS-PIVGNG 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0170PF005778070.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 807 bits (2085), Expect = 0.0
Identities = 263/869 (30%), Positives = 429/869 (49%), Gaps = 40/869 (4%)

Query: 12 IATFCALLYSNSALCAELVEYDHTFLMGKDASNIDLSRYTEGNPTLPGIYDVSVYVNDQP 71
+ CA AE + ++ FL + DLSR+ G PG Y V +Y+N+
Sbjct: 30 LFVACAFAAQAPLSSAE-LYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGY 88

Query: 72 IMSQSIAFAVIEGKKNAQACITQKNLLQFHISSPDKNSEKAILLKRDDDLGDCLNLAEMI 131
+ ++ + F + ++ C+T+ L +++ + + C+ L MI
Sbjct: 89 MATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLL------ADDACVPLTSMI 142

Query: 132 PQSSIRYDVNDQRLDIDVPQAWIMKNYQNYVDPSLWENGINAAMLSYNLNGYHSESP-GR 190
++ + DV QRL++ +PQA++ + Y+ P LW+ GINA +L+YN +G ++ G
Sbjct: 143 HDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGG 202

Query: 191 TNDSIYAAFNGGINLGAWRLRASGNYNWMTNVHS-----DYDFQNRYLQRDLASLRSQLV 245
+ Y G+N+GAWRLR + +++ ++ S + N +L+RD+ LRS+L
Sbjct: 203 NSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLT 262

Query: 246 IGESYTTGETFDSVSIRGIRLYSDSRMLPPVLASFAPIIHGVANTNAKVTVMQNGYKIYE 305
+G+ YT G+ FD ++ RG +L SD MLP FAP+IHG+A A+VT+ QNGY IY
Sbjct: 263 LGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYN 322

Query: 306 TTVPPGAFAIDDLSPSGYGSDLIVTIEEADGTKRTFSQPFSSVVQMLRPGVGRWDISAGQ 365
+TVPPG F I+D+ +G DL VTI+EADG+ + F+ P+SSV + R G R+ I+AG+
Sbjct: 323 STVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGE 382

Query: 366 VLKD-SIQDEPNLFQASYYYGLNNYLTGYTGIQLTDNNYTAGLLGLGMNT-PVGAFSVDV 423
+ Q++P FQ++ +GL T Y G QL D Y A G+G N +GA SVD+
Sbjct: 383 YRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLAD-RYRAFNFGIGKNMGALGALSVDM 441

Query: 424 THSNVSIPDDKTYQGQSYRISWNKLFENTSTSLNIAAYRYSTQHYLGLNDALTLIDEVEH 483
T +N ++PDD + GQS R +NK + T++ + YRYST Y D +
Sbjct: 442 TQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYN 501

Query: 484 PE-----QDLEPKSMRNYSRM---KNQVTVSINQPLKFEKKDYGSFYLSGSWSDYWASGQ 535
E ++PK Y+ + ++ +++ Q L + YLSGS YW +
Sbjct: 502 IETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQL----GRTSTLYLSGSHQTYWGTSN 557

Query: 536 NSTNYSIGYSNSASWGSYSISAQRSLNE-DGQTDDSIYLSFTIPIENLLGTEHRSS-GFQ 593
+ G + + ++++S + N D + L+ IP + L ++ +S
Sbjct: 558 VDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHA 617

Query: 594 SIDTQLNSDFKGNNQLNISSSGYSDT-NRISYSVNTGYMMNKSSDDLSYIGGYASYESPW 652
S ++ D G G N +SYSV TGY + S +Y +
Sbjct: 618 SASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGY 677

Query: 653 GTLSGSASASSDNSRQFSLNTDGGFVLHSGGLTFSNDSFSDSDTLAVIQAPGAKGARINY 712
G + S S D +Q GG + H+ G+T +DT+ +++APGAK A++
Sbjct: 678 GNANIGYSHSDDI-KQLYYGVSGGVLAHANGVTLGQPL---NDTVVLVKAPGAKDAKVEN 733

Query: 713 GNST-VDRWGYGVTSALSPYHENRIALDINDLENDVELKSTSTVAVPRQGAVVFADFETV 771
D GY V + Y ENR+ALD N L ++V+L + VP +GA+V A+F+
Sbjct: 734 QTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKAR 793

Query: 772 QGQSAIMNIVRSDGKNIPFAADIYDEQNNIIGNVGQGGQAFVRGIEQEGNIRITWIEEGK 831
G +M + + K +PF A + E + G V GQ ++ G+ G +++ W EE
Sbjct: 794 VGIKLLMT-LTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEEN 852

Query: 832 PVSCFAHYQQNTTSEKIAQSIILNGLRCQ 860
C A+YQ S++ Q + C+
Sbjct: 853 A-HCVANYQLPPESQQ--QLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0187FERRIBNDNGPP5090.0 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 509 bits (1312), Expect = 0.0
Identities = 293/296 (98%), Positives = 294/296 (99%)

Query: 2 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAVIDPNRIVALEWLPVELLLALGIVPYGVA 61
MSGLPLISRRRLLTAMALSPLLWQMNTAHAA IDPNRIVALEWLPVELLLALGIVPYGVA
Sbjct: 1 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA 60

Query: 62 DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 121
DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR
Sbjct: 61 DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 120

Query: 122 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAHYEDFIRSMKPRFVKRGARPLLLT 181
GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLA YEDFIRSMKPRFVKRGARPLLLT
Sbjct: 121 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLT 180

Query: 182 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 241
TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH
Sbjct: 181 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 240

Query: 242 DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRILDNAIGGKA 297
DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVR+LDNAIGGKA
Sbjct: 241 DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA 296


5c0249c0365Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0249-126-3.794754Probable hydroxyacylglutathione hydrolase
c0250-123-1.759738Hypothetical protein yafS
c0251025-2.163845Ribonuclease HI
c0252126-1.388651DNA polymerase III, epsilon chain
c0253421-0.259833*Hypothetical protein
c02544180.199482Hypothetical protein
c02554180.802441Hypothetical protein
c0256422-0.740423Hypothetical protein
c0257421-0.005827Unknown in ISEc8
c02582210.348289Unknown in ISEc8
c0259120-0.441345Hypothetical protein
c0260224-0.685948Hypothetical protein
c0261224-0.467677Hypothetical protein
c02623261.982997Hypothetical protein
c02633272.124675Putative Transposase
c02646282.808750Hypothetical protein
c02656263.231107Hypothetical protein
c02686294.020409Hypothetical protein
c02698294.841583Hypothetical protein yeeV
c02708284.797535Hypothetical protein yeeU
c02719284.986754Hypothetical protein
c02729264.865144Hypothetical protein yeeT
c02738265.082844Putative radC-like protein yeeS
c02747264.661954Hypothetical protein
c02754222.691045Hypothetical protein yafX
c02763231.525834Hypothetical protein
c0277220-0.854952Hypothetical protein
c0278320-4.265736Hypothetical protein yafZ
c0279318-4.401834Hypothetical protein ykfF
c0280220-3.151222Hypothetical protein
c0281319-1.485102Hypothetical protein
c0282420-1.057400Hypothetical protein
c02835220.225605Hypothetical protein
c02846251.771522Conserved hypothetical protein
c02856220.190975Conserved hypothetical protein
c0286621-1.480478Conserved hypothetical protein
c0287727-5.837158Hypothetical protein
c0288631-6.997625Hypothetical protein
c0289119-3.843699Hypothetical protein
c0290-116-3.164352Hypothetical protein
c0291-217-1.569043Hypothetical protein
c0292-117-0.118372Hypothetical protein
c0293-1161.409403Hypothetical protein
c02940172.235289Hypothetical protein
c02951202.093539Hypothetical protein
c02960191.572432Hypothetical protein
c0297024-0.602024Hypothetical protein
c0298732-4.225530Hypothetical protein
c0299633-2.596380Hypothetical protein
c03009240.821875Hypothetical protein
c030110241.540642Hypothetical protein
c0302621-2.177747Hypothetical protein
c0303520-2.248250Hypothetical protein
c0304521-2.251396Hypothetical protein
c0305524-1.735493Hypothetical protein
c0306426-2.601484Conserved hypothetical protein
c0307525-3.163342Hypothetical protein
c0308829-2.761697Haemolysin expression modulating protein
c0309829-2.564134Conserved hypothetical protein
c0310830-1.989252Hypothetical protein
c0311934-3.416034Hypothetical protein
c0312932-3.902617Hypothetical protein
c0313633-6.495776Hypothetical protein
c0314532-8.870473Hypothetical protein
c0315534-9.635876Hypothetical protein
c0316436-11.332366Hypothetical protein
c0317539-12.650523Conserved hypothetical protein
c0318543-13.534001Putative sugar-phosphate isomerase
c0319642-13.693178Putative oligogalacturonide lyase
c0320644-14.587699Hypothetical protein
c0321745-15.033875Gluconate 5-dehydrogenase
c0322845-15.223186Putative oligogalacturonide transporter
c0323745-15.374588Putative exopolygalacturonate lyase
c0324742-15.143925Hypothetical protein
c0325540-12.714071Hypothetical protein
c0326337-10.520841Hypothetical protein
c0327128-8.161508Conserved hypothetical protein
c0328026-6.656589Hypothetical protein
c0329-123-5.714421Hypothetical protein
c0330-122-5.768017Putative deoxyribose operon repressor
c0331-122-5.691265Putative ribokinase
c0332-120-4.583718Putative L-fucose permease
c0333-218-3.824456Putative cytoplasmic protein
c0334-219-3.536261Putative integral membrane protein
c0335-117-2.846754Hypothetical protein
c0336018-3.227731PTS system, mannitol (Cryptic)-specific IIA
c03374192.846773Putative conserved protein
c03385203.308484Hypothetical protein
c03394192.991184Hypothetical protein
c03404211.902140Conserved hypothetical protein
c03414201.698008Hypothetical protein
c03454211.797624Putative member of ShlA/HecA/FhaA exoprotein
c0348423-1.906450Hypothetical protein
c0349422-1.533021Putative Transposase within prophage
c0350625-1.852878Pic serine protease precursor
c0351326-4.108091Hypothetical protein
c0352427-2.928805Partial Transposase
c0354532-4.344230Unknown protein of IS629 encoded within
c0355532-3.546771Hypothetical protein
c0357732-3.691166Hypothetical protein
c0358730-3.117230Hypothetical protein
c0359631-3.884195Hypothetical protein
c0360632-4.701044Hypothetical protein
c0361731-4.768164Putative cytoplasmic membrane export protein
c0362731-5.824186Putative membrane spanning export protein
c0363731-6.448175Putative RTX family exoprotein A gene
c0364233-11.006199Hypothetical protein
c0365027-6.994370Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0249BINARYTOXINB345e-04 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 34.3 bits (78), Expect = 5e-04
Identities = 12/55 (21%), Positives = 28/55 (50%), Gaps = 4/55 (7%)

Query: 186 NDYYRKVKELRAKNQITLPVILKNERQINVFLRT----EDIDLINVINEETLLQQ 236
+ ++ EL A N T+ +K ++N+ +R D + I V +E+++++
Sbjct: 589 QNIKNQLAELNATNIYTVLDKIKLNAKMNILIRDKRFHYDRNNIAVGADESVVKE 643


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0258PF06580336e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.3 bits (76), Expect = 6e-04
Identities = 9/74 (12%), Positives = 29/74 (39%), Gaps = 8/74 (10%)

Query: 16 LRQKDQQLSLVEETKTFLRSALARAEEKIEEDEREIEHLRA--QIEKLRRMLFGTCSEKL 73
L + ++ +R +L + + E+ + + Q+ ++ F ++L
Sbjct: 187 LEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQ---FE---DRL 240

Query: 74 RREVEQAEALLKQR 87
+ E + A++ +
Sbjct: 241 QFENQINPAIMDVQ 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0268HTHFIS280.034 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 28.3 bits (63), Expect = 0.034
Identities = 8/39 (20%), Positives = 18/39 (46%), Gaps = 1/39 (2%)

Query: 84 NGAQFRQLCETTDWVDAGE-NVLLFGASGLGKSHLAAAI 121
A +++ + + +++ G SG GK +A A+
Sbjct: 142 RSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0297PF05272330.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.7 bits (74), Expect = 0.002
Identities = 12/22 (54%), Positives = 13/22 (59%)

Query: 32 VTVLLGPNGCGKSTLLRALAGL 53
VL G G GKSTL+ L GL
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0321DHBDHDRGNASE1321e-39 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 132 bits (333), Expect = 1e-39
Identities = 77/262 (29%), Positives = 131/262 (50%), Gaps = 14/262 (5%)

Query: 4 INMFSLKGKNALVTGASYGIGFEIAKALSNAGATIIFNDVIEENIKKGLAAYKENNINAH 63
+N ++GK A +TGA+ GIG +A+ L++ GA I D E ++K +++ K +A
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 64 GYLFDITDEQSVKENIKKIESDVGIIDILVNNAGIIQRKPMLETTAADYRKIIDIDLTGQ 123
+ D+ D ++ E +IE ++G IDILVN AG+++ + + ++ ++ TG
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 124 FIMAKAVLPSMIKKGHGKIINICSMMSELGRETVVGYASAKGGLKMLTKNICSEFGEKNI 183
F +++V M+ + G I+ + S + + R ++ YAS+K M TK + E E NI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 184 QCNGIGPGYIAT--------PQTAPLREIQPDGSRHPFDQFIISKTPAGRWGKPDDLAGA 235
+CN + PG T + + I+ + F P + KP D+A A
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAEQVIKGSL-----ETFKTG-IPLKKLAKPSDIADA 234

Query: 236 AIFLASDASNFINGHILYVDGG 257
+FL S + I H L VDGG
Sbjct: 235 VLFLVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0345PF05860742e-17 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 74.1 bits (182), Expect = 2e-17
Identities = 25/139 (17%), Positives = 44/139 (31%), Gaps = 26/139 (18%)

Query: 32 AVITPQNGA---GMDKAANGVPVVNIATPNGAGISHNRFTDYNVGKEGLILNNATGKLNP 88
A ITP ++ T G+ + H+ F +++V G N
Sbjct: 1 AQITPDTTLPINSNITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFNNPT---- 55

Query: 89 TQLGGLIQNNPNLKAGGEAKGIINEVTGGKRSLLQGYTEVAGKAANVMVANPYGITCDGC 148
+ II+ VTGG S + G A N+ + NP GI
Sbjct: 56 -----------------NIQNIISRVTGGSVSNIDGLIRANATA-NLFLINPNGIIFGQN 97

Query: 149 GFINTPHATLTTGKPVMNA 167
++ + + + +
Sbjct: 98 ARLDIGGSFVGSTANRLKF 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0350IGASERPTASE7380.0 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 738 bits (1906), Expect = 0.0
Identities = 260/862 (30%), Positives = 401/862 (46%), Gaps = 116/862 (13%)

Query: 53 SQAGIVRSDIAYQIYRDFAENKGLFVPGATDIPVYDKDGKLVGRL--DKAPMADFSSVSS 110
++A +VR D+ YQI+RDFAENKG F GAT++ V DK+ K +G + PM DFS V
Sbjct: 23 TEAALVRDDVDYQIFRDFAENKGKFSVGATNVLVKDKNNKDLGTALPNGIPMIDFSVVDV 82

Query: 111 N-GVATLVSPQYIVSVKH-NGGYQSVSFGN------------------GKNTYSLVDRNN 150
+ +ATL++PQY+V VKH + G + FGN +N Y V++N
Sbjct: 83 DKRIATLINPQYVVGVKHVSNGVSELHFGNLNGNMNNGNAKAHRDVSSEENRYFSVEKNE 142

Query: 151 HSSV-----------------DFHAPRLNKLVTEVIPSAITSEGTKANAYKDTERYTAFY 193
+ + D++ PRL+K VTEV P ++ + A Y D +Y AF
Sbjct: 143 YPTKLNGKTVTTEDQTQKRREDYYMPRLDKFVTEVAPIEASTASSDAGTYNDQNKYPAFV 202

Query: 194 RVGSGTQYTKDKDGNLVKVAGGYAFKTGGTTGVPLISDATIVSNPGQTYNPVNG------ 247
R+GSG+Q+ K N + + V I P + + NG
Sbjct: 203 RLGSGSQFIYKKGDNYSLILNNHEVGGNNLKLVGDAYTYGIAGTPYKVNHENNGLIGFGN 262

Query: 248 ---------------PLPDYGAPGDSGSPLFAYDEQQKKWVIVAVLRAYAGINGAT-NWW 291
PL +Y GDSGSPLF YD ++ KW+ + +AG N + W
Sbjct: 263 SKEEHSDPKGILSQDPLTNYAVLGDSGSPLFVYDREKGKWLFLGSYDFWAGYNKKSWQEW 322

Query: 292 NVIPTDYLNQVMQDDFDAPVDFVSGLPPLNWTYDKTSGTGTLSQGSKNWTMHGQKDNDLN 351
N+ + + V+ D + + +W+ + + T T + S N + KD N
Sbjct: 323 NIYKSQFTKDVLNKD--SAGSLIGSKTDYSWSSNGKTSTITGGEKSLNVDLADGKDKP-N 379

Query: 352 AGKNLVFSGQNGAIVLKDSVTQGAGYLEFKDSYTVSAES-GKTWTGAGIITDKGTNVTWK 410
GK++ F G +G + L +++ QGAG L F+ Y V S TW GAG+ +G VTWK
Sbjct: 380 HGKSVTFEG-SGTLTLNNNIDQGAGGLFFEGDYEVKGTSDNTTWKGAGVSVAEGKTVTWK 438

Query: 411 VNGVAGDNLHKLGEGTLTINGTGVNPGGLKTGDGTVVLNQQADTAGNVQAFSSVNLASGR 470
V+ D L K+G+GTL + GTG N G LK GDGTV+L QQ + +G AF+SV + SGR
Sbjct: 439 VHNPQYDRLAKIGKGTLIVEGTGDNKGSLKVGDGTVILKQQTNGSGQ-HAFASVGIVSGR 497

Query: 471 PTVVLGDARQVNPDNISWGYRGGKLDLNGNAVTFTRLQAADYGAVITN-NAQQKSRLLLD 529
T+VL D +QV+P++I +G+RGG+LDLNGN++TF ++ D GA + N N S + +
Sbjct: 498 STLVLNDDKQVDPNSIYFGFRGGRLDLNGNSLTFDHIRNIDDGARLVNHNMTNASNITIT 557

Query: 530 LKAQDT--------NVSVP-IGSISPFGGTGTPGNLYSMILNGQTRFYILKSASYGNTLW 580
++ T N+ P + F G LY + L T + + K AS + L
Sbjct: 558 GESLITDPNTITPYNIDAPDEDNPYAFRRIKDGGQLY-LNLENYTYYALRKGASTRSELP 616

Query: 581 GNSLNDPAQWEFVGTDKNKAVQTVKDRILAGRAKQPVIF----HGQLTGNMDVTIPQLPG 636
NS W ++G ++A + V + I R + G+ GN++VT
Sbjct: 617 KNSGESNENWLYMGKTSDEAKRNVMNHINNERMNGFNGYFGEEEGKNNGNLNVTFKGKSE 676

Query: 637 GRKVILDGSVNLPEGTLSEDSGTLIFQGHPVIHA-SVSGSAPVSLN-----------QKD 684
+ +L G NL G L+ + GTL G P HA ++G + + + D
Sbjct: 677 QNRFLLTGGTNL-NGDLTVEKGTLFLSGRPTPHARDIAGISSTKKDPHFAENNEVVVEDD 735

Query: 685 WENRQFIMKTLSLK-DADFHLSRN-ASLNSDIKSDNS---HITLGSDRVFVDKNDGTGNY 739
W NR F T+++ +A + RN A++ S+I + N HI + ++D TG
Sbjct: 736 WINRNFKATTMNVTGNASLYSGRNVANITSNITASNKAQVHIGYKTGDTVCVRSDYTGY- 794

Query: 740 VILEEGTSVPDTVNDR-------SQYEGNITLDHNSTLDIGSR--FTGGIEAYDSAVSIT 790
T D ++D+ + GN+ L ++ +G F +S V +T
Sbjct: 795 -----VTCTTDKLSDKALNSFNPTNLRGNVNLTESANFVLGKANLFGTIQSRGNSQVRLT 849

Query: 791 SPDVLLTAPGAFAGSSLTVHDG 812
+ G L + +G
Sbjct: 850 E-NSHWHLTGNSDVHQLDLANG 870



Score = 80.1 bits (197), Expect = 4e-17
Identities = 41/154 (26%), Positives = 65/154 (42%), Gaps = 18/154 (11%)

Query: 864 DNATLEITRGAHASGDIHASAASTVTIGSDTPAELASAETTASAFAG--------SLLEG 915
+ A + I + + + VT +D ++ A + G + + G
Sbjct: 771 NKAQVHIGYKTGDTVCVRSDYTGYVTCTTDKLSDKALNSFNPTNLRGNVNLTESANFVLG 830

Query: 916 YNAAFNGAITGGRADVSM-HNALWTLGGDSAIHTLTVRNSRI------SSEGDRTFRTLT 968
F + G + V + N+ W L G+S +H L + N I +S + TLT
Sbjct: 831 KANLFGTIQSRGNSQVRLTENSHWHLTGNSDVHQLDLANGHIHLNSADNSNNVTKYNTLT 890

Query: 969 VNKLDATGSDFVLRTDLKN--ADKINVTEKATGS 1000
VN L GS + L TDL N DK+ VT+ ATG+
Sbjct: 891 VNSLSGNGSFYYL-TDLSNKQGDKVVVTKSATGN 923


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0362RTXTOXIND2342e-74 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 234 bits (598), Expect = 2e-74
Identities = 88/430 (20%), Positives = 179/430 (41%), Gaps = 64/430 (14%)

Query: 45 LMVIFLLVLVVTFVIWAWNSPLDEVTRGQGSIIPGSREQVIQTLDPGILKTLEVREGDIV 104
L+ F++ +V I + ++ V G + R + I+ ++ I+K + V+EG+ V
Sbjct: 59 LVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESV 118

Query: 105 EKGQVLLTLDDTRSSAMLRESEARVNNLEAVRARLRAEAYS------ESLTFPDD-VPAD 157
KG VLL L + A ++++ + + R + + S L PD+ +
Sbjct: 119 RKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQN 178

Query: 158 LRERESTVYR--------LRKTELAQS-----------------IAGLKQSKALLDKEIA 192
+ E E + + Q I + + +
Sbjct: 179 VSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLD 238

Query: 193 MTRPIVREGAMSEVELLRMQRQSAEL---------------------QLQMDEKQNKYLT 231
++ + A+++ +L + + E + + +
Sbjct: 239 DFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKN 298

Query: 232 EAGAELVKTEAELAQAKENMAGRADPVERSRIRAPLRGIVKNIRVNTLGGVVSAGQDIME 291
E +L +T + +A + + S IRAP+ V+ ++V+T GGVV+ + +M
Sbjct: 299 EILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358

Query: 292 IIPLEDQLLIEAYINPRDVAYVRTGMPALVKLTAYDYAIYGGLDGVVTLVSPDTLRDQKR 351
I+P +D L + A + +D+ ++ G A++K+ A+ Y YG L G V ++ D + DQ+
Sbjct: 359 IVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQR- 417

Query: 352 PGDLKLDPNEAYYRVLVTTSNNYLTDRNGKILPVIPGMIASVDIKTGQKSVFQYLIKPIT 411
+ V+++ N L+ K +P+ GM + +IKTG +SV YL+ P+
Sbjct: 418 --------LGLVFNVIISIEENCLS-TGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLE 468

Query: 412 -RMKQALQER 420
+ ++L+ER
Sbjct: 469 ESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0363CABNDNGRPT483e-07 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 47.7 bits (113), Expect = 3e-07
Identities = 32/192 (16%), Positives = 58/192 (30%), Gaps = 25/192 (13%)

Query: 1327 NNIDTATDVNTVTIYGSVSGEILGGYGSDNITVTKNLTGNISVGDNADTLTAGWIYGGST 1386
N+ T T + + + S + ++ DT
Sbjct: 260 ANMTTRTGDSVYGFNSNTDRDFYTATDSSKALI-----FSVWDAGGTDTFDFSGYSNNQR 314

Query: 1387 VSMGDGNDT--------VTITDGAYNTTISLGAGDDVFDATSGVMGDSAYATVVNGEDGN 1438
+++ +G+ + V+I G G+G+D+ S ++ G GN
Sbjct: 315 INLNEGSFSDVGGLKGNVSIAHGVTIENAIGGSGNDILVGNSA-------DNILQGGAGN 367

Query: 1439 DTFKLGTIAKNLTIDAGAGDDIVVLTKDYDSTSSGN---QGYINGGEGSDTLVLTGTISV 1495
D G A T+ GAG D V DST + + G + D +
Sbjct: 368 DVLYGG--AGADTLYGGAGRDTFVYGSGQDSTVAAYDWIADFQKGIDKIDLSAFRNEGQL 425

Query: 1496 NLAAGKNEGIAG 1507
+ + G
Sbjct: 426 SFVQDQFTGKGQ 437


6c0390c0435Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0390320-3.997810Gamma-glutamyl phosphate reductase
c0391826-5.769861*CP4-like integrase
c0392725-5.715843Hypothetical protein
c0393725-3.942815Haemoglobin protease
c0394230-7.956308Hypothetical protein
c0395123-3.680016Hypothetical protein
c0396017-1.291185Insertion element IS1 1/2/3/5/6 protein insA
c03970180.992301InsB protein
c03980191.444760Hypothetical protein
c03992191.430513Hypothetical protein yagU
c04002201.117945Hypothetical protein yagV precursor
c04011210.865777Hypothetical protein yagW
c04022200.690803Hypothetical protein yagX precursor
c0403420-3.163437Hypothetical protein yagY precursor
c0404423-6.454730Hypothetical protein yagZ precursor
c0405231-7.685674Hypothetical protein ykgK
c0406135-7.349299Putative 50S ribosomal protein L36
c0407231-5.97870050S ribosomal protein L31 type B-1
c0408128-4.905837Hypothetical protein
c0409025-4.172264Putative oxidoreductase
c0410018-1.583681Hypothetical protein ycjY
c0411019-1.807973Putative LysR-like transcriptional regulator
c0412-117-1.648891Hypothetical transcriptional regulator ycjZ
c0413-117-2.026045Putative aldo/keto reductase
c0414-119-2.7462932,5-diketo-D-gluconic acid reductase A
c0415-222-3.483369Putative adhesin
c0416128-6.664725Hypothetical transcriptional regulator ykgA
c0417127-5.4679052,5-diketo-D-gluconic acid reductase A
c0418124-3.916207Hypothetical protein ykgB
c0419022-3.004483Hypothetical protein ykgI precursor
c0420-121-3.692448Probable pyridine nucleotide-disulfide
c0421-119-3.786159Hypothetical transcriptional regulator ykgD
c0422019-4.103550Hypothetical protein ykgE
c0423021-4.339422Putative electron transport protein ykgF
c0424224-5.025347Hypothetical protein ykgG
c0425427-6.325505Hypothetical protein ykgH
c0426016-2.161806Hypothetical protein
c0427-1131.546011Conserved hypothetical protein
c04280142.678244Conserved hypothetical protein
c04290193.595418Conserved hypothetical protein
c04300162.316528Type 1 fimbriae Regulatory protein fimB
c04310192.746209Choline dehydrogenase
c0432-1171.983827Betaine aldehyde dehydrogenase
c0433-2150.136585Regulatory protein betI
c0434015-0.991165High-affinity choline transport protein
c0435120-3.971807Hypothetical protein yahA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0393IGASERPTASE6140.0 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 614 bits (1585), Expect = 0.0
Identities = 267/886 (30%), Positives = 407/886 (45%), Gaps = 123/886 (13%)

Query: 42 SLALSALLPTVAGASTVGGNNPYQTYRDFAENKGQFQAGATNIPIFNNKGELVGHL--DK 99
+L ++ L A+ V + YQ +RDFAENKG+F GATN+ + + + +G +
Sbjct: 12 ALTVAYALTPYTEAALVRDDVDYQIFRDFAENKGKFSVGATNVLVKDKNNKDLGTALPNG 71

Query: 100 APMVDFSSVNVSSNPGVATLINPQYIASVKH-NKGYQSVSFG------------------ 140
PM+DFS V + +ATLINPQY+ VKH + G + FG
Sbjct: 72 IPMIDFSVV--DVDKRIATLINPQYVVGVKHVSNGVSELHFGNLNGNMNNGNAKAHRDVS 129

Query: 141 DGQNSYHIVDRNEHSSS-----------------DLHTPRLDKLVTEVAPATVTSSST-- 181
+N Y V++NE+ + D + PRLDK VTEVAP +++S+
Sbjct: 130 SEENRYFSVEKNEYPTKLNGKTVTTEDQTQKRREDYYMPRLDKFVTEVAPIEASTASSDA 189

Query: 182 ADILNPSKYSAFYRAGSGSQYIQDSQGKRHWVTGGYGYLTGGILPTSFFYH--------- 232
+ +KY AF R GSGSQ+I + + + Y
Sbjct: 190 GTYNDQNKYPAFVRLGSGSQFIYKKGDNYSLILNNHEVGGNNLKLVGDAYTYGIAGTPYK 249

Query: 233 --GSDGIQLYMGGNIHDHSI---------LPSFGEAGDSGSPLFGWNTAKGQWELVGVYS 281
+ + G + +HS L ++ GDSGSPLF ++ KG+W +G Y
Sbjct: 250 VNHENNGLIGFGNSKEEHSDPKGILSQDPLTNYAVLGDSGSPLFVYDREKGKWLFLGSYD 309

Query: 282 ---GVGGGTNLIYSLIPQSFLSQIYSEDNDAPVFFNASSGAPLQWKFDSSTGTGSLKQGS 338
G + +++ F + ++D+ + + + + S+ T ++ G
Sbjct: 310 FWAGYNKKSWQEWNIYKSQFTKDVLNKDSAGSLIGSK-----TDYSWSSNGKTSTITGGE 364

Query: 339 DEYAMHGQKGSDL-NAGKNLTFLGHNGQIDLENSVTQGAGSLTFTDDYTVT-TSNGSTWT 396
+ G D N GK++TF G +G + L N++ QGAG L F DY V TS+ +TW
Sbjct: 365 KSLNVDLADGKDKPNHGKSVTFEG-SGTLTLNNNIDQGAGGLFFEGDYEVKGTSDNTTWK 423

Query: 397 GAGIIVDKDASVNWQVNGVKGDNLHKIGEGTLVVQGTGVNEGGLKVGDGTVVLNQQADSS 456
GAG+ V + +V W+V+ + D L KIG+GTL+V+GTG N+G LKVGDGTV+L QQ + S
Sbjct: 424 GAGVSVAEGKTVTWKVHNPQYDRLAKIGKGTLIVEGTGDNKGSLKVGDGTVILKQQTNGS 483

Query: 457 GHVQAFSSVNIASGRPTVVLADNQQVNPDNISWGYRGGVLDVNGNDLTFHKLNAADYGAT 516
G AF+SV I SGR T+VL D++QV+P++I +G+RGG LD+NGN LTF + D GA
Sbjct: 484 GQ-HAFASVGIVSGRSTLVLNDDKQVDPNSIYFGFRGGRLDLNGNSLTFDHIRNIDDGAR 542

Query: 517 LGNS-SDKTANITLD---YQTRPADVKV---------NEWSSSNRGTVGSLYIYNNPYTH 563
L N +NIT+ T P + N ++ G LY+ YT
Sbjct: 543 LVNHNMTNASNITITGESLITDPNTITPYNIDAPDEDNPYAFRRIKDGGQLYLNLENYT- 601

Query: 564 TVDYFILK--TSSYGWFP-TGQVSNEHWEYVGHDQNSAQALLANRINNK------GYLY- 613
Y+ L+ S+ P SNE+W Y+G + A+ + N INN+ GY
Sbjct: 602 ---YYALRKGASTRSELPKNSGESNENWLYMGKTSDEAKRNVMNHINNERMNGFNGYFGE 658

Query: 614 -HGKLLGNINFSNKATPGTTGALVMDGSANMSGTFTQENGRLTIQGHPVIHASTSQSIAN 672
GK GN+N + K ++ G N++G T E G L + G P HA + IA
Sbjct: 659 EEGKNNGNLNVTFKGKSE-QNRFLLTGGTNLNGDLTVEKGTLFLSGRPTPHA---RDIAG 714

Query: 673 TVSSLGDNSVLTQPTSFTQDDWENRTFSFGSLVLK-DTDFGLGRN-ATLNTTIQADNSS- 729
S+ D +DDW NR F ++ + + GRN A + + I A N +
Sbjct: 715 ISSTKKDPHFAENNEVVVEDDWINRNFKATTMNVTGNASLYSGRNVANITSNITASNKAQ 774

Query: 730 ----VTLGDSRVFIDKKDGQGTAFTLEEGTSVATKDADKSVFNGTVNLDNQS--VLNINE 783
GD+ G T T ++ + A + + G VNL + VL
Sbjct: 775 VHIGYKTGDTVCVRSDYTGYVTC-TTDKLSDKALNSFNPTNLRGNVNLTESANFVLGKAN 833

Query: 784 IFNGGIQANNSTVNISSDS-------AVLENSTLTSTALNLNKGAN 822
+F NS V ++ +S + + L + ++LN N
Sbjct: 834 LFGTIQSRGNSQVRLTENSHWHLTGNSDVHQLDLANGHIHLNSADN 879



Score = 53.9 bits (129), Expect = 5e-09
Identities = 65/329 (19%), Positives = 121/329 (36%), Gaps = 54/329 (16%)

Query: 760 KDADKSVFNGTVNLDNQSVLNINEIFNGGIQANNSTVNISSDSAVLENSTLTSTALNLNK 819
K +D++ N +++N+ + N F NN +N++ +N L + NLN
Sbjct: 631 KTSDEAKRNVMNHINNERMNGFNGYFGEEEGKNNGNLNVTFKGKSEQNRFLLTGGTNLNG 690

Query: 820 GANVLASQSFVSDGPVNISDATLSLNSRPDEVSHTLLPVYDYAGSW----------NLKG 869
V F+S P + ++S + W N+ G
Sbjct: 691 DLTVEKGTLFLSGRPTPHARDIAGISSTKKDPHFAENNEVVVEDDWINRNFKATTMNVTG 750

Query: 870 DDARLNVGPYSMLSGNINVQDKGTVTLG--------------GEGELSPDLTLQNQMLYS 915
+ + + + ++ NI +K V +G G + D L ++ L S
Sbjct: 751 NASLYSGRNVANITSNITASNKAQVHIGYKTGDTVCVRSDYTGYVTCTTD-KLSDKALNS 809

Query: 916 LFN-----------------GYRNTWSGSLNAPDATVSMT-DTQWSMNGNSTAGNMKLNR 957
G N + + ++ V +T ++ W + GNS + L
Sbjct: 810 FNPTNLRGNVNLTESANFVLGKANLFGTIQSRGNSQVRLTENSHWHLTGNSDVHQLDLAN 869

Query: 958 TIVGFNGGTSS-----FTTLTTDNLDAVQSAFVMRTDL--NKADKLVINKSATGHDNSIW 1010
+ N +S + TLT ++L +F TDL + DK+V+ KSATG+
Sbjct: 870 GHIHLNSADNSNNVTKYNTLTVNSLSG-NGSFYYLTDLSNKQGDKVVVTKSATGNFTLQV 928

Query: 1011 VNFLKKPSDKDTLDIPLVSAPEATADNLF 1039
+ +P+ ++ L A +A D+L
Sbjct: 929 ADKTGEPNHN---ELTLFDASKAQRDHLN 954


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0402PF00577635e-12 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 62.6 bits (152), Expect = 5e-12
Identities = 29/247 (11%), Positives = 73/247 (29%), Gaps = 23/247 (9%)

Query: 487 TLNLNSLWSKLGTFSISYNDDRRYNSHYYTADYYQSVYSGTFGSLGLRAGIQRYNNGDSS 546
L + + T +S + Y + +Q+ + F + N
Sbjct: 530 QLTVTQQLGRTSTLYLSG-SHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQK 588

Query: 547 ANTGKYIALDLSLPLGNWFSAGMTHQNGYTMANLSARKQFDEGT------------IRTV 594
+ +AL++++P +W + Q + A+ S + +
Sbjct: 589 -GRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNL 647

Query: 595 GANLSRAISGDTGDDKTLSGGAYAQFDARYASGTLNVNSAADGYINTNLTANGSVGWQGK 654
++ +G + +G A + Y + + S +D +G V
Sbjct: 648 SYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGY-SHSDDIKQLYYGVSGGVLAHAN 706

Query: 655 NIAASGRTDGNAGVIFDTGLEN---DGQISAKINGRIFPLNGKRNYLPLSPYGRYEVELQ 711
+ + ++ G ++ + Q + + R G + Y V L
Sbjct: 707 GVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWR-----GYAVLPYATEYRENRVALD 761

Query: 712 NSKNSLD 718
+ + +
Sbjct: 762 TNTLADN 768


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0415INTIMIN548e-178 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 548 bits (1413), Expect = e-178
Identities = 233/832 (28%), Positives = 355/832 (42%), Gaps = 78/832 (9%)

Query: 41 PVMAARAQHAVQPRLSMENTTVTADNNVEKNVASLAANAGTFLSSQPDS-----DATRNF 95
P++AA +L+ + VT N + + AA L SQ S D ++
Sbjct: 131 PLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSRSLNGDYAKDT 190

Query: 96 ITGMATAKANQEIQEWLGKYGTARVKLNVDKKFSLKDSSLEMLYPIYDTPTNMLFTQGAI 155
G+A +A+ ++Q WL YGTA V L F SSL+ L P YD+ + F Q
Sbjct: 191 ALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD--GSSLDFLLPFYDSEKMLAFGQVGA 248

Query: 156 HRTDDRTQSNIGFGWRHFSENDWMAGVNTFIDHDLSRSHTRIGVGAEYWRDYLKLSANGY 215
D R +N+G G R F + M G N FID D S +TR+G+G EYWRDY K S NGY
Sbjct: 249 RYIDSRFTANLGAGQRFF-LPENMLGYNVFIDQDFSGDNTRLGIGGEYWRDYFKSSVNGY 307

Query: 216 IRASGWKKSPDVEDYQERPANGWDIRAEGYLPAWPQLGASLMYEQYYGDEVGLFGKDKRQ 275
R SGW +S + +DY ERPANG+DIR GYLP++P LGA LMYEQYYGD V LF DK Q
Sbjct: 308 FRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDNVALFNSDKLQ 367

Query: 276 KDPHAITAEVNYTPVPLLTLSAGHKQGKSGENDTRFGLEVNYRIGEPLEKQLDTDSIRER 335
+P A T VNYTP+PL+T+ ++ G END + ++ Y+ +P +Q++ + E
Sbjct: 368 SNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQQIEPQYVNEL 427

Query: 336 RMLAGSRYDLVERNNNIVLEYRKSEVIRIALPERIEGKGGQTVSLGLVVSKATHGLKNVQ 395
R L+GSRYDLV+RNNNI+LEY+K +++ + +P I G T + L+V K+ +GL +
Sbjct: 428 RTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTERSTQKIQLIV-KSKYGLDRIV 486

Query: 396 WEAPSLLAAGGKITGQG----NQWQVTLPAYQAGKDNYYAISAIAYDNKGNASKRVQTEV 451
W+ +L + GG+I G +Q LPAY G N Y ++A AYD GN+S V +
Sbjct: 487 WDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTI 546

Query: 452 VISGAGMSADRTALTLDGQSRIQMLANGNEQKPLVLSLR----DAEGQPVTGMKDQIKTE 507
+ G D+ +T + A+G E +++ PV+
Sbjct: 547 TVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVS--------- 597

Query: 508 LTFKPAGNIVTRTLKATKSQAKPTLGEFTETEAGVYQSVFTTGTQSGEATITVSVDDMSK 567
NIV+ T + + A +G + G+ ++ +M+
Sbjct: 598 ------FNIVSGTAVLSANSAN-------TNGSGKATVTLKSDK-PGQVVVSAKTAEMTS 643

Query: 568 TVTAELRATMMDVANSTLSANEPSGDVVADGQQAYTLTLTAVDSEGNPVTGEASRLRLVP 627
+ A + S VA+GQ A T T+ V PV+ +
Sbjct: 644 ALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVK-VMKGDKPVSNQEVTF---- 698

Query: 628 QDTNGVTVGAIS--EIKPGVYSATVSSTRAGNVVVRAFSEQYQLGTLQQTLKFVAGP--- 682
T + + G T++ST G +V A + ++F
Sbjct: 699 -TTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTID 757

Query: 683 ----------LDAAHSSITLNPDK---PVVGGTVTAIWTAKDANDNPVTGLNPDAPSLSG 729
+ ++ L + GG W + + V + +L
Sbjct: 758 DGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDA-SSGQVTLKE 816

Query: 730 AAAAGSTASGWTDNGDGTWTAQISLGTTAGELDVMPKLNGQDAAANAAKVTVVADALSSN 789
+ +DN T+T T L V L S+
Sbjct: 817 KGTTTISVIS-SDNQTATYTIA-----TPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSS 870

Query: 790 QSKV-------SVAEDHVKAGESTTVTLVAKDAHGNAISGLSLSASLTGTAS 834
Q+++ A + S T+ + +A SG++ + L
Sbjct: 871 QNELENVFKAWGAANKYEYYKSSQTIISWVQQTAQDAKSGVASTYDLVKQNP 922



Score = 74.0 bits (181), Expect = 4e-15
Identities = 74/345 (21%), Positives = 113/345 (32%), Gaps = 42/345 (12%)

Query: 905 KTTTELTFTVK----DAYGNPVTGLKPDAPVFSGAASTGSERPSAGNWTEKGNGVYVSTL 960
T TVK PV+ + SG A SA + G+G TL
Sbjct: 575 TEAITYTATVKKNGVAQANVPVSFN-----IVSGTAV-----LSANSANTNGSGKATVTL 624

Query: 961 TLGSAAGQLSVMPRVNGQNAVAQPLVLNVAGDASKAEIRDMTVKVNNQLANGQSANQITL 1020
+ +A+ V+ V D +KA I ++ +ANGQ A IT
Sbjct: 625 KSDKPGQVVVSAKTAEMTSALNANAVIFV--DQTKASITEIKADKTTAVANGQDA--ITY 680

Query: 1021 TV-VDSYGNPLQGQEVTLTLPQGVTSKTGNTVTTNAAGKVDIELMSTVAGELEIEASVKN 1079
TV V P+ QEVT T G S + T T+ G + L ST G+ + A V +
Sbjct: 681 TVKVMKGDKPVSNQEVTFTTTLGKLSNS--TEKTDTNGYAKVTLTSTTPGKSLVSARVSD 738

Query: 1080 SQKTVKVKFKADFSTGQASLEVDAA-AQKVANGKDAFTLTATVK-DQYGNLLPGAVVVFN 1137
VK F +L +D + V G T ++ Q G +
Sbjct: 739 VAVDVKAPEVEFF----TTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYT 794

Query: 1138 LPRGVKPLADGNIMVNADKEGKAELKVVSVTAGTYEITASAGNDQPSNAQSVTFVADKTT 1197
A+ I G+ LK GT I+ + ++Q T+
Sbjct: 795 W-----RSANPAIASVDASSGQVTLK----EKGTTTISVISSDNQT-----ATYTIATPN 840

Query: 1198 ATISSIEVIGNRAVADGKTKQTYKVTVTDANNNLLKDSEVTLTAS 1242
+ I + D ++ N L++ A+
Sbjct: 841 SLI-VPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAA 884



Score = 54.7 bits (131), Expect = 2e-09
Identities = 46/249 (18%), Positives = 82/249 (32%), Gaps = 24/249 (9%)

Query: 1168 TAGTYEITASA----GNDQPSNAQSVTFVADKTTAT---ISSIEVIGNRAVADGKTKQTY 1220
+ Y++TA A GN + ++T +++ ++ A ADG TY
Sbjct: 521 GSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITY 580

Query: 1221 KVTVTDANNNLLKDSEVTLTASPENLVLTPNGTATTNEQGQAIFTATTTVAATYTLTAKV 1280
TV S + +A TN G+A T + ++AK
Sbjct: 581 TATVKKNGVAQANVPVSFNIVS--GTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAK- 637

Query: 1281 EQADGQESTKTAESKFVADDKNAVLAASPERVDSLVADGKTTATLTVTLMSGVNPVGGTM 1340
A+ + FV K ++ ++ + VA+G+ T TV +M G PV
Sbjct: 638 -TAEMTSALNANAVIFVDQTKASITEIKADK-TTAVANGQDAITYTVKVMKGDKPVSNQE 695

Query: 1341 WVDIEA--PEGVTEADYQFLPSKNDHFASGKITRTFSTNKPGTYTFTFNSLTYGGYEMKP 1398
+ K D +G T ++ PG + ++ ++K
Sbjct: 696 VTFTTTLGKLSNSTE-------KTD--TNGYAKVTLTSTTPGKSLVS-ARVSDVAVDVKA 745

Query: 1399 VTVTINAVP 1407
V
Sbjct: 746 PEVEFFTTL 754



Score = 47.0 bits (111), Expect = 5e-07
Identities = 56/368 (15%), Positives = 104/368 (28%), Gaps = 56/368 (15%)

Query: 779 VTVVADALSSNQSKV---SVAEDHVKAGESTTVTLVA------KDAHGNAISGLSLSASL 829
+TV+++ +Q V + + KA + +T A +S +S
Sbjct: 546 ITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSG-- 603

Query: 830 TGTASEGATVSSWTEKGDGSYVAT--LTTGGKTGELRVMPLFNGQPAATEAAQLTVIAGE 887
A +S+ + +GS AT L + + A A + +
Sbjct: 604 ------TAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALN--ANAVIFVDQ 655

Query: 888 MSSANSTLVADNKTPTVKTTTELTFTVKDAY-GNPVTGLKPDAPVFSGAASTGSERPSAG 946
++ + + AD T +T+TVK PV+ + +T + S
Sbjct: 656 TKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEV-------TFTTTLGKLSNS 708

Query: 947 NWTEKGNGVYVSTLTLGSAAGQLSVMPRVNGQN-AVAQPLVLNVAG---DASKAEIRDMT 1002
NG TLT + G+ V RV+ V P V D EI
Sbjct: 709 TEKTDTNGYAKVTLT-STTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEI---- 763

Query: 1003 VKVNNQLANGQSANQITLTV-VDSYGNPLQGQEVTLTLPQGVTSKTGNTVTTNAAGKVDI 1061
+ G T+ + G T ++ ++G+V +
Sbjct: 764 ------VGTGVKGKLPTVWLQYGQVNLKASGGNGKYTW---RSANPAIASVDASSGQVTL 814

Query: 1062 ELMSTVAGELEIEASVKNSQKTVKVKFKADFSTGQASLEVDAAAQKVANGKDAFTLTATV 1121
+ G I ++Q + + + +
Sbjct: 815 K----EKGTTTISVISSDNQ---TATYTIATPNSLIVPNMSKRVT-YNDAVNTCKNFGGK 866

Query: 1122 KDQYGNLL 1129
N L
Sbjct: 867 LPSSQNEL 874


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0416HTHTETR280.027 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.4 bits (63), Expect = 0.027
Identities = 12/42 (28%), Positives = 19/42 (45%)

Query: 14 RQKILQQLLEWIECNLEHPISIEDIAQKSGYSRRNIQLLFRN 55
RQ IL L S+ +IA+ +G +R I F++
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKD 54


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0426PRTACTNFAMLY1221e-30 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 122 bits (307), Expect = 1e-30
Identities = 112/509 (22%), Positives = 187/509 (36%), Gaps = 73/509 (14%)

Query: 330 LDINLSDSSVWKGKVSGAGDASVSLQNGSVWNVTGSSTVDALAVKDSTVNITKATVNTGT 389
LD+ L+ + W G S+S+ N W +T +S V AL + + G
Sbjct: 413 LDVALASQARWTGATRAVD--SLSIDNA-TWVMTDNSNVGALRLASDGSVDFQQPAEAGR 469

Query: 390 FA-------SQNGTLI----VDASSENTLDISGKASGDLRVY---------SAGSLDLIN 429
F + +G D + L + ASG R++ SA +L L+
Sbjct: 470 FKVLTVNTLAGSGLFRMNVFADLGLSDKLVVMQDASGQHRLWVRNSGSEPASANTLLLVQ 529

Query: 430 EQ----TAFISTGKDSTLKATGTTEGGLYQYDLTQGADGNFYFVKNTHK----------- 474
F KD G + G Y+Y L +G + V
Sbjct: 530 TPLGSAATFTLANKD------GKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGP 583

Query: 475 -----------------------ASNASSVIQAMA-AAPANVANLQADTLSARQDAVRLS 510
++ A++ + + + +++ LS R +RL+
Sbjct: 584 QPPQPPQPQPEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRLN 643

Query: 511 ENDKGGVWIQYFGGKQKHTTAGNASYDLDVNGVMLGGDTRFMTEDGSWLAGVAMSSAKGD 570
D GG W + F +Q+ +D V G LG D G W G +GD
Sbjct: 644 P-DAGGAWGRGFAQRQQLDNRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGD 702

Query: 571 MT-TMQSKGDTEGYSFHAYLSRQYNNGIFIDTAAQFGHYSNTADVRLMNGGGTIKADFNT 629
T G T+ Y + ++G ++D + N V +G +K + T
Sbjct: 703 RGFTGDGGGHTDSVHVGGYATYIADSGFYLDATLRASRLENDFKVAGSDGY-AVKGKYRT 761

Query: 630 NGFGAMVKGGYTWKDGNGLFIQPYAKLSALTLEGVDYQL-NGVDVHSDSYNSVLGEAGTR 688
+G GA ++ G + +G F++P A+L+ G Y+ NG+ V + +SVLG G
Sbjct: 762 HGVGASLEAGRRFTHADGWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLE 821

Query: 689 VGYDFAVGNA-TVKPYLNLAALNEFSDGNKVRLGDESVNASIDGAAFRVGAGVQADITKN 747
VG + V+PY+ + L EF V + + G +G G+ A + +
Sbjct: 822 VGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRG 881

Query: 748 MGAYASLDYTKGDDIENPLQGVVGINVTW 776
YAS +Y+KG + P G +W
Sbjct: 882 HSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0433HTHTETR632e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.7 bits (152), Expect = 2e-14
Identities = 31/172 (18%), Positives = 58/172 (33%), Gaps = 15/172 (8%)

Query: 16 RRRQLIDATLEAINEVGMHDATIAQIARRAGVSTGIISHYFRDKNGLLEATMRDITSQLR 75
R+ ++D L ++ G+ ++ +IA+ AGV+ G I +F+DK+ L S +
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 76 DAVLNRLHALPQGSAERRLQAIVGGNFDETQVSSAAMKAWLAFWASSMHQP-------ML 128
+ L P L+ I+ + T V+ + +
Sbjct: 72 ELELEYQAKFPGDPLS-VLREILIHVLEST-VTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 129 YRLQQVSSRRLLSNLVSEFRRE---LPRQQAQEAGYGLAALIDGL---WLRA 174
R + S + + + A + I GL WL A
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFA 181


7c0496c0505Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0496-127-3.859915Hypothetical protein yaiA
c0497015-2.415507Hypothetical protein
c0498119-0.805900AroM protein
c0499219-0.183083Hypothetical protein yaiE
c05003170.994377Hypothetical protein
c05012151.266160Hypothetical protein ykiA
c05022151.291642Recombination associated protein rdgC
c05031161.610681Hypothetical protein yajF
c05041151.533511Protein araJ precursor
c05052151.808628Exonuclease sbcC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0503ACETATEKNASE290.020 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.4 bits (66), Expect = 0.020
Identities = 17/69 (24%), Positives = 29/69 (42%), Gaps = 10/69 (14%)

Query: 233 FISGTGFATDYRRLSGHALKGSEIIRLVEESDPVAELALRRYELRLAKSLAHVVNILDP- 291
+G ++D+R L A + D A+LAL + R+ K++ +
Sbjct: 273 VYGISGISSDFRDLEDAAF---------KNGDKRAQLALNVFAYRVKKTIGSYAAAMGGV 323

Query: 292 DVIVLGGGM 300
DVIV G+
Sbjct: 324 DVIVFTAGI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0504TCRTETA513e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 51.4 bits (123), Expect = 3e-09
Identities = 73/356 (20%), Positives = 126/356 (35%), Gaps = 35/356 (9%)

Query: 33 ILSLALGTFGLGMAEFGIMGVLTELAHNVGISIPAAGH---MISYYALGVVVGAPIIALF 89
+ ++AL G+G+ IM VL L ++ S H +++ YAL AP++
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 90 SSRYSLKHILLFLVALCVIGNAMFTLSSSYLMLAIGRLVSGFPHGAFFGVGAIVLSKIIK 149
S R+ + +LL +A + A+ + +L IGR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 150 PGKVTAAVAGMVSGMTVANLLGIPLGTYLSQEFSWRYTFLLIAVFNIAVMASVYFWVPDI 209
G A G +S ++ P+ L FS F A N + F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 210 RDEAKGKLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYVKPYMMFI 257
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 258 SGFSETAMTFIMMLVGLGM---VLGNVLSGRISGRYSPLRIAAVTDFIIVLALLMLFFFG 314
F A T + L G+ + +++G ++ R R + ++L F
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 315 GMKTTSLIFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAVG 368
I + G+ LQ +L + E G G +A +L S VG
Sbjct: 299 RGWMAFPIMVLLASGGIG--MPALQAMLSRQV-DEERQGQLQGSLAALTSLTSIVG 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0505IGASERPTASE404e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.4 bits (94), Expect = 4e-05
Identities = 40/264 (15%), Positives = 81/264 (30%), Gaps = 11/264 (4%)

Query: 162 LNAKPKERAELLEELTGTEIYGKISAMVFEQHKSARTELEKLQAQASGVALLTPEQVQSL 221
A P E E + E + E S V + + A + + A Q+
Sbjct: 1029 APATPSETTETVAENSKQE-----SKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTN 1083

Query: 222 TASLQVLTDEEKQLLTAQQQEQQSLNWLTRLD-ELQQEASRRQQALQQALAEEEKAQPQL 280
+ +E Q ++ +++ E QE + + + E QPQ
Sbjct: 1084 EVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQA 1143

Query: 281 AALSLAQPARNLRPHWE---RIAEHSAALAHTRQQIEEVNTRLQSTMALRASIRHHAAKQ 337
P N++ A+ T +E+ T + + + +
Sbjct: 1144 EPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTT 1203

Query: 338 SAELQQQQQSLNAWLQEHDRFRQWNNELAGWRAQFSQQTSDREHLRQWQQQLTHAEQKLN 397
A Q S ++ ++ R + + ++DR + T+ L+
Sbjct: 1204 PATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPA-TTSSNDRSTVALCDLTSTNTNAVLS 1262

Query: 398 ALAAITLTLTADEVASALAQHAEQ 421
A + + V A++QH Q
Sbjct: 1263 DARAKAQFVALN-VGKAVSQHISQ 1285



Score = 33.5 bits (76), Expect = 0.005
Identities = 27/139 (19%), Positives = 54/139 (38%), Gaps = 13/139 (9%)

Query: 738 QQDVLAAQSLQKAQAQFDTALQASVFDDQQAFLAALMDEQTLTQLEQLKQNLENQRRQAQ 797
Q DV + S + A+ D A A E T T E KQ + + Q
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPP-------APATPSETTETVAENSKQESKTVEKNEQ 1056

Query: 798 TLVTQTAETLTQHQQHRPGGLSLTVTVEQIQQELAQTHQKLRENTTSQGEIRQQLKQDAD 857
TA+ ++ + V E+AQ+ + +E T++ + ++++
Sbjct: 1057 DATETTAQNREVAKEAKS-----NVKANTQTNEVAQSGSETKETQTTETKETATVEKEEK 1111

Query: 858 NRQQQQTLMQQIAQMTQQV 876
+ + + Q++ ++T QV
Sbjct: 1112 AKVETEK-TQEVPKVTSQV 1129


8c0546c0552Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0546427-1.990277Hypothetical lipoprotein yajG precursor
c0547929-1.550989Hypothetical protein
c0548728-1.269262BolA protein
c0549627-0.743111Hypothetical protein
c05506260.014932Hypothetical protein
c05515270.123966Trigger factor
c05523230.302544Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0546PF06291290.006 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 28.9 bits (64), Expect = 0.006
Identities = 12/37 (32%), Positives = 19/37 (51%)

Query: 34 NMFKKILFPLVALFMLAGCAKPPTTIEVSPTITLPQQ 70
N KK+LF ++ GCA+ T+ PT P++
Sbjct: 4 NKMKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKE 40


9c0573c0591Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0573018-3.338858Hypothetical protein ybaZ
c0574118-4.895217Hypothetical protein ybaA
c0575-112-1.584729Hypothetical protein ylaB
c0576117-0.396588Hypothetical protein ylaC
c0577116-0.564209Maltose O-acetyltransferase
c0578118-0.586789Haemolysin expression modulating protein
c05791160.094945Hypothetical protein ybaJ
c05801160.515455Acriflavine resistance protein B
c05812120.366598Acriflavine resistance protein A precursor
c0582215-0.034723Potential acrAB operon repressor
c05832180.898931Hypothetical protein
c05843162.492824Potassium efflux system kefA
c05853173.932634Hypothetical protein ybaM
c05863173.902926Primosomal replication protein N
c05872183.664353Hypothetical protein ybaN
c05884243.156804Adenine phosphoribosyltransferase
c05894252.848637DNA polymerase III subunit tau
c05903261.690167Conserved hypothetical protein
c05912221.581646Hypothetical protein ybaB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0575BCTERIALGSPF290.034 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.4 bits (66), Expect = 0.034
Identities = 31/137 (22%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 247 IWLPLGLVIGLLAAMFVLRILRRIQSPHHRLQDAIENRDICVHYQPIVSLANGKIVGAEA 306
W+ L L+ G +A +LR R+ + + P++ G+I
Sbjct: 228 PWMLLALLAGFMAFRVMLR------QEKRRVS-----FHRRLLHLPLI----GRIARGLN 272

Query: 307 LARWPQTDGSWLSPDSFIPLAQQTGLS-EPLTLLIIRSVFEDMGDWLRQHPQQHISINLE 365
AR+ +T + S +PL Q +S + ++ R D +R+ H + LE
Sbjct: 273 TARYARTLSILNA--SAVPLLQAMRISGDVMSNDYARHRLSLATDAVREGVSLHKA--LE 328

Query: 366 STVLTSEKIPQLLREMI 382
T L P ++R MI
Sbjct: 329 QTAL----FPPMMRHMI 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0580ACRIFLAVINRP13680.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1368 bits (3543), Expect = 0.0
Identities = 802/1033 (77%), Positives = 916/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++SA+YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ + LNK++LTPVDVI +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL 300
+ EEFGK+ L+VN DGS V L+DVA++ELGGENY++IA NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+LA+++PFFP G+K++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVF+PMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWFNRMFEKSTHHYTDSVGGILRSTGR 540
SVLVALILTPALCAT+LKP++ H E K GFFGWFN F+ S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTNYYLT 600
YL++Y +IV GM LF+RLPSSFLP+EDQGVF+TM+QLPAGATQERTQKVL++VT+YYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD 660
EK NVESVF VNGF F+G+ QN G+AFVSLK W +R G+EN EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQLL AA+HP L SVRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS++DIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D+ YVR+A+G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSIPFS 900
M LME LASKLP G+GYDWTGMSYQERLSGNQAP+L AIS +VVFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF 1020
+EATL AVRMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0581RTXTOXIND446e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.4 bits (105), Expect = 6e-07
Identities = 33/212 (15%), Positives = 71/212 (33%), Gaps = 23/212 (10%)

Query: 112 TYQAAYDSAKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTA 171
+ Y A +L + + Q+ Q +++ ++ L +Q +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 172 AKAAVETARINLAYTKVTSPISGRIGKSNV-TEGALVQNGQATALATVQQLDPIYVDVTQ 230
+ + + +P+S ++ + V TEG +V + T + V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALV 372

Query: 231 SSNDFLRLKQELA----------NGTLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVD 280
+ D + KV I D I+ + G + ++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLV---GKVKNINLDAIEDQRLGLVFNVIISIE 429

Query: 281 QTTGSITLRAIFPNPDHTLLPGMFVRARLEEG 312
+ S + I L GM V A ++ G
Sbjct: 430 ENCLSTGNKNIP------LSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 7e-04
Identities = 24/125 (19%), Positives = 43/125 (34%), Gaps = 13/125 (10%)

Query: 61 PLQITTELPGR-TSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQAAYDS 119
++I G+ T + R E++P + I+ + KEG + G L ++ +A
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA---- 134

Query: 120 AKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTAAKAAVETA 179
D K Q++ A+L RYQ L E ++
Sbjct: 135 ---DTLKTQSSLLQARLEQTRYQILS-----RSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 180 RINLA 184
+L
Sbjct: 187 LTSLI 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0582HTHTETR2225e-76 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 222 bits (567), Expect = 5e-76
Identities = 215/215 (100%), Positives = 215/215 (100%)

Query: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60
MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180
GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215
APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0584RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRVKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0589IGASERPTASE399e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.5 bits (89), Expect = 9e-05
Identities = 40/251 (15%), Positives = 78/251 (31%), Gaps = 31/251 (12%)

Query: 404 PLPETTSQVLAARQQLQRVQGATKAKKSEPAA----ATRARPVNNAALERLASVTDRVQA 459
P E +Q + + + P+ AR + A + A T
Sbjct: 983 PEVEKRNQTVDTTN----ITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETT 1037

Query: 460 RPVPSALEKAPAKKEAYRWKATTPVMQQKE--------VVATPKALKKA---LEHEKTPE 508
V ++ E AT Q +E V A + + A E ++T
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 509 LAAKLAA---------EAIERDPWAAQVSQLSLPKLVEQVALNAWKE-ESDNAVCLHLRS 558
K A E+ +V+ PK + + E +N ++++
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 559 SQRHLNNRGAQQKLAEALST-LKGSTVELTIVEDDNPAVRTPLEWRQAIYEEKLAQARES 617
Q N ++ A+ S+ ++ E T V N V P A + + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 618 IIADNNIQTLR 628
+ + +++R
Sbjct: 1218 KPKNRHRRSVR 1228


10c0601c0646Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0601-1133.226976Protein ybaK
c06020143.662337Hypothetical protein ybaP
c0603-2123.286964Hypothetical protein ybaQ
c0604-2144.020320Copper-transporting P-type ATPase
c0605-3152.307624Probable glutaminase ybaS
c0606-2161.250322Hypothetical transport protein ybaT
c0607-1141.329096Transcriptional regulator cueR
c0608-1131.290458Hypothetical protein
c0609-1161.487255Hypothetical protein ybbJ
c0610-1141.367594Hypothetical protein ybbK
c06111163.569825Hypothetical ABC transporter ATP-binding protein
c06131174.592657Hypothetical protein ybbN
c06140164.234026Hypothetical oxidoreductase ybbO
c06151174.036378Acyl-CoA thioesterase I precursor
c0616-1164.002570Hypothetical ABC transporter ATP-binding protein
c0617-2143.207897Hypothetical protein ybbP
c0618-3160.333714Hypothetical protein ybbB
c0619-316-0.415415Hypothetical transcriptional regulator ybbS
c0620-117-1.452325Ureidoglycolate hydrolase
c0621-116-1.200130Negative regulator of allantoin and glyoxylate
c0622016-1.355984Glyoxylate carboligase
c0623117-1.006421Hydroxypyruvate isomerase
c0624116-0.7836382-hydroxy-3-oxopropionate reductase
c0625214-0.616477Putative allantoin permease
c06262130.347460Allantoinase
c06273171.271366Putative purine permease ybbY
c06282162.279410Glycerate kinase 1
c06292142.473050Hypothetical protein ylbA
c06301163.749942Allantoate amidohydrolase
c06311164.634126Ureidoglycolate dehydrogenase
c06322175.382182Protein fdrA
c06332174.993902Hypothetical protein ylbE
c06341154.428430Hypothetical protein ylbF
c06350132.586343Carbamate kinase
c06362150.929057Phosphoribosylaminoimidazole carboxylase ATPase
c06372170.487251Phosphoribosylaminoimidazole carboxylase
c0638215-0.243049Hypothetical protein
c06393170.466005UDP-2,3-diacylglucosamine hydrolase
c06403190.741858Conserved hypothetical protein
c06413231.320028Peptidyl-prolyl cis-trans isomerase B
c06421181.686062Cysteinyl-tRNA synthetase
c06432160.773511Hypothetical protein ybcI
c0644117-2.164795Hypothetical protein ybcJ
c0645020-4.292691FolD bifunctional protein
c0646222-5.179378Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0605BLACTAMASEA290.020 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 29.0 bits (65), Expect = 0.020
Identities = 11/43 (25%), Positives = 19/43 (44%)

Query: 38 GQLAAVAIVTSDGNVYSAGDSDYRFALESISKVCTLALALEDV 80
G++ + + + G +A +D RF + S KV L V
Sbjct: 38 GRVGMIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARV 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0614DHBDHDRGNASE785e-19 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 77.8 bits (191), Expect = 5e-19
Identities = 49/212 (23%), Positives = 81/212 (38%), Gaps = 7/212 (3%)

Query: 16 KSVLITGCSSGIGLESALELKRQGFHVLAGCRKPDDVERMNS----MGFT--GVLIDLDS 69
K ITG + GIG A L QG H+ A P+ +E++ S D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 70 PESVDRAADEVIALTDNCLYGIFNNAGFGMYGPLSTISRAQMEQQFSANFFGAHQLTMRL 129
++D + + + N AG G + ++S + E FS N G + +
Sbjct: 69 SAAIDEITARIEREMGP-IDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 130 LPAMLPHGEGRIVMTSSVMGLISTPGRGAYAASKYALEAWSDALRMELRHSGIKVSLIEP 189
M+ G IV S + AYA+SK A ++ L +EL I+ +++ P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 190 GPIRTRFTDNVNQTQSDKPVENPGIAARFTLG 221
G T ++ ++ G F G
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTG 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0616PF05272300.013 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.013
Identities = 12/20 (60%), Positives = 13/20 (65%)

Query: 41 LVGESGSGKSTLLAILAGLD 60
L G G GKSTL+ L GLD
Sbjct: 601 LEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0621PF09025280.020 YopR Core
		>PF09025#YopR Core

Length = 143

Score = 28.1 bits (62), Expect = 0.020
Identities = 17/61 (27%), Positives = 25/61 (40%), Gaps = 8/61 (13%)

Query: 126 EAVLIGQLECKSMVRMCAPLGSR--------LPLHASGAGKALLYPLAEEELMSIILQTG 177
+ + +LE K+M+R PLG + L G L LA EL +I G
Sbjct: 68 QGLEADRLELKAMLRAELPLGRQQQTFLLQLLGAVEHAPGGEYLAQLARRELQVLIPLNG 127

Query: 178 L 178
+
Sbjct: 128 M 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0626UREASE561e-10 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 56.3 bits (136), Expect = 1e-10
Identities = 39/163 (23%), Positives = 60/163 (36%), Gaps = 32/163 (19%)

Query: 4 DLIIKNGTVILENEARVVDIAVKDGKIAAIG-------QD-----LGDAKDVMDASGLVV 51
D +I N ++ DI +KDG+IAAIG Q +G +V+ G +V
Sbjct: 69 DTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIV 128

Query: 52 SPGMVDAHTHISEPGRSHWEGYETGTRAAAKGGITTMIEMPLNQLPATVDRAS------- 104
+ G +D+H H P + A G+T M+ PA A+
Sbjct: 129 TAGGMDSHIHFICPQQIE---------EALMSGLTCMLGGGTG--PAHGTLATTCTPGPW 177

Query: 105 -IELKFDAAKGKLTIDAAQLGGLVSYNIDRLHELDEVGVVGFK 146
I +AA ++ A G + L E+ G K
Sbjct: 178 HIARMIEAADA-FPMNLAFAGKGNASLPGALVEMVLGGATSLK 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0635CARBMTKINASE386e-138 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 386 bits (994), Expect = e-138
Identities = 126/310 (40%), Positives = 176/310 (56%), Gaps = 16/310 (5%)

Query: 2 KTLVVALGGNALLQRGEALTAENQYRNIASAVPALARL-ARSYRLAIVHGNGPQVGLLAL 60
K +V+ALGGNAL QRG+ + E N+ +A + AR Y + I HGNGPQVG L L
Sbjct: 3 KRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLL 62

Query: 61 QNLAWKE---VEPYPLDVLVAESQGMIGYMLAQSLSAQPQM----PHVTTVLTRIEVSPD 113
A + + P+DV A SQG IGYM+ Q+L + + V T++T+ V +
Sbjct: 63 HMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKN 122

Query: 114 DPAFLQPEKFIGPVYQPEEQEALEATYGWQMKRD-GKYLRRVVASPQPRKILDSEAIELL 172
DPAF P K +GP Y E + L GW +K D G+ RRVV SP P+ +++E I+ L
Sbjct: 123 DPAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKL 182

Query: 173 LKEGHVVICSGGGGVPVTEDG---AGSEAVIDKDLAAALLAEQINADGLVILTDADAVYE 229
++ G +VI SGGGGVPV + G EAVIDKDLA LAE++NAD +ILTD +
Sbjct: 183 VERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAAL 242

Query: 230 NWGTPQQRAIRHATPDELAPFAKAD----GSMGPKVTAVSGYVRSRGKPAWIGALSRIEE 285
+GT +++ +R +EL + + GSMGPKV A ++ G+ A I L + E
Sbjct: 243 YYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLEKAVE 302

Query: 286 TLAGEAGTCI 295
L G+ GT +
Sbjct: 303 ALEGKTGTQV 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0642RTXTOXIND290.030 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.030
Identities = 16/150 (10%), Positives = 44/150 (29%), Gaps = 8/150 (5%)

Query: 299 RSQLNYSEENLKQARAALERLYTALRGTDKTVAPAGGEAFEARFIEAMDDDFNTP----- 353
+ ++ +L QAR R R + P E F +++
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIK 192

Query: 354 EAYSVLFDMAREVNRLKAEDMAAANAMASHLRKLSAVLGLLEQEPEAFLQSGAQADDSEV 413
E +S + + + A + + + + + + + + F +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSS---LLHKQAI 249

Query: 414 AEIEALIQQRLDARKAKDWAAADAARDRLN 443
A+ L Q+ + + +++
Sbjct: 250 AKHAVLEQENKYVEAVNELRVYKSQLEQIE 279


11c0668c0703Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0668-2133.0944344'-phosphopantetheinyl transferase entD
c0669-2122.291739Ferrienterobactin receptor precursor
c0670-1132.236890Hypothetical protein
c0671-1132.821939Enterochelin esterase
c06720123.377681Conserved hypothetical protein
c06730143.752666Enterobactin synthetase component F
c06740123.313035Ferric enterobactin transport protein fepE
c06750145.458022Ferric enterobactin transport ATP-binding
c06760165.644269Ferric enterobactin transport system permease
c0677-1175.324514Ferric enterobactin transport system permease
c0678-2184.887816Hypothetical membrane protein P43
c0679-2194.732561Ferrienterobactin-binding periplasmic protein
c0680-2235.197138Isochorismate synthase entC
c0681-2255.251052Enterobactin synthetase component E
c0682-1225.059982Isochorismatase
c0683-1204.9368082,3-dihydro-2,3-dihydroxybenzoate dehydrogenase
c0684-1203.720316Hypothetical protein ybdB
c0685-1161.944682Carbon starvation protein A
c0686-117-2.591400Hypothetical protein yjiX
c0687-215-3.056173Hypothetical oxidoreductase ybdH
c0688-317-4.460323Hypothetical aminotransferase ybdL
c0689-122-6.081830Hypothetical protein ybdM
c0690-119-4.232883Hypothetical protein ybdN
c0691021-4.132304Hypothetical transcriptional regulator ybdO
c0692023-1.307801Thiol:disulfide interchange protein dsbG
c06930260.022486Hypothetical protein
c06940240.332235Alkyl hydroperoxide reductase C22 protein
c06950160.361641Alkyl hydroperoxide reductase subunit F
c06960130.792140Unknown protein from 2D-page
c06970142.055424Conserved hypothetical protein
c0698-1162.827780Regulator of nucleoside diphosphate kinase
c0699-2173.259363Ribonuclease I precursor
c0700-2184.084420Citrate carrier/transporter
c0701-2235.3728022-(5''-triphosphoribosyl)-3'-dephosphocoenzyme-A
c0702-1254.734588Apo-citrate lyase phosphoribosyl-dephospho-CoA
c0703-1193.030442Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0668ENTSNTHTASED2677e-93 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 267 bits (684), Expect = 7e-93
Identities = 108/183 (59%), Positives = 132/183 (72%), Gaps = 1/183 (0%)

Query: 51 MKTTHTSLPFAGHTLHFVEFDPASFREQDLLWLPHYAQLQHAGRKRKTEHLAGRIAAIYA 110
M T+H LPFAGH LH V+FD +SFRE DLLWLPH+ +L+ AGRKRK EHLAGRIAA++A
Sbjct: 1 MLTSHFPLPFAGHRLHIVDFDASSFREHDLLWLPHHDRLRSAGRKRKAEHLAGRIAAVHA 60

Query: 111 LREYGYKCVPAIGELRQPVWPAGVYGSISHCGTTALAVVSRQPIGIDIEEIFSAQTAREL 170
LRE G + VP +G+ RQP+WP G++GSISHC TTALAV+SRQ IGIDIE+I S TA EL
Sbjct: 61 LREVGVRTVPGMGDKRQPLWPDGLFGSISHCATTALAVISRQRIGIDIEKIMSQHTATEL 120

Query: 171 TDNIITPAEHKRLADCGLAFPLALTLAFSAKESAFKA-SEIQTDADFLDYQIISRNKQQV 229
+II E + L L FPLALTLAFSAKES +KA S+ T F ++ S +
Sbjct: 121 APSIIDSDERQILQASLLPFPLALTLAFSAKESVYKAFSDRVTLPGFNSAKVTSLTATHI 180

Query: 230 IIH 232
+H
Sbjct: 181 SLH 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0678TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.3 bits (84), Expect = 2e-04
Identities = 82/394 (20%), Positives = 146/394 (37%), Gaps = 38/394 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATPALVGR 141
V+L + G ++ + P L +Y+ + G + G A A +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPALPP 201
+ + G V P++GGL+ GG + + AA L L LP
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLM---GGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 202 PPQPREHPLK----SLLAGFRFLLASPLVGGIALLGGLLTMAS----AVRVLYPALADNW 253
+ PL+ + LA FR+ +V + + ++ + A+ V++ D +
Sbjct: 184 SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRF 241

Query: 254 QMSAAQIGFLYAAIP-LGAAIGALTSGKLAHSVRPGLLMLLSTLG---AFLAIGLFGLMP 309
A IG AA L + A+ +G +A + ++L + ++ +
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 310 MWILGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGG 369
M +V LA G ML Q E G++ G A +G L
Sbjct: 302 MAFPIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 370 LGAMMTPVASASASGFGLLIIGVLLLLVLVELRR 403
+ A + + +G+ + L LL L LRR
Sbjct: 358 IYA----ASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0679FERRIBNDNGPP632e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 62.7 bits (152), Expect = 2e-13
Identities = 60/280 (21%), Positives = 100/280 (35%), Gaps = 35/280 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSAEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKS--- 151
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 152 --WQSLLTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
+ LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQ 309
KD DA+ A PL +P V+ + + F SAM
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMH 283


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0682ISCHRISMTASE443e-161 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 443 bits (1141), Expect = e-161
Identities = 146/299 (48%), Positives = 195/299 (65%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPESHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPIPASKAE-----------LREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA + + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0683DHBDHDRGNASE362e-130 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 362 bits (930), Expect = e-130
Identities = 110/258 (42%), Positives = 150/258 (58%), Gaps = 20/258 (7%)

Query: 5 GKNVWVTGAGKGIGYATAMAFVEAGAKVTGFD---------------QAFTQEQYPFATE 49
GK ++TGA +GIG A A GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAAQVAQVCQRLLAETERLDVLVNAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+A + ++ R+ E +D+LVN AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRHQRGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALSVGLELAGSGVRC 169
+ +R G+IVTV S+ A PR M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0692BCTLIPOCALIN290.017 Bacterial lipocalin signature.
		>BCTLIPOCALIN#Bacterial lipocalin signature.

Length = 171

Score = 28.8 bits (64), Expect = 0.017
Identities = 18/98 (18%), Positives = 39/98 (39%), Gaps = 13/98 (13%)

Query: 50 QGITIIKTFDAPGGMKGYLGKYQDMGVTIYLTPDGKHAISG--YMYNEKGENLSNTLIEK 107
+ + + F+ YLGK+ ++ + G ++ + N+ G ++ N
Sbjct: 21 ESVKPVSDFEL----NNYLGKWYEVARLDHSFERGLSQVTAEYRVRNDGGISVLN----- 71

Query: 108 EIYAPAGREMWQRMEQSHWLLDGKKDAPVIVYVFADPF 145
Y+ + W+ E + ++G D + V F PF
Sbjct: 72 RGYSEE-KGEWKEAEGKAYFVNGSTDGYLKVSFFG-PF 107


12c0778c0790Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0778-2174.524093Hypothetical protein
c0779-2174.593605KDP operon transcriptional Regulatory protein
c0780-2164.373581Sensor protein kdpD
c07810152.952124Potassium-transporting ATPase C chain
c0782-1152.825434Potassium-transporting ATPase B chain
c0783-1151.668118Potassium-transporting ATPase A chain
c0784-1161.064262Hypothetical protein
c0785-2161.849046Hypothetical protein ybfA precursor
c0786-2152.362437Hypothetical protein ybgA
c0787-1153.016709Deoxyribodipyrimidine photolyase
c0788-1152.906689Hypothetical transporter ybgH
c07890163.911167Hypothetical protein ybgI
c0790-1153.193559Hypothetical protein ybgJ
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0779HTHFIS928e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.2 bits (229), Expect = 8e-24
Identities = 35/125 (28%), Positives = 58/125 (46%), Gaps = 1/125 (0%)

Query: 2 TNVLIVEDEQAIRRFLRTALEGDGMRVFEAETLQRGLLEAATRKPDLIILDLGLPDGDGI 61
+L+ +D+ AIR L AL G V A DL++ D+ +PD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 EFIRDLRQWSA-VPVIVLSARSEESDKIAALDAGADDYLSKPFGIGELQARLRVALRRHS 120
+ + +++ +PV+V+SA++ I A + GA DYL KPF + EL + AL
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 121 ATTTP 125
+
Sbjct: 124 RRPSK 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0780PF06580330.007 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.5 bits (74), Expect = 0.007
Identities = 10/48 (20%), Positives = 21/48 (43%), Gaps = 4/48 (8%)

Query: 786 LLENAVKYAGAQAE----IGINAHVEGENLQLDVWDNGPGLPPGQEQT 829
L+EN +K+ AQ I + + + L+V + G +++
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKES 310


13c0799c0819Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c07991263.410503Hypothetical protein
c08001293.299433Succinate dehydrogenase hydrophobic membrane
c08011303.305608Succinate dehydrogenase flavoprotein subunit
c08021261.882619Succinate dehydrogenase iron-sulfur protein
c08030251.0058682-oxoglutarate dehydrogenase E1 component
c0804-222-1.196835Dihydrolipoamide succinyltransferase component
c0805-221-3.308010Succinyl-CoA synthetase beta chain
c0806-120-3.350896Succinyl-CoA synthetase alpha chain
c0807019-2.808577Hypothetical protein
c0808223-0.580026Hypothetical protein
c08092250.933753Hypothetical protein
c08103240.986189Hypothetical protein
c08111231.025142Cytochrome D ubiquinol oxidase subunit I
c08121220.929230Cytochrome D ubiquinol oxidase subunit II
c08132210.526648Hypothetical protein
c08142180.333913Protein ybgE
c08152190.033854Protein ybgC
c08162220.085272TolQ protein
c08173220.020968TolR protein
c0818422-0.160123TolA protein
c0819223-0.989532TolB protein precursor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0802TCRTETOQM310.003 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.4 bits (71), Expect = 0.003
Identities = 11/41 (26%), Positives = 23/41 (56%), Gaps = 1/41 (2%)

Query: 14 VDDAPRMQDYTLEAEEGRDM-MLLDALIQLKEKDPSLSFRR 53
+++ + T+E + + MLLDAL+++ + DP L +
Sbjct: 339 IENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYV 379


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0804RTXTOXIND300.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.017
Identities = 27/196 (13%), Positives = 56/196 (28%), Gaps = 12/196 (6%)

Query: 48 EVPASADGILDAVLEDEGTTVTSRQILGRLREGNSAGKETSAKSE-EKASTPAQRQQASL 106
E+ + I+ ++ EG +V +L +L + +S +A R Q
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 107 EEQNNDAL----SPAIRRLLAEHNLDASAIKGTGVGGRLTRED----VEKHLAKAPAKES 158
+ L P + + T ++ E +L K A+
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERL 217

Query: 159 APAAAAPAAQPTLAARSEKRVPMTRLRKRVA---ERLLEAKNSTAMLTTFNEVNMKPIMD 215
A + + + L + A +LE +N V +
Sbjct: 218 TVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQ 277

Query: 216 LRKQYGEAFEKRHGIR 231
+ + A E+ +
Sbjct: 278 IESEILSAKEEYQLVT 293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0807SYCDCHAPRONE280.025 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 28.4 bits (63), Expect = 0.025
Identities = 18/65 (27%), Positives = 26/65 (40%)

Query: 255 AMQNSGDTQLARKYNREGEAVYKTGQLEQAIQLFQQATELDGNYGQAFSNLGLAYQKNGN 314
AM N + + Y++G+ E A ++FQ LD + F LG Q G
Sbjct: 26 AMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQ 85

Query: 315 IAEAI 319
AI
Sbjct: 86 YDLAI 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0818IGASERPTASE592e-11 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 58.9 bits (142), Expect = 2e-11
Identities = 33/199 (16%), Positives = 68/199 (34%), Gaps = 8/199 (4%)

Query: 99 EQERLKQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADAKAKAEADAKAAEE 158
E E+ Q QA+ + E A A ++ E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVP----SNNEEIARVDEAPVPPPAPATPSETTET 1039

Query: 159 AAK--KAAADAKKKAEAEAAKAAVEAQKKAEAAAAALKKKAEAAEAA--AAEARKKAATE 214
A+ K + +K E +A + + ++ A+ A + +K + E A +E ++ TE
Sbjct: 1040 VAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTE 1099

Query: 215 AAEKAKAEAEKKAAAEKAAADKKAAAEKAAADKKAAEKAAAEKAAADKKAAAEKAAADKK 274
E A E E+KA E + + K+ + +A ++ + +
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQ 1159

Query: 275 AAAAKAAAEKAAAAKAAAE 293
+ A + A + ++
Sbjct: 1160 SQTNTTADTEQPAKETSSN 1178



Score = 56.6 bits (136), Expect = 1e-10
Identities = 30/236 (12%), Positives = 85/236 (36%), Gaps = 11/236 (4%)

Query: 68 QSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQA 127
Q+ S ++E+ ++ +E ++ + +K E+ A +
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN--EQDATET 1061

Query: 128 ELKQKQ-AEEAAAKAAAD------AKAKAEADAKAAEEAAKKAAADAKKKAEAEAAKAAV 180
+ ++ A+EA + A+ A++ +E E + A + ++KA+ E K
Sbjct: 1062 TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQE 1121

Query: 181 EAQKKAEAAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAA 240
+ ++ + ++++E + A AR+ T ++ +++ A E+ A + +
Sbjct: 1122 VPKVTSQVSPK--QEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNV 1179

Query: 241 EKAAADKKAAEKAAAEKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAAAEADD 296
E+ + + + A ++ K + ++ +
Sbjct: 1180 EQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVE 1235



Score = 56.2 bits (135), Expect = 2e-10
Identities = 30/229 (13%), Positives = 72/229 (31%), Gaps = 4/229 (1%)

Query: 66 RMQSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAK 125
R ++E+ + + + Q+ E +E Q E + +EKE A E +K E
Sbjct: 1066 REVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKV 1125

Query: 126 QAELKQKQAEEAAAKAAADAKAKAEADAKAAEEAAKKAAADAKKKAEAEAAKAAVEAQKK 185
+++ KQ + + A+ + + +E + A + A+ + VE
Sbjct: 1126 TSQVSPKQEQSETVQPQAEPARENDP-TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVT 1184

Query: 186 AEAAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAA---EK 242
E E + + +++ + A ++
Sbjct: 1185 ESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDR 1244

Query: 243 AAADKKAAEKAAAEKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAA 291
+ +D +A A+ A + A ++ ++ +
Sbjct: 1245 STVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQ 1293



Score = 55.8 bits (134), Expect = 2e-10
Identities = 32/265 (12%), Positives = 86/265 (32%), Gaps = 14/265 (5%)

Query: 51 DAVMVDSGAVVEQYKRMQSQESSAKRSDEQRKMKEQQAAE-ELREKQAAEQER------L 103
D V A + ++ ++K+ + + EQ A E + ++ A++ +
Sbjct: 1021 DEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANT 1080

Query: 104 KQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADAKAKAEADAKAAEEAAKKA 163
+ E + ++ ++ Q K+ +K+ KA + + E ++ + K+
Sbjct: 1081 QTNEVAQSGSETKETQ-TTETKETATVEKE-----EKAKVETEKTQEVPKVTSQVSPKQE 1134

Query: 164 AADA-KKKAEAEAAKAAVEAQKKAEAAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAE 222
++ + +AE K+ ++ + A+ ++ +
Sbjct: 1135 QSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNS 1194

Query: 223 AEKKAAAEKAAADKKAAAEKAAADKKAAEKAAAEKAAADKKAAAEKAAADKKAAAAKAAA 282
+ A + +++ K + + + + A + A +
Sbjct: 1195 VVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTS 1254

Query: 283 EKAAAAKAAAEADDIFGELSSGKNA 307
A + A A F L+ GK
Sbjct: 1255 TNTNAVLSDARAKAQFVALNVGKAV 1279


14c0872c0880Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0872-1223.449052Hypothetical protein ybhO
c0873-1223.810051Hypothetical protein ybhP
c0874-2223.515243Hypothetical protein ybhQ
c0875-1213.840107Hypothetical protein ybhR
c0876-1214.243760Hypothetical protein ybhS
c0877-1173.912646Hypothetical ABC transporter ATP-binding protein
c0878-1153.404899Hypothetical membrane protein ybhG
c08790132.971970Hypothetical transcriptional regulator ybiH
c08800143.312630Putative ATP-dependent RNA helicase rhlE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0875ABC2TRNSPORT469e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 46.1 bits (109), Expect = 9e-08
Identities = 35/146 (23%), Positives = 63/146 (43%), Gaps = 5/146 (3%)

Query: 197 AREREQGTLDQLLVSPLTTWQIFIGKAVPALIVATFQATIVLAIGIWAYQIPFAGSLALF 256
R Q T + +L + L I +G+ A A IG+ A + + L+L
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGA---GIGVVAAALGYTQWLSLL 148

Query: 257 YFTMVV--YGLSLVGFGLLISSLCSTQQQAFIGVFVFMMPAILLSGYVSPVENMPMWLQN 314
Y V+ GL+ G+++++L + + + P + LSG V PV+ +P+ Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 315 LTWINPIRHFTDITKQIYLKDASLDI 340
P+ H D+ + I L +D+
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0877PF05272310.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.012
Identities = 20/90 (22%), Positives = 28/90 (31%), Gaps = 21/90 (23%)

Query: 298 TPRFEDAFIDLLGGAGTSESPLGAILHTVGGTPGETVIEAKELTKKFGDFAATDHVNFAV 357
PR E + +LG P + + + K HV +
Sbjct: 547 VPRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVM 589

Query: 358 KRGEIFG----LLGPNGAGKSTTFKMMCGL 383
+ G F L G G GKST + GL
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGL 619



Score = 29.3 bits (65), Expect = 0.047
Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 39 YVTGLVGPDGAGKTTLMRMLAGL 61
Y L G G GK+TL+ L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0878RTXTOXIND626e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 62.2 bits (151), Expect = 6e-13
Identities = 42/259 (16%), Positives = 92/259 (35%), Gaps = 25/259 (9%)

Query: 83 ALMQAKAGVSVAQAQYDLMLAGYRDEEIAQAAAAVKQAQAAYDYAQNFYNRQQGLWKSRT 142
Q + + +A+ +LA E + + + + L +
Sbjct: 201 QKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENK 260

Query: 143 ISA--NDLENARSSRDQAQATLKSAQDKLRQYRSGNREQ---DIAQAKASLEQAQAQLAQ 197
N+L +S +Q ++ + SA+++ + + + + Q ++ +LA+
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 198 AELNLQDSTLVAPSDGTLLTRAV-EPGTVLNEGGTVFTVSLT-RPVWVRAYVDERNLDQA 255
E Q S + AP + V G V+ T+ + + V A V +++
Sbjct: 321 NEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFI 380

Query: 256 QPGRKVLLYTDGRPNKPYH---GQIGFVSPTAEFTPKTVETPDLRTDLVYRLRIVVT--- 309
G+ ++ + P Y G++ ++ A D R LV+ + I +
Sbjct: 381 NVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISIEENC 432

Query: 310 ----DADDALRQGMPVTVQ 324
+ + L GM VT +
Sbjct: 433 LSTGNKNIPLSSGMAVTAE 451


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0879HTHTETR737e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.7 bits (178), Expect = 7e-18
Identities = 33/214 (15%), Positives = 78/214 (36%), Gaps = 17/214 (7%)

Query: 13 KGEQAKKQLIAAALAQFGEYGMNATT-REIAAQAGQNIAAITYYFGSKEDLYLACAQWIA 71
+ ++ ++ ++ AL F + G+++T+ EIA AG AI ++F K DL+ +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 72 DFIGEQFRPHAEEAERLFAQPQPDRAAIRELILRACKNMIKLLTQDDTVNLSKFISREQL 131
IGE E + P + +RE+++ ++ + + + + F E +
Sbjct: 68 SNIGELEL---EYQAKFPGDP---LSVLREILIHVLESTVTEERRRLLMEII-FHKCEFV 120

Query: 132 SPTAAYHLVHEQVISPLHSHLTRLIAAWTGCDANDTRMILHTHALIGEILAFRLGKETIL 191
A + + + + + +A L T + + G
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKH--CIEAKMLPADLMTRRAAIIMRGYISG----- 173

Query: 192 LRTGWTAFDEEKTELINQTVTCHIDLILQGLSQR 225
L W + + + ++ ++L+
Sbjct: 174 LMENWLFAPQSFD--LKKEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0880SECA300.026 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.026
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 4/67 (5%)

Query: 246 QQVLVFTRTKHGANHLAEQLNKDGIRSAAIHG-NKSQGARTRALADFKSGDIRVLVATDI 304
Q VLV T + + ++ +L K GI+ ++ + A A A + + V +AT++
Sbjct: 450 QPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNM 506

Query: 305 AARGLDI 311
A RG DI
Sbjct: 507 AGRGTDI 513


15c0908c0918Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
c0908-2133.301288Putative formate acetyltransferase 3
c0909-1123.285396Putative pyruvate formate-lyase 3 activating
c0910-2143.053043Fructose-6-phosphate aldolase 1
c0911-2143.183620Molybdopterin biosynthesis protein moeB
c0912-1152.876441Molybdopterin biosynthesis protein moeA
c0913016-0.936344Putative L-asparaginase precursor
c0914015-2.641949Hypothetical ABC transporter ATP-binding protein
c0915013-3.275665Putative binding protein yliB precursor
c0916-111-4.380104Hypothetical ABC transporter permease protein
c0917010-4.523300Hypothetical ABC transporter permease protein
c0918010-4.889348Hypothetical protein yliE
16c0928c0975Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0928-219-4.612827Hypothetical protein ybjH precursor
c0929-123-6.353761Protein ybjI
c0930-127-6.637048Hypothetical protein ybjJ
c0931132-8.601536Hypothetical protein ybjK
c0932234-9.392704Integrase for prophage
c0933339-11.036087Hypothetical protein
c0934335-7.251298Hypothetical protein
c0935531-2.372551Putative regulator for prophage
c0936223-2.620817Hypothetical protein
c0937116-0.211898Hypothetical protein
c0938017-0.527599Hypothetical protein
c0939116-0.538793Hypothetical protein
c0940018-2.403508Hypothetical protein ybiI
c0941123-5.463502DNA adenine methylase
c0942120-4.796586Putative replication protein for prophage
c0943019-4.360080Hypothetical protein
c0944-118-4.150108Hypothetical protein
c0945-118-2.618374Hypothetical protein
c09460200.157080Hypothetical protein
c09472293.955419Probable capsid portal protein
c09484315.332776Terminase, ATPase subunit
c09495265.666980Hypothetical protein
c09506286.071026Putative capsid scaffolding protein
c09515286.036063Hypothetical protein
c09524306.261089Major capsid protein
c09533255.701361Terminase, endonuclease subunit
c09543265.494466Putative capsid completion protein
c09552244.316047Probable phage tail protein
c09564214.076780Possible secretory protein
c09575194.533335Fels-2 prophage: probable prophage lysozyme
c09584204.432555Putative membrane protein
c09595235.405404Putative Regulatory protein
c09604235.054180Putative phage tail protein
c09614235.072754Hypothetical protein
c09623224.855966Putative phage tail protein
c09631212.856210Putative Phage baseplate assembly protein
c09642230.508926Phage baseplate assembly protein
c0965323-4.301850Phage baseplate assembly protein
c0966219-3.912191Putative phage tail protein
c0967219-4.147272Hypothetical protein
c0968119-4.346721Probable variable tail fibre protein
c0969119-5.157618Hypothetical protein yfdK
c0970120-3.073768Hypothetical protein
c09713174.592292Probable major tail sheath protein
c09722184.295357Putative tail fiber component of prophage
c09731184.057236Hypothetical protein
c09742183.811643Putative phage tail protein
c0975-1153.028931Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0930TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.7 bits (77), Expect = 0.001
Identities = 34/150 (22%), Positives = 65/150 (43%), Gaps = 6/150 (4%)

Query: 239 LLIGVVVLAMAFAEGSANDWL-PLLMVDGHGFSP-TSGSLIYAGFTLGMTVGRFIGGWFI 296
+IGV+ + F + + P +M D H S GS+I T+ + + +IGG +
Sbjct: 258 FMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILV 317

Query: 297 DHYSRVAVVR-ASALM--GALGIGLIIFVDSAWVA-GVSVVLWGLGASLGFPLTISAASD 352
D + V+ + L ++ S ++ + VL GL + TI ++S
Sbjct: 318 DRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSL 377

Query: 353 TGPDAPTRVSVVATTGYLAFLVGPPLLGYL 382
+A +S++ T +L+ G ++G L
Sbjct: 378 KQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0931HTHTETR521e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 52.3 bits (125), Expect = 1e-10
Identities = 14/81 (17%), Positives = 31/81 (38%)

Query: 12 RRANDPQRREKIIQATLEAVKLYGIHAVTHRKIAALAGVPLGSMTYYFSGIDELLLEAFS 71
+ + R+ I+ L G+ + + +IA AGV G++ ++F +L E +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 72 RFTEIMSRQYQAFFSDVSDAP 92
+ + + P
Sbjct: 65 LSESNIGELELEYQAKFPGDP 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0942DNABINDNGFIS300.012 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 29.6 bits (66), Expect = 0.012
Identities = 21/72 (29%), Positives = 37/72 (51%), Gaps = 9/72 (12%)

Query: 100 QQRFNPDMVILADVNAQPSHISKPLMQRIE-----YFSSL-GRP--KAYSRYLRETIKPC 151
+QR N D++ ++ VN+Q KPL ++ YF+ L G+ Y L E +P
Sbjct: 3 EQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQPL 62

Query: 152 LER-LEHIRDSQ 162
L+ +++ R +Q
Sbjct: 63 LDMVMQYTRGNQ 74


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c095960KDINNERMP270.033 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 26.8 bits (59), Expect = 0.033
Identities = 21/80 (26%), Positives = 32/80 (40%), Gaps = 14/80 (17%)

Query: 3 RLLLVVLALLLAALGWQTWR-------LADASQTISTQSDELQSKSQALAKSNSQLIS-- 53
R LLV+ L ++ + WQ W A + +T + + A +LIS
Sbjct: 5 RNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGKLISVK 64

Query: 54 -----LSILTETNNREQARL 68
L+I T + EQA L
Sbjct: 65 TDVLDLTINTRGGDVEQALL 84


17c1109c1122Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
c1109-1183.312204Hypothetical protein
c1110-1203.129291Hypothetical protein yccA
c1111-1214.144946Hypothetical protein
c1112-1204.264615*Hypothetical protein
c1113-1194.136563Hydrogenase-1 small chain precursor
c1114-1183.686027Hydrogenase-1 large chain
c11150172.861588Probable Ni/Fe-hydrogenase 1 B-type cytochrome
c1116-1153.008077Hydrogenase 1 maturation protease
c1117-2152.243244Hydrogenase-1 operon protein hyaE
c1118-2151.711029Hydrogenase-1 operon protein hyaF
c1119-2140.807402Cytochrome BD-II oxidase subunit I
c1120-216-0.730363Cytochrome BD-II oxidase subunit II
c1121019-2.712737Periplasmic appA protein precursor
c1122421-5.555318Cold shock-like protein cspH
18c1138c1147Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c11382171.172677Hypothetical protein
c11391162.452495Hypothetical protein yccJ
c11401163.342062Flavoprotein wrbA
c1141-1184.230552Hypothetical protein
c1142-1184.389469Hypothetical protein
c1143-1174.589973Putative purine permease ycdG
c1144-1184.852583Putative flavin:NADH reductase ycdH
c11450143.817546Putative NADH dehydrogenase/NAD(P)H
c1146-2113.996554Hypothetical protein ycdJ
c1147-1113.140410Hypothetical protein ycdK
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1143TCRTETB290.049 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 28.7 bits (64), Expect = 0.049
Identities = 20/114 (17%), Positives = 34/114 (29%), Gaps = 11/114 (9%)

Query: 50 PFAQTAVMGVQHAVAMFGATVLMPILMGLDPNLSIFMSGIGTLL--------FFFITGGR 101
PF + G + G ++P +M LS G + F +I G
Sbjct: 257 PFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGIL 316

Query: 102 VPSYLGSSAAFVGVVIAATGFNGQGINPNISIALGGIIACGLVYTVIGLVVMKI 155
V +GV + F + + +V+ + GL K
Sbjct: 317 VDRRGPLYVLNIGVTFLSVSFLTASFLLETTSW---FMTIIIVFVLGGLSFTKT 367


19c1157c1308Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c1157-219-3.457617Hypothetical protein ycdB precursor
c1158-224-4.813119Hypothetical protein
c1159-126-6.145571PhoH protein
c1160-129-6.441034Hypothetical protein ycdP
c1161-132-8.600984Hypothetical protein ycdQ
c1162-134-9.206633Hypothetical lipoprotein ycdR precursor
c1163033-9.555184Hypothetical protein ycdS precursor
c1164237-11.794742Hypothetical protein ycdT
c1165337-10.586987Putative P4-family integrase
c1166435-12.179249Conserved hypothetical protein
c1167328-6.773266Hypothetical protein
c1168325-5.911105Unknown in IS1N
c1169219-4.176911Prophage CP4-57 Regulatory protein alpA
c1170117-2.513827Hypothetical protein yfjI
c1171-1190.229572Conserved hypothetical protein
c1172-2201.559906Hypothetical protein
c1173-2181.217410Hypothetical protein
c1174-2180.954462Hypothetical protein
c1175-1172.998586Putative aminotransferase
c11760143.365536Hypothetical protein
c11830164.072822Hypothetical protein
c11840185.437654Conserved hypothetical protein
c11850226.003452Hypothetical protein
c11861226.445450Putative beta-ketoacyl-ACP synthase
c11871256.6133883-oxoacyl-[acyl-carrier protein] reductase
c11881256.509861Conserved hypothetical protein
c11892246.339634Putative 3-oxoacyl-[ACP] synthase
c11902255.892560Conserved hypothetical protein
c11911225.427869Conserved hypothetical protein
c1192-2142.068796Conserved hypothetical protein
c1193-2151.762095Hypothetical protein
c11940151.342830Putative enzyme
c1195222-1.710140Hypothetical protein
c1196021-1.797148Conserved hypothetical protein
c1197120-1.104068Putative enzyme
c1198217-1.455478Conserved hypothetical protein
c1199218-2.685887Putative acyl carrier protein
c12002140.778763Putative acyl carrier protein
c12013161.498713Putative phospholipid biosynthesis
c12024181.507102Conserved hypothetical protein
c12036211.469071Putative O-methyltransferase
c12046232.262284Conserved hypothetical protein
c12056222.894874Hypothetical protein
c12065261.843811Hypothetical protein
c12075290.555546Hypothetical protein
c12087290.010840Hypothetical protein
c1209525-0.671497Hypothetical protein
c1210529-0.060149Hypothetical protein
c1213430-0.455721Hypothetical protein
c1214430-0.858453Cea protein
c1215333-3.983461Entry exclusion protein 2
c1216434-4.887001Hypothetical protein
c1217435-5.447201Hypothetical protein
c1218332-3.446622Hypothetical protein
c1219230-3.517317Putative Transposase
c1220232-4.758817Phospho-2-dehydro-3-deoxyheptonate aldolase,
c1221431-4.540109Hypothetical protein
c1222532-4.352266Hypothetical protein
c1224529-5.376785Transposase insF for insertion sequence
c1225632-6.758176Transposase insE for insertion sequence
c1227736-8.235710MchB protein
c1229735-9.080959MchC protein
c1230635-8.497828MchD protein
c1231533-7.558864Microcin H47 secretion protein
c1232433-7.152092Probable microcin H47 secretion ATP-binding
c1233538-8.596330Hypothetical protein
c1234434-5.780419Conserved hypothetical protein
c1235431-3.208471Conserved hypothetical protein
c1236532-4.465054Conserved hypothetical protein
c1237731-2.199248Putative F1C and S fimbrial switch Regulatory
c1238832-2.152658Putative F1C and S fimbrial switch Regulatory
c1239931-2.457144F1C major fimbrial subunit precursor
c1240932-2.204737Putative minor F1C fimbrial subunit precursor
c1241932-2.582027F1C periplasmic chaperone
c1242931-3.109349F1C fimbrial usher
c1243729-6.486959F1C minor fimbrial subunit F precursor
c1244632-7.587612F1C minor fimbrial subunit protein G presursor
c1245523-5.141932F1C Putative fimbrial adhesin precursor
c1246318-4.382822Hypothetical protein
c1247215-2.380521Putative Regulatory protein
c12480141.739027Hypothetical protein
c12490153.007883Hypothetical protein
c12500143.403865Siderophore receptor IroN
c12510164.225984IroE protein
c12521184.340302Ferric enterochelin esterase
c12531173.395443ATP binding cassette (ABC) transporter homolog
c1254323-1.851171Putative glucosyltransferase
c1255530-3.981244Hypothetical protein
c1256529-2.550510Hypothetical protein
c1257424-0.398036Putative conserved protein
c12582220.036903Hypothetical protein
c12594221.745984Hypothetical protein
c12605240.620328*Putative Transposase
c1261525-0.845627Hypothetical protein
c1262626-3.174051Putative Transposase for IS629
c1263626-3.970528Unknown protein of IS629 encoded within
c1264528-5.302596Hypothetical protein
c1265530-6.831001Outer membrane heme/hemoglobin receptor
c1266634-9.182245Hypothetical protein
c1267426-6.183809Hypothetical protein
c12686220.917578Hypothetical protein
c12696220.825478Hypothetical protein
c12707231.987353Hypothetical protein
c12718243.046086Hypothetical protein
c12728243.835630Hypothetical protein yeeP
c12738253.621491Antigen 43 precursor
c12746242.131635Hypothetical protein yeeR
c12757263.216397Hypothetical protein
c12768274.616975Hypothetical protein
c12778285.080732Hypothetical protein
c12788274.937569Hypothetical protein yafZ
c12798274.982724Hypothetical protein
c12808275.054479Hypothetical protein yafX
c12818295.518256Hypothetical protein
c12827325.301996Putative radC-like protein yeeS
c12846304.548402Hypothetical protein yeeT
c12856313.941788Hypothetical protein
c12863290.933330Hypothetical protein
c1287524-1.233251Hypothetical protein
c1288522-1.818879Hypothetical protein yeeV
c1289422-3.308586Unknown protein encoded within prophage
c1290321-3.355525Conserved hypothetical protein
c1291118-3.133852Hypothetical protein
c1292016-1.948513Conserved hypothetical protein
c1293-117-0.653127Hypothetical protein
c1294-216-0.591203*Hypothetical protein
c1295-215-0.978509Putative 2-hydroxyacid dehydrogenase ycdW
c1296-215-2.108283Hypothetical protein ycdX precursor
c1297-117-3.440614Hypothetical protein ycdY
c1298019-4.937255Hypothetical protein ycdZ
c1299123-7.385536Curli production assembly/transport component
c1300032-10.026450Curli production assembly/transport component
c1301131-8.183693Curli production assembly/transport component
c1302235-8.155227Probable csgAB operon transcriptional Regulatory
c1303536-7.642006Hypothetical protein
c1304332-5.422339Hypothetical protein
c1305-123-3.345354Minor curlin subunit precursor
c1306-119-3.866444Major curlin subunit precursor
c1307-122-4.353577Putative curli production protein csgC
c1308-115-3.452694Hypothetical protein ymdA precursor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1163ARGDEIMINASE310.030 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 30.6 bits (69), Expect = 0.030
Identities = 26/181 (14%), Positives = 59/181 (32%), Gaps = 22/181 (12%)

Query: 451 HRAAENELKKAEVIEPRNINLEVEQAWTALTLQEWQQA--AVLTHDVVEREPQDPGVV-R 507
A + A +++ + +E + + L ++ ++E E + +
Sbjct: 49 EVARQEHEVFASILKNNLVEIEYIEDLISEVLVSSVALENKFISQFILEAEIKTDFTINL 108

Query: 508 LK---RAVDVHNLAELRIAGSTGIDAEGPDSGKHDVDLTTIVYS---PPLKDNWRGFAGF 561
LK ++ + N+ I+G E + DL P+ + F
Sbjct: 109 LKDYFSSLTIDNMISKMISGVVT--EELKNYTSSLDDLVNGANLFIIDPMPNVL--FT-- 162

Query: 562 GYADGQFSEGKGIVRDWLAGVEWRSRNIWLEAEYAERVFNHEHKPGARLSGWYDFNDNWR 621
D S G G+ + + + R E +AE +F + + W + +
Sbjct: 163 --RDPFASIGNGVT---INKMFTKVRQ--RETIFAEYIFKYHPVYKENVPIWLNRWEEAS 215

Query: 622 I 622
+
Sbjct: 216 L 216


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1169HTHFIS260.025 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 25.6 bits (56), Expect = 0.025
Identities = 6/15 (40%), Positives = 13/15 (86%)

Query: 5 SQLLGISRSTIYEKM 19
+ LLG++R+T+ +K+
Sbjct: 456 ADLLGLNRNTLRKKI 470


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1187DHBDHDRGNASE922e-24 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 91.7 bits (227), Expect = 2e-24
Identities = 62/251 (24%), Positives = 119/251 (47%), Gaps = 15/251 (5%)

Query: 3 RSVLVTGASKGIGRAIACQLAADGFNI-GVHYHRDAAGAQETLNAIVANGGNGRLLSFDV 61
+ +TGA++GIG A+A LA+ G +I V Y+ + + ++ A + DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVS--SLKAEARHAEAFPADV 66

Query: 62 ANREQCREVLEHEIAQHGAWYGVVSNAGIARDAAFPALSNDDWDAVIHTNLDSFYNVIQP 121
+ E+ + G +V+ AG+ R +LS+++W+A N +N +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 122 CIMPMIGARQGGRIITLSSVSGVMGNRGQVNYSAAKAGIIGATKALAIELAKRKITVNCI 181
+ + R+ G I+T+ S + Y+++KA + TK L +ELA+ I N +
Sbjct: 127 -VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 182 APGLIDTGMIEM-------EESALKEAMSM----IPMKRMGQAEEVAGLASYLMSDIAGY 230
+PG +T M E +K ++ IP+K++ + ++A +L+S AG+
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 231 VTRQVISINGG 241
+T + ++GG
Sbjct: 246 ITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1191ACRIFLAVINRP496e-08 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 49.1 bits (117), Expect = 6e-08
Identities = 37/167 (22%), Positives = 77/167 (46%), Gaps = 12/167 (7%)

Query: 246 YSDYASQQAKQDISTLGVATLLGVILLIVAVFRSLRPLLLCVISIGIGALAGTVATLLIF 305
+ + + + TL A +L V L++ +++R L+ I++ + L GT A L F
Sbjct: 329 TTPFVQLSIHEVVKTLFEAIML-VFLVMYLFLQNMRATLIPTIAVPV-VLLGTFAILAAF 386

Query: 306 G-ELHLMTLVMSMSVIGISADYTLYYL--TERMVHGNDVSPWQ----SLAKVRNALLLAL 358
G ++ +T+ + IG+ D + + ER++ + + P + S+++++ AL+
Sbjct: 387 GYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIA 446

Query: 359 LTTVAAYL-IMMLAPFPGI--RQMAIFAAIGLSASCLTVLFWHPWLC 402
+ A ++ + G RQ +I ++ S L L P LC
Sbjct: 447 MVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALC 493



Score = 37.9 bits (88), Expect = 2e-04
Identities = 34/199 (17%), Positives = 69/199 (34%), Gaps = 31/199 (15%)

Query: 592 LVPVEGVKNSAALQEISSYYPCGIAWV---DRKNTFDELFALYRYVLTGLLLVALAVIAC 648
L + +K A L E+ ++P G+ + D +++ V T + L +
Sbjct: 300 LDTAKAIK--AKLAELQPFFPQGMKVLYPYDTTPFVQL--SIHEVVKTLFEAIMLVFLVM 355

Query: 649 GAVARLGWRKGFISLVPSVLSLGCGLAVLAISGQAVNLFSLLALVLVLGIGI-------- 700
+ R I + + L A+LA G ++N ++ +VL +G+ +
Sbjct: 356 YLFLQ-NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVE 414

Query: 701 NYTLFFSNPRGTPLT-----------SLLAITLAMLTTLLTLGMLVFSATQAISSFGIVL 749
N + P +L+ I + + + + S F I +
Sbjct: 415 NVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITI 474

Query: 750 VSGI----FTAFLLSPLAM 764
VS + A +L+P
Sbjct: 475 VSAMALSVLVALILTPALC 493


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1206FLAGELLIN310.008 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 31.2 bits (70), Expect = 0.008
Identities = 30/281 (10%), Positives = 68/281 (24%), Gaps = 1/281 (0%)

Query: 38 LTGTGGIGFTIGSSKTSHDRREAGTTQSQSASTIGSTAGNVSITAGKQAHISGSDVIANR 97
L+ + +G++ + +S G S +V
Sbjct: 137 LSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYD 196

Query: 98 DISITGDSVVVDPGHDRRTVDEKFEQKKSGLTVALSGTVGSAINNAVTSAQETKESSDSR 157
++ + VD D + V + + + +A + +++ S
Sbjct: 197 TYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKST 256

Query: 158 LKALQATKTALSGVQAGQAATMASATGDPNATGVSLSLTTQKSKSQQHSESDTVSGSTLN 217
+A A + ++ G+ G S +
Sbjct: 257 AGTAEAKAIAGA-IKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADI 315

Query: 218 AGNNLSVVATGKNRGDNRGDIVIAGSQLKAGGNTSLDAANDILLSGAANTQKTTGRNSSS 277
+V A N V+ G + A L + A ++ + +
Sbjct: 316 TAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGA 375

Query: 278 GGGVGVSIGAGKGAGISAFASVNAAKGREKGNGTEWTETTT 318
+ AG + F A+ N +
Sbjct: 376 EYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKS 416


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1214CHANLCOLICIN1321e-40 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 132 bits (334), Expect = 1e-40
Identities = 67/72 (93%), Positives = 67/72 (93%)

Query: 1 METAVAYYKDGVPYDDKGQVIITLLNDNPDGSGSGSGGGTGGSNSESSAAIHATAKWSTA 60
METAVAYYKDGVPYDDKGQVIITLLN PDGSGSG GGG GGS SESSAAIHATAKWSTA
Sbjct: 1 METAVAYYKDGVPYDDKGQVIITLLNGTPDGSGSGGGGGKGGSKSESSAAIHATAKWSTA 60

Query: 61 QLKKTQAEQAAR 72
QLKKTQAEQAAR
Sbjct: 61 QLKKTQAEQAAR 72


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1230RTXTOXINC524e-11 Gram-negative bacterial RTX toxin-activating protein C...
		>RTXTOXINC#Gram-negative bacterial RTX toxin-activating protein C

signature.
Length = 170

Score = 51.8 bits (124), Expect = 4e-11
Identities = 23/77 (29%), Positives = 34/77 (44%), Gaps = 2/77 (2%)

Query: 48 AIQHNQIKFLFDSRGFPLAYITWAYLEADTEARLLRDPEFRLHPSEWNEDGRIWILDFCC 107
AIQ NQ L +P+AY +WA L + E + L D L +W R W +D+
Sbjct: 38 AIQANQYVLLTRD-DYPVAYCSWANLSLENEIKYLNDVT-SLVAEDWTSGDRKWFIDWIA 95

Query: 108 KPGFGRKVIDYLIQLQP 124
G + Y+ + P
Sbjct: 96 PFGDNGALYKYMRKKFP 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1231RTXTOXIND1286e-35 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 128 bits (324), Expect = 6e-35
Identities = 85/432 (19%), Positives = 163/432 (37%), Gaps = 59/432 (13%)

Query: 27 WLIMLGSIVFITAFLMFIIVGTYSRRVNVSGEVTTWPRAVNIYSGVQGFVVRQFVHEGQL 86
+ IM F+ + ++G +G++T R+ I V V EG+
Sbjct: 62 YFIMG----FLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGES 117

Query: 87 IKKGDPVYLIDISKST------------------RSGIVTDNHRRDIENQLVRVDKIISR 128
++KGD + + + R I++ + + +L D+ +
Sbjct: 118 VRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQ 177

Query: 129 LEESKKIT---------LDTLEKQRLQYTDAFRRSS-------DIIQRAEEGIKIMKNNM 172
+++ T + Q+ Q + I R E ++ K+ +
Sbjct: 178 NVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRL 237

Query: 173 ENYRNYQTKGLINKDQLTNQVALYYQQQNNLLSLSGQNEQNALQITTLESQIQTQAADFD 232
+++ + K I K + Q Y + N L Q EQ +I + + + Q F
Sbjct: 238 DDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFK 297

Query: 233 NRIY----QMELQRYELQKELV-NTDVEGEIIIRALTDGKVDSLSV-TVGQMVNPGDNLL 286
N I Q L EL N + + +IRA KV L V T G +V + L+
Sbjct: 298 NEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLM 357

Query: 287 QVIPENIENYYLILWVPNDAVPYISAGDKVNIRYEAFPAEKFGQFSATVKTISRTPASTQ 346
++PE+ + + V N + +I+ G I+ EAFP ++G VK I+ + +
Sbjct: 358 VIVPED-DTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNIN--LDAIE 414

Query: 347 EMLTYKGAPQNTPGASVPWYKVIAMPEKQIIRYDEKNLPLENGMKAESTLFLEKRRIYQW 406
+ G + + +I++ E + KN+PL +GM + + R + +
Sbjct: 415 D---------QRLG--LVFNVIISIEENCLST-GNKNIPLSSGMAVTAEIKTGMRSVISY 462

Query: 407 MLSPFYDMKHSA 418
+LSP + +
Sbjct: 463 LLSPLEESVTES 474


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1238FIMREGULATRY1462e-49 Escherichia coli: P pili regulatory PapB protein si...
		>FIMREGULATRY#Escherichia coli: P pili regulatory PapB protein

signature.
Length = 104

Score = 146 bits (370), Expect = 2e-49
Identities = 84/102 (82%), Positives = 88/102 (86%)

Query: 1 MAQHEVITRGGDAFLLKLRESALSSGSMSEEQFFLLIGISSIHSDRVILAMKDYLVSGHS 60
MA HEVI+R G+AFLL +RES L GSMSE FFLLIGISSIHSDRVILAMKDYLV GHS
Sbjct: 1 MAHHEVISRSGNAFLLNIRESVLLPGSMSEMHFFLLIGISSIHSDRVILAMKDYLVGGHS 60

Query: 61 RKDVCEKYQMNNGYFSTTLGRLTRLNVLVARLAPYYTDSVSA 102
RK+VCEKYQMNNGYFSTTLGRL RLN L ARLAPYYTD SA
Sbjct: 61 RKEVCEKYQMNNGYFSTTLGRLIRLNALAARLAPYYTDESSA 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1240FIMBRIALPAPE300.003 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 30.4 bits (68), Expect = 0.003
Identities = 39/160 (24%), Positives = 65/160 (40%), Gaps = 23/160 (14%)

Query: 11 AALAGNHWHVMLPGGNMRFQGKIIAEACSLALSDRQMTVDMGQLSSNRFHAAGEYGDSVG 70
A L H H N+ F+GK+I AC++ + V+ G + +G G+
Sbjct: 15 AVLMSQHVHA---ADNLTFKGKLIIPACTV----QNAEVNWGDIEIQNLVQSG--GNQKD 65

Query: 71 FDIHLQGCSTVVSQRVGISFYGVSDIHEPELLSVDEENDASDGIAIALFNES----GELV 126
F + + ++ + +V I+ G + +L + + DG+ I L+N + G V
Sbjct: 66 FTVDMNCPYSLGTMKVTITSNGQTG---NSILVPNTSTASGDGLLIYLYNSNNSGIGNAV 122

Query: 127 KLNQPPENWVHLTRGDMKLHMQARYKATHYPVTGGKANGQ 166
L +T G + AR K T Y G K N Q
Sbjct: 123 TLGSQ------VTPGKITGTAPAR-KITLYAKLGYKGNMQ 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1242PF005779630.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 963 bits (2490), Expect = 0.0
Identities = 546/861 (63%), Positives = 689/861 (80%), Gaps = 9/861 (1%)

Query: 41 RMKFNILPLAFFIGIIVSPAR------AELYFNPRFLSDDPDAVADLSAFTQGQELPPGV 94
K + + + + A AELYFNPRFL+DDP AVADLS F GQELPPG
Sbjct: 18 IRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGT 77

Query: 95 YRVDIYLNDTYISTRDVQFQMSQDGKQLAPCLSPEHMSAMGVNRYAVPGMERLPADTCTS 154
YRVDIYLN+ Y++TRDV F + + PCL+ +++MG+N +V GM L D C
Sbjct: 78 YRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVP 137

Query: 155 LNSMIQGATFRFDVGQQRLYLTVPQLYMSNQARGYIAPEYWDNGITAALLNYDFSGNRVR 214
L SMI AT + DVGQQRL LT+PQ +MSN+ARGYI PE WD GI A LLNY+FSGN V+
Sbjct: 138 LTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQ 197

Query: 215 DSYGGTSDYAYLNLKTGLNIGSWRLRDNTSWSYSAGKGYS--QNNWQHINTWLERDIVSL 272
+ GG S YAYLNL++GLNIG+WRLRDNT+WSY++ S +N WQHINTWLERDI+ L
Sbjct: 198 NRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPL 257

Query: 273 RSRLTMGDSYTRGDIFDGVNFRGIQLASDDNMVPDSQRGYAPTIHGISRGTSRISIRQNG 332
RSRLT+GD YT+GDIFDG+NFRG QLASDDNM+PDSQRG+AP IHGI+RGT++++I+QNG
Sbjct: 258 RSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNG 317

Query: 333 YEIYQSTLPPGPFEINDIYPAGSGGDLQVTLQEADGSVQRFNVPWSSVPVLQREGHLKYA 392
Y+IY ST+PPGPF INDIY AG+ GDLQVT++EADGS Q F VP+SSVP+LQREGH +Y+
Sbjct: 318 YDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYS 377

Query: 393 LSAGEFRSGGHQQDNPRFAEGTLKYGLPAGWTVYGGAWIAERYRAFNLGVGKNMGWLGAV 452
++AGE+RSG QQ+ PRF + TL +GLPAGWT+YGG +A+RYRAFN G+GKNMG LGA+
Sbjct: 378 ITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGAL 437

Query: 453 SLDATRANARLPDESRHDGQSYRFLYNKSLTETGTNIQLIGYRYSTRGYFSFADTAWKKM 512
S+D T+AN+ LPD+S+HDGQS RFLYNKSL E+GTNIQL+GYRYST GYF+FADT + +M
Sbjct: 438 SVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRM 497

Query: 513 SGYSVLTQDGVIQIQPKYTDYYNLAYNKRGRVQVSISQQTGESSTLYLSGSHQSYWGTDR 572
+GY++ TQDGVIQ++PK+TDYYNLAYNKRG++Q++++QQ G +STLYLSGSHQ+YWGT
Sbjct: 498 NGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSN 557

Query: 573 TDRQLNAGFNSSVNDISWSLNYSLSRNAWQHETDRILSFDVSIPFSHWMRSDSTSAWRNA 632
D Q AG N++ DI+W+L+YSL++NAWQ D++L+ +V+IPFSHW+RSDS S WR+A
Sbjct: 558 VDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHA 617

Query: 633 SARYSQTLEAHGQAASTAGLYGTLLGDNNLGYSIQSGYTRGGYEGSSKTGYASLNYRGGY 692
SA YS + + +G+ + AG+YGTLL DNNL YS+Q+GY GG S TGYA+LNYRGGY
Sbjct: 618 SASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGY 677

Query: 693 GNASAGYSHSGGYRQLYYGLSGGILAHANGLTLSQPLGDTLILVRAPGASDTRIENQTGV 752
GNA+ GYSHS +QLYYG+SGG+LAHANG+TL QPL DT++LV+APGA D ++ENQTGV
Sbjct: 678 GNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGV 737

Query: 753 STDWRGYAVLPYATDYRENRVALDTNTLADNVDIENTVVSVVPTHGAVVRADYKTRVGVK 812
TDWRGYAVLPYAT+YRENRVALDTNTLADNVD++N V +VVPT GA+VRA++K RVG+K
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIK 797

Query: 813 VLMTLMRNGKAVPFGSVVTARNGGS-SIAGENGQVYLSGMPLSGQVSVKWGSQTTDQCTA 871
+LMTL N K +PFG++VT+ + S I +NGQVYLSGMPL+G+V VKWG + C A
Sbjct: 798 LLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVA 857

Query: 872 DYKLPKESAGQILSHVTASCR 892
+Y+LP ES Q+L+ ++A CR
Sbjct: 858 NYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1273IGASERPTASE443e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 44.3 bits (104), Expect = 3e-06
Identities = 54/248 (21%), Positives = 93/248 (37%), Gaps = 25/248 (10%)

Query: 501 AGTTTLNNGATFTLAGKTVNNDTLTIREGD-ALLQGGALTGNGRVEKS-GSGTLTVSNTT 558
A T + A+ G+ V N T I + A + G TG+ +S +G +T +
Sbjct: 743 ATTMNVTGNASLYS-GRNVANITSNITASNKAQVHIGYKTGDTVCVRSDYTGYVTCTTDK 801

Query: 559 LTQKAVNLNEGTLTLNDSTVTTDIIAHRGTALKLTGSTVLNGAIDPTN--VTLTSGATWN 616
L+ KA+N + N + + ++ L + + N V LT + W+
Sbjct: 802 LSDKALN------SFNPTNLRGNVNLTESANFVLGKANLFGTIQSRGNSQVRLTENSHWH 855

Query: 617 IPDNATVQSVVDDLSHAGQIHFTSARTGKFVPT--TLQVKNLNGQNGTISLRVRPDMAQN 674
+ N+ V + H IH SA V TL V +L+G S D++
Sbjct: 856 LTGNSDVHQLDLANGH---IHLNSADNSNNVTKYNTLTVNSLSGNG---SFYYLTDLSNK 909

Query: 675 NADRLVIDGGRATGKTILNLVNAGNSGTGLATTGKGIQVVEAINGATTEEGAFVQGNMLQ 734
D++V+ ATG L + + + + +A + GN +
Sbjct: 910 QGDKVVVT-KSATGNFTLQVAD-----KTGEPNHNELTLFDASKAQRDHLNVSLVGNTVD 963

Query: 735 AGAFNYTL 742
GA+ Y L
Sbjct: 964 LGAWKYKL 971


20c1395c1490Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c1395-216-3.614813CobB protein
c1396-218-4.223418Hypothetical protein ycfZ
c1397-118-3.247688Hypothetical protein ymfA
c1398-120-3.427829Spermidine/putrescine-binding periplasmic
c1399025-4.268621Spermidine/putrescine transport system permease
c1400229-5.127639Prophage lambda integrase
c1401226-4.757717Putative excisionase for prophage
c1402125-4.758273Exodeoxyribonuclease VIII
c1403133-7.847378Hypothetical protein
c1404134-6.005239Hypothetical protein ydfD
c1405130-6.766073Division inhibition protein dicB
c1406324-4.488712Hypothetical protein
c1407225-1.489907Hypothetical protein ydfC
c1408324-1.182890Hypothetical protein ydfB
c1409326-1.662999Hypothetical protein ydfA
c1410425-1.832385Hypothetical protein
c1411324-1.349891Unknown protein encoded by cryptic prophage
c1412226-2.010702Hypothetical protein
c1413127-7.244507Unknown protein encoded within prophage
c1414235-10.137661Hypothetical protein
c1415339-12.714071Hypothetical protein
c1416335-10.069207Hypothetical protein
c1417434-7.872550Hypothetical protein
c1418121-2.278859Hypothetical protein
c1419-120-0.390647Hypothetical protein
c1420-1200.457802Hypothetical protein
c1421-1200.634896Hypothetical protein
c1422-122-0.388312Hypothetical protein
c1423023-0.440702Hypothetical protein ydfU
c1424230-4.566520Hypothetical protein
c1425329-6.444831Hypothetical protein
c1426529-6.153876Antitermination protein Q homolog of cryptic
c1427633-8.129094Hypothetical protein
c1428937-6.995410***Hypothetical protein
c1429736-6.570169Enhancing lycopene biosynthesis protein 1
c1430635-4.406484Hypothetical protein
c1431330-1.442940Hypothetical protein
c1432124-1.529894Hypothetical protein
c1433323-1.790542Lysis protein S homolog from lambdoid prophage
c1434223-2.174214Hypothetical protein ydfR
c1435227-5.124283Hypothetical protein
c1436229-4.878031Probable lysozyme from lambdoid prophage Qin
c1437633-6.645253Putative Rz endopeptidase from lambdoid prophage
c1438633-8.358348Hypothetical protein
c1439530-7.492288Hypothetical protein
c1440420-3.477540Hypothetical protein ydfO
c1441420-0.786623Hypothetical protein
c14423241.166252Unknown protein encoded within prophage
c14432243.491267Hypothetical protein
c14441233.785454Prophage Qin DNA packaging protein NU1 homolog
c14451234.698671Putative DNA packaging protein of prophage;
c14460265.584643Putative DNA packaging protein of prophage
c14471265.581046Putative capsid protein of prophage
c14483256.171359Putative capsid assembly protein of prophage
c14493253.906233Hypothetical protein
c14503273.347851Putative capsid protein of prophage
c14512242.477328Putative capsid protein of prophage
c14522232.336461Hypothetical protein
c14531221.918992Putative head-tail joining protein of prophage
c14542241.730442Putative tail component of prophage
c14553244.397585Putative tail component of prophage
c14562254.763997Putative tail fiber component V of prophage
c14573285.627858Hypothetical protein
c14583266.405663Putative tail component of prophage
c14594286.890216Putative tail component of prophage
c14603287.343176Putative tail component of prophage
c14616326.783864Putative tail fiber component M of prophage
c14626306.389127Putative tail component of prophage
c14633254.021378Hypothetical protein
c14642263.741517Putative tail fiber component K of prophage
c14652253.101640Putative tail assembly protein of cryptic
c14662242.550743Putative tail component of prophage
c1467027-0.978272Putative Lom-like outer membrane protein of
c1468030-1.524891Hypothetical protein
c1469233-2.779727Hypothetical protein
c1470332-3.805193Hypothetical protein
c1471120-2.816120Hypothetical protein
c1472-118-2.451242Hypothetical protein
c1473-117-3.471984Hypothetical protein
c1474-312-3.026157Hypothetical protein ybcY precursor
c1475-212-0.870674Hypothetical protein ylcE
c1476-213-1.328265Spermidine/putrescine transport system permease
c1477-112-1.553668Spermidine/putrescine transport ATP-binding
c1478-315-1.709639Hypothetical protein
c1479-314-0.908946Peptidase T
c1480-419-1.508555Hypothetical protein ycfD
c1481230-3.954747Hypothetical protein
c1482230-2.834932Hypothetical protein
c1483130-2.673269Putative integrase of prophage
c1484231-3.503079Unknown protein encoded by prophage
c1485228-3.817886Hypothetical protein
c1486128-3.141976Hypothetical protein
c1487030-3.836203Hypothetical protein
c1488130-4.255696Hypothetical protein
c1489330-3.882568Hypothetical protein
c1490224-0.944426Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1398CHLAMIDIAOMP280.043 Chlamydia major outer membrane protein signature.
		>CHLAMIDIAOMP#Chlamydia major outer membrane protein signature.

Length = 393

Score = 28.4 bits (63), Expect = 0.043
Identities = 19/67 (28%), Positives = 28/67 (41%), Gaps = 8/67 (11%)

Query: 137 GVNGDAVDPKSVTSWADL------WKPEYKGSLLLTDDAREVFQMALRKLGYSGNTTDPK 190
G GD DP T+W D + ++ +L D + FQM + +GN T P
Sbjct: 42 GFGGDPCDP--CTTWCDAISMRMGYYGDFVFDRVLKTDVNKEFQMGDKPTSTTGNATAPT 99

Query: 191 EIEAAYN 197
+ A N
Sbjct: 100 TLTAREN 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1405STREPKINASE290.004 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 28.5 bits (63), Expect = 0.004
Identities = 11/36 (30%), Positives = 21/36 (58%)

Query: 17 TSPGGTRHRITKLIVEGAIMKTLLPNVNTSEGCFEI 52
T G H++ K + AI + L+ NV++++ FE+
Sbjct: 91 TDSGAMSHKLEKADLLKAIQEQLIANVHSNDDYFEV 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1456INTIMIN310.006 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 30.8 bits (69), Expect = 0.006
Identities = 26/120 (21%), Positives = 44/120 (36%), Gaps = 20/120 (16%)

Query: 134 KEVITRTVKVTNVGKPSVAEERSKITPVTAIKVTPTGTVEKGKTT-----------TLTV 182
++ IT TVKV KP +E + T + + + T G ++
Sbjct: 675 QDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSA 734

Query: 183 TVEPENATDK----TFRA-ISADPSKATI---SVKDMTITVTGVKDGKVSIPVISGNGQF 234
V K F ++ D I VK TV ++ G+V++ GNG++
Sbjct: 735 RVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVW-LQYGQVNLKASGGNGKY 793


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1460GPOSANCHOR436e-06 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 42.7 bits (100), Expect = 6e-06
Identities = 55/377 (14%), Positives = 123/377 (32%), Gaps = 36/377 (9%)

Query: 236 SGLIAMAKQFHNVTAEQIAYVAQLQRSGDETGALQAANEAATKGFDDQTRRLKENMGTLE 295
S ++ +E+ + + +L+ + + + + L+ L
Sbjct: 95 SNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALA 154

Query: 296 TWADRTARAFKSMWDAVLDI-GRPDTAQEMLIKAEAAFKKADDIWNLRKDDYFVNDEARA 354
+A + + + T + EA + + + +
Sbjct: 155 ARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIK 214

Query: 355 RYWDDREKARLALEAARK-KAEQQTQQDKNAQQQSDTEASRLKYTEEAQRAYERLQTPLE 413
++ K ++ + EA + + L+ +
Sbjct: 215 TLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMN 274

Query: 414 KYTARQEELNKALKDGKILQADYNTLMAAAKKDYEATLKKPKQSGVKVSAGDRQEDSAHA 473
TA ++ + L+A+ L + A + + R D++
Sbjct: 275 FSTADSAKIKTLEAEKAALEAEKADLEHQ-SQVLNANRQSLR----------RDLDASRE 323

Query: 474 ALLTLQAELRTLEKHAGANEKISQQ-RRDL-------WKAESQFAVLEEAAQRRQLSAQE 525
A L+AE + LE+ +E Q RRDL + E++ LEE + + S Q
Sbjct: 324 AKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQS 383

Query: 526 KS--LLAHKDETLEYKRQLAALGDKVTYQERLNALAQQADKFAQQQRAKRAAIDAKSRGL 583
L A ++ + ++ L K+ E+LN +++ K ++++A
Sbjct: 384 LRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKA------------ 431

Query: 584 TDRQAEREATEQRLKEQ 600
+ QA+ EA + LKE+
Sbjct: 432 -ELQAKLEAEAKALKEK 447


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1467ENTEROVIROMP1488e-48 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 148 bits (375), Expect = 8e-48
Identities = 66/200 (33%), Positives = 100/200 (50%), Gaps = 30/200 (15%)

Query: 1 MRKLCAVILSAVVWLVAAGTPASAAEHQSTLSAGYLQTHTDMPGSDNLNGINVKYRYEFT 60
M+K+ + A V AGT +A ST++ GY Q+ + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAA---TSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DA-LGLITSFSYANAEDEQKTHYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGV 119
++ LG+I SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDDRHSNTSLAWGAGVQFNPTESVTIDIAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+V +D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDAFIVGIGYRF 199
S +I G+GYRF
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1474LUXSPROTEIN310.002 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 31.4 bits (71), Expect = 0.002
Identities = 18/66 (27%), Positives = 30/66 (45%), Gaps = 7/66 (10%)

Query: 40 TKEHLLPHFL-EHLGNNHLDI------GVGTGFYLTHVPESSLISLMDLNEASLNAASTR 92
T EHL F+ HL + ++I G TGFY++ + S + D A++
Sbjct: 54 TLEHLYAGFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKV 113

Query: 93 AGESKI 98
++KI
Sbjct: 114 ENQNKI 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1477PF05272300.017 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.017
Identities = 10/36 (27%), Positives = 19/36 (52%), Gaps = 1/36 (2%)

Query: 46 LTLLGPSGCGKTTVLRLIAGLE-TVDSGRIMLDNED 80
+ L G G GK+T++ + GL+ D+ + +D
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKD 634


21c1514c1626Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c1514-122-3.198175Hypothetical protein
c1515021-3.331097Hypothetical protein
c1516018-1.975933Hypothetical protein
c1517018-2.689952Isocitrate dehydrogenase (NADP)
c1518124-3.508112Hypothetical protein
c1519021-2.695576Prophage lambda integrase
c1520119-0.608509Unknown protein of IS629 encoded within
c1521223-1.028468Putative Transposase for IS629
c1522025-3.049168Hypothetical protein
c1523127-3.029616Unknown protein encoded by bacteriophage
c1524325-1.525825Hypothetical protein
c15256270.175975Hypothetical protein
c1526626-0.049135Hypothetical protein
c1527627-0.042551Hypothetical protein
c15286280.317834Hypothetical protein
c15294270.822365Hypothetical protein
c15302270.261948Hypothetical protein
c1531024-0.816484Hypothetical protein
c1532025-0.387473Unknown protein encoded within prophage
c1533226-0.726947Hypothetical protein
c1534225-1.451862Putative exonuclease encoded by prophage
c1535327-2.707983Putative conserved protein
c1536627-4.889173Putative recombination protein Bet of prophage
c1537734-9.101217Putative host-nuclease inhibitor protein Gam of
c1538639-10.868205Hypothetical protein
c1539542-10.918913Hypothetical protein
c1540741-11.167951Lambda Regulatory protein CIII
c1541839-10.397660Putative single-stranded DNA binding protein of
c1542739-9.808251Lambda ant-restriction protein
c1543637-9.371283Putative superinfection exclusion protein B of
c1544633-7.759424Hypothetical protein
c1545426-4.991840Hypothetical protein
c1546227-3.092058Repressor protein
c1547023-2.354835Hypothetical protein
c1548122-0.302970Putative Regulatory protein CII of
c15491220.047160Putative replication protein O of bacteriophage
c15500251.290477Putative replication protein P of bacteriophage
c1551025-0.300932Putative exclusion protein ren of prophage
c15521240.048721Unknown protein of IS629 encoded within
c1553126-0.474908Putative Transposase for IS629
c1554230-3.260309Unknown protein encoded within prophage
c1555125-5.665637Putative DNA N-6-adenine-methyltransferase of
c1556126-6.841800Hypothetical protein
c1557127-5.881405Crossover junction endodeoxyribonuclease rusA
c1558227-5.864443Hypothetical protein
c1559228-6.531880Antitermination protein Q homolog from lambdoid
c1560227-6.103779Outer membrane porin protein nmpC precursor
c1561128-4.731674Lysis protein S homolog from lambdoid prophage
c1562127-4.712137Probable lysozyme from lambdoid prophage DLP12
c1563126-3.982311Putative Rz endopeptidase from lambdoid prophage
c1564220-1.711594Bor protein homolog from lambdoid prophage DLP12
c1565121-0.367051Partial tonB-like membrane protein encoded
c15661211.737697Hypothetical protein
c15671213.693973Hypothetical protein
c15680213.934974Prophage Qin DNA packaging protein NU1 homolog
c15690224.557159Putative DNA packaging protein of prophage;
c15701234.872477Putative DNA packaging protein of prophage
c15711244.691789Putative capsid protein of prophage
c15723224.717937Putative capsid assembly protein of prophage
c15732232.706039Hypothetical protein
c15741242.389127Putative capsid protein of prophage
c15752241.762680Putative capsid protein of prophage
c15762253.505864Hypothetical protein
c15772273.590252Putative head-tail joining protein of prophage
c15781283.549621Putative tail fiber component Z of prophage
c15793275.284814Putative tail component of prophage
c15803275.466533Tail protein
c15813265.435647Hypothetical protein
c15833265.793068Putative tail component of prophage
c15843256.193459Putative tail component of prophage
c15853255.531094Putative tail component of prophage
c15862233.485683Hypothetical protein
c15871243.436262Putative tail component of prophage
c15881242.781550Putative tail component of prophage
c15891231.645091Putative tail component of prophage
c15901241.067489Putative tail component of prophage
c1591131-2.503893Hypothetical protein
c1592124-2.463920Hypothetical protein
c1593216-2.166539Hypothetical protein
c1594116-1.023277Hypothetical protein
c1595017-0.868951Hypothetical protein
c1596019-3.772700Hypothetical protein
c1597019-4.208808SitD protein
c1598-122-5.739284SitC protein
c1599-127-6.659534SitB protein
c1600031-8.094380SitA protein
c1601033-9.522213Hypothetical protein
c1602032-7.482927Hypothetical protein
c1603033-8.092095Hypothetical protein ycgX
c1604134-8.030782Hypothetical transcriptional regulator ycgE
c1605032-9.040854Hypothetical protein
c1606133-9.372391Hypothetical protein ycgF
c1607027-8.284457Hypothetical protein ycgZ
c1608-123-8.022150Hypothetical protein
c1609-125-6.440916Hypothetical protein ymgB
c1610-123-6.248089Conserved hypothetical protein
c1611026-4.925547Hypothetical protein
c1615127-4.553208Putative conserved protein
c1616127-5.945226Hypothetical protein ymgD precursor
c1617024-4.087515Conserved hypothetical protein
c1618021-3.567828Hypothetical protein
c1619021-3.205397Putative conserved protein
c1620024-4.457804Hypothetical protein
c1621-122-4.829511Cell division topological specificity factor
c1622021-4.862335Septum site-determining protein minD
c1623-223-4.336371Septum site-determining protein minC
c1624-318-5.387839Hypothetical protein
c1625-219-5.699338Hypothetical protein ycgJ precursor
c1626-218-4.737561Protein ycgK precursor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1515CHANNELTSX270.001 Nucleoside-specific channel-forming protein Tsx signa...
		>CHANNELTSX#Nucleoside-specific channel-forming protein Tsx

signature.
Length = 294

Score = 27.3 bits (60), Expect = 0.001
Identities = 10/24 (41%), Positives = 15/24 (62%)

Query: 17 HGIFRDYELPHYSLITRIFHDGKQ 40
H + +Y HYS++ R FH+G Q
Sbjct: 240 HILALNYAHWHYSIVARYFHNGGQ 263


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1529HTHFIS280.034 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 28.3 bits (63), Expect = 0.034
Identities = 8/39 (20%), Positives = 18/39 (46%), Gaps = 1/39 (2%)

Query: 84 NGAQFRQLCETTDWVDAGE-NVLLFGASGLGKSHLAAAI 121
A +++ + + +++ G SG GK +A A+
Sbjct: 142 RSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1541UREASE290.007 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 28.6 bits (64), Expect = 0.007
Identities = 18/66 (27%), Positives = 26/66 (39%), Gaps = 7/66 (10%)

Query: 57 IMLAQHALLIAISSDLNAYGVVCEFDWN----DGNGQEGWPPMDGSEGIRITD---IDTS 109
+ LA L I + D +G +F DG GQ G+ IT+ +D
Sbjct: 22 VRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTREGGAVDTVITNALILDHW 81

Query: 110 GIFDSD 115
GI +D
Sbjct: 82 GIVKAD 87


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1546FIMBRIALPAPF270.030 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 26.6 bits (58), Expect = 0.030
Identities = 11/42 (26%), Positives = 22/42 (52%)

Query: 13 VQAGMFSPELRTFTKGDAERLVSTTKKASDSAFWLEVEGNSM 54
V G +PE ++G+ + +S + + W++V GN+M
Sbjct: 44 VDFGNINPEHVDNSRGEVTKNISISCPYKSGSLWIKVTGNTM 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1550FLGMOTORFLIG270.043 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 27.5 bits (61), Expect = 0.043
Identities = 17/77 (22%), Positives = 27/77 (35%), Gaps = 11/77 (14%)

Query: 2 KNIAAQMVNFDREQM-----------RRIANNMPEQYDEKPQVQQVAQIINGVFSQLLAT 50
N+A ++ DR +++A+ E Y V V +IIN +
Sbjct: 165 TNVARRIALMDRTSPEVVREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKF 224

Query: 51 FPASLANRDQNELNEIR 67
SL D EI+
Sbjct: 225 IIESLEEEDPELAEEIK 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1560ECOLIPORIN5090.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 509 bits (1312), Expect = 0.0
Identities = 241/388 (62%), Positives = 280/388 (72%), Gaps = 33/388 (8%)

Query: 21 MKKLTVAISAVAASVLMAMSAQAAEIYNKDSNKLDLYGKVNAKHYFSSNDADDGDTTYAR 80
MK+ +A+ V ++L A +A AAEIYNKD NKLDLYGKV+ HYFS + + DGD TY R
Sbjct: 1 MKRKVLAL--VIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMR 58

Query: 81 LGFKGETQINDQLTGFGQWEYEFKGNRAESQGSSKDKTRLAFAGLKFGDYGSIDYGRNYG 140
+GFKGETQINDQLTG+GQWEY + N E +G++ TRLAFAGLKFGDYGS DYGRNYG
Sbjct: 59 VGFKGETQINDQLTGYGQWEYNVQANTTEGEGANS-WTRLAFAGLKFGDYGSFDYGRNYG 117

Query: 141 VAYDIGAWTDVLPEFGGDTWTQTDVFMTGRTTGVATYRNNDFFGLVDGLNFAAQYQGKND 200
V YD+ WTD+LPEFGGD++T D +MTGR GVATYRN DFFGLVDGLNFA QYQGKN+
Sbjct: 118 VLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNE 177

Query: 201 R----------------TDVTEANGDGFGFSTTYEY-EGFGVGATYAKSDRTDGQVAYGK 243
D+ NGDGFG STTY+ GF GA Y SDRT+ QV G
Sbjct: 178 SQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGG 237

Query: 244 SKFNASGKNAEVWAAGLKYDANNIYLATTYSETQNMTVFG------NNHIANKAQNFEAV 297
+ A G A+ W AGLKYDANNIYLAT YSET+NMT +G + +ANK QNFE
Sbjct: 238 T--IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVT 295

Query: 298 AQYQFDFGLRPSVAYLQSKGKDLGVH----GDRDLVKYVDVGATYYFNKNMSTFVDYKIN 353
AQYQFDFGLRP+V++L SKGKDL + D+DLVKY DVGATYYFNKN ST+VDYKIN
Sbjct: 296 AQYQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKIN 355

Query: 354 LID-DSKFTKTAGIDTDDIVAVGLVYQF 380
L+D D F K AGI TDDIVA+G+VYQF
Sbjct: 356 LLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1564PF062911892e-66 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 189 bits (482), Expect = 2e-66
Identities = 102/102 (100%), Positives = 102/102 (100%)

Query: 12 MQDNKMKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHFFVSGIGQKKTVDAA 71
MQDNKMKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHFFVSGIGQKKTVDAA
Sbjct: 1 MQDNKMKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHFFVSGIGQKKTVDAA 60

Query: 72 KICGGAENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ 113
KICGGAENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ
Sbjct: 61 KICGGAENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1565TONBPROTEIN692e-17 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 68.9 bits (168), Expect = 2e-17
Identities = 33/82 (40%), Positives = 46/82 (56%)

Query: 41 ADEPRQLVTVYPRYPEYAAANYIKGLVEVKFDIGADGTVTRIVFLRSEPHNLFRDEVVKA 100
A PR L P+YP A A I+G V+VKFD+ DG V + L ++P N+F EV A
Sbjct: 150 ASGPRALSRNQPQYPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREVKNA 209

Query: 101 MAKWRFEKNRPCQGVKRQFIFT 122
M +WR+E +P G+ +F
Sbjct: 210 MRRWRYEPGKPGSGIVVNILFK 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c15692FE2SRDCTASE310.010 Ferric iron reductase signature.
		>2FE2SRDCTASE#Ferric iron reductase signature.

Length = 262

Score = 31.2 bits (70), Expect = 0.010
Identities = 10/41 (24%), Positives = 21/41 (51%), Gaps = 1/41 (2%)

Query: 310 TRDGLMFFSARGDEIPPPRSITFHIWTAYSPFTTWVQIVYD 350
R+ L+ F R DE P ++T W++ + ++ + + D
Sbjct: 36 HREHLLEF-IRLDEPAPLNAMTLAQWSSPNVLSSLLAVYSD 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1581LIPPROTEIN48310.002 Mycoplasma P48 major surface lipoprotein signature.
		>LIPPROTEIN48#Mycoplasma P48 major surface lipoprotein signature.

Length = 428

Score = 30.7 bits (69), Expect = 0.002
Identities = 12/52 (23%), Positives = 24/52 (46%), Gaps = 1/52 (1%)

Query: 2 TVISATAASSP-LPDTTGMLTLPAATPFTVMVIPLTDTVAFVLSADTARKLL 52
TVI+ +S+P + L A P T + L + +V+ D+ + ++
Sbjct: 250 TVINNVLSSTPADVKYNPHVILSVAGPATFETVRLANKGQYVIGVDSDQGMI 301


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1585GPOSANCHOR412e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 41.2 bits (96), Expect = 2e-05
Identities = 56/377 (14%), Positives = 125/377 (33%), Gaps = 36/377 (9%)

Query: 236 SGLTAMARQFHNVTAEQIAYVAQLQRSGDESGALQAANEAATKGFDDQTRRLKENMGTLE 295
S R+ +E+ + + +L+ + + + + L+ L
Sbjct: 95 SNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALA 154

Query: 296 TWADRTARAFKSMWDAVLDI-GRPDTAQEMLIKAEAAFKKADDIWNLRKDDYFVNDEARA 354
+A + + + T + EA + + + +
Sbjct: 155 ARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIK 214

Query: 355 RYWDDREKARLALEAARK-KAEQQSQQDKNAQQQSDTEASRLKYTEEAQKAYERLQTPLE 413
++ K + ++ + EA + + + L+ +
Sbjct: 215 TLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMN 274

Query: 414 KYTARQEELNKALKDGKILQADYNTLMAAAKKDYEATLKKPKQSGVKVSAGDRQEDSAHA 473
TA ++ + L+A+ L + A + + R D++
Sbjct: 275 FSTADSAKIKTLEAEKAALEAEKADLEHQ-SQVLNANRQSLR----------RDLDASRE 323

Query: 474 ALLTLQAELRTLEKHAGANEKISQQ-RRDL-------WKAESQFAVLEEAAQRRQLSAQE 525
A L+AE + LE+ +E Q RRDL + E++ LEE + + S Q
Sbjct: 324 AKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQS 383

Query: 526 KS--LLAHKDETLEYKRQLAALGDKVTYQERLNALAQQADKFAQQQRAKRAAIDAKSRGL 583
L A ++ + ++ L K+ E+LN +++ K ++++A
Sbjct: 384 LRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKA------------ 431

Query: 584 TDRQAEREATEQRLKEQ 600
+ QA+ EA + LKE+
Sbjct: 432 -ELQAKLEAEAKALKEK 447


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1589PF06291280.012 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 27.7 bits (61), Expect = 0.012
Identities = 13/40 (32%), Positives = 19/40 (47%), Gaps = 5/40 (12%)

Query: 122 MTGILFSLGASMVLGGVAQML-----APKARTPRTQTTDN 156
M +LFS +M++ G AQ P A TP+ T +
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHH 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1591IGASERPTASE411e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.8 bits (95), Expect = 1e-05
Identities = 27/132 (20%), Positives = 56/132 (42%), Gaps = 15/132 (11%)

Query: 123 SQSAAAAKKSETAAASSRNA--AKTSETNAGNSAKAAASSKTAAQNAATAAERSETNARA 180
S + A+ E A ++T+ET A NS + + + + Q+A + N
Sbjct: 1012 SNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQ---NREV 1068

Query: 181 SEEASADSEEASRRN--AESAAENAGVATTKAREAAADATKAGQKKDEALSAATRAEKAA 238
++EA ++ + ++ N A+S +E +E TK ++ A EK
Sbjct: 1069 AKEAKSNVKANTQTNEVAQSGSE--------TKETQTTETKETATVEKEEKAKVETEKTQ 1120

Query: 239 DRAEVAAEVTAE 250
+ +V ++V+ +
Sbjct: 1121 EVPKVTSQVSPK 1132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1600adhesinb297e-103 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 297 bits (762), Expect = e-103
Identities = 85/262 (32%), Positives = 147/262 (56%), Gaps = 7/262 (2%)

Query: 9 MLLGGLALTCSIAFQASATEKFKVITTFTIIADMAKNVAGDAAEVSSITKPGAEIHEYQP 68
+G A + + + + K V+ T +IIAD+ KN+AGD + SI G + HEY+P
Sbjct: 13 AFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKNIAGDKINLHSIVPVGQDPHEYEP 72

Query: 69 TPGDIKRAQGAQLILANGMNLEL----WFQRFYQHLNGVPE---VIVSSGVTPVGITEGP 121
P D+K+ A LI NG+NLE WF + ++ VS GV + +
Sbjct: 73 LPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVSEGVDVIYLEGQS 132

Query: 122 YEGKPNPHAWMSPDNALIYVDNIRDALIKYDPANAQTYQRNADTYKAKITQTLAPLRKQI 181
+GK +PHAW++ +N +IY NI L + DPAN +TY++N Y K++ +++
Sbjct: 133 EKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANKETYEKNLKAYVEKLSALDKEAKEKF 192

Query: 182 TELPENQRWMVTSEGAFSYLARDLGLKELYLWPINADQQGTPQQVRKVVDIVKKNHIPAV 241
+P ++ +VTSEG F Y ++ + Y+W IN +++GTP Q++ +V+ ++K +P++
Sbjct: 193 NNIPGEKKMIVTSEGCFKYFSKAYNVPSAYIWEINTEEEGTPDQIKTLVEKLRKTKVPSL 252

Query: 242 FSESTISDKPARQVARETGAHY 263
F ES++ D+P + V+++T
Sbjct: 253 FVESSVDDRPMKTVSKDTNIPI 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1619PRTACTNFAMLY435e-08 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 42.7 bits (100), Expect = 5e-08
Identities = 26/101 (25%), Positives = 45/101 (44%), Gaps = 1/101 (0%)

Query: 8 TRSIYRELGATLSYNMRLGNGMEIEPWLKAAVRKEFVDDNRVKVNSDGNFVNDLSGRRGI 67
S+ LG + + L G +++P++KA+V +EF V N + +L G R
Sbjct: 811 GSSVLGRLGLEVGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNGIAH-RTELRGTRAE 869

Query: 68 YQAGIKASFSSTLSGHLGVGYSRGAGVESPWNAVVGVNWSF 108
G+ A+ S + YS+G + PW G +S+
Sbjct: 870 LGLGMAAALGRGHSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


22c1714c1725Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c1714-121-3.455258Hypothetical protein
c1715-120-3.738015Putative potassium channel protein
c1716016-1.259201Protein yciI
c1717116-1.154322TonB protein
c1718019-3.339174Hypothetical protein
c1719-121-4.804072Putative acyl-CoA thioester hydrolase yciA
c1720-215-3.234684Probable intracellular septation protein
c1721-211-1.332760Hypothetical protein yciC
c1722-1120.555011Outer membrane protein W precursor
c17230131.390477Protein yciE
c17240142.403607Protein yciF
c17250153.318101Tryptophan synthase alpha chain
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1716adhesinmafb325e-04 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 32.0 bits (72), Expect = 5e-04
Identities = 16/57 (28%), Positives = 20/57 (35%), Gaps = 2/57 (3%)

Query: 73 GPMPAVDSNDPGAAGFTGSTVIAEFESLEAAQAWADADPYVAAGVYEHVSVKPFKKV 129
P+PA G GS E + EA W +P A V +V KV
Sbjct: 268 APLPA--EGKFAVIGGLGSVAGFEKNTREAVDRWIQENPNAAETVEAVFNVAAAAKV 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1717TONBPROTEIN2591e-89 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 259 bits (663), Expect = 1e-89
Identities = 234/239 (97%), Positives = 236/239 (98%), Gaps = 1/239 (0%)

Query: 18 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVAPADLEPPQA 77
MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMV PADLEPPQA
Sbjct: 1 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQA 60

Query: 78 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQ-PKRDVKPVESR 136
VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKV++ PKRDVKPVESR
Sbjct: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120

Query: 137 PASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 196
PASPFENTAPAR TSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF
Sbjct: 121 PASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 180

Query: 197 DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ 255
DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ
Sbjct: 181 DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ 239


23c1772c1777Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c1772316-1.697630Hypothetical protein ymjA
c1773315-1.040283Psp operon transcriptional activator
c1774315-1.657075Phage shock protein A
c1775-121-4.741421Phage shock protein B
c1776021-5.453294Hypothetical protein
c1777-116-3.755659Phage shock protein C
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1773HTHFIS342e-118 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 342 bits (880), Expect = e-118
Identities = 125/341 (36%), Positives = 182/341 (53%), Gaps = 23/341 (6%)

Query: 11 DNLLGEANSFLEVLEQVSHLAPLDKPVLIIGERGTGKELIASRLHYLSSRWQGPFISLNC 70
L+G + + E+ ++ L D ++I GE GTGKEL+A LH R GPF+++N
Sbjct: 137 MPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINM 196

Query: 71 AALNENLLDSELFGHEAGAFTGAQKRHPGRFERADGGTLFFDELATAPMMVQEKLLRVIE 130
AA+ +L++SELFGHE GAFTGAQ R GRFE+A+GGTLF DE+ PM Q +LLRV++
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQ 256

Query: 131 YGELERVGGSQPLQVNVRLVCATNADLPAMVNEGTFRADLLDRLAFDVVQLPPLRERESD 190
GE VGG P++ +VR+V ATN DL +N+G FR DL RL ++LPPLR+R D
Sbjct: 257 QGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAED 316

Query: 191 IMLMAEHFAIQMCREIKLPLFPGFTEHARETLLNYRWPGNIRELKNVVERSVYRHGTSDY 250
I + HF Q +E F + A E + + WPGN+REL+N+V R +
Sbjct: 317 IPDLVRHFVQQAEKEGLDVK--RFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVI 374

Query: 251 PLDDIIID---PFKRRPSEEAIAVSENTSLPTLPLD------------------LREFQM 289
+ I + P E+A A S + S+ +
Sbjct: 375 TREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLA 434

Query: 290 QQEKELLQLSLQQGKYNQKRAAELLGLTYHQFRALLKKHQI 330
+ E L+ +L + NQ +AA+LLGL + R +++ +
Sbjct: 435 EMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1775MPTASEINHBTR250.030 Metalloprotease inhibitor signature.
		>MPTASEINHBTR#Metalloprotease inhibitor signature.

Length = 122

Score = 24.6 bits (53), Expect = 0.030
Identities = 7/43 (16%), Positives = 17/43 (39%)

Query: 30 SGRSELSQSEQQRLAQLADEAKRMRERIQALESILDAEHPNWR 72
+G+ + + A A++A + + E L + +W
Sbjct: 37 AGQLGIEATGSGVCAGPAEQANALAGDVACAEQWLGDKPVSWS 79


24c1797c1817Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c1797-118-3.641783Hypothetical protein ycjG
c1800-119-4.973976Conserved hypothetical protein
c1801-116-3.849677Hypothetical protein ycjY
c1802-114-3.733800Hypothetical transcriptional regulator ycjZ
c1803-112-3.080308Periplasmic murein peptide-binding protein
c1804-216-3.503575Hypothetical protein ynaI
c1805-216-2.644649Hypothetical protein ynaJ
c1806-217-2.891155Protein ydaA
c1807024-4.544470Fumarate and nitrate reduction Regulatory
c1808232-5.857097Methylated-DNA--protein-cysteine
c1809128-5.477485Hypothetical protein
c1810-226-4.835944Hypothetical protein
c1811-123-3.400614Hypothetical protein
c1812120-1.989537Hypothetical protein
c1813016-1.388651Hypothetical protein
c1814016-1.095611Hypothetical protein ydaL
c1815-116-2.366809Hypothetical protein ydaM
c1816-116-2.024954Hypothetical protein ydaN
c1817-216-3.149290ATP-independent RNA helicase dbpA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1811RTXTOXIND552e-12 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 55.2 bits (133), Expect = 2e-12
Identities = 18/89 (20%), Positives = 37/89 (41%), Gaps = 16/89 (17%)

Query: 12 VVAIGILLAGVVFFIW-WVSK--------GRFIQTTDDAYIGGNITTVASKVSGYISAIE 62
+VA I+ V+ FI + + G+ G + + + I
Sbjct: 59 LVAYFIMGFLVIAFILSVLGQVEIVATANGKLT-------HSGRSKEIKPIENSIVKEII 111

Query: 63 VRDNQSVKKGDIILRLDDRDYRANVARLE 91
V++ +SV+KGD++L+L A+ + +
Sbjct: 112 VKEGESVRKGDVLLKLTALGAEADTLKTQ 140


25c1879c1892Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c18794290.917774Hypothetical protein
c1880425-2.611795Putative conserved protein
c1881524-3.318476Hypothetical protein
c1882525-2.703966Hypothetical protein
c1883532-0.482641Conserved hypothetical protein
c1884527-4.056901Hypothetical protein
c1885530-4.914452Hypothetical protein
c1886533-4.635159Hypothetical protein
c1887436-5.835919Hypothetical protein
c1888538-6.725914Conserved hypothetical protein
c1889444-15.687376Hypothetical protein
c1890337-11.562680Hypothetical protein
c1891129-7.119185Hypothetical protein
c1892022-3.539636Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1892PF07299280.010 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 28.3 bits (63), Expect = 0.010
Identities = 16/51 (31%), Positives = 23/51 (45%), Gaps = 13/51 (25%)

Query: 61 LNDMYAFIPGDNYYFIKS------SGYKFVND-------KWFTLKSINNIF 98
+ M AFI D Y FIKS +G+ ND K ++ I ++F
Sbjct: 4 VIKMEAFIRSDQYNFIKSQAYILANGHATANDRGVIQALKSLAIEKIIHVF 54


26c1913c1936Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c1913018-3.15335730S ribosomal protein S22
c1914-114-1.522627Bdm protein
c1915-111-1.699335Hypothetical protein
c1916-211-0.708056Osmotically inducible protein C
c1918-312-1.941519Putative sensor kinase
c1919-212-2.953403Putative conserved protein
c1920-211-3.664834Hypothetical lipoprotein yddW precursor
c1921-213-5.149753Amino acid antiporter
c1922-215-5.838446Glutamate decarboxylase beta
c1923-219-7.251967Probable zinc protease pqqL
c1924-220-8.305697Hypothetical protein yddB
c1925-223-7.332349Hypothetical ABC transporter ATP-binding protein
c1926-126-7.109832Hypothetical protein ydeM
c1927026-6.227867Putative sulfatase ydeN precursor
c1928026-5.878054Hypothetical transcriptional regulator ydeO
c1929228-5.640063Conserved hypothetical protein
c1930228-5.477944Hypothetical protein ydeP
c1931433-7.067976Hypothetical fimbrial-like protein ydeQ
c1932232-6.731306Hypothetical fimbrial-like protein ydeR
c1933130-5.966172Hypothetical fimbrial-like protein ydeS
c1934129-5.634470Outer membrane usher protein fimD precursor
c1935-126-4.454018Chaperone protein fimC precursor
c1936-126-4.277540Type-1 fimbrial protein, A chain precursor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c191560KDINNERMP230.050 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 23.0 bits (49), Expect = 0.050
Identities = 13/39 (33%), Positives = 21/39 (53%), Gaps = 3/39 (7%)

Query: 8 SLELYISGYLLWKFNLSAGIIFSPDFIRIILLILVLFLT 46
S+EL + + LW +LSA P +I IL+ + +F
Sbjct: 443 SVELRQAPFALWIHDLSAQ---DPYYILPILMGVTMFFI 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1932FIMBRIALPAPF327e-04 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 31.6 bits (71), Expect = 7e-04
Identities = 28/93 (30%), Positives = 46/93 (49%), Gaps = 7/93 (7%)

Query: 16 LFTATLQAADVTITVNGRVVAKPCTIQT-KEANVNLGDLYTRNLQQPGSASGWHNITLSL 74
L T+ ADV I + G V PCTI + V+ G++ N + ++ G +S+
Sbjct: 11 LLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNI---NPEHVDNSRGEVTKNISI 67

Query: 75 TDCPIETSAVTAIVTGSTDNTGYYKNEGTAENI 107
+ CP ++ ++ VTG+T G +N A NI
Sbjct: 68 S-CPYKSGSLWIKVTGNTMGVG--QNNVLATNI 97


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1934PF005779450.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 945 bits (2443), Expect = 0.0
Identities = 498/869 (57%), Positives = 652/869 (75%), Gaps = 10/869 (1%)

Query: 15 QVLILPRFARLTFALGLATAVFPVDAEYYFNPRFLSNDLAESVDLSAFTKGREAPPGTYR 74
+ + F RL A A AE YFNPRFL++D DLS F G+E PPGTYR
Sbjct: 20 KHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYR 79

Query: 75 VDIYLNDEFMASRDITFIADDNNADLIPCLSTDLLVSLGIKKSALLDNKEHSADKHVPDN 134
VDIYLN+ +MA+RD+TF D+ ++PCL+ L S+G+ +++ + ++ +
Sbjct: 80 VDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASV-------SGMNLLAD 132

Query: 135 SACTPLQDRLADASSEFDVGQQHLSLSVPQIYVGRMARGYVSPDLWEEGINAGLLNYSFN 194
AC PL + DA+++ DVGQQ L+L++PQ ++ ARGY+ P+LW+ GINAGLLNY+F+
Sbjct: 133 DACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFS 192

Query: 195 GNSINNRSNHNAGKSNYAYLNLQSGINIGSWRLRDNSTWSYNSGSSNSSDSNKWQHINTS 254
GNS N G S+YAYLNLQSG+NIG+WRLRDN+TWSYNS S+S NKWQHINT
Sbjct: 193 GNS---VQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTW 249

Query: 255 AERDIIPLRSRLTVGDSYTDGDIFDSVNFRGLKINSTEAMLPDSQHGFAPVIHGIARGTA 314
ERDIIPLRSRLT+GD YT GDIFD +NFRG ++ S + MLPDSQ GFAPVIHGIARGTA
Sbjct: 250 LERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTA 309

Query: 315 QVSVKQNGYDVYQTTVPPGPFTIDDINSAANGGDLQVTIKEADGSIQTLYVPYSSVPVLQ 374
QV++KQNGYD+Y +TVPPGPFTI+DI +A N GDLQVTIKEADGS Q VPYSSVP+LQ
Sbjct: 310 QVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQ 369

Query: 375 RAGYTRYALAMGEYRSGNNLQSTPKFVQASLMHGLKGNWTPYGGMQIAEDYQAFNLGIGK 434
R G+TRY++ GEYRSGN Q P+F Q++L+HGL WT YGG Q+A+ Y+AFN GIGK
Sbjct: 370 REGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGK 429

Query: 435 DLGLFGAFSFDITQANTTLADDTRHSGQSVKSVYSKSFYQTGTNIQVAGYRYSTQGFYNL 494
++G GA S D+TQAN+TL DD++H GQSV+ +Y+KS ++GTNIQ+ GYRYST G++N
Sbjct: 430 NMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNF 489

Query: 495 SDSAYSRMSGYTVKPPTGDTSEQTLFIDYFNLFYSKRGQEQISISQQLGNYGTTFFSASR 554
+D+ YSRM+GY ++ G + F DY+NL Y+KRG+ Q++++QQLG T + S S
Sbjct: 490 ADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSH 549

Query: 555 QSYWNTSRSDQQISFGLNVPFGDITTSLNYSYSNNIWQNDRDHLLAFTLNVPFSHWMRTD 614
Q+YW TS D+Q GLN F DI +L+YS + N WQ RD +LA +N+PFSHW+R+D
Sbjct: 550 QTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSD 609

Query: 615 SQSAFHNSNASYSMSNDLKGGMTNLSGVYGTLLPDNNLNYSVQVGNTHGGNTSSGTSGYS 674
S+S + +++ASYSMS+DL G MTNL+GVYGTLL DNNL+YSVQ G GG+ +SG++GY+
Sbjct: 610 SKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYA 669

Query: 675 SLNYRGAYGNTNVGYSRSGDSSQIYYGMSGGIIAHADGITFGQPLGDTMVLVKAPGADNV 734
+LNYRG YGN N+GYS S D Q+YYG+SGG++AHA+G+T GQPL DT+VLVKAPGA +
Sbjct: 670 TLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDA 729

Query: 735 KIENQTGIHTDWRGYAILPFATEYRENRVALNANSLADNVELDETVVTVIPTHGAIARAT 794
K+ENQTG+ TDWRGYA+LP+ATEYRENRVAL+ N+LADNV+LD V V+PT GAI RA
Sbjct: 730 KVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAE 789

Query: 795 FNVQIGGKVLMTLKYGNKSVPFGAIVTHGENKNGSIVAENGQVYLTGLPQSGKLQVSWGK 854
F ++G K+LMTL + NK +PFGA+VT +++ IVA+NGQVYL+G+P +GK+QV WG+
Sbjct: 790 FKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGE 849

Query: 855 DKNSNCIVEYKLPEVSPGTLLNQQTAICR 883
++N++C+ Y+LP S LL Q +A CR
Sbjct: 850 EENAHCVANYQLPPESQQQLLTQLSAECR 878


27c1950c1970Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c1950-115-3.585568Sugar efflux transporter
c1951019-6.109148Multiple antibiotic resistance protein marC
c1952022-6.959091Multiple antibiotic resistance protein marR
c1953225-7.714955Multiple antibiotic resistance protein marA
c1954127-7.920183Multiple antibiotic resistance protein marB
c1955126-8.4134446-phospho-beta-glucosidase bglA
c1956025-7.644566Putative outer membrane protein yieC precursor
c1957120-5.755964PTS system, cellobiose-specific IIA component
c1958120-5.803970Putative conserved protein
c1959-117-3.599452PTS system, cellobiose-specific IIB component
c1960-115-2.977038Putative conserved protein
c1961-214-1.944460Probable amino acid metabolite efflux pump
c1962-215-1.884377Hypothetical protein ydeE
c1963-216-1.968432Hypothetical protein ydeH
c1964-117-1.964932Peptidyl-dipeptidase dcp
c1965-217-3.053508Probable oxidoreductase ydfG
c1966-317-2.819123Hypothetical transcriptional regulator ydfH
c1967-218-2.898985Hypothetical protein ydfZ
c1968-217-2.914776Hypothetical oxidoreductase ydfI
c1969-213-3.378853Hypothetical metabolite transport protein ydfJ
c1970-316-3.562564Starvation sensing protein rspB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1950TCRTETB553e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 54.5 bits (131), Expect = 3e-10
Identities = 41/192 (21%), Positives = 84/192 (43%), Gaps = 8/192 (4%)

Query: 36 LSDIAHSFHMQTAQVGIMLTIYAWVVALMSLPFMLMTSQVERRKLLICLFVVFIASHVLS 95
L DIA+ F+ A + T + ++ + + ++ Q+ ++LL+ ++ V+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 96 FLSWS-FTVLVISRIGVAFAHAIFWSITASLAIRMAPAGKRAQALSLIATGTALAMVLGL 154
F+ S F++L+++R A F ++ + R P R +A LI + A+ +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 155 PLGRIVGQYFGWRMTFFAIGIGALITLLCLIKLLPLLPSEHSGSLKSLPLLFRRPALMSI 214
+G ++ Y W I + +IT+ L+KLL + LMS+
Sbjct: 157 AIGGMIAHYIHWSY-LLLIPMITIITVPFLMKLLK------KEVRIKGHFDIKGIILMSV 209

Query: 215 YLLTVVVVTAHY 226
++ ++ T Y
Sbjct: 210 GIVFFMLFTTSY 221


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1962TCRTETA423e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 41.7 bits (98), Expect = 3e-06
Identities = 42/239 (17%), Positives = 82/239 (34%), Gaps = 18/239 (7%)

Query: 7 RSTSALLASSLLLTIGRGATLPFMTIYLSRQYSLSVDLI---GYAMTIALTIGVVFSLGF 63
R +L++ L +G G +P + L R S D+ G + + + +
Sbjct: 5 RPLIVILSTVALDAVGIGLIMPVLPGLL-RDLVHSNDVTAHYGILLALYALMQFACAPVL 63

Query: 64 GILADKFDKKRYMLMAITAFASGFIAIPLVNNVTLVVLFFALINCAYSVFATVLKAWFAD 123
G L+D+F ++ +L+++ A + + + ++ + + + A A+ AD
Sbjct: 64 GALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG-AYIAD 122

Query: 124 NLSSTSKTKIFSINYTMLNIGWTIGPPLGTLLVMQSINLPFWLAAICSAFPMLFIQIWVK 183
+ + F G GP LG L+ S + PF+ AA + L +
Sbjct: 123 ITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLP 182

Query: 184 RSEK---------IIATETGSVWSPKVLLQDKALLWFTCSGFLASFVSGAFASCISQYV 233
S K + W + A L F+ V A+ +
Sbjct: 183 ESHKGERRPLRREALNPLASFRW--ARGMTVVAALMAV--FFIMQLVGQVPAALWVIFG 237



Score = 32.5 bits (74), Expect = 0.002
Identities = 21/155 (13%), Positives = 60/155 (38%), Gaps = 2/155 (1%)

Query: 7 RSTSALLASSLLLTIGRGATLPFMTIYLSRQYSLSVDLIGYAMTIALTIGVVF-SLGFGI 65
+AL+A ++ + I+ ++ IG ++ + + ++ G
Sbjct: 210 TVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGP 269

Query: 66 LADKFDKKRYMLMAITAFASGFIAIPLVNNVTLVVLFFALINCAYSVFATVLKAWFADNL 125
+A + ++R +++ + A +G+I + + L+ + L+A + +
Sbjct: 270 VAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASG-GIGMPALQAMLSRQV 328

Query: 126 SSTSKTKIFSINYTMLNIGWTIGPPLGTLLVMQSI 160
+ ++ + ++ +GP L T + SI
Sbjct: 329 DEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASI 363


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1965DHBDHDRGNASE1009e-28 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 100 bits (251), Expect = 9e-28
Identities = 70/244 (28%), Positives = 114/244 (46%), Gaps = 16/244 (6%)

Query: 7 IVLVTGATAGFGECITRRFIQQGHKVIATGRRQERLQELKDELGDNLYIAQ---LDVRNR 63
I +TGA G GE + R QG + A E+L+++ L A+ DVR+
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 64 AAIEEMLASLPAEWSNIDILVNNAGLALGMEPAHKASIEDWETMIDTNNKGLVYMTRAVL 123
AAI+E+ A + E IDILVN AG+ L H S E+WE N+ G+ +R+V
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGV-LRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 124 PGMVERNHGHIINIGSTAGSWPYAGGNVYGATKAFVRQFSLNLRTDLHGTAVRVTDIEPG 183
M++R G I+ +GS P Y ++KA F+ L +L +R + PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 184 LVGGTEFSNVRFKGDDGKAE------KTYQNTVALT----PEDVSEAV-WWVSTLPAHVN 232
T+ + ++G + +T++ + L P D+++AV + VS H+
Sbjct: 189 ST-ETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHIT 247

Query: 233 INTL 236
++ L
Sbjct: 248 MHNL 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1969TCRTETB493e-08 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 48.7 bits (116), Expect = 3e-08
Identities = 33/118 (27%), Positives = 55/118 (46%), Gaps = 16/118 (13%)

Query: 72 VGAFIFGKMGDRIGRKKVLFITITMMGICTTLIGVLPTYAQIGVFAPILLVTLRIIQGLG 131
+G ++GK+ D++G K++L I + + + V ++ + + A R IQG G
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMA-------RFIQGAG 116

Query: 132 AGAEISGAGTMLAEYAPKGKR----GIISSFVAMGTNCGTLSATAI-----WAFMFFI 180
A A + ++A Y PK R G+I S VAMG G I W+++ I
Sbjct: 117 AAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLI 174


28c2083c2088Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c2083215-3.331610Hypothetical protein ydiK
c2084018-4.580839Hypothetical protein ydiL
c2085015-3.637386Hypothetical transport protein ydiM
c2086-116-3.914362Hypothetical transport protein ydiN
c2087-117-3.554683Hypothetical shikimate 5-dehydrogenase-like
c2088-216-3.1178643-dehydroquinate dehydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2085TCRTETA290.027 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.4 bits (66), Expect = 0.027
Identities = 54/312 (17%), Positives = 105/312 (33%), Gaps = 18/312 (5%)

Query: 61 FAGLLSDRFGRRPFIMLGMCCYMAFFFGILHTNNIIIAYVFGFLAGMANSFLDAGTYPSL 120
G LSDRFGRRP +++ + + + + + Y+ +AG+ + A +
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAYI 120

Query: 121 MEAFPRSPGTANI-LIKAFVSSGQFLLPLIISLLVWAELWFGWSFMIAAGIMFINALFLY 179
+ + + A G P++ L+ F AA + +N L
Sbjct: 121 ADITDGDERARHFGFMSACFGFGMVAGPVLGGLM--GGFSPHAPFFAAAALNGLNFLTGC 178

Query: 180 RCTFPPHPGRHLPV---IKKTTSSTEHRCSIIDLASYSLYGYISMATFYLVSQWLAQYGQ 236
H G P+ +S + +A+ +I + + +G+
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGE 238

Query: 237 FVAGMSYTM-SIKLLSIYTVGSLLCVFITAPLIRNTVRPTTLL--MLYTFISFIALLTVC 293
T I L + + SL IT P+ L+ M+ +I L
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 294 LHPTFYVVIIFAFVIGFTSAGGVVQIGLTLMAERF--PYAKGKATGIYYSAGSIATFTIP 351
+ +++ ++GG+ L M R +G+ G + S+ + P
Sbjct: 299 RGWMAFPIMV------LLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGP 352

Query: 352 LITAHLSQRSIA 363
L+ + SI
Sbjct: 353 LLFTAIYAASIT 364


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2086TCRTETB310.008 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.4 bits (71), Expect = 0.008
Identities = 38/177 (21%), Positives = 75/177 (42%), Gaps = 9/177 (5%)

Query: 14 ILAVLCIYFSYFLHGISVITLAQNMTSLAEKFSTDNAGIAYLISGIGLGRLISILFFGVI 73
IL LCI F ++ + L ++ +A F+ A ++ + L I +G +
Sbjct: 15 ILIWLCIL--SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKL 72

Query: 74 SDKFGRRAVILMAVIMY----LLFFFGIPACPNLTLAYCLAVCVGIANSALDTGGYPALM 129
SD+ G + ++L +I+ ++ F G L +A + A AL
Sbjct: 73 SDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARY- 131

Query: 130 ECFPKASGSAVILVKAMVSFGQMFYPMLVSYMLLNNIWYGYGLIIPGILFVLITLML 186
+ G A L+ ++V+ G+ P + M+ + I + Y L+IP I + + ++
Sbjct: 132 -IPKENRGKAFGLIGSIVAMGEGVGP-AIGGMIAHYIHWSYLLLIPMITIITVPFLM 186


29c2108c2118Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c2108524-0.917146Hypothetical protein
c2109724-1.297576Integration host factor alpha-subunit
c2112424-1.594412Phenylalanyl-tRNA synthetase alpha chain
c5495321-4.246557Phenylalanyl-tRNA synthetase operon leader
c2113219-4.81672450S ribosomal protein L20
c2114117-5.730825Hypothetical protein
c2115-117-5.207835Translation initiation factor IF-3
c2116-316-3.670993Threonyl-tRNA synthetase
c2117-126-5.164161Putative conserved protein
c2118-123-4.023904Putative conserved protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2109DNABINDINGHU1193e-39 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 119 bits (301), Expect = 3e-39
Identities = 34/89 (38%), Positives = 55/89 (61%)

Query: 4 TKAEMSEYLFDKLGLSKRDAKELVELFFEEIRRALENGEQVKLSGFGNFDLRDKNQRPGR 63
K ++ + + L+K+D+ V+ F + L GE+V+L GFGNF++R++ R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 64 NPKTGEDIPITARRVVTFRPGQKLKSRVE 92
NP+TGE+I I A +V F+ G+ LK V+
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


30c2130c2145Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
c2130-115-3.040614Hypothetical protein
c2131-113-3.143414Catalase HPII
c2132018-4.702143Hypothetical protein ydjC
c2133017-5.1115956-phospho-beta-glucosidase
c2134-116-4.494318Cel operon repressor
c2135018-2.514446PTS system, cellobiose-specific IIA component
c2136116-2.234559PTS system, cellobiose-specific IIC component
c2137016-1.702578PTS system, cellobiose-specific IIB component
c2138014-1.020703Osmotically inducible lipoprotein E precursor
c2139013-0.213959NH(3)-dependent NAD(+) synthetase
c21400131.113945Hypothetical protein ydjQ
c21410133.013865Hypothetical protein ydjR
c21421143.484490Hypothetical protein
c21430133.830831Spheroplast protein Y precursor
c21440143.829875Succinylglutamate desuccinylase
c21450143.877146Succinylarginine dihydrolase
31c2154c2202Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c21541153.304398Hypothetical protein ynjA
c21551153.115854Protein ynjB
c21561162.813891Hypothetical ABC transporter permease protein
c2157-2131.871752Hypothetical ABC transporter ATP-binding protein
c2158-115-1.160572Putative thiosulfate sulfurtransferase ynjE
c2159-221-3.191097Hypothetical protein ynjF
c2160-225-5.537932CTP pyrophosphohydrolase
c2161-318-3.911274Hypothetical protein ynjH precursor
c2162-214-3.050795NADP-specific glutamate dehydrogenase
c2163-217-3.458044Hypothetical protein
c2164-213-2.096011Chaperone protein hscC
c2165-19-0.264097Hypothetical protein
c2166-2101.605496DNA topoisomerase III
c2167-2111.102419Selenide,water dikinase
c2168-213-1.134131Protein ydjA
c2169-314-2.594903Hypothetical protein
c2170-313-2.642008Protease IV
c2171-120-4.312628L-asparaginase I
c2172-121-5.169269Pyrazinamidase/nicotinamidase
c2173022-5.792107Hypothetical metabolite transport protein ydjE
c2174-122-5.023492Hypothetical transcriptional regulator ydjF
c2175023-4.371500Hypothetical oxidoreductase ydjG
c2176021-3.989384Hypothetical sugar kinase ydjH
c2177020-3.686426Hypothetical protein ydjI
c2178-118-3.132310Hypothetical zinc-type alcohol
c2179-218-2.382482Hypothetical metabolite transport protein ydjK
c2180125-2.349450Hypothetical protein
c2181-120-2.162845Hypothetical zinc-type alcohol
c2182024-1.944207Hypothetical protein yeaC
c2183119-1.548664Peptide methionine sulfoxide reductase msrB
c2184117-1.528831Glyceraldehyde 3-phosphate dehydrogenase A
c2185-110-3.775516Unknown protein from 2D-page
c2186-111-4.352585Hypothetical protein yeaE
c2187-112-4.441034MltA-interacting protein precursor
c2188-114-4.969742Hypothetical protein yeaG
c2189-218-5.390458Hypothetical protein yeaH
c2190-222-5.557782Hypothetical protein yeaI
c2191-220-1.608362Hypothetical protein yeaJ
c2192-1200.552678Hypothetical protein yeaK
c21930200.748895Conserved hypothetical protein
c2194-220-0.608509Hypothetical protein yeaL
c2195-120-1.438254Hypothetical transcriptional regulator yeaM
c2196-120-1.673680Hypothetical transport protein yeaN
c2197021-3.659431Hypothetical protein yeaO
c2198-119-4.054504Hypothetical protein yoaF
c2199021-4.394115Hypothetical protein yeaP
c2200-122-4.516706Hypothetical protein yeaQ
c2201023-4.158604Hypothetical protein
c2202222-3.867990Hypothetical protein yoaG
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2164SHAPEPROTEIN1041e-26 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 104 bits (261), Expect = 1e-26
Identities = 80/373 (21%), Positives = 138/373 (36%), Gaps = 89/373 (23%)

Query: 3 IGIDLGTTNSLAAVWRNGQSELIPNALGKFLTPSVVCVDEDG------MVLTGEAARDLQ 56
+ IDLGT N+L V G ++ N PSVV + +D + G A
Sbjct: 13 LSIDLGTANTLIYVKGQG---IVLN------EPSVVAIRQDRAGSPKSVAAVGHDA---- 59

Query: 57 LIKPQNCASNFKRMMGTS-------KTLKLG--GREFRAEELSSLILRQLKEDAENYLGE 107
K+M+G + + +K G F E++ ++Q+ ++
Sbjct: 60 -----------KQMLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNS---FMR 105

Query: 108 EVTEAVISVPAYFGDMQRKATKAAATMAGLNVERLINEPTAAALAYGLHNKDDEHQFLVF 167
++ VP ++R+A + +A AG LI EP AAA+ GL + +V
Sbjct: 106 PSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGS-MVV 164

Query: 168 DLGGGTFDVSILELFDNIMEVRAS-AGDNFLGGEDIVDILIDAYCSRRDLPENIEWREPT 226
D+GGGT +V+++ L + GD F E I++ + Y E T
Sbjct: 165 DIGGGTTEVAVISLNGVVYSSSVRIGGDRF--DEAIINYVRRNY--------GSLIGEAT 214

Query: 227 LQRHLRIEAERVKRVLS--VRDEATFSVEIEGRRYYWHL-------TTEKFEFL---LQT 274
AER+K + + +E+ GR + + E E L L
Sbjct: 215 --------AERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEPLTG 266

Query: 275 FFERIHMPLER-------AIRDAKINFSQLDQVVLVGGTTRMPLIRKLVTRLFGRIPAMH 327
+ + LE+ I + + VL GG + + +L+ G +
Sbjct: 267 IVSAVMVALEQCPPELASDISERGM--------VLTGGGALLRNLDRLLMEETGIPVVVA 318

Query: 328 LNPDEVIAQGAAI 340
+P +A+G
Sbjct: 319 EDPLTCVARGGGK 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2172ISCHRISMTASE373e-05 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 36.9 bits (85), Expect = 3e-05
Identities = 35/192 (18%), Positives = 55/192 (28%), Gaps = 58/192 (30%)

Query: 8 PPRALLLV-DLQNDFCAGGALAVPEGDSTVDVANRLIDWCQSRGEAVI-----ASQD--- 58
P RA+LL+ D+QN F +L + C G V+ SQ+
Sbjct: 28 PNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCVQLGIPVVYTAQPGSQNPDD 87

Query: 59 -------WHPANHGSFASQHGVEPYTPGQLDGLPQTFWPDHCVQNSEGAQLHPLLKQKAI 111
W P + + + P D + T W
Sbjct: 88 RALLTDFWGPGLNSGPYEEKIITELAPEDDDLV-LTKW---------------------- 124

Query: 112 AAVFHKGENPLVDSYSAFFDNGRRQKTALDDWLRAHVINELIVMGLATDYCVKFTVLDAL 171
YSAF +T L + +R ++LI+ G+ T +A
Sbjct: 125 -------------RYSAFK------RTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAF 165

Query: 172 QLGYKVNVITDG 183
K + D
Sbjct: 166 MEDIKAFFVGDA 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2173TCRTETB394e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 38.7 bits (90), Expect = 4e-05
Identities = 32/130 (24%), Positives = 51/130 (39%), Gaps = 3/130 (2%)

Query: 88 ALMFGYFIGSLTGGFIGDYFGRRRAFRINLLIVGIAATGAAFVPDMH--WLIFFRFLMGT 145
A M + IG+ G + D G +R ++I + FV LI RF+ G
Sbjct: 57 AFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSV-IGFVGHSFFSLLIMARFIQGA 115

Query: 146 GMGALIMVGYASFTEFIPATVRGKWSARLSFVGNWSPMLSAAIGVVVIAFFSWRIMFLLG 205
G A + +IP RGK + + + AIG ++ + W + L+
Sbjct: 116 GAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIP 175

Query: 206 GIGILLAWFL 215
I I+ FL
Sbjct: 176 MITIITVPFL 185


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2179TCRTETB310.011 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.0 bits (70), Expect = 0.011
Identities = 33/142 (23%), Positives = 48/142 (33%), Gaps = 23/142 (16%)

Query: 71 MFLGALVGGIIGDKTGRRNAFILYEAIHIASMVVGAFSPNMDF-LIACRFVMGVGLGALL 129
+G V G + D+ G + + I+ V+G + LI RF+ G G A
Sbjct: 62 FSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFP 121

Query: 130 VTLFAGFTEYMPGRNR----GTWSSRVSFIGNWSYPLCSLIAMGLTPLISA----EWNWR 181
+ Y+P NR G S V+ + G+ P I +W
Sbjct: 122 ALVMVVVARYIPKENRGKAFGLIGSIVA------------MGEGVGPAIGGMIAHYIHWS 169

Query: 182 VQLLIPAILSLIATALAWRYFP 203
LLIP I I T
Sbjct: 170 YLLLIPMI--TIITVPFLMKLL 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2185INVEPROTEIN290.021 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 28.9 bits (64), Expect = 0.021
Identities = 18/81 (22%), Positives = 34/81 (41%), Gaps = 13/81 (16%)

Query: 165 ETTSALHTYFNVGDIAKVSVSGLGDRFIDKVNDAKED-----------VLTDGIQTFPDR 213
E ++AL + N D K S S L + F ++V + + V ++ F +
Sbjct: 57 EMSAALAQFRNRRDYEKKS-SNLSNSF-ERVLEDEALPKAKQILKLISVHGGALEDFLRQ 114

Query: 214 TDRVYLNPQDCSVINDEALNR 234
++ +P D ++ E L R
Sbjct: 115 ARSLFPDPSDLVLVLRELLRR 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2194PRTACTNFAMLY280.022 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 27.7 bits (61), Expect = 0.022
Identities = 18/61 (29%), Positives = 26/61 (42%)

Query: 49 QGLSIGIIILTIGVMAPIASGTLPPSTLIHSFLNWKSLVAIAVGVIVSWLGGRGVTLMGS 108
Q +I L IG + + LPPS ++ N ++ A VS LG +TL G
Sbjct: 174 QRSAIVDGGLHIGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPAAVSVLGASELTLDGG 233

Query: 109 Q 109

Sbjct: 234 H 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2200HTHTETR306e-04 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 30.0 bits (67), Expect = 6e-04
Identities = 9/37 (24%), Positives = 17/37 (45%), Gaps = 5/37 (13%)

Query: 4 LSWIIFGLIAGILAKWIMPG-----KDGGGFFMTILL 35
+ I+ G I+G++ W+ K ++ ILL
Sbjct: 163 AAIIMRGYISGLMENWLFAPQSFDLKKEARDYVAILL 199


32c2344c2510Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c2344-220-4.279807Hypothetical protein yedE
c2345234-7.556630Hypothetical protein yedF
c2346235-7.868458Hypothetical protein yedK
c2347740-9.579669Hypothetical protein
c2348536-8.114841Outer membrane porin protein nmpC precursor
c2349-222-3.544039Hypothetical transcriptional regulator ybcM
c2350-2120.473813Protein ybcL precursor
c23511152.444025Hypothetical protein
c23521153.163863EmrE protein
c23531184.915408Flagellar hook-basal body complex protein fliE
c23542184.772370Flagellar M-ring protein
c23552164.847908Flagellar motor switch protein fliG
c23560144.563023Hypothetical protein
c2357-2163.960546Flagellar assembly protein fliH
c2358-2183.708785Flagellum-specific ATP synthase
c2359-2162.642357Flagellar fliJ protein
c2360-2162.790023Flagellar hook-length control protein
c2361-2222.331857Flagellar fliL protein
c23620180.968669Flagellar motor switch protein fliM
c23633170.593876Flagellar motor switch protein fliN
c2364217-2.549982Flagellar protein fliO
c2365117-3.284859Flagellar biosynthetic protein fliP precursor
c2366119-4.785175Flagellar biosynthetic protein fliQ
c2367219-5.081266Flagellar biosynthetic protein fliR
c2368020-3.499105Hypothetical protein
c2369-216-2.116680Colanic acid capsular biosynthesis activation
c2370-216-0.230633DsrB protein
c2371-216-0.169855Hypothetical protein
c2372-2140.397503Hypothetical protein yodD
c2373-2151.028491Hypothetical protein yedP
c2374-1181.571276Hypothetical protein yedQ
c23751181.622810Hypothetical protein
c23762171.966469Hypothetical protein yodC
c23772171.690038Hypothetical protein yedI
c2378-111-0.658797Hypothetical transport protein yedA
c2379-214-1.748697Very short patch repair protein
c2380-214-2.669061DNA-cytosine methyltransferase
c2381-322-6.295698Hypothetical protein yedJ
c2382-129-7.909471Hypothetical protein yedR
c2383-130-8.477498Outer membrane protein N precursor
c2384-227-6.477051Hypothetical protein
c2385-126-6.033135Protein yedU
c2386030-7.413496Putative sensor-like histidine kinase yedV
c2387024-4.905275Probable transcriptional Regulatory protein
c2388032-8.997726Transthyretin-like protein precursor
c2389034-8.615790Hypothetical protein yedY
c2390139-9.923311Hypothetical protein yedZ
c2391243-10.952856Hypothetical protein yodA
c2392343-10.179700Putative P4-family integrase
c2393346-12.846510Hypothetical protein
c2394443-11.134574PilV-like protein
c2395446-14.267928Putative type IV pilin protein precursor
c2396644-13.939562Hypothetical protein
c2397543-13.910933Hypothetical protein
c2398746-16.558484Hypothetical protein
c2399745-15.666610Hypothetical protein
c2400547-16.075696Hypothetical protein
c2401640-12.101934Hypothetical protein
c2402644-14.294961Hypothetical protein
c2403744-14.844821Hypothetical protein
c2404848-15.911527Hypothetical protein
c2405747-15.782590Hypothetical protein
c2406941-13.681260Hypothetical protein
c2407848-14.945182Hypothetical protein
c2408846-14.229600Hypothetical protein
c2409644-13.366586Hypothetical protein
c2410437-11.606764Hypothetical protein
c2411630-9.573352DNA-binding protein H-NS
c2412532-9.222830Hypothetical protein
c2413223-6.667811Hypothetical protein
c2414020-3.311728Hypothetical protein
c2415-1180.106079Hypothetical protein
c24160162.590020Hypothetical protein
c24170184.442554*Hypothetical protein yeeI
c24181215.548804*Prophage P4 integrase
c2419-1217.050550Putative anthranilate synthase
c24200227.001052Putative cytoplasmic transmembrane protein
c2421-1216.632457Putative ABC transporter protein
c2422-1227.055453Putative inner membrane ABC-transporter
c2423-1247.881866Putataive AraC type regulator
c2424-1248.336335Putative peptide synthetase
c2425-1248.307166Hypothetical protein
c2426-1248.702465Putative peptide synthetase
c24270249.105493Putative peptide/polyketide synthetase protein
c24280248.914633Hypothetical protein
c24291258.456482Hypothetical protein
c24300247.810483Hypothetical protein
c2431-2256.289495Hypothetical protein
c2432-3214.066066Putative thioesterase
c2433-3223.597816Enterobactin synthetase component E
c24340201.009359Putative salicyl-AMP ligase
c24350190.173549Hypothetical protein
c2436020-1.244973Putative pesticin receptor precursor
c2437130-6.197999Hypothetical protein
c2438-227-4.372258Hypothetical protein
c2439-225-3.957230Hypothetical protein
c2440-330-5.664592Hypothetical protein
c2441-230-5.234805Hypothetical protein
c2442-326-4.431699Hypothetical protein
c2443-225-3.288442Shikimate transporter
c2444-223-3.153357AMP nucleosidase
c2445-126-4.102984Hypothetical protein yeeN
c2446-124-3.306640*Nitrogen assimilation Regulatory protein nac
c2447-123-2.923043Transcriptional regulator cbl
c2448-120-1.807164*Hypothetical protein yeeO
c2449-118-0.724090*Prophage P4 integrase
c24500200.179976Hypothetical protein
c24511200.801233Putative thioesterase
c24521211.739095Hypothetical protein
c24532203.289282Putative polyketide synthase
c24542184.057282Hypothetical protein
c24552174.264108Putative peptide synthetase
c24562174.498286Hypothetical protein
c24572174.712281Putative amidase
c24582164.689145Putative peptide synthetase
c24592164.333810Putative peptide synthetase
c24602153.781787Putative polyketide synthase
c24612162.597563Hypothetical protein
c2462123-1.482059Hypothetical protein
c24630190.675606Putative transacylase
c24641201.811571Putative acyl-coa dehydrogenase
c24652191.226304Hypothetical protein
c24661190.452208Hypothetical protein
c24671190.576908Putative 3-hydroxyacyl-CoA dehydrogenase
c24681180.937352Putative polyketide synthase
c24691190.186325Putative polyketide synthase
c2470021-2.831206Putative peptide/polyketide synthase
c2471230-8.224589Hypothetical protein
c2472-127-3.744519Transposase
c2473025-2.686361Transposase
c2474128-2.314577Transposase
c2475129-2.301013Hypothetical protein
c2476228-1.223362Protein erfK/srfK precursor
c2477125-3.164777Nicotinate-nucleotide--dimethylbenzimidazole
c2478024-4.151437Cobalamin [5'-phosphate] synthase
c2479426-4.743284Cobalamin biosynthesis protein cobU
c2480626-4.499762Hypothetical protein
c2481628-6.066324Hypothetical protein
c2482527-5.904070Putative outer membrane receptor for iron
c2483425-5.294522Hypothetical protein
c2484226-6.304166Hypothetical protein ybdM
c2485229-7.758437Hypothetical protein ybdN
c2486133-8.571524Hypothetical protein
c2489029-6.555881Putative transferase
c2490130-6.480403Hypothetical protein yaiO
c2492232-6.023286Putative carbohydrate kinase
c2493326-4.192278Hypothetical protein
c2494224-2.092876Hypothetical protein
c2495421-0.035744Putative phosphotriesterase-related protein
c24966251.478321Hypothetical protein
c24976261.274488Transposase
c24984263.746825Hypothetical protein
c24994294.247337Hypothetical protein
c25002274.478880Hypothetical protein
c25014303.627933Hypothetical protein
c25024283.647581Hypothetical protein
c25034233.890456Transposase
c25045251.665450Hypothetical protein
c25055250.637226Hypothetical protein
c25066250.716755Hypothetical protein
c25086250.693046Hypothetical protein
c25095210.100044Insertion sequence ATP-binding protein
c25103210.609563Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2344RTXTOXIND300.018 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.018
Identities = 10/57 (17%), Positives = 17/57 (29%), Gaps = 2/57 (3%)

Query: 175 RFTLLPIFRIPVKMQKVSAASPLTQKPDQARRRF--RLGMLVFFGMLGWALLTAMNQ 229
R L R + + + A L + P R R M ++L +
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEI 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2345PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2348ECOLIPORIN495e-179 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 495 bits (1277), Expect = e-179
Identities = 234/370 (63%), Positives = 271/370 (73%), Gaps = 31/370 (8%)

Query: 1 MSAQAAEIYNKDSNKLDLYGKVNAKHYFSSNDADDGDTTYVRLGFKGETQINDQLTGFGQ 60
+A AAEIYNKD NKLDLYGKV+ HYFS + + DGD TY+R+GFKGETQINDQLTG+GQ
Sbjct: 17 GAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVGFKGETQINDQLTGYGQ 76

Query: 61 WEYEFKGNRAESQGSSKDKTRLAFAGLKFGDYGSIDYGRNYGVAYDIGAWTDVLPEFGGD 120
WEY + N E +G++ TRLAFAGLKFGDYGS DYGRNYGV YD+ WTD+LPEFGGD
Sbjct: 77 WEYNVQANTTEGEGANS-WTRLAFAGLKFGDYGSFDYGRNYGVLYDVEGWTDMLPEFGGD 135

Query: 121 TWTQTDVFMTGRTTGVATYRNNDFFGLVDGLNFAAQYQGKNDR----------------T 164
++T D +MTGR GVATYRN DFFGLVDGLNFA QYQGKN+
Sbjct: 136 SYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQSADDVNIGTNNRNNGD 195

Query: 165 DVTEANGDGFGFSTTYEY-EGFGVGATYAKSDRTNDQVIYGNNSLNASGQNAEVWAAGLK 223
D+ NGDGFG STTY+ GF GA Y SDRTN+QV G A G A+ W AGLK
Sbjct: 196 DIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGT--IAGGDKADAWTAGLK 253

Query: 224 YDANNIYLATTYSETQNMTVFG------NNHIANKAQNFEVVAQYQFDFGLRPSVAYLQS 277
YDANNIYLAT YSET+NMT +G + +ANK QNFEV AQYQFDFGLRP+V++L S
Sbjct: 254 YDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQYQFDFGLRPAVSFLMS 313

Query: 278 KGKDLG----AWGDQDLVEYIDVGATYYFNKNMSTFVDYKINLIDKSD-FTKASGVATDD 332
KGKDL D+DLV+Y DVGATYYFNKN ST+VDYKINL+D D F K +G++TDD
Sbjct: 314 KGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDDDDPFYKDAGISTDD 373

Query: 333 IVAVGLVYQF 342
IVA+G+VYQF
Sbjct: 374 IVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2353FLGHOOKFLIE1206e-39 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 120 bits (301), Expect = 6e-39
Identities = 102/103 (99%), Positives = 102/103 (99%)

Query: 12 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTVARTQAEKFTL 71
SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQT ARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 72 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 114
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2354FLGMRINGFLIF7510.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 751 bits (1940), Expect = 0.0
Identities = 476/555 (85%), Positives = 512/555 (92%), Gaps = 5/555 (0%)

Query: 3 ATAAQTKSLEWLNRLRANPKIPLIVAGSAAVAVMVALILWAKAPDYRTLFSNLSDQDGGA 62
+TA Q K LEWLNRLRANP+IPLIVAGSAAVA++VA++LWAK PDYRTLFSNLSDQDGGA
Sbjct: 5 STATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGA 64

Query: 63 IVSQLTQMNIPYRFSEASGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 122
IV+QLTQMNIPYRF+ SGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ
Sbjct: 65 IVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 124

Query: 123 FSEQVNYQRALEGELSRTIETIGPVKGARVHLAMPKPSLFVREQKSPSASVTVNLLPGRA 182
FSEQVNYQRALEGEL+RTIET+GPVK ARVHLAMPKPSLFVREQKSPSASVTV L PGRA
Sbjct: 125 FSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRA 184

Query: 183 LDEGQISAIVHLVSSAVAGLPPGNVTLVDQGGHLLTQSNTSGRDLNDAQLKYASDVEGRI 242
LDEGQISA+VHLVSSAVAGLPPGNVTLVDQ GHLLTQSNTSGRDLNDAQLK+A+DVE RI
Sbjct: 185 LDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRI 244

Query: 243 QRRIEAILSPIVGNGNIHAQVTAQLDFASKEQTEEQYRPNGDESHAALRSRQLNESEQSG 302
QRRIEAILSPIVGNGN+HAQVTAQLDFA+KEQTEE Y PNGD S A LRSRQLN SEQ G
Sbjct: 245 QRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVG 304

Query: 303 SGYPGGVPGALSNQPAPANNAPISTPPANQNNRQQ--QASTTSNS---GPRSTQRNETSN 357
+GYPGGVPGALSNQPAP N API+TPP NQ N Q Q ST++NS GPRSTQRNETSN
Sbjct: 305 AGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSN 364

Query: 358 YEVDRTIRHTKMNVGDVQRLSVAVVVNYKTLPDGKPLPLSNEQMKQIEALTREAMGFSEK 417
YEVDRTIRHTKMNVGD++RLSVAVVVNYKTL DGKPLPL+ +QMKQIE LTREAMGFS+K
Sbjct: 365 YEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDK 424

Query: 418 RGDSLNVVNSPFNSSDESGGALPFWQQQAFIDQLLAAGRWLLVLLVAWLLWRKAVRPQLT 477
RGD+LNVVNSPF++ D +GG LPFWQQQ+FIDQLLAAGRWLLVL+VAW+LWRKAVRPQLT
Sbjct: 425 RGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLT 484

Query: 478 RRAEAVKTVQQQAQAREEVEDAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 537
RR E K Q+QAQ R+E E+AVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR
Sbjct: 485 RRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 544

Query: 538 VVALVIRQWINNDHE 552
VVALVIRQW++NDHE
Sbjct: 545 VVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2355FLGMOTORFLIG341e-119 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 341 bits (876), Expect = e-119
Identities = 117/329 (35%), Positives = 197/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE 60
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD 120
+ + DY R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLKRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLANRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2357FLGFLIH371e-134 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 371 bits (954), Expect = e-134
Identities = 221/228 (96%), Positives = 224/228 (98%)

Query: 8 MSDNLPWKTWTPDDLAPPPAEFVPMAESEETIIEEVEPSLEQQLAQLQMQAHEQGYQAGI 67
MSDNLPWKTWTPDDLAPP AEFVP+ E EETIIEE EPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 68 AEGRQQGHEQGYQEGLAQGLEQGLAEAKAQQAPIHARMQQLVSEFQTTLDALDSVIASRL 127
AEGRQQGH+QGYQEGLAQGLEQGLAEAK+QQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 128 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 187
MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 188 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 235
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2359FLGFLIJ2022e-70 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 202 bits (515), Expect = 2e-70
Identities = 146/147 (99%), Positives = 147/147 (100%)

Query: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60
MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120
+TSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147
AALLAENRLDQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2360FLGHOOKFLIK469e-168 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 469 bits (1207), Expect = e-168
Identities = 365/375 (97%), Positives = 369/375 (98%)

Query: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTAK 60
MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPT K
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLVSDILSDAQQADLLIPVDETPPVINDEQSTLTPLTTAQTMTLAAVADKNTTKDEKA 120
GEPL+SDI+SDAQQA+LLIPVDETPPVINDEQST TPLTTAQTM LAAVADKNTTKDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPAEKPTLFTKLTSAQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLP EKPTLFTKLTS QLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTADASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTPLVAEAQSKAEVISTPSPVTA ASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMISPHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQM+SPHQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360

Query: 361 LQGRVTGNSGVDIFA 375
LQGRVTGNSGVDIFA
Sbjct: 361 LQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2362FLGMOTORFLIM385e-136 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 385 bits (989), Expect = e-136
Identities = 85/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 20 ILSQAEIDALLNGDS--EVKDEPTASVSGESDIRPYDPNTQRRVVRERLQALEIINERFA 77
+LSQ EID LL S + E +S I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 78 RHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 137
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 138 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 197
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 198 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 255
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 256 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 312
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 313 GVPVLTSQYGTLNGQYALRIEHLI 336
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2363FLGMOTORFLIN2092e-73 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 209 bits (534), Expect = 2e-73
Identities = 124/137 (90%), Positives = 132/137 (96%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTSGKSATDAVFQQFGGGDVSGTLQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T+ KSA DAVFQQ GGGDVSG +QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2365FLGBIOSNFLIP333e-119 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 333 bits (856), Expect = e-119
Identities = 244/245 (99%), Positives = 244/245 (99%)

Query: 1 MRRLFSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60
MRRL SVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2366TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2367TYPE3IMRPROT2026e-67 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 202 bits (516), Expect = 6e-67
Identities = 256/261 (98%), Positives = 259/261 (99%)

Query: 1 MMQVTSDQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
M+QVTS+QWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPGSHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDP SHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGSEPLNSNAFLALTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIG EPLNSNAFLALTKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIISELPLI 261
EHLFSEIFNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2380PF05272290.045 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.045
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 15/62 (24%)

Query: 320 AKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVTRTLSARYYKDGAEILIDRG 379
A+Y + PVLW Y+ R+ K + G+ VY +R +DG+E RG
Sbjct: 166 ARYQVGPVLWGYVVRFIK---SDGDKLTLPYVY------------SRSQRDGSEAWKWRG 210

Query: 380 WD 381
WD
Sbjct: 211 WD 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2381CARBMTKINASE352e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 34.8 bits (80), Expect = 2e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 9/92 (9%)

Query: 46 AQKLAADDDVDMLVILTACYFHDIVSLAKNHPQRQRSSILAAEETRRLLREEFVQFPA-- 103
+KLA + + D+ +ILT + +L + Q + EE R+ E F A
Sbjct: 219 GEKLAEEVNADIFMILTDV---NGAALYYGTEKEQWLREVKVEELRKYYEEG--HFKAGS 273

Query: 104 --EKIEAVCHAIAAHSFSAQIAPLTTEAKIVQ 133
K+ A I A IA L + ++
Sbjct: 274 MGPKVLAAIRFIEWGGERAIIAHLEKAVEALE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2383ECOLIPORIN418e-148 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 418 bits (1075), Expect = e-148
Identities = 199/388 (51%), Positives = 249/388 (64%), Gaps = 41/388 (10%)

Query: 24 MKRKVLAMLVPALLVAGAANAAEIYNKNGNKVELYGKMVGERILTDRENGEKGDNSQDTS 83
MKRKVLA+++PALL AGAA+AAEIYNK+GNK++LYGK+ G +D ++ + GD +
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSD-DSSKDGD----QT 55

Query: 84 YARVGVKGETQINPELTGYGQFELDLEASNRHNPDQ---TRLAYAGLSYKDFGSFDYGRN 140
Y RVG KGETQIN +LTGYGQ+E +++A+ TRLA+AGL + D+GSFDYGRN
Sbjct: 56 YMRVGFKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRN 115

Query: 141 VGVAYDAEAFTDMFVEWGGDSWAGTDLFMTNRTNGVATYRNTDFFGMVEGLNFALQYQGK 200
GV YD E +TDM E+GGDS+ D +MT R NGVATYRNTDFFG+V+GLNFALQYQGK
Sbjct: 116 YGVLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGK 175

Query: 201 NEGTGNY----------------KANGDGHGLSATYTID-GFSFAGAYANSDRTDWQSGD 243
NE NGDG G+S TY I GFS AY SDRT+ Q
Sbjct: 176 NESQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNA 235

Query: 244 GK----GERAEVWALSTKYDANNVYAAVMYGESHNM-------NSDDGDVVNKTQNFEAV 292
G G++A+ W KYDANN+Y A MY E+ NM DG V NKTQNFE
Sbjct: 236 GGTIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVT 295

Query: 293 LQYQFDFGLRPSIGYSYSKALDVA----GYKDSDRLNYIEIGTWYYFNKNMNVYTAYQIN 348
QYQFDFGLRP++ + SK D+ D D + Y ++G YYFNKN + Y Y+IN
Sbjct: 296 AQYQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKIN 355

Query: 349 LLDKSD-YVLAHGLNTDDQLAVGIVYQF 375
LLD D + G++TDD +A+G+VYQF
Sbjct: 356 LLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2386PF06580394e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.7 bits (90), Expect = 4e-05
Identities = 38/195 (19%), Positives = 74/195 (37%), Gaps = 34/195 (17%)

Query: 261 TLSQIRSIAEYQKTIAGN-IEELENISRLTENILFLARADKNNVLVKLDALSLNKEVENL 319
L+ IR++ T A + L + R + L ++ V SL E+ +
Sbjct: 178 ALNNIRALILEDPTKAREMLTSLSELMRYS-----LRYSNARQV-------SLADELTVV 225

Query: 320 LDYL--EYLSDEKEIRFKVECNQQIFADKI---LLQRMLSNLIVNAIRYSPEKSRIHITS 374
YL + E ++F+ + N I ++ L+Q ++ N I + I P+ +I +
Sbjct: 226 DSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKG 285

Query: 375 FLDANGSLNIDIASPGTKINEPEKLFRRFWRGDNSRHSVGQGLGLSLVKA-IAELHGGSA 433
D NG++ +++ + G+ + K G GL V+ + L+G A
Sbjct: 286 TKD-NGTVTLEVENTGSLALKNTKE--------------STGTGLQNVRERLQMLYGTEA 330

Query: 434 TYHYLSKHNVFRITL 448
K +
Sbjct: 331 QIKLSEKQGKVNAMV 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2387HTHFIS849e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.5 bits (209), Expect = 9e-21
Identities = 30/117 (25%), Positives = 60/117 (51%), Gaps = 1/117 (0%)

Query: 39 KILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDDYALIILDIMLPGMDGWQ 98
IL+ +D+ + + Q LS AGY + S+ D L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 99 ILQTLRTA-KQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSFSELLARVRAQLRQ 154
+L ++ A PV+ ++A+++ ++ + GA DYL KPF +EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2394BCTERIALGSPH330.002 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 32.6 bits (74), Expect = 0.002
Identities = 19/86 (22%), Positives = 38/86 (44%), Gaps = 8/86 (9%)

Query: 2 IKKKGFTLLEVTIVL---GIGTLIAFMKFQDMRNDQEAVLADNVGTQIKQLGE--AVNRY 56
++++GFTLLE+ ++L G+ + + F R+D A Q++ + +
Sbjct: 1 MRQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQ 60

Query: 57 ---ISIRYDKISTLSSSNNQSSDPGP 79
+S+ D+ L +DP P
Sbjct: 61 FFGVSVHPDRWQFLVLEARDGADPAP 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2395PilS_PF08805738e-19 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 73.4 bits (180), Expect = 8e-19
Identities = 47/179 (26%), Positives = 84/179 (46%), Gaps = 17/179 (9%)

Query: 7 KRKSKKGFSLLELLLVLGIIAALVVAAFIVYPKVQASQRAQAESNNIATIQAGVKALYTS 66
K++ KG +L+E+LLV+G+I L +A+ +Y VQ++ ++ E NN+ T+ A +K+L
Sbjct: 21 KKEQDKGATLMEVLLVVGVIVVLAASAYKLYSMVQSNIQSSNEQNNVLTVIANMKSLKFQ 80

Query: 67 AS-SFTGLTNTVAVQAKIFPDNMLSGSGTAAKPINAFKGNVTLAATATGPSSATGSSFTI 125
+ + T+ + P +M+ T A N + G+VT+ S+ SF +
Sbjct: 81 GRYTDSNYIKTL-YAQGLLPSDMI-ADTTGASAKNPWGGSVTITT------SSDKYSFNV 132

Query: 126 TYDNVPAAECVKIATAAAGNFYITTVGTKVVKAAGGTLDVAATAAACTNATSNTLVFTS 184
NVP C+ + A + + T +AA + SNTL F++
Sbjct: 133 VEANVPQKNCMAMVNA--------LRSSSAISKINNTSTSTVSAATVCASDSNTLTFST 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2401PF06291270.025 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 26.9 bits (59), Expect = 0.025
Identities = 9/23 (39%), Positives = 15/23 (65%)

Query: 13 DKQMKKILLVAGAALALAGCGEK 35
D +MKK+L A A+ + GC ++
Sbjct: 3 DNKMKKMLFSAALAMLITGCAQQ 25


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2412TACYTOLYSIN300.002 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 30.3 bits (68), Expect = 0.002
Identities = 13/53 (24%), Positives = 22/53 (41%), Gaps = 12/53 (22%)

Query: 77 WFFTWKD------TGIQ-PGTAFVSSVVAGICFGVLMAAYHWWRKVVN--NLP 120
W W T I + ++A C G+ A+ WWRKV++ ++
Sbjct: 502 WDNNWYSKTSPFSTVIPLGANSRNIRIMARECTGL---AWEWWRKVIDERDVK 551


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2413SHAPEPROTEIN250.034 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 25.1 bits (55), Expect = 0.034
Identities = 8/25 (32%), Positives = 14/25 (56%)

Query: 35 SVVNERREEYYQEIGEKKAHKLKMK 59
+++N R Y IGE A ++K +
Sbjct: 197 AIINYVRRNYGSLIGEATAERIKHE 221


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2424ISCHRISMTASE521e-08 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 51.6 bits (123), Expect = 1e-08
Identities = 22/70 (31%), Positives = 44/70 (62%)

Query: 28 QQLRERLIQELNLTPQQLHEESNLIQAGLDSIRLMRWLHWFRKNGYRLTLRELYAAPTLA 87
+ +R+++ + L TP+ + ++ +L+ GLDS+R+M + +R+ G +T EL PT+
Sbjct: 233 ENIRKQIAELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIE 292

Query: 88 AWNQLMLSRS 97
W +L+ +RS
Sbjct: 293 EWQKLLTTRS 302


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2437INTIMIN752e-17 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 75.5 bits (185), Expect = 2e-17
Identities = 20/60 (33%), Positives = 29/60 (48%), Gaps = 3/60 (5%)

Query: 181 QQIASTSQLIGSLLAEDMNSEQAANIARGWASSQASGVMTDWLSRFGTARITLGVDEDFS 240
QQ AS + S +N + A + A G A +QAS + WL +GTA + L +F
Sbjct: 168 QQAASLGSQLQS---RSLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2439INTIMIN553e-10 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 55.5 bits (133), Expect = 3e-10
Identities = 62/263 (23%), Positives = 91/263 (34%), Gaps = 20/263 (7%)

Query: 175 IAVKAHVNDQFGNPVTHQPATFSAAPSSQMIISQNTVSTNTQGVAEVTMTPERNGSYTVK 234
I A V G + P +F+ S ++S N+ +TN G A VT+ ++ G V
Sbjct: 578 ITYTATVKKN-GVAQANVPVSFNIV-SGTAVLSANSANTNGSGKATVTLKSDKPGQVVVS 635

Query: 235 ASLANGASLEKQLEAI---DEKLTLTSSPLIGVNAPKGATLTATLT---SANGTPVEGQV 288
A A S I K ++T A T T PV Q
Sbjct: 636 AKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQE 695

Query: 289 INFSVTLEGATLSGGKVRTNSSGQAPVVLTSNKVGTYTVTASFHNGVTIQTQTTVKVTGN 348
+ F+ TL LS +T+++G A V LTS G V+A + V+
Sbjct: 696 VTFTTTL--GKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTT 753

Query: 349 PSTAHVASFIADPSTIAATNSDLSTLKATVEDGSGNL-IEGPTVYFALKSGSTTLTSLTA 407
+ I T ++ G NL G + +S + + S
Sbjct: 754 LTID------DGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIAS--- 804

Query: 408 VTDQNGIATTSVKGEITGSVTVS 430
V +G T KG T SV S
Sbjct: 805 VDASSGQVTLKEKGTTTISVISS 827



Score = 52.8 bits (126), Expect = 2e-09
Identities = 46/170 (27%), Positives = 65/170 (38%), Gaps = 7/170 (4%)

Query: 271 TLTATLTSANGTPVEGQVINFSVTLEGATLSGGKVRTNSSGQAPVVLTSNKVGTYTVTAS 330
T TAT+ NG ++F++ A LS TN SG+A V L S+K G V+A
Sbjct: 579 TYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAK 637

Query: 331 FHNGV-TIQTQTTVKVTGNPSTAHVASFIADPSTIAATNSDLSTLKATVEDGSGNLIEGP 389
+ + V + A + AD +T A D T V G +
Sbjct: 638 TAEMTSALNANAVIFVDQ--TKASITEIKADKTTAVANGQDAITYTVKVMKG-DKPVSNQ 694

Query: 390 TVYFALKSGSTTLTSLTAVTDQNGIATTSVKGEITGSVTVSAVTSAGGMQ 439
V F G + + T TD NG A ++ G VSA S +
Sbjct: 695 EVTFTTTLGKLSNS--TEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVD 742



Score = 51.2 bits (122), Expect = 7e-09
Identities = 51/233 (21%), Positives = 89/233 (38%), Gaps = 16/233 (6%)

Query: 13 AVTDADGKAKVTLKGTKAGAHTVTASMVGGKS--EQLVVNFTADTLTAQVNLNVTEDNFI 70
A T+ GKA VTLK K G V+A S V F T + + + +
Sbjct: 612 ANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAV 671

Query: 71 ANNIGMTRLQATVTDGNGNPVEGIKVNFRGTSVTLSSTSVETDDQVFAEILVTSTEVGLK 130
AN V PV +V F T LS+++ +TD +A++ +TST G
Sbjct: 672 ANGQDAITYTVKVMK-GDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKS 730

Query: 131 TVSASLADKPTEVISRLLN----AKVDVNSATI----TSQEIPEGQVMVAQDIAVKAHVN 182
VSA ++D +V + + +D + I ++P + Q + N
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN 790

Query: 183 DQFGNPVTHQPATFSAAPSSQMIISQNTVSTNTQGVAEVTMTPERNGSYTVKA 235
++ + A S Q+ + + +T + V + + +YT+
Sbjct: 791 GKYTWRSANPAIASVDASSGQVTLKEKGTTTIS-----VISSDNQTATYTIAT 838



Score = 40.1 bits (93), Expect = 2e-05
Identities = 35/213 (16%), Positives = 63/213 (29%), Gaps = 18/213 (8%)

Query: 4 NFTLSDGDKAVTDADGKAKVTLKGTKAGAHTVTASMVGGKSE--QLVVNFTADTLTAQVN 61
TD +G AKVTL T G V+A + + V F N
Sbjct: 701 TLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGN 760

Query: 62 LNVTEDNFIANNIGMTRLQATVTDGNGN-PVEGIKVNFRGTSVTLSSTSVETDDQVFAEI 120
+ + + + + G N G + S + SV+
Sbjct: 761 IEI-----VGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQ---- 811

Query: 121 LVTSTEVGLKTVSASLADKPTEVISRLLNAKVDVNSATITSQEIPEGQVMVAQDIAVKAH 180
VT E G T+S +D T + + + ++ + V ++
Sbjct: 812 -VTLKEKGTTTISVISSDNQT--ATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGG--- 865

Query: 181 VNDQFGNPVTHQPATFSAAPSSQMIISQNTVST 213
N + + + AA + S T+ +
Sbjct: 866 KLPSSQNELENVFKAWGAANKYEYYKSSQTIIS 898


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2440INTIMIN280.022 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 27.7 bits (61), Expect = 0.022
Identities = 22/129 (17%), Positives = 46/129 (35%), Gaps = 6/129 (4%)

Query: 11 KISAIDYSQNINGDYKATVTGGGEGIATLIPVLNGVHQAGLSTTIEFISAETRPMTGTVS 70
K+S + NG K T+T G + + ++ V + +EF G +
Sbjct: 704 KLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFF-TTLTIDDGNIE 762

Query: 71 VNSANLPTASFPSQGFTGAYYQLNNDNFAPGKTAADYSFSSSASWVGVDATGKVTFKNDG 130
+ + P+ L + G + ++ A ++G+VT K G
Sbjct: 763 IVGTGV-KGKLPTVWLQYGQVNL---KASGGNGKYTWRSANPAIASVDASSGQVTLKEKG 818

Query: 131 DSNTVIITA 139
+ T+ + +
Sbjct: 819 -TTTISVIS 826


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2443TCRTETB330.002 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.3 bits (76), Expect = 0.002
Identities = 39/259 (15%), Positives = 96/259 (37%), Gaps = 18/259 (6%)

Query: 79 LGGVIFGHFGDRLGRKRMLMLTVWMMGIATALIGILPSFSTIGWWAPILLVTLRAIQGFA 138
+G ++G D+LG KR+L+ + + + + + SF ++ I+ ++ A
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSL----LIMARFIQGAGAAA 119

Query: 139 VGGEWGGAALLSVESAPKNKK-AFYSSGVQVGYGVGLLLSTGLVSLISMMTTDEQFLSWG 197
+ + K S V +G GVG + + I
Sbjct: 120 FPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------------H 167

Query: 198 WRIPFLFSIVLVLGALWVRNGMEESAEFEQQQHNQAAAKKRIPVIEALLRHPGAFLKIIA 257
W L ++ ++ ++ +++ + + + ++ +L + +
Sbjct: 168 WSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLI 227

Query: 258 LRLCELLTMYIVTAFALNYSTQNMGLPRELFLNIGLLVGGLSCLTIPCFAWLADRFGRRR 317
+ + L +++ + + GL + + IG+L GG+ T+ F + +
Sbjct: 228 VSVLSFL-IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDV 286

Query: 318 VYITGALIGTLSAFPFFMA 336
++ A IG++ FP M+
Sbjct: 287 HQLSTAEIGSVIIFPGTMS 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2452BICOMPNTOXIN330.002 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 33.3 bits (76), Expect = 0.002
Identities = 6/41 (14%), Positives = 16/41 (39%)

Query: 303 LAADNRILYASGWFIDQNQGPYISHGGQNPNFSSCIALRPD 343
+ +F+ ++ P + G NP+F + ++
Sbjct: 210 VGYKPHSKDPRDYFVPDSELPPLVQSGFNPSFIATVSHEKG 250


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2455ISCHRISMTASE426e-06 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 42.3 bits (99), Expect = 6e-06
Identities = 20/87 (22%), Positives = 40/87 (45%), Gaps = 3/87 (3%)

Query: 956 DVRQMVATVRNTAPASGSER-LGDAAIRHSVRVCVEGALEQTEFDDNENLYVLGLDSIKS 1014
++ A V+ T+ +G + IR + ++ E + D E+L GLDS++
Sbjct: 209 QLQNAPADVQKTSANTGKKNVFTCENIRKQIAELLQETPE--DITDQEDLLDRGLDSVRI 266

Query: 1015 IQIAAQLRHHGWTMSAVQVMECGTVNA 1041
+ + Q R G ++ V++ E T+
Sbjct: 267 MTLVEQWRREGAEVTFVELAERPTIEE 293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2469DHBDHDRGNASE511e-08 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 51.2 bits (122), Expect = 1e-08
Identities = 32/167 (19%), Positives = 58/167 (34%), Gaps = 7/167 (4%)

Query: 334 IPGNVLWIIGGEKGIGRMIGEALAQREGVRVVLSSRTGYHHEAVQQDAL------DVIHC 387
I G + +I G +GIG + LA +G + E V +
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLAS-QGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPA 64

Query: 388 DVTQAEAVRACLATLLERYGRLDGVIFAADATTTLTLHQLSESALRDTLTVKERGTANVL 447
DV + A+ A + G +D ++ A +H LS+ T +V G N
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 448 HALAQRNLLDERLLLLFCNSLAAVNAEIGQTGYATASAYLDALAQQL 494
++++ + ++ S A YA++ A + L
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2489FLGPRINGFLGI270.043 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 26.8 bits (59), Expect = 0.043
Identities = 12/30 (40%), Positives = 19/30 (63%)

Query: 1 MIIDESAGEVVIGANTRICHGAVIQGPVVI 30
++I+E G +VIGA+ RI AV G + +
Sbjct: 263 VVINERTGTIVIGADVRISRVAVSYGTLTV 292


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2509HTHFIS280.034 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 28.3 bits (63), Expect = 0.034
Identities = 8/39 (20%), Positives = 18/39 (46%), Gaps = 1/39 (2%)

Query: 84 NGAQFRQLCETTDWVDAGE-NVLLFGASGLGKSHLAAAI 121
A +++ + + +++ G SG GK +A A+
Sbjct: 142 RSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


33c2520c2529Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
c2520322-0.185722Conserved hypothetical protein
c25212220.492490Hypothetical protein
c25223210.730938Conserved hypothetical protein
c25233230.749347Conserved hypothetical protein
c25244230.730478Hypothetical protein
c25256252.436486Hypothetical protein
c25269274.536505Hypothetical protein yafZ
c25278274.084904Hypothetical protein yafX
c25287263.825502Hypothetical protein
c25294262.104363Putative radC-like protein yeeS
34c2545c2575Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c2545-1223.391545Hypothetical protein yefM
c5496-1243.973494His operon leader peptide
c2546-1244.149324ATP phosphoribosyltransferase
c2547-1243.957261Histidinol dehydrogenase
c2548-1273.063333Histidinol-phosphate aminotransferase
c2549-2191.066613Histidine biosynthesis bifunctional protein
c2550-218-1.205052Imidazole glycerol phosphate synthase subunit
c2551-217-1.7734011-(5-phosphoribosyl)-5-[(5-
c2552-216-1.372898Imidazole glycerol phosphate synthase subunit
c2553-215-4.143884Histidine biosynthesis bifunctional protein
c2554-122-7.280773Chain length determinant protein
c2555125-8.641676UDP-glucose 6-dehydrogenase
c2556232-11.0819846-phosphogluconate dehydrogenase,
c2557341-14.055875Phosphomannomutase
c2558756-18.444831Mannose-1-phosphate guanylyltransferase
c2559758-19.968757Hypothetical protein
c2560858-19.754173UDP-glucose 4-epimerase
c2561760-20.503529Hypothetical protein
c2562549-16.609147Glycosyl transferase
c2563339-13.607491Glycosyl transferase
c2564228-9.514449Hypothetical protein
c2565-119-4.566720Hypothetical protein
c2566-115-2.827834Hypothetical protein
c2567-1191.001606UTP--glucose-1-phosphate uridylyltransferase
c2568-1221.553962Colanic acid biosynthesis protein wcaM
c25690252.853512Putative colanic acid biosynthesis glycosyl
c25700253.116230Colanic acid biosynthesis protein wcaK
c25710253.384300Lipopolysaccharide biosynthesis protein wzxC
c25720233.605627Putative colanic biosynthesis UDP-glucose lipid
c25730243.973515Phosphomannomutase
c2574-1243.907212Hypothetical protein
c2575-1223.307768Mannose-1-phosphate guanylyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2560NUCEPIMERASE1743e-54 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 174 bits (444), Expect = 3e-54
Identities = 74/351 (21%), Positives = 143/351 (40%), Gaps = 40/351 (11%)

Query: 1 MNILVTGGAGYIGSHTAIELLNAGHEIIVLDNFSNASYKCIEK---IKEITRRDFITITG 57
M LVTG AG+IG H + LL AGH+++ +DN N Y K ++ + + F
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNL-NDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 58 DAGCRKTLSAIFEKHAIDIVIHFAGFKSVSESKSEPLKYYQNNVGVTITLLQVMEEYRIK 117
D R+ ++ +F + V +V S P Y +N+ + +L+ +I+
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 118 KFIFSSSATVYGEPEIIPIPETAKIGGTTNPYGTSKYFVEKILEDVSSTGKLDIICLRYF 177
+++SS++VYG +P + + Y +K E + S L LR+F
Sbjct: 120 HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFF 179

Query: 178 NPVGAHSSGKIGEAPSGIPNNLVPYLL--DVASGKRDKLFIYGNDYPTNDGTGVRDFIHV 235
G P G P ++ + + GK ++ N G RDF ++
Sbjct: 180 TVYG----------PWGRP-DMALFKFTKAMLEGKSIDVY--------NYGKMKRDFTYI 220

Query: 236 VDLAKGHLAAMNYL---------------SINSGYNIFNLGTGKGYSVLELITTFEKLTN 280
D+A+ + + + + + Y ++N+G +++ I E
Sbjct: 221 DDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALG 280

Query: 281 IKVNKSFIERRAGDVASCWADADKANSLLDWQAEQTLEQMLLDSWRWKKNY 331
I+ K+ + + GDV AD ++ + E T++ + + W +++
Sbjct: 281 IEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


35c2597c2636Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c2597-3123.088008Hypothetical protein yegI
c2598-3163.805905Hypothetical protein yegK
c2599-3163.811782Hypothetical protein yegL
c2600-2173.894588Hypothetical protein yegM precursor
c2601-2183.860459Hypothetical protein yegN
c2602-2152.620287Hypothetical protein yegO
c2603-213-2.841182Hypothetical transport protein yegB
c2604-121-5.561590Sensor protein baeS
c2605-131-9.057045Transcriptional Regulatory protein baeR
c2606-124-7.804890Hypothetical protein yegP
c2607-126-8.031105Hypothetical protein
c2608028-8.449365Hypothetical protein
c2609-221-6.008158Hypothetical protein
c2610-118-5.090005Hypothetical protein
c2611112-2.091330Putative protease yegQ
c2612319-2.949777Hypothetical protein
c2613320-3.060642Hypothetical protein yegR
c2614320-2.956668Hypothetical protein yegS
c2615320-2.972274Hypothetical protein
c2616220-2.101714Galactitol-1-phosphate 5-dehydrogenase
c2617118-2.446181PTS system, galactitol-specific IIC component
c2618-116-2.533859PTS system, galactitol-specific IIB component
c2619-117-2.478350PTS system, galactitol-specific IIA component
c2620-111-1.523682Putative tagatose 6-phosphate kinase gatZ
c2621-113-0.251826Tagatose-bisphosphate aldolase gatY
c26220120.608964Transposase
c26231131.173991Fructose-bisphosphate aldolase class I
c26242140.512375Hypothetical protein
c26251131.233571Putative nucleoside transporter yegT
c26261152.135492Hypothetical protein yegU
c2627-1161.540504Hypothetical sugar kinase yegV
c2628-1150.476233Hypothetical transcriptional regulator yegW
c2629-216-0.285801Hypothetical protein yegX
c2630221-2.590914Phosphomethylpyrimidine kinase
c2631224-4.883682Hydroxyethylthiazole kinase
c2632225-6.961380Hypothetical protein yohL
c2633226-7.387338Hypothetical protein yohM
c2634229-8.294155Hypothetical protein yohN precursor
c2635230-8.580494Hypothetical protein yehA precursor
c2636022-5.692583Hypothetical outer membrane usher protein yehB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2600RTXTOXIND524e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.8 bits (124), Expect = 4e-09
Identities = 48/369 (13%), Positives = 106/369 (28%), Gaps = 87/369 (23%)

Query: 53 SYKSRWVIVIVVVIAAIAAFWFWQGRNDSQSAAPG-----ATKQAQQSPAGG-------R 100
S + R V ++ IA G+ + + A G + + +
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 101 RGMRA---DPLA---PVQAATAVEQAVPRYLTGLGTITAANTVTVRSRVDG--QLMALHF 152
G D L + A + L T ++ ++ +L
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 153 QEGQQVKAGDLLAEI------------DPSQFKVALAQAQGQLA-------KDKATLANA 193
Q V ++L Q ++ L + + + + +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 194 RRDLARYQQLAKTNLVSRQELDAQQALVSETEGTIKADEASVA----------------- 236
+ L + L +++ + Q+ E ++ ++ +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 237 --------------------------SAQLQLDWSRITAPVDGRV-GLKQVDVGNQISSG 269
+ + S I APV +V LK G +++
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 270 DTTGIVVITQTHPIDLVFTLPESDIATVVQAQKAGKPLVVEAWDRTNSKKL-SEGTLLSL 328
+T +V++ + +++ + DI + Q A + VEA+ T L + ++L
Sbjct: 354 ETL-MVIVPEDDTLEVTALVQNKDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINL 410

Query: 329 DNQIDATTG 337
D D G
Sbjct: 411 DAIEDQRLG 419


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2601ACRIFLAVINRP9200.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 920 bits (2379), Expect = 0.0
Identities = 300/1036 (28%), Positives = 513/1036 (49%), Gaps = 29/1036 (2%)

Query: 13 SRLFIMRPVATTLLMVAILLAGIIGYRALPVSALPEVDYPTIQVVTLYPGASPDVMTSAV 72
+ FI RP+ +L + +++AG + LPV+ P + P + V YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMSSQS-SGGASVITLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L MSS S S G+ ITL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPNPPVYSKVNPADPPIMTLAVTSTAMPMTQVE--DMVETRVAQKISQISGVGLVTLSGG 189
+ + S + +M S TQ + D V + V +S+++GVG V L G
Sbjct: 122 VQQQGI-SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAQAIAALGLTSETVRTAITGANVNSAKGSLDGP------SRAVTLSANDQ 243
Q A+R+ L+A + LT V + N A G L G ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MQSAEEYRQLII-AYQNGAPIRLGDVATVEQGAENSWLGAWANKEQAIVMNVQRQPGANI 302
++ EE+ ++ + +G+ +RL DVA VE G EN + A N + A + ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 ISTADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVDDTQFELMMAIALVVMIIYLFL 362
+ TA +I+ L +L P+ +KV D T ++ S+ + L AI LV +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N+ AT+IP +AVP+ L+GTFA++ +SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAITLAVAIL 481
+ E P A K +I ++ + L AV IP+ F G G ++R+F+IT+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SQESLRKQNRFSRASEKMFDRIIAAYGRGLAKVLNHPWL 538
+S +V+L LTP +CA +L S E + F FD + Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVALSTLLLSVLLWVFIPKGFFPVQDNGIIQGTLQAPQSSSFANMAQRQRQVADVILQ 598
L + + V+L++ +P F P +D G+ +Q P ++ + QV D L+
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VQSLTSFVGVDGTNPSLNSARLQINLKPLDERDDR---VQKVIARLQTAVDKVPG 653
+ V+S+ + G + + N+ ++LKP +ER+ + VI R + + K+
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR- 658

Query: 654 VDLFLQPTQDLTIDTQVSRTQYQFTLQ---ATSLDALSTWVPQLMEKLQQLP-QLSDVSS 709
D F+ P I + T + F L DAL+ QL+ Q P L V
Sbjct: 659 -DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDKGLVAYVNVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTE 769
+ + + VD++ A LG+S++D++ + A G ++ + ++ ++ + +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 NTPGLAALDTIRLTSSDGGVVPLSSIAKIEQRFAPLSINHLDQFPVTTISFNVPDNYSLG 829
+D + + S++G +VP S+ + + + P I S G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 DAVQAIMDTEKTLNLPVDITTQFQGSTLAFQSALGSTVWLIVAAVVAMYIVLGILYESFI 889
DA A+M+ + LP I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DA-MALMENLAS-KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALMIAGSELDVIAIIGIILLIGIVKKNAIMMIDFALAAEREQ 949
P++++ +P VG LLA + + DV ++G++ IG+ KNAI++++FA ++
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMSPREAIYQACLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIGMVGGLIVSQV 1009
G EA A +R RPILMT+LA +LG LPL +S G G+ + +GIG++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDRL 1025
L +F PV +++ R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2602ACRIFLAVINRP9150.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 915 bits (2367), Expect = 0.0
Identities = 288/1035 (27%), Positives = 503/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ +L++ + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVSEMTSSS-SLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLPSGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSDT--YSQGELYDFASTQLAPTISQIDGVGDVDVGGSSL 182
+ S + +M+ SD +Q ++ D+ ++ + T+S+++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLNPQALFNQGVSLDDVRTAISNANVRKPQG------ALEDGTHRWQIQTNDELK 236
A+R+ L+ L ++ DV + N + G AL I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDSIRARLPELQSTIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRS 355
T +I+A+L ELQ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RAT+IP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RMLVALQQGYGKSLKWVLNHTRLVG 530
+LV+L LTP +C +LK + GF Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 AVLLGTIALNIWLYISIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 586
+ +A + L++ +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 587 RD-DPAVDNVTGFT-GGSRVNSGMMFITLKPRGERS---ETAQQIIDRLRKKLAKEPGAN 641
+ +V V GF+ G N+GM F++LKP ER+ +A+ +I R + +L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLMAVQDIRVGGRQANASYQYTLLSDDLAALREWEPKIRKKLATL-----PELADVNSD 696
+ + I G ++ L D + + R +L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QEDNGAEMNLIYDRDTMARLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 756
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 TQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSD 816
++K++V + G+ +P S F + + I G S D
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVILIIAAIATVYIVLGILYESYVH 876
A A ++ ++L P+ + + G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGN 936
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA +
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLL 996
EA A +R RPI+MT+LA + G LPL +S G GS + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVVYLFFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031



Score = 80.7 bits (199), Expect = 2e-17
Identities = 76/448 (16%), Positives = 161/448 (35%), Gaps = 26/448 (5%)

Query: 592 VDNVTGFTGGS-RVNSGMMFITLKPRGERSETAQQIIDRLRKKLAKEPGANLFLMAVQDI 650
+DN+ + S S + +T + + Q+ ++L+ P + Q I
Sbjct: 72 IDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE----VQQQGI 127

Query: 651 RVGGRQANASYQYTLLSDDLAALREW-----EPKIRKKLATLPELADVNSDQEDNGAE-- 703
V ++ +SD+ ++ ++ L+ L + DV GA+
Sbjct: 128 SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL----FGAQYA 183

Query: 704 MNLIYDRDTMARLGID----VQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRYTQD 759
M + D D + + + + + + T P Q + R+
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP 243

Query: 760 ISALEKMFVINNEGKAIPLSYFAK--WQPANAPLSVNHQGLSAASTISFNLPTGKSLSDA 817
+ +N++G + L A+ N + G AA +L D
Sbjct: 244 EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL-DT 302

Query: 818 SAAIDRAMTQL--GVPSTVRGSFA-GTAQVFQETMNSQVILIIAAIATVYIVLGILYESY 874
+ AI + +L P ++ + T Q +++ V + AI V++V+ + ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 875 VHPLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRH 934
L +P +G L F + + + G++L IG++ +AI++V+
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 935 GNLTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQ 994
L P+EA ++ ++ + +P+ GG + + ITIV + +S
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 995 LLTLYTTPVVYLFFDRLRLRFSRKPKQA 1022
L+ L TP + + + K
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENKGG 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2603TCRTETB1269e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 126 bits (317), Expect = 9e-34
Identities = 98/435 (22%), Positives = 191/435 (43%), Gaps = 25/435 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQIGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAITTLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGFSPLAIAGLVAVGVVALVLYLLHAQNNNRALFSLKL 257
G +L++VG+ L F+ + V V++ ++++ H + L
Sbjct: 202 KGIILMSVGIVFFML---------FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGFGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+R G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHVSVDSGTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYT--WLSMASIIAL 445
+Y+ L + II +
Sbjct: 428 LYSNLLLLFSGIIVI 442


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2604BCTERIALGSPF340.001 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 34.0 bits (78), Expect = 0.001
Identities = 28/95 (29%), Positives = 36/95 (37%), Gaps = 20/95 (21%)

Query: 164 RQTSWLIVALSTLLAALATF------PLARGLLAPVKRLVDGTHKLAAGDFTTRVAPTSE 217
RQ + L+ A L AL P L+A V+ V H LA + P S
Sbjct: 75 RQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSF 131

Query: 218 DEL-----------GRLAEDFNQLASTLEKNQQMR 241
+ L G L N+LA E+ QQMR
Sbjct: 132 ERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2605HTHFIS766e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 6e-18
Identities = 28/136 (20%), Positives = 65/136 (47%), Gaps = 1/136 (0%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLPYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTILRRCK 129
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 PQRELQQQDAESPLII 145
+ + D++ + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2613LIPOLPP20250.046 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 25.5 bits (55), Expect = 0.046
Identities = 13/38 (34%), Positives = 24/38 (63%), Gaps = 1/38 (2%)

Query: 3 EGKVKKIAAISLISVFLMSGCAVHNDETSIGKFGLAYK 40
+ +VKKI +S+++ ++ GC+ H ++ I K AYK
Sbjct: 2 KNQVKKILGMSVVAAMVIVGCS-HAPKSGISKSNKAYK 38


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2616DHBDHDRGNASE330.001 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 33.5 bits (76), Expect = 0.001
Identities = 22/92 (23%), Positives = 35/92 (38%), Gaps = 2/92 (2%)

Query: 156 AQGCENKNVIIIGAGT-IGLLAIQCAVALGAKSVTAIDISSEKLALAKSFGAMQTFNSRE 214
A+G E K I GA IG + + GA + A+D + EKL S + ++
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAH-IAAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 215 MSAPQIQGVLRDLRFNQLILETAGVPQTVELA 246
A D ++ E + V +A
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2625TCRTETA356e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.8 bits (80), Expect = 6e-04
Identities = 53/268 (19%), Positives = 89/268 (33%), Gaps = 17/268 (6%)

Query: 29 LSKSGFSAGEIGWSYACTAIAAILSPILVGSITDRFFSAQKVLAVLMFAGAVLMYFAAQQ 88
L S G A A+ ++G+++DRF ++ + ++ AGA + Y
Sbjct: 35 LVHSNDVTAHYGILLALYALMQFACAPVLGALSDRF--GRRPVLLVSLAGAAVDYAI--- 89

Query: 89 TTFAGFFPLLLAYSLTYMPTIALTNSIAFANVPDVERDFPRIRVMGTIG-WIASGLACGF 147
A F +L + T A T ++A A + D+ R R G + G+ G
Sbjct: 90 MATAPFLWVLYIGRIVAGITGA-TGAVAGAYIADITDGDERARHFGFMSACFGFGMVAG- 147

Query: 148 LPQMLGY-ADISPTNIPLLITAGSSALLGVFAFFLPDTPPKSTGKMDIKVMLGLDALILL 206
P + G SP + P A + L + FL K + + L A
Sbjct: 148 -PVLGGLMGGFSP-HAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRW 205

Query: 207 RDKD------FLVFFFCSFLFAMPLAFYYIFANGYLTEVGMKNATGWMTLGQFSEIFFML 260
VFF + +P A + IF G + +
Sbjct: 206 ARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAM 265

Query: 261 ALPFFTKRFGIKKVLLLGLVTAAIRYGF 288
R G ++ L+LG++ Y
Sbjct: 266 ITGPVAARLGERRALMLGMIADGTGYIL 293



Score = 34.0 bits (78), Expect = 0.001
Identities = 32/153 (20%), Positives = 53/153 (34%), Gaps = 20/153 (13%)

Query: 253 FSEIFFMLALPFFTKRFGIKKVLLLGLVTAAIRYGFFIYGSADEYFTYALLFLGILLHGV 312
+ L + RFG + VLL+ L AA+ Y +L++G ++ G+
Sbjct: 54 LMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAP-----FLWVLYIGRIVAGI 108

Query: 313 SYDFYYVTAYIYVDKKAPVHMRTAAQGLITLCCQGFGSLLGYRLGGVMMEKMFAYQEPVN 372
+ V D R G ++ C GFG + G LGG+M F+ P
Sbjct: 109 TGATGAVAGAYIAD-ITDGDERARHFGFMS-ACFGFGMVAGPVLGGLMGG--FSPHAP-- 162

Query: 373 GLTFNWAGMWTFGAVMIAIIAVLFMIFFRESDN 405
+ A + + + ES
Sbjct: 163 ---------FFAAAALNGLNFLTGCFLLPESHK 186



Score = 28.6 bits (64), Expect = 0.048
Identities = 22/114 (19%), Positives = 43/114 (37%), Gaps = 4/114 (3%)

Query: 7 LSFMMFVEWFIWGAWFVPLWLWL----SKSGFSAGEIGWSYACTAIAAILSPILVGSITD 62
++ +M V + + VP LW+ + + A IG S A I L+ ++
Sbjct: 212 VAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVA 271

Query: 63 RFFSAQKVLAVLMFAGAVLMYFAAQQTTFAGFFPLLLAYSLTYMPTIALTNSIA 116
++ L + M A A T FP+++ + + AL ++
Sbjct: 272 ARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLS 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2634TYPE3OMGPROT280.024 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 27.9 bits (62), Expect = 0.024
Identities = 13/42 (30%), Positives = 21/42 (50%), Gaps = 1/42 (2%)

Query: 66 KMLLGALLLVTSAAWAAPATAGSTNTSGISKYE-LSSFIADF 106
++L G LLL++S +WA ++K E L + DF
Sbjct: 11 RVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDF 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2635BINARYTOXINB290.037 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 28.9 bits (64), Expect = 0.037
Identities = 18/79 (22%), Positives = 34/79 (43%), Gaps = 8/79 (10%)

Query: 93 NITLSNNQ---TSFTSGYSVTVTPAASNAKVNVSAGGGGSVMINGVATLSSA-----SSS 144
NI LS N+ T T + T++ S ++ + S G + + + + S+S
Sbjct: 297 NIILSKNEDQSTQNTDSQTRTISKNTSTSRTHTSEVHGNAEVHASFFDIGGSVSAGFSNS 356

Query: 145 TRGSAAVQFLLCLLGGKSW 163
+ A+ L L G ++W
Sbjct: 357 NSSTVAIDHSLSLAGERTW 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2636PF005776010.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 601 bits (1551), Expect = 0.0
Identities = 222/843 (26%), Positives = 361/843 (42%), Gaps = 93/843 (11%)

Query: 7 LRMTPLASAI---VALLIGIEAYAAEETFDTHFMIGGMKDQQVSNIRL--EDNQPLPGPY 61
R+ + A +AE F+ F+ Q V+++ + PG Y
Sbjct: 21 HRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADD--PQAVADLSRFENGQELPPGTY 78

Query: 62 DIDIYVNKQWRGKYEIIVKDNPQET----CLSREMIKRLGINTDS-----FASGKQCLTF 112
+DIY+N + ++ E CL+R + +G+NT S + C+
Sbjct: 79 RVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPL 138

Query: 113 KQLIQGGSYTWDIGVFRLDFSVPQAWVEDLESGYVPPENWERGINAFYTSYYVSQYYSDY 172
+I + D+G RL+ ++PQA++ + GY+PPE W+ GINA +Y S
Sbjct: 139 TSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQN 198

Query: 173 KASGNSKSTYVRFNSGLNLQGWQLHSDASFSKTNNNPGV-----WKSNTLYLERGFAQLL 227
+ GNS Y+ SGLN+ W+L + ++S +++ W+ +LER L
Sbjct: 199 RIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLR 258

Query: 228 GTLRVGDMYTSSDIFDSVRFSGVRLFRDMQMLPNSKQNFTPRVQGIAQSNALVTIEQNGF 287
L +GD YT DIFD + F G +L D MLP+S++ F P + GIA+ A VTI+QNG+
Sbjct: 259 SRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGY 318

Query: 288 VVYQKEVPPGPFAITDLQLAGGGADLDVSVKEADGSVTTYLVPYAAVPNMLQPGVSKYDF 347
+Y VPPGPF I D+ AG DL V++KEADGS + VPY++VP + + G ++Y
Sbjct: 319 DIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSI 378

Query: 348 AAGRSHIEGASKQSD-FVQAGYQYGFNNLLTLYGGSMVANNYYAFTLGTGWNT-RIGAIS 405
AG A ++ F Q+ +G T+YGG+ +A+ Y AF G G N +GA+S
Sbjct: 379 TAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALS 438

Query: 406 VDATKSHSIQDNGDVFDGQSYQIAYNKFVSQTSTRFGLAAWRYSSRDYRTFNDHVWANNK 465
VD T+++S + DGQS + YNK ++++ T L +RYS+ Y F D ++
Sbjct: 439 VDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMN 498

Query: 466 DNYRRDENDIYDI----ADYYQNDFGRKNSFSANMSQSLPEGWGSVS--------WGDDV 513
++ + + DYY + ++ ++Q L ++ WG
Sbjct: 499 GYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSN 557

Query: 514 TTPRRQIYMSNSTT----------FDDQGVASNNTGLS-------GTVGSRDQFNYGVNL 556
+ Q ++ + + + L+ D + +
Sbjct: 558 VDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHA 617

Query: 557 SYQYQGNETTAGA---------------NLTWNAPVATVNG------------------- 582
S Y + G NL+++ G
Sbjct: 618 SASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGY 677

Query: 583 -----SYSQSSAYRQAGASVSGGIVAWSGGVNLANRLSETFAVMNAPGIKDAYVNGQKYR 637
YS S +Q VSGG++A + GV L L++T ++ APG KDA V Q
Sbjct: 678 GNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGV 737

Query: 638 TTNRNGVVVYDGMTPYRENYLMLDVSQSDSEAELRGNRKIAAPYRGAVVLVNFDTDQRKP 697
T+ G V T YREN + LD + +L P RGA+V F +
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKA-RVGI 796

Query: 698 WFIKALRADGQPLTFGYEVNDIHGHNIGVVGQGSQLFIRTNEVPPSVNVAIDKQQGLSCT 757
+ L + +PL FG V + G+V Q+++ + V V +++ C
Sbjct: 797 KLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCV 856

Query: 758 ITF 760
+
Sbjct: 857 ANY 859


36c2728c2744Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c27280183.325154Hypothetical protein
c2729-1193.245878Hypothetical protein
c27300213.638771Nitrate/nitrite response regulator protein narP
c27310234.093674Cytochrome c-type biogenesis protein ccmH
c27320224.513450Thiol:disulfide interchange protein dsbE
c27330204.579745Cytochrome c-type biogenesis protein ccmF
c2734-2173.398901Cytochrome c-type biogenesis protein ccmE
c2735-1142.893756Heme exporter protein D
c2736-2133.010058Heme exporter protein C
c2737-2143.271155Heme exporter protein B
c2738-2143.247737Heme exporter protein A
c2739-1173.981277Cytochrome c-type protein napC
c2740-1204.473418Hypothetical protein
c2741-1224.911818Diheme cytochrome c napB precursor
c2742-1214.790074Hypothetical protein
c2743-1224.173598Ferredoxin-type protein napH
c27440223.263881Ferredoxin-type protein napG
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2729PERTACTIN270.024 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 27.0 bits (59), Expect = 0.024
Identities = 15/43 (34%), Positives = 26/43 (60%), Gaps = 2/43 (4%)

Query: 40 VFAVIEKGGLLEV--KATGDFKIFVTDTGARPAAGDNLTLVTT 80
VFA + L V A+G +++V ++G+ PA+G+ + LV T
Sbjct: 484 VFADLGLSDKLVVMRDASGQHRLWVRNSGSEPASGNTMLLVQT 526


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2730HTHFIS652e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.9 bits (158), Expect = 2e-14
Identities = 22/113 (19%), Positives = 48/113 (42%), Gaps = 2/113 (1%)

Query: 19 VMIVDDHPLMRRGVRQLLELDSGFEVVAEAGDGASAIDLANRLDIDVILLDLNMKGMSGL 78
+++ DD +R + Q L +G++V + A+ D D+++ D+ M +
Sbjct: 6 ILVADDDAAIRTVLNQALS-RAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 79 DTLNALRRDGVTAQIIILTVSDASSDVFALIDAGADGYLLKDSDPEVLLEAIR 131
D L +++ +++++ + + GA YL K D L+ I
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


37c2799c2824Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c27992132.365615Hypothetical protein yfbI
c28001153.690608Conserved hypothetical protein
c2801-1133.644502Hypothetical protein yfbJ
c2802-1154.186640Polymyxin B resistance protein pmrD
c2803-1134.855693O-succinylbenzoic acid--CoA ligase
c2804-1124.227787O-succinylbenzoate-CoA synthase
c2805-1113.405870Naphthoate synthase
c2806-1122.879973Hypothetical protein
c2807-1112.316045Hypothetical protein yfbB
c2808-2110.213475Menaquinone biosynthesis protein menD
c2809-115-2.007463Menaquinone-specific isochorismate synthase
c2810019-3.527750ElaB protein
c2811019-3.938443Protein elaA
c2812-111-1.559361Protein elaC
c2813-211-0.409211Hypothetical protein yfbK
c2814-1212.546977Hypothetical protein
c28150222.917421Hypothetical protein yfbL
c28160273.779308Hypothetical protein yfbM
c28170304.412986NADH dehydrogenase I chain N
c28180323.906080NADH dehydrogenase I chain M
c2819-1324.392195NADH dehydrogenase I chain L
c2820-1314.146233NADH dehydrogenase I chain K
c2821-1314.207017NADH dehydrogenase I chain J
c28220314.323102NADH dehydrogenase I chain I
c28230304.150271NADH dehydrogenase I chain H
c28240294.157994NADH dehydrogenase I chain G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2800BCTERIALGSPC280.007 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 28.0 bits (62), Expect = 0.007
Identities = 12/31 (38%), Positives = 18/31 (58%), Gaps = 1/31 (3%)

Query: 34 KHIVLWLGLALACLGLAMVLWLLVL-QNVPV 63
+ I+ +L + L C LAM+ W + L N PV
Sbjct: 15 RRILFYLLMLLFCQQLAMIFWRIGLPDNAPV 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2803ALARACEMASE310.013 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 30.5 bits (69), Expect = 0.013
Identities = 32/193 (16%), Positives = 59/193 (30%), Gaps = 37/193 (19%)

Query: 268 GYGLTEFASTVCAKEADGLADVGSPL----PGREVKIVNDEVWLRAASMAEGYWRNGQRV 323
G+G+ S + A + L ++ + G + I+ E + A + + R+
Sbjct: 40 GHGIERIWSAIGATDGFALLNLEEAITLRERGWKGPILMLEGFFHAQDLEIY---DQHRL 96

Query: 324 PLVNDEGWYATRDRGEMYNGKLTI-------VGRLDNLFFSGGEGIQPEEVERVIAAHPA 376
W + L I + RL G QP+ V V A
Sbjct: 97 TTCVHSNWQLKALQNARLKAPLDIYLKVNSGMNRL---------GFQPDRVLTVWQQLRA 147

Query: 377 VLQVFIVPVADKEFGHRPVAVVEYDQQSVDLDEWVKDKLARFQQPVRWLTLPPELKNGGI 436
+ V + + H A + + + +AR +Q L L N
Sbjct: 148 MANVGEMTL----MSHFAEA---------EHPDGISGAMARIEQAAEGLECRRSLSNSAA 194

Query: 437 KISRQALK-EWVK 448
+ +WV+
Sbjct: 195 TLWHPEAHFDWVR 207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2811AUTOINDCRSYN325e-04 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 32.1 bits (73), Expect = 5e-04
Identities = 13/74 (17%), Positives = 29/74 (39%), Gaps = 12/74 (16%)

Query: 1 MIDWQDLHHSDLSVSQLYALLQLRCAVFV--------VEQNCPYQDIDGDDLEGENRHIL 52
M++ D++H+ LS ++ L LR F + D ++ ++
Sbjct: 1 MLEIFDVNHTLLSETKSGELFTLRKETFKDRLNWAVQCTDGMEFDQYDNNN----TTYLF 56

Query: 53 GWHNGTLVAYARIL 66
G + T++ R +
Sbjct: 57 GIKDNTVICSLRFI 70


38c2888c2910Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c2888-118-3.921391Hypothetical protein yfcZ
c2889116-4.227836Long-chain fatty acid transport protein
c2890423-3.004028VacJ lipoprotein precursor
c2891424-3.855628Hypothetical protein
c2892425-4.591812Hypothetical protein yfdC
c2893529-5.516827*Hypothetical protein
c2894527-5.017870Hypothetical protein ydeU
c2895426-4.727305yapH homolog
c2896127-6.680823Hypothetical protein
c2897128-6.671670Type 1 fimbriae Regulatory protein fimB
c2898126-5.779819Type 1 fimbriae Regulatory protein fimB
c2899026-5.448384D-serine deaminase activator
c2900-127-6.258986DsdX permease
c2901-232-8.754618D-serine dehydratase
c2902-135-9.721984Multidrug resistance protein Y
c2903-135-9.358557Hypothetical protein
c2904-134-8.401045Multidrug resistance protein K
c2905-131-7.597590Positive transcription regulator evgA
c2906-131-7.209354Sensor protein evgS precursor
c2907130-5.440447Hypothetical protein yfdE
c2908130-5.135107Hypothetical protein yfdV
c2909128-4.811692Probable oxalyl-CoA decarboxylase
c2910-221-3.911327Hypothetical protein yfdW
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2890VACJLIPOPROT407e-148 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 407 bits (1048), Expect = e-148
Identities = 250/251 (99%), Positives = 251/251 (100%)

Query: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60
MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR
Sbjct: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60

Query: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120
DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM
Sbjct: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120

Query: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADSLYPVLSWLTWPM 180
ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMAD+LYPVLSWLTWPM
Sbjct: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADALYPVLSWLTWPM 180

Query: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240
SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA
Sbjct: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240

Query: 241 IQDDLKDIDSE 251
IQDDLKDIDSE
Sbjct: 241 IQDDLKDIDSE 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2894PRTACTNFAMLY544e-10 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 53.5 bits (128), Expect = 4e-10
Identities = 49/235 (20%), Positives = 86/235 (36%), Gaps = 27/235 (11%)

Query: 47 WTDGQDRLQQGIMAGYGNEKSSTTSSLSGYKSKGAINGYSTGLYGTWQQNDGNDNGAYVD 106
R G +AGY T G+ + GY+T + D+G Y+D
Sbjct: 683 VAVAGGRWHLGGLAGYTRGDRGFTGDGGGHTDSVHVGGYATYI---------ADSGFYLD 733

Query: 107 TWIQYGWFNN--TVNGEKLAAESWKSR--GFTGSVEAGYTFKAGEFTGSQGSHYDWYIQP 162
++ N V G A K R G S+EAG F W+++P
Sbjct: 734 ATLRASRLENDFKVAGSDGYAVKGKYRTHGVGASLEAGRRFTH---------ADGWFLEP 784

Query: 163 QSQITWMNVRASEHTEKNGTKVQLSGDGNIQSRLGVRTYLKGKSASDDNKAHQFEPFVEV 222
Q+++ + NG +V+ G ++ RLG+ GK + Q +P+++
Sbjct: 785 QAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEV---GKRI-ELAGGRQVQPYIKA 840

Query: 223 NWIHNTRSWG-VKMDNTALSQDGATNIAEVKTGVQGKLSDNLNVWGNVGVQAGDK 276
+ + G V + A + AE+ G+ L +++ + G K
Sbjct: 841 SVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYASYEYSKGPK 895


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2895PRTACTNFAMLY555e-09 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 54.7 bits (131), Expect = 5e-09
Identities = 44/129 (34%), Positives = 59/129 (45%), Gaps = 12/129 (9%)

Query: 2182 TLTVNGDYTGGGTLIINTVLGDDTSTTDKLIVTGNTSGDTGVVVNNVRGQGAQTADGIEI 2241
LTVN G G +N D +DKL+V + SG + V N G +A+ + +
Sbjct: 472 VLTVN-TLAGSGLFRMNVFA--DLGLSDKLVVMQDASGQHRLWVRNS-GSEPASANTLLL 527

Query: 2242 VHVGGQSDGNFRLQN---RAVAGAWEYFLHKGNAGGTDGNWYLR-SELPPEPQPQPQPQP 2297
V S F L N + G + Y L A +G W L ++ PP P+P PQP P
Sbjct: 528 VQTPLGSAATFTLANKDGKVDIGTYRYRL----AANGNGQWSLVGAKAPPAPKPAPQPGP 583

Query: 2298 QPQPQPQPQ 2306
QP PQPQ
Sbjct: 584 QPPQPPQPQ 592


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2902TCRTETB1222e-32 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 122 bits (308), Expect = 2e-32
Identities = 92/404 (22%), Positives = 167/404 (41%), Gaps = 17/404 (4%)

Query: 19 VTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVITSFGVANAIAIPVTGRLAQ 78
+ I L + +F +L+ + NV++P I+ WV T+F + +I V G+L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 79 RIGELRLFLLSVTFFSLSSLMCSLS-TNLDVLIFFRVVQGLMAGPLIPLSQSLLLRNYPP 137
++G RL L + S++ + + +LI R +QG A L ++ R P
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 138 EKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRE 197
E R A L V + GP +GG I W +L+ +PM I+ L L +E
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192

Query: 198 TETSPVKMNLPGLTLLVLGVGGLQIMLDKGRDLDWFNSSTIILLTVVSVISLISLVIWES 257
++ G+ L+ +G+ + ML F +S I +VSV+S + V
Sbjct: 193 VRIKG-HFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIR 241

Query: 258 TSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQETMGYNAIWAGLAYAPI 317
+P +D L K+ F IG++ + +G + ++P ++++ + G
Sbjct: 242 KVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 318 GIMPLLIS-PLIGRYGNKIDMRLLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQFFQG 376
G M ++I + G ++ ++ +V + S T F II+ G
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGG 361

Query: 377 FAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL 420
+ ++TI S L + S+ NF LS G ++
Sbjct: 362 LSFTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2904RTXTOXIND786e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 78.3 bits (193), Expect = 6e-18
Identities = 62/413 (15%), Positives = 123/413 (29%), Gaps = 98/413 (23%)

Query: 13 RRKYFALLAVVLFIAFSGAYAYWSMELKDMISTDDAYVTGNADPISAQVSGSVTVVNHKD 72
RR ++ F+ + + + +G + I + V + K+
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKE 114

Query: 73 TNYVRQGDILVSLDKTDATIALNKA----------------------------------- 97
VR+GD+L+ L A K
Sbjct: 115 GESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEP 174

Query: 98 -----------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQSLEDYN 137
K + Q + L + AE + + Y+
Sbjct: 175 YFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEK 234

Query: 138 RRV----PLAKQGVISKEALEHTKDTLI----------SSKAALNAAIQAYKANKALVMN 183
R+ L + I+K A+ ++ + S + + I + K +
Sbjct: 235 SRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE--YQLV 292

Query: 184 TPLNRQPQVIEAADATKE----------AWLALKRTDIKSPVTGYIAQRSVQ-VGETVSP 232
T L + + + T + + I++PV+ + Q V G V+
Sbjct: 293 TQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTT 352

Query: 233 GQSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINMGTGN 291
++LM +VP + V A + + + +GQ+ I + F G +G
Sbjct: 353 AETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLVGK-- 404

Query: 292 AFSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDT 340
+ + +V V +S++ L PL G+++TA I T
Sbjct: 405 -VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKT 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2905HTHFIS493e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 3e-09
Identities = 22/148 (14%), Positives = 53/148 (35%), Gaps = 31/148 (20%)

Query: 4 IIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQV 63
++ DD + L + ++ + + + + D+V+ DV +P N +
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 LETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF 123
L ++K + ++++SA+N + AI+A++ G +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDY 101

Query: 124 ---PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 102 LPKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2906HTHFIS792e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.5 bits (196), Expect = 2e-17
Identities = 30/105 (28%), Positives = 51/105 (48%)

Query: 960 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKVSMQHYDLLITDVNMPNMDGFE 1019
+IL+ADD R +L + L+ GYDV ++ ++ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 LTRKLREQNSSLPIWGLTANAQANEREKGLNCGMNLCLFKPLTLD 1064
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


39c2952c2960Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c2952-2213.040632PTS system, glucose-specific IIA component
c2953-1254.106188Pyridoxine kinase
c2954-1264.193018Hypothetical protein yfeK precursor
c2955-1254.161041Cysteine synthase B
c2956-1243.814171Sulfate transport ATP-binding protein cysA
c2957-1223.605513Putative conserved protein
c29580193.027303Sulfate transport system permease protein cysT
c2959-2172.754686Thiosulfate-binding protein precursor
c2960-2173.072245Oxidoreductase ucpA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2956PF05272348e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.3 bits (78), Expect = 8e-04
Identities = 11/33 (33%), Positives = 16/33 (48%)

Query: 30 MVALLGPSGSGKTTLLRIIAGLEHQTSGHIRFH 62
V L G G GK+TL+ + GL+ + H
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIG 630


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2960DHBDHDRGNASE1564e-48 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 156 bits (395), Expect = 4e-48
Identities = 95/255 (37%), Positives = 137/255 (53%), Gaps = 4/255 (1%)

Query: 56 LTGKTALITGALQGIGEGIARTFARHGANLILLDISPE-IEKLADELCGRGHRCTAVVAD 114
+ GK A ITGA QGIGE +ART A GA++ +D +PE +EK+ L A AD
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 115 VRDPASVAAAIKRAKEKEGRIDILVNNAGVCRLGSFLDMSDEDRDFHIDINIKGVWNVTK 174
VRD A++ R + + G IDILVN AGV R G +SDE+ + +N GV+N ++
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 175 AVLPEMIARKDGRIVMMSSVTGDMVADPGETAYALTKAAIVGLTKSLAVEYAQSGIRVNA 234
+V M+ R+ G IV + S V AYA +KAA V TK L +E A+ IR N
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAG-VPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 235 ICPGYVRTPMAESIARQSNPEDP--ESVLTEMAKAIPMRRLADPLEVGELAAFLASDESS 292
+ PG T M S+ N + + L IP+++LA P ++ + FL S ++
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 293 YLTGTQNVIDGGSTL 307
++T +DGG+TL
Sbjct: 245 HITMHNLCVDGGATL 259


40c2969c2983Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c29690153.149354Probable N-acetylmuramoyl-L-alanine amidase amiA
c2970-2183.924954Coproporphyrinogen III oxidase, aerobic
c2971-2194.961366Ethanolamine operon Regulatory protein
c2972-2235.357381Ethanolamine utilization protein eutK precursor
c2973-2225.582930Ethanolamine utilization protein eutL
c2974-1225.873717Ethanolamine ammonia-lyase light chain
c29750226.001603Ethanolamine ammonia-lyase heavy chain
c29761216.289780Ethanolamine utilization protein eutA
c29771205.799968Ethanolamine utilization protein eutH
c29783206.311957Ethanolamine utilization protein eutG
c29792196.165766Ethanolamine utilization protein eutJ
c29802215.477972Ethanolamine utilization protein eutE
c29810184.277189Ethanolamine utilization protein eutN
c29821183.642040Ethanolamine utilization protein eutM precursor
c29832183.259038Ethanolamine utilization protein eutD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2979SHAPEPROTEIN512e-09 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 50.5 bits (121), Expect = 2e-09
Identities = 33/116 (28%), Positives = 50/116 (43%), Gaps = 9/116 (7%)

Query: 63 VRDGIVWDFFGAVTIVRRHLD-TLEQQFGRRFSHAATSFPPGTDP---RISINVLESAGL 118
++DG++ DFF +++ + F R P G R + AG
Sbjct: 76 MKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGA 135

Query: 119 EVSHVLDEPTAVA---DLLQLDNAG--VVDIGGGTTGIAIVKKGKVTYSADEATGG 169
+++EP A A L + G VVDIGGGTT +A++ V YS+ GG
Sbjct: 136 REVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGG 191


41c3047c3053Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c30471183.006735Protein sseB
c30480214.123160Peptidase B
c30493192.973587Hypothetical protein yfhJ
c30503192.692724Ferredoxin, 2Fe-2S
c30513222.459086Chaperone protein hscA
c30524260.382545Chaperone protein hscB
c30532250.805407Protein yfhF
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3047STREPKINASE290.022 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 28.9 bits (64), Expect = 0.022
Identities = 27/120 (22%), Positives = 52/120 (43%), Gaps = 21/120 (17%)

Query: 130 GNSLSSQEVLEGGESLILSE-----VAEPPVQMIDSLTTLFKTIKPVKRAFICLIKESEE 184
G++++SQE+L +S++ + E ++ +F+TI P+ + F +K E+
Sbjct: 217 GDTITSQELLAQAQSILNKNHPGYTIYERDSSIVTHDNDIFRTILPMDQEFTYRVKNREQ 276

Query: 185 A-QPNLLIGIEADGDIEEIIQAAGSVATDTLPGDEPIDICQVKKGEKGISHFITEHIAPF 243
A + N G+ + + ++I V +KKGEK F H+ F
Sbjct: 277 AYRINKKSGLNEEINNTDLISEKYYV---------------LKKGEKPYDPFDRSHLKLF 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3051SHAPEPROTEIN1153e-30 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 115 bits (289), Expect = 3e-30
Identities = 81/371 (21%), Positives = 144/371 (38%), Gaps = 74/371 (19%)

Query: 41 GIDLGTTNSLVATVRSGQAETLADHEGRHLLPSVVHYQQQGHS-------VGYDARTNAA 93
IDLGT N+L+ G + +E PSVV +Q VG+DA+
Sbjct: 14 SIDLGTANTLIYVKGQG----IVLNE-----PSVVAIRQDRAGSPKSVAAVGHDAK-QML 63

Query: 94 LDTANTISSVKRLMGRSLADIQQRYPHLPYQFQASENGLPMIETAAGLLNPVRVSADILK 153
T I++++ + +AD V+ +L+
Sbjct: 64 GRTPGNIAAIRPMKDGVIADF-------------------------------FVTEKMLQ 92

Query: 154 ALAARATEALAGE-LDGVVITVPAYFDDAQRQGTKDAARLAGLHVLRLLNEPTAAAIAYG 212
+ V++ VP +R+ +++A+ AG + L+ EP AAAI G
Sbjct: 93 HFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAG 152

Query: 213 LDSGQEGVIAVYDLGGGTFDISILRLSRGVFEVLATGGDSALGGDDFDHLLADYIREQAG 272
L + V D+GGGT +++++ L+ V +GGD FD + +Y+R G
Sbjct: 153 LPVSEATGSMVVDIGGGTTEVAVISLNGVV-----YSSSVRIGGDRFDEAIINYVRRNYG 207

Query: 273 --IPDRSDNRVQRELLDAAIAAKIALSDADSVTVNVAG---WQG-----EISREQFNELI 322
I + + R++ E+ A + + V G +G ++ + E +
Sbjct: 208 SLIGEATAERIKHEI-------GSAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEAL 260

Query: 323 APLVKRTLLACRRALKDAGVE-ADEVLE--VVVVGGSTRVPLVRERVGEFFGRPPLTSID 379
+ + A AL+ E A ++ E +V+ GG + + + E G P + + D
Sbjct: 261 QEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLMEETGIPVVVAED 320

Query: 380 PDKVVAIGAAI 390
P VA G
Sbjct: 321 PLTCVARGGGK 331


42c3134c3225Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c3134215-0.987045Hypothetical protein
c3135216-0.769533GrpE protein
c3136318-1.324892Hypothetical protein
c3137215-0.858786Probable inorganic polyphosphate/ATP-NAD kinase
c3138220-3.391512DNA repair protein recN
c3139226-5.970615Small protein A
c3140228-7.544807Protein yfjF
c3141230-8.007796Hypothetical protein yfjG
c3142131-7.306132SsrA-binding protein
c3143236-8.299220Hypothetical protein
c3144031-6.944207DNA-damage-inducible protein I
c3145-130-6.047163Hypothetical protein ydfK
c3146029-4.666855Putative DNA-invertase from lambdoid prophage
c3147026-4.469883Hypothetical protein
c3148-124-4.122437Hypothetical protein
c3149-2180.296742Hypothetical protein
c3150-2181.144089Hypothetical protein
c31511234.115225Hypothetical protein
c31522244.142452Hypothetical protein
c31533244.856888Putative outer membrane protein of prophage
c31542255.166657Putative tail component of prophage
c31551256.235554Putative tail component of prophage
c31561266.094887Putative tail fiber component K of prophage
c31571255.669949Hypothetical protein
c31582275.986828Putative tail component of prophage
c31592275.943055Putative tail component of prophage
c31601265.734849Putative tail component of prophage
c31612264.662735Putative tail component of prophage
c31623254.831768Putative tail component of prophage
c31632235.112332Putative tail component of prophage
c31642265.051106Putative tail component of prophage
c31652256.806693Putative tail fiber component Z of prophage
c31664257.127727Putative head-tail joining protein of prophage
c31674266.969469Putative DNA packaging protein of prophage
c31681245.835102Putative capsid protein of prophage
c31691245.548911Putative head-DNA stabilization protein of
c31701244.982252Putative capsid protein of prophage
c31710232.443896Putative capsid structural protein of prophage
c3172221-0.308154Putative head-tail joining protein of prophage
c3173319-1.279182Putative DNA packaging protein of prophage
c3174426-5.746243Prophage QSR' DNA packaging protein NU1 homolog
c3175226-5.769861Hypothetical protein
c3176229-7.018380GnsB protein
c3177229-5.532549Cold shock-like protein cspI
c3178131-5.934106Hypothetical protein
c3179131-6.114331Hypothetical protein ydfP precursor
c3180231-6.059564Probable lysozyme from lambdoid prophage Qin
c3181332-6.616030Unknown protein encoded by prophage
c3182229-2.757978Lysis protein S homolog from lambdoid prophage
c3183123-2.089734Hypothetical protein
c3184223-0.505259Cold shock-like protein cspB
c31852220.955982Cold shock-like protein cspF
c31863231.463259Antitermination protein Q homolog from lambdoid
c31871212.391806Hypothetical protein ydfU
c31880221.481631Hypothetical protein
c31891221.881641Hypothetical protein
c31902220.443050Hypothetical protein
c3191221-0.734302Hypothetical protein yfdN
c3192122-3.110263Unknown protein encoded by cryptic prophage
c3193432-6.060533Hypothetical protein
c3194432-4.530291Hypothetical protein
c3195432-3.089946Hypothetical protein ymfL
c3196428-3.721136Hypothetical protein
c3197325-3.703466Putative repressor protein of prophage
c3198128-7.217167Hypothetical protein
c3199028-7.249000Hypothetical protein
c3200031-8.670634Hypothetical protein
c3201234-10.910923Hypothetical protein yfdQ
c3202026-8.930318Hypothetical protein yfdR
c3203-119-5.839663Hypothetical protein
c3204-111-0.712865Hypothetical protein
c32051172.312388Hypothetical protein
c32061182.575055Hypothetical protein
c32071224.171538Hypothetical protein ygaT
c32081203.756864Hypothetical protein ygaF
c32091183.151860Succinate-semialdehyde dehydrogenase (NADP+)
c32100182.3137384-aminobutyrate aminotransferase
c32111190.061966GABA permease
c3212-121-1.410381Hypothetical transcriptional regulator ygaE
c3213020-2.894824Unknown protein from 2D-page
c3214127-3.468323Hypothetical protein
c3215327-5.328045Hypothetical protein yqaE
c3216025-5.123839Hypothetical transcriptional regulator ygaV
c3217227-5.970726Hypothetical protein ygaP
c3218223-4.355928DNA-binding protein stpA
c3219-123-2.028973Hypothetical protein
c3220-120-1.210484Hypothetical protein
c3221-118-0.015368Hypothetical protein ygaW
c32220170.996815Hypothetical protein ygaC
c32230101.320312Hypothetical protein ygaM
c3224-1110.501014Conserved hypothetical protein
c3225212-0.132001Conserved hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3139BLACTAMASEA260.032 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 26.3 bits (58), Expect = 0.032
Identities = 23/87 (26%), Positives = 36/87 (41%), Gaps = 11/87 (12%)

Query: 4 KTLTAAAAVLLMLTAGCSTLERVVYRPDINQGNYLTANDVSKIRV--GMTQQQVAYALGT 61
K + AVL + AG LER ++ Q + + + VS+ + GMT ++ A
Sbjct: 69 KVV-LCGAVLARVDAGDEQLERKIH---YRQQDLVDYSPVSEKHLADGMTVGELCAA--A 122

Query: 62 PLMSDPFGTNTWFYVFRQQPGHEGVTQ 88
MSD N + G G+T
Sbjct: 123 ITMSDNSAANL---LLATVGGPAGLTA 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3150IGASERPTASE465e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 46.2 bits (109), Expect = 5e-07
Identities = 33/165 (20%), Positives = 54/165 (32%), Gaps = 15/165 (9%)

Query: 219 ETAAKNSQVAAAQSESAAAGSA--TSATGSATAAANSQKAAKTSETNAKSSQTAAKTSET 276
E +N V + A S + A +A A S+T +E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 277 NAKASETAAKNSQDA------------AAQSESAAAGSASAAASSASASANSQKAAKTSE 324
+ + S+T KN QDA A+S A + A S S + +Q
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKET 1103

Query: 325 TNAKASETAAANSAKASAASQTAAKASEDAAREYASQ-AAEPYKQ 368
+ E A + K + ++ S + Q AEP ++
Sbjct: 1104 ATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARE 1148



Score = 44.7 bits (105), Expect = 2e-06
Identities = 32/194 (16%), Positives = 65/194 (33%), Gaps = 11/194 (5%)

Query: 111 PEALRRFEEMVEEAARNAEAASQSAAAAKKSETAAASSKNAAKTSETNAANSAQAAATSQ 170
+ + E+ E + E ++ + K E + A E + + + +
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQT 1162

Query: 171 TASENSATAAKKSETNAKNSETAAKTSETNAK-----SSQTAAKTSETNAKASETAAKNS 225
+ ++ AK++ +N + T + T T + T A T T S KN
Sbjct: 1163 NTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNR 1222

Query: 226 QVAAAQSESAAAGSATSATGSATAAANSQKAAKTSETNAKSSQTAAKTS----ETNAKAS 281
+ +S AT+++ + A ++ TNA S AK S
Sbjct: 1223 HRRSVRSVPHNVEPATTSSNDRSTVA--LCDLTSTNTNAVLSDARAKAQFVALNVGKAVS 1280

Query: 282 ETAAKNSQDAAAQS 295
+ ++ + Q
Sbjct: 1281 QHISQLEMNNEGQY 1294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3153ENTEROVIROMP1472e-47 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 147 bits (372), Expect = 2e-47
Identities = 65/200 (32%), Positives = 101/200 (50%), Gaps = 30/200 (15%)

Query: 1 MRKLCAVILSAVVWLVAAGTPASAAEHQSTLSAGYLQTHTDMPGSDDLKGINVKYRYEFT 60
M+K+ + A V AGT +A ST++ GY Q+ + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAA---TSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGLVTSFSYANAKDEQKTHYSDTRWHEDSVRNRWFSMMAGPSVRVNEWFSAYAMAGM 119
++ LG++ SF+Y T S T D +N+++ + AGP+ R+N+W S Y + G+
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDLAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3155PF06291280.012 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 28.1 bits (62), Expect = 0.012
Identities = 13/40 (32%), Positives = 19/40 (47%), Gaps = 5/40 (12%)

Query: 122 MTGILFSLGASMVLGGVAQML-----APKARTPRTQTTDN 156
M +LFS +M++ G AQ P A TP+ T +
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHH 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3160GPOSANCHOR412e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 40.8 bits (95), Expect = 2e-05
Identities = 57/377 (15%), Positives = 125/377 (33%), Gaps = 36/377 (9%)

Query: 236 SGLTAMARQFHNVTAEQIAYVAQLQRSGDEAGALQAANEAATKGFDDQTRRLKENMGTLE 295
S R+ +E+ + + +L+ + + + + L+ L
Sbjct: 95 SNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALA 154

Query: 296 TWADRTARAFKSMWDAVLDI-GRPDTAQEMLIKAEAAFKKADDIWNLRKDDYFVNDEARA 354
+A + + + T + EA + + + +
Sbjct: 155 ARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIK 214

Query: 355 RYWDDREKARLALKAARK-KAEQQSQQDKNAQQQSDTEASRLKYTEEAQKAYKRLQTPLE 413
++ K + ++ + EA + + K L+ +
Sbjct: 215 TLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMN 274

Query: 414 KYTARQEELNKALKDGKILQADYNTLMAAAKKDYEATLKKPKQSGVKVSAGDRQEDSAHA 473
TA ++ + L+A+ L + A + + R D++
Sbjct: 275 FSTADSAKIKTLEAEKAALEAEKADLEHQ-SQVLNANRQSLR----------RDLDASRE 323

Query: 474 ALLTLQAELRTLEKHAGANEKISQQ-RRDL-------WKAESQFAVLEEAAQRRQLSAQE 525
A L+AE + LE+ +E Q RRDL + E++ LEE + + S Q
Sbjct: 324 AKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQS 383

Query: 526 KS--LLAHKDETLEYKRQLAALGDKVTYQERLNALAQQADKFAQQQRAKRAAIEAKSRGL 583
L A ++ + ++ L K+ E+LN +++ K ++++A
Sbjct: 384 LRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKA------------ 431

Query: 584 TDRQAEREATEQRLKEQ 600
+ QA+ EA + LKE+
Sbjct: 432 -ELQAKLEAEAKALKEK 447


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3163INTIMIN300.011 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 30.0 bits (67), Expect = 0.011
Identities = 32/202 (15%), Positives = 61/202 (30%), Gaps = 29/202 (14%)

Query: 76 DWTATGQGQKSAGDTSFT----LAWMPGEQGQQALLAWFNEGDTRAYKIRFPNGTVDVFR 131
G G+ + S + + AL + A I +
Sbjct: 611 SANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSAL-------NANAV-IFVDQTKASITE 662

Query: 132 GWVSSIGKAVTAKEVITRTVKVTNVGRPSMAEDRSTVTAATGMTVTPASTSVVKGQSTTL 191
++ IT TVKV +P ++ + T ++ + T TL
Sbjct: 663 IKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTL 722

Query: 192 T---------------VAFQPEGATDKSFRAVSADKTKATVSVSGMTITVKG--VAAGKV 234
T VA + + F ++ D + +G+ + + G+V
Sbjct: 723 TSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQV 782

Query: 235 NIPVVSGNGEFAAVAEINVTAS 256
N+ GNG++ + AS
Sbjct: 783 NLKASGGNGKYTWRSANPAIAS 804


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3174PF04183290.007 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 29.5 bits (66), Expect = 0.007
Identities = 15/91 (16%), Positives = 33/91 (36%), Gaps = 17/91 (18%)

Query: 76 DLHPGTLEFERHRLTRAQATAQELKN--AKESAEVVETAFCTFVLSRIAREISSILD--G 131
D G + + + QE+++ ++ SA+ + T + R IS ++ G
Sbjct: 442 DFQ-GDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDLQTGHFVTVLRFISPLMVRLG 500

Query: 132 IP------------LSVQRRFPELDNRHIDF 150
+P ++ P++ R F
Sbjct: 501 VPERRFYQLLAAVLSDYMKKHPQMSERFALF 531


43c3261c3285Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c32610133.430703Glucitol operon repressor
c32621133.873122GutQ protein
c32631133.390006Hypothetical sigma-54-dependent transcriptional
c32640132.259114Hypothetical protein
c32671152.796464FlRd-NAD(+) reductase
c32680141.974466Hydrogenase maturation protein hypF
c3269021-5.059286Electron transport protein hydN
c3270229-7.604867Hypothetical protein
c3271024-5.938141Hypothetical protein
c3272-220-3.545514Hypothetical protein
c3273-315-2.667721Hypothetical protein ygjM
c3274-116-0.022531Hypothetical protein
c3275-1181.634458Hypothetical protein
c3276-1243.390053Hypothetical protein
c3277-1304.961049Hydrogenase 3 maturation protease
c3278-1285.525468Formate hydrogenlyase maturation protein hycH
c3279-1285.644557Formate hydrogenlyase subunit 7
c32800275.011481Formate hydrogenlyase subunit 6
c32810254.867707Formate hydrogenlyase subunit 5 precursor
c32821234.706982Formate hydrogenlyase subunit 4
c32832224.144229Formate hydrogenlyase subunit 3
c32842202.889352Formate hydrogenlyase subunit 2
c32851203.411544Formate hydrogenlyase Regulatory protein hycA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3261ARGREPRESSOR280.024 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 27.9 bits (62), Expect = 0.024
Identities = 10/45 (22%), Positives = 18/45 (40%), Gaps = 5/45 (11%)

Query: 1 MKPRQRQAAILEYLQKQGKCSVEEL-----AQYFDTTGTTIRKDL 40
M QR I E + + +EL ++ T T+ +D+
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDI 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3263HTHFIS377e-128 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 377 bits (969), Expect = e-128
Identities = 125/388 (32%), Positives = 194/388 (50%), Gaps = 33/388 (8%)

Query: 174 IAALAAGALS----------NALLIEQLESQNMLPGEAAPFEAVKQTQMIGLSPGMTQLK 223
I A GA +I + ++ ++ ++G S M ++
Sbjct: 91 IKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIY 150

Query: 224 KEIEIVAASDLNVLISGETGTGKELVAKAIHEASPRAVNPLVYLNCAALPESVAESELFG 283
+ + + +DL ++I+GE+GTGKELVA+A+H+ R P V +N AA+P + ESELFG
Sbjct: 151 RVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFG 210

Query: 284 HVKGAFTGAISNRSGKFEMADNGTLFLDEIGELSLALQAKLLRVLQYGDIQRVGDDRSLR 343
H KGAFTGA + +G+FE A+ GTLFLDEIG++ + Q +LLRVLQ G+ VG +R
Sbjct: 211 HEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIR 270

Query: 344 VDVRVLAATNRDLREEVLAGRFRADLFHRLSVFPLSVPPLRERGDDVILLAGYFCEQCRL 403
DVR++AATN+DL++ + G FR DL++RL+V PL +PPLR+R +D+ L +F +Q
Sbjct: 271 SDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAE- 329

Query: 404 RLGLSRVVLSAGARNLLQHYNFPGNVRELEHAIHRAVVLARATRSGDEVIL-----EAQH 458
+ GL A L++ + +PGNVRELE+ + R L E+I E
Sbjct: 330 KEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPD 389

Query: 459 FAFPEVTLPPPEVAAVPVVKQNLR-----------------EATEAFQRETIRQALAQNH 501
+ ++ V++N+R + I AL
Sbjct: 390 SPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATR 449

Query: 502 HNWAACARMLETDVANLHRLAKRLGLKD 529
N A +L + L + + LG+
Sbjct: 450 GNQIKAADLLGLNRNTLRKKIRELGVSV 477


44c3299c3310Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c3299020-3.510667Hypothetical aldolase class II protein ygbL
c3300-220-3.627126Hypothetical protein ygbM
c3301-318-3.856916Hypothetical permease ygbN
c3302-219-5.058928Hypothetical protein
c3303-316-0.716136Hypothetical protein
c3304-2160.398620Hypothetical protein
c3305-2161.085752Hypothetical protein
c3306-1161.533427Hypothetical protein
c33070151.296468Putative conserved protein
c33081151.631012Lipoprotein nlpD precursor
c33092182.319390Hypothetical protein
c33102172.160383Protein-L-isoaspartate O-methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3308RTXTOXIND300.019 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.8 bits (67), Expect = 0.019
Identities = 16/84 (19%), Positives = 36/84 (42%), Gaps = 12/84 (14%)

Query: 292 IIATADGRVVYAGNALRGYGNLIIIKHNDDYLSAYAHNDTMLVREQQEVKAGQKIATMGS 351
I+ATA+G++ ++G + IK ++ ++V+E + V+ G + + +
Sbjct: 82 IVATANGKLTHSGRSK-------EIKP---IENSIV--KEIIVKEGESVRKGDVLLKLTA 129

Query: 352 TGTSSTRLHFEIRYKGKSVNPLRY 375
G + L + + RY
Sbjct: 130 LGAEADTLKTQSSLLQARLEQTRY 153


45c3321c3339Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c3321-1173.462866Phosphoadenosine phosphosulfate reductase
c3322-1153.893257Sulfite reductase [NADPH] hemoprotein
c33230133.823075Sulfite reductase [NADPH] flavoprotein
c33240162.933779Putative 6-pyruvoyl tetrahydrobiopterin
c3325-1132.592785Probable electron transfer flavoprotein-quinone
c3326-1100.856963Ferredoxin-like protein ygcO
c33280100.085679Putative electron transfer flavoprotein subunit
c3329011-0.796854Putative electron transfer flavoprotein subunit
c3330-111-1.367687Hypothetical metabolite transport protein ygcS
c3331-111-2.504304Putative conserved protein
c3332115-3.988771Hypothetical oxidoreductase ygcW
c3333116-3.651540Hypothetical protein yqcE
c3334-119-4.038224Hypothetical sugar kinase ygcE
c3335-126-5.220526Hypothetical protein ygcF
c3336-129-6.025933Hypothetical protein
c3337028-6.008621Hypothetical protein
c3338026-4.085836Hypothetical protein
c3339225-3.158739Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3322PF07675300.021 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 30.4 bits (68), Expect = 0.021
Identities = 20/92 (21%), Positives = 39/92 (42%), Gaps = 12/92 (13%)

Query: 206 ILGQTYLPRKFKTTVVIP---PQND--IDLHANDMNFVAIAENGKLVGFNLLVGGGLSIE 260
++ +P+ T +P PQN + A+ ++VAI+++G L G + G++
Sbjct: 240 VMPYRAMPKT--NTYTLPASLPQNQASYSIQASAGSYVAISKDGVLYGTGVANASGVATV 297

Query: 261 HGNK-----KTYARTASEFGYLPLEHTLAVAE 287
+ K Y + YLP+ + E
Sbjct: 298 NMTKQITENGNYDVVITRSNYLPVIKQIQAGE 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3330TCRTETB363e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 36.0 bits (83), Expect = 3e-04
Identities = 53/338 (15%), Positives = 123/338 (36%), Gaps = 34/338 (10%)

Query: 93 LGSLVLGWISDHIGRQKIFTFSFMLITLASFLQFFATTP-EHLIGLRILIGIGLGGDYSV 151
+G+ V G +SD +G +++ F ++ S + F + LI R + G G ++
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPAL 123

Query: 152 GHTLLAEFSPRRHRGILLGAFSVVWT----VGYVLASIAGHHFISESPEAWRWLLASAAL 207
++A + P+ +RG G + VG + + H+ W +LL +
Sbjct: 124 VMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------HWSYLLLIPMI 177

Query: 208 PALLITLLRWGTPESPRWLLRQGRFAEAHAIVHRYFGPHVLLGDEVATATHKHIKTLF-- 265
+ + L + R +G F I+ +L + + + L
Sbjct: 178 TIITVPFLMKLLKKEVR---IKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFL 234

Query: 266 -SSRYWRRTA--------FNSVFFVCLVIPWFVIYT----WLPTIAQTIGLEDALTASLM 312
++ R+ ++ F+ V+ +I+ ++ + + L+ + +
Sbjct: 235 IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEI 294

Query: 313 LNALLIVGALLGLVLTHLLAHRRFLLGSFLLLTATLVVMACLPSGSSLTLLLFVLFSTTI 372
+ ++ G + ++ ++ G +L + ++ S F+L +T+
Sbjct: 295 GSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLS-----VSFLTASFLLETTSW 349

Query: 373 SAVSNLVGILPAESFPTDIRSLGVGFATAMSRLGAAVS 410
+V +L SF + S V + GA +S
Sbjct: 350 FMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMS 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3332DHBDHDRGNASE1071e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 107 bits (267), Expect = 1e-29
Identities = 73/257 (28%), Positives = 117/257 (45%), Gaps = 11/257 (4%)

Query: 36 MDFFSLKGKTAIVTGGNSGLGQAFAMALAKAGANVFIPSFVKDNGETKEMIEK-QGVEVD 94
M+ ++GK A +TG G+G+A A LA GA++ + + E K + +
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 95 FMQVDITAEGAPQKIIAACCERFGTVDILVNNAGICKLNKVLDFGRADWDPMIDVNLTAA 154
D+ A +I A G +DILVN AG+ + + +W+ VN T
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 155 FELSYEAAKIMIPQKSGKIINICSLFSYLGGQWSPAYSATKHALAGFTKAYCDELGQYNI 214
F S +K M+ ++SG I+ + S + + AY+++K A FTK EL +YNI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 215 QVNGIAPGYYATDI--TLATRSNPETNQRVLDH-------IPANRWGDTQDLMGAAVFLA 265
+ N ++PG TD+ +L N Q + IP + D+ A +FL
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAE-QVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 266 SPASNYVNGHLLVVDGG 282
S + ++ H L VDGG
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3333TCRTETA290.036 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.0 bits (65), Expect = 0.036
Identities = 22/103 (21%), Positives = 45/103 (43%), Gaps = 8/103 (7%)

Query: 48 GLIMSTFGIAAIILYAPSGVIADKFSHRKMITSAMIITGLLGLIMATYPPLWVMLCIQVA 107
G++++ + + G ++D+F R ++ ++ + IMAT P LWV+ ++
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV 105

Query: 108 FAITTILMLWSVSIKAASLLGD---HSEQGKIMGWMEGLRGVG 147
IT + A + + D E+ + G+M G G
Sbjct: 106 AGITG-----ATGAVAGAYIADITDGDERARHFGFMSACFGFG 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c333756KDTSANTIGN280.041 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 27.6 bits (61), Expect = 0.041
Identities = 17/74 (22%), Positives = 29/74 (39%), Gaps = 12/74 (16%)

Query: 30 NASWSEVLNQYQRRTDLIPNLVASIKGYSSHEQEVLEAVTLARSQANRASSDLQKTPGDE 89
+AS ++ ++ Q D + L S GY + + N+ + P +
Sbjct: 294 SASIEQIQSKIQELGDTLEELRDSFDGY------------INNAFVNQIHLNFVMPPQAQ 341

Query: 90 QKLQAWQQAQAQVT 103
Q+ QQ QAQ T
Sbjct: 342 QQQGQGQQQQAQAT 355


46c3384c3407Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c3384-3184.045828Membrane-bound lytic murein transglycosylase A
c3385-1204.530426***Hypothetical protein
c33860235.098565Conserved hypothetical protein
c33872267.523245Hypothetical protein
c33880256.551767Hypothetical protein
c3389-1235.352857Hypothetical protein
c3390-1173.079103Hypothetical protein
c3391-1183.168326Secreted protein Hcp
c33920162.888451ClpB protein
c33931163.149758Hypothetical protein
c33940184.240698Hypothetical protein
c3395-1195.466399Hypothetical protein
c3396-1226.421044Hypothetical protein
c3397-1226.458530Hypothetical protein
c3398-1246.961966Hypothetical protein
c3399-2236.112430Hypothetical protein
c3400-2182.877888Hypothetical protein
c3401020-1.635565Hypothetical protein
c3402225-4.790229Hypothetical protein
c3403126-5.882744Hypothetical protein
c3404132-10.198976Hypothetical protein
c3405132-10.7788492-hydroxyacid dehydrogenase
c3406-124-8.583837Phosphosugar isomerase
c3407-119-6.576897Beta-cystathionase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3389OMPADOMAIN811e-18 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 80.7 bits (199), Expect = 1e-18
Identities = 44/142 (30%), Positives = 63/142 (44%), Gaps = 14/142 (9%)

Query: 415 PEQKMEVTASLQVQTVRLDSMSLFDVGQARLKDGSTKVL---VDALVNIRAKPGWLILVA 471
+Q + L S LF+ +A LK L L N+ K G ++V
Sbjct: 200 VAPAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDG-SVVVL 258

Query: 472 GYTDATGDEKSNQQLSLRRAEAVRNWMLQTSDIPATCFAVQGLGESQPAATNDTPQGR-- 529
GYTD G + NQ LS RRA++V ++ L + IPA + +G+GES P N +
Sbjct: 259 GYTDRIGSDAYNQGLSERRAQSVVDY-LISKGIPADKISARGMGESNPVTGNTCDNVKQR 317

Query: 530 -------AVNRRVEISLVPRSD 544
A +RRVEI + D
Sbjct: 318 AALIDCLAPDRRVEIEVKGIKD 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3392HTHFIS367e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 36.0 bits (83), Expect = 7e-04
Identities = 36/190 (18%), Positives = 66/190 (34%), Gaps = 36/190 (18%)

Query: 512 IMTLRQEGTDSTELQQQLRTHQGFAPLLALDVDARAVATVVADWTGIPLSSLLK------ 565
+ + ++ +L +++ + P+L + + + A G L K
Sbjct: 52 VTDVVMPDENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGA-YDYLPKPFDLTE 110

Query: 566 --DEQSDLLSMEQSLENR----------VVGQSPALCAIAQRL-RAAKTGLTPENGPQGV 612
L+ + ++ +VG+S A+ I + L R +T LT
Sbjct: 111 LIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLT-------- 162

Query: 613 FLLTGPSGTGKTETALTLADTLFGGEKSLITINLSEYQEPHTVSQLKGSPPGYVGYGQGG 672
++TG SGTGK A L D + IN++ S+L G + G
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGH--------EKG 214

Query: 673 VLTEAVRKRP 682
T A +
Sbjct: 215 AFTGAQTRST 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3401ANTHRAXTOXNA290.010 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.3 bits (65), Expect = 0.010
Identities = 13/83 (15%), Positives = 35/83 (42%), Gaps = 9/83 (10%)

Query: 33 ESKSVASAVFYKQIKILHLDFFSR---------SALNTDAEDTPLSTMVHVWQLKTREDF 83
+ + V+Y+ K + LD S+ + + + ++D+ S ++ + K + +
Sbjct: 161 INSEQSKEVYYEIGKGISLDIISKDKSLDPEFLNLIKSLSDDSDSSDLLFSQKFKEKLEL 220

Query: 84 DKADYDTLFMQEEKTLEKDVLAK 106
+ D F++E T + +
Sbjct: 221 NNKSIDINFIKENLTEFQHAFSL 243


47c3554c3696Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c3554221-3.122691Hypothetical protein
c3555322-3.123000Hypothetical protein yqgA
c3556425-2.991215*Prophage P4 integrase
c3557422-1.844491ShiA homolog
c35583242.586244Hypothetical protein
c35594241.018871Hypothetical protein
c35603250.867828Unknown protein encoded by ISEc8 within
c35614240.480612Unknown protein encoded by ISEc8
c3562426-1.509716Hypothetical protein
c3563525-2.881033Unknown protein encoded by ISEc8 within
c3564733-8.025412Hypothetical protein
c3565636-10.154830Putative response regulator
c3566636-10.842081Hypothetical protein
c3567739-11.434840Hypothetical protein
c3568640-11.653960Hypothetical protein
c3569641-11.633538Hemolysin C
c3570538-10.051226Hemolysin A
c3571435-9.499168Hypothetical protein
c3572434-8.768794Hypothetical protein
c3573431-7.416628Hemolysin B
c3574431-3.610873Hemolysin D
c35755302.010472Transposase insF for insertion sequence
c35765313.279951Unknown in IS
c35774300.424821Unknown protein encoded by ISEc8 within
c3578430-2.568138Unknown protein encoded by ISEc8 within
c3579536-4.348754Hypothetical protein
c3580738-7.227493Hypothetical protein
c3581739-7.074572Hypothetical protein
c3582738-5.967696PapX protein
c3583739-3.470348PapG protein
c3584739-2.560297PapF protein
c3585533-0.463753PapE protein
c35865310.518206PapK protein
c3587530-0.808310Hypothetical protein
c3588530-1.398392PapJ protein
c3589426-1.723957PapD protein
c3590425-0.756002PapC protein
c3591329-2.245794PapH protein
c3592224-1.388651PapA protein
c35930240.803130PapI protein
c35941240.212612Putative Transposase
c3595129-2.517469Transposase
c3596331-5.518257Hypothetical protein in IS
c3597432-5.877540Transposase
c3598533-6.748416Hypothetical protein
c3599632-5.484139Hypothetical protein
c3600529-4.505182Hypothetical protein
c3601625-3.516311Hypothetical protein
c36026280.982825Hypothetical protein
c36036271.241447Hypothetical protein
c36047271.379311Hypothetical protein
c36058250.843404Hypothetical protein
c36066251.275601Hypothetical protein
c36076251.524385Hypothetical protein
c36085261.115782Hypothetical protein
c36096260.995925Hypothetical protein
c36105231.660012Putative receptor
c36114202.441936Transposase insD for insertion element
c36124191.369624Transposase insC for insertion element
c3613320-0.166932Hypothetical protein
c3614326-5.526973Hypothetical protein
c3615326-5.810420Unknown in ISEc8
c3616532-8.471547Unknown in ISEc8
c3617433-8.964409Unknown in putative ISEc8
c3618324-6.169013Hypothetical protein
c3619423-5.780543secreted auto transpoter toxin
c3620316-0.317528Hypothetical protein
c36212170.665313Hypothetical protein
c36222161.262726Hypothetical protein
c36233181.910240IutA protein
c36243172.269886IucD protein
c36254192.030152IucC protein
c36265221.145911IucB protein
c3627622-0.272986IucA protein
c36281038-5.100195shiF protein
c36291348-10.788882Hypothetical protein
c36301349-12.303972Hypothetical protein yjhS precursor
c36311451-12.858185Hypothetical protein
c36321151-11.883889Hypothetical protein
c36331152-11.598861Hypothetical protein
c36341052-11.492451Hypothetical protein yjhT precursor
c3635952-10.752270Hypothetical protein yjhA precursor
c3636849-9.147093Hypothetical protein
c3637644-7.991127Putative sialic acid transporter
c3638432-4.170644Hypothetical protein yhcI
c3639524-1.928275N-acetylneuraminate lyase subunit
c36405254.744889Unknown in ISEc8
c36415265.158968Unknown in ISEc8
c36424275.464809Hypothetical protein
c36434286.013237Unknown in ISEc8
c36442296.230397Hypothetical protein
c36452295.912643Unknown protein encoded by ISEc8 within
c36465262.794271Hypothetical protein
c3647325-0.656809Hypothetical protein
c3648425-2.200579Hypothetical protein
c3649526-1.253150Haemolysin expression modulating protein
c36507232.630154Hypothetical protein
c36517232.656074Hypothetical protein
c36527232.886442Hypothetical protein yfjI
c36537234.925290Hypothetical protein
c36546245.270847Hypothetical protein yeeP
c36556244.890419Antigen 43 precursor
c36563253.252880Hypothetical protein
c36572254.026554Unknown in ISEc8
c36583264.018339Hypothetical protein
c36594232.893756Unknown protein encoded by ISEc8
c36605232.377375Unknown protein encoded by ISEc8
c36615232.342386Hypothetical protein
c36624222.263897Unknown protein encoded by ISEc8
c36635221.892531Hypothetical protein
c36645232.042409Hypothetical protein yeeR
c36656273.843907Hypothetical protein
c36666285.348258Hypothetical protein ykfF
c36677304.718324Hypothetical protein
c36687294.458344Hypothetical protein yafZ
c36698294.463328Hypothetical protein yfjX
c36707304.615838Hypothetical protein
c36717294.430670Putative radC-like protein yeeS
c36726283.869999Hypothetical protein
c36735294.117135Hypothetical protein yeeT
c36745313.827922Hypothetical protein
c36756261.377455Hypothetical protein
c3676627-0.160972Hypothetical protein yeeU
c3677523-4.101829Hypothetical protein yeeV
c3678624-5.408259Conserved hypothetical protein
c3679625-6.614174Conserved hypothetical protein
c3680627-7.655018Conserved hypothetical protein
c3681127-7.201788Hypothetical protein
c3682019-5.694207Hypothetical protein
c3683-111-2.191035Hypothetical protein
c3684-212-1.061453Hypothetical protein
c3685-29-0.560494Hypothetical protein
c3686-310-1.000462Hypothetical protein yrbH
c3687-212-3.246177KpsE protein
c3688120-6.062578KpsD protein
c3689339-12.5743333-deoxy-manno-octulosonate cytidylyltransferase
c3690444-14.724889KpsC protein
c3691753-18.445718KpsS protein
c3692857-19.934463Hypothetical protein
c3693652-18.560177Hypothetical protein
c3694547-16.400090Hypothetical protein
c3695035-11.452944Hypothetical protein
c3696122-4.130354Putative glycerol-3-phosphate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3564PF06580423e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.2 bits (99), Expect = 3e-06
Identities = 24/137 (17%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 360 LSIETRRLQLRIMMSHSLPLIRADISMIERVITNLLDNAVRH----TPPEGSIRLKVWQE 415
L + + + + R+ + + D+ + ++ L++N ++H P G I LK ++
Sbjct: 229 LQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD 288

Query: 416 DNRLHVEVADSGPGLTEDMRTHLFRRASVLCHEPSEEPRGGLGLLIVRRMLVLHGGD--- 472
+ + +EV ++G + + + G GL VR L + G
Sbjct: 289 NGTVTLEVENTGSLALK-----------------NTKESTGTGLQNVRERLQMLYGTEAQ 331

Query: 473 IRLTDSTTGACFRFFLP 489
I+L++ +P
Sbjct: 332 IKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3565HTHFIS901e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 1e-22
Identities = 35/129 (27%), Positives = 60/129 (46%)

Query: 7 KILLMEDDYDIAALLRLNLQDEGYQIVHEADGARARLLLDKQTWDAVILDLMLPNVNGLE 66
IL+ +DD I +L L GY + ++ A + D V+ D+++P+ N +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 67 ICRYIRQMTRYLPVIIISARTSETHRVLGLEMGADDYLPKPFSIPELIARIKALFRRQEA 126
+ I++ LPV+++SA+ + + E GA DYLPKPF + ELI I +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 127 MGQNILLAG 135
+
Sbjct: 125 RPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3569RTXTOXINC317e-114 Gram-negative bacterial RTX toxin-activating protein C...
		>RTXTOXINC#Gram-negative bacterial RTX toxin-activating protein C

signature.
Length = 170

Score = 317 bits (815), Expect = e-114
Identities = 161/170 (94%), Positives = 167/170 (98%)

Query: 55 MNMNNPLEVLGHVSWLWASSPLHRNWPVSLFAINVLPAIRANQYALLTRDNYPVAYCSWA 114
MN+N PLE+LGHVSWLWASSPLHRNWPVSLFAINVLPAI+ANQY LLTRD+YPVAYCSWA
Sbjct: 1 MNINKPLEILGHVSWLWASSPLHRNWPVSLFAINVLPAIQANQYVLLTRDDYPVAYCSWA 60

Query: 115 NLSLENEIKYLNDVTSLVAEDWTSGDRKWFIDWIAPFGDNGALYKYMRKKFPDELFRAIR 174
NLSLENEIKYLNDVTSLVAEDWTSGDRKWFIDWIAPFGDNGALYKYMRKKFPDELFRAIR
Sbjct: 61 NLSLENEIKYLNDVTSLVAEDWTSGDRKWFIDWIAPFGDNGALYKYMRKKFPDELFRAIR 120

Query: 175 VDPKTHVGKVSEFHGGKIDKQLANKIFKQYYHELITEVKNKTDFNFSLTG 224
VDPKTHVGKVSEFHGGKIDKQLANKIFKQY+HELITEVK K+DFNFSLTG
Sbjct: 121 VDPKTHVGKVSEFHGGKIDKQLANKIFKQYHHELITEVKRKSDFNFSLTG 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3570RTXTOXINA14840.0 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 1484 bits (3843), Expect = 0.0
Identities = 988/1024 (96%), Positives = 1001/1024 (97%)

Query: 1 MPTITTAQIKSTLQSAKQSSANKLHSAGQSTKDALKKAAEQTRNAGNRLILLIPKDYKGQ 60
M TITTAQIKSTLQSAKQS+ANKLHSAGQSTKDALKKAAEQTRNAGNRLILLIPKDYKGQ
Sbjct: 1 MTTITTAQIKSTLQSAKQSAANKLHSAGQSTKDALKKAAEQTRNAGNRLILLIPKDYKGQ 60

Query: 61 GSSLNDLVRTADELGIEVQYDEKNGTAITKQVFGTAEKLIGLTERGVTIFAPQLDKLLQK 120
GSSLNDLVRTADELGIEVQYDEKNGTAITKQVFGTAEKLIGLTERGVTIFAPQLDKLLQK
Sbjct: 61 GSSLNDLVRTADELGIEVQYDEKNGTAITKQVFGTAEKLIGLTERGVTIFAPQLDKLLQK 120

Query: 121 YQKAGNKLGGSAENIGDNLGKAGSVLSTFQNFLGTALSSMKIDELIKRQKSGSNVSSSEL 180
YQKAGN LGG AENIGDNLGKAG +LSTFQNFLGTALSSMKIDELIK+QKSG NVSSSEL
Sbjct: 121 YQKAGNILGGGAENIGDNLGKAGGILSTFQNFLGTALSSMKIDELIKKQKSGGNVSSSEL 180

Query: 181 AKASIELINQLVDTAASINNNVNSFSQQLNKLGSVLSNTKHLTGVGNKLQNLPNLDNIGA 240
AKASIELINQLVDT AS+NNNVNSFSQQLN LGSVLSNTKHL GVGNKLQNLPNLDNIGA
Sbjct: 181 AKASIELINQLVDTVASLNNNVNSFSQQLNTLGSVLSNTKHLNGVGNKLQNLPNLDNIGA 240

Query: 241 GLDTVSGILSAISASFILSNADADTGTKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGL 300
GLDTVSGILSAISASFILSNADADT TKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGL
Sbjct: 241 GLDTVSGILSAISASFILSNADADTRTKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGL 300

Query: 301 STSAAAAGLIASVVTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKE 360
STSAAAAGLIAS VTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKE
Sbjct: 301 STSAAAAGLIASAVTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKE 360

Query: 361 TGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEASKQAMFEH 420
TGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEASKQAMFEH
Sbjct: 361 TGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEASKQAMFEH 420

Query: 421 VASKMADVIAEWEKKHGKNYFENGYDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHW 480
VASKMADVIAEWEKKHGKNYFENGYDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHW
Sbjct: 421 VASKMADVIAEWEKKHGKNYFENGYDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHW 480

Query: 481 DTLIGELAGVTRNGDKTLSGKSYIDYYEEGKRLEKKPDEFQKQVFDPLKGNIDLSDSKSS 540
DTLIGELAGVTRNGDKTLSGKSYIDYYEEGKRLEKK DEFQKQVFDPLKGNIDLSDSKSS
Sbjct: 481 DTLIGELAGVTRNGDKTLSGKSYIDYYEEGKRLEKKXDEFQKQVFDPLKGNIDLSDSKSS 540

Query: 541 TLLKFVTPLLTPGEEIRERRQSGKYEYITELLVKGVDKWTVKGVQDKGSVYDYSNLIQHA 600
TLLKFVTPLLTPGEEIRERRQSGKYEYITELLVKGVDKWTVKGVQDKG+VYDYSNLIQHA
Sbjct: 541 TLLKFVTPLLTPGEEIRERRQSGKYEYITELLVKGVDKWTVKGVQDKGAVYDYSNLIQHA 600

Query: 601 SVGNNQYREIRIESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATE 660
SVGNNQYREIRIESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATE
Sbjct: 601 SVGNNQYREIRIESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATE 660

Query: 661 AGNYTVTRVLGGDVKILQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVE 720
AGNYTVTRVLGGDVK+LQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVE
Sbjct: 661 AGNYTVTRVLGGDVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVE 720

Query: 721 ELIGTTRADKFFGSKFTDIFHGADGDDHIEGNDGNDRLYGDKGNDTLRGGNGDDQLYGGD 780
ELIGTTRADKFFGSKFTDIFHGADGDD IEGNDGNDRLYGDKGNDTL GGNGDDQLYGGD
Sbjct: 721 ELIGTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGD 780

Query: 781 GNDKLIGGTGNNYLNGGDGDDELQVQGNSLAKNVLSGGKGNDKLYGSEGADLLDGGEGND 840
GNDKLIG GNNYLNGGDGDDE QVQGNSLAKNVL GGKGNDKLYGSEGADLLDGGEG+D
Sbjct: 781 GNDKLIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDD 840

Query: 841 LLKGGYGNDIYRYLSGYGHHIIDDDGGKDDKLSLADIDFRDVAFRREGNDLIMYKAEGNV 900
LLKGGYGNDIYRYLSGYGHHIIDDDGGK+DKLSLADIDFRDVAF+REGNDLIMYK EGNV
Sbjct: 841 LLKGGYGNDIYRYLSGYGHHIIDDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEGNV 900

Query: 901 LSIGHKNGITFRNWFEKESGDISNHQIEQIFDKDGRVITPDSLKKALEYQQSNNKASYVY 960
LSIGHKNGITFRNWFEKESGDISNH+IEQIFDK GR+ITPDSLKKALEYQQ NNKASYVY
Sbjct: 901 LSIGHKNGITFRNWFEKESGDISNHEIEQIFDKSGRIITPDSLKKALEYQQRNNKASYVY 960

Query: 961 GNDALAYGSQDNLNPLINEISKIISAAGNFDVKEERAAASLLQLSGNASDFSYGRNSITL 1020
GNDALAYGSQ +LNPLINEISKIISAAG+FDVKEER AASLLQLSGNASDFSYGRNSITL
Sbjct: 961 GNDALAYGSQGDLNPLINEISKIISAAGSFDVKEERTAASLLQLSGNASDFSYGRNSITL 1020

Query: 1021 TASA 1024
T SA
Sbjct: 1021 TTSA 1024


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3574RTXTOXIND5990.0 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 599 bits (1547), Expect = 0.0
Identities = 463/478 (96%), Positives = 468/478 (97%)

Query: 1 MKTWLMGFSEFLLRYKLVWSETWKIRKQLDTPVREKDENEFLPAHLELIETPVSRRPRLV 60
MKTWLMGFSEFLLRYKLVWSETWKIRKQLDTPVREKDENEFLPAHLELIETPVSRRPRLV
Sbjct: 1 MKTWLMGFSEFLLRYKLVWSETWKIRKQLDTPVREKDENEFLPAHLELIETPVSRRPRLV 60

Query: 61 AYFIMGFLVIAVILSVLGQVEIVATANGKLTLSGRSKEIKPIENSIVKEIIVKEGESVRK 120
AYFIMGFLVIA ILSVLGQVEIVATANGKLT SGRSKEIKPIENSIVKEIIVKEGESVRK
Sbjct: 61 AYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRK 120

Query: 121 GDVLLKLTALGAEADTLKTQSSLLQTRLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS 180
GDVLLKLTALGAEADTLKTQSSLLQ RLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS
Sbjct: 121 GDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS 180

Query: 181 EEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTILARINRYENLSRVEKSRLDDF 240
EEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT+LARINRYENLSRVEKSRLDDF
Sbjct: 181 EEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDF 240

Query: 241 RSLLHKQAIAKHAVLEQENKYVEAANELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI 300
SLLHKQAIAKHAVLEQENKYVEA NELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI
Sbjct: 241 SSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI 300

Query: 301 LDKLRQTTDNIELLTLELEKNEERQQASVIRAPVSGKVQQLKVHTEGGVVTTAETLMVIV 360
LDKLRQTTDNI LLTLEL KNEERQQASVIRAPVS KVQQLKVHTEGGVVTTAETLMVIV
Sbjct: 301 LDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIV 360

Query: 361 PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQKLGL 420
PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQ+LGL
Sbjct: 361 PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGL 420

Query: 421 VFNVIVSVEENDLSTGNKHIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTESLHER 478
VFNVI+S+EEN LSTGNK+IPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTESL ER
Sbjct: 421 VFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3583PF036276030.0 PapG
		>PF03627#PapG

Length = 336

Score = 603 bits (1555), Expect = 0.0
Identities = 333/336 (99%), Positives = 335/336 (99%)

Query: 1 MKKWFPALLFSLCVSGESSAWNNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIATVT 60
MKKWFPALLFSLCVSGESSAWNNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIATVT
Sbjct: 1 MKKWFPALLFSLCVSGESSAWNNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIATVT 60

Query: 61 WNQCNGPEFADGSWAYYREYIAWVVFPKKVMTKNGYPLFIEVHNKGSWSEENTGDNDSYF 120
WNQCNGP FADGSWAYYREYIAWVVFPKKVMTKNGYPLFIEVHNKGSWSEENTGDNDSYF
Sbjct: 61 WNQCNGPGFADGSWAYYREYIAWVVFPKKVMTKNGYPLFIEVHNKGSWSEENTGDNDSYF 120

Query: 121 FLKGYKWDERAFDAGNLCQKPGETTRLTEKFNDIIFKVALPADLPLGDYSVTIPYTSGIQ 180
FLKGYKWDERAFDAGNLCQKPGETTRLTEKF+DIIFKVALPADLPLGDYSVTIPYTSG+Q
Sbjct: 121 FLKGYKWDERAFDAGNLCQKPGETTRLTEKFDDIIFKVALPADLPLGDYSVTIPYTSGMQ 180

Query: 181 RHFASYLGARFKIPYNVAKTLPRENEMLFLFKNIGGCRPSAQSLEIKHGDLSINSANNHY 240
RHFASYLGARFKIPYNVAKTLPRENEMLFLFKNIGGCRPSAQSLEIKHGDLSINSANNHY
Sbjct: 181 RHFASYLGARFKIPYNVAKTLPRENEMLFLFKNIGGCRPSAQSLEIKHGDLSINSANNHY 240

Query: 241 AAQTLSVSCDVPANIRFMLLRNTTPTYSHGKKFSVGLGHGWDSIVSVNGVDTGETTMRWY 300
AAQTLSVSCDVPANIRFMLLRNTTPTYSHGKKFSVGLGHGWDSIVSVNGVDTGETTMRWY
Sbjct: 241 AAQTLSVSCDVPANIRFMLLRNTTPTYSHGKKFSVGLGHGWDSIVSVNGVDTGETTMRWY 300

Query: 301 KAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP 336
KAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP
Sbjct: 301 KAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3584FIMBRIALPAPF2675e-95 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 267 bits (683), Expect = 5e-95
Identities = 155/167 (92%), Positives = 156/167 (93%), Gaps = 1/167 (0%)

Query: 11 MARLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE 70
M RLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE
Sbjct: 1 MIRLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE 60

Query: 71 VTKTISISCTYKSGSPWIKVTGNAMA-GQTNVLATNIANFGIALYQGKGMSTPLTLGNGS 129
VTK ISISC YKSGS WIKVTGN M GQ NVLATNI +FGIALYQGKGMSTPLTLGNGS
Sbjct: 61 VTKNISISCPYKSGSLWIKVTGNTMGVGQNNVLATNITHFGIALYQGKGMSTPLTLGNGS 120

Query: 130 GNGYRVTAGLDTARSTFTFTSVPFRNGSRTLNGGDFRTTASMSMIYN 176
GNGYRVTAGLDTARSTFTFTSVPFRNGS LNGGDFRTTASMSMIYN
Sbjct: 121 GNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMIYN 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3585FIMBRIALPAPE306e-110 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 306 bits (784), Expect = e-110
Identities = 128/173 (73%), Positives = 145/173 (83%)

Query: 7 MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVTKAEVDWGNVEIQTLSQNG 66
MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTV AEV+WG++EIQ L Q+G
Sbjct: 1 MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAEVNWGDIEIQNLVQSG 60

Query: 67 NHEKEFTVNMQCPYHLGTMKVTITATNTYNNAILVQNTSNTSSDGVLVYLYNSNAGNIGT 126
++K+FTV+M CPY LGTMKVTIT+ N+ILV NTS S DG+L+YLYNSN IG
Sbjct: 61 GNQKDFTVDMNCPYSLGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYNSNNSGIGN 120

Query: 127 AITLGTPFTPGKITGNNADRTISLHAKLGYKGNMQSLKAGDFSATATLVASYS 179
A+TLG+ TPGKITG R I+L+AKLGYKGNMQSL+AG FSATATLVASYS
Sbjct: 121 AVTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVASYS 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3590PF005777420.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 742 bits (1917), Expect = 0.0
Identities = 243/882 (27%), Positives = 362/882 (41%), Gaps = 67/882 (7%)

Query: 2 MRGMKDRI-PFAVNNITCVILLSLFCNAASAVEFNTDVLDAADKKNIDFTRFSEAGYVLP 60
+ K R+ F V + +++ + FN L + D +RF + P
Sbjct: 16 LHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPP 75

Query: 61 GQYLLDVIVNGQSISPASLQISFVEPQSSGDKAEKKLPQACLTSDMVRLMGLTAESLDKV 120
G Y +D+ +N + A+ ++F S CLT + MGL S+ +
Sbjct: 76 GTYRVDIYLNNGYM--ATRDVTFNTGDSEQGI------VPCLTRAQLASMGLNTASVSGM 127

Query: 121 VYWHDGQCADF-HGLPGVDIRPDTGAGVLRINMPQAWLEYSDATWLPPSRWDDGIPGLML 179
D C + + D G L + +PQA++ ++PP WD GI +L
Sbjct: 128 NLLADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLL 187

Query: 180 DYNLNGTVSRNYQGGDSHQFSYNGTVGGNLGPWRLRADYQGSQEQSRYNGEKTTNRNFTW 239
+YN +G +N GG+SH N G N+G WRLR + S S + + +
Sbjct: 188 NYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSS--DSSSGSKNKWQH 245

Query: 240 SRFYLFRAIPRWRANLTLGENNINSDIFRSWSYTGASLESDDRMLPPRLRGYAPQITGIA 299
+L R I R+ LTLG+ DIF ++ GA L SDD MLP RG+AP I GIA
Sbjct: 246 INTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIA 305

Query: 300 ETNARVVVSQQGRVLYDSMVPAGPFSIQDLD-SSVRGRLDVEVIEQNGRKKTFQVDTASV 358
A+V + Q G +Y+S VP GPF+I D+ + G L V + E +G + F V +SV
Sbjct: 306 RGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSV 365

Query: 359 PYLTRPGQVRYKLVSGRSRGYGHETEGPVFATGEASWGLSNQWSLYGGAVLAGDYNALAA 418
P L R G RY + +G R + E P F GL W++YGG LA Y A
Sbjct: 366 PLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNF 425

Query: 419 GAGWDLGVPGTLSADITQSVARIEGERTFQGKSWRLSYSKRFDNADADITFAGYRFSERN 478
G G ++G G LS D+TQ+ + + + G+S R Y+K + + +I GYR+S
Sbjct: 426 GIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSG 485

Query: 479 YMTMEQYLNARYR--------------------NDYSSREKEMYTVTLNKNVADWNTSFN 518
Y +R + + ++ +T+ + + + +
Sbjct: 486 YFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTS-TLY 544

Query: 519 LQYSRQTYWDIRKTD-YYTVSVNRYFNVFGLQGVAVGLSASRSKYLGRD--NDSAYLRIS 575
L S QTYW D + +N F + LS S +K + + L ++
Sbjct: 545 LSGSHQTYWGTSNVDEQFQAGLNTAFE-----DINWTLSYSLTKNAWQKGRDQMLALNVN 599

Query: 576 VPLGT------------GTASYSGSMSND-RYVNMAGYTDM-FNDGLDSYSLNAGLNSGG 621
+P +ASYS S + R N+AG D SYS+ G GG
Sbjct: 600 IPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGG 659

Query: 622 GLTSQRQINAYYSHRSPLANLSANIASLQKGYTSFGVSASGGATITGKGAALHAGGMSGG 681
S A ++R N + S SGG G L G
Sbjct: 660 DGNSGSTGYATLNYRGGYGNANIG-YSHSDDIKQLYYGVSGGVLAHANGVTL--GQPLND 716

Query: 682 TRLLVDTDGVGGVPVDGGQVV-TNRWGTGVVTDISSYYRNTTSVDLKRLPDDVEATRSVV 740
T +LV G V+ V T+ G V+ + Y N ++D L D+V+ +V
Sbjct: 717 TVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVA 776

Query: 741 ESALTEGAIGYRKFSVLKGKRLFAILRLADGSQPPFGASVTSEKGRELGMVADEGLAWLS 800
T GAI +F G +L L + PFGA VTSE + G+VAD G +LS
Sbjct: 777 NVVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYLS 835

Query: 801 GVTPGETLSVNW--DGKIQCQVNVPETAISDQQLL----LPC 836
G+ + V W + C N S QQLL C
Sbjct: 836 GMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3591FIMBRIALPAPE320.001 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 31.5 bits (71), Expect = 0.001
Identities = 41/173 (23%), Positives = 75/173 (43%), Gaps = 29/173 (16%)

Query: 29 GMSLPEYWG----EEHVWWDGRAAFHGEVVRPACTLAMEDAWQIIDMGETPVRDL-QNGF 83
G+ LP G +HV F G+++ PACT+ + ++ G+ +++L Q+G
Sbjct: 6 GLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAE----VNWGDIEIQNLVQSG- 60

Query: 84 SGPERKFSLRLRNCEFNSQGGNLFSDSRIRVTFDGVRGET---PDKFNLSGQAKGINLQI 140
G ++ F++ + NC ++ ++ +T +G G + P+ SG I L
Sbjct: 61 -GNQKDFTVDM-NCPYS------LGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYN 112

Query: 141 ADARGNIARAGKV-MPAIP--LTGNEEALDYTLRIVR----NGKKLEAGNYFA 186
++ I A + P +TG A TL N + L+AG + A
Sbjct: 113 SNN-SGIGNAVTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSA 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3600HTHTETR652e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.0 bits (158), Expect = 2e-15
Identities = 34/197 (17%), Positives = 66/197 (33%), Gaps = 13/197 (6%)

Query: 1 MSTYENILRITDTLIQQRGFLGFSYADLEKEIGIRKASIHHHFPRKTDLGIAYCQYKTEV 60
T ++IL + L Q+G S ++ K G+ + +I+ HF K+DL +
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 61 FDRLNITLHN--VSSGVQRLRTYLDA-FAGCAERGEMCGIYAMLSDSHQFSPELQ--EAV 115
L + + LR L + ++ +F E+ +
Sbjct: 70 IGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 116 SQLAHQEI-QMIKDIITSGQNSGEFKTVLLPDELAVIVCSTLKGALMLNRLPPHDTYSGT 174
+ E I+ + + L+ A+I+ + G LM N L ++
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISG-LMENWLFAPQSFDLK 188

Query: 175 ------VNALIKMLDTR 185
V L++M
Sbjct: 189 KEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3603ALARACEMASE300.003 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 29.7 bits (67), Expect = 0.003
Identities = 10/51 (19%), Positives = 20/51 (39%), Gaps = 2/51 (3%)

Query: 34 TEDIAVTDAWLRAHPDDVEVIPLCRIQPRQYSHEQHIEELRAALARMLHMH 84
I + + + H D+E+ R+ +S+ Q A L L ++
Sbjct: 73 KGPILMLEGFF--HAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDIY 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3608HTHFIS280.034 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 28.3 bits (63), Expect = 0.034
Identities = 8/39 (20%), Positives = 18/39 (46%), Gaps = 1/39 (2%)

Query: 84 NGAQFRQLCETTDWVDAGE-NVLLFGASGLGKSHLAAAI 121
A +++ + + +++ G SG GK +A A+
Sbjct: 142 RSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3615PF06580330.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.9 bits (75), Expect = 0.002
Identities = 10/78 (12%), Positives = 31/78 (39%), Gaps = 8/78 (10%)

Query: 19 QAEALRQKDQQLSLVEETEAFLRSALARAEEKIEEDEREIEHLRA--QIEKLRRMLFGTR 76
+A L + ++ +R +L + + E+ + + Q+ ++ F
Sbjct: 183 RALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQ---FE-- 237

Query: 77 SEKLRREVEQAEALLKQR 94
++L+ E + A++ +
Sbjct: 238 -DRLQFENQINPAIMDVQ 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3619IGASERPTASE2865e-81 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 286 bits (733), Expect = 5e-81
Identities = 192/900 (21%), Positives = 322/900 (35%), Gaps = 215/900 (23%)

Query: 35 NRKLVATMLSLAVAGTVNA---ANIDISNVWARDYLDLAQNKGIFQPGATDVTITLKNGD 91
N+K ++L VA + A + +V + + D A+NKG F GAT+V + KN
Sbjct: 3 NKKFKLNFIALTVAYALTPYTEAALVRDDVDYQIFRDFAENKGKFSVGATNVLVKDKNNK 62

Query: 92 KF--SFHN-LSIPDFSGAAAS-GAATAIGGSYSVTVAH-----------------NKKNP 130
+ N + + DFS AT I Y V V H N N
Sbjct: 63 DLGTALPNGIPMIDFSVVDVDKRIATLINPQYVVGVKHVSNGVSELHFGNLNGNMNNGNA 122

Query: 131 QAAETQVYAQSSYRVVDRRNSN-------------------DFEIQRLNKFVVETVGATP 171
+A ++ Y V++ D+ + RL+KFV E
Sbjct: 123 KAHRDVSSEENRYFSVEKNEYPTKLNGKTVTTEDQTQKRREDYYMPRLDKFVTEVAPIEA 182

Query: 172 AETNPTTYSDALERYGIVTSDGSKKIIGFRAGSGGTSFI----NGESKISTNSAYS---- 223
+ + +D +K R GSG + FI + S I N
Sbjct: 183 STAS---------SDAGTYNDQNKYPAFVRLGSG-SQFIYKKGDNYSLILNNHEVGGNNL 232

Query: 224 -------HDLLSASLFEVTQWDSYGMMIYKNDKT-------------FRNLEIFGDSGSG 263
++ + ++V ++ G++ + N K N + GDSGS
Sbjct: 233 KLVGDAYTYGIAGTPYKVN-HENNGLIGFGNSKEEHSDPKGILSQDPLTNYAVLGDSGSP 291

Query: 264 AYLYDNKLEKWVLVGTTHGIASVNGDQLTWITKYNDKLVSELKDTYS----------HKI 313
++YD + KW+ +G+ A N Y + ++ + S +
Sbjct: 292 LFVYDREKGKWLFLGSYDFWAGYNKKSWQEWNIYKSQFTKDVLNKDSAGSLIGSKTDYSW 351

Query: 314 NLNGNNVTIKNTDITLHQNNADTTGTQEKITKDKDIVFTNGGDVLFKDNLDFGSGGIIFD 373
+ NG TI + +L N D ++K K + F G + +N+D G+GG+ F+
Sbjct: 352 SSNGKTSTITGGEKSL---NVDLADGKDKPNHGKSVTFEGSGTLTLNNNIDQGAGGLFFE 408

Query: 374 EGHEYNINGQGFTFKGAGIDIGKESIVNWNALYSSDDVLHKIGPGTLNVQKKQG--ANIK 431
+E T+KGAG+ + + V W D L KIG GTL V+ ++K
Sbjct: 409 GDYEVKGTSDNTTWKGAGVSVAEGKTVTWKVHNPQYDRLAKIGKGTLIVEGTGDNKGSLK 468

Query: 432 IGEGNVILNEEG------TFNNIYLASGNGKVILNKDNSLGNDQYAGIFFTKRGGTLDLN 485
+G+G VIL ++ F ++ + SG ++LN D + + I+F RGG LDLN
Sbjct: 469 VGDGTVILKQQTNGSGQHAFASVGIVSGRSTLVLNDDKQVDPN---SIYFGFRGGRLDLN 525

Query: 486 GHNQTFTRIAATDDGTTITNSDTTKEAVLAINNEDSYIYHGNINGNIKLTHNINSQD--- 542
G++ TF I DDG + N + T + + I E + N +NI++ D
Sbjct: 526 GNSLTFDHIRNIDDGARLVNHNMTNASNITITGESLI-----TDPNTITPYNIDAPDEDN 580

Query: 543 ------KKTNAKL----------ILDGSVNTKNDVEV----SNASLTMQGHATEHAI--- 579
K +L L +T++++ SN + G ++ A
Sbjct: 581 PYAFRRIKDGGQLYLNLENYTYYALRKGASTRSELPKNSGESNENWLYMGKTSDEAKRNV 640

Query: 580 ------------------------------FRSSANHCSLVFLCGTD------------- 596
F+ + + GT+
Sbjct: 641 MNHINNERMNGFNGYFGEEEGKNNGNLNVTFKGKSEQNRFLLTGGTNLNGDLTVEKGTLF 700

Query: 597 ----WVTVLKETESSYNKKFNSDYKSNNQQTSFDQPDWKTGVFKFDTLHLN-NADFSISR 651
++ + K + + NN + DW FK T+++ NA R
Sbjct: 701 LSGRPTPHARDIAGISSTKKDPHFAENN--EVVVEDDWINRNFKATTMNVTGNASLYSGR 758

Query: 652 N-ANVEGNISA-NKSAITIGDKN-------------VYIDNLAGKNITNNGFDFKQ---- 692
N AN+ NI+A NK+ + IG K V + N F+
Sbjct: 759 NVANITSNITASNKAQVHIGYKTGDTVCVRSDYTGYVTCTTDKLSDKALNSFNPTNLRGN 818

Query: 693 ---TISTNLSIGETKFTGGI-TAHNSQIAIGDQAVVTLNGATF-----LDNTPISIDKGA 743
T S N +G+ G I + NSQ+ + + + L G + L N I ++
Sbjct: 819 VNLTESANFVLGKANLFGTIQSRGNSQVRLTENSHWHLTGNSDVHQLDLANGHIHLNSAD 878



Score = 40.0 bits (93), Expect = 6e-05
Identities = 52/204 (25%), Positives = 89/204 (43%), Gaps = 42/204 (20%)

Query: 792 GNANFI-ARNMASVTGNIYADDAATITLGQPETETPTISSAYQAW--------AETLLYG 842
GNA+ RN+A++T NI A + A + +G +T + S Y + ++ L
Sbjct: 750 GNASLYSGRNVANITSNITASNKAQVHIGYKTGDTVCVRSDYTGYVTCTTDKLSDKALNS 809

Query: 843 FD-TAYRGAITAPK--------------------ATVSMN-NAIWHLNSQSSINRLETKD 880
F+ T RG + + + V + N+ WHL S +++L+ +
Sbjct: 810 FNPTNLRGNVNLTESANFVLGKANLFGTIQSRGNSQVRLTENSHWHLTGNSDVHQLDLAN 869

Query: 881 SMVRFTGDNG-----KFTTLTVNNLTIDDSAFVLRANLA--QADQLVVNKSLSGKNNLLL 933
+ + K+ TLTVN+L+ + S + L +L+ Q D++VV KS +G L +
Sbjct: 870 GHIHLNSADNSNNVTKYNTLTVNSLSGNGSFYYL-TDLSNKQGDKVVVTKSATGNFTLQV 928

Query: 934 VDFIEKNGNSNGLNIDLVSAPKGT 957
D K G N + L A K
Sbjct: 929 AD---KTGEPNHNELTLFDASKAQ 949


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3625PF041838160.0 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 816 bits (2109), Expect = 0.0
Identities = 566/580 (97%), Positives = 571/580 (98%)

Query: 1 MNHKDWDFVNRRLVAKMLSEMEYEQVFHAESQGDDHYCINLPGAQWRFIAERGIWGWLWI 60
MNHKDWD VNRRLVAKMLSE+EYEQVFHAESQGDD YCINLPGAQWRFIAERGIWGWLWI
Sbjct: 1 MNHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWI 60

Query: 61 DAQTLRCTDEPVLAQTLLMQLKPVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD 120
DAQTLRC DEPVLAQTLLMQLK VLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD
Sbjct: 61 DAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD 120

Query: 121 LINLDADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRC 180
LINL+ADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRC
Sbjct: 121 LINLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRC 180

Query: 181 DNDLDIQQLLTAAMDPQEFTRFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG 240
DN++DI QLLTAAMDPQEF RFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG
Sbjct: 181 DNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG 240

Query: 241 RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR 300
RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR
Sbjct: 241 RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR 300

Query: 301 WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK 360
WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK
Sbjct: 301 WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK 360

Query: 361 PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI 420
PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI
Sbjct: 361 PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI 420

Query: 421 AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEAFPEMDSLPQEVRDVTSRLSADYLIHDL 480
AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKE FPEMDSLPQEVRDVTSRLSADYLIHDL
Sbjct: 421 AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDL 480

Query: 481 QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMNKHPQMSERFALFSLFRPQIIR 540
QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYM KHPQMSERFALFSLFRPQIIR
Sbjct: 481 QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSLFRPQIIR 540

Query: 541 VVLNPVKLTWPDQDGGSRMLPNYLENLQNPLWLVTQEYES 580
VVLNPVKLTWPD DGGSRMLPNYLE+LQNPLWLVTQEYES
Sbjct: 541 VVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQEYES 580


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3627PF04183340e-112 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 340 bits (874), Expect = e-112
Identities = 104/480 (21%), Positives = 177/480 (36%), Gaps = 46/480 (9%)

Query: 58 ELIIPLDEQKSLHFRVAYFSPTQHHRF-----AFPARLVTASGSYPVDFTTLSRLIIDKL 112
E + + Q + + P RF + + A D L++ ++ +L
Sbjct: 24 EQVFHAESQGDDRYCIN--LPGAQWRFIAERGIWGWLWIDAQTLRCADEPVLAQTLLMQL 81

Query: 113 RHQLFLPVPLCETFHQRVLESHAHTQQAIDARHDWTALREKALNFGEAEQALLTGHAFHP 172
+ L + Q + + Q + AR +A LN + Q LL+GH
Sbjct: 82 KQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINLNA-DRLQCLLSGHPKFV 140

Query: 173 APKSHEPFNRREAERYLPDMAPHFPLRWFSVDKTQIAGES-LHLNLQQRLTRFAAENAPQ 231
K + + ERY P+ A F L W +V + + +++ Q LT A PQ
Sbjct: 141 FNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQLLT---AAMDPQ 197

Query: 232 LLNELS--------DNQWLF-PLHPWQGEYLLQQGWCQALVAKGLIKDLGEAGTSWLPTT 282
S D+ WL P+HPWQ + + + A+G + LGE G WL
Sbjct: 198 EFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADF-AEGRMVSLGEFGDQWLAQQ 256

Query: 283 SSRSLYCATSRD--MIKFSLSVRLTNSIRTLSVKEVKRGMRLARLAQ----TDGWQMLQ- 335
S R+L A+ R IK L++ T+ R + + + G +R Q TD +
Sbjct: 257 SLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQVFATDATLVQSG 316

Query: 336 ---VRFPTFRVMQEDGWAGLLDLNGNIMQESLFALRENLLVDQPKSQTNVLVSLTQAAPD 392
+ P + +G+A L + REN ++ VL++ +
Sbjct: 317 AVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKPDESPVLMATLMECDE 376

Query: 393 GGDSLLVSAVKRLSDRLGITVQQAAHAWVDAYCQQVLKPLFTAEADYGLVLLAHQQNILV 452
L + + DR G+ A W+ + V+ PL+ YG+ L+AH QNI +
Sbjct: 377 NNQPLAGAYI----DRSGLD----AETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITL 428

Query: 453 QMLGDLPVGFIYRDCQGSAFMPHATDWLDSIGEAQAENIFTHEQLLRYFPYYLLVNSTFA 512
M +P + +D QG M + + E + L++
Sbjct: 429 AMKEGVPQRVLLKDFQGD--MRLVKEEFPEMDSLPQE----VRDVTSRLSADYLIHDLQT 482


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3628TCRTETA485e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 47.5 bits (113), Expect = 5e-08
Identities = 81/375 (21%), Positives = 135/375 (36%), Gaps = 41/375 (10%)

Query: 20 FSAGLLGIGQNGLLVVLPVLVIQTNLSLSV---WAALLMLGSMLFLPSSPWWGKQISRTG 76
+ L +G ++ VLP L+ S V + LL L +++ +P G R G
Sbjct: 12 STVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFG 71

Query: 77 SKPVVLWALGGYGISFTLLGLGSVLMATSAITTAVGLGILIIARIAYGLTVSAMVPACQV 136
+PV+L +L G + + ++ L +L I RI G+T + A
Sbjct: 72 RRPVLLVSLAGAAVDYAIMATAPFLW------------VLYIGRIVAGITGATGAVAGAY 119

Query: 137 WALQRAGEGNRMAALATISSGLSCGRLFGPLCAAAMLAIHPLAPLGLLMAAPVLALLMLL 196
A R +S+ G + GP+ M P AP A L L
Sbjct: 120 IA-DITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGC 178

Query: 197 RL------PGTPPQPTPEYKSVSLKRDCLPYLLCAILLAAAVSMMQLGLSPAL------T 244
L P ++ R + A L+A M +G PA
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGE 238

Query: 245 RQFATDTTAISQQVAWLLGLSAVAALIAQFGVLRPQRLTPVALLLSAGVLMSGGLAIMLS 304
+F D T I +A L ++A + G + + AL+L +G + + +
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMI-TGPVAARLGERRALMLGMIADGTGYILLAFA 297

Query: 305 EQLWLFYPGCAVLSFGAALATPAYQLLLNDKLADGAGAGWLATSHTLGYGLCALLVPLVS 364
+ W+ +P +L+ G + PA Q +L+ + D G L L L S
Sbjct: 298 TRGWMAFPIMVLLASG-GIGMPALQAMLS-RQVDEERQGQLQ----------GSLAALTS 345

Query: 365 KTGVAIALIMAALFA 379
T + L+ A++A
Sbjct: 346 LTSIVGPLLFTAIYA 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3635OUTRMMBRANEA300.012 Outer membrane protein A signature.
		>OUTRMMBRANEA#Outer membrane protein A signature.

Length = 346

Score = 29.5 bits (66), Expect = 0.012
Identities = 24/135 (17%), Positives = 46/135 (34%), Gaps = 24/135 (17%)

Query: 125 YQATPDLNLALRYRYDWKAFRQTDLDGNRARNDQHQFDGYVTYKINDSWLFAWQTTVYTK 184
YQ P + + Y + + + ++ + Q + Y I D +YT+
Sbjct: 64 YQVNPYVGFEMGYDWLGRMPYKGSVENGAYKAQGVQLTAKLGYPITDDL------DIYTR 117

Query: 185 V---------NNFKYGNHKKTATENAFVL--QYKMSPVFTPYIEYDYLDKQGVYKGKDNK 233
+ + YG + T F +Y ++P +EY + + G +
Sbjct: 118 LGGMVWRADTKSNVYGKNHDTGVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGTR 177

Query: 234 HEN-------SYRVG 241
+N SYR G
Sbjct: 178 PDNGMLSLGVSYRFG 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3637TCRTETB645e-13 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 63.7 bits (155), Expect = 5e-13
Identities = 64/401 (15%), Positives = 145/401 (36%), Gaps = 34/401 (8%)

Query: 43 LLDGFDFVLISLVLTEVQHEFGLTTIEAASLISAAFISRWFGGLAIGALSDKMGRRMAMV 102
+ +++++ L ++ ++F + +A ++ G G LSD++G + ++
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 103 LSIVLFSLGTLACGLAPGYAVMFI-ARIVIGLGMAGEYGSSVTYVIESWPVHLRNKASGF 161
I++ G++ + + + I AR + G G A + V P R KA G
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 162 LISGFSIGGGLAAQVYSIVVPLWGWRSLFFVGMLPILFAFYLRKNLPESDDWQKRQQENK 221
+ S ++G G+ + ++ W L + M+ I+ +L K L +
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKE----------- 192

Query: 222 PVRTMVDILYREKNKYINILLSCIAFACLYVCFSGVTANAALITVMALCCAAVFIS---- 277
VR + I+L + + T+ + ++++ +F+
Sbjct: 193 -VRI------KGHFDIKGIILMSVGIVFFML---FTTSYSISFLIVSVLSFLIFVKHIRK 242

Query: 278 ----FIYQGMGKRWPTGIMLMLVVMFCFLYGWPLQAFLPTWLKVDMQYSPETVALIFMLA 333
F+ G+GK P ++ +L F + +P +K Q S + + +
Sbjct: 243 VTDPFVDPGLGKNIPF-MIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 334 G-FGSAAGSCIGGFMGDWLGTRK-AYVISLLIGQLVIIPVFLVDRDYVWLLGLLIFTQQV 391
G IGG + D G + + + FL++ ++ +++F
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLG- 360

Query: 392 FGQGIGALVPKIISGYFNVEQRAAGLGFIYNVGSLGGACAP 432
++ I+S ++ AG+ + L
Sbjct: 361 GLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGI 401


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3638PF03309320.002 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 32.0 bits (73), Expect = 0.002
Identities = 16/67 (23%), Positives = 27/67 (40%), Gaps = 4/67 (5%)

Query: 1 MITLAVDIGGTKISAALISDD---GSFLLKKQISTPHERCPDEMTGALRLLVSEMKGTAE 57
M+ LA+D+ T LIS + + +I T E DE+ + L+ +
Sbjct: 1 ML-LAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTADELALTIDGLIGDDAERLT 59

Query: 58 RFAVAST 64
+ ST
Sbjct: 60 GASGLST 66


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3643PF02370320.002 M protein repeat
		>PF02370#M protein repeat

Length = 168

Score = 31.6 bits (71), Expect = 0.002
Identities = 20/103 (19%), Positives = 47/103 (45%), Gaps = 3/103 (2%)

Query: 18 EQAEALRQRDLQLSLVEETEAFLRSALARAEEKIEEEEREIEYLRAQIEKLRRMLFGTRS 77
+ +++ R+ D Q + LR + ++KIEE E+E + + + E+ + +
Sbjct: 38 DSSDSKRENDPQYRALMGENQDLRKREGQYQDKIEELEKERKEKQERPERREKFERQHQD 97

Query: 78 EKLQREVEQAEAQLKQREQESDRYSGREDDPQVPRQLRQSRHR 120
+ Q + ++ + + +Q E E + + Q+ RQ +R
Sbjct: 98 KHYQEQQKKHQQEQQQLEAEKQKL---AKEKQISDASRQGLNR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3655PRTACTNFAMLY422e-05 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 41.6 bits (97), Expect = 2e-05
Identities = 87/417 (20%), Positives = 139/417 (33%), Gaps = 46/417 (11%)

Query: 311 QNTGGALVTSTAATVTGTNRLGAFSVVAGKADNVVLENGGRLDVLSGHTATNTRVDDGGT 370
++ + V VT GA + V+ + + +GG + G A +
Sbjct: 194 EDLPPSRVVLRDTNVTAVPASGAPAAVSVLGASELTLDGGH--ITGGRAAGVAAMQGAVV 251

Query: 371 ----LDIRNGGAATTVSMGNGGVLLADSGAAVSGTRSDGKAFSIGGGQA----DALMLEK 422
IR G A ++ G V G AV G G + G +E
Sbjct: 252 HLQRATIRRGDAPAGGAVPGGAVP----GGAVPGGFGPGGFGPVLDGWYGVDVSGSSVEL 307

Query: 423 GSSFTLNAGDTATDTTVNGGLFTARGGTLAGTTTLNNGAILTLSGKTV---NNDTLTIR- 478
S A G T GG+L+ G ++ G L+I
Sbjct: 308 AQSIVEAPELGAAIRVGRGARVTVSGGSLSAPH----GNVIETGGARRFAPQAAPLSITL 363

Query: 479 EGDALLQGGSLTGNGSVEKSGSGTLTVSNTTLTQKAVNLNEGTLTLNDSTVTTDVIAQRG 538
+ A QG +L E LT++ Q + E S DV
Sbjct: 364 QAGAHAQGKALLYRVLPEPV---KLTLTGGADAQGDIVATELPSIPGTSIGPLDVALASQ 420

Query: 539 TALKLTGSTVLNGAIDPTNVTLASDATWNIPDNATVQSVVDDLSHAGQIHF-TSSRTGTF 597
GA + +ATW + DN+ V ++ L+ G + F + G F
Sbjct: 421 ARWT--------GATRAVDSLSIDNATWVMTDNSNVGAL--RLASDGSVDFQQPAEAGRF 470

Query: 598 VPATLKVKNLNGQNGTISLRVRPDMAQNNADRLVIDGGRATGKTILNLVNAGNSASGLAT 657
L V L G +G + V D+ + D+LV+ A+G+ L + N+G+ + T
Sbjct: 471 --KVLTVNTLAG-SGLFRMNVFADLGLS--DKLVVMQD-ASGQHRLWVRNSGSEPASANT 524

Query: 658 SGKGIQVVEAINGATTEEGAFVQGNRLQAGAFNYSLNRDSDESWYLRSENAYRAEVP 714
+ V + A T A ++ G + Y L + + W L A A P
Sbjct: 525 L---LLVQTPLGSAATFTLANK-DGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKP 577


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3662CHANLCOLICIN320.004 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 32.0 bits (72), Expect = 0.004
Identities = 25/91 (27%), Positives = 42/91 (46%), Gaps = 5/91 (5%)

Query: 4 SLAHENARLRALLQTQQDTIRQMAEYNRLLSQRVAAYASEINRLKALVAKLQRMQFGKSS 63
+ A A AL Q +D + + +N + A N A+ A+ +R++ K+
Sbjct: 79 AQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANN--AAMQAEDERLRLAKAE 136

Query: 64 EKLR---AKTERQIQEAQERISALQEEMAET 91
EK R E+ QEA++R ++ E AET
Sbjct: 137 EKARKEAEAAEKAFQEAEQRRKEIEREKAET 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3694SYCDCHAPRONE455e-07 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 44.5 bits (105), Expect = 5e-07
Identities = 17/90 (18%), Positives = 33/90 (36%), Gaps = 7/90 (7%)

Query: 238 AFHQY-KGRWAKAIDAYEKHTSINPLCAELYYRLGLSYDRCYQWDKAAENYRKALSLDEN 296
AF+QY G++ A ++ ++ + + LG Q+D A +Y +D
Sbjct: 43 AFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIK 102

Query: 297 HPYWHYRLGFVL------ERSQKYLDAAVA 320
P + + L ++ L A
Sbjct: 103 EPRFPFHAAECLLQKGELAEAESGLFLAQE 132



Score = 33.0 bits (75), Expect = 0.004
Identities = 14/83 (16%), Positives = 32/83 (38%), Gaps = 1/83 (1%)

Query: 265 ELYYRLGLSYDRCYQWDKAAENYRKALSLDENHPYWHYRLGFVLERSQKYLDAAVAYQFA 324
E Y L + + +++ A + ++ LD + LG + +Y A +Y +
Sbjct: 37 EQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYG 96

Query: 325 AQSNIKHISIWYYRSGYAYAKAN 347
A +IK + + + +
Sbjct: 97 AIMDIK-EPRFPFHAAECLLQKG 118



Score = 29.9 bits (67), Expect = 0.044
Identities = 16/84 (19%), Positives = 31/84 (36%), Gaps = 2/84 (2%)

Query: 15 EYINGIS--LYKKKEWEKALLFFEKSIIKKTKHAESYFKAGICNLKLHRYEEAFKYISKA 72
+ G+ +++ A+ + I K F A C L+ EA + A
Sbjct: 71 RFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLA 130

Query: 73 LELEPSNIQWKEQLEQCARHLDKL 96
EL ++KE + + L+ +
Sbjct: 131 QELIADKTEFKELSTRVSSMLEAI 154


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3696LPSBIOSNTHSS466e-09 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 45.6 bits (108), Expect = 6e-09
Identities = 23/92 (25%), Positives = 46/92 (50%), Gaps = 7/92 (7%)

Query: 7 KVITFGTFDVLHIGHINILKRAKKMGDYLIVGVSSDYLNFSKKQRYPVYPETERLEIIR- 65
I G+FD + GH++I++R ++ D + V V N +K+ P++ ERLE I
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLR---NPNKQ---PMFSVQERLEQIAK 55

Query: 66 SLKFVDEVFIEESLELKGEYIKKFKADILVMG 97
++ + ++ L Y ++ +A ++ G
Sbjct: 56 AIAHLPNAQVDSFEGLTVNYARQRQAGAILRG 87


48c3750c3767Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c3750023-5.959384Putative regulator
c3751123-5.742296Hypothetical oxidoreductase ydfI
c3752019-3.890649Hypothetical zinc-type alcohol
c3753-118-3.630954Ureidoglycolate dehydrogenase
c3754-118-3.624542Putative c4-dicarboxylate transport system
c3755-211-0.262947Hypothetical protein
c3756-2110.707868c4-dicarboxylate permease
c3757-2132.004206Protein sufI precursor
c3758-2120.9997461-acyl-sn-glycerol-3-phosphate acyltransferase
c3759-2131.331898Hypothetical protein
c3760-2121.455813Topoisomerase IV subunit A
c3761-216-1.874451Putative binding protein ygiS precursor
c3762021-4.765243Hypothetical protein ygiV
c3763020-5.050623Protein ygiW precursor
c3764020-4.988651Transcriptional Regulatory protein qseB
c3765-120-4.800091Sensor protein qseC
c3766-126-6.271561Hypothetical protein
c3767-222-3.284166Conserved hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3761PF07675300.043 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 29.7 bits (66), Expect = 0.043
Identities = 24/112 (21%), Positives = 47/112 (41%), Gaps = 16/112 (14%)

Query: 35 SAAPLYAADVPANTPLAPQQVFRYNNHSDPGTLDPQKVEENTAAQIVL------------ 82
+ PL+ +N A + + ++DP + Q + ++V+
Sbjct: 421 ATGPLFTGTASSNLYSANFE-YLTPANADP-VVTTQNIIVTGQGEVVIPGGVYDYCITNP 478

Query: 83 DLFEGLVWMDGEGQVQPAQAERWEILDGGKRYIFHLRSGLQWSDGQPLTAED 134
+ G +W+ G+G QPA+ + + + GK+Y F +R DG + ED
Sbjct: 479 EPASGKMWIAGDGGNQPARYDDF-AFEAGKKYTFTMRR-AGMGDGTDMEVED 528


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3764HTHFIS905e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.9 bits (223), Expect = 5e-23
Identities = 30/129 (23%), Positives = 56/129 (43%)

Query: 2 RILLIEDDMLIGDGIKTGLSKMGFRVDWFTQGRQGKEALYSAPYDAVILDLTLPGMDGRD 61
IL+ +DD I + LS+ G+ V + + + D V+ D+ +P + D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 ILREWREKGQREPVLILTARDALAERVEGLRLGADDYLCKPFALIEVAARLEALMRRTNG 121
+L ++ PVL+++A++ ++ GA DYL KPF L E+ + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 QASNELRHG 130
+ S
Sbjct: 125 RPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3765PF06580401e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 40.2 bits (94), Expect = 1e-05
Identities = 38/176 (21%), Positives = 62/176 (35%), Gaps = 29/176 (16%)

Query: 284 DRATRLVDQLLTLSRLDSLDNLQDVAEIPLEDLLQSSVMDIYHTAQQAKIDVRLTLKANG 343
+A ++ L L R SL ++ L D L +V+D Y + + RL +
Sbjct: 191 TKAREMLTSLSELMRY-SLRYSNA-RQVSLADEL--TVVDSYLQLASIQFEDRLQFENQI 246

Query: 344 IKRTGQ----PLLLSLLVRNLLDNAVRYSPQGSVVDVTLNADN----FIVRDNGPGVTPE 395
P+L+ LV N + + + PQG + + DN V + G
Sbjct: 247 NPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN 306

Query: 396 ALARIGERFYRPPGQTATGSGLGLSIV-QRIAKLHGMNVDFG-NAEQGGFEAKVSW 449
T +G GL V +R+ L+G + +QG A V
Sbjct: 307 ---------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLI 347


49c3787c3794Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c3787014-3.118972Putative disulfide isomerase
c3788013-3.218890Hypothetical protein ygiD
c3789017-5.200843Zinc transporter zupT
c3790017-5.378524Conserved hypothetical protein
c3791017-4.938897Hypothetical fimbrial-like protein ygiL
c3792-117-4.851854Hypothetical outer membrane usher protein yqiG
c3793-325-5.592979Hypothetical fimbrial chaperone yqiH precursor
c3794-323-4.270527Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3792PF005776890.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 689 bits (1780), Expect = 0.0
Identities = 234/878 (26%), Positives = 401/878 (45%), Gaps = 70/878 (7%)

Query: 14 HAIKNALSG------VVCSLLFVLPVH--AVEFNVDMIDAEDRENIDISRFEKKGYIPPG 65
H K+ L+G V C+ P+ + FN + + + D+SRFE +PPG
Sbjct: 17 HIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPG 76

Query: 66 RYLVRVQINKNMLPQTLILEWVKADNESGSLLCLTKENLTNFGLNTEFIESLQNIAGSEC 125
Y V + +N + + + D+E G + CLT+ L + GLNT + + +A C
Sbjct: 77 TYRVDIYLNNGYMATRDV-TFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDAC 135

Query: 126 LDLSQR-QELTTRLDKATMILSLSVPQAWLKYQATNWTPPEFWDTGITGFILDYNVYASQ 184
+ L+ + T +LD L+L++PQA++ +A + PPE WD GI +L+YN +
Sbjct: 136 VPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNS 195

Query: 185 YAPHHGDSTQNVSSYGTLGFNLGAWRLRSDYQYNQNFADGRSVNRDS-EFARTYLFRPIP 243
G ++ G N+GAWRLR + ++ N +D S +++ + T+L R I
Sbjct: 196 VQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDII 255

Query: 244 SWSSKFTMGQYDLSSNLYDTFHFTGASLESDESMLPPDLQGYAPQITGIAQTNAKVTVAQ 303
S+ T+G +++D +F GA L SD++MLP +G+AP I GIA+ A+VT+ Q
Sbjct: 256 PLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQ 315

Query: 304 NGRVLYQTTVAPGPFTISDL-GQSFQGQLDVTVEEEDGRTSTFQVGSASIPYLTRKGQVR 362
NG +Y +TV PGPFTI+D+ G L VT++E DG T F V +S+P L R+G R
Sbjct: 316 NGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTR 375

Query: 363 YKTSLGKPTSVGHNDINNPFFWTAEASWGWLNNVSLYGGGMFTADDYQAITTGIGFNLNQ 422
Y + G+ S P F+ + G ++YGG AD Y+A GIG N+
Sbjct: 376 YSITAGEYRSGNAQQE-KPRFFQSTLLHGLPAGWTIYGGTQL-ADRYRAFNFGIGKNMGA 433

Query: 423 FGSLSFDVTGADASLQQQNSGNLRGYSYRFNYAKHFESTGSQITFAGYRFSDKDYVSMSE 482
G+LS D+T A+++L G S RF Y K +G+ I GYR+S Y + ++
Sbjct: 434 LGALSVDMTQANSTLPDD--SQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFAD 491

Query: 483 YLSSRNGDESID--------------------NEKESYVISLNQYFETLELNSYLNVTRN 522
SR +I+ N++ +++ Q YL+ +
Sbjct: 492 TTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRT-STLYLSGSHQ 550

Query: 523 TYWDS-ASNTNYSVSVSKNFDIGDFKGISASLAVSRIR--WDDDEENQYYFSFSLPL--- 576
TYW + + + ++ F+ I+ +L+ S + W + + ++P
Sbjct: 551 TYWGTSNVDEQFQAGLNTA-----FEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHW 605

Query: 577 --------QQNRNISYSMQRTGSSNTSQMISWYDS--SDRNNIWNISASATDDNIRDGEP 626
++ + SYSM + + + Y + D N +++ +
Sbjct: 606 LRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGS 665

Query: 627 TLRGSYQHYSPWGRLNINGSVQPNQYNSVTAGWYGSLTATRHGVALHDYSYGDNARMMVD 686
T + + +G NI S + + G G + A +GV L ++ ++V
Sbjct: 666 TGYATLNYRGGYGNANIGYSHS-DDIKQLYYGVSGGVLAHANGVTLGQPL--NDTVVLVK 722

Query: 687 TDGISGIEINSNRTV-TNGLGIAVIPSLSNYTTSMLRVNNNDLPEGVDVENSVIRTTLTQ 745
G ++ + V T+ G AV+P + Y + + ++ N L + VD++N+V T+
Sbjct: 723 APGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTR 782

Query: 746 GAIGYAKLNATTGYQIVGVIRQENGRFPPLGVNVTDKATGKDVGLVAEDGFVYLSGIQEN 805
GAI A+ A G +++ + N + P G VT + + G+VA++G VYLSG+
Sbjct: 783 GAIVRAEFKARVGIKLLMTLTH-NNKPLPFGAMVTS-ESSQSSGIVADNGQVYLSGMPLA 840

Query: 806 SILHLTWGD---NTCEVT---PPNQSNISESAIILPCK 837
+ + WG+ C PP + + C+
Sbjct: 841 GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


50c3854c3876Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c38542210.717576Hypothetical protein yqjB
c38550220.163860Protein yqjC precursor
c3856-3202.255448Hypothetical protein yqjD
c3857-2181.983405Hypothetical protein yqjE
c3858-1171.675465Hypothetical protein yqjK
c38590181.209716Hypothetical protein yqjF
c38620193.083571Hypothetical protein yhaH
c3863-1173.009952Hypothetical transcriptional regulator yhaJ
c3864-1152.345939Hypothetical protein yhaK
c38650162.049394Hypothetical protein
c38660161.791745Hypothetical protein yhaL
c3867-1161.033619Conserved hypothetical protein
c3870015-0.645551L-serine dehydratase 1
c3871-114-2.459732TdcF protein
c3872-113-3.746053Keto-acid formate acetyltransferase
c3873019-7.106600Putative conserved protein
c3874-115-4.726005Threonine/serine transporter
c3875-118-4.140743Threonine dehydratase catabolic
c3876-215-3.019463Tdc operon transcriptional activator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3873ACETATEKNASE493e-11 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 49.0 bits (117), Expect = 3e-11
Identities = 15/42 (35%), Positives = 23/42 (54%), Gaps = 2/42 (4%)

Query: 1 MNNRSNSFGERIVSSENARVICAVIPTNEEKMIALDAIHLGK 42
N E I+S+ +++V V+PTNEE MIA D + +
Sbjct: 358 KNKVRGE--EAIISTADSKVNVMVVPTNEEYMIAKDTEKIVE 397


51c3904c3924Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c39040203.822374Hypothetical protein yraP precursor
c39050213.356719Hypothetical protein yraQ
c39060183.061611Hypothetical protein yraR
c39070193.355875Hypothetical protein yhbP
c3908-2183.700051Hypothetical protein yhbQ
c3909-2193.318582Hypothetical acetyltransferase yhbS
c3910-2192.808880Hypothetical protein yhbT
c3911-1183.390350Putative protease yhbU precursor
c39120202.914514Hypothetical protein yhbV
c3913-1181.681280Hypothetical protein yhbW
c39141221.411833Tryptophan-specific transport protein
c39153291.406708Hypothetical protein
c39164301.631048Cold-shock DEAD-box protein A
c39173260.990171Hypothetical protein
c39183261.051774Lipoprotein nlpI precursor
c39194321.695653Hypothetical protein
c39205341.592226Polyribonucleotide nucleotidyltransferase
c39215301.48776030S ribosomal protein S15
c39224291.312059tRNA pseudouridine synthase B
c39235280.842778Ribosome-binding factor A
c39243260.886115Translation initiation factor IF-2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3906NUCEPIMERASE290.014 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.0 bits (65), Expect = 0.014
Identities = 8/22 (36%), Positives = 13/22 (59%)

Query: 4 VLITGATGLVGGHLLRMLINEP 25
L+TGA G +G H+ + L+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG 24


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3924TCRTETOQM732e-15 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 73.4 bits (180), Expect = 2e-15
Identities = 70/313 (22%), Positives = 110/313 (35%), Gaps = 77/313 (24%)

Query: 396 IMGHVDHGKTSLLDYI-----RSTKVASGEAG-------------GITQHIGAYHVETEN 437
++ HVD GKT+L + + T++ S + G GIT G + EN
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 438 GMITFLDTPGHAAFTSMRARGAQATDIVVLVVAADDGVMPQTIEAIQHAKAAGVPVVVAV 497
+ +DTPGH F + R D +L+++A DGV QT + G+P + +
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 498 NKIDKPEADPDRV----KNELSQYGI-----------------LPEEWG----------- 525
NKID+ D V K +LS + E+W
Sbjct: 128 NKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLE 187

Query: 526 ---------------GESQFV---------HVSAKAGTGIDELLDAILLQAEVLELKAVR 561
ES H SAK GID L++ I +
Sbjct: 188 KYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVIT--NKFYSSTHRG 245

Query: 562 KGMASGAVIESFLDKGRGPVATVLVREGTLHKGDIVL-CGFEYGRVRAMRNELGQEVLEA 620
+ G V + + R +A + + G LH D V E ++ M + E+ +
Sbjct: 246 QSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKIKITEMYTSINGELCKI 305

Query: 621 GPSIPVEILGLSG 633
+ EI+ L
Sbjct: 306 DKAYSGEIVILQN 318


52c3960c3983Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c39602160.407620Protein yhbN precursor
c39613170.639697Probable ABC transporter ATP-binding protein
c39623150.099221RNA polymerase sigma-54 factor
c3963017-0.386830Probable sigma(54) modulation protein
c3964-1150.332660Nitrogen regulatory IIA protein
c3965013-0.157124Hypothetical protein yhbJ
c3966-2130.641205Phosphocarrier protein NPr
c3967-2130.365735Hypothetical protein yrbL
c3968-1173.022797Monofunctional biosynthetic peptidoglycan
c3969-1203.388993Enhancing lycopene biosynthesis protein 2
c3970-1203.139169Aerobic respiration control sensor protein arcB
c3971-1214.617086Hypothetical protein yhcC
c3972-1214.721650Hypothetical protein
c3973-2214.919201Glutamate synthase [NADPH] large chain
c3974-2143.813369Glutamate synthase [NADPH] small chain
c3975-2133.420308Hypothetical protein yhcH
c3976-2144.223021Hypothetical protein yhcI
c3977-1183.556187Hypothetical protein yhcJ
c3978-1182.520001Putative sialic acid transporter
c39793221.588582N-acetylneuraminate lyase subunit
c39803270.884076Hypothetical transcriptional regulator yhcK
c39811190.301740Stringent starvation protein B
c39821190.059961Stringent starvation protein A
c39832200.052082Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3970HTHFIS647e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.5 bits (157), Expect = 7e-13
Identities = 26/115 (22%), Positives = 45/115 (39%), Gaps = 4/115 (3%)

Query: 528 VLLVEDIELNVIVARSVLEKLGNSVDVAMTGKAALEMFKPGEYDLVLLDIQLPDMTGLDI 587
+L+ +D V L + G V + G+ DLV+ D+ +PD D+
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 588 SRALTKRYPREDLPPLVALTA-NVLKDKQEYLNAGMDDVLSKPLSVPALTAMIKK 641
+ K P P++ ++A N + G D L KP + L +I +
Sbjct: 66 LPRIKKARPD---LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3978TCRTETB571e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 56.8 bits (137), Expect = 1e-10
Identities = 65/408 (15%), Positives = 139/408 (34%), Gaps = 33/408 (8%)

Query: 40 LLDGFDFVLIALVLTEVQGEFGLTTVQAASLISAAFISRWFGGLMLGAMGDRYGRRLAMV 99
+ +++ + L ++ +F + +A ++ G + G + D+ G + ++
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 100 TSIVLFSAGTLACGFAPGYITMFI-ARLVIGMGMAGEYGSSATYVIESWPKHLRNKASGF 158
I++ G++ + ++ I AR + G G A V PK R KA G
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 159 LISGFSVGAVVAAQVYSLVVPVWGWRALFFIGILPIIFALWLRKNIPEAEDWKEKHGGKA 218
+ S ++G V + ++ W L I ++ II +L K + +
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVR--------- 194

Query: 219 PVRTMVDILYRGEHRIANIVMTLAAATALWFCFAGNLQNAAIVAVLGLLCAAIFISFMVQ 278
+G I I++ + IV+VL L IF+ + +
Sbjct: 195 ---------IKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFL---IFVKHIRK 242

Query: 279 STGK----RWPTGVMLMVVVLFAFLYSWPIQA---LLPTYLKTDLAYDPHTVANVLFFSG 331
T + M+ VL + + ++P +K + +V+ F G
Sbjct: 243 VTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPG 302

Query: 332 -FGAAVGCCVGGFLGEWLGTRK-AYVCSLLASQLLIIPVFAIGGANVWVLGLLLFFQQML 389
+ +GG L + G + S + F + W + +++ F
Sbjct: 303 TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLET-TSWFMTIIIVFVLGG 361

Query: 390 GQGISGILPKLIGGYFDTDQRAAGLGFTYNVGALGGALAP-ILGALIA 436
++ ++ + AG+ L I+G L++
Sbjct: 362 LSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


53c4011c4025Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4011013-3.228325Biotin carboxyl carrier protein of acetyl-CoA
c4012014-4.052676Biotin carboxylase
c4013022-5.234805Hypothetical protein
c4014024-6.285126Hypothetical protein
c4015-125-6.389607Ribose transport system permease protein rbsC
c4016-121-4.969603Ribose transport ATP-binding protein rbsA
c4017-120-3.948892Putative ribose ABC transporter
c4018-212-0.893601Tagatose-bisphosphate aldolase gatY
c4019-2120.295513Hypothetical protein
c4020-2130.557981Hypothetical protein
c4021-3111.256346Hypothetical protein
c4022-4131.729305Hypothetical protein yhdT
c4023-4130.633845Sodium/pantothenate symporter
c4024-119-1.181059Ribosomal protein L11 methyltransferase
c4025-219-3.751480Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4011RTXTOXIND270.026 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 27.5 bits (61), Expect = 0.026
Identities = 8/27 (29%), Positives = 16/27 (59%)

Query: 127 IEADKSGTVKAILVESGQPVEFDEPLV 153
I+ ++ VK I+V+ G+ V + L+
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLL 125


54c4100c4123Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4100221-1.781385Putative general secretion pathway protein H
c4101122-2.355176Probable general secretion pathway protein I
c4102222-3.062291Probable general secretion pathway protein J
c4103023-3.396837Probable general secretion pathway protein K
c4104-219-2.720582Probable general secretion pathway protein L
c4105-221-3.432074Putative general secretion pathway protein M
c4106-224-2.105299Type 4 prepilin-like proteins leader peptide
c4107135-1.744807Bacterioferritin
c4108237-1.322833Bacterioferritin-associated ferredoxin
c4109339-1.255093Probable bifunctional chitinase/lysozyme
c4110748-1.265498Hypothetical protein
c4111851-0.397975Elongation factor Tu
c4112644-1.079569Elongation factor G
c4113732-1.449851Hypothetical protein
c4114729-0.57716530S ribosomal protein S7
c4115423-1.557380Hypothetical protein
c4116323-1.85253430S ribosomal protein S12
c4117318-1.781228Hypothetical protein yheL
c4118324-0.792838Hypothetical protein yheM
c4119224-1.015263Hypothetical protein yheN
c41200170.226469Hypothetical protein yheO
c41210171.777418FKBP-type peptidyl-prolyl cis-trans isomerase
c4122-1153.321905SlyX protein
c4123-2143.143369FKBP-type peptidyl-prolyl cis-trans isomerase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4100BCTERIALGSPH1412e-45 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 141 bits (356), Expect = 2e-45
Identities = 49/153 (32%), Positives = 76/153 (49%), Gaps = 18/153 (11%)

Query: 3 QQRGFTLLEMMLVLALVAITASVVLFTYGREDAANTRARETAARFTAALELAIDRATLSG 62
+QRGFTLLEMML+L L+ ++A +VL + + + A +T ARF A L R +G
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFP--ASRDDSAAQTLARFEAQLRFVQQRGLQTG 59

Query: 63 QPVGIHFSDSAWRIMV----PGKTP-------SAWRWVPLQEDAADESKNDWGEELSIQL 111
Q G+ W+ +V G P S +RW+PL+ S + G +L++
Sbjct: 60 QFFGVSVHPDRWQFLVLEARDGADPAPADDGWSGYRWLPLRAGRVATSGSIAGGKLNLAF 119

Query: 112 ---QPFKPDDSNQPQVVILADGQITPFSLLMAN 141
+ + P D P V+I G++TPF L +
Sbjct: 120 AQGEAWTPGD--NPDVLIFPGGEMTPFRLTLGE 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4101BCTERIALGSPG300.002 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 29.9 bits (67), Expect = 0.002
Identities = 17/90 (18%), Positives = 41/90 (45%), Gaps = 8/90 (8%)

Query: 14 MNKQSGMTLLEVLLAMSIFTAVALTLMSSMQGQ--RNAIERMRNETLALWIADNQLQSQD 71
+KQ G TLLE+++ + I +A ++ ++ G + ++ ++ +AL A + + D
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK-LD 62

Query: 72 SFGEENTSSSGKELING-----EEWNWRSD 96
+ T+ + L+ N+ +
Sbjct: 63 NHHYPTTNQGLESLVEAPTLPPLAANYNKE 92


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4102BCTERIALGSPG341e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 34.1 bits (78), Expect = 1e-04
Identities = 14/45 (31%), Positives = 27/45 (60%), Gaps = 5/45 (11%)

Query: 3 NRQQGFTLLEVMAALAIFSMLSVLAFMIFSQVSELHQRSQKEIQK 47
++Q+GFTLLE+M + I + VLA ++ + + + + + QK
Sbjct: 5 DKQRGFTLLEIMVVIVI---IGVLASLVVPNL--MGNKEKADKQK 44


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4106PREPILNPTASE1514e-47 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 151 bits (383), Expect = 4e-47
Identities = 88/262 (33%), Positives = 118/262 (45%), Gaps = 47/262 (17%)

Query: 5 LPLFILVGFIAGYFVNVMAYHL---------------SPLEDKTALTFRQVLVH------ 43
L L + G F+NV+ + L +D+ L+
Sbjct: 16 FSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEPPYNLMVPRSCCP 75

Query: 44 FWQKKYAWHDTVPLI-------------------------LCVAAAIACALAPFTPIVTG 78
+ +PL+ L ++A A+ T
Sbjct: 76 HCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSVAVAMTLAPGWGTL 135

Query: 79 ALFLYFCFALTLSVIDFRTQLLPDKLTLPLLWLGLVFNAQSGLIDLHDAVYGAVAGYGVL 138
A L + L+ ID LLPD+LTLPLLW GL+FN G + L DAV GA+AGY VL
Sbjct: 136 AALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGYLVL 195

Query: 139 WCVYWGVWLVCHKEGLGYGDFKLLAAAGAWCGWQTLPMILLIASLGGIGYAIVSQLLQRR 198
W +YW L+ KEG+GYGDFKLLAA GAW GWQ LP++LL++SL G I LL+
Sbjct: 196 WSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLRNH 255

Query: 199 TIPT-IAFGPWLALGSMINLGY 219
I FGP+LA+ I L +
Sbjct: 256 HQSKPIPFGPYLAIAGWIALLW 277


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4107HELNAPAPROT353e-05 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 35.2 bits (81), Expect = 3e-05
Identities = 28/150 (18%), Positives = 59/150 (39%), Gaps = 24/150 (16%)

Query: 5 TKVINYLNKLLGNE---LVAINQYFLHARMFKNWGLKRLNDVEYHESIDEM-----KHAD 56
T V N LN L N ++++ +W +K + HE +E+ + D
Sbjct: 11 TLVENSLNTQLSNWFLLYSKLHRF--------HWYVKGPHFFTLHEKFEELYDHAAETVD 62

Query: 57 RYIERILFLEGLPN--LQDLGKL------NIGEDVEEMLRSDLALELDGAKNLREAIGYA 108
ER+L + G P +++ + EM+++ + + + IG A
Sbjct: 63 TIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYKQISSESKFVIGLA 122

Query: 109 DSVHDYVSRDMMIEILRDEEGHIDWLETEL 138
+ D + D+ + ++ + E + L + L
Sbjct: 123 EENQDNATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4111TCRTETOQM804e-18 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 79.5 bits (196), Expect = 4e-18
Identities = 57/198 (28%), Positives = 87/198 (43%), Gaps = 13/198 (6%)

Query: 28 VNVGTIGHVDHGKTTLTAAI------TTVLAKTYGGAARAFDQIDNAPEEKARGITINTS 81
+N+G + HVD GKTTLT ++ T L G R DN E+ RGITI T
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRT----DNTLLERQRGITIQTG 59

Query: 82 HVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGPMPQTREHILLGRQV 141
+ +D PGH D++ + + +DGAIL+++A DG QTR R++
Sbjct: 60 ITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKM 119

Query: 142 GVPYIIVFLNKCDMVDDEELLELVEMEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWE 201
G+P I F+NK D + L V +++E LS + + +W+
Sbjct: 120 GIP-TIFFINKIDQNGID--LSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWD 176

Query: 202 AKILELAGFLDSYIPEPE 219
I L+ Y+
Sbjct: 177 TVIEGNDDLLEKYMSGKS 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4112TCRTETOQM6130.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 613 bits (1583), Expect = 0.0
Identities = 178/698 (25%), Positives = 304/698 (43%), Gaps = 81/698 (11%)

Query: 9 RYRNIGISAHIDAGKTTTTERILFYTGVNHKIGEVHDGAATMDWMEQEQERGITITSAAT 68
+ NIG+ AH+DAGKTT TE +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TAFWSGMAKQYEPHRINIIDTPGHVDFTIEVERSMRVLDGAVMVYCAVGGVQPQSETVWR 128
+ W ++NIIDTPGH+DF EV RS+ VLDGA+++ A GVQ Q+ ++
Sbjct: 62 SFQWEN-------TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFH 114

Query: 129 QANKYKVPRIAFVNKMDRMGANFLKVVNQIKTRLGANPVPLQLAIGAEEHFTGVVDLVKM 188
K +P I F+NK+D+ G + V IK +L A V Q V M
Sbjct: 115 ALRKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ----------KVELYPNM 164

Query: 189 KAINWNDADQGVTFEYEDIPADMVELANEWHQNLIESAAEASEELMEKYLGGEELTEAEI 248
N+ +++Q ++ E +++L+EKY+ G+ L E+
Sbjct: 165 CVTNFTESEQ------------------------WDTVIEGNDDLLEKYMSGKSLEALEL 200

Query: 249 KGALRQRVLNNEIILVTCGSAFKNKGVQAMLDAVIDYLPSPVDVPAINGILDDGKDTPAE 308
+ R N + V GSA N G+ +++ + + S
Sbjct: 201 EQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH----------------- 243

Query: 309 RHASDDEPFSALAFKIATDPFVGNLTFFRVYSGVVNSGDTVLNSVKAARERFGRIVQMHA 368
FKI L + R+YSGV++ D+V S K + + +
Sbjct: 244 ---RGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEK-EKIKITEMYTSIN 299

Query: 369 NKREEIKEVRAGDIAAAIG----LKDVTTGDTLCDPDAPIILERMEFPEPVISIAVEPKT 424
+ +I + +G+I L V GDT P ER+E P P++ VEP
Sbjct: 300 GELCKIDKAYSGEIVILQNEFLKLNSV-LGDTKLLPQR----ERIENPLPLLQTTVEPSK 354

Query: 425 KADQEKMGLALGRLAKEDPSFRVWTDEESNQTIIAGMGELHLDIIVDRMKREFNVEANVG 484
+E + AL ++ DP R + D +++ I++ +G++ +++ ++ +++VE +
Sbjct: 355 PQQREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIK 414

Query: 485 KPQVAYRETIRQKVTDVEGKHAKQSGGRGQYGHVVIDMYPLEPGSNPKGYEFINDIKGGV 544
+P V Y E +K E + + + + + PL GS G ++ + + G
Sbjct: 415 EPTVIYMERPLKK---AEYTIHIEVPPNPFWASIGLSVSPLPLGS---GMQYESSVSLGY 468

Query: 545 IPGEYIPAVDKGIQEQLKAGPLAGYPVVDMGIRLHFGSYHDVDSSELAFKLAASIAFKEG 604
+ + AV +GI+ + G L G+ V D I +G Y+ S+ F++ A I ++
Sbjct: 469 LNQSFQNAVMEGIRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQV 527

Query: 605 FKKAKPVLLEPIMKVEVETPEENTGDVIGDLSRRRGMLKGQESEVTGVKIHAEVPLSEMF 664
KKA LLEP + ++ P+E D + + + + V + E+P +
Sbjct: 528 LKKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQ 587

Query: 665 GYATQLRSLTKGRASYTMEFLKYDEAPSNVAQAVIEAR 702
Y + L T GR+ E Y + V + R
Sbjct: 588 EYRSDLTFFTNGRSVCLTELKGYHVT---TGEPVCQPR 622


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4120ACRIFLAVINRP290.023 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.7 bits (64), Expect = 0.023
Identities = 14/62 (22%), Positives = 29/62 (46%), Gaps = 1/62 (1%)

Query: 164 ASSVEDLVTQTLEFTIEEVNADRNV-SNNAKNRQIVLNLYEKGIFDIKDAINQVADRLNI 222
A +V+D VTQ +E + ++ + S + + + L + D A QV ++L +
Sbjct: 54 AQTVQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQL 113

Query: 223 SK 224
+
Sbjct: 114 AT 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4121INFPOTNTIATR1325e-40 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 132 bits (334), Expect = 5e-40
Identities = 79/226 (34%), Positives = 124/226 (54%), Gaps = 9/226 (3%)

Query: 28 AAKPATTADSKAAFKNDDQKSAYALGASLGRYMENSLKEQEKLGIKLDKDQLIAGVQDAF 87
A A A + D K +Y++GA LG K + GI ++ D L G+QD
Sbjct: 14 AMSTAMAATDATSLTTDKDKLSYSIGADLG-------KNFKNQGIDINPDVLAKGMQDGM 66

Query: 88 A-DKSKLSDQEIEQTLQAFEARVKSSAQAKMEKDAADNEAKGKEYREKFAKEKGVKTSST 146
+ + L++++++ L F+ + + A+ K A +N+AKG + + G+ +
Sbjct: 67 SGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVLPS 126

Query: 147 GLVYQVVEAGKGEAPKDSDTVVVNYKGTLIDGKEFDNSYTRGEPLSFRLDGVIPGWTEGL 206
GL Y++++AG G P SDTV V Y GTLIDG FD++ G+P +F++ VIPGWTE L
Sbjct: 127 GLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEAL 186

Query: 207 KNIKKGGKIKLVIPPELAYGKAGVPG-IPPNSTLVFDVELLDVKPA 251
+ + G ++ +P +LAYG V G I PN TL+F + L+ VK A
Sbjct: 187 QLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVKKA 232


55c4202c4212Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4202117-4.106042Thiosulfate sulfurtransferase glpE
c4203123-5.737681Aerobic glycerol-3-phosphate dehydrogenase
c4204339-10.238606Conserved hypothetical protein
c4205342-10.495420Hypothetical protein
c4206345-11.719006Hypothetical protein
c4207347-11.474719Putative fimbrial adhesin precursor
c4208245-11.452250Putative fimbrial chaperone precursor
c4209242-10.250254Putative minor fimbrial subunit precursor
c4210022-6.281553Putative minor fimbrial subunit precursor
c4211017-4.281315Hypothetical protein
c4212014-3.314625Hypothetical outer membrane usher protein ycbS
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4212PF005778840.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 884 bits (2287), Expect = 0.0
Identities = 398/866 (45%), Positives = 568/866 (65%), Gaps = 28/866 (3%)

Query: 19 KRVVPLLLVIMPACSIA--------GMRFNPAFLSGDTEAVADLSRFEKGMTYLPGSYEV 70
R+ + + AC+ A + FNP FL+ D +AVADLSRFE G PG+Y V
Sbjct: 21 HRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRV 80

Query: 71 EVWVNDSPLLSRTVTFKADD-ENQLIPCLSLADLLSLGINKNALPEQALASSENSCLDLR 129
++++N+ + +R VTF D E ++PCL+ A L S+G+N ++ L + +++C+ L
Sbjct: 81 DIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLA-DDACVPLT 139

Query: 130 IWFPDVHYMPELDAQRLKLTFPQAIIKRDARGYIPPEQWDNGITAFLLNYDFSGN--NDR 187
D ++ QRL LT PQA + ARGYIPPE WD GI A LLNY+FSGN +R
Sbjct: 140 SMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNR 199

Query: 188 GDYSSNNYYLNLRAGINIGAWRFRDYSTWSR-----GSNSAGKLEHISSTLQRVIIPFRS 242
+S+ YLNL++G+NIGAWR RD +TWS S S K +HI++ L+R IIP RS
Sbjct: 200 IGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRS 259

Query: 243 ELTLGDTWSSSDVFDSVSIRGIKLESDENMLPDSQSGFAPTVRGIAKSRAQVTIKQNGYV 302
LTLGD ++ D+FD ++ RG +L SD+NMLPDSQ GFAP + GIA+ AQVTIKQNGY
Sbjct: 260 RLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYD 319

Query: 303 IYQTYMPPGPFEISDLNPTSSAGDLEVTIKESDNSETVYTVPYAAVPILQREGHLKYSTT 362
IY + +PPGPF I+D+ ++GDL+VTIKE+D S ++TVPY++VP+LQREGH +YS T
Sbjct: 320 IYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSIT 379

Query: 363 VGQYRSNSYNQKSPYVFQGELIWGLPWDITAYGGAQFSEDYRALALGLGLNLGVFGATSF 422
G+YRS + Q+ P FQ L+ GLP T YGG Q ++ YRA G+G N+G GA S
Sbjct: 380 AGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSV 439

Query: 423 DVTQANSSLVDGSKHQGQSYRFLYSKSLVQTGTAFHIIGYRYSTQGFYTLSDTTYQQMSG 482
D+TQANS+L D S+H GQS RFLY+KSL ++GT ++GYRYST G++ +DTTY +M+G
Sbjct: 440 DMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNG 499

Query: 483 TVVDPKTLDDKDYVYNWNDFYNLRYSKRGKFQASVSQPFGNYGSMYLSASQQTYWNTDKK 542
++ + + + D+YNL Y+KRGK Q +V+Q G ++YLS S QTYW T
Sbjct: 500 YNIETQDGVIQVKPK-FTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNV 558

Query: 543 DSLYQVGYNTSIKGIYLNVAWNYSKSPGTN-ADKIVSLNVSLPISNWLSSTNDGRSSSNA 601
D +Q G NT+ + I ++++ +K+ D++++LNV++P S+WL S D +S
Sbjct: 559 DEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRS--DSKSQWRH 616

Query: 602 MTATYGYSQDNHGQVNQYTGVSGSLLEQHNLSYNIQHGFANQDNSSSGSVG---VNYRGA 658
+A+Y S D +G++ GV G+LLE +NLSY++Q G+A + +SGS G +NYRG
Sbjct: 617 ASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGG 676

Query: 659 YGSLNSAYSYDNEGNQQINYGISGALVVHENGLTLSQPLGETNVLIKAPGANNVDVQRGT 718
YG+ N YS+ + +Q+ YG+SG ++ H NG+TL QPL +T VL+KAPGA + V+ T
Sbjct: 677 YGNANIGYSHSD-DIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQT 735

Query: 719 GISTDWRGYAVVPYATEYRRNNISLDPMSMNMHTELDITSTEVIPGKGALVRAEFAAHIG 778
G+ TDWRGYAV+PYATEYR N ++LD ++ + +LD V+P +GA+VRAEF A +G
Sbjct: 736 GVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVG 795

Query: 779 IRGLFTVRYRNKSVPFGATASAQIKNSSQITGIVGDNGQLYLSGLPLEGVINIQWGDGVQ 838
I+ L T+ + NK +PFGA +++ SSQ +GIV DNGQ+YLSG+PL G + ++WG+
Sbjct: 796 IKLLMTLTHNNKPLPFGAMVTSE---SSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEEN 852

Query: 839 QKCQANYKLPETELDNPVSYATLECR 864
C ANY+LP ++ + ECR
Sbjct: 853 AHCVANYQLPPESQQQLLTQLSAECR 878


56c4226c4244Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4226-116-4.100840Hypothetical protein
c4227119-6.308660Gluconate utilization system GNT-I
c4228223-8.683563Protein yhhW
c4229219-6.112577Putative oxidoreductase yhhX
c4230322-6.851032Hypothetical protein
c4233219-5.119688Hypothetical protein yhhZ
c42340140.285294Hypothetical protein yrhA
c4235-2162.681116Hypothetical protein yrhB
c4236-2193.305085Gamma-glutamyltranspeptidase precursor
c4237-2223.156804Hypothetical protein yhhA precursor
c4238-2223.068367Glycerophosphoryl diester phosphodiesterase
c4239-2253.124200SN-glycerol-3-phosphate transport ATP-binding
c4240-1242.363754SN-glycerol-3-phosphate transport system
c4241-1253.338301SN-glycerol-3-phosphate transport system
c4242-2263.564729Glycerol-3-phosphate-binding periplasmic protein
c4243-3233.666404Hypothetical protein
c4244-3233.321928High-affinity branched-chain amino acid
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4236NAFLGMOTY320.005 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 32.0 bits (72), Expect = 0.005
Identities = 27/80 (33%), Positives = 36/80 (45%), Gaps = 13/80 (16%)

Query: 272 RTPISGDYRGYQVYSMPPPSSGGIHIVQILNILENFDMQKYGF-GSADAMQIMAEAEKYA 330
R P+ G+ R + SMPPP G H +I N+ F Q G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNL--KFFKQFDGYVGGQTAWGILSELEKGR 133

Query: 331 YADRSEYLGDPDFVKVPWQA 350
Y P F WQ+
Sbjct: 134 Y---------PTFSYQDWQS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4238PF04619290.014 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 28.7 bits (64), Expect = 0.014
Identities = 12/60 (20%), Positives = 23/60 (38%), Gaps = 4/60 (6%)

Query: 29 VGAKYGHKMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGDLNWQD----LLRVDAGSW 84
+G ++ D + G+ FL+ D+N ++ + W + D GSW
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4239PF05272280.046 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.046
Identities = 10/29 (34%), Positives = 16/29 (55%)

Query: 46 IVMVGPSGCGKSTLLRMVAGLERVTTGDI 74
+V+ G G GKSTL+ + GL+ +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHF 627


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4242MALTOSEBP392e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 39.3 bits (91), Expect = 2e-05
Identities = 39/160 (24%), Positives = 66/160 (41%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQDLADYAAKLKASGMKCGYASGWQ 193
G L++ P L YNKD PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKDLLP-------NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


57c4255c4283Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c42552141.712643Cell division protein ftsX
c42563142.078911Cell division ATP-binding protein ftsE
c42572133.500598Cell division protein ftsY
c42580173.911210Putative methylase yhhF
c42590163.518693Hypothetical protein yhhL
c4260-1163.494576Hypothetical protein yhhM
c4261-1153.691563Hypothetical protein yhhN
c42620142.700325Lead, cadmium, zinc and mercury transporting
c42630141.242928SirA protein
c4264-1151.663475Hypothetical protein yhhQ
c42650162.514147DcrB protein precursor
c4266-1183.687932Hypothetical protein yhhS
c4267-1204.332486Hypothetical protein yhhT
c4268-1255.6479834'-phosphopantetheinyl transferase acpT
c4269-1255.480718Nickel-binding periplasmic protein precursor
c42703286.680847Nickel transport system permease protein nikB
c4271-1223.752042Nickel transport system permease protein nikC
c42720201.389127Nickel transport ATP-binding protein nikD
c4273124-0.987255Nickel transport ATP-binding protein nikE
c4274220-2.650618Nickel responsive regulator
c4275217-1.277392Hypothetical protein
c4276317-1.682892Putative regulator
c4277217-0.687151Putative phosphotransferase system enzyme
c4278315-0.597431Putative phosphotransferase system enzyme
c4279214-0.452822PTS system, galactitol-specific IIC component
c42800150.700485Putative xylulose kinase
c42811182.194906Hypothetical protein
c42820173.342480Hypothetical protein
c42830173.470133Putative phosphocarrier protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4257IGASERPTASE501e-08 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 50.4 bits (120), Expect = 1e-08
Identities = 44/208 (21%), Positives = 73/208 (35%), Gaps = 21/208 (10%)

Query: 20 QTPEK-ETEVQNEQPVVEEIVQAQE----------PVKASEHAVEEQPQ-AHTEAKAETF 67
TP + +V + EEI + E P + +E E Q + T K E
Sbjct: 998 TTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQD 1057

Query: 68 AADVVEVTEQVAESEK----AQPEAEVVAQPEPVVEETPEPVAIEREELPLSEDVNAEAV 123
A + +VA+ K A + VAQ +ET E + E E
Sbjct: 1058 ATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVET- 1116

Query: 124 SPEEWQAEAETVEIVEAAEEEAAKEEITDEEPEAQALAAEVAEEA-VMVVSPAEEDQPVE 182
E E V + ++E ++ EP + +E + A+ +QP +
Sbjct: 1117 ---EKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAK 1173

Query: 183 EIAQEQEKPTKEGFFARLKRSLLKTKEN 210
E + E+P E S+++ EN
Sbjct: 1174 ETSSNVEQPVTESTTVNTGNSVVENPEN 1201



Score = 50.1 bits (119), Expect = 2e-08
Identities = 38/180 (21%), Positives = 61/180 (33%), Gaps = 11/180 (6%)

Query: 19 EQTPEKETEVQNEQPVVEEIVQAQEPVKASEHAVEEQPQAHTEAKAET-FAADVVEVTEQ 77
TP + TE E E + A+E + + EAK+ EV +
Sbjct: 1030 PATPSETTETVAENSKQESKTVEKNEQDATE-TTAQNREVAKEAKSNVKANTQTNEVAQS 1088

Query: 78 VAESEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLSEDVNAEAVSPEEWQAEAETVEI 137
+E+++ Q E E E +E E+ V ++ VSP++ Q+E +
Sbjct: 1089 GSETKETQTT----ETKETATVEKEEKAKVETEKTQEVPKVTSQ-VSPKQEQSETVQPQ- 1142

Query: 138 VEAAEEEAAKEEITDEEPEAQALAAEVA---EEAVMVVSPAEEDQPVEEIAQEQEKPTKE 194
E A E I + + + A E + V P E V E P
Sbjct: 1143 AEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENT 1202



Score = 42.0 bits (98), Expect = 5e-06
Identities = 27/161 (16%), Positives = 56/161 (34%), Gaps = 14/161 (8%)

Query: 17 QKEQTPEKETEVQNEQPVVEEIVQA-QEPVKASEHAVEEQPQAHTEAKAETFAADVVEVT 75
+E E ++ V+ E+ Q+ E + +E E KA+ EV
Sbjct: 1065 NREVAKEAKSNVKAN-TQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVP 1123

Query: 76 EQVAE-------SEKAQPEAEVVAQPEPVV--EETPEPVAIEREELPLSEDVNAEAVSPE 126
+ ++ SE QP+AE + +P V +E + +++ ++ P
Sbjct: 1124 KVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPV 1183

Query: 127 EWQAEAETVEIVEAAEEEAAKEEITDEEPEAQALAAEVAEE 167
E+ TV + E +P + ++ +
Sbjct: 1184 T---ESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKN 1221



Score = 30.4 bits (68), Expect = 0.019
Identities = 21/117 (17%), Positives = 38/117 (32%), Gaps = 3/117 (2%)

Query: 17 QKEQTPEKETEVQNEQPVVEEIVQAQEPVKASEHAVEEQPQAHTEAKAETFAADVVEVTE 76
++ +T + + E E I + Q + A EQP T + E + V
Sbjct: 1134 EQSETVQPQAEPARENDPTVNIKEPQSQ--TNTTADTEQPAKETSSNVEQPVTESTTVNT 1191

Query: 77 QVAESEKAQPEAEVVAQPEPVVEETPEPVAIEREEL-PLSEDVNAEAVSPEEWQAEA 132
+ E + QP E + +P R + + +V S + A
Sbjct: 1192 GNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVA 1248


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4263PF012061053e-34 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 105 bits (265), Expect = 3e-34
Identities = 24/72 (33%), Positives = 41/72 (56%)

Query: 9 DHTLDALGLRCPEPVMMVRKTVRNMQPGETLLIIADDPATTRDIPGFCTFMEHELVAKET 68
D +LDA GL CP P++ +KT+ M GE L ++A DP + +D F HEL+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 69 DGLPYRYLIRKG 80
+ Y + +++
Sbjct: 65 EDGTYHFRLKRA 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4266TCRTETA553e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 54.8 bits (132), Expect = 3e-10
Identities = 80/398 (20%), Positives = 147/398 (36%), Gaps = 32/398 (8%)

Query: 27 LRLNLRIVSIVMFNFASYLTIGLPLAVLPGYVHDVM--GFSAFWAGLVISLQYFATLLSR 84
++ N ++ I+ + IGL + VLPG + D++ G++++L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 85 PHAGRYADLLGPKKIVVFGLCGCFLSGLGYLTAGLTASLPVISLLLLCLGRVILGI-GQS 143
P G +D G + +++ L G + + Y L V L +GR++ GI G +
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAG---AAVDYAIMATAPFLWV-----LYIGRIVAGITGAT 112

Query: 144 FAGTGSTLWGVGVVGSL--HIGRVISWNGIVTYGAMAMGAPLGVVFYHWGGLQALALIIM 201
A G+ + + H G + + G +G +G H A AL +
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGL 172

Query: 202 GVALVAILLAIPRPTVK--ASKGKPLPFRAVLGRVWLYGMALALA-----SAGFGVIATF 254
LL + + P + + +A +A V A
Sbjct: 173 NFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAAL 232

Query: 255 ITLFYDAK-GWDGAAFALTLFSCAFVGT---RLLFPNGINRIGGLNVAMICFSVEIIGLL 310
+F + + WD ++L + + + ++ R+G M+ + G +
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI 292

Query: 311 LVGVATMPWMAKIG-VLLAGAGFSLVFPALGVVAVKAVPQQNQGAALATYTVFMDLSLGV 369
L+ AT WMA VLLA G + PAL + + V ++ QG + L+ +
Sbjct: 293 LLAFATRGWMAFPIMVLLASGGIGM--PALQAMLSRQVDEERQGQLQGSLAALTSLT-SI 349

Query: 370 TGPLAGLVMSWAGVPV----IYLAAAGLVAIALLLTWR 403
GPL + A + ++A A L + L R
Sbjct: 350 VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4270BORPETOXINB280.046 Bordetella pertussis toxin B subunit signature.
		>BORPETOXINB#Bordetella pertussis toxin B subunit signature.

Length = 226

Score = 28.1 bits (62), Expect = 0.046
Identities = 21/77 (27%), Positives = 32/77 (41%), Gaps = 10/77 (12%)

Query: 204 GQRHVTWARLRGLSDKQTERRHILRNASLPMITAVGMHIGELIGGTMIIENIFAWPGVG- 262
R +T A LRG D Q RH+ R S+ + G ++G GG +I++ PG
Sbjct: 53 KTRALTVAELRGSGDLQEYLRHVTRGWSIFALYD-GTYLGGEYGG--VIKD--GTPGGAF 107

Query: 263 ----RYAVSAIFNRDYP 275
+ + N P
Sbjct: 108 DLKTTFCIMTTRNTGQP 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4273HTHFIS300.008 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.2 bits (68), Expect = 0.008
Identities = 10/34 (29%), Positives = 19/34 (55%)

Query: 25 QAVLNNVSLALKSGETVALLGRSGCGKSTLARLL 58
Q + ++ +++ T+ + G SG GK +AR L
Sbjct: 147 QEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


58c4295c4306Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
c4295-3213.097676Hypothetical transporter yhiP
c4296-2233.566776Hypothetical protein yhiQ
c4297-2232.556122Oligopeptidase A
c42980190.888389Hypothetical protein yhiR
c4299117-1.632724Glutathione reductase
c4300321-6.213991Hypothetical protein
c4301320-7.910390Arsenate reductase
c4302120-6.570095Conserved hypothetical protein
c4303-113-4.093477Putative conserved protein
c4304-214-4.023786Outer membrane protein slp precursor
c4305-215-4.046376Hypothetical protein
c4306-314-3.605891Hypothetical transcriptional regulator yhiF
59c4371c4386Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4371-224-3.135376Hypothetical protein
c4372-220-2.9230432-ketogluconate reductase
c4373220-0.163536Hypothetical protein yiaF
c43742261.224502Hypothetical protein
c43751261.322846Hypothetical protein yiaG
c4376-1230.486091Hypothetical protein
c4377-1220.213133Cold shock protein cspA
c4378-218-0.399037Glycyl-tRNA synthetase beta chain
c4379015-1.388651Glycyl-tRNA synthetase alpha chain
c4380017-2.457680Conserved hypothetical protein
c4381016-3.465385Hypothetical protein yiaH
c4382-216-3.400145Hypothetical protein yiaA
c4383-215-2.009636Hypothetical protein yiaB
c4384-215-1.773266Xylulose kinase
c4385-214-2.894927Xylose isomerase
c4386-218-3.244201D-xylose-binding periplasmic protein precursor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4380VACJLIPOPROT260.003 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 26.4 bits (58), Expect = 0.003
Identities = 10/31 (32%), Positives = 16/31 (51%), Gaps = 3/31 (9%)

Query: 9 AMALIVLVGCSTPPPVQKAQRVKGDPLRSLN 39
A+ +LVGC++ Q+ + DPL N
Sbjct: 9 ALGTTLLVGCASSGTDQQGRS---DPLEGFN 36


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4382FLGBIOSNFLIP270.027 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 27.1 bits (60), Expect = 0.027
Identities = 19/66 (28%), Positives = 26/66 (39%), Gaps = 1/66 (1%)

Query: 78 MTCLTVFIISVALLLVGLWNATLLLSEKGFYGLAFFLSLFGAVAVQKNIRDAGINPPKET 137
MT T II LL L + + GLA FL+ F V I P E
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAP-PNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEE 119

Query: 138 QVTQEE 143
+++ +E
Sbjct: 120 KISMQE 125


60c4396c4453Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4396219-2.187899Hypothetical oxidoreductase yiaK
c4397116-2.520193Hypothetical protein yiaL
c4398-218-0.042424Hypothetical protein
c4399-1201.362545Hypothetical protein yiaM
c4400-2182.415098Hypothetical protein yiaN
c4401-2203.940567Hypothetical protein
c4402-3194.574819Putative ABC transporter Periplasmic binding
c4403-3194.978849Cryptic L-xylulose kinase
c4404-2163.614730Probable hexulose-6-phosphate synthase
c4405-2122.210100Putative hexulose-6-phosphate isomerase
c4406-1112.825596Probable sugar isomerase sgbE
c4407-1112.858741Hypothetical protein
c44080122.429347Aldehyde dehydrogenase B
c4409-1122.197677Hypothetical protein
c4410-1112.704916Probable alcohol dehydrogenase
c4411-3143.294545Selenocysteine-specific elongation factor
c4412-3132.422144L-seryl-tRNA(Sec) selenium transferase
c4413-3141.540642Hypothetical GST-like protein yibF
c4414-2161.327465Hypothetical protein yibH
c4415-2180.947798Hypothetical protein yibI
c4416-1200.843955PTS system, mannitol-specific IIABC component
c4417014-1.013936Mannitol-1-phosphate 5-dehydrogenase
c4418114-5.472809Mannitol operon repressor
c4419423-0.782210Hypothetical protein
c44204210.196997Hypothetical protein
c44213210.434872Hypothetical protein yibL
c44222180.890768Hypothetical protein
c44232181.265214Conserved hypothetical protein
c44242182.152591Putative adhesin
c4425-2143.531028L-lactate permease
c44260133.031412Putative L-lactate dehydrogenase operon
c4427-2152.714424L-lactate dehydrogenase
c4428-1142.017267Hypothetical tRNA/rRNA methyltransferase yibK
c4429-1151.399012Serine acetyltransferase
c44300181.398355Glycerol-3-phosphate dehydrogenase (NAD(P)+)
c4431227-1.224563Hypothetical protein
c4432420-2.066158Protein-export protein secB
c4433124-0.263694Glutaredoxin 3
c44343191.735552Hypothetical protein
c44352181.719428Hypothetical protein
c44360140.240695Hypothetical protein yibN
c4437-1110.463201Hypothetical protein
c4438-1120.7481012,3-bisphosphoglycerate-independent
c4439-1110.970418Hypothetical protein yibP
c4440111-0.279698Hypothetical protein yibQ precursor
c44412130.080059Putative glycosyl transferase yibD
c44420131.091220Hypothetical protein
c4443-214-2.460412Threonine 3-dehydrogenase
c4444019-5.5162312-amino-3-ketobutyrate coenzyme A ligase
c4445126-8.581971ADP-L-glycero-D-manno-heptose-6-epimerase
c4446231-10.556988ADP-heptose--LPS heptosyltransferase II
c4447239-13.777063Lipopolysaccharide heptosyltransferase-1
c4448341-15.979309Lipid A-core, surface polymer ligase
c4449234-13.856983Putative beta1,3-glucosyltransferase
c4450228-10.891033UDP-galactose:(galactosyl) LPS
c4451226-9.138344Lipopolysaccharide core biosynthesis protein
c4452222-6.100526Lipopolysaccharide 1,2-glucosyltransferase
c4453218-3.747305Lipopolysaccharide 1,3-galactosyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4411TCRTETOQM585e-11 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 58.3 bits (141), Expect = 5e-11
Identities = 44/147 (29%), Positives = 69/147 (46%), Gaps = 18/147 (12%)

Query: 3 IATAGHVDHGKTTLLQAI---TGV------------NADRLPEEKKRGMTIDLGYAYWPQ 47
I HVD GKTTL +++ +G D E++RG+TI G +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 48 PDGRVPGFIDVPGHEKFLSNMLAGVGGIDHALLVVACDDGVMAQTREHLAILQLTGNPML 107
+ +V ID PGH FL+ + + +D A+L+++ DGV AQTR L+ G P +
Sbjct: 66 ENTKV-NIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTI 124

Query: 108 TVALTKADRVDEARVDEVERQVKEVLR 134
+ K D+ + V + +KE L
Sbjct: 125 -FFINKIDQNG-IDLSTVYQDIKEKLS 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4414RTXTOXIND642e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 64.5 bits (157), Expect = 2e-13
Identities = 56/314 (17%), Positives = 103/314 (32%), Gaps = 82/314 (26%)

Query: 66 ITPQVTGIVTEVTDKNNQLIQKGEVLFKLDPVR------------YQARVD--RLQA--- 108
I P IV E+ K + ++KG+VL KL + QAR++ R Q
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSR 158

Query: 109 ------------------------DLMTATHNIK----TLRAQLTEAQANTTQVSAERDR 140
+++ T IK T + Q + + N + AER
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT 218

Query: 141 LFKNYQRY----------LKGSQAAVNPFS---------ERDIDDARQNF---LAQDALV 178
+ RY L + ++ + E +A +Q +
Sbjct: 219 VLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQI 278

Query: 179 KGSVAE----QAQIQSQLDSMVNGE----QSQIVSLRAQLTEAKYNLEQTVIRAPSNGYV 230
+ + + + + + I L +L + + + +VIRAP + V
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKV 338

Query: 231 TQVLIR-PGTYAAALPLRPVMVFIPEQKRQIV-AQFRQNSLLRLKPGDDAEVVFNALPGQ 288
Q+ + G +MV +PE V A + + + G +A + A P
Sbjct: 339 QQLKVHTEGGVVT--TAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYT 396

Query: 289 VFH---GKLTSILP 299
+ GK+ +I
Sbjct: 397 RYGYLVGKVKNINL 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4424PF03895633e-14 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 63.3 bits (154), Expect = 3e-14
Identities = 19/79 (24%), Positives = 36/79 (45%), Gaps = 2/79 (2%)

Query: 1701 ESKLSGGIASAMAMTGLPQAYTPGASMASIGGGTYNGESAVALGV-SMVSANGRWVYKLQ 1759
+L G+A+ A++ L Q G + S G Y ++A+A+GV S ++ +
Sbjct: 2 SKELQTGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGVA 61

Query: 1760 GSTNSQGEYSAALGAGIQW 1778
+T + G S G ++
Sbjct: 62 FNTYN-GGMSYGASVGYEF 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4430NUCEPIMERASE290.020 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.4 bits (66), Expect = 0.020
Identities = 20/87 (22%), Positives = 30/87 (34%), Gaps = 13/87 (14%)

Query: 8 MTVI---GAGSYGTALAITLARNGHEVVLWGHD---PEHIATLERDRCNAAFLPDVPFPD 61
M + AG G ++ L GH+VV G D + +L++ R P F
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVV--GIDNLNDYYDVSLKQARLELLAQPGFQF-- 56

Query: 62 TLHLESDLATALAASRNILVVVPSHVF 88
+ DLA + VF
Sbjct: 57 ---HKIDLADREGMTDLFASGHFERVF 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4432SECBCHAPRONE2402e-84 Bacterial protein-transport SecB chaperone protein ...
		>SECBCHAPRONE#Bacterial protein-transport SecB chaperone protein

signature.
Length = 170

Score = 240 bits (613), Expect = 2e-84
Identities = 87/153 (56%), Positives = 117/153 (76%), Gaps = 4/153 (2%)

Query: 23 EQNNTEMTFQIQRIYTKDISFEAPNAPHVFQKDWQPEVKLDLDTASSQLADDVYEVVLRV 82
Q + QIQRIY KD+SFEAPN PH+FQ+DW+P++ DL T + Q+ DD+YEV L +
Sbjct: 12 TQATQQPVLQIQRIYVKDVSFEAPNLPHIFQQDWEPKLSFDLSTEAKQVGDDLYEVCLNI 71

Query: 83 TVTASLGEE--TAFLCEVQQGGIFSIAGIEGTQMAHCLGAYCPNILFPYARECITSMVSR 140
+V ++ AF+CEV+Q G+F+I+G+E QMAHCL + CPN+LFPYARE ++S+V+R
Sbjct: 72 SVETTMESSGDVAFICEVKQAGVFTISGLEEMQMAHCLTSQCPNMLFPYARELVSSLVNR 131

Query: 141 GTFPQLNLAPVNFDALFMNYL--QQQAGEGTEE 171
GTFP LNL+PVNFDALFM+YL Q+QA + TEE
Sbjct: 132 GTFPALNLSPVNFDALFMDYLQRQEQAEQTTEE 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4439CHANLCOLICIN362e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 36.2 bits (83), Expect = 2e-04
Identities = 51/223 (22%), Positives = 74/223 (33%), Gaps = 22/223 (9%)

Query: 60 RAVRQKQQQRASLLAQLKKQEEAISEATRKLRETQNTLNQLNKQIDEMNASIAKLEQQKA 119
R + +++ R A K +EA RE T QL E A E+ KA
Sbjct: 131 RLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKA 190

Query: 120 ---AQERSLAAQLDAAFRQGEHTGIQLILSGEESQRGQRLQAYFGYLNQARQETIAQLKQ 176
AQ++ AAQ + GE + LS AR + L
Sbjct: 191 VEIAQKKLSAAQSEVVKMDGEIKTLNSRLSS---------------SIHARDAEMKTLAG 235

Query: 177 TREEVAMQRAELEEKQSEQQTLLYEQRAQQAKLTQALSERKKTLAGLESSIQQGQQQLSE 236
R E+A A+ +E + L + R++ AG +Q Q SE
Sbjct: 236 KRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASE 295

Query: 237 LRANESRLRNSIARAEAAAKARAEREAREAQAVRDRQKEATRK 279
R N R+ I + + A R A R + E K
Sbjct: 296 TRIN--RINADITQIQKAIS--QVSNNRNAGIARVHEAEENLK 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4445NUCEPIMERASE1047e-28 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 104 bits (260), Expect = 7e-28
Identities = 77/348 (22%), Positives = 127/348 (36%), Gaps = 67/348 (19%)

Query: 2 IIVTGGAGFIGSNIVKALNDKGITDILVVDNLKD--------------GTKFVNLVDLDI 47
+VTG AGFIG ++ K L + G ++ +DNL D +D+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 48 ADYMDKEDFLIQIMAGEEFGDVEAIFHEGACSSTTEWDGKYMMDNNYQYSK-------EL 100
AD + + + A F E +F + +Y ++N + Y+ +
Sbjct: 62 ADR----EGMTDLFASGHF---ERVFISPHRLAV-----RYSLENPHAYADSNLTGFLNI 109

Query: 101 LHYCLEREIP-FLYASSAATYGGRTSD-FIESREYEKPLNVYGYSKFLFDEYVRQILPEA 158
L C +I LYASS++ YG F + P+++Y +K +
Sbjct: 110 LEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 159 NSQIVGFRYFNVYGPREGHKGSMASVAFHLNTQLNNGESPKLFEGSENFKRDFVYVGDVA 218
G R+F VYGP + MA F + G+S ++ KRDF Y+ D+A
Sbjct: 170 GLPATGLRFFTVYGPWG--RPDMA--LFKFTKAMLEGKSIDVY-NYGKMKRDFTYIDDIA 224

Query: 219 DVNL------------WFLENGVSG-------IFNLGTGRAESFQAVADATLAY-HKKGQ 258
+ + W +E G ++N+G A + +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 259 IEYIPFPDKLKGRYQAFTQADLTNLRAA-GYDKPFKTVAEGVTEYMAW 305
+P G T AD L G+ P TV +GV ++ W
Sbjct: 285 KNMLPLQ---PGDVL-ETSADTKALYEVIGF-TPETTVKDGVKNFVNW 327


61c4474c4519Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4474-2163.619686DNA-directed RNA polymerase omega chain
c4475-2143.468183Guanosine-3',5'-bis(Diphosphate)
c4476-1153.327850tRNA (Guanosine-2'-O-)-methyltransferase
c4477-1143.032867ATP-dependent DNA helicase recG
c4478-1131.994044Sodium/glutamate symport carrier protein
c4479-2122.131398Putative purine permease yicE
c4480-2131.623397Hypothetical protein yicH
c4481-2130.546430Conserved hypothetical protein
c4482-115-0.039079Hypothetical protein yajF
c4483019-3.450308Putative aldolase
c4484-215-2.712357Putative aldolase
c4485-113-3.364881Putative PTS enzyme-ii fructose
c4486-214-4.410891PTS system, fructose-like-2 IIB component 1
c4487-214-4.255367Putative phosphotransferase system (PTS),
c4488-214-4.279416Putative transcriptional Antiterminator
c4489-214-3.562564Putative family 31 glucosidase yicI
c4490127-6.220253Hypothetical symporter yicJ
c4491335-7.620691*Putative prophage integrase
c4492337-8.068288ShiA homolog
c4493239-7.893612Hypothetical protein
c4494238-7.491158Putative transcriptional regulator
c4495331-6.200797Hexuronate transporter
c4496218-3.406195Putative glucosidase
c4497215-2.760124Putative glucosidase
c4498116-1.242167IS1 protein InsB
c4499115-1.437148Hypothetical protein
c4500215-1.333542Putative amino acid antiporter
c4501115-1.746805Putative conserved protein
c4502119-2.535750Putative antiporter
c4503225-2.147664Insertion element IS1 1/2/3/5/6 protein insA
c4504426-2.977082Insertion element IS1 1/5/6 protein insB
c4505426-4.162670Hypothetical protein
c4506429-5.131855Transposase insI for insertion sequence element
c4507532-5.913904Transposase insF for insertion sequence
c4508637-6.562475Conserved hypothetical protein
c4509738-7.175367Hypothetical protein
c4510639-7.082121Hypothetical protein
c4511641-6.360058Hypothetical protein
c4512638-6.011371Hypothetical protein
c4513534-4.987136Hypothetical protein
c4514629-4.801935Hypothetical protein
c4517424-0.266856Hypothetical protein
c45184230.184682Conserved hypothetical protein
c45192160.880520Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4477SECA403e-05 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 40.2 bits (94), Expect = 3e-05
Identities = 38/129 (29%), Positives = 56/129 (43%), Gaps = 18/129 (13%)

Query: 244 NLSMLALRAGAQRFHAQPLSANDALKNKLLAALPFKPTGAQARVVAEIERDM-ALDVPMM 302
LS L+ F A+ L + L+N + A A R ++ M DV ++
Sbjct: 37 KLSDEELKGKTAEFRAR-LEKGEVLENLIPEAF------AVVREASKRVFGMRHFDVQLL 89

Query: 303 ---RLVQGDV-----GSGKTLVAALAA-LRAIAHGKQVALMAPTELLAEQHANNFRNWFA 353
L + + G GKTL A L A L A+ GK V ++ + LA++ A N R F
Sbjct: 90 GGMVLNERCIAEMRTGEGKTLTATLPAYLNALT-GKGVHVVTVNDYLAQRDAENNRPLFE 148

Query: 354 PLGIEVGWL 362
LG+ VG
Sbjct: 149 FLGLTVGIN 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4488PF08280330.004 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 32.9 bits (75), Expect = 0.004
Identities = 23/164 (14%), Positives = 58/164 (35%), Gaps = 20/164 (12%)

Query: 22 RQNRLLRFLLPRREYTTIVTIAGYLNVSEKTIQRDLRLLEQWL-GQWRINVEKRAGAGVM 80
+ +L+ + I +A ++ + L + + ++KR M
Sbjct: 45 SKCQLVVLFF-KTSSLPITEVAEKTGLTFLQLNHYCEELNAFFPDSLSMTIQKR-----M 98

Query: 81 LSAENIADLLHLDHLLVAECEEIDCVMNNARRVKIASQLLSETPNETSISKLSERYFISG 140
+ H ++ + + ++ +++ + L+ + ++ + +F+S
Sbjct: 99 I-------SCQFTHP--SKETYLYQLYASSNVLQLLAFLIKNGSHSRPLTDFARSHFLSN 149

Query: 141 ASIVNDLRVIESWLAPLGLSLIRSPSGTHIEGSEGQVRQAMALL 184
+S + L L L S I G E ++R +ALL
Sbjct: 150 SSAYRMREALIPLLRNFELKL----SKNKIVGEEYRIRYLIALL 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4495TCRTETB463e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 45.6 bits (108), Expect = 3e-07
Identities = 37/176 (21%), Positives = 68/176 (38%), Gaps = 5/176 (2%)

Query: 39 QLRWWMLILFLMGVTVNYITRNSLGILAPELKDSLGITTEQYSWIVGGFQLAYTLFQPLC 98
Q+ W+ IL V + L + P++ + +W+ F L +++ +
Sbjct: 14 QILIWLCILSFFSV----LNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVY 69

Query: 99 GWLIDVIGLKIGFMICASLWGIACLLHAGAGSWIQLALLRFFMGGAEASATPA-NAKIIG 157
G L D +G+K + + ++ S+ L ++ F+ GA A+A PA ++
Sbjct: 70 GKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVA 129

Query: 158 EWFPKSERPVAAGWAGVGFSIGAMLAPPIIYFAHASFGWQGAFMFTGALAILWVFL 213
+ PK R A G G ++G + P I W + I FL
Sbjct: 130 RYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFL 185


62c4539c4581Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c45392302.993285Hypothetical protein
c4540223-1.826361Putative maturase-related protein
c4541016-5.347343Putative maturase-related protein
c4542-215-4.522762Hypothetical protein
c4543-214-4.172063Hypothetical protein
c4544-214-4.956913Hypothetical protein
c4545-116-5.690715Hypothetical protein
c4546-117-4.721984Hypothetical protein
c4547022-3.212550S-adenosylmethionine synthetase
c4548230-4.854654Hypothetical protein
c4549432-5.379782Hypothetical protein
c4550530-3.829536Hypothetical protein
c4551727-1.651118Hypothetical protein
c45529261.380735Transposase insC for insertion element
c45537250.836370Hypothetical protein
c45546240.772239Hypothetical protein
c4555526-0.302197Conserved hypothetical protein
c4556527-6.376025Conserved hypothetical protein
c4557529-7.573547Hypothetical protein
c4558637-11.001710Hypothetical protein
c4559633-9.703002Hypothetical protein
c4560534-10.980024Hypothetical protein
c4561432-8.924196Hypothetical protein
c4562428-4.251070Hypothetical protein
c4563428-2.687121Hypothetical protein
c4564428-1.615053Conserved hypothetical protein
c4565324-0.443909Hypothetical protein
c45665241.083438Hypothetical transcriptional regulator yfjR
c45675262.006730Hypothetical protein
c45686272.840205Hypothetical protein
c45696284.204569Hypothetical protein ykfF
c45707284.494735Hypothetical protein yafZ
c45717264.097633Hypothetical protein yfjX
c45726253.711648Hypothetical protein
c45736251.842455Putative radC-like protein yeeS
c45746230.334105Hypothetical protein yeeT
c4575524-1.068138Hypothetical protein yeeU
c4576321-5.282732Hypothetical protein yeeV
c4577019-5.681580Conserved hypothetical protein
c4578-121-6.074397Conserved hypothetical protein
c4579-124-7.723730Hypothetical protein
c4580025-7.903201Hypothetical protein
c4581020-4.803754Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4545PF065802034e-64 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 203 bits (519), Expect = 4e-64
Identities = 56/214 (26%), Positives = 101/214 (47%), Gaps = 14/214 (6%)

Query: 203 RKRVEIERSLHEAEFKALSYQINPHFLFNVLNTIGRLAFLEDAQRTETMVHDFSDMMRYL 262
+ ++ EA+ AL QINPHF+FN LN I L LED + M+ S++MRY
Sbjct: 149 IDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALI-LEDPTKAREMLTSLSELMRYS 207

Query: 263 LRKNSHGLITLRNEINYVNNYMSIQKVRMRDRFDYLCDIPEKYLDVVCPFLILQPLVENF 322
LR ++ ++L +E+ V++Y+ + ++ DR + I +DV P +++Q LVEN
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENG 267

Query: 323 FNYVVEPRDSNSHLLIRATDDGLNVIIEVTDNGDGIAPDTINRILSGDQKLQKGSIGINN 382
+ + +L++ T D V +EV + G +T + G+ N
Sbjct: 268 IKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK----------ESTGTGLQN 317

Query: 383 IKNRLKLLFGESYGLEIMSPNKPRMGTTIKLRFP 416
++ RL++L+G +++ + P
Sbjct: 318 VRERLQMLYGTEAQIKLSEKQG---KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4546HTHFIS596e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 59.5 bits (144), Expect = 6e-12
Identities = 29/139 (20%), Positives = 53/139 (38%), Gaps = 3/139 (2%)

Query: 3 TIVIVEDEPIELESLRQIISQCVENAAIHEASTGKKAIHLIDQLSQIDMILVDINIPLPN 62
TI++ +D+ L Q +S+ + S I D+++ D+ +P N
Sbjct: 5 TILVADDDAAIRTVLNQALSR--AGYDVRITSNAATLWRWIAA-GDGDLVVTDVVMPDEN 61

Query: 63 GKQVIEYLKKKNSDTKIIVITANDDFDIVRSMYNLKVDDYLLKPVKKCILTDTIKKTLAF 122
++ +KK D ++V++A + F DYL KP L I + LA
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 123 DEGENEKSRALKQKVFAMI 141
+ K Q ++
Sbjct: 122 PKRRPSKLEDDSQDGMPLV 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4550HTHTETR724e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.4 bits (177), Expect = 4e-18
Identities = 33/63 (52%), Positives = 41/63 (65%)

Query: 10 RHTKFAAEETRKQILDVAEFCFCETGFSKTTLEMIAARAGCTRGAIYWYFNEKKDLLRQV 69
R TK A+ETR+ ILDVA F + G S T+L IA AG TRGAIYW+F +K DL ++
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 70 IER 72
E
Sbjct: 63 WEL 65


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4551ACRIFLAVINRP411e-137 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 411 bits (1057), Expect = e-137
Identities = 202/339 (59%), Positives = 270/339 (79%), Gaps = 1/339 (0%)

Query: 1 MIQARNQLLAEAAKSPA-LNMVRPNGMNDEPQFQILIDDEKVQAFKLSMSDVDNIMSAAW 59
+ QARNQLL AA+ PA L VRPNG+ D QF++ +D EK QA +S+SD++ +S A
Sbjct: 694 LTQARNQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTAL 753

Query: 60 GSMYVNDFNDRGRVKKVYIQGEPGSRISPQDFDKWYVRNSDGDMVSFASFATGKWIYGSP 119
G YVNDF DRGRVKK+Y+Q + R+ P+D DK YVR+++G+MV F++F T W+YGSP
Sbjct: 754 GGTYVNDFIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSP 813

Query: 120 KLEQYNGISAVEILGEPAPGYSSGDAMKAIEDIAARLPEGFHISWTGLSFEERLSGSQAP 179
+LE+YNG+ ++EI GE APG SSGDAM +E++A++LP G WTG+S++ERLSG+QAP
Sbjct: 814 RLERYNGLPSMEIQGEAAPGTSSGDAMALMENLASKLPAGIGYDWTGMSYQERLSGNQAP 873

Query: 180 ALYALSLLIVFLCLAALYESWSIPFSVMLVVPLGVLGAVCATLLRGLGNDVFFQVGLLTT 239
AL A+S ++VFLCLAALYESWSIP SVMLVVPLG++G + A L NDV+F VGLLTT
Sbjct: 874 ALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTT 933

Query: 240 IGLSAKNAILIVEFARELHEKEGLSIKEAAVEAARVRLRPIIMTSLAFVMGVIPLAVSTG 299
IGLSAKNAILIVEFA++L EKEG + EA + A R+RLRPI+MTSLAF++GV+PLA+S G
Sbjct: 934 IGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNG 993

Query: 300 ASSGSKHAIGTGVVGGMITATILAIFYIPLFYMLIAGFF 338
A SG+++A+G GV+GGM++AT+LAIF++P+F+++I F
Sbjct: 994 AGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRCF 1032



Score = 77.2 bits (190), Expect = 1e-17
Identities = 64/322 (19%), Positives = 125/322 (38%), Gaps = 21/322 (6%)

Query: 29 EPQFQILIDDEKVQAFKLSMSDVDNIMSAA----WGSMYVNDFNDRGRVKKVYIQGEPGS 84
+ +I +D + + +KL+ DV N + G+ I +
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQ-TR 239

Query: 85 RISPQDFDKWYVR-NSDGDMVSFASFAT---GKWIYGSPKLEQYNGISAVEILGEPAPGY 140
+P++F K +R NSDG +V A G Y + + NG A + + A G
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNV--IARINGKPAAGLGIKLATGA 297

Query: 141 SSGDAMKAI----EDIAARLPEGFHISW---TGLSFEERLSGSQAPALYALSLLIVFLCL 193
++ D KAI ++ P+G + + T + + A+ ++VFL +
Sbjct: 298 NALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAI--MLVFLVM 355

Query: 194 AALYESWSIPFSVMLVVPLGVLGAVCATLLRGLGNDVFFQVGLLTTIGLSAKNAILIVEF 253
++ + VP+ +LG G + G++ IGL +AI++VE
Sbjct: 356 YLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVEN 415

Query: 254 ARELHEKEGLSIKEAAVEAARVRLRPIIMTSLAFVMGVIPLAVSTGASSGSKHAIGTGVV 313
+ ++ L KEA ++ ++ ++ IP+A G++ +V
Sbjct: 416 VERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIV 475

Query: 314 GGMITATILAIFYIP-LFYMLI 334
M + ++A+ P L L+
Sbjct: 476 SAMALSVLVALILTPALCATLL 497


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4552PF06704250.047 DspF/AvrF protein
		>PF06704#DspF/AvrF protein

Length = 129

Score = 25.2 bits (55), Expect = 0.047
Identities = 20/85 (23%), Positives = 39/85 (45%), Gaps = 6/85 (7%)

Query: 28 RTTQEKIAIVQQSFEPGMTVSLVARQHGVAASQLFLWRKQYQEGSLTAVAA-GEQVVPAS 86
+ + + +S + SL A Q+GV A L+ Q E ++ + E V+
Sbjct: 2 NNSPTDFSRLIKSLGAQLGTSLTA-QNGVCA----LYDSQDNEAAVIEMPDHSEMVIFHC 56

Query: 87 ELAAAMKQIKELQRLLNKTPDVSRL 111
+ + + +LQ+LL+ DV+R+
Sbjct: 57 RVGRSPDRAADLQKLLSLNFDVARM 81


63c4637c4643Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
c4637-113-3.041200Hypothetical protein yieG
c4638012-4.218529Hypothetical protein yieH
c4639014-3.860805Hypothetical protein yieI
c4640113-3.643595Hypothetical protein yieK
c4641014-4.311771Hypothetical protein yieL
c4642016-4.978268Putative outer membrane protein yieC precursor
c4643-114-3.5558936-phospho-beta-glucosidase bglB
64c4653c4662Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c46532312.014411Phosphate-binding periplasmic protein precursor
c46542282.287361Glucosamine--fructose-6-phosphate
c46554352.280625GlmU protein
c46565342.267614Hypothetical protein
c46575342.184446ATP synthase epsilon chain
c46585322.090061ATP synthase beta chain
c46594281.725091ATP synthase gamma chain
c46605311.694922ATP synthase alpha chain
c46613301.078418Hypothetical protein
c46623230.679018ATP synthase delta chain
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4655RTXTOXINA290.049 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.049
Identities = 23/80 (28%), Positives = 31/80 (38%), Gaps = 10/80 (12%)

Query: 367 LGDAEIGDNVNIGAGTITCNYDGANKFKTIIGDDVFVGSDTQLVAPVTVGKGATIAAGTT 426
LGD + D V + AG+ N G DV T G AT A T
Sbjct: 616 LGDGD--DKVFLSAGSA--NIYAGK------GHDVVYYDKTDTGYLTIDGTKATEAGNYT 665

Query: 427 VTRNVGENALAISRVPQTQK 446
VTR +G + + V + Q+
Sbjct: 666 VTRVLGGDVKVLQEVVKEQE 685


65c4689c4693Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
c4689-1213.162283Hypothetical protein
c5500-1224.424569IlvGMEDA operon leader peptide
c4690-1214.397905Acetohydroxy acid synthase II
c4691-1253.893529Acetolactate synthase isozyme II small subunit
c4692-1263.935808Branched-chain amino acid aminotransferase
c46930223.705280Dihydroxy-acid dehydratase
66c4729c4738Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
c4729-2214.669507Conserved hypothetical protein
c4730-2183.571477Diaminopimelate epimerase
c4731-3172.550450Hypothetical protein yigA
c4732-2172.238573Integrase/recombinase xerC
c4733-2140.556618Hypothetical protein yigB
c4734-110-2.582585DNA helicase II
c4735-213-6.088979Conserved hypothetical protein
c4736-113-5.821872Conserved hypothetical protein
c4737-111-4.853563Magnesium and cobalt transport protein corA
c4738016-6.447224Conserved hypothetical protein
67c4790c4822Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4790-1173.0570993-octaprenyl-4-hydroxybenzoate carboxy-lyase
c4791-2182.937199NAD(P)H-flavin reductase
c4792-2182.8924483-ketoacyl-CoA thiolase
c4793-2171.776838Fatty oxidation complex alpha subunit
c4794-2130.952783Xaa-Pro dipeptidase
c4795-1130.162602Hypothetical protein yigZ
c4796-113-1.047587Trk system potassium uptake protein trkH
c4797-214-2.438189Protoporphyrinogen oxidase
c4800-119-3.661378**Molybdopterin-guanine dinucleotide biosynthesis
c4801-221-6.020307Molybdopterin-guanine dinucleotide biosynthesis
c4802-221-6.728673Protein yihD
c4803-222-7.305544Hypothetical protein yihE
c4804021-8.392271Thiol:disulfide interchange protein dsbA
c4805020-7.477315Hypothetical protein
c4806020-6.972265Hypothetical protein yihF
c4807018-5.499464Hypothetical protein yihG
c4808320-3.314949Hypothetical protein
c48113180.182387Hypothetical protein
c48122171.907018Probable GTP-binding protein engB
c48130152.272481Hypothetical protein
c4814-1172.356950Hypothetical protein
c48152242.575242Hypothetical protein yihI
c48161222.186875Oxygen-independent coproporphyrinogen III
c48171190.881024Nitrogen regulation protein NR(I)
c4818019-1.057251Nitrogen regulation protein NR(II)
c4819118-1.839876Glutamine synthetase
c4820014-2.531464GTP-binding protein typA/BipA
c4821-121-4.115924Hypothetical transcriptional regulator yihL
c4822-121-3.505586Hypothetical protein yihM
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4815SECA310.002 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 31.0 bits (70), Expect = 0.002
Identities = 11/71 (15%), Positives = 30/71 (42%)

Query: 14 AKARRKTREELDQEARDRKRQKKRRGHAPGSRAAGGNTTSGSKGQNAPKDPRIGSKTPIP 73
+K + + EE+++ + R+ + +R ++ + + + ++G P P
Sbjct: 827 SKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCP 886

Query: 74 LGVAEKVTKQH 84
G +K + H
Sbjct: 887 CGSGKKYKQCH 897


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4817HTHFIS6010.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 601 bits (1550), Expect = 0.0
Identities = 206/478 (43%), Positives = 299/478 (62%), Gaps = 11/478 (2%)

Query: 21 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGAEVLEALASKTPDVLLSDIRMPGM 80
M + V DDD++IR VL +AL+ AG N A + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 81 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 140
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 141 HYQEQQQPRNVQLNGPTTDIIGEAPAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 200
+ + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 201 LHRHSPRAKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 260
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 261 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVLEGKFREDLFHR 320
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L+Q + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 321 LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL 380
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 381 ENTCRWLTVMAAGQEVLIQDLPGELFESNVPESTSHMQPDSWATLLAQWADRALRS---- 436
EN R LT + + + + EL S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 437 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 489
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4820TCRTETOQM1804e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 180 bits (458), Expect = 4e-51
Identities = 97/445 (21%), Positives = 170/445 (38%), Gaps = 81/445 (18%)

Query: 4 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQE--RVMDSNDLEKERGITILAKNT 61
K+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIVYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPALSVDEPTV 307
K+ ++ + E + D A +G+IV + L ++ + DT+ + + P +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 308 SMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSGRGELHLS 367
+ + D L LR +S G++ +
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKVQME 397

Query: 368 VLIENMRRE-GFELAVSRPKVIFRE 391
V ++ + E+ + P VI+ E
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


68c4883c4895Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c48830133.1374861,4-dihydroxy-2-naphthoate
c48841143.107436ATP-dependent hsl protease ATP-binding subunit
c48852133.360356ATP-dependent protease hslV
c48861133.141827Cell division protein ftsN
c4887-1122.470816Transcriptional repressor cytR
c48880154.199380Primosomal protein N'
c4889-2121.338622Hypothetical protein
c4890-310-0.465786Hypothetical protein yiiX precursor
c4891-211-2.430699Met repressor
c4892-212-2.707109Cystathionine gamma-synthase
c4893-112-3.610509AKII-HDII protein
c4894-118-6.331542Nucleoside-specific channel-forming protein tsx
c4895-211-3.115478Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4884HTHFIS300.018 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.2 bits (68), Expect = 0.018
Identities = 11/36 (30%), Positives = 18/36 (50%), Gaps = 3/36 (8%)

Query: 49 TPKNILMIGPTGVGKTEIAR---RLAKLANAPFIKV 81
T +++ G +G GK +AR K N PF+ +
Sbjct: 159 TDLTLMITGESGTGKELVARALHDYGKRRNGPFVAI 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4886IGASERPTASE415e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 41.2 bits (96), Expect = 5e-06
Identities = 32/155 (20%), Positives = 64/155 (41%), Gaps = 5/155 (3%)

Query: 114 LTPEQRQLLEQMQADMRQQPTQLVEVPWNEQTPEQRQQTLQRQRQAQQLAEQQRLVQQSR 173
+ +QAD+ P+ E+ ++ P + +AE + Q+S+
Sbjct: 992 VDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK--QESK 1049

Query: 174 TTEQSWQQQT-RTSQAAPVQAQPRQSKPASTQQPYQDLLQTPAHTTAQSKPQQAAPVARV 232
T E++ Q T T+Q V + + + A+TQ + T ++ ++ A V +
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 233 ADAPKPTAEKKDERRWMVQCGSFRGAEQAETVRAQ 267
A T + ++ + Q + EQ+ETV+ Q
Sbjct: 1110 EKAKVETEKTQEVPKVTSQVSPKQ--EQSETVQPQ 1142


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4894CHANNELTSX357e-127 Nucleoside-specific channel-forming protein Tsx signa...
		>CHANNELTSX#Nucleoside-specific channel-forming protein Tsx

signature.
Length = 294

Score = 357 bits (916), Expect = e-127
Identities = 171/262 (65%), Positives = 204/262 (77%), Gaps = 6/262 (2%)

Query: 30 WLHQSLNVIGRTDSRFGPRLTNDLYPEYTVAGRKDWFDFYGYVDLPKFFGVGSHYDVGIW 89
W HQS+NV+G +RFGP++ ND Y EY +KDWFDFYGY+D P FFG G+ GIW
Sbjct: 34 WWHQSVNVVGSYHTRFGPQIRNDTYLEYEAFAKKDWFDFYGYIDAPVFFG-GNSTAKGIW 92

Query: 90 DEGSPLFTEIEPRFSIDKLTGLNLAFGPFKEWFIANNYVYDMGDNQSSRQSTWYMGLGTD 149
++GSPLF EIEPRFSIDKLT +L+FGPFKEW+ ANNY+YDMG N S QSTWYMGLGTD
Sbjct: 93 NKGSPLFMEIEPRFSIDKLTNTDLSFGPFKEWYFANNYIYDMGRNDSQEQSTWYMGLGTD 152

Query: 150 IDTGLPIKLSANIYAKYQWQNYGAANENEWDGYRFKIKYSIPLTNLFGGRLVYNSFTNFD 209
IDTGLP+ LS N+YAKYQWQNYGA+NENEWDGYRFK+KY +PLT+L+GG L Y FTNFD
Sbjct: 153 IDTGLPMSLSLNVYAKYQWQNYGASNENEWDGYRFKVKYFVPLTDLWGGSLSYIGFTNFD 212

Query: 210 FGSDLADKSHNN-----KRTSNAIASSHILSLLYEHWKFAFTLRYFHNGGQWNAGEKVNF 264
+GSDL D + + RTSN+IASSHIL+L Y HW ++ RYFHNGGQW K+NF
Sbjct: 213 WGSDLGDDNFYDLNGKHARTSNSIASSHILALNYAHWHYSIVARYFHNGGQWADDAKLNF 272

Query: 265 GDGPFELKNTGWGTYTTIGYQF 286
GDGPF +++TGWG Y +GY F
Sbjct: 273 GDGPFSVRSTGWGGYFVVGYNF 294


69c5036c5050Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c5036120-3.614700Succinyl-CoA synthetase beta chain
c5037222-4.519162Succinyl-CoA synthetase alpha chain
c5038224-5.505911Putative membrane-bound protein
c5039130-7.396280Putative lactate dehydrogenase
c5040-131-8.617427Putative c4-dicarboxylate transport
c5041128-8.206161Putative transport sensor protein
c5042122-4.075140Hypothetical protein
c50431181.271501Hypothetical protein
c50440182.420587Hypothetical protein
c50450172.828216Class B acid phosphatase precursor
c50460203.105973Hypothetical protein yjbQ
c5047-1140.491956Protein yjbR
c5048-1140.476697Excinuclease ABC subunit A
c5049-119-2.666184Single-strand binding protein
c5050022-4.155704Hypothetical protein yjcB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5040HTHFIS443e-155 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 443 bits (1142), Expect = e-155
Identities = 158/485 (32%), Positives = 230/485 (47%), Gaps = 48/485 (9%)

Query: 6 SEITIVYIEDSDDVRFACEQTLTLAGYRVISCCDAEHSIPLIQSQANIIILTDVRLPGIS 65
+ TI+ +D +R Q L+ AGY V +A I + +++TDV +P +
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 66 GLELLSYINEMDSKIPVILITGHGDVEMAVDAMRNGAFDFIEKPSSSDKLLSIIARAVEK 125
+LL I + +PV++++ A+ A GA+D++ KP +L+ II RA+ +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 126 RRLVLENQQLLANLQQENGPVLIGRSPQMQQLRKMILNVADTGADVLIYGETGCGKEVVA 185
+ + + ++G L+GRS MQ++ +++ + T ++I GE+G GKE+VA
Sbjct: 122 PKRRPSKLEDDS----QDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVA 177

Query: 186 RMLHHWSTRRQGQFVALNCAGLPETLFESEIFGHEAGAFTGAVKKRIGKIEHANGGTLFL 245
R LH + RR G FVA+N A +P L ESE+FGHE GAFTGA + G+ E A GGTLFL
Sbjct: 178 RALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFL 237

Query: 246 DEIEGMPSGMQVKLLRVLQERTIERLGANQLIPVNCRVIAATKEDLLRRSEEHLFRLDLY 305
DEI MP Q +LLRVLQ+ +G I + R++AAT +DL + + LFR DLY
Sbjct: 238 DEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLY 297

Query: 306 YRLNVVSLNIPPLRQRREDIPELFYWFASQAAQKYNRPLPDISPMLLAWLQSQSWPGNVR 365
YRLNVV L +PPLR R EDIP+L F QA K + L +++ WPGNVR
Sbjct: 298 YRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAE-KEGLDVKRFDQEALELMKAHPWPGNVR 356

Query: 366 ELKHNAERFVL----------------GLLTHHQPVPMTQQEESGLTAC----------- 398
EL++ R P+ L+
Sbjct: 357 ELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYF 416

Query: 399 ----------------IDAFEKKLIEDMLRQTEGQVSLTARLLQLPRKTLYDKLNKHQIQ 442
+ E LI L T G A LL L R TL K+ + +
Sbjct: 417 ASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476

Query: 443 PQVYR 447

Sbjct: 477 VYRSS 481


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5049PERTACTIN270.048 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 27.0 bits (59), Expect = 0.048
Identities = 15/50 (30%), Positives = 19/50 (38%), Gaps = 4/50 (8%)

Query: 119 GGAPAGGNIGGGQPQGGWGQPQQPQGGNQFSG----GAQSRPQQSAPAAP 164
G APAGG + GG GG + + G + QS AP
Sbjct: 261 GDAPAGGAVPGGAVPGGAVPGGFGPLLDGWYGVDVSDSTVDLAQSIVEAP 310


70c5059c5069Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c5059-1213.435655Hypothetical protein
c5060-2203.132422Hypothetical protein
c5061-3212.849993Hypothetical protein
c5062-2232.938977Putative symporter yjcG
c5063-1223.274913Hypothetical protein yjcH
c5064-2233.994372Acetyl-coenzyme A synthetase
c5065-1173.955552Hypothetical protein
c5066-2174.166905Cytochrome c552 precursor
c5067-1194.612171Cytochrome c-type protein nrfB precursor
c5068-1194.362761NrfC protein
c5069-1183.476161NrfD protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5063RTXTOXIND270.019 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 26.7 bits (59), Expect = 0.019
Identities = 5/33 (15%), Positives = 13/33 (39%), Gaps = 1/33 (3%)

Query: 18 ELVEKR-QRFATILSIIMLAVYIGFILLIAFAP 49
EL+E R +++ ++ + +L
Sbjct: 47 ELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQ 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5068VACJLIPOPROT300.007 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 29.9 bits (67), Expect = 0.007
Identities = 6/21 (28%), Positives = 12/21 (57%)

Query: 179 FGNLDDPSSEISQLLRQKPTY 199
GNL++P+ ++ L+ P
Sbjct: 75 TGNLEEPAVMVNYFLQGDPYQ 95


71c5095c5108Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c50950214.076033Protein rpiR
c50961276.522741Ribose 5-phosphate isomerase B
c50970317.555914Conserved hypothetical protein
c50980347.562945PhnP protein
c50990357.841966PhnO protein
c5101-1358.133783PhnM protein
c5102-1348.225454Phosphonates transport ATP-binding protein phnL
c5103-1358.637647Phosphonates transport ATP-binding protein phnK
c5104-1378.823608PhnJ protein
c51050367.986349PhnI protein
c51060377.430888PhnH protein
c51071366.448303PhnG protein
c51081334.723620Probable transcriptional regulator phnF
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5099SACTRNSFRASE333e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.6 bits (74), Expect = 3e-04
Identities = 20/84 (23%), Positives = 32/84 (38%), Gaps = 5/84 (5%)

Query: 50 HLALLDGEVVGMIGLHLQFHLHHVNWIGEIQELVVMPQARGLNVGSKLLAWAEEEARQAG 109
L L+ +G I + + N I+++ V R VG+ LL A E A++
Sbjct: 68 FLYYLENNCIGRIKIRSNW-----NGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENH 122

Query: 110 AEMTELSTNVKRHDAHRFYLREGY 133
L T A FY + +
Sbjct: 123 FCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5102PF05272290.013 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.013
Identities = 17/70 (24%), Positives = 25/70 (35%), Gaps = 8/70 (11%)

Query: 36 CVVLHGHSGSGKSTLLRSLYANYLPDEGQIQIKHGDEWVDLVTAPARKVVEI------RK 89
VVL G G GKSTL+ +L + I G + + + E+ R+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIV--AYELSEMTAFRR 655

Query: 90 TTVGWVSQFL 99
V F
Sbjct: 656 ADAEAVKAFF 665


72c5129c5217Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c5129016-3.102937Hypothetical protein
c5130017-3.508279Transcriptional Regulatory protein dcuR
c5131118-4.701280Sensor protein dcuS
c5132331-8.622694Hypothetical protein
c5133221-6.069502Hypothetical protein yjdI
c5134-216-3.847262Hypothetical protein yjdJ
c5135-217-3.598503Hypothetical protein
c5136-123-3.018607Hypothetical protein ydcX
c5137-119-3.877948Hypothetical protein
c5138019-3.926722Lysyl-tRNA synthetase, heat inducible
c5139217-2.743615Hypothetical transporter yjdL
c5140322-2.830128Lysine decarboxylase, inducible
c5141420-2.355080Probable cadaverine/lysine antiporter
c5142419-2.407345Transcriptional activator cadC
c51437270.969204Hypothetical protein
c51446251.497464Hypothetical protein
c51453261.548286Hypothetical protein
c51466251.355660Conserved hypothetical protein
c51476273.422111Conserved hypothetical protein
c51485254.668356Unknown protein encoded within prophage
c51497274.972058Hypothetical protein yeeV
c51508274.822769Hypothetical protein yeeU
c51517274.714635Hypothetical protein yeeT
c51527284.456372Putative radC-like protein yeeS
c51536272.550743Hypothetical protein
c51546261.524138Hypothetical protein
c51556250.668765Hypothetical protein yfjX
c5156425-0.729114Hypothetical protein yafZ
c5157427-1.406067Hypothetical protein ykfF
c5158423-3.268350Hypothetical protein
c5159425-6.220356Hypothetical protein
c5160425-6.569374Hypothetical transcriptional regulator yfjR
c5161425-7.239552Hypothetical protein
c5162222-4.721984Putative conserved protein
c5163223-5.914446Hypothetical protein
c5164123-5.991826Hypothetical protein
c5165118-2.260283Hypothetical protein
c5166218-2.223249Partial Transposase
c5167219-1.528707Putative Transposase for IS629
c5168523-3.594533Unknown protein of IS629 encoded within
c5169527-6.704972Hypothetical protein
c5170526-6.105632Hypothetical protein
c5171423-4.955819Hypothetical protein
c5172421-4.293850Hypothetical protein
c5173422-3.639674Hypothetical protein
c5174424-4.471503Putative iron-regulated outer membrane virulence
c5175423-2.586908Hypothetical protein
c5176427-3.717082Putative Transposase within prophage
c5177532-3.486644Hypothetical protein in IS
c5178633-2.976530Putative Transposase for IS629
c5179738-3.417978PapG protein
c5180639-2.531750PapF protein
c5181533-0.450700PapE protein
c51824310.489031PapK protein
c5183531-0.866435Hypothetical protein
c5184431-1.407677PapJ protein
c5185430-2.180318PapD protein
c5186428-1.344325PapC protein
c5187632-4.008344PapH protein
c5188524-2.476999PapA protein
c51895220.270117PapI protein
c5190421-0.640148Hypothetical protein
c5191423-0.305202Hypothetical protein
c51924210.003754Conserved hypothetical protein
c5193322-0.326666Hypothetical protein
c5194227-4.742310Hypothetical protein
c5195327-6.463355Hypothetical protein
c5196021-5.769316Transposase insC for insertion element
c5197-122-4.817079Transposase insD for insertion element
c5198-121-4.804105Hypothetical protein
c5199022-4.399203Hypothetical protein
c5200022-4.063364Hypothetical protein
c5201022-3.390392Transporter protein
c5202223-2.699117Regulatory protein
c5203524-2.390472Regulatory protein
c5204625-1.772897Transport activator
c5205730-4.267056Conserved hypothetical protein
c5206735-6.380655Hypothetical protein
c5207737-6.999096Hypothetical protein
c5208532-5.660612Hypothetical protein ybdM
c5209332-6.010677Hypothetical protein ybdN
c5210333-6.407590Hypothetical protein
c5211229-4.385764Hypothetical protein
c5212119-1.189765Hypothetical protein
c5213-115-0.143045Putative Transposase for IS629
c5214015-0.629938Unknown protein of IS629 encoded within
c5215014-0.499380Conserved hypothetical protein
c5216118-0.052315Prophage P4 integrase
c5217222-0.160275*Protein yjdC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5130HTHFIS682e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 68.3 bits (167), Expect = 2e-15
Identities = 31/109 (28%), Positives = 51/109 (46%), Gaps = 4/109 (3%)

Query: 4 VLIIDDDAMVAELNRRYVAQIPGFQCCGTASTLEKAKEIIFNSDTPIDLILLDIYMQKEN 63
+L+ DDDA + + + +++ G+ S I + DL++ D+ M EN
Sbjct: 6 ILVADDDAAIRTVLNQALSRA-GYDVR-ITSNAATLWRWI--AAGDGDLVVTDVVMPDEN 61

Query: 64 GLDLLPVLHNARCKSDVIVISSAADAVTIKDSLHYGVVDYLIKPFQASR 112
DLLP + AR V+V+S+ +T + G DYL KPF +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTE 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5131PF06580418e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.0 bits (96), Expect = 8e-06
Identities = 21/99 (21%), Positives = 38/99 (38%), Gaps = 18/99 (18%)

Query: 442 LIENALE-ALGP-EPGGEISVTLHYRHGWLHCEVNDDGPGIAPDKIDHIFDKGVSTKGSE 499
L+EN ++ + GG+I + +G + EV + G +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN------------TKES 310

Query: 500 RGVGLALVKQQVENLGG---SIAVESEPGIFTQFFVQIP 535
G GL V+++++ L G I + + G V IP
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5134SACTRNSFRASE260.012 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 26.4 bits (58), Expect = 0.012
Identities = 9/28 (32%), Positives = 16/28 (57%)

Query: 32 LAIIEHTDVDESLKGQGIGKQLVAKVVE 59
A+IE V + + +G+G L+ K +E
Sbjct: 89 YALIEDIAVAKDYRKKGVGTALLHKAIE 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5139TCRTETA300.028 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.8 bits (67), Expect = 0.028
Identities = 36/190 (18%), Positives = 66/190 (34%), Gaps = 14/190 (7%)

Query: 44 NHAISLFSAYA-SLVYVTPILGGWLADRLLGNRTAVIAGALLMTLGHVVLGIDTNSTFSL 102
H L + YA P+LG +DR G R ++ + + ++ + L
Sbjct: 43 AHYGILLALYALMQFACAPVLGAL-SDRF-GRRPVLLVSLAGAAVDYAIMAT-APFLWVL 99

Query: 103 YLALAIIICGYGLFKSNISCLLGELYDEND-HRRDGGFSLLYAAGNIGSIAAPIACGLAA 161
Y+ + G+ + + + D D R F + A G +A P+ GL
Sbjct: 100 YIGRIV----AGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMG 155

Query: 162 QWYGWHVGFALAGGGMFIGLLIFLSGHRHFQSTRSMDKKALTSVKF-ALPVWSWLVVMLC 220
+ H F A + L FL+G + +++ L L + W M
Sbjct: 156 G-FSPHAPFFAAA---ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTV 211

Query: 221 LAPVFFTLLL 230
+A + +
Sbjct: 212 VAALMAVFFI 221


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5142SYCDCHAPRONE378e-05 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 36.8 bits (85), Expect = 8e-05
Identities = 16/97 (16%), Positives = 36/97 (37%), Gaps = 7/97 (7%)

Query: 391 PLDEKQLAALNTEIDNIVTLPELNNLS-----IIYQIKAVSALVKGKTDESYQAINTGID 445
++ A+ + + T+ LN +S +Y + A + GK +++++
Sbjct: 6 TDTQEYQLAMESFLKGGGTIAMLNEISSDTLEQLYSL-AFNQYQSGKYEDAHKVFQALCV 64

Query: 446 LEMSWLNYVL-LGKVYEMKGMNREAADAYLTAFNLRP 481
L+ + L LG + G A +Y +
Sbjct: 65 LDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDI 101


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5145HTHFIS280.034 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 28.3 bits (63), Expect = 0.034
Identities = 8/39 (20%), Positives = 18/39 (46%), Gaps = 1/39 (2%)

Query: 84 NGAQFRQLCETTDWVDAGE-NVLLFGASGLGKSHLAAAI 121
A +++ + + +++ G SG GK +A A+
Sbjct: 142 RSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5179PF036276010.0 PapG
		>PF03627#PapG

Length = 336

Score = 601 bits (1550), Expect = 0.0
Identities = 333/336 (99%), Positives = 334/336 (99%)

Query: 1 MKKWFPALLFSLCVSGESSAWNNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIATVT 60
MKKWFPALLFSLCVSGESSAWNNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIATVT
Sbjct: 1 MKKWFPALLFSLCVSGESSAWNNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIATVT 60

Query: 61 WNQCNGPEFADGSWAYYREYIAWVVFPKKVMTQNGYPLFIEVHNKGSWSEENTGDNDSYF 120
WNQCNGP FADGSWAYYREYIAWVVFPKKVMT+NGYPLFIEVHNKGSWSEENTGDNDSYF
Sbjct: 61 WNQCNGPGFADGSWAYYREYIAWVVFPKKVMTKNGYPLFIEVHNKGSWSEENTGDNDSYF 120

Query: 121 FLKGYKWDERAFDAGNLCQKPGETTRLTEKFDDIIFKVALPADLPLGDYSVKIPYTSGMQ 180
FLKGYKWDERAFDAGNLCQKPGETTRLTEKFDDIIFKVALPADLPLGDYSV IPYTSGMQ
Sbjct: 121 FLKGYKWDERAFDAGNLCQKPGETTRLTEKFDDIIFKVALPADLPLGDYSVTIPYTSGMQ 180

Query: 181 RHFASYLGARFKIPYNVAKTLPRENEMLFLFKNIGGCRPSAQSLEIKHGDLSINSANNHY 240
RHFASYLGARFKIPYNVAKTLPRENEMLFLFKNIGGCRPSAQSLEIKHGDLSINSANNHY
Sbjct: 181 RHFASYLGARFKIPYNVAKTLPRENEMLFLFKNIGGCRPSAQSLEIKHGDLSINSANNHY 240

Query: 241 AAQTLSVSCDVPANIRFMLLRNTTPTYSHGKKFSVGLGHGWDSIVSVNGVDTGETTMRWY 300
AAQTLSVSCDVPANIRFMLLRNTTPTYSHGKKFSVGLGHGWDSIVSVNGVDTGETTMRWY
Sbjct: 241 AAQTLSVSCDVPANIRFMLLRNTTPTYSHGKKFSVGLGHGWDSIVSVNGVDTGETTMRWY 300

Query: 301 KAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP 336
KAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP
Sbjct: 301 KAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5180FIMBRIALPAPF2675e-95 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 267 bits (683), Expect = 5e-95
Identities = 155/167 (92%), Positives = 156/167 (93%), Gaps = 1/167 (0%)

Query: 11 MARLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE 70
M RLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE
Sbjct: 1 MIRLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE 60

Query: 71 VTKTISISCTYKSGSPWIKVTGNAMA-GQTNVLATNIANFGIALYQGKGMSTPLTLGNGS 129
VTK ISISC YKSGS WIKVTGN M GQ NVLATNI +FGIALYQGKGMSTPLTLGNGS
Sbjct: 61 VTKNISISCPYKSGSLWIKVTGNTMGVGQNNVLATNITHFGIALYQGKGMSTPLTLGNGS 120

Query: 130 GNGYRVTAGLDTARSTFTFTSVPFRNGSRTLNGGDFRTTASMSMIYN 176
GNGYRVTAGLDTARSTFTFTSVPFRNGS LNGGDFRTTASMSMIYN
Sbjct: 121 GNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMIYN 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5181FIMBRIALPAPE306e-110 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 306 bits (784), Expect = e-110
Identities = 128/173 (73%), Positives = 145/173 (83%)

Query: 7 MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVTKAEVDWGNVEIQTLSQNG 66
MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTV AEV+WG++EIQ L Q+G
Sbjct: 1 MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAEVNWGDIEIQNLVQSG 60

Query: 67 NHEKEFTVNMQCPYHLGTMKVTITATNTYNNAILVQNTSNTSSDGVLVYLYNSNAGNIGT 126
++K+FTV+M CPY LGTMKVTIT+ N+ILV NTS S DG+L+YLYNSN IG
Sbjct: 61 GNQKDFTVDMNCPYSLGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYNSNNSGIGN 120

Query: 127 AITLGTPFTPGKITGNNADRTISLHAKLGYKGNMQSLKAGDFSATATLVASYS 179
A+TLG+ TPGKITG R I+L+AKLGYKGNMQSL+AG FSATATLVASYS
Sbjct: 121 AVTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVASYS 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5186PF005777420.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 742 bits (1918), Expect = 0.0
Identities = 243/882 (27%), Positives = 362/882 (41%), Gaps = 67/882 (7%)

Query: 5 MRGMKDRI-PFAVNNITCVILLSLFCNAASAVEFNTDVLDAADKKNIDFTRFSEAGYVLP 63
+ K R+ F V + +++ + FN L + D +RF + P
Sbjct: 16 LHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPP 75

Query: 64 GQYLLDVIVNGQSISPASLQISFVEPQSSGDKAEKKLPQACLTSDMVRLMGLTAESLDKV 123
G Y +D+ +N + A+ ++F S CLT + MGL S+ +
Sbjct: 76 GTYRVDIYLNNGYM--ATRDVTFNTGDSEQGI------VPCLTRAQLASMGLNTASVSGM 127

Query: 124 VYWHDGQCADF-HGLPGVDIRPDTGAGVLRINMPQAWLEYSDATWLPPSRWDDGIPGLML 182
D C + + D G L + +PQA++ ++PP WD GI +L
Sbjct: 128 NLLADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLL 187

Query: 183 DYNLNGTVSRNYQGGDSHQFSYNGTVGGNLGPWRLRADYQGSQEQSRYNGEKTTNRNFTW 242
+YN +G +N GG+SH N G N+G WRLR + S S + + +
Sbjct: 188 NYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSS--DSSSGSKNKWQH 245

Query: 243 SRFYLFRAIPRWRANLTLGENNINSDIFRSWSYTGASLESDDRMLPPRLRGYAPQITGIA 302
+L R I R+ LTLG+ DIF ++ GA L SDD MLP RG+AP I GIA
Sbjct: 246 INTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIA 305

Query: 303 ETNARVVVSQQGRVLYDSMVPAGPFSIQDLD-SSVRGRLDVEVIEQNGRKKTFQVDTASV 361
A+V + Q G +Y+S VP GPF+I D+ + G L V + E +G + F V +SV
Sbjct: 306 RGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSV 365

Query: 362 PYLTRPGQVRYKLVSGRSRGYGHETEGPVFATGEASWGLSNQWSLYGGAVLAGDYNALAA 421
P L R G RY + +G R + E P F GL W++YGG LA Y A
Sbjct: 366 PLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNF 425

Query: 422 GAGWDLGVPGTLSADITQSVARIEGERTFQGKSWRLSYSKRFDNADADITFAGYRFSERN 481
G G ++G G LS D+TQ+ + + + G+S R Y+K + + +I GYR+S
Sbjct: 426 GIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSG 485

Query: 482 YMTMEQYLNARYR--------------------NDYSSREKEMYTVTLNKNVADWNTSFN 521
Y +R + + ++ +T+ + + + +
Sbjct: 486 YFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTS-TLY 544

Query: 522 LQYSRQTYWDIRKTD-YYTVSVNRYFNVFGLQGVAVGLSASRSKYLGRD--NDSAYLRIS 578
L S QTYW D + +N F + LS S +K + + L ++
Sbjct: 545 LSGSHQTYWGTSNVDEQFQAGLNTAFE-----DINWTLSYSLTKNAWQKGRDQMLALNVN 599

Query: 579 VPLGT------------GTASYSGSMSND-RYVNMAGYTDM-FNDGLDSYSLNAGLNSGG 624
+P +ASYS S + R N+AG D SYS+ G GG
Sbjct: 600 IPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGG 659

Query: 625 GLTSQRQINAYYSHRSPLANLSANIASLQKGYTSFGVSASGGATITGKGAALHAGGMSGG 684
S A ++R N + S SGG G L G
Sbjct: 660 DGNSGSTGYATLNYRGGYGNANIG-YSHSDDIKQLYYGVSGGVLAHANGVTL--GQPLND 716

Query: 685 TRLLVDTDGVGGVPVDGGQVV-TNRWGTGVVTDISSYYRNTTSVDLKRLPDDVEATRSVV 743
T +LV G V+ V T+ G V+ + Y N ++D L D+V+ +V
Sbjct: 717 TVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVA 776

Query: 744 ESALTEGAIGYRKFSVLKGKRLFAILRLADGSQPPFGASVTSEKGRELGMVADEGLAWLS 803
T GAI +F G +L L + PFGA VTSE + G+VAD G +LS
Sbjct: 777 NVVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYLS 835

Query: 804 GVTPGETLSVNW--DGKIQCQVNVPETAISDQQLL----LPC 839
G+ + V W + C N S QQLL C
Sbjct: 836 GMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5187FIMBRIALPAPE320.001 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 31.5 bits (71), Expect = 0.001
Identities = 41/173 (23%), Positives = 75/173 (43%), Gaps = 29/173 (16%)

Query: 29 GMSLPEYWG----EEHVWWDGRAAFHGEVVRPACTLAMEDAWQIIDMGETPVRDL-QNGF 83
G+ LP G +HV F G+++ PACT+ + ++ G+ +++L Q+G
Sbjct: 6 GLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAE----VNWGDIEIQNLVQSG- 60

Query: 84 SGPERKFSLRLRNCEFNSQGGNLFSDSRIRVTFDGVRGET---PDKFNLSGQAKGINLQI 140
G ++ F++ + NC ++ ++ +T +G G + P+ SG I L
Sbjct: 61 -GNQKDFTVDM-NCPYS------LGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYN 112

Query: 141 ADARGNIARAGKV-MPAIP--LTGNEEALDYTLRIVR----NGKKLEAGNYFA 186
++ I A + P +TG A TL N + L+AG + A
Sbjct: 113 SNN-SGIGNAVTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSA 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5201TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.0 bits (91), Expect = 3e-05
Identities = 66/387 (17%), Positives = 130/387 (33%), Gaps = 39/387 (10%)

Query: 52 TPYLKEQLDLSATQI---GVLSSCMLIAYGISKGVMSSLADKASPKVFMACGLVLCAIVN 108
P L L S G+L + + V+ +L+D+ + + L A+
Sbjct: 28 LPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDY 87

Query: 109 VGLGFSTAFWIFAALVILNGLFQGMGVGPSFITIANWFPRRERGRVGAFWNISHNIGGGI 168
+ + W+ I+ G+ G IA+ ER R F +S G G+
Sbjct: 88 AIMATAPFLWVLYIGRIVAGITGATGAVAG-AYIADITDGDERAR--HFGFMSACFGFGM 144

Query: 169 VA-PIVGAAFALLGSEHWQSASYIVPACVAIVFAVIVLILGKGSPHQEGLPSLEEMMPEE 227
VA P++G A + A + + + L +PE
Sbjct: 145 VAGPVLGGLMGGFSPH----APFFAAAALNGLNFLTGCFL----------------LPE- 183

Query: 228 KVVLNTRQTVKAPENMSAFQIFCTYVLRNKNAWYVSLVDVFVYMVRFGMISWLPIYLLTV 287
+ + + P A ++ +L+ VF M G + +
Sbjct: 184 -----SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGE 238

Query: 288 KHFSKEQMSVAFLFFEWA---AIPSTLLAGWLSDKLFKGRRMPLAMICMALIFICLIGYW 344
F + ++ + ++ ++ G ++ +L + R + L MI +I L
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 345 KSESLFMVTIFAAIVGCLIYVPQFLASVQTMEIVPSFAVGSAVGLRGFMSYIFGASLGTS 404
+ F + + A G + Q + S Q E GS L ++ I G L T+
Sbjct: 299 RGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTS-LTSIVGPLLFTA 357

Query: 405 LFGIMVDHIGWHGGFYLLGCGIICCII 431
++ + W+G ++ G + +
Sbjct: 358 IYAASITT--WNGWAWIAGAALYLLCL 382


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5204HTHFIS2407e-77 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 240 bits (615), Expect = 7e-77
Identities = 113/479 (23%), Positives = 191/479 (39%), Gaps = 83/479 (17%)

Query: 10 SILLIDDDADVLDAYTQLLEQSGYRVFACNNPFEAQAWIQPDWPGIVLSDVCMPGCSGID 69
+IL+ DDDA + Q L ++GY V +N WI +V++DV MP + D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 70 LMMLFHQDDQQLPILLITGHGDVPMAVDAVKKGAWDFLQKPVDPGKLLSLVEEALRQRQS 129
L+ + LP+L+++ A+ A +KGA+D+L KP D +L+ ++ AL + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 130 IIARRQYCQQTLQVELIGRSEWINQYRRRLQQLSETDIAVWLYGAPGTGRMTGARYLHQF 189
++ + Q L+GRS + + R L +L +TD+ + + G GTG+ AR LH +
Sbjct: 125 RPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDY 183

Query: 190 GRNAQGEFVYRELTPDNAPQLND------------------------FIALAQGGTLVLS 225
G+ G FV N + A+GGTL L
Sbjct: 184 GKRRNGPFV-----AINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLD 238

Query: 226 HPEHLTREQQYHLVQ-LQSQEHRP----------FRLIGIGDTSLVELAASNHIIAELYY 274
+ + Q L++ LQ E+ R++ + L + +LYY
Sbjct: 239 EIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYY 298

Query: 275 CFAMTQIACLPLTQRPDDIEPLFRHYLCKACQRLNHPVPEVGKEMLKEMMRRMWPNNVRE 334
+ + PL R +DI L RH++ +A + V +E L+ M WP NVRE
Sbjct: 299 RLNVVPLRLPPLRDRAEDIPDLVRHFVQQAE-KEGLDVKRFDQEALELMKAHPWPGNVRE 357

Query: 335 LANAAE--------------------------------LFTVGVLPLAE---------TA 353
L N G L +++ A
Sbjct: 358 LENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFA 417

Query: 354 NPLMHVGTPTPLDRRVEDAERQIITEALNIHQGRINEVAEYLQIPRKKLYLRMKKYGLS 412
+ + DR + + E +I AL +G + A+ L + R L ++++ G+S
Sbjct: 418 SFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5217HTHTETR471e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 46.9 bits (111), Expect = 1e-08
Identities = 30/199 (15%), Positives = 57/199 (28%), Gaps = 14/199 (7%)

Query: 1 MGKTEENSVQ-REDVLGEALKLLELQGIANTTLEMVAERVDYPLDELRRFWPDKEAILYD 59
KT++ + + R+ +L AL+L QG+++T+L +A+ + + DK + +
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 60 ALRYLSQQIDVWRRQLMLDETQTAEQKLLARYQALSECVKNNRYPGCLFIAACTFYPDPG 119
I + L + E + F+
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLEST--VTEERRRLLMEIIFHKCEF 119

Query: 120 H----PIHQLADQQKSAAYDFTHELLTT-------LEVDDPAMVAKQMELVLEGCLSRML 168
+ Q +YD + L A M + G + L
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 169 VNRSQADVDTAHRLAEDIL 187
D+ R IL
Sbjct: 180 FAPQSFDLKKEARDYVAIL 198


73c5249c5269Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c5249-2133.168365Oligoribonuclease
c5250-2123.234567***Putative electron transport protein yjeS
c5251-2123.350138Hypothetical protein yjeF
c5252-1142.586514Hypothetical protein yjeE
c52530133.119136N-acetylmuramoyl-L-alanine amidase amiB
c52541152.870024DNA mismatch repair protein mutL
c52552192.028615tRNA delta(2)-isopentenylpyrophosphate
c52564252.198793Hfq protein
c52574232.094618GTP-binding protein hflX
c52584242.628136HflK protein
c52594242.468771HflC protein
c52602201.616222Hypothetical protein yjeT
c52613191.524292Adenylosuccinate synthetase
c52624140.693526Hypothetical protein yjeB
c52634140.521168Ribonuclease R
c5264215-2.355547Hypothetical tRNA/rRNA methyltransferase yjfH
c5265220-4.987364Hypothetical protein yjfI
c5266320-3.488740Hypothetical protein yjfJ precursor
c5267622-4.089381Hypothetical protein yjfK
c5268322-2.338565Hypothetical protein yjfL
c52692200.628156Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5257SECA320.005 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 31.8 bits (72), Expect = 0.005
Identities = 26/144 (18%), Positives = 54/144 (37%), Gaps = 6/144 (4%)

Query: 282 HVIDAADVRVQENIEAVNTVLEEIDAHEIPTLLVMNKIDMLDDFEPRIDRDEENK-PIRV 340
++D +DV N + IDA+ P L ++ + + R+ D + PI
Sbjct: 665 ELLDVSDVSETINSIREDVFKATIDAYIPPQSL--EEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 341 WLSAQTGAGIPQLFQALTERLSGEVAQHTLRLPPQEGRLRSRFYQLQAIEKEWMEEDGSV 400
WL + L + + + + + + R + LQ ++ W E ++
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAM 782

Query: 401 SLQVRMPIVDWRRLCKQEPALIDY 424
+R I R +++P +Y
Sbjct: 783 D-YLRQGIH-LRGYAQKDP-KQEY 803


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5258cloacin320.006 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.6 bits (71), Expect = 0.006
Identities = 25/81 (30%), Positives = 30/81 (37%), Gaps = 10/81 (12%)

Query: 17 GSSKPGGNSEGNGNKGGRDQGPPDLDDIFRKLSKKLGGLGGGKGTGSGGGSSSQGP---- 72
S G +SE N GG G G GGG GTG G S+ P
Sbjct: 33 ASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTG-GNLSAVAAPVAFG 91

Query: 73 -----RPQLGGRVVTIAAAAI 88
P GG V+I+A A+
Sbjct: 92 FPALSTPGAGGLAVSISAGAL 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5263RTXTOXIND310.027 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.027
Identities = 12/55 (21%), Positives = 24/55 (43%), Gaps = 1/55 (1%)

Query: 179 VVPDDSRLSFDILIPPDQIMGARMGFVVVVELTQRPTRRTKAV-GKIVEVLGDNM 232
+VP+D L L+ I +G ++++ P R + GK+ + D +
Sbjct: 359 IVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAI 413


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5266PHPHTRNFRASE320.001 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 32.4 bits (74), Expect = 0.001
Identities = 22/122 (18%), Positives = 47/122 (38%), Gaps = 12/122 (9%)

Query: 85 VNPSLINEVAEEIARLENLITAEEQVLSNLEVSRDGVEKAVTATAQRIAQFEQQMEVVKA 144
+ + I +V+ EI +L A E+ L +D E ++ A I F + V+
Sbjct: 29 IEKTSITDVSTEIEKLT---AALEKSKEELRAIKDQTEASMGADKAEI--FAAHLLVLDD 83

Query: 145 TEAMQRAQQAVTTSTVGASSSVSTAAESLKRLQTRQAERQARLDAAAQLEKVADGRDLDE 204
E + + + + A ++ ++ +D E+ AD RD+ +
Sbjct: 84 PELVDGIKGKIENEQMNAEYALKEVSD-------MFVSMFESMDNEYMKERAADIRDVSK 136

Query: 205 KL 206
++
Sbjct: 137 RV 138


74c5283c5294Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c5283-1263.871682Unknown pentitol phosphotransferase enzyme II, B
c5284-2233.733300Unknown pentitol phosphotransferase enzyme II, A
c5285-2243.806694Probable hexulose-6-phosphate synthase
c5286-2223.452978Hypothetical protein
c5287-1233.099867Putative hexulose-6-phosphate isomerase
c52881281.043566Probable sugar isomerase sgaE
c5289638-0.916316Hypothetical protein yjfY precursor
c5290534-1.450152Hypothetical protein
c5291432-2.69398630S ribosomal protein S6
c5292330-4.34423030S ribosomal protein S18
c5293127-4.855011Hypothetical protein
c5294-224-3.10902250S ribosomal protein L9
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5285ECOLNEIPORIN280.032 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 27.8 bits (62), Expect = 0.032
Identities = 6/19 (31%), Positives = 7/19 (36%), Gaps = 2/19 (10%)

Query: 118 FNGDVQI--ELTGYWTWEQ 134
F G + L W EQ
Sbjct: 62 FKGQEDLGNGLKAIWQVEQ 80


75c5371c5396Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c5371224-2.036374*Prophage P4 integrase
c5372328-1.782469Hypothetical protein
c5373532-0.480347Hypothetical protein
c53746351.725091Hypothetical protein
c53755361.844145Hypothetical protein
c53765351.459772Hypothetical protein
c5377430-0.416834Hypothetical protein
c5378326-1.307271Hypothetical protein
c5379220-4.151082Hypothetical protein
c5380127-8.733916Hypothetical protein
c5381231-10.542467Hypothetical protein
c5382031-8.868477Hypothetical protein
c5383030-7.499314Hypothetical protein
c5384-133-7.676710Hypothetical protein
c5385032-8.297934Hypothetical protein
c5386030-7.428011Hypothetical protein
c5387030-6.907415Hypothetical protein yjhS precursor
c5388-130-7.128723Hypothetical protein yjhT precursor
c5389129-6.187502Hypothetical protein yjhA precursor
c5390030-5.727731Hypothetical protein
c5391126-4.024839Type 1 fimbriae Regulatory protein fimB
c5392025-3.374467Type 1 fimbriae Regulatory protein fimE
c5393124-3.186795Type-1 fimbrial protein, A chain precursor
c5394124-2.927113Fimbrin-like protein fimI precursor
c5395222-2.785196Chaperone protein fimC precursor
c5396315-1.141126Outer membrane usher protein fimD precursor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5396PF0057710980.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 1098 bits (2842), Expect = 0.0
Identities = 877/878 (99%), Positives = 877/878 (99%)

Query: 1 MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQA 60
MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQA
Sbjct: 1 MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQA 60

Query: 61 VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN 120
VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN
Sbjct: 61 VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN 120

Query: 121 TASVSGMNLLADDACVPLTSMIHDATAHLDVGQQRLNLTIPQAFMSNRARGYIPPELWDP 180
TASVSGMNLLADDACVPLTSMIHDATA LDVGQQRLNLTIPQAFMSNRARGYIPPELWDP
Sbjct: 121 TASVSGMNLLADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDP 180

Query: 181 GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSK 240
GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSK
Sbjct: 181 GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSK 240

Query: 241 NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV 300
NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV
Sbjct: 241 NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV 300

Query: 301 IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV 360
IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV
Sbjct: 301 IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV 360

Query: 361 PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY 420
PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY
Sbjct: 361 PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY 420

Query: 421 RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR 480
RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR
Sbjct: 421 RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR 480

Query: 481 YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRT 540
YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRT
Sbjct: 481 YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRT 540

Query: 541 STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNI 600
STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNI
Sbjct: 541 STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNI 600

Query: 601 PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD 660
PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD
Sbjct: 601 PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD 660

Query: 661 GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVL 720
GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVL
Sbjct: 661 GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVL 720

Query: 721 VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP 780
VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP
Sbjct: 721 VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP 780

Query: 781 TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA 840
TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA
Sbjct: 781 TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA 840

Query: 841 GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878
GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR
Sbjct: 841 GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


76c5408c5432Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c5408219-0.088261Isoaspartyl dipeptidase
c5409321-1.483271Hypothetical protein yjiG
c5410120-0.703036Hypothetical protein yjiH
c5411119-0.319132Hypothetical protein
c54142180.312389Hypothetical protein yjiJ
c5415216-0.696130Hypothetical protein yjiK
c5416215-1.756353Hypothetical protein
c5417114-1.603561Hypothetical protein yjiL
c5418017-4.481767Hypothetical protein yjiM
c5419016-4.810939Hypothetical protein yjiN
c5420-113-4.895709Hypothetical protein yfcI
c5421019-7.295506Hypothetical protein
c5422-115-6.166085Hypothetical protein yjiW
c5423-114-6.440389Putative restriction modification enzyme S
c5424-213-2.761331Putative restriction modification enzyme M
c5425-312-1.968647Putative restriction modification enzyme R
c5426-312-1.771000Conserved hypothetical protein
c5427-3202.322431Hypothetical protein yjiA
c5428-3171.258899Hypothetical protein yjiX
c5429-2160.961776Hypothetical protein yjiY
c5430-114-1.685681Methyl-accepting chemotaxis protein I
c5431019-3.586350Hypothetical protein
c5432020-3.763948Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5408UREASE354e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 35.5 bits (82), Expect = 4e-04
Identities = 30/129 (23%), Positives = 48/129 (37%), Gaps = 33/129 (25%)

Query: 26 CDVLIANGKIIAVASNIPSDIVPDCT--------VVDLSGQILCPGFIDQHVHLIGG--- 74
D+ + +G+I A+ D+ P T V+ G+I+ G +D H+H I
Sbjct: 86 ADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIHFICPQQI 145

Query: 75 ------------GGEAGP------TTRTP-EVALSRLTEA--GITSVVGLLGTDSISRHP 113
GG GP TT TP ++R+ EA + G + S P
Sbjct: 146 EEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIARMIEAADAFPMNLAFAGKGNAS-LP 204

Query: 114 ESLLAKTRA 122
+L+
Sbjct: 205 GALVEMVLG 213


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5414TCRTETA290.027 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.4 bits (66), Expect = 0.027
Identities = 58/312 (18%), Positives = 106/312 (33%), Gaps = 16/312 (5%)

Query: 82 RPFLLASALASGLLILAMAWLPPFILVLLIRVLAGVASAGMLIFGSTLIMQHTRHPFVLA 141
RP LL S + + MA P ++ + R++AG+ A + G+ +
Sbjct: 73 RPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARH 132

Query: 142 ALFSGVGIGIALSNEYVLAGLHFDLSSQTLWQGAGALSGMMLIALTLLMP-SKKHAIAPM 200
F G + VL GL S + A AL+G+ + L+P S K P+
Sbjct: 133 FGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPL 192

Query: 201 PLAKTEQQIMSWW---------LLAILYGLAGFGYIIVATYLPLMAKDAGSPLLTAHLWT 251
W L+A+ + + G + A ++ T + +
Sbjct: 193 RREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGI-S 251

Query: 252 LVGLSIVPGCFGWLWA---AKRWGALPCLTANLLVQAIS-VLLTLASDSPLLLIISSLGF 307
L I+ + A R G L ++ +LL A+ + I L
Sbjct: 252 LAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVL-L 310

Query: 308 GGTFMGTTSLVMTIARQLSVPGNLNLLGFVTLIYGIGQILGPALTSMLGNGTSALASATL 367
+G +L ++RQ+ L G + + + I+GP L + + + +
Sbjct: 311 ASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWA 370

Query: 368 CGAAALFIAALI 379
A A +
Sbjct: 371 WIAGAALYLLCL 382


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5415ADHESNFAMILY290.026 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 29.1 bits (65), Expect = 0.026
Identities = 10/45 (22%), Positives = 17/45 (37%)

Query: 54 LFVIVAVCTFFVQSCARKSNHAASFQNYHATIDGKEIAGITNNIS 98
+++ + + +CA S Q IA IT NI+
Sbjct: 6 TLLVLFLSAIILVACASGKKDTTSGQKLKVVATNSIIADITKNIA 50


77c5443c5454Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c5443122-3.215329Hypothetical protein yjjP
c5444224-4.190069Hypothetical protein yjjQ
c5445117-0.876815Transcriptional activator protein bglJ
c54462151.394999Ferric iron reductase protein fhuF
c54472162.245129Hypothetical protein
c5448-1193.217126Hypothetical protein
c5449-1173.410311***Hypothetical protein
c54500173.428305Ribosomal RNA small subunit methyltransferase C
c5451-1172.941414Hypothetical protein
c54521191.990068DNA polymerase III, psi subunit
c54532201.049512Ribosomal-protein-alanine acetyltransferase
c54542210.942176Hypothetical protein yjjG
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c54462FE2SRDCTASE482e-177 Ferric iron reductase signature.
		>2FE2SRDCTASE#Ferric iron reductase signature.

Length = 262

Score = 482 bits (1243), Expect = e-177
Identities = 250/262 (95%), Positives = 253/262 (96%)

Query: 1 MAYRSAPLYEDVIWRTHLQPQDAGLAQAVRATIAEHREHLLEFIRLDEPAPLNAMTLAQW 60
MAYRSAPLYEDVIWRTHLQPQD LAQAVRATIA+HREHLLEFIRLDEPAPLNAMTLAQW
Sbjct: 1 MAYRSAPLYEDVIWRTHLQPQDPTLAQAVRATIAKHREHLLEFIRLDEPAPLNAMTLAQW 60

Query: 61 SSPNALSSLLAVYSDHIYRNQPLMIRENKPLISLWAQWYIGLMVPPLMLALLTQEKALDV 120
SSPN LSSLLAVYSDHIYRNQP+MIRENKPLISLWAQWYIGLMVPPLMLALLTQEKALDV
Sbjct: 61 SSPNVLSSLLAVYSDHIYRNQPMMIRENKPLISLWAQWYIGLMVPPLMLALLTQEKALDV 120

Query: 121 SPEHVHVEFHETGRAACFWVDVCEDKNATLHSPQQRMETLISQALVPVVQALEATGEING 180
SPEH H EFHETGR ACFWVDVCEDKNAT HSPQ RMETLISQALVPVVQALEATGEING
Sbjct: 121 SPEHFHAEFHETGRVACFWVDVCEDKNATPHSPQHRMETLISQALVPVVQALEATGEING 180

Query: 181 KLIWSNTGYLINWYLTEMKQLLGEATVESLRYALFFEKTLTTGEDNPLWRTVVLRDGLLV 240
KLIWSNTGYLINWYLTEMKQLLGEATVESLR+ALFFEKTLT GEDNPLWRTVVLRDGLLV
Sbjct: 181 KLIWSNTGYLINWYLTEMKQLLGEATVESLRHALFFEKTLTNGEDNPLWRTVVLRDGLLV 240

Query: 241 RRTCCQRYRLPDVQQCGDCTLK 262
RRTCCQRYRLPDVQQCGDCTLK
Sbjct: 241 RRTCCQRYRLPDVQQCGDCTLK 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5453SACTRNSFRASE554e-12 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 54.6 bits (131), Expect = 4e-12
Identities = 23/80 (28%), Positives = 35/80 (43%), Gaps = 1/80 (1%)

Query: 62 DEATLFNIAVDPDYQRQGLGRALLEHLIDELEKRGVATLWLEVRASNAAAIALYESLGFN 121
A + +IAV DY+++G+G ALL I+ ++ L LE + N +A Y F
Sbjct: 88 GYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFI 147

Query: 122 EATIRRNYYPTTDG-REDAI 140
+ Y E AI
Sbjct: 148 IGAVDTMLYSNFPTANEIAI 167


78c0503c0509N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c05031161.610681Hypothetical protein yajF
c05041151.533511Protein araJ precursor
c05052151.808628Exonuclease sbcC
c0506-2142.003125Nuclease sbcCD subunit D
c0507-2171.843977Hypothetical protein
c0508-2152.407771Phosphate regulon transcriptional Regulatory
c0509-1142.202258Phosphate regulon sensor protein phoR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0503ACETATEKNASE290.020 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.4 bits (66), Expect = 0.020
Identities = 17/69 (24%), Positives = 29/69 (42%), Gaps = 10/69 (14%)

Query: 233 FISGTGFATDYRRLSGHALKGSEIIRLVEESDPVAELALRRYELRLAKSLAHVVNILDP- 291
+G ++D+R L A + D A+LAL + R+ K++ +
Sbjct: 273 VYGISGISSDFRDLEDAAF---------KNGDKRAQLALNVFAYRVKKTIGSYAAAMGGV 323

Query: 292 DVIVLGGGM 300
DVIV G+
Sbjct: 324 DVIVFTAGI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0504TCRTETA513e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 51.4 bits (123), Expect = 3e-09
Identities = 73/356 (20%), Positives = 126/356 (35%), Gaps = 35/356 (9%)

Query: 33 ILSLALGTFGLGMAEFGIMGVLTELAHNVGISIPAAGH---MISYYALGVVVGAPIIALF 89
+ ++AL G+G+ IM VL L ++ S H +++ YAL AP++
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 90 SSRYSLKHILLFLVALCVIGNAMFTLSSSYLMLAIGRLVSGFPHGAFFGVGAIVLSKIIK 149
S R+ + +LL +A + A+ + +L IGR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 150 PGKVTAAVAGMVSGMTVANLLGIPLGTYLSQEFSWRYTFLLIAVFNIAVMASVYFWVPDI 209
G A G +S ++ P+ L FS F A N + F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 210 RDEAKGKLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYVKPYMMFI 257
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 258 SGFSETAMTFIMMLVGLGM---VLGNVLSGRISGRYSPLRIAAVTDFIIVLALLMLFFFG 314
F A T + L G+ + +++G ++ R R + ++L F
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 315 GMKTTSLIFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAVG 368
I + G+ LQ +L + E G G +A +L S VG
Sbjct: 299 RGWMAFPIMVLLASGGIG--MPALQAMLSRQV-DEERQGQLQGSLAALTSLTSIVG 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0505IGASERPTASE404e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.4 bits (94), Expect = 4e-05
Identities = 40/264 (15%), Positives = 81/264 (30%), Gaps = 11/264 (4%)

Query: 162 LNAKPKERAELLEELTGTEIYGKISAMVFEQHKSARTELEKLQAQASGVALLTPEQVQSL 221
A P E E + E + E S V + + A + + A Q+
Sbjct: 1029 APATPSETTETVAENSKQE-----SKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTN 1083

Query: 222 TASLQVLTDEEKQLLTAQQQEQQSLNWLTRLD-ELQQEASRRQQALQQALAEEEKAQPQL 280
+ +E Q ++ +++ E QE + + + E QPQ
Sbjct: 1084 EVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQA 1143

Query: 281 AALSLAQPARNLRPHWE---RIAEHSAALAHTRQQIEEVNTRLQSTMALRASIRHHAAKQ 337
P N++ A+ T +E+ T + + + +
Sbjct: 1144 EPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTT 1203

Query: 338 SAELQQQQQSLNAWLQEHDRFRQWNNELAGWRAQFSQQTSDREHLRQWQQQLTHAEQKLN 397
A Q S ++ ++ R + + ++DR + T+ L+
Sbjct: 1204 PATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPA-TTSSNDRSTVALCDLTSTNTNAVLS 1262

Query: 398 ALAAITLTLTADEVASALAQHAEQ 421
A + + V A++QH Q
Sbjct: 1263 DARAKAQFVALN-VGKAVSQHISQ 1285



Score = 33.5 bits (76), Expect = 0.005
Identities = 27/139 (19%), Positives = 54/139 (38%), Gaps = 13/139 (9%)

Query: 738 QQDVLAAQSLQKAQAQFDTALQASVFDDQQAFLAALMDEQTLTQLEQLKQNLENQRRQAQ 797
Q DV + S + A+ D A A E T T E KQ + + Q
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPP-------APATPSETTETVAENSKQESKTVEKNEQ 1056

Query: 798 TLVTQTAETLTQHQQHRPGGLSLTVTVEQIQQELAQTHQKLRENTTSQGEIRQQLKQDAD 857
TA+ ++ + V E+AQ+ + +E T++ + ++++
Sbjct: 1057 DATETTAQNREVAKEAKS-----NVKANTQTNEVAQSGSETKETQTTETKETATVEKEEK 1111

Query: 858 NRQQQQTLMQQIAQMTQQV 876
+ + + Q++ ++T QV
Sbjct: 1112 AKVETEK-TQEVPKVTSQV 1129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0506FRAGILYSIN300.022 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 29.7 bits (66), Expect = 0.022
Identities = 13/70 (18%), Positives = 23/70 (32%), Gaps = 4/70 (5%)

Query: 149 KQQHLLAAITDYYQQHYADACKLRGDQPLPIIATGHLTTVGASKSDAVRDIYIGTLDAFP 208
K+ ++ I ++Y + + + I T D + + I A
Sbjct: 135 KEAQMMNEIAEFYAAPFKKTRAINEKEAFECI-YDSRTRSA--GKD-IVSVKINIDKAKK 190

Query: 209 AQNFPPADYI 218
N P DYI
Sbjct: 191 ILNLPECDYI 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0508HTHFIS951e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.5 bits (235), Expect = 1e-24
Identities = 32/149 (21%), Positives = 61/149 (40%), Gaps = 9/149 (6%)

Query: 4 RILVVEDEVPIREMVCFVLEQNGFQPVEAEDYDSAVNQLNEPWPDLILLDWMLPGGSGIQ 63
ILV +D+ IR ++ L + G+ + + + DL++ D ++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 FIKHLKRESMTRDIPVVMLTARGEEEDRVRGLETGADDYITKPFSPKELVARIKAVMRRI 123
+ +K+ D+PV++++A+ ++ E GA DY+ KPF EL+ I +
Sbjct: 65 LLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA-- 120

Query: 124 SPMAVEEVIEMQGLSLDPTSHRVMAGEEP 152
E L D + G
Sbjct: 121 -----EPKRRPSKLEDDSQDGMPLVGRSA 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0509PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.1 bits (78), Expect = 0.001
Identities = 19/105 (18%), Positives = 33/105 (31%), Gaps = 26/105 (24%)

Query: 325 LVYNAVNH----TPEGTHITVRWQRVPHGAEFSVEDNGPGIAPEHIPRLTERFYRVDKAR 380
LV N + H P+G I ++ + VE+ G
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN---------------- 306

Query: 381 SRQTGGSGLGLAIVKHAVNH---HESRLNIESTVGKGTRFSFVIP 422
+G GL V+ + E+++ + GK +IP
Sbjct: 307 --TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM-VLIP 348


79c0554c0562N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c05541220.333394ATP-dependent Clp protease ATP-binding subunit
c05550200.324589ATP-dependent protease La
c0556-1140.450688DNA-binding protein HU-beta
c0557-2130.429222Peptidyl-prolyl cis-trans isomerase D
c0558-219-0.349747Hypothetical protein ybaV precursor
c0559-317-0.911855Hypothetical protein ybaW
c0560-2140.195134Hypothetical protein ybaX
c05610131.333137Hypothetical protein ybaE
c05620141.024780Cof protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0554HTHFIS290.043 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.043
Identities = 16/73 (21%), Positives = 29/73 (39%), Gaps = 13/73 (17%)

Query: 60 ERSALPTPHEIRNHLDDYVIGQEQAKKVLAVAVYNHYKRLRNGDTSNGVELGKSNILLIG 119
E P+ E + ++G+ A + +Y RL D +++ G
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGRSAAMQ----EIYRVLARLMQTD---------LTLMITG 167

Query: 120 PTGSGKTLLAETL 132
+G+GK L+A L
Sbjct: 168 ESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0555GPOSANCHOR340.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 34.3 bits (78), Expect = 0.002
Identities = 33/144 (22%), Positives = 67/144 (46%), Gaps = 12/144 (8%)

Query: 195 KQSVLEMSDVNERLEYLMAMMESEIDLLQVEKRIRNRVKKQMEKSQREYYLNEQMKAIQK 254
++ + L A QV R +++ ++ S+ +Q++A +
Sbjct: 277 TADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAK---KQLEAEHQ 333

Query: 255 ELGEMDDAPD-ENEALKRKIDAAKMPKEAKEKAEAELQKLKMMSPMS-AEATVVRGYIDW 312
+L E + + ++L+R +DA++ EAK++ EAE QKL+ + +S A +R +D
Sbjct: 334 KLEEQNKISEASRQSLRRDLDASR---EAKKQLEAEHQKLEEQNKISEASRQSLRRDLDA 390

Query: 313 MVQVPWNARSKVKKDLRQAQEILD 336
+ A+ +V+K L +A L
Sbjct: 391 SRE----AKKQVEKALEEANSKLA 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0556DNABINDINGHU1173e-38 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 117 bits (294), Expect = 3e-38
Identities = 49/88 (55%), Positives = 67/88 (76%)

Query: 2 NKSQLIDKIAAGADISKAAAGRALDAIIASVTESLKEGDDVALVGFGTFAVKERAARTGR 61
NK LI K+A +++K + A+DA+ ++V+ L +G+ V L+GFG F V+ERAAR GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEITIAAAKVPSFRAGKALKDAV 89
NPQTG+EI I A+KVP+F+AGKALKDAV
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0559PF08280270.021 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 27.1 bits (60), Expect = 0.021
Identities = 24/138 (17%), Positives = 41/138 (29%), Gaps = 20/138 (14%)

Query: 1 MQTQIKVRGYHLDVYQHVNNARYL-------EFLEEARWHGLENSDSFHWMTAH------ 47
+Q I + Y N Y E++ + N FH +
Sbjct: 361 LQHFIPETNLFVSPYYKGNQKLYTSLKLIVEEWMAKLPGKRYLNHKHFHLFCHYVEQILR 420

Query: 48 ------NIAFVVVN-ININYRRPAVLSDLLTITSQLQQLNGKSGILSQVITLEPEGQVVA 100
+ FV N IN + + + + Q+ L+P+ +
Sbjct: 421 NIQPPLVVVFVASNFINAHLLTDSFPRYFSDKSIDFHSYYLLQDNVYQIPDLKPDLVITH 480

Query: 101 DALITFVCIDLKTQKALA 118
LI FV +L A+A
Sbjct: 481 SQLIPFVHHELTKGIAVA 498


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0562HTHFIS290.019 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.019
Identities = 12/64 (18%), Positives = 24/64 (37%), Gaps = 10/64 (15%)

Query: 197 LTVLTQHLGLSLRDCMAFGDAMNDREMLGSVGSGFIMGN----------AMPQLRAELPH 246
TVL Q L + D +A + + ++ + +P+++ P
Sbjct: 16 RTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKARPD 75

Query: 247 LPVI 250
LPV+
Sbjct: 76 LPVL 79


80c0575c0584N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0575-112-1.584729Hypothetical protein ylaB
c0576117-0.396588Hypothetical protein ylaC
c0577116-0.564209Maltose O-acetyltransferase
c0578118-0.586789Haemolysin expression modulating protein
c05791160.094945Hypothetical protein ybaJ
c05801160.515455Acriflavine resistance protein B
c05812120.366598Acriflavine resistance protein A precursor
c0582215-0.034723Potential acrAB operon repressor
c05832180.898931Hypothetical protein
c05843162.492824Potassium efflux system kefA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0575BCTERIALGSPF290.034 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.4 bits (66), Expect = 0.034
Identities = 31/137 (22%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 247 IWLPLGLVIGLLAAMFVLRILRRIQSPHHRLQDAIENRDICVHYQPIVSLANGKIVGAEA 306
W+ L L+ G +A +LR R+ + + P++ G+I
Sbjct: 228 PWMLLALLAGFMAFRVMLR------QEKRRVS-----FHRRLLHLPLI----GRIARGLN 272

Query: 307 LARWPQTDGSWLSPDSFIPLAQQTGLS-EPLTLLIIRSVFEDMGDWLRQHPQQHISINLE 365
AR+ +T + S +PL Q +S + ++ R D +R+ H + LE
Sbjct: 273 TARYARTLSILNA--SAVPLLQAMRISGDVMSNDYARHRLSLATDAVREGVSLHKA--LE 328

Query: 366 STVLTSEKIPQLLREMI 382
T L P ++R MI
Sbjct: 329 QTAL----FPPMMRHMI 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0580ACRIFLAVINRP13680.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1368 bits (3543), Expect = 0.0
Identities = 802/1033 (77%), Positives = 916/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++SA+YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ + LNK++LTPVDVI +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL 300
+ EEFGK+ L+VN DGS V L+DVA++ELGGENY++IA NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+LA+++PFFP G+K++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVF+PMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWFNRMFEKSTHHYTDSVGGILRSTGR 540
SVLVALILTPALCAT+LKP++ H E K GFFGWFN F+ S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTNYYLT 600
YL++Y +IV GM LF+RLPSSFLP+EDQGVF+TM+QLPAGATQERTQKVL++VT+YYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD 660
EK NVESVF VNGF F+G+ QN G+AFVSLK W +R G+EN EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQLL AA+HP L SVRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS++DIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D+ YVR+A+G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSIPFS 900
M LME LASKLP G+GYDWTGMSYQERLSGNQAP+L AIS +VVFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF 1020
+EATL AVRMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0581RTXTOXIND446e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.4 bits (105), Expect = 6e-07
Identities = 33/212 (15%), Positives = 71/212 (33%), Gaps = 23/212 (10%)

Query: 112 TYQAAYDSAKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTA 171
+ Y A +L + + Q+ Q +++ ++ L +Q +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 172 AKAAVETARINLAYTKVTSPISGRIGKSNV-TEGALVQNGQATALATVQQLDPIYVDVTQ 230
+ + + +P+S ++ + V TEG +V + T + V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALV 372

Query: 231 SSNDFLRLKQELA----------NGTLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVD 280
+ D + KV I D I+ + G + ++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLV---GKVKNINLDAIEDQRLGLVFNVIISIE 429

Query: 281 QTTGSITLRAIFPNPDHTLLPGMFVRARLEEG 312
+ S + I L GM V A ++ G
Sbjct: 430 ENCLSTGNKNIP------LSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 7e-04
Identities = 24/125 (19%), Positives = 43/125 (34%), Gaps = 13/125 (10%)

Query: 61 PLQITTELPGR-TSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQAAYDS 119
++I G+ T + R E++P + I+ + KEG + G L ++ +A
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA---- 134

Query: 120 AKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTAAKAAVETA 179
D K Q++ A+L RYQ L E ++
Sbjct: 135 ---DTLKTQSSLLQARLEQTRYQILS-----RSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 180 RINLA 184
+L
Sbjct: 187 LTSLI 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0582HTHTETR2225e-76 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 222 bits (567), Expect = 5e-76
Identities = 215/215 (100%), Positives = 215/215 (100%)

Query: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60
MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180
GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215
APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0584RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRVKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


81c0651c0661N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0651-211-0.011195Hypothetical protein ybcY precursor
c0652-2100.961004Protease VII precursor
c06530101.771006Hypothetical protein ybcH precursor
c0654-1111.600238Bacteriophage N4 adsorption protein A precursor
c06550141.266951Bacteriophage N4 adsorption protein B
c06560212.441577Sensor kinase cusS
c06570212.599379Transcriptional Regulatory protein cusR
c0658-1201.645288Probable outer membrane lipoprotein cusC
c0659-1201.777642Hypothetical protein cusX precursor
c0660-1201.900089Putative copper efflux system protein cusB
c0661-2191.302248Putative cation efflux system protein cusA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0651LUXSPROTEIN310.002 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 31.4 bits (71), Expect = 0.002
Identities = 18/66 (27%), Positives = 30/66 (45%), Gaps = 7/66 (10%)

Query: 40 TKEHLLPHFL-EHLGNNHLDI------GVGTGFYLTHVPESSLISLMDLNEASLNAASTR 92
T EHL F+ HL + ++I G TGFY++ + S + D A++
Sbjct: 54 TLEHLYAGFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKV 113

Query: 93 AGESKI 98
++KI
Sbjct: 114 ENQNKI 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0652OMPTIN469e-171 Omptin serine protease signature.
		>OMPTIN#Omptin serine protease signature.

Length = 317

Score = 469 bits (1207), Expect = e-171
Identities = 272/276 (98%), Positives = 275/276 (99%)

Query: 1 MRAKLLGIVLTTPIAISSFASTETLSFTPDNINADISLGTLSGKTKERVYLAEEGGRKVS 60
MRAKLLGIVLTTPIAISSFASTETLSFTPDNINADISLGTLSGKTKERVYLAEEGGRKVS
Sbjct: 1 MRAKLLGIVLTTPIAISSFASTETLSFTPDNINADISLGTLSGKTKERVYLAEEGGRKVS 60

Query: 61 QLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDRDWMDSSNPGTWTDESR 120
QLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVD+DWMDSSNPGTWTDESR
Sbjct: 61 QLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDQDWMDSSNPGTWTDESR 120

Query: 121 HPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRDDI 180
HPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRDDI
Sbjct: 121 HPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRDDI 180

Query: 181 GSFPNGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWVEAFDNDEHYDPGKRIT 240
GSFPNGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWVE+ DNDEHYDPGKRIT
Sbjct: 181 GSFPNGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWVESSDNDEHYDPGKRIT 240

Query: 241 YRSKVKDQNYYSVAVNAGYYVTPNAKVYIEGAWNRV 276
YRSKVKDQNYYSVAVNAGYYVTPNAKVY+EGAWNRV
Sbjct: 241 YRSKVKDQNYYSVAVNAGYYVTPNAKVYVEGAWNRV 276


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0656PF06580310.007 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.007
Identities = 29/184 (15%), Positives = 67/184 (36%), Gaps = 34/184 (18%)

Query: 306 EELTRMAKMVSDML-FLAQADNNQLIPEKKMLNLADEVGKVFDFFEALAEDR-GVELRFV 363
+ M +S+++ + + N + + LADE+ V + + LA + L+F
Sbjct: 191 TKAREMLTSLSELMRYSLRYSNARQVS------LADELTVVDSYLQ-LASIQFEDRLQFE 243

Query: 364 GDECQVAGDPLMLRRALSNLLSNALRY----TPTGETIVVRCQTVDHLVQVTVENPGTPI 419
D + + L+ N +++ P G I+++ + V + VEN G+
Sbjct: 244 NQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA 303

Query: 420 APEHLPRLFDRFYRVDPSRQRKGEGSGIGLAIVK---SIVVAHKGTVAVTSDVRGTRFVI 476
E +G GL V+ ++ + + ++ ++
Sbjct: 304 LKN------------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 477 ILPA 480
++P
Sbjct: 346 LIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0657HTHFIS862e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.7 bits (212), Expect = 2e-21
Identities = 35/117 (29%), Positives = 62/117 (52%)

Query: 2 KLLIVEDEKKTGEYLTKGLTEAGFVVDLADNGLNGYHLAMTGDYDLIILDIMLPDVNGWD 61
+L+ +D+ L + L+ AG+ V + N + GD DL++ D+++PD N +D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 IVRMLRSANKGMPILLLTALGTIEHRVKGLELGADDYLVKPFAFAELLARVRTLLRR 118
++ ++ A +P+L+++A T +K E GA DYL KPF EL+ + L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0658RTXTOXIND393e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.0 bits (91), Expect = 3e-05
Identities = 25/189 (13%), Positives = 60/189 (31%), Gaps = 13/189 (6%)

Query: 254 QAQTVNSDSLQSVKLPA-GLPSQILLQRPDIMEAEHALM-----AANANIGAARAAFFPS 307
+ +S + +K + +I+++ + + L+ A A+ ++
Sbjct: 87 NGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQS----- 141

Query: 308 ISLTSGISTASSDLSSLFNASSGMWNFIPKIEIPIFNAGRNQANLDIAEIRQQQSVVNYE 367
SL + + + + P F + L + + ++Q
Sbjct: 142 -SLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQN 200

Query: 368 QKIQNAFKEVADALALRQSLNDQISAQQRYLASLQITLQRARTLYQHGAVSYLEVLDAER 427
QK Q + A R ++ +I+ + + L +L A++ VL+ E
Sbjct: 201 QKYQ-KELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQEN 259

Query: 428 SLFATRQTL 436
L
Sbjct: 260 KYVEAVNEL 268


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0661ACRIFLAVINRP5990.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 599 bits (1546), Expect = 0.0
Identities = 198/1048 (18%), Positives = 415/1048 (39%), Gaps = 90/1048 (8%)

Query: 1 MIEWIIRRSVANRFLVLMGALFLSIWGTWTIINTPVDALPDLSDVQVIIKTSYPGQAPQI 60
M + IRR + A+ L + G I+ PV P ++ V + +YPG Q
Sbjct: 1 MANFFIRR----PIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQT 56

Query: 61 VENQVTYPLTTTMLSVPGAKTVRGFSQ-FGDSYVYVIFEDGTDPYWARSRVLEYLNQVQG 119
V++ VT + M + + S G + + F+ GTDP A+ +V L
Sbjct: 57 VQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATP 116

Query: 120 KLPAGVSAELGP-DATGVGWIYEYALVDRSGKHDLADLRSLQDWFLKYELKTIPDVAEVA 178
LP V + + + ++ V + D+ +K L + V +V
Sbjct: 117 LLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQ 176

Query: 179 SVGGVVKEYQVVIDPQRLAQYGISLAEVKSALDASNQEAGGSSIELA------EAEYMVR 232
G ++ +D L +Y ++ +V + L N + + + +
Sbjct: 177 LFGAQ-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASII 235

Query: 233 ASGYLQTLDDFNHIVLKASENGVPVYLRDVAKVQVGPEMRRGIAELNGEGEVAGGVVILR 292
A + ++F + L+ + +G V L+DVA+V++G E IA +NG+ AG + L
Sbjct: 236 AQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGK-PAAGLGIKLA 294

Query: 293 SGKNAREVIAAVKDKLETLKSSLPEGVEIVTTYDRSQLIDRAIDNLSGKLLEEFIVVAVV 352
+G NA + A+K KL L+ P+G++++ YD + + +I + L E ++V +V
Sbjct: 295 TGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLV 354

Query: 353 CALFLWHVRSALVAIISLPLGLCIAFIVMHFQGLNANIMSLGGIAIAVGAMVDAAIVMIE 412
LFL ++R+ L+ I++P+ L F ++ G + N +++ G+ +A+G +VD AIV++E
Sbjct: 355 MYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVE 414

Query: 413 NAHKRLEEWQHQHPDATLDNKTRWQVITDASVEVGPALFISL------------------ 454
N + + E + +AT + ++ Q V A+FI +
Sbjct: 415 NVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITI 474

Query: 455 -------LIITLSFIPI-----------------------------FTLEGQEGRLFGPL 478
+++ L P ++ + L
Sbjct: 475 VSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKIL 534

Query: 479 AFTKTYAMAGA---ALLAIVVIPILMGYWIRGKEGDLLYMPSTLPGISAAEAASMLQKTD 535
T Y + A A + ++ + + + +G L M G + +L +
Sbjct: 535 GSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVT 594

Query: 536 KLIM--SVPEVARVFGKTGKAETATDSAPLEMVETTIQLKPQDQW-RPGMTMDKIIEELD 592
+ V VF G + + + LKP ++ + + +I
Sbjct: 595 DYYLKNEKANVESVFTVNGFSFSGQAQN---AGMAFVSLKPWEERNGDENSAEAVIHRAK 651

Query: 593 NTVRLPGLANLWVPPIRNRIDMLSTGIKSPIGIKVSGTVLADI-DTMAEQIEEVARTVPG 651
+ + + +++ + I +G + + + A+
Sbjct: 652 MELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPAS 711

Query: 652 VASALAERLEGGRYINVEINREKAARYGMTVADVQLFVTSAVGGAMVGETVEGIARYPIN 711
+ S LE +E+++EKA G++++D+ +++A+GG V + ++ +
Sbjct: 712 LVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLY 771

Query: 712 LRYPQSWRDSPQALRQLPILTPMKQQITLADVADVKVSTGPSMLKTENARPTSWIYIDAR 771
++ +R P+ + +L + + + + + G L+ N P+ I +A
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAA 831

Query: 772 DRDMVSVVHDLQKAIAEKVQLKPGTSVAFSGQFELLERANHKLKLMVPMTLMIIFVLLYL 831
L + +A K L G ++G + ++ +V ++ +++F+ L
Sbjct: 832 PGTSSGDAMALMENLASK--LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAA 889

Query: 832 AFRRVGEALLIISSVPFALVGGIWLLWWMGFHLSVATGTGFIALAGVAAEFGVVMLMYLR 891
+ + ++ VP +VG + V G + G++A+ ++++ + +
Sbjct: 890 LYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAK 949

Query: 892 HAIEAEPSLNNPQTFSEQKLDEALYHGAVLRVRPKAMTVAVIIAGLLPILWGTGAGSEVM 951
+E E + + EA +R+RP MT I G+LP+ GAGS
Sbjct: 950 DLMEKE----------GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQ 999

Query: 952 SRIAAPMIGGMITAPLLSLFIIPAAYKL 979
+ + ++GGM++A LL++F +P + +
Sbjct: 1000 NAVGIGVMGGMVSATLLAIFFVPVFFVV 1027


82c0678c0683N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0678-2184.887816Hypothetical membrane protein P43
c0679-2194.732561Ferrienterobactin-binding periplasmic protein
c0680-2235.197138Isochorismate synthase entC
c0681-2255.251052Enterobactin synthetase component E
c0682-1225.059982Isochorismatase
c0683-1204.9368082,3-dihydro-2,3-dihydroxybenzoate dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0678TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.3 bits (84), Expect = 2e-04
Identities = 82/394 (20%), Positives = 146/394 (37%), Gaps = 38/394 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATPALVGR 141
V+L + G ++ + P L +Y+ + G + G A A +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPALPP 201
+ + G V P++GGL+ GG + + AA L L LP
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLM---GGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 202 PPQPREHPLK----SLLAGFRFLLASPLVGGIALLGGLLTMAS----AVRVLYPALADNW 253
+ PL+ + LA FR+ +V + + ++ + A+ V++ D +
Sbjct: 184 SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRF 241

Query: 254 QMSAAQIGFLYAAIP-LGAAIGALTSGKLAHSVRPGLLMLLSTLG---AFLAIGLFGLMP 309
A IG AA L + A+ +G +A + ++L + ++ +
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 310 MWILGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGG 369
M +V LA G ML Q E G++ G A +G L
Sbjct: 302 MAFPIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 370 LGAMMTPVASASASGFGLLIIGVLLLLVLVELRR 403
+ A + + +G+ + L LL L LRR
Sbjct: 358 IYA----ASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0679FERRIBNDNGPP632e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 62.7 bits (152), Expect = 2e-13
Identities = 60/280 (21%), Positives = 100/280 (35%), Gaps = 35/280 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSAEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKS--- 151
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 152 --WQSLLTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
+ LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQ 309
KD DA+ A PL +P V+ + + F SAM
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMH 283


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0682ISCHRISMTASE443e-161 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 443 bits (1141), Expect = e-161
Identities = 146/299 (48%), Positives = 195/299 (65%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPESHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPIPASKAE-----------LREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA + + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0683DHBDHDRGNASE362e-130 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 362 bits (930), Expect = e-130
Identities = 110/258 (42%), Positives = 150/258 (58%), Gaps = 20/258 (7%)

Query: 5 GKNVWVTGAGKGIGYATAMAFVEAGAKVTGFD---------------QAFTQEQYPFATE 49
GK ++TGA +GIG A A GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAAQVAQVCQRLLAETERLDVLVNAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+A + ++ R+ E +D+LVN AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRHQRGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALSVGLELAGSGVRC 169
+ +R G+IVTV S+ A PR M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


83c0875c0880N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0875-1213.840107Hypothetical protein ybhR
c0876-1214.243760Hypothetical protein ybhS
c0877-1173.912646Hypothetical ABC transporter ATP-binding protein
c0878-1153.404899Hypothetical membrane protein ybhG
c08790132.971970Hypothetical transcriptional regulator ybiH
c08800143.312630Putative ATP-dependent RNA helicase rhlE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0875ABC2TRNSPORT469e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 46.1 bits (109), Expect = 9e-08
Identities = 35/146 (23%), Positives = 63/146 (43%), Gaps = 5/146 (3%)

Query: 197 AREREQGTLDQLLVSPLTTWQIFIGKAVPALIVATFQATIVLAIGIWAYQIPFAGSLALF 256
R Q T + +L + L I +G+ A A IG+ A + + L+L
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGA---GIGVVAAALGYTQWLSLL 148

Query: 257 YFTMVV--YGLSLVGFGLLISSLCSTQQQAFIGVFVFMMPAILLSGYVSPVENMPMWLQN 314
Y V+ GL+ G+++++L + + + P + LSG V PV+ +P+ Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 315 LTWINPIRHFTDITKQIYLKDASLDI 340
P+ H D+ + I L +D+
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0877PF05272310.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.012
Identities = 20/90 (22%), Positives = 28/90 (31%), Gaps = 21/90 (23%)

Query: 298 TPRFEDAFIDLLGGAGTSESPLGAILHTVGGTPGETVIEAKELTKKFGDFAATDHVNFAV 357
PR E + +LG P + + + K HV +
Sbjct: 547 VPRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVM 589

Query: 358 KRGEIFG----LLGPNGAGKSTTFKMMCGL 383
+ G F L G G GKST + GL
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGL 619



Score = 29.3 bits (65), Expect = 0.047
Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 39 YVTGLVGPDGAGKTTLMRMLAGL 61
Y L G G GK+TL+ L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0878RTXTOXIND626e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 62.2 bits (151), Expect = 6e-13
Identities = 42/259 (16%), Positives = 92/259 (35%), Gaps = 25/259 (9%)

Query: 83 ALMQAKAGVSVAQAQYDLMLAGYRDEEIAQAAAAVKQAQAAYDYAQNFYNRQQGLWKSRT 142
Q + + +A+ +LA E + + + + L +
Sbjct: 201 QKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENK 260

Query: 143 ISA--NDLENARSSRDQAQATLKSAQDKLRQYRSGNREQ---DIAQAKASLEQAQAQLAQ 197
N+L +S +Q ++ + SA+++ + + + + Q ++ +LA+
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 198 AELNLQDSTLVAPSDGTLLTRAV-EPGTVLNEGGTVFTVSLT-RPVWVRAYVDERNLDQA 255
E Q S + AP + V G V+ T+ + + V A V +++
Sbjct: 321 NEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFI 380

Query: 256 QPGRKVLLYTDGRPNKPYH---GQIGFVSPTAEFTPKTVETPDLRTDLVYRLRIVVT--- 309
G+ ++ + P Y G++ ++ A D R LV+ + I +
Sbjct: 381 NVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISIEENC 432

Query: 310 ----DADDALRQGMPVTVQ 324
+ + L GM VT +
Sbjct: 433 LSTGNKNIPLSSGMAVTAE 451


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0879HTHTETR737e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.7 bits (178), Expect = 7e-18
Identities = 33/214 (15%), Positives = 78/214 (36%), Gaps = 17/214 (7%)

Query: 13 KGEQAKKQLIAAALAQFGEYGMNATT-REIAAQAGQNIAAITYYFGSKEDLYLACAQWIA 71
+ ++ ++ ++ AL F + G+++T+ EIA AG AI ++F K DL+ +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 72 DFIGEQFRPHAEEAERLFAQPQPDRAAIRELILRACKNMIKLLTQDDTVNLSKFISREQL 131
IGE E + P + +RE+++ ++ + + + + F E +
Sbjct: 68 SNIGELEL---EYQAKFPGDP---LSVLREILIHVLESTVTEERRRLLMEII-FHKCEFV 120

Query: 132 SPTAAYHLVHEQVISPLHSHLTRLIAAWTGCDANDTRMILHTHALIGEILAFRLGKETIL 191
A + + + + + +A L T + + G
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKH--CIEAKMLPADLMTRRAAIIMRGYISG----- 173

Query: 192 LRTGWTAFDEEKTELINQTVTCHIDLILQGLSQR 225
L W + + + ++ ++L+
Sbjct: 174 LMENWLFAPQSFD--LKKEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0880SECA300.026 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.026
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 4/67 (5%)

Query: 246 QQVLVFTRTKHGANHLAEQLNKDGIRSAAIHG-NKSQGARTRALADFKSGDIRVLVATDI 304
Q VLV T + + ++ +L K GI+ ++ + A A A + + V +AT++
Sbjct: 450 QPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNM 506

Query: 305 AARGLDI 311
A RG DI
Sbjct: 507 AGRGTDI 513


84c0924c0931N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c0924015-0.772116Penicillin-binding protein 6 precursor
c0925115-0.217480Deoxyribose operon repressor
c0926014-0.108201Hypothetical protein ybjG
c0927-113-0.930461Multidrug translocase mdfA
c0928-219-4.612827Hypothetical protein ybjH precursor
c0929-123-6.353761Protein ybjI
c0930-127-6.637048Hypothetical protein ybjJ
c0931132-8.601536Hypothetical protein ybjK
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0924BLACTAMASEA438e-07 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 43.2 bits (102), Expect = 8e-07
Identities = 41/201 (20%), Positives = 64/201 (31%), Gaps = 34/201 (16%)

Query: 23 AFLFLFAPTAFAAEQTVEAPSVDARAW----------ILMDYASGKVLAEGNADEKLDPA 72
+ L A A P + I MD ASG+ L ADE+
Sbjct: 7 CIISLLATLPLAV-HASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMM 65

Query: 73 SLTKIMTSYVVGQALKADKIKLTDMVTVGKDAWATGNPALRGSSVMFLKPGDQVSVADLN 132
S K++ V + A +L + + +P V D ++V +L
Sbjct: 66 STFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSP------VSEKHLADGMTVGELC 119

Query: 133 KGVIIQSGNDACIALADYVAGSQESFIGLMNGYAKKLGLTNTT---FQTVHGLDAPGQF- 188
I S N A L V G + + +++G T ++T PG
Sbjct: 120 AAAITMSDNSAANLLLATVGGPAG-----LTAFLRQIGDNVTRLDRWETELNEALPGDAR 174

Query: 189 --STARDMA------LLGKAL 201
+T MA L + L
Sbjct: 175 DTTTPASMAATLRKLLTSQRL 195


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0927TCRTETA416e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 41.0 bits (96), Expect = 6e-06
Identities = 60/269 (22%), Positives = 109/269 (40%), Gaps = 23/269 (8%)

Query: 81 LLGPLSDRIGRRPVMLAGVVWFIVTCLAILLAQNIEQFTLLRFLQGISLCFIGAVGYAAI 140
+LG LSDR GRRPV+L + V + A + + R + GI+ GAV A I
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGA-TGAVAGAYI 120

Query: 141 QESFEEAVCIKITALMANVALIAPLLGPLVG---AAWIHVLPWEGMFVLFAALAAISFFG 197
+ + + M+ + GP++G + P F AAL ++F
Sbjct: 121 ADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAP----FFAAAALNGLNFLT 176

Query: 198 LQRAMPETATRIGEKLSLKELGRDYKLVLKNG-RFVAGALALGFVSLPLLAWIAQSP--I 254
+PE+ L + L G VA +A+ F ++ + Q P +
Sbjct: 177 GCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFF----IMQLVGQVPAAL 232

Query: 255 IIITGEQLSSYEYGLLQVPI--FGAL--IAGNLLLARLTSRRTVRSLIIMGGWPIMIGLL 310
+I GE ++ + + + FG L +A ++ + +R R +++G G +
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI 292

Query: 311 VAAAATVISSHAYLWMTAGLSIYAFGIGL 339
+ A AT ++ + + + GIG+
Sbjct: 293 LLAFAT----RGWMAFPIMVLLASGGIGM 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0930TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.7 bits (77), Expect = 0.001
Identities = 34/150 (22%), Positives = 65/150 (43%), Gaps = 6/150 (4%)

Query: 239 LLIGVVVLAMAFAEGSANDWL-PLLMVDGHGFSP-TSGSLIYAGFTLGMTVGRFIGGWFI 296
+IGV+ + F + + P +M D H S GS+I T+ + + +IGG +
Sbjct: 258 FMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILV 317

Query: 297 DHYSRVAVVR-ASALM--GALGIGLIIFVDSAWVA-GVSVVLWGLGASLGFPLTISAASD 352
D + V+ + L ++ S ++ + VL GL + TI ++S
Sbjct: 318 DRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSL 377

Query: 353 TGPDAPTRVSVVATTGYLAFLVGPPLLGYL 382
+A +S++ T +L+ G ++G L
Sbjct: 378 KQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0931HTHTETR521e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 52.3 bits (125), Expect = 1e-10
Identities = 14/81 (17%), Positives = 31/81 (38%)

Query: 12 RRANDPQRREKIIQATLEAVKLYGIHAVTHRKIAALAGVPLGSMTYYFSGIDELLLEAFS 71
+ + R+ I+ L G+ + + +IA AGV G++ ++F +L E +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 72 RFTEIMSRQYQAFFSDVSDAP 92
+ + + P
Sbjct: 65 LSESNIGELELEYQAKFPGDP 85


85c0997c1002N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c09970122.841666Arginine transport ATP-binding protein artP
c09980123.781804Putative lipoprotein ybjP precursor
c09990133.480524Hypothetical protein ybjQ
c1000-1123.590252Probable N-acetylmuramoyl-L-alanine amidase
c1001-3153.335461Hypothetical protein ybjS
c1002-3132.805396Hypothetical protein ybjT
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c0997PF05272300.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.009
Identities = 9/18 (50%), Positives = 12/18 (66%)

Query: 64 LVLLGPSGAGKSSLLRVL 81
+VL G G GKS+L+ L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1000ECOLIPORIN290.023 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 28.7 bits (64), Expect = 0.023
Identities = 20/54 (37%), Positives = 26/54 (48%), Gaps = 9/54 (16%)

Query: 2 RRVFWLVAAALLLAGCTGEKGIVEKEGYQLDTRRQAQAAYPRIKVLVIHYTADD 55
R+V LV ALL AG I K+G +LD Y ++ L HY +DD
Sbjct: 3 RKVLALVIPALLAAGAAHAAEIYNKDGNKLDL-------YGKVDGL--HYFSDD 47


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1001NUCEPIMERASE752e-17 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 75.2 bits (185), Expect = 2e-17
Identities = 70/363 (19%), Positives = 123/363 (33%), Gaps = 65/363 (17%)

Query: 22 MKVLVTGATSGLGRNAVEFLCQKGISVRA---------TGRNEAMGKLLEKMGAEFVPAD 72
MK LVTGA +G + + L + G V +A +LL + G +F D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 73 LTELVSSQAKVMLAGIDTLWHCS-------SFTSPWGTQQAFDLANVRATRRLGEWAVAW 125
L + + ++ S +P A+ +N+ + E
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPH----AYADSNLTGFLNILEGCRHN 116

Query: 126 GVRNFIHISSPSLYFDYHHHRDIKEDFRPHRFANEFARSKAASEEVINMLSQANPQTRFT 185
+++ ++ SS S+Y + D + +A +K A+E + + S T
Sbjct: 117 KIQHLLYASSSSVYGL-NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH-LYGLPAT 174

Query: 186 ILRPQSLFGPHDK--VFIPRLAHMMHHYGSILLPHGGSALVDMTYYENAVHAMWLASQEA 243
LR +++GP + + + + M SI + + G D TY ++ A+
Sbjct: 175 GLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI 234

Query: 244 CDKLPS--------------GRVYNITNGEHRTLRSIVQKLIDELNIDCRIRSVPYPMLD 289
RVYNI N L +Q L D L I+ + +P D
Sbjct: 235 PHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGD 294

Query: 290 MIARSMERLGRKSAKEPPLTHYGVSKLNFDFTLDITRAQEELGYQPVLTLDEGIEKTAAW 349
+ T D E +G+ P T+ +G++ W
Sbjct: 295 V----------------LETS-----------ADTKALYEVIGFTPETTVKDGVKNFVNW 327

Query: 350 LRD 352
RD
Sbjct: 328 YRD 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1002NUCEPIMERASE561e-10 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 55.9 bits (135), Expect = 1e-10
Identities = 29/125 (23%), Positives = 52/125 (41%), Gaps = 17/125 (13%)

Query: 29 RILVLGASGYIGQHLVRTLSQQGHQILA---------AARHVDRLAKLQLANVSCHKVDL 79
+ LV GA+G+IG H+ + L + GHQ++ + RL L HK+DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 80 SWPDNLPALLQD--IDTVYFLVH------SMGEGGDFIAQERQVALNVRDALREVPVKQL 131
+ + + L + V+ H S+ + LN+ + R ++ L
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 132 IFLSS 136
++ SS
Sbjct: 122 LYASS 126


86c1345c1353N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c13450142.624364Flagellar hook protein flgE
c1346-1132.564448Flagellar basal-body rod protein flgF
c1347-2101.432361Flagellar basal-body rod protein flgG
c1348-1142.432373Flagellar L-ring protein precursor
c13490142.219221Flagellar P-ring protein precursor
c13501141.892773Peptidoglycan hydrolase flgJ
c13511141.446168Flagellar hook-associated protein 1
c13522151.421240Flagellar hook-associated protein 3
c13533161.440887Ribonuclease E
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1345FLGHOOKAP1414e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.5 bits (97), Expect = 4e-06
Identities = 17/49 (34%), Positives = 29/49 (59%)

Query: 353 TLTNGALEASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR 401
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + L+N+R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 37.2 bits (86), Expect = 1e-04
Identities = 22/56 (39%), Positives = 30/56 (53%), Gaps = 4/56 (7%)

Query: 6 AVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1347FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1348FLGLRINGFLGH350e-126 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 350 bits (898), Expect = e-126
Identities = 231/232 (99%), Positives = 232/232 (100%)

Query: 6 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 65
MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 66 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGVFGNA 125
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQG+FGNA
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 126 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 185
RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 186 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 237
SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1349FLGPRINGFLGI425e-151 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 425 bits (1094), Expect = e-151
Identities = 156/363 (42%), Positives = 212/363 (58%), Gaps = 9/363 (2%)

Query: 5 FLSALILLLVTTAVQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQTLN 64
F + L RI+D+ S+Q R N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 13 FSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMR 72

Query: 65 NMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGTLLM 124
ML LGIT G + KN+AAVMVTA+LPPF G +DV VSS+G+A SLRGG L+M
Sbjct: 73 AMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIM 131

Query: 125 TPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFGVGN 184
T L G D Q+YA+AQG ++V G A +++ R+ NGA+IERELPS+F
Sbjct: 132 TSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSV 191

Query: 185 TLNLQLNDEDFSMAQQIADTINRVR----GYGSATALDARTIQVRVPSGNSSQVRFLADI 240
L LQL + DFS A ++AD +N G A D++ I V+ P + R +A+I
Sbjct: 192 NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEI 250

Query: 241 QNMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAVAQGNLSVTVNRQANVSQPDTPFGG 300
+N+ V T AKVVIN RTG++V+ +V + AV+ G L+V V V QP PF
Sbjct: 251 ENLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSR 308

Query: 301 GQTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLR 360
GQT V PQT I Q G + ++ L +V LN++G +++ILQ ++SAG L+
Sbjct: 309 GQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQ 367

Query: 361 AKL 363
A+L
Sbjct: 368 AEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1350FLGFLGJ5100.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 510 bits (1315), Expect = 0.0
Identities = 311/313 (99%), Positives = 312/313 (99%)

Query: 11 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 70
MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 71 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEEPTPAAPMKFPLET 130
LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEE TPAAPMKFPLET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 131 VVRYQNQALSQLVQKAVPRNYDDSLPGNSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 190
VVRYQNQALSQLVQKAVPRNYDDSLPG+SKAFLAQLSLPAQLASQQSGVPHHLILAQAAL
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180

Query: 191 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 250
ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL
Sbjct: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240

Query: 251 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 310
EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK
Sbjct: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300

Query: 311 VSKTYSMNIDNLF 323
VSKTYSMNIDNLF
Sbjct: 301 VSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1351FLGHOOKAP16830.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 683 bits (1763), Expect = 0.0
Identities = 544/546 (99%), Positives = 546/546 (100%)

Query: 2 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 61
SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 121
GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 181
SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 241
QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSTADPSRTTVAYIDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 301
RQLAAVPS+ADPSRTTVAY+DGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 361
ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 362 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 421
YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV
Sbjct: 361 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 420

Query: 422 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 481
NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN
Sbjct: 421 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 480

Query: 482 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 541
KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD
Sbjct: 481 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 540

Query: 542 ALINIR 547
ALINIR
Sbjct: 541 ALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1352FLAGELLIN468e-08 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 46.2 bits (109), Expect = 8e-08
Identities = 42/226 (18%), Positives = 80/226 (35%), Gaps = 9/226 (3%)

Query: 7 MMYQQNMRGITNSQAEWMKYGEQMSTGKRVVNPSDDPIAASQAVVLSQAQAQNSQYTLAR 66
++ Q N+ +S + + E++S+G R+ + DD + A + +Q +
Sbjct: 11 LLTQNNLNKSQSSLSSAI---ERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNA 67

Query: 67 TFATQKVSLEESVLSQVTTAIQNAQEKIVYASNGTLSDDDRASLATDIQGLRDQLLNLAN 126
E L+++ +Q +E V A+NGT SD D S+ +IQ +++ ++N
Sbjct: 68 NDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSN 127

Query: 127 TTDGNGRYIFAGYKTETAPFSEADGDYVGGTESIKQQVDASRSMVIGHTGDKIFDSITSN 186
T NG + + DG E+I + +G G + +
Sbjct: 128 QTQFNGVKVLSQDNQMKIQVGANDG------ETITIDLQKIDVKSLGLDGFNVNGPKEAT 181

Query: 187 AVAEPDGSASETNLFAMLDSAIAALKTPVADSEADKETAAAALDKT 232
+ T A + + TA DK
Sbjct: 182 VGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1353IGASERPTASE682e-13 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 67.8 bits (165), Expect = 2e-13
Identities = 61/298 (20%), Positives = 93/298 (31%), Gaps = 54/298 (18%)

Query: 513 PSEEEFAERKRPEQPALATFAMPDVPPAPTPAEPAAPVVAPAPKSAPATPAAPAQPGLLS 572
P E+ + DVP P+ E A AP P APA P
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIA-----RVDEAPVPPPAPATP---- 1033

Query: 573 RFFGALKALFSGGEETKPSEQPTPKAEAKPERQQDRRKPRQNNRRDRNERRDPRSERTEG 632
SE AE + + K Q
Sbjct: 1034 ------------------SETTETVAENSKQESKTVEKNEQ------------------- 1056

Query: 633 SDNREENRRNRRQAQQQTAETRESRQQAEV------TEKARTTDEQQTPRRERSRRRNDD 686
D E +NR A++ + + + Q EV T++ +TT+ ++T E+ + +
Sbjct: 1057 -DATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVE 1115

Query: 687 KRQAQQEVKALNVEEQSVQETEQEERVRPVQPRRKQRQLNQKVRYEQSVAEEAVVAPVVE 746
+ QEV + + QE + + + R +N K Q+ P E
Sbjct: 1116 TEK-TQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKE 1174

Query: 747 ETVAAEPIVQEAPAPRTELVKVPLPVVAQTAPEQPEENNADNRDNGGMPRRSRRSPRH 804
+ E V E+ T V P A QP N+ + RRS RS H
Sbjct: 1175 TSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPH 1232


87c1680c1684N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c1680-2141.739720Hypothetical protein ychO
c1681-2202.180501Nitrate/nitrite response regulator protein narL
c1682-2212.480397Nitrate/nitrite sensor protein narX
c1683-2272.755395Hypothetical protein
c1684-2232.846591Nitrite extrusion protein 1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1680INTIMIN2548e-78 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 254 bits (650), Expect = 8e-78
Identities = 118/378 (31%), Positives = 194/378 (51%), Gaps = 21/378 (5%)

Query: 79 GEQAKAFALGKVRDALSQQVNQHVESWLSPWGNASVDVKVDNEGHFTGSRGSWFVPLQDN 138
G+ AK ALG + Q + +++WL +G A V+++ N F GS + +P D+
Sbjct: 184 GDYAKDTALGIAGN----QASSQLQAWLQHYGTAEVNLQSGNN--FDGSSLDFLLPFYDS 237

Query: 139 DRYLTWSQLSLTQQDDGLVSNVGVGQRWARGSWLVGYNTFYDNLLDENLQRAGFGAEAWG 198
++ L + Q+ D +N+G GQR+ ++GYN F D + R G G E W
Sbjct: 238 EKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLGIGGEYWR 297

Query: 199 EYLRLSANFYQPFAAWHE--QTATQEQRMARGYDLTARMRMPFYQHLNTSVSVEQYFGDR 256
+Y + S N Y + WHE ++R A G+D+ +P Y L + EQY+GD
Sbjct: 298 DYFKSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDN 357

Query: 257 VDLFNSGTGYHNPVALSLGLNYTPVPLVTVTAQHKQGESGENQNNLGLNLNYRFGVPLKK 316
V LFNS NP A ++G+NYTP+PLVT+ ++ G EN + Y+F P +
Sbjct: 358 VALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQ 417

Query: 317 QLSAGEVAESQSLRGSRYDNPQRNNLPTLEYRQRKTLTVFLATPPWDLKPGETVPLKLQI 376
Q+ V E ++L GSRYD QRNN LEY+++ L++ + + T ++L +
Sbjct: 418 QIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNI-PHDINGTERSTQKIQLIV 476

Query: 377 RSRYGIRQLIWQGDTQILS-----LTPGAQANSEEGWTLIMPDWQNGEGASNHWRLSVVV 431
+S+YG+ +++W D+ + S G+Q S + + I+P + +G SN ++++
Sbjct: 477 KSKYGLDRIVWD-DSALRSQGGQIQHSGSQ--SAQDYQAILPAYV--QGGSNVYKVTARA 531

Query: 432 EDNQGQRVSSNEITLTLV 449
D G SSN + LT+
Sbjct: 532 YDRNGN--SSNNVLLTIT 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1681HTHFIS758e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.3 bits (185), Expect = 8e-18
Identities = 32/117 (27%), Positives = 56/117 (47%), Gaps = 2/117 (1%)

Query: 29 ATILLIDDHPMLRTGVKQLISMAPDITVVGEASNGEQGIELAESLDPDLILLDLNMPGMN 88
ATIL+ DD +RT + Q +S A + SN + D DL++ D+ MP N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 89 GLETLDKLREKSLSGRIVVFSVSNHEEDVVTALKRGADGYLLKDMEPEDLLKALHQA 145
+ L ++++ ++V S N + A ++GA YL K + +L+ + +A
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1682PF06580532e-09 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 52.9 bits (127), Expect = 2e-09
Identities = 36/172 (20%), Positives = 73/172 (42%), Gaps = 23/172 (13%)

Query: 424 PESSRELLSQIRNELNASWVQLRELLTTFRLQLTEPGLRPALEASCEEYSAKFGFPVKLD 483
P +RE+L+ + + S + +LT +++ + S +F ++ +
Sbjct: 190 PTKAREMLTSLSELMRYSLRYSNARQVSLADELT------VVDSYLQLASIQFEDRLQFE 243

Query: 484 YQLPPRL----VPSHQAIHLLQIAREALSNALKH-----SQASEVVVTVAQNDNQVKLTV 534
Q+ P + VP L+Q E N +KH Q ++++ +++ V L V
Sbjct: 244 NQINPAIMDVQVPPM----LVQTLVE---NGIKHGIAQLPQGGKILLKGTKDNGTVTLEV 296

Query: 535 QDNGCGVPENAIRSNHYGMIIMRDRAQSLRG-DCRVRRRESGGTEVVVTFIP 585
++ G +N S G+ +R+R Q L G + +++ E G + IP
Sbjct: 297 ENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1684ACRIFLAVINRP310.012 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.0 bits (70), Expect = 0.012
Identities = 35/166 (21%), Positives = 60/166 (36%), Gaps = 22/166 (13%)

Query: 260 IMSLLYLATFGSFIGFSAGFAMLSKTQFPDVQILQYAFFGPFIGALARSA---GGALSDR 316
I+S + L+ + I A A L K + + FFG F S ++
Sbjct: 474 IVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKI 533

Query: 317 LGGTRVTLVNFILMAIFSGLLFLTLPTD----GQGGSFMAFFAVFLALFLTAGLGSGSTF 372
LG T L+ + L+ +LFL LP+ G F+ L +G+T
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTM----------IQLPAGATQ 583

Query: 373 QMISVIFRKLTMDRVKAEGGSDER-----AMREAATDTAAALGFIS 413
+ + ++T +K E + E + A + F+S
Sbjct: 584 ERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVS 629


88c1756c1768N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c1756-114-0.748868Hypothetical protein yciR
c1757-1160.044767Exoribonuclease II
c1758-317-0.691659Hypothetical protein yciW
c1759-311-0.024475Enoyl-[acyl-carrier-protein] reductase (NADH)
c1760-2120.291892Putative transcriptional repressor
c1761-1130.458428Acriflavine resistance protein A precursor
c1762-2130.615568Hypothetical protein
c1763-1130.541877Hypothetical protein
c1764-2121.066787Acriflavine resistance protein B
c1765-2130.927733Partial Putative outer membrane channel protein
c1766-2140.908333Putative membrane transport protein
c1767-2131.324527Peptide transport system ATP-binding protein
c1768-3131.306273Peptide transport system ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1756PF08280300.043 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 29.8 bits (67), Expect = 0.043
Identities = 21/105 (20%), Positives = 36/105 (34%), Gaps = 2/105 (1%)

Query: 526 PIDVELTESCLIENDELALSVIQQFSRLGAQVHLDDFGTGYSSLSQLARFPIDAIKLDQV 585
P+ V S I L S + FS + + ++ Q+ D +
Sbjct: 425 PLVVVFVASNFINAHLLTDSFPRYFS--DKSIDFHSYYLLQDNVYQIPDLKPDLVITHSQ 482

Query: 586 FVRDIHKQPVSQSLVRAIVAVAQALNLQVIAEGVESAKEDAFLTK 630
+ +H + V I L++Q + V+ K A LTK
Sbjct: 483 LIPFVHHELTKGIAVAEISFDESILSIQELMYQVKEEKFQADLTK 527


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1759DHBDHDRGNASE494e-09 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 49.3 bits (117), Expect = 4e-09
Identities = 50/260 (19%), Positives = 97/260 (37%), Gaps = 22/260 (8%)

Query: 4 LSGKRILVTGVASKLSIAYGIAQAMHREGAEL-AFTYQNDKLKGRVEEFAAQLGSDIVLQ 62
+ GK +TG A I +A+ + +GA + A Y +KL+ V A+
Sbjct: 6 IEGKIAFITGAAQ--GIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 CDVAEDTSIDTMFAELGKVWPKFDGFVHSIGF---APGDQLDGDYVNAVTREGFKIAHDI 119
DV + +ID + A + + D V+ G L + A F +
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEAT----FSVN--- 116

Query: 120 SSYSFVAMAKACRSMLNP-GSALLTLSYLGAERAIPNYNVMGLAKASLEANVRYMANAMG 178
S+ F A + M++ +++T+ A + +KA+ + + +
Sbjct: 117 STGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 179 PEGVRVNAISAGPIRTLAASGI--------KDFRKMLAHCEAVTPIRRTVTIEDVGNSAA 230
+R N +S G T + + + L + P+++ D+ ++
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVL 236

Query: 231 FLCSDLSAGISGEVVHVDGG 250
FL S + I+ + VDGG
Sbjct: 237 FLVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1760HTHTETR574e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 56.9 bits (137), Expect = 4e-12
Identities = 17/65 (26%), Positives = 33/65 (50%)

Query: 34 MTSKLEIRHKQRQDEIINAARRCFRRCGFHAASMSQIASEAQLSVGQIYRYFANKDAIIE 93
M K + ++ + I++ A R F + G + S+ +IA A ++ G IY +F +K +
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 94 EMVRR 98
E+
Sbjct: 61 EIWEL 65


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1761RTXTOXIND484e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.9 bits (114), Expect = 4e-08
Identities = 19/70 (27%), Positives = 39/70 (55%), Gaps = 1/70 (1%)

Query: 52 PVSVVSELTGR-TSAALSAEVRPQVGGIIQKRLFKEGDLVKAGQPLYQIDAASYQAAWNE 110
V +V+ G+ T + S E++P I+++ + KEG+ V+ G L ++ A +A +
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLK 138

Query: 111 ARAALQQAQA 120
+++L QA+
Sbjct: 139 TQSSLLQARL 148



Score = 30.6 bits (69), Expect = 0.010
Identities = 15/116 (12%), Positives = 32/116 (27%), Gaps = 9/116 (7%)

Query: 94 QPLYQIDAASYQAAWN--EARAALQQAQALVKADCQKAQRYTRLVKENGVSQQDADDAQS 151
L A + A + K+ ++ + KE Q ++
Sbjct: 241 SSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE--YQLVTQLFKN 298

Query: 152 TCAQDKASVEAKKAALET----ARINLDWTTVTAPISGRI-GISSVTPGALVTASQ 202
L + + AP+S ++ + T G +VT ++
Sbjct: 299 EILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE 354


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1764ACRIFLAVINRP11240.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1124 bits (2908), Expect = 0.0
Identities = 565/1005 (56%), Positives = 736/1005 (73%), Gaps = 6/1005 (0%)

Query: 1 MPVAQYPDVAPPTIKISATYTGASAETLENSVTQVIEQQLTGLDNLLYFSSTSSSDGSVS 60
+PVAQYP +APP + +SA Y GA A+T++++VTQVIEQ + G+DNL+Y SSTS S GSV+
Sbjct: 30 LPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVT 89

Query: 61 INVTFEQGTDPDTAQVQVQNKIQQAESRLPSEVQQTGVTVEKSQSNFLLIAAVYDTTDKA 120
I +TF+ GTDPD AQVQVQNK+Q A LP EVQQ G++VEKS S++L++A
Sbjct: 90 ITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGT 149

Query: 121 SSSDIADWLVSNVQDPLARVEGVGSLQVFGAEYAMRIWLDPAKLASYSLMPSDVQSAIEA 180
+ DI+D++ SNV+D L+R+ GVG +Q+FGA+YAMRIWLD L Y L P DV + ++
Sbjct: 150 TQDDISDYVASNVKDTLSRLNGVGDVQLFGAQYAMRIWLDADLLNKYKLTPVDVINQLKV 209

Query: 181 QNVQVTAGKIGALPSPNTQQLTATVRAQSRLQTVDQFKNIIVKSQSDGAVVRIKDVARVE 240
QN Q+ AG++G P+ QQL A++ AQ+R + ++F + ++ SDG+VVR+KDVARVE
Sbjct: 210 QNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVE 269

Query: 241 MGSEDYTAIGKLNGHPSAGVAVMLSPGANALNTATLVKDKIAEFQRNMPQGYDIAYPKDS 300
+G E+Y I ++NG P+AG+ + L+ GANAL+TA +K K+AE Q PQG + YP D+
Sbjct: 270 LGGENYNVIARINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDT 329

Query: 301 TEFIKISVEDVIQTLFEAIVLVVCVMYLFLQNLRATLIPALAVPVVLLGTFGVLALFGYS 360
T F+++S+ +V++TLFEAI+LV VMYLFLQN+RATLIP +AVPVVLLGTF +LA FGYS
Sbjct: 330 TPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYS 389

Query: 361 INTLTLFAMVLAIGLLVDDAIVVVENVERIMRDEGLPAREATEKSMGEISGALVAIALVL 420
INTLT+F MVLAIGLLVDDAIVVVENVER+M ++ LP +EATEKSM +I GALV IA+VL
Sbjct: 390 INTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVL 449

Query: 421 SAVFLPMAFFGGSTGVIYRQFSITIISAMLLSVVVALTLTPALCGSVL----QHVPPHKK 476
SAVF+PMAFFGGSTG IYRQFSITI+SAM LSV+VAL LTPALC ++L +K
Sbjct: 450 SAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKG 509

Query: 477 GFFGAFNRFYRRTEDKYQRGVIYVLRRAARTMGLYVVLGGGMALMMWKLPGSFLPTEDQG 536
GFFG FN + + + Y V +L R + +Y ++ GM ++ +LP SFLP EDQG
Sbjct: 510 GFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQG 569

Query: 537 EIMVQYTLPAGATAARTAEVNRQIVDWFLINEKANTDVIFTVDGFSFSGSGQNTGMAFVS 596
+ LPAGAT RT +V Q+ D++L NEKAN + +FTV+GFSFSG QN GMAFVS
Sbjct: 570 VFLTMIQLPAGATQERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVS 629

Query: 597 LKNWSQRKGAENTAQAIALRATKELGTIRDATVFAMTPPAVDGLGQSNGFTFELLANGGT 656
LK W +R G EN+A+A+ RA ELG IRD V PA+ LG + GF FEL+ G
Sbjct: 630 LKPWEERNGDENSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGL 689

Query: 657 DRETLLQMRNQLIEKANQSP-ELHSVRANDLPQMPQLQVDIDSNKAVSLGLSLNDVTDTL 715
+ L Q RNQL+ A Q P L SVR N L Q ++++D KA +LG+SL+D+ T+
Sbjct: 690 GHDALTQARNQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTI 749

Query: 716 SSAWGGTYVNDFIDRGRVKKVYIQGDSEFRSAPSDLGKWFVRGSDNAMTPFSAFATTRWL 775
S+A GGTYVNDFIDRGRVKK+Y+Q D++FR P D+ K +VR ++ M PFSAF T+ W+
Sbjct: 750 STALGGTYVNDFIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWV 809

Query: 776 YGPERLVRYNGSAAYEIQGENATGFSSGDAMTKMEELANSLPAGTTWAWSGLSLQEKLAS 835
YG RL RYNG + EIQGE A G SSGDAM ME LA+ LPAG + W+G+S QE+L+
Sbjct: 810 YGSPRLERYNGLPSMEIQGEAAPGTSSGDAMALMENLASKLPAGIGYDWTGMSYQERLSG 869

Query: 836 GQALSLYAVSILVVFLCLAALYESWSVPFSVILVIPLGLLGAALAAWMRDLNNDVYFQVA 895
QA +L A+S +VVFLCLAALYESWS+P SV+LV+PLG++G LAA + + NDVYF V
Sbjct: 870 NQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVG 929

Query: 896 LLTTIGLSSKNAILIVEFA-EAAVAEGYSLSRAALRAAQTRLRPIIMTSLAFIAGVMPLA 954
LLTTIGLS+KNAILIVEFA + EG + A L A + RLRPI+MTSLAFI GV+PLA
Sbjct: 930 LLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLA 989

Query: 955 IATGAGANSRIAIGTGIIGGTLTATLLAIFFVPLFFVLVKRLFAG 999
I+ GAG+ ++ A+G G++GG ++ATLLAIFFVP+FFV+++R F G
Sbjct: 990 ISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRCFKG 1034



Score = 75.6 bits (186), Expect = 6e-16
Identities = 53/330 (16%), Positives = 117/330 (35%), Gaps = 19/330 (5%)

Query: 691 QLQVDIDSNKAVSLGLSLNDVTDTLSSA----WGGTYVNDFIDRGRVKKVYIQGDSEFRS 746
+++ +D++ L+ DV + L G G+ I + F++
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKN 242

Query: 747 APSDLGKWFVRGSDN-AMTPFSAFATTRWLYGPER--LVRYNGSAA-----YEIQGENAT 798
P + GK +R + + ++ A L G + R NG A G NA
Sbjct: 243 -PEEFGKVTLRVNSDGSVVRLKDVARVE-LGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 799 GFSSGDAMTKMEELANSLPAG--TTWAWSGLSLQEKLASGQALSLYAVSILVVFLCLAAL 856
+ K+ EL P G + + + +L+ +LV + +
Sbjct: 301 DTAKA-IKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLV-MYLF 358

Query: 857 YESWSVPFSVILVIPLGLLGAALAAWMRDLNNDVYFQVALLTTIGLSSKNAILIVEFAEA 916
++ + +P+ LLG + + ++ IGL +AI++VE E
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 917 AVAEGYSLSRAALRAAQTRLR-PIIMTSLAFIAGVMPLAIATGAGANSRIAIGTGIIGGT 975
+ E + A + ++++ ++ ++ A +P+A G+ I+
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 976 LTATLLAIFFVPLFFVLVKRLFAGKPRRQE 1005
+ L+A+ P + + + + +
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1765RTXTOXIND290.048 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.048
Identities = 24/166 (14%), Positives = 49/166 (29%), Gaps = 11/166 (6%)

Query: 70 DVQKAIADIDSARALYGQTNASLFPTVNAALSSTRSRSLANGTGTTAEADGTVSSYTLDL 129
A AD ++ Q +RS L + + + +
Sbjct: 128 TALGAEADTLKTQSSLLQARL----EQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEE 183

Query: 130 FGRNQSLSRAARETWLASEFTAQNTRLTLIAEISTAWLTLAADNSNLALAKETMASAENS 189
R SL + TW ++ + AE T + + + ++
Sbjct: 184 VLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTV-------LARINRYENLSRVEKSR 236

Query: 190 LKIIQRQQQVGTAAATDVSEAMSVYQQARASVASYQTQVMQDKNAL 235
L A V E + Y +A + Y++Q+ Q ++ +
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEI 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1766TCRTETA689e-15 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 68.3 bits (167), Expect = 9e-15
Identities = 68/312 (21%), Positives = 114/312 (36%), Gaps = 18/312 (5%)

Query: 5 SLSWALILGLLAGIGPMCTDLYLPALPEMSEQLAATTTITQLTLTASLIGLGVGQLLFGP 64
L L L +G L +P LP + L + +T L + Q P
Sbjct: 6 PLIVILSTVALDAVG---IGLIMPVLPGLLRDLVHSNDVTA-HYGILLALYALMQFACAP 61

Query: 65 ----LSDKIGRKRPLILSLLLFIVSSILCATTNNIYWLVVWRFIQGIAGAGGSVLSRSIA 120
LSD+ GR+ L++SL V + AT ++ L + R + GI GA G+V IA
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIA 121

Query: 121 RDKYQGVTLTQFFALLMTVNGLAPVLSPVLGGYIVSTFDWRTLFWVMAEISTVLLLGCLL 180
D G + F + G V PVLGG + F F+ A ++ + L
Sbjct: 122 -DITDGDERARHFGFMSACFGFGMVAGPVLGGLM-GGFSPHAPFFAAAALNGLNFLTGCF 179

Query: 181 FINETLPENKRGSSL----LLTGRSVVQNRRFMRFCLIQSFMLAGLFAYIGSSSFVL--Q 234
+ E+ +R L + + + F + L + ++ +V+ +
Sbjct: 180 LLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFF-IMQLVGQVPAALWVIFGE 238

Query: 235 KEFGFSPMQFSLVFGLNGI-GLIIASWIFSRLARRINAMTLLRGGLIAAILCALLTVLCA 293
F + + GI + + I +A R+ L G+IA +L
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 294 WIQLPIPALVAL 305
+ P +V L
Sbjct: 299 RGWMAFPIMVLL 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c1768HTHFIS310.007 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.007
Identities = 9/16 (56%), Positives = 14/16 (87%)

Query: 38 LVGESGSGKSLIAKAI 53
+ GESG+GK L+A+A+
Sbjct: 165 ITGESGTGKELVARAL 180


89c2297c2305N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c22971121.137916Chemotaxis protein cheY
c22980111.479939Protein-glutamate methylesterase
c22990111.284228Chemotaxis protein methyltransferase
c23000110.857443Hypothetical protein
c2301-1110.778016Methyl-accepting chemotaxis protein II
c2302-1120.823039Chemotaxis protein cheW
c2303-1130.420292Chemotaxis protein cheA
c2304-117-1.178441Chemotaxis motB protein
c2305015-2.105357Chemotaxis motA protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2297HTHFIS889e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.3 bits (219), Expect = 9e-24
Identities = 30/105 (28%), Positives = 51/105 (48%), Gaps = 3/105 (2%)

Query: 7 KFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGLDALNKLQAGGYGFVISDWNMPNMDGL 66
LV DD + +R ++ L G++ V + + AG V++D MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 ELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKPF 111
+LL I+ LPVL+++A+ I A++ GA Y+ KPF
Sbjct: 64 DLLPRIKKARPD--LPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2298HTHFIS659e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 65.2 bits (159), Expect = 9e-14
Identities = 35/188 (18%), Positives = 72/188 (38%), Gaps = 23/188 (12%)

Query: 1 MSKIRVLSVDDSALMRQIMTEIINSHSDMEMVATAPDPLVARDLIKKFNPDVLTLDVEMP 60
M+ +L DD A +R ++ + ++ V + I + D++ DV MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 RMDGLDFLEKLMRLRPMPVVMVSSLTGKGS-EVTLRALELGAIDFVTKPQLGIREGMLAY 119
+ D L ++ + RP V+V ++ + + ++A E GA D++ KP + E +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLV--MSAQNTFMTAIKASEKGAYDYLPKP-FDLTELIGII 115

Query: 120 SEMIAEKVRTAAKASLAAHKPLSAPTTLKAGPLLSSEKLIAIGASTGGTEAIRHVLQPLP 179
+AE R +K + + +G S E R + + +
Sbjct: 116 GRALAEPKRRPSKLEDDSQDGMP-----------------LVGRSAAMQEIYRVLARLMQ 158

Query: 180 LSSPALLI 187
++
Sbjct: 159 TDLTLMIT 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2303PF06580433e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.9 bits (101), Expect = 3e-06
Identities = 23/151 (15%), Positives = 49/151 (32%), Gaps = 52/151 (34%)

Query: 379 ELDKSLIERIIDPLT--HLVRNSLDHGIELPEKRLAAGKNSVGNLILSAEHQGGNICIEV 436
+++ ++++ + P+ LV N + HGI G ++L G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 437 TDDGAGLNRERILAKAASQGLTVSENMSDDEVAMLIFAPGFSTAEQVTDVSGRGVGMDVV 496
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKNTK--------------------------------------ESTGTGLQNV 318

Query: 497 KRNIQEMGG---HVEIQSKQGTGTTIRILLP 524
+ +Q + G +++ KQG +L+P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2304PF05272310.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.8 bits (69), Expect = 0.009
Identities = 22/93 (23%), Positives = 35/93 (37%), Gaps = 11/93 (11%)

Query: 46 LISISSPKELIQIAEYFRTPLATAVTGGDRISNSESPIPGGGDDYTQSQGEVNKQPNIEE 105
L +SSP A P + G + ++ PGGGDD GE +++
Sbjct: 384 LADVSSPTAAAGGAGGGEPPKKRDPSAG---AGTDPGGPGGGDD-----GEDPFGEWLDD 435

Query: 106 LKKRM---EQSRLRKLRGDLDQLIESDPKLRAL 135
R+ + L+ R L + + S P L
Sbjct: 436 EVARLRLRGRWLLKPRRAALIEALRSAPALAGC 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2305PF05844330.001 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 33.1 bits (75), Expect = 0.001
Identities = 12/28 (42%), Positives = 22/28 (78%), Gaps = 2/28 (7%)

Query: 76 MDLLALLYRLMAKSRQMGMFSLERDIEN 103
++LL +L+R+ K+R++G+ L+RD EN
Sbjct: 74 VELLLILFRIAQKARELGV--LQRDNEN 99


90c2338c2345N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c2338-113-1.465221Flagellin
c2339-2160.450688Flagellar hook-associated protein 2
c2340-2140.259965Flagellar protein fliS
c2341-1130.662108Flagellar protein fliT
c2342-1140.089651Cytoplasmic alpha-amylase
c2343-118-1.134300Hypothetical lipoprotein yedD precursor
c2344-220-4.279807Hypothetical protein yedE
c2345234-7.556630Hypothetical protein yedF
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2338FLAGELLIN1812e-52 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 181 bits (460), Expect = 2e-52
Identities = 186/469 (39%), Positives = 243/469 (51%), Gaps = 11/469 (2%)

Query: 2 AQVINTNSLSLITQNNINKNQSALSSSIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 61
AQVINTNSLSL+TQNN+NK+QS+LSS+IERLSSGLRINSAKDDAAGQAIANRFTSNIKGL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 TQAARNANDGISVAQTTEGALSEINNNLQRIRELTVQASTGTNSDSDLDSIQDEIKSRLD 121
TQA+RNANDGIS+AQTTEGAL+EINNNLQR+REL+VQA+ GTNSDSDL SIQDEI+ RL+
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EIDRVSGQTQFNGVNVLAKDGSMKIQVGANDGQTITIDLKKIDSDTLGLNGFNVNGSGTI 181
EIDRVS QTQFNGV VL++D MKIQVGANDG+TITIDL+KID +LGL+GFNVNG
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEA 180

Query: 182 ANKAATISDLTAAKMDAATNTITTTNNALTASKALDQLKDGDTVTIKADAAQTATVYTYN 241
S D + + V A N
Sbjct: 181 TVGDLKSSFKNVTGYDTYAVGANKYRVDVNSG----------AVVTDTTAPTVPDKVYVN 230

Query: 242 ASAGNFSFSNVSNNTSAKAGDVAASLLPPAGQTASGVYKAASGEVNFDVDANGKITIGGQ 301
A+ G + + NNT+ S + + G+ D G
Sbjct: 231 AANGQLTTDDAENNTAVDLFKTTKSTA-GTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDT 289

Query: 302 EAYLTSDGNLTTNDAGGATAATLDGLFKKAGDGQSIGFNKTASVTMGGTTYNFKTGADAG 361
+ +G ++T G T+ + A + + + +V F
Sbjct: 290 KTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTK 349

Query: 362 AATANAGVSFTDTASKETVLNKVATAKQGTAVAANGDTSATITYKSGVQTYQAVFAAGDG 421
+A + A K V A+ A + T A T +
Sbjct: 350 NESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINED 409

Query: 422 TASAKYADNTDVSNATATYTDADGEMTTIGSYTTKYSIDANNGKVTVDS 470
A+AK + +++ + + D +++G+ ++ N TV +
Sbjct: 410 AAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTN 458



Score = 95.1 bits (236), Expect = 8e-23
Identities = 95/371 (25%), Positives = 139/371 (37%), Gaps = 7/371 (1%)

Query: 224 TVTIKADAAQTATVYTYNASAGNFSFSNVSNNTSAKAGDVAASLLPPAGQTASGVYKAAS 283
+ + A+ +T T+ + + + L + + +G A
Sbjct: 144 KIQVGANDGETITIDLQKIDVKS---LGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAV 200

Query: 284 GEVNFDVDANGKITIGGQEAYLTSDGNLTTNDAGGATAATLDGLFKKAGDGQSIGFNKTA 343
G + VD N + A D G T + + TA
Sbjct: 201 GANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTA 260

Query: 344 SVTMGGTTYNFKTGADAGAATANAGVSFTDTASKETVLNKVATAKQGTAVAANGDTSATI 403
D T T + + + T+
Sbjct: 261 EAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAA 320

Query: 404 TYKSGVQTYQAVFAAGDGTASAKYADNTDVSNATATYTDADGEMTTIGSYTTKYSIDANN 463
+ + T+ D + +D E +K +++
Sbjct: 321 NVDA----ATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAE 376

Query: 464 GKVTVDSGTGTGKYAPKVGAEVYVSANGTLTTDATSEGTVTKDPLKALDEAISSIDKFRS 523
T + + + DA + T +PL ++D A+S +D RS
Sbjct: 377 YTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTANPLASIDSALSKVDAVRS 436

Query: 524 SLGAIQNRLDSAVTNLNNTTTNLSEAQSRIQDADYATEVSNMSKAQIIQQAGNSVLAKAN 583
SLGAIQNR DSA+TNL NT TNL+ A+SRI+DADYATEVSNMSKAQI+QQAG SVLA+AN
Sbjct: 437 SLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKAQILQQAGTSVLAQAN 496

Query: 584 QVPQQVLSLLQ 594
QVPQ VLSLL+
Sbjct: 497 QVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2339TYPE3OMBPROT330.003 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 32.7 bits (74), Expect = 0.003
Identities = 24/72 (33%), Positives = 37/72 (51%), Gaps = 2/72 (2%)

Query: 214 NGMEVSVAAQNAQLTVNNVAIENSSNTISDALENITLNLNDVTTGNQTLTITQDTSKAQT 273
N E +VAA+N + + A+ + +S AL T++L V+T LT T T ++
Sbjct: 236 NSSERAVAARNKAEELVSAALYSRPELLSQALSGKTVDLKIVSTS--LLTPTSLTGGEES 293

Query: 274 AIKDWVNAYNSL 285
+KD VNA L
Sbjct: 294 MLKDQVNALKGL 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2344RTXTOXIND300.018 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.018
Identities = 10/57 (17%), Positives = 17/57 (29%), Gaps = 2/57 (3%)

Query: 175 RFTLLPIFRIPVKMQKVSAASPLTQKPDQARRRF--RLGMLVFFGMLGWALLTAMNQ 229
R L R + + + A L + P R R M ++L +
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEI 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2345PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


91c2348c2367N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c2348536-8.114841Outer membrane porin protein nmpC precursor
c2349-222-3.544039Hypothetical transcriptional regulator ybcM
c2350-2120.473813Protein ybcL precursor
c23511152.444025Hypothetical protein
c23521153.163863EmrE protein
c23531184.915408Flagellar hook-basal body complex protein fliE
c23542184.772370Flagellar M-ring protein
c23552164.847908Flagellar motor switch protein fliG
c23560144.563023Hypothetical protein
c2357-2163.960546Flagellar assembly protein fliH
c2358-2183.708785Flagellum-specific ATP synthase
c2359-2162.642357Flagellar fliJ protein
c2360-2162.790023Flagellar hook-length control protein
c2361-2222.331857Flagellar fliL protein
c23620180.968669Flagellar motor switch protein fliM
c23633170.593876Flagellar motor switch protein fliN
c2364217-2.549982Flagellar protein fliO
c2365117-3.284859Flagellar biosynthetic protein fliP precursor
c2366119-4.785175Flagellar biosynthetic protein fliQ
c2367219-5.081266Flagellar biosynthetic protein fliR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2348ECOLIPORIN495e-179 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 495 bits (1277), Expect = e-179
Identities = 234/370 (63%), Positives = 271/370 (73%), Gaps = 31/370 (8%)

Query: 1 MSAQAAEIYNKDSNKLDLYGKVNAKHYFSSNDADDGDTTYVRLGFKGETQINDQLTGFGQ 60
+A AAEIYNKD NKLDLYGKV+ HYFS + + DGD TY+R+GFKGETQINDQLTG+GQ
Sbjct: 17 GAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVGFKGETQINDQLTGYGQ 76

Query: 61 WEYEFKGNRAESQGSSKDKTRLAFAGLKFGDYGSIDYGRNYGVAYDIGAWTDVLPEFGGD 120
WEY + N E +G++ TRLAFAGLKFGDYGS DYGRNYGV YD+ WTD+LPEFGGD
Sbjct: 77 WEYNVQANTTEGEGANS-WTRLAFAGLKFGDYGSFDYGRNYGVLYDVEGWTDMLPEFGGD 135

Query: 121 TWTQTDVFMTGRTTGVATYRNNDFFGLVDGLNFAAQYQGKNDR----------------T 164
++T D +MTGR GVATYRN DFFGLVDGLNFA QYQGKN+
Sbjct: 136 SYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQSADDVNIGTNNRNNGD 195

Query: 165 DVTEANGDGFGFSTTYEY-EGFGVGATYAKSDRTNDQVIYGNNSLNASGQNAEVWAAGLK 223
D+ NGDGFG STTY+ GF GA Y SDRTN+QV G A G A+ W AGLK
Sbjct: 196 DIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGT--IAGGDKADAWTAGLK 253

Query: 224 YDANNIYLATTYSETQNMTVFG------NNHIANKAQNFEVVAQYQFDFGLRPSVAYLQS 277
YDANNIYLAT YSET+NMT +G + +ANK QNFEV AQYQFDFGLRP+V++L S
Sbjct: 254 YDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQYQFDFGLRPAVSFLMS 313

Query: 278 KGKDLG----AWGDQDLVEYIDVGATYYFNKNMSTFVDYKINLIDKSD-FTKASGVATDD 332
KGKDL D+DLV+Y DVGATYYFNKN ST+VDYKINL+D D F K +G++TDD
Sbjct: 314 KGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDDDDPFYKDAGISTDD 373

Query: 333 IVAVGLVYQF 342
IVA+G+VYQF
Sbjct: 374 IVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2353FLGHOOKFLIE1206e-39 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 120 bits (301), Expect = 6e-39
Identities = 102/103 (99%), Positives = 102/103 (99%)

Query: 12 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTVARTQAEKFTL 71
SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQT ARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 72 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 114
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2354FLGMRINGFLIF7510.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 751 bits (1940), Expect = 0.0
Identities = 476/555 (85%), Positives = 512/555 (92%), Gaps = 5/555 (0%)

Query: 3 ATAAQTKSLEWLNRLRANPKIPLIVAGSAAVAVMVALILWAKAPDYRTLFSNLSDQDGGA 62
+TA Q K LEWLNRLRANP+IPLIVAGSAAVA++VA++LWAK PDYRTLFSNLSDQDGGA
Sbjct: 5 STATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGA 64

Query: 63 IVSQLTQMNIPYRFSEASGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 122
IV+QLTQMNIPYRF+ SGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ
Sbjct: 65 IVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 124

Query: 123 FSEQVNYQRALEGELSRTIETIGPVKGARVHLAMPKPSLFVREQKSPSASVTVNLLPGRA 182
FSEQVNYQRALEGEL+RTIET+GPVK ARVHLAMPKPSLFVREQKSPSASVTV L PGRA
Sbjct: 125 FSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRA 184

Query: 183 LDEGQISAIVHLVSSAVAGLPPGNVTLVDQGGHLLTQSNTSGRDLNDAQLKYASDVEGRI 242
LDEGQISA+VHLVSSAVAGLPPGNVTLVDQ GHLLTQSNTSGRDLNDAQLK+A+DVE RI
Sbjct: 185 LDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRI 244

Query: 243 QRRIEAILSPIVGNGNIHAQVTAQLDFASKEQTEEQYRPNGDESHAALRSRQLNESEQSG 302
QRRIEAILSPIVGNGN+HAQVTAQLDFA+KEQTEE Y PNGD S A LRSRQLN SEQ G
Sbjct: 245 QRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVG 304

Query: 303 SGYPGGVPGALSNQPAPANNAPISTPPANQNNRQQ--QASTTSNS---GPRSTQRNETSN 357
+GYPGGVPGALSNQPAP N API+TPP NQ N Q Q ST++NS GPRSTQRNETSN
Sbjct: 305 AGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSN 364

Query: 358 YEVDRTIRHTKMNVGDVQRLSVAVVVNYKTLPDGKPLPLSNEQMKQIEALTREAMGFSEK 417
YEVDRTIRHTKMNVGD++RLSVAVVVNYKTL DGKPLPL+ +QMKQIE LTREAMGFS+K
Sbjct: 365 YEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDK 424

Query: 418 RGDSLNVVNSPFNSSDESGGALPFWQQQAFIDQLLAAGRWLLVLLVAWLLWRKAVRPQLT 477
RGD+LNVVNSPF++ D +GG LPFWQQQ+FIDQLLAAGRWLLVL+VAW+LWRKAVRPQLT
Sbjct: 425 RGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLT 484

Query: 478 RRAEAVKTVQQQAQAREEVEDAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 537
RR E K Q+QAQ R+E E+AVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR
Sbjct: 485 RRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 544

Query: 538 VVALVIRQWINNDHE 552
VVALVIRQW++NDHE
Sbjct: 545 VVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2355FLGMOTORFLIG341e-119 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 341 bits (876), Expect = e-119
Identities = 117/329 (35%), Positives = 197/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE 60
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD 120
+ + DY R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLKRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLANRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2357FLGFLIH371e-134 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 371 bits (954), Expect = e-134
Identities = 221/228 (96%), Positives = 224/228 (98%)

Query: 8 MSDNLPWKTWTPDDLAPPPAEFVPMAESEETIIEEVEPSLEQQLAQLQMQAHEQGYQAGI 67
MSDNLPWKTWTPDDLAPP AEFVP+ E EETIIEE EPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 68 AEGRQQGHEQGYQEGLAQGLEQGLAEAKAQQAPIHARMQQLVSEFQTTLDALDSVIASRL 127
AEGRQQGH+QGYQEGLAQGLEQGLAEAK+QQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 128 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 187
MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 188 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 235
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2359FLGFLIJ2022e-70 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 202 bits (515), Expect = 2e-70
Identities = 146/147 (99%), Positives = 147/147 (100%)

Query: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60
MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120
+TSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147
AALLAENRLDQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2360FLGHOOKFLIK469e-168 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 469 bits (1207), Expect = e-168
Identities = 365/375 (97%), Positives = 369/375 (98%)

Query: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTAK 60
MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPT K
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLVSDILSDAQQADLLIPVDETPPVINDEQSTLTPLTTAQTMTLAAVADKNTTKDEKA 120
GEPL+SDI+SDAQQA+LLIPVDETPPVINDEQST TPLTTAQTM LAAVADKNTTKDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPAEKPTLFTKLTSAQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLP EKPTLFTKLTS QLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTADASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTPLVAEAQSKAEVISTPSPVTA ASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMISPHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQM+SPHQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360

Query: 361 LQGRVTGNSGVDIFA 375
LQGRVTGNSGVDIFA
Sbjct: 361 LQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2362FLGMOTORFLIM385e-136 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 385 bits (989), Expect = e-136
Identities = 85/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 20 ILSQAEIDALLNGDS--EVKDEPTASVSGESDIRPYDPNTQRRVVRERLQALEIINERFA 77
+LSQ EID LL S + E +S I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 78 RHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 137
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 138 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 197
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 198 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 255
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 256 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 312
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 313 GVPVLTSQYGTLNGQYALRIEHLI 336
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2363FLGMOTORFLIN2092e-73 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 209 bits (534), Expect = 2e-73
Identities = 124/137 (90%), Positives = 132/137 (96%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTSGKSATDAVFQQFGGGDVSGTLQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T+ KSA DAVFQQ GGGDVSG +QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2365FLGBIOSNFLIP333e-119 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 333 bits (856), Expect = e-119
Identities = 244/245 (99%), Positives = 244/245 (99%)

Query: 1 MRRLFSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60
MRRL SVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2366TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2367TYPE3IMRPROT2026e-67 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 202 bits (516), Expect = 6e-67
Identities = 256/261 (98%), Positives = 259/261 (99%)

Query: 1 MMQVTSDQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
M+QVTS+QWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPGSHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDP SHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGSEPLNSNAFLALTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIG EPLNSNAFLALTKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIISELPLI 261
EHLFSEIFNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


92c2380c2387N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c2380-214-2.669061DNA-cytosine methyltransferase
c2381-322-6.295698Hypothetical protein yedJ
c2382-129-7.909471Hypothetical protein yedR
c2383-130-8.477498Outer membrane protein N precursor
c2384-227-6.477051Hypothetical protein
c2385-126-6.033135Protein yedU
c2386030-7.413496Putative sensor-like histidine kinase yedV
c2387024-4.905275Probable transcriptional Regulatory protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2380PF05272290.045 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.045
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 15/62 (24%)

Query: 320 AKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVTRTLSARYYKDGAEILIDRG 379
A+Y + PVLW Y+ R+ K + G+ VY +R +DG+E RG
Sbjct: 166 ARYQVGPVLWGYVVRFIK---SDGDKLTLPYVY------------SRSQRDGSEAWKWRG 210

Query: 380 WD 381
WD
Sbjct: 211 WD 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2381CARBMTKINASE352e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 34.8 bits (80), Expect = 2e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 9/92 (9%)

Query: 46 AQKLAADDDVDMLVILTACYFHDIVSLAKNHPQRQRSSILAAEETRRLLREEFVQFPA-- 103
+KLA + + D+ +ILT + +L + Q + EE R+ E F A
Sbjct: 219 GEKLAEEVNADIFMILTDV---NGAALYYGTEKEQWLREVKVEELRKYYEEG--HFKAGS 273

Query: 104 --EKIEAVCHAIAAHSFSAQIAPLTTEAKIVQ 133
K+ A I A IA L + ++
Sbjct: 274 MGPKVLAAIRFIEWGGERAIIAHLEKAVEALE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2383ECOLIPORIN418e-148 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 418 bits (1075), Expect = e-148
Identities = 199/388 (51%), Positives = 249/388 (64%), Gaps = 41/388 (10%)

Query: 24 MKRKVLAMLVPALLVAGAANAAEIYNKNGNKVELYGKMVGERILTDRENGEKGDNSQDTS 83
MKRKVLA+++PALL AGAA+AAEIYNK+GNK++LYGK+ G +D ++ + GD +
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSD-DSSKDGD----QT 55

Query: 84 YARVGVKGETQINPELTGYGQFELDLEASNRHNPDQ---TRLAYAGLSYKDFGSFDYGRN 140
Y RVG KGETQIN +LTGYGQ+E +++A+ TRLA+AGL + D+GSFDYGRN
Sbjct: 56 YMRVGFKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRN 115

Query: 141 VGVAYDAEAFTDMFVEWGGDSWAGTDLFMTNRTNGVATYRNTDFFGMVEGLNFALQYQGK 200
GV YD E +TDM E+GGDS+ D +MT R NGVATYRNTDFFG+V+GLNFALQYQGK
Sbjct: 116 YGVLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGK 175

Query: 201 NEGTGNY----------------KANGDGHGLSATYTID-GFSFAGAYANSDRTDWQSGD 243
NE NGDG G+S TY I GFS AY SDRT+ Q
Sbjct: 176 NESQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNA 235

Query: 244 GK----GERAEVWALSTKYDANNVYAAVMYGESHNM-------NSDDGDVVNKTQNFEAV 292
G G++A+ W KYDANN+Y A MY E+ NM DG V NKTQNFE
Sbjct: 236 GGTIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVT 295

Query: 293 LQYQFDFGLRPSIGYSYSKALDVA----GYKDSDRLNYIEIGTWYYFNKNMNVYTAYQIN 348
QYQFDFGLRP++ + SK D+ D D + Y ++G YYFNKN + Y Y+IN
Sbjct: 296 AQYQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKIN 355

Query: 349 LLDKSD-YVLAHGLNTDDQLAVGIVYQF 375
LLD D + G++TDD +A+G+VYQF
Sbjct: 356 LLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2386PF06580394e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.7 bits (90), Expect = 4e-05
Identities = 38/195 (19%), Positives = 74/195 (37%), Gaps = 34/195 (17%)

Query: 261 TLSQIRSIAEYQKTIAGN-IEELENISRLTENILFLARADKNNVLVKLDALSLNKEVENL 319
L+ IR++ T A + L + R + L ++ V SL E+ +
Sbjct: 178 ALNNIRALILEDPTKAREMLTSLSELMRYS-----LRYSNARQV-------SLADELTVV 225

Query: 320 LDYL--EYLSDEKEIRFKVECNQQIFADKI---LLQRMLSNLIVNAIRYSPEKSRIHITS 374
YL + E ++F+ + N I ++ L+Q ++ N I + I P+ +I +
Sbjct: 226 DSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKG 285

Query: 375 FLDANGSLNIDIASPGTKINEPEKLFRRFWRGDNSRHSVGQGLGLSLVKA-IAELHGGSA 433
D NG++ +++ + G+ + K G GL V+ + L+G A
Sbjct: 286 TKD-NGTVTLEVENTGSLALKNTKE--------------STGTGLQNVRERLQMLYGTEA 330

Query: 434 TYHYLSKHNVFRITL 448
K +
Sbjct: 331 QIKLSEKQGKVNAMV 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2387HTHFIS849e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.5 bits (209), Expect = 9e-21
Identities = 30/117 (25%), Positives = 60/117 (51%), Gaps = 1/117 (0%)

Query: 39 KILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDDYALIILDIMLPGMDGWQ 98
IL+ +D+ + + Q LS AGY + S+ D L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 99 ILQTLRTA-KQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSFSELLARVRAQLRQ 154
+L ++ A PV+ ++A+++ ++ + GA DYL KPF +EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


93c2437c2443N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c2437130-6.197999Hypothetical protein
c2438-227-4.372258Hypothetical protein
c2439-225-3.957230Hypothetical protein
c2440-330-5.664592Hypothetical protein
c2441-230-5.234805Hypothetical protein
c2442-326-4.431699Hypothetical protein
c2443-225-3.288442Shikimate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2437INTIMIN752e-17 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 75.5 bits (185), Expect = 2e-17
Identities = 20/60 (33%), Positives = 29/60 (48%), Gaps = 3/60 (5%)

Query: 181 QQIASTSQLIGSLLAEDMNSEQAANIARGWASSQASGVMTDWLSRFGTARITLGVDEDFS 240
QQ AS + S +N + A + A G A +QAS + WL +GTA + L +F
Sbjct: 168 QQAASLGSQLQS---RSLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2439INTIMIN553e-10 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 55.5 bits (133), Expect = 3e-10
Identities = 62/263 (23%), Positives = 91/263 (34%), Gaps = 20/263 (7%)

Query: 175 IAVKAHVNDQFGNPVTHQPATFSAAPSSQMIISQNTVSTNTQGVAEVTMTPERNGSYTVK 234
I A V G + P +F+ S ++S N+ +TN G A VT+ ++ G V
Sbjct: 578 ITYTATVKKN-GVAQANVPVSFNIV-SGTAVLSANSANTNGSGKATVTLKSDKPGQVVVS 635

Query: 235 ASLANGASLEKQLEAI---DEKLTLTSSPLIGVNAPKGATLTATLT---SANGTPVEGQV 288
A A S I K ++T A T T PV Q
Sbjct: 636 AKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQE 695

Query: 289 INFSVTLEGATLSGGKVRTNSSGQAPVVLTSNKVGTYTVTASFHNGVTIQTQTTVKVTGN 348
+ F+ TL LS +T+++G A V LTS G V+A + V+
Sbjct: 696 VTFTTTL--GKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTT 753

Query: 349 PSTAHVASFIADPSTIAATNSDLSTLKATVEDGSGNL-IEGPTVYFALKSGSTTLTSLTA 407
+ I T ++ G NL G + +S + + S
Sbjct: 754 LTID------DGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIAS--- 804

Query: 408 VTDQNGIATTSVKGEITGSVTVS 430
V +G T KG T SV S
Sbjct: 805 VDASSGQVTLKEKGTTTISVISS 827



Score = 52.8 bits (126), Expect = 2e-09
Identities = 46/170 (27%), Positives = 65/170 (38%), Gaps = 7/170 (4%)

Query: 271 TLTATLTSANGTPVEGQVINFSVTLEGATLSGGKVRTNSSGQAPVVLTSNKVGTYTVTAS 330
T TAT+ NG ++F++ A LS TN SG+A V L S+K G V+A
Sbjct: 579 TYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAK 637

Query: 331 FHNGV-TIQTQTTVKVTGNPSTAHVASFIADPSTIAATNSDLSTLKATVEDGSGNLIEGP 389
+ + V + A + AD +T A D T V G +
Sbjct: 638 TAEMTSALNANAVIFVDQ--TKASITEIKADKTTAVANGQDAITYTVKVMKG-DKPVSNQ 694

Query: 390 TVYFALKSGSTTLTSLTAVTDQNGIATTSVKGEITGSVTVSAVTSAGGMQ 439
V F G + + T TD NG A ++ G VSA S +
Sbjct: 695 EVTFTTTLGKLSNS--TEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVD 742



Score = 51.2 bits (122), Expect = 7e-09
Identities = 51/233 (21%), Positives = 89/233 (38%), Gaps = 16/233 (6%)

Query: 13 AVTDADGKAKVTLKGTKAGAHTVTASMVGGKS--EQLVVNFTADTLTAQVNLNVTEDNFI 70
A T+ GKA VTLK K G V+A S V F T + + + +
Sbjct: 612 ANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAV 671

Query: 71 ANNIGMTRLQATVTDGNGNPVEGIKVNFRGTSVTLSSTSVETDDQVFAEILVTSTEVGLK 130
AN V PV +V F T LS+++ +TD +A++ +TST G
Sbjct: 672 ANGQDAITYTVKVMK-GDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKS 730

Query: 131 TVSASLADKPTEVISRLLN----AKVDVNSATI----TSQEIPEGQVMVAQDIAVKAHVN 182
VSA ++D +V + + +D + I ++P + Q + N
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN 790

Query: 183 DQFGNPVTHQPATFSAAPSSQMIISQNTVSTNTQGVAEVTMTPERNGSYTVKA 235
++ + A S Q+ + + +T + V + + +YT+
Sbjct: 791 GKYTWRSANPAIASVDASSGQVTLKEKGTTTIS-----VISSDNQTATYTIAT 838



Score = 40.1 bits (93), Expect = 2e-05
Identities = 35/213 (16%), Positives = 63/213 (29%), Gaps = 18/213 (8%)

Query: 4 NFTLSDGDKAVTDADGKAKVTLKGTKAGAHTVTASMVGGKSE--QLVVNFTADTLTAQVN 61
TD +G AKVTL T G V+A + + V F N
Sbjct: 701 TLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGN 760

Query: 62 LNVTEDNFIANNIGMTRLQATVTDGNGN-PVEGIKVNFRGTSVTLSSTSVETDDQVFAEI 120
+ + + + + G N G + S + SV+
Sbjct: 761 IEI-----VGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQ---- 811

Query: 121 LVTSTEVGLKTVSASLADKPTEVISRLLNAKVDVNSATITSQEIPEGQVMVAQDIAVKAH 180
VT E G T+S +D T + + + ++ + V ++
Sbjct: 812 -VTLKEKGTTTISVISSDNQT--ATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGG--- 865

Query: 181 VNDQFGNPVTHQPATFSAAPSSQMIISQNTVST 213
N + + + AA + S T+ +
Sbjct: 866 KLPSSQNELENVFKAWGAANKYEYYKSSQTIIS 898


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2440INTIMIN280.022 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 27.7 bits (61), Expect = 0.022
Identities = 22/129 (17%), Positives = 46/129 (35%), Gaps = 6/129 (4%)

Query: 11 KISAIDYSQNINGDYKATVTGGGEGIATLIPVLNGVHQAGLSTTIEFISAETRPMTGTVS 70
K+S + NG K T+T G + + ++ V + +EF G +
Sbjct: 704 KLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFF-TTLTIDDGNIE 762

Query: 71 VNSANLPTASFPSQGFTGAYYQLNNDNFAPGKTAADYSFSSSASWVGVDATGKVTFKNDG 130
+ + P+ L + G + ++ A ++G+VT K G
Sbjct: 763 IVGTGV-KGKLPTVWLQYGQVNL---KASGGNGKYTWRSANPAIASVDASSGQVTLKEKG 818

Query: 131 DSNTVIITA 139
+ T+ + +
Sbjct: 819 -TTTISVIS 826


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2443TCRTETB330.002 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.3 bits (76), Expect = 0.002
Identities = 39/259 (15%), Positives = 96/259 (37%), Gaps = 18/259 (6%)

Query: 79 LGGVIFGHFGDRLGRKRMLMLTVWMMGIATALIGILPSFSTIGWWAPILLVTLRAIQGFA 138
+G ++G D+LG KR+L+ + + + + + SF ++ I+ ++ A
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSL----LIMARFIQGAGAAA 119

Query: 139 VGGEWGGAALLSVESAPKNKK-AFYSSGVQVGYGVGLLLSTGLVSLISMMTTDEQFLSWG 197
+ + K S V +G GVG + + I
Sbjct: 120 FPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------------H 167

Query: 198 WRIPFLFSIVLVLGALWVRNGMEESAEFEQQQHNQAAAKKRIPVIEALLRHPGAFLKIIA 257
W L ++ ++ ++ +++ + + + ++ +L + +
Sbjct: 168 WSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLI 227

Query: 258 LRLCELLTMYIVTAFALNYSTQNMGLPRELFLNIGLLVGGLSCLTIPCFAWLADRFGRRR 317
+ + L +++ + + GL + + IG+L GG+ T+ F + +
Sbjct: 228 VSVLSFL-IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDV 286

Query: 318 VYITGALIGTLSAFPFFMA 336
++ A IG++ FP M+
Sbjct: 287 HQLSTAEIGSVIIFPGTMS 305


94c2596c2605N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c2596-3122.716228Hypothetical chaperone protein yegD
c2597-3123.088008Hypothetical protein yegI
c2598-3163.805905Hypothetical protein yegK
c2599-3163.811782Hypothetical protein yegL
c2600-2173.894588Hypothetical protein yegM precursor
c2601-2183.860459Hypothetical protein yegN
c2602-2152.620287Hypothetical protein yegO
c2603-213-2.841182Hypothetical transport protein yegB
c2604-121-5.561590Sensor protein baeS
c2605-131-9.057045Transcriptional Regulatory protein baeR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2596SHAPEPROTEIN491e-08 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 49.4 bits (118), Expect = 1e-08
Identities = 32/129 (24%), Positives = 58/129 (44%), Gaps = 20/129 (15%)

Query: 153 AMMLH-IRQQAQAQLPEAITQAVIGRPINFQGLGGDEANTQAQGILERAAKRAGFKDVVF 211
M+ H I+Q + ++ P+ + + + I E +A+ AG ++V
Sbjct: 89 KMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQV-------ERRAIRE-SAQGAGAREVFL 140

Query: 212 QYEPVAAGLDYEATLQEEKRVLVVDIGGGTTDCSLLLMGPQWRSRLDREASLLGHSGCRI 271
EP+AA + + E +VVDIGGGTT+ +++ + ++ S RI
Sbjct: 141 IEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN-----------GVVYSSSVRI 189

Query: 272 GGNDLDIAL 280
GG+ D A+
Sbjct: 190 GGDRFDEAI 198



Score = 36.7 bits (85), Expect = 1e-04
Identities = 33/137 (24%), Positives = 56/137 (40%), Gaps = 23/137 (16%)

Query: 353 RLSYRLV---RSAEESKIALSSV--AETRASLPFISDELAT------LISQQGLESALSQ 401
R +Y + +AE K + S + + LA ++ + AL +
Sbjct: 203 RRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQE 262

Query: 402 PLARILEQVQLALDNAQEKPDV--------IYLTGGSARSPLIKKALAEQLPGIPVAGGD 453
PL I+ V +AL+ Q P++ + LTGG A + + L E+ GIPV +
Sbjct: 263 PLTGIVSAVMVALE--QCPPELASDISERGMVLTGGGALLRNLDRLLMEET-GIPVVVAE 319

Query: 454 D-FGSVTAGLARWAEVV 469
D V G + E++
Sbjct: 320 DPLTCVARGGGKALEMI 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2600RTXTOXIND524e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.8 bits (124), Expect = 4e-09
Identities = 48/369 (13%), Positives = 106/369 (28%), Gaps = 87/369 (23%)

Query: 53 SYKSRWVIVIVVVIAAIAAFWFWQGRNDSQSAAPG-----ATKQAQQSPAGG-------R 100
S + R V ++ IA G+ + + A G + + +
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 101 RGMRA---DPLA---PVQAATAVEQAVPRYLTGLGTITAANTVTVRSRVDG--QLMALHF 152
G D L + A + L T ++ ++ +L
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 153 QEGQQVKAGDLLAEI------------DPSQFKVALAQAQGQLA-------KDKATLANA 193
Q V ++L Q ++ L + + + + +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 194 RRDLARYQQLAKTNLVSRQELDAQQALVSETEGTIKADEASVA----------------- 236
+ L + L +++ + Q+ E ++ ++ +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 237 --------------------------SAQLQLDWSRITAPVDGRV-GLKQVDVGNQISSG 269
+ + S I APV +V LK G +++
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 270 DTTGIVVITQTHPIDLVFTLPESDIATVVQAQKAGKPLVVEAWDRTNSKKL-SEGTLLSL 328
+T +V++ + +++ + DI + Q A + VEA+ T L + ++L
Sbjct: 354 ETL-MVIVPEDDTLEVTALVQNKDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINL 410

Query: 329 DNQIDATTG 337
D D G
Sbjct: 411 DAIEDQRLG 419


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2601ACRIFLAVINRP9200.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 920 bits (2379), Expect = 0.0
Identities = 300/1036 (28%), Positives = 513/1036 (49%), Gaps = 29/1036 (2%)

Query: 13 SRLFIMRPVATTLLMVAILLAGIIGYRALPVSALPEVDYPTIQVVTLYPGASPDVMTSAV 72
+ FI RP+ +L + +++AG + LPV+ P + P + V YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMSSQS-SGGASVITLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L MSS S S G+ ITL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPNPPVYSKVNPADPPIMTLAVTSTAMPMTQVE--DMVETRVAQKISQISGVGLVTLSGG 189
+ + S + +M S TQ + D V + V +S+++GVG V L G
Sbjct: 122 VQQQGI-SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAQAIAALGLTSETVRTAITGANVNSAKGSLDGP------SRAVTLSANDQ 243
Q A+R+ L+A + LT V + N A G L G ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MQSAEEYRQLII-AYQNGAPIRLGDVATVEQGAENSWLGAWANKEQAIVMNVQRQPGANI 302
++ EE+ ++ + +G+ +RL DVA VE G EN + A N + A + ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 ISTADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVDDTQFELMMAIALVVMIIYLFL 362
+ TA +I+ L +L P+ +KV D T ++ S+ + L AI LV +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N+ AT+IP +AVP+ L+GTFA++ +SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAITLAVAIL 481
+ E P A K +I ++ + L AV IP+ F G G ++R+F+IT+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SQESLRKQNRFSRASEKMFDRIIAAYGRGLAKVLNHPWL 538
+S +V+L LTP +CA +L S E + F FD + Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVALSTLLLSVLLWVFIPKGFFPVQDNGIIQGTLQAPQSSSFANMAQRQRQVADVILQ 598
L + + V+L++ +P F P +D G+ +Q P ++ + QV D L+
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VQSLTSFVGVDGTNPSLNSARLQINLKPLDERDDR---VQKVIARLQTAVDKVPG 653
+ V+S+ + G + + N+ ++LKP +ER+ + VI R + + K+
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR- 658

Query: 654 VDLFLQPTQDLTIDTQVSRTQYQFTLQ---ATSLDALSTWVPQLMEKLQQLP-QLSDVSS 709
D F+ P I + T + F L DAL+ QL+ Q P L V
Sbjct: 659 -DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDKGLVAYVNVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTE 769
+ + + VD++ A LG+S++D++ + A G ++ + ++ ++ + +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 NTPGLAALDTIRLTSSDGGVVPLSSIAKIEQRFAPLSINHLDQFPVTTISFNVPDNYSLG 829
+D + + S++G +VP S+ + + + P I S G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 DAVQAIMDTEKTLNLPVDITTQFQGSTLAFQSALGSTVWLIVAAVVAMYIVLGILYESFI 889
DA A+M+ + LP I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DA-MALMENLAS-KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALMIAGSELDVIAIIGIILLIGIVKKNAIMMIDFALAAEREQ 949
P++++ +P VG LLA + + DV ++G++ IG+ KNAI++++FA ++
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMSPREAIYQACLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIGMVGGLIVSQV 1009
G EA A +R RPILMT+LA +LG LPL +S G G+ + +GIG++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDRL 1025
L +F PV +++ R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2602ACRIFLAVINRP9150.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 915 bits (2367), Expect = 0.0
Identities = 288/1035 (27%), Positives = 503/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ +L++ + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVSEMTSSS-SLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLPSGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSDT--YSQGELYDFASTQLAPTISQIDGVGDVDVGGSSL 182
+ S + +M+ SD +Q ++ D+ ++ + T+S+++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLNPQALFNQGVSLDDVRTAISNANVRKPQG------ALEDGTHRWQIQTNDELK 236
A+R+ L+ L ++ DV + N + G AL I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDSIRARLPELQSTIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRS 355
T +I+A+L ELQ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RAT+IP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RMLVALQQGYGKSLKWVLNHTRLVG 530
+LV+L LTP +C +LK + GF Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 AVLLGTIALNIWLYISIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 586
+ +A + L++ +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 587 RD-DPAVDNVTGFT-GGSRVNSGMMFITLKPRGERS---ETAQQIIDRLRKKLAKEPGAN 641
+ +V V GF+ G N+GM F++LKP ER+ +A+ +I R + +L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLMAVQDIRVGGRQANASYQYTLLSDDLAALREWEPKIRKKLATL-----PELADVNSD 696
+ + I G ++ L D + + R +L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QEDNGAEMNLIYDRDTMARLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 756
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 TQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSD 816
++K++V + G+ +P S F + + I G S D
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVILIIAAIATVYIVLGILYESYVH 876
A A ++ ++L P+ + + G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGN 936
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA +
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLL 996
EA A +R RPI+MT+LA + G LPL +S G GS + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVVYLFFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031



Score = 80.7 bits (199), Expect = 2e-17
Identities = 76/448 (16%), Positives = 161/448 (35%), Gaps = 26/448 (5%)

Query: 592 VDNVTGFTGGS-RVNSGMMFITLKPRGERSETAQQIIDRLRKKLAKEPGANLFLMAVQDI 650
+DN+ + S S + +T + + Q+ ++L+ P + Q I
Sbjct: 72 IDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE----VQQQGI 127

Query: 651 RVGGRQANASYQYTLLSDDLAALREW-----EPKIRKKLATLPELADVNSDQEDNGAE-- 703
V ++ +SD+ ++ ++ L+ L + DV GA+
Sbjct: 128 SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL----FGAQYA 183

Query: 704 MNLIYDRDTMARLGID----VQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRYTQD 759
M + D D + + + + + + T P Q + R+
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP 243

Query: 760 ISALEKMFVINNEGKAIPLSYFAK--WQPANAPLSVNHQGLSAASTISFNLPTGKSLSDA 817
+ +N++G + L A+ N + G AA +L D
Sbjct: 244 EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL-DT 302

Query: 818 SAAIDRAMTQL--GVPSTVRGSFA-GTAQVFQETMNSQVILIIAAIATVYIVLGILYESY 874
+ AI + +L P ++ + T Q +++ V + AI V++V+ + ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 875 VHPLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRH 934
L +P +G L F + + + G++L IG++ +AI++V+
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 935 GNLTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQ 994
L P+EA ++ ++ + +P+ GG + + ITIV + +S
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 995 LLTLYTTPVVYLFFDRLRLRFSRKPKQA 1022
L+ L TP + + + K
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENKGG 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2603TCRTETB1269e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 126 bits (317), Expect = 9e-34
Identities = 98/435 (22%), Positives = 191/435 (43%), Gaps = 25/435 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQIGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAITTLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGFSPLAIAGLVAVGVVALVLYLLHAQNNNRALFSLKL 257
G +L++VG+ L F+ + V V++ ++++ H + L
Sbjct: 202 KGIILMSVGIVFFML---------FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGFGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+R G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHVSVDSGTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYT--WLSMASIIAL 445
+Y+ L + II +
Sbjct: 428 LYSNLLLLFSGIIVI 442


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2604BCTERIALGSPF340.001 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 34.0 bits (78), Expect = 0.001
Identities = 28/95 (29%), Positives = 36/95 (37%), Gaps = 20/95 (21%)

Query: 164 RQTSWLIVALSTLLAALATF------PLARGLLAPVKRLVDGTHKLAAGDFTTRVAPTSE 217
RQ + L+ A L AL P L+A V+ V H LA + P S
Sbjct: 75 RQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSF 131

Query: 218 DEL-----------GRLAEDFNQLASTLEKNQQMR 241
+ L G L N+LA E+ QQMR
Sbjct: 132 ERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2605HTHFIS766e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 6e-18
Identities = 28/136 (20%), Positives = 65/136 (47%), Gaps = 1/136 (0%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLPYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTILRRCK 129
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 PQRELQQQDAESPLII 145
+ + D++ + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


95c2665c2672N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c26650191.556463Penicillin-binding protein 7 precursor
c26661192.655570Hypothetical protein
c26670182.741638Hypothetical protein yohC
c26681182.177783Hypothetical protein yohD
c26691182.077375Hypothetical protein
c26700162.113095Putative conserved protein
c26710152.055580Putative channel/filament proteins
c26720150.800942Hypothetical protein yohI
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2665BLACTAMASEA445e-07 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 43.6 bits (103), Expect = 5e-07
Identities = 42/195 (21%), Positives = 76/195 (38%), Gaps = 18/195 (9%)

Query: 4 MPKFRVSLFSLALMLAVPFAPQAVAKTAAATTASQPEIASGSAMI-VDLNTNKVIYSNHP 62
M R+ + SL + +P A A + S+ +++ MI +DL + + + +
Sbjct: 1 MRYIRLCIISL--LATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRA 58

Query: 63 DLVRPIASISKLMTAMVVLDARLPLDEKLKVDISQTPEMKGVYSRV---RLNSEISRKDM 119
D P+ S K++ VL DE+L+ I + YS V L ++ ++
Sbjct: 59 DERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGEL 118

Query: 120 LLLALMSSENRAAASLAHHYPGGYKAFIKAMNAKAKSLGMNNTRFV--EPTGLS-----V 172
A+ S+N +AA+L GG + A + +G N TR E
Sbjct: 119 CAAAITMSDN-SAANLLLATVGG----PAGLTAFLRQIGDNVTRLDRWETELNEALPGDA 173

Query: 173 HNVSTARDLTKLLIA 187
+ +T + L
Sbjct: 174 RDTTTPASMAATLRK 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2668BCTERIALGSPF290.015 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 28.6 bits (64), Expect = 0.015
Identities = 5/34 (14%), Positives = 16/34 (47%), Gaps = 2/34 (5%)

Query: 164 WLHNLDQHLKHW-VWLILVVVL-VVGVRWWLKRS 195
L + ++ + W++L ++ + R L++
Sbjct: 215 VLMGMSDAVRTFGPWMLLALLAGFMAFRVMLRQE 248


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2669DHBDHDRGNASE642e-15 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 63.9 bits (155), Expect = 2e-15
Identities = 39/128 (30%), Positives = 59/128 (46%), Gaps = 10/128 (7%)

Query: 3 KQGQGGRIINITSVHEHTPLPDASAYTAAKHALGGLTKAMALELVRHKILVNAVAPGAIA 62
+ G I+ + S P +AY ++K A TK + LEL + I N V+PG+
Sbjct: 132 MDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGSTE 191

Query: 63 TPMN-----GMDGGD--VKPDAEP---SIPLRRFGTTHEIASLVVWLCSEGANYTTGQSL 112
T M +G + +K E IPL++ +IA V++L S A + T +L
Sbjct: 192 TDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITMHNL 251

Query: 113 IVDGGFML 120
VDGG L
Sbjct: 252 CVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2670DHBDHDRGNASE369e-06 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 35.8 bits (82), Expect = 9e-06
Identities = 25/98 (25%), Positives = 44/98 (44%), Gaps = 1/98 (1%)

Query: 3 QVAIITASDSGIGKECALLLAQQGFDIGITWHSDEEGAKDTAREVVSHGVRAEIVQLDLG 62
++A IT + GIG+ A LA QG I ++ E+ K + AE D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKA-EARHAEAFPADVR 67

Query: 63 NLPEGAQALEKLIQRLGRIDVLVNNAGAMTKAPFLDMA 100
+ + ++ + +G ID+LVN AG + ++
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLS 105


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2672SHAPEPROTEIN300.018 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 29.7 bits (67), Expect = 0.018
Identities = 32/127 (25%), Positives = 53/127 (41%), Gaps = 5/127 (3%)

Query: 144 GAKAMREAVPAHLPVSVKVRLGWDSGEK-KFEIADAVQQAGATELVVHGRTKEQGY-RAE 201
G EA+ ++ + +G + E+ K EI A E+ V GR +G R
Sbjct: 190 GGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGF 249

Query: 202 HIDWQAIGE-IRQRLNIPVIANGEIWDWQSAQQCMAISGCDAVMIGRGALNIPNLSRVVK 260
++ I E +++ L V A + + IS V+ G GAL + NL R++
Sbjct: 250 TLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGAL-LRNLDRLL- 307

Query: 261 YNEPRMP 267
E +P
Sbjct: 308 MEETGIP 314


96c2726c2730N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c27260131.239139Hypothetical protein yejM
c27270161.917692*Hypothetical protein
c27280183.325154Hypothetical protein
c2729-1193.245878Hypothetical protein
c27300213.638771Nitrate/nitrite response regulator protein narP
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2726IGASERPTASE300.027 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.027
Identities = 19/70 (27%), Positives = 28/70 (40%), Gaps = 6/70 (8%)

Query: 503 LHVSTPASEYSQGQ-DLF---NPQRRHYWVTAADNDTLAITTPKKTLVLNNNGKYRTYNL 558
L V+ E + + LF QR H V+ +T+ + K L N NG+Y YN
Sbjct: 926 LQVADKTGEPNHNELTLFDASKAQRDHLNVSLV-GNTVDLGAWKYKLR-NVNGRYDLYNP 983

Query: 559 RGERVKDEKP 568
E+
Sbjct: 984 EVEKRNQTVD 993


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2727PRTACTNFAMLY260.005 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 26.2 bits (57), Expect = 0.005
Identities = 10/35 (28%), Positives = 14/35 (40%)

Query: 2 WSSAINSRNNVTTDAGAGFEQTLTGLTLGIDSRFS 36
W R + AG F+Q + G LG D +
Sbjct: 650 WGRGFAQRQQLDNRAGRRFDQKVAGFELGADHAVA 684


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2729PERTACTIN270.024 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 27.0 bits (59), Expect = 0.024
Identities = 15/43 (34%), Positives = 26/43 (60%), Gaps = 2/43 (4%)

Query: 40 VFAVIEKGGLLEV--KATGDFKIFVTDTGARPAAGDNLTLVTT 80
VFA + L V A+G +++V ++G+ PA+G+ + LV T
Sbjct: 484 VFADLGLSDKLVVMRDASGQHRLWVRNSGSEPASGNTMLLVQT 526


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2730HTHFIS652e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.9 bits (158), Expect = 2e-14
Identities = 22/113 (19%), Positives = 48/113 (42%), Gaps = 2/113 (1%)

Query: 19 VMIVDDHPLMRRGVRQLLELDSGFEVVAEAGDGASAIDLANRLDIDVILLDLNMKGMSGL 78
+++ DD +R + Q L +G++V + A+ D D+++ D+ M +
Sbjct: 6 ILVADDDAAIRTVLNQALS-RAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 79 DTLNALRRDGVTAQIIILTVSDASSDVFALIDAGADGYLLKDSDPEVLLEAIR 131
D L +++ +++++ + + GA YL K D L+ I
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


97c2758c2763N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c2758-111-1.776468Outer membrane protein C precursor
c2759-111-1.660740Putative sensor-like histidine kinase yojN
c2760-112-1.394835Capsular synthesis regulator component B
c2761-113-0.608118Sensor protein rcsC
c2762-1160.051678Sensor protein atoS
c2763-1140.892051Acetoacetate metabolism Regulatory protein atoC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2758ECOLIPORIN5390.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 539 bits (1390), Expect = 0.0
Identities = 258/386 (66%), Positives = 295/386 (76%), Gaps = 14/386 (3%)

Query: 1 MKVKVLSLLVPALLVAGAANAAEVYNKDGNKLDLYGKVDGLHYFSDDKSVDGDQTYMRLG 60
MK KVL+L++PALL AGAA+AAE+YNKDGNKLDLYGKVDGLHYFSDD S DGDQTYMR+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQVTDQLTGYGQWEYQIQGNAPESE-NNSWTRVAFAGLKFQDIGSFDYGRNYGVVY 119
FKGETQ+ DQLTGYGQWEY +Q N E E NSWTR+AFAGLKF D GSFDYGRNYGV+Y
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 120 DVTSWTDVLPEFGGDTYG-SDNFMQQRGNGFATYRNTDFFGLVDGLNFAVQYQGQNGSVS 178
DV WTD+LPEFGGD+Y +DN+M R NG ATYRNTDFFGLVDGLNFA+QYQG+N S S
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 179 GENDPDFTGHGITNNGRKALRQNGDGVGGSITYDY-EGFGVGAAVSSSKRTWDQNNT-GL 236
++ G NNG NGDG G S TYD GF GAA ++S RT +Q N G
Sbjct: 181 ADDVN--IGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGT 238

Query: 237 IGTGDRAETYTGGLKYDANNIYLAAQYTQTYNATRVGSL------GWANKAQNFEAVAQY 290
I GD+A+ +T GLKYDANNIYLA Y++T N T G G ANK QNFE AQY
Sbjct: 239 IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQY 298

Query: 291 QFDFGLRPSVAYLQSKGKNLGVVAGRNYDDEDILKYVDVGATYYFNKNMSTYVDYKINLL 350
QFDFGLRP+V++L SKGK+L N DD+D++KY DVGATYYFNKN STYVDYKINLL
Sbjct: 299 QFDFGLRPAVSFLMSKGKDLT-YNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLL 357

Query: 351 D-DNQFTRAAGINTDDIVALGLVYQF 375
D D+ F + AGI+TDDIVALG+VYQF
Sbjct: 358 DDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2760HTHFIS497e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 48.7 bits (116), Expect = 7e-09
Identities = 26/145 (17%), Positives = 60/145 (41%), Gaps = 20/145 (13%)

Query: 16 MNNMNVIIADDHPIVLFGIRKSLEQIEWVNVVGEFEDSTALINNLPKLDAHVLITDLSMP 75
M +++ADD + + ++L + + + ++ L + D +++TD+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 76 GDKYGDGITLIKYIKRHFPSLSIIVLTMNNNPAILSAVLDLDIEGIVLKQGA------PT 129
+ L+ IK+ P L ++V++ N +A+ ++GA P
Sbjct: 59 D---ENAFDLLPRIKKARPDLPVLVMSAQNTFM--TAIKA-------SEKGAYDYLPKPF 106

Query: 130 DLPKALAALQKGKKFTPESVSRLLE 154
DL + + + + S+L +
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2761HTHFIS816e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.0 bits (200), Expect = 6e-18
Identities = 29/106 (27%), Positives = 47/106 (44%)

Query: 827 ILVVDDHPINRRLLADQLGSLGYQCKTANDGVDALNVLNKNHIDIVLSDVNMPNMDGYRL 886
ILV DD R +L L GY + ++ + D+V++DV MP+ + + L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 887 TQRIRQLGLTLPVIGVTANALAEEKQRCLESGMDSCLSKPVTLDVI 932
RI++ LPV+ ++A + E G L KP L +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTEL 111


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2763HTHFIS5620.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 562 bits (1449), Expect = 0.0
Identities = 181/484 (37%), Positives = 269/484 (55%), Gaps = 35/484 (7%)

Query: 1 MTAINRILIVDDEDNVRRMLSTAFALQGFETHCANNGRTALHLFADIHPDVVLMDIRMPE 60
MT IL+ DD+ +R +L+ A + G++ +N T A D+V+ D+ MP+
Sbjct: 1 MTGA-TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPD 59

Query: 61 MDGIKALKEMRSHETRTPVILMTAYAEVETAVEALRCGAFDYVIKPFDLDELNLIVQRAL 120
+ L ++ PV++M+A TA++A GA+DY+ KPFDL EL I+ RAL
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 121 QLQSMKKEIRHLHQALSTSWQWGH-ILTNSPAMMDICKDTAKIALSQASVLISGESGTGK 179
+ L Q G ++ S AM +I + A++ + +++I+GESGTGK
Sbjct: 120 AEP------KRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGK 173

Query: 180 ELIARAIHYNSRRAKGPFIKVNCAALPESLLESELFGHEKGAFTGAQTLRQGLFERANEG 239
EL+ARA+H +R GPF+ +N AA+P L+ESELFGHEKGAFTGAQT G FE+A G
Sbjct: 174 ELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGG 233

Query: 240 TLLLDEIGEMPLVLQAKLLRILQEREFERIGGHQTIKVDIRIIAATNRDLQAMVKEGTFR 299
TL LDEIG+MP+ Q +LLR+LQ+ E+ +GG I+ D+RI+AATN+DL+ + +G FR
Sbjct: 234 TLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFR 293

Query: 300 EDLFYRLNVIHLILPPLRDRREDISLLANHFLQKFSSENQRDIIDIDPMAMSLLTAWSWP 359
EDL+YRLNV+ L LPPLRDR EDI L HF+Q+ E + D A+ L+ A WP
Sbjct: 294 EDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLD-VKRFDQEALELMKAHPWP 352

Query: 360 GNIRELSNVIERAVVMNSGPIIFSEDLPPQIRQPV---------CNAGEAKTAPVGERN- 409
GN+REL N++ R + +I E + ++R + +G + E N
Sbjct: 353 GNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENM 412

Query: 410 ----------------LKEEIKRVEKRIIMEVLEQQEGNRTRTALMLGISRRALMYKLQE 453
+ +E +I+ L GN+ + A +LG++R L K++E
Sbjct: 413 RQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472

Query: 454 YGID 457
G+
Sbjct: 473 LGVS 476


98c2902c2906N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c2902-135-9.721984Multidrug resistance protein Y
c2903-135-9.358557Hypothetical protein
c2904-134-8.401045Multidrug resistance protein K
c2905-131-7.597590Positive transcription regulator evgA
c2906-131-7.209354Sensor protein evgS precursor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2902TCRTETB1222e-32 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 122 bits (308), Expect = 2e-32
Identities = 92/404 (22%), Positives = 167/404 (41%), Gaps = 17/404 (4%)

Query: 19 VTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVITSFGVANAIAIPVTGRLAQ 78
+ I L + +F +L+ + NV++P I+ WV T+F + +I V G+L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 79 RIGELRLFLLSVTFFSLSSLMCSLS-TNLDVLIFFRVVQGLMAGPLIPLSQSLLLRNYPP 137
++G RL L + S++ + + +LI R +QG A L ++ R P
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 138 EKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRE 197
E R A L V + GP +GG I W +L+ +PM I+ L L +E
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192

Query: 198 TETSPVKMNLPGLTLLVLGVGGLQIMLDKGRDLDWFNSSTIILLTVVSVISLISLVIWES 257
++ G+ L+ +G+ + ML F +S I +VSV+S + V
Sbjct: 193 VRIKG-HFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIR 241

Query: 258 TSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQETMGYNAIWAGLAYAPI 317
+P +D L K+ F IG++ + +G + ++P ++++ + G
Sbjct: 242 KVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 318 GIMPLLIS-PLIGRYGNKIDMRLLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQFFQG 376
G M ++I + G ++ ++ +V + S T F II+ G
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGG 361

Query: 377 FAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL 420
+ ++TI S L + S+ NF LS G ++
Sbjct: 362 LSFTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2904RTXTOXIND786e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 78.3 bits (193), Expect = 6e-18
Identities = 62/413 (15%), Positives = 123/413 (29%), Gaps = 98/413 (23%)

Query: 13 RRKYFALLAVVLFIAFSGAYAYWSMELKDMISTDDAYVTGNADPISAQVSGSVTVVNHKD 72
RR ++ F+ + + + +G + I + V + K+
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKE 114

Query: 73 TNYVRQGDILVSLDKTDATIALNKA----------------------------------- 97
VR+GD+L+ L A K
Sbjct: 115 GESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEP 174

Query: 98 -----------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQSLEDYN 137
K + Q + L + AE + + Y+
Sbjct: 175 YFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEK 234

Query: 138 RRV----PLAKQGVISKEALEHTKDTLI----------SSKAALNAAIQAYKANKALVMN 183
R+ L + I+K A+ ++ + S + + I + K +
Sbjct: 235 SRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE--YQLV 292

Query: 184 TPLNRQPQVIEAADATKE----------AWLALKRTDIKSPVTGYIAQRSVQ-VGETVSP 232
T L + + + T + + I++PV+ + Q V G V+
Sbjct: 293 TQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTT 352

Query: 233 GQSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINMGTGN 291
++LM +VP + V A + + + +GQ+ I + F G +G
Sbjct: 353 AETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLVGK-- 404

Query: 292 AFSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDT 340
+ + +V V +S++ L PL G+++TA I T
Sbjct: 405 -VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKT 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2905HTHFIS493e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 3e-09
Identities = 22/148 (14%), Positives = 53/148 (35%), Gaps = 31/148 (20%)

Query: 4 IIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQV 63
++ DD + L + ++ + + + + D+V+ DV +P N +
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 LETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF 123
L ++K + ++++SA+N + AI+A++ G +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDY 101

Query: 124 ---PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 102 LPKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c2906HTHFIS792e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.5 bits (196), Expect = 2e-17
Identities = 30/105 (28%), Positives = 51/105 (48%)

Query: 960 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKVSMQHYDLLITDVNMPNMDGFE 1019
+IL+ADD R +L + L+ GYDV ++ ++ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 LTRKLREQNSSLPIWGLTANAQANEREKGLNCGMNLCLFKPLTLD 1064
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


99c3234c3244N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c3234-2122.748593Putative transport protein
c3235-3142.391967Hypothetical protein ygaZ
c3236-3162.574133Hypothetical protein ygaH
c3237-3152.372564Transcriptional repressor mprA
c3238-2152.618247Multidrug resistance protein A
c3239-2162.555011Multidrug resistance protein B
c3240-1140.871832Hypothetical protein
c3241-1140.603977Hypothetical protein
c3242-1120.739009Hypothetical protein
c3243-1130.143382Hypothetical protein
c3244-114-0.427397Autoinducer-2 production protein luxS
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3234TCRTETB447e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 44.1 bits (104), Expect = 7e-07
Identities = 32/165 (19%), Positives = 70/165 (42%), Gaps = 2/165 (1%)

Query: 34 LDTIARNFSLSASSAGFIVTAAQLGYAAGLLFLVPLGDMFERRRLIVSMTLLAAGGMLIT 93
L IA +F+ +S ++ TA L ++ G L D +RL++ ++ G +I
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 94 ASSQSLA-MMILGTALTGLFSVVAQILVPLA-ATLASPDKRGKVVGTIMSGLLLGILLAR 151
S ++I+ + G + LV + A + RGK G I S + +G +
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 152 TVAGLLANLGGWRTVFWVASMLMALMALALWRGLPQMKSETHLNY 196
+ G++A+ W + + + + + + +++ + H +
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3237PF05272280.018 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.018
Identities = 23/94 (24%), Positives = 36/94 (38%), Gaps = 12/94 (12%)

Query: 23 PYQEILLTRLCMHMQSKLLENRNKMLKAQGINETLFMALITLESQENHSIQPSELSCALG 82
P QE+ L + + L R A+G + + T + ++L ALG
Sbjct: 756 PEQELRLVETGVQGRLWALLTREGAPAAEGAAQKGYSVNTTFVTI-------ADLVQALG 808

Query: 83 -----SSRTNATRIADELEKRGWIERRESDNDRR 111
SS ++ D L + GW RE+ RR
Sbjct: 809 ADPGKSSPMLEGQVRDWLNENGWEYLRETSGQRR 842


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3238RTXTOXIND793e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 79.1 bits (195), Expect = 3e-18
Identities = 64/412 (15%), Positives = 120/412 (29%), Gaps = 97/412 (23%)

Query: 29 LLLTLLFIIIAVAIGIYWFLVLRHFEETDDA----YVAGNQIQIMSQVSGSVTKVWADNT 84
L FI+ + I VL E A +G +I + V ++
Sbjct: 57 PRLVAYFIMGFLVIAFILS-VLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEG 115

Query: 85 DFVKEGDVLVTLDPTDARQAFEKA------------------------------------ 108
+ V++GDVL+ L A K
Sbjct: 116 ESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPY 175

Query: 109 ----------------KTALASSVRQTHQLMINSKQLQANIEVQKIALAKA-------QS 145
K ++ Q +Q +N + +A + + +S
Sbjct: 176 FQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKS 235

Query: 146 DYNRRVPLGNANLIGREELQHARDAVTSAQAQLDVAIQQYNANQAMILGTKLEDQPAVQQ 205
+ L + I + + + A +L V Q ++ IL K E Q Q
Sbjct: 236 RLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQL 295

Query: 206 AATEVRN------------------AWLALERTRIVSPMTGYVSRRAVQ-PGAQISPTTP 246
E+ + + + I +P++ V + V G ++
Sbjct: 296 FKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355

Query: 247 LMAVVPA-TNMWVDANFKETQIANMRIGQPVTITTDIYGDDVKY---TGKVVGLDMGTGS 302
LM +VP + V A + I + +GQ I + + +Y GKV + +
Sbjct: 356 LMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAF-PYTRYGYLVGKVKNI-----N 409

Query: 303 AFSLLPAQNATGNWIKVVQRLPVRIELDQKQLEQYPLRIGLSTLVSVNTTNR 354
++ G V+ + + PL G++ + T R
Sbjct: 410 LDAIE--DQRLGLVFNVIISIEENCLST--GNKNIPLSSGMAVTAEIKTGMR 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3239TCRTETB1328e-36 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 132 bits (334), Expect = 8e-36
Identities = 97/405 (23%), Positives = 169/405 (41%), Gaps = 23/405 (5%)

Query: 20 IALSLATFMQVLDSTIANVAIPTIAGNLGSSLSQGTWVITSFGVANAISIPLTGWLAKRV 79
I L + +F VL+ + NV++P IA + + WV T+F + +I + G L+ ++
Sbjct: 17 IWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQL 76

Query: 80 GEVKLFLWSTIAFAIASWACGVS-SSLNMLIFFRVIQGIVAGPLIPLSQSLLLNNYPPAK 138
G +L L+ I S V S ++LI R IQG A L ++ P
Sbjct: 77 GIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN 136

Query: 139 RSIALALWSMTVIVAPICGPILGGYISDNYHWGWIFFINVPIGVAVVLMTLQTLRGRETR 198
R A L V + GP +GG I+ HW + + +P+ + + L L +E R
Sbjct: 137 RGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVR 194

Query: 199 TERRRIDAVGLALLVIGIGSLQIMLDRGKELDWFSSQEIIILTVVAVVAICFLIVWELTD 258
+ D G+ L+ +GI + ML F++ I +V+V++ +
Sbjct: 195 I-KGHFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIRKV 243

Query: 259 DNPIVDLSLFKSRNFTIGCLCISLAYMLYFGAIVLLPQLLQEVYGYTATWAGLASAPVGI 318
+P VD L K+ F IG LC + + G + ++P ++++V+ + G G
Sbjct: 244 TDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGT 303

Query: 319 IPVILS-PIIGRFAHKLDMRRLVTFSFIMYAVCFYWRAYTFEPGMDFGASAWPQFIQGF- 376
+ VI+ I G + ++ +V F ++ S + I F
Sbjct: 304 MSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFL-----LETTSWFMTIIIVFV 358

Query: 377 --AVACFFMPLTTITLSGLPPERLAAASSLSNFTRTLAGSIGTSI 419
++ ++TI S L + A SL NFT L+ G +I
Sbjct: 359 LGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3244LUXSPROTEIN293e-105 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 293 bits (751), Expect = e-105
Identities = 132/170 (77%), Positives = 148/170 (87%)

Query: 2 PLLDSFTVDHTRMEAPAVRVAKTMNTPHGDAITVFDLRFCVPNKEVMPERGIHTLEHLFA 61
PLLDSFTVDHTRM APAVRVAKTM TP GD ITVFDLRF PNK+++ E+GIHTLEHL+A
Sbjct: 1 PLLDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYA 60

Query: 62 GFMRNHLNGNGVEIIDISPMGCRTGFYMSLIGTPDEQRVADAWKAAMEDVLKVQDQNQIP 121
GFMRNHLNG+ VEIIDISPMGCRTGFYMSLIGTP EQ+VADAW AAMEDVLKV++QN+IP
Sbjct: 61 GFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIP 120

Query: 122 ELNVYQCGTYQMHSLQEAQDIARNILERDVRINSNEELALPKEKLQELHI 171
ELN YQCGT MHSL EA+ IA+NILE V +N N+ELALP+ L+EL I
Sbjct: 121 ELNEYQCGTAAMHSLDEAKQIAKNILEVGVAVNKNDELALPESMLRELRI 170


100c3330c3337N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c3330-111-1.367687Hypothetical metabolite transport protein ygcS
c3331-111-2.504304Putative conserved protein
c3332115-3.988771Hypothetical oxidoreductase ygcW
c3333116-3.651540Hypothetical protein yqcE
c3334-119-4.038224Hypothetical sugar kinase ygcE
c3335-126-5.220526Hypothetical protein ygcF
c3336-129-6.025933Hypothetical protein
c3337028-6.008621Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3330TCRTETB363e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 36.0 bits (83), Expect = 3e-04
Identities = 53/338 (15%), Positives = 123/338 (36%), Gaps = 34/338 (10%)

Query: 93 LGSLVLGWISDHIGRQKIFTFSFMLITLASFLQFFATTP-EHLIGLRILIGIGLGGDYSV 151
+G+ V G +SD +G +++ F ++ S + F + LI R + G G ++
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPAL 123

Query: 152 GHTLLAEFSPRRHRGILLGAFSVVWT----VGYVLASIAGHHFISESPEAWRWLLASAAL 207
++A + P+ +RG G + VG + + H+ W +LL +
Sbjct: 124 VMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------HWSYLLLIPMI 177

Query: 208 PALLITLLRWGTPESPRWLLRQGRFAEAHAIVHRYFGPHVLLGDEVATATHKHIKTLF-- 265
+ + L + R +G F I+ +L + + + L
Sbjct: 178 TIITVPFLMKLLKKEVR---IKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFL 234

Query: 266 -SSRYWRRTA--------FNSVFFVCLVIPWFVIYT----WLPTIAQTIGLEDALTASLM 312
++ R+ ++ F+ V+ +I+ ++ + + L+ + +
Sbjct: 235 IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEI 294

Query: 313 LNALLIVGALLGLVLTHLLAHRRFLLGSFLLLTATLVVMACLPSGSSLTLLLFVLFSTTI 372
+ ++ G + ++ ++ G +L + ++ S F+L +T+
Sbjct: 295 GSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLS-----VSFLTASFLLETTSW 349

Query: 373 SAVSNLVGILPAESFPTDIRSLGVGFATAMSRLGAAVS 410
+V +L SF + S V + GA +S
Sbjct: 350 FMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMS 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3332DHBDHDRGNASE1071e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 107 bits (267), Expect = 1e-29
Identities = 73/257 (28%), Positives = 117/257 (45%), Gaps = 11/257 (4%)

Query: 36 MDFFSLKGKTAIVTGGNSGLGQAFAMALAKAGANVFIPSFVKDNGETKEMIEK-QGVEVD 94
M+ ++GK A +TG G+G+A A LA GA++ + + E K + +
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 95 FMQVDITAEGAPQKIIAACCERFGTVDILVNNAGICKLNKVLDFGRADWDPMIDVNLTAA 154
D+ A +I A G +DILVN AG+ + + +W+ VN T
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 155 FELSYEAAKIMIPQKSGKIINICSLFSYLGGQWSPAYSATKHALAGFTKAYCDELGQYNI 214
F S +K M+ ++SG I+ + S + + AY+++K A FTK EL +YNI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 215 QVNGIAPGYYATDI--TLATRSNPETNQRVLDH-------IPANRWGDTQDLMGAAVFLA 265
+ N ++PG TD+ +L N Q + IP + D+ A +FL
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAE-QVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 266 SPASNYVNGHLLVVDGG 282
S + ++ H L VDGG
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3333TCRTETA290.036 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.0 bits (65), Expect = 0.036
Identities = 22/103 (21%), Positives = 45/103 (43%), Gaps = 8/103 (7%)

Query: 48 GLIMSTFGIAAIILYAPSGVIADKFSHRKMITSAMIITGLLGLIMATYPPLWVMLCIQVA 107
G++++ + + G ++D+F R ++ ++ + IMAT P LWV+ ++
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV 105

Query: 108 FAITTILMLWSVSIKAASLLGD---HSEQGKIMGWMEGLRGVG 147
IT + A + + D E+ + G+M G G
Sbjct: 106 AGITG-----ATGAVAGAYIADITDGDERARHFGFMSACFGFG 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c333756KDTSANTIGN280.041 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 27.6 bits (61), Expect = 0.041
Identities = 17/74 (22%), Positives = 29/74 (39%), Gaps = 12/74 (16%)

Query: 30 NASWSEVLNQYQRRTDLIPNLVASIKGYSSHEQEVLEAVTLARSQANRASSDLQKTPGDE 89
+AS ++ ++ Q D + L S GY + + N+ + P +
Sbjct: 294 SASIEQIQSKIQELGDTLEELRDSFDGY------------INNAFVNQIHLNFVMPPQAQ 341

Query: 90 QKLQAWQQAQAQVT 103
Q+ QQ QAQ T
Sbjct: 342 QQQGQGQQQQAQAT 355


101c3417c3424N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c34170171.084467Prepilin peptidase dependent protein C
c3418-1150.915359Hypothetical protein ygdB precursor
c3419-3111.477591Prepilin peptidase dependent protein B
c3420-3101.148429Prepilin peptidase dependent protein A
c3421-381.128661Hypothetical protein
c3422-391.063083Thymidylate synthase
c3423-2100.875893Prolipoprotein diacylglyceryl transferase
c3424-2101.257435Phosphoenolpyruvate-protein phosphotransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3417BCTERIALGSPH290.002 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 29.1 bits (65), Expect = 0.002
Identities = 27/114 (23%), Positives = 43/114 (37%), Gaps = 29/114 (25%)

Query: 8 QQGFSLPEVMLAMVLMVMIVTA----------------LSGFQRTLMNSLASRNQYQQLW 51
Q+GF+L E+ML ++LM + L+ F+ L Q Q +
Sbjct: 3 QRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQFF 62

Query: 52 -----RHGWQ--QTQLRAISPPA----NWQVNRMQTSQAGCVSISVTLVSPGGR 94
WQ + R + PA W R +AG V+ S ++ GG+
Sbjct: 63 GVSVHPDRWQFLVLEARDGADPAPADDGWSGYRWLPLRAGRVATSGSI--AGGK 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3419PilS_PF08805270.030 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 27.2 bits (60), Expect = 0.030
Identities = 12/50 (24%), Positives = 25/50 (50%)

Query: 1 MSARRNRRMPVKEQGFSLLEVLIAMAISSVLLLGAARFLPALQRESLTNT 50
S+ RR +++G +L+EVL+ + + VL A + +Q ++
Sbjct: 13 FSSLSARRKKEQDKGATLMEVLLVVGVIVVLAASAYKLYSMVQSNIQSSN 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3420BCTERIALGSPG290.003 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 29.5 bits (66), Expect = 0.003
Identities = 9/24 (37%), Positives = 18/24 (75%)

Query: 1 MKTQRGYTLIETLVAMLILVMLSA 24
QRG+TL+E +V ++I+ +L++
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLAS 27


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3424PHPHTRNFRASE6110.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 611 bits (1577), Expect = 0.0
Identities = 189/571 (33%), Positives = 314/571 (54%), Gaps = 7/571 (1%)

Query: 168 QTRIRALPAAPGVAIAEGWQDATLPLMEQVYQASTLDPALERERLTGALEEAANEFRRYS 227
+I + A+ GVAIA+ + + + + S D + E E+LT ALE++ E R
Sbjct: 2 HHKITGIAASSGVAIAKAFIHLEPNV--DIEKTSITDVSTEIEKLTAALEKSKEELRAIK 59

Query: 228 KRFAAGAQKETAAIFDLYSHLLSDTRLRRELFAEVDKGSV-AEWAVKTVIEKFAEQFAAL 286
+ A + A IF + +L D L + +++ + AE+A+K V + F F ++
Sbjct: 60 DQTEASMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESM 119

Query: 287 SDNYLKERAGDLRALGQRLLFHLDDANQGPNAW-PERFILVADELSATTLAELPQDRLVG 345
+ Y+KERA D+R + +R+L HL G A E +++A++L+ + A+L + + G
Sbjct: 120 DNEYMKERAADIRDVSKRVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKG 179

Query: 346 VVVRDGAANSHAAIMVRALGIPTVMGA-DIQPSVLHRRTLIVDGYRGELLVDPEPVLLQE 404
G SH+AIM R+L IP V+G ++ + H +IVDG G ++V+P ++
Sbjct: 180 FATDIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKA 239

Query: 405 YQRLISEEIELSRLAEDDVNLPAQLKSGERIKVMLNAGLSPEHEEKLGSRIDGIGLYRTE 464
Y+ + + + V P+ K G +++ N G + + L + +GIGLYRTE
Sbjct: 240 YEEKRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTE 299

Query: 465 IPFMLQSGFPSEEEQVAQYQGMLQMFNDKPVTLRTLDVGADKQLPYMPISEE-NPCLGWR 523
+M + P+EEEQ Y+ ++Q + KPV +RTLD+G DK+L Y+ + +E NP LG+R
Sbjct: 300 FLYMDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFR 359

Query: 524 GIRITLDQPEIFLIQVRAMLRANAATGNLNILLPMVTSLDEVDEARRLIERAGREVEEMI 583
IR+ L++ +IF Q+RA+LRA + GNL ++ PM+ +L+E+ +A+ +++ ++
Sbjct: 360 AIRLCLEKQDIFRTQLRALLRA-STYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEG 418

Query: 584 GYEIPKPRIGIMLEVPSMVFMLPHLAKRVDFISVGTNDLTQYILAVDRNNTRVANIYDSL 643
+GIM+E+PS AK VDF S+GTNDL QY +A DR N RV+ +Y
Sbjct: 419 VDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPY 478

Query: 644 HPAMLRALAMIAREAEIHGIDLRLCGEMAGDPMCVAILIGLGYRHLSMNGRSVARVKYLL 703
HPA+LR + M+ + A G + +CGEMAGD + + +L+GLG SM+ S+ + L
Sbjct: 479 HPAILRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQL 538

Query: 704 RRIDFAEAENLAQRSLEAQLATEVRHQVAAF 734
++ E + AQ++L A EV V
Sbjct: 539 LKLSKEELKPFAQKALMLDTAEEVEQLVKKT 569


102c3564c3570N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c3564733-8.025412Hypothetical protein
c3565636-10.154830Putative response regulator
c3566636-10.842081Hypothetical protein
c3567739-11.434840Hypothetical protein
c3568640-11.653960Hypothetical protein
c3569641-11.633538Hemolysin C
c3570538-10.051226Hemolysin A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3564PF06580423e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.2 bits (99), Expect = 3e-06
Identities = 24/137 (17%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 360 LSIETRRLQLRIMMSHSLPLIRADISMIERVITNLLDNAVRH----TPPEGSIRLKVWQE 415
L + + + + R+ + + D+ + ++ L++N ++H P G I LK ++
Sbjct: 229 LQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD 288

Query: 416 DNRLHVEVADSGPGLTEDMRTHLFRRASVLCHEPSEEPRGGLGLLIVRRMLVLHGGD--- 472
+ + +EV ++G + + + G GL VR L + G
Sbjct: 289 NGTVTLEVENTGSLALK-----------------NTKESTGTGLQNVRERLQMLYGTEAQ 331

Query: 473 IRLTDSTTGACFRFFLP 489
I+L++ +P
Sbjct: 332 IKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3565HTHFIS901e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 1e-22
Identities = 35/129 (27%), Positives = 60/129 (46%)

Query: 7 KILLMEDDYDIAALLRLNLQDEGYQIVHEADGARARLLLDKQTWDAVILDLMLPNVNGLE 66
IL+ +DD I +L L GY + ++ A + D V+ D+++P+ N +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 67 ICRYIRQMTRYLPVIIISARTSETHRVLGLEMGADDYLPKPFSIPELIARIKALFRRQEA 126
+ I++ LPV+++SA+ + + E GA DYLPKPF + ELI I +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 127 MGQNILLAG 135
+
Sbjct: 125 RPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3569RTXTOXINC317e-114 Gram-negative bacterial RTX toxin-activating protein C...
		>RTXTOXINC#Gram-negative bacterial RTX toxin-activating protein C

signature.
Length = 170

Score = 317 bits (815), Expect = e-114
Identities = 161/170 (94%), Positives = 167/170 (98%)

Query: 55 MNMNNPLEVLGHVSWLWASSPLHRNWPVSLFAINVLPAIRANQYALLTRDNYPVAYCSWA 114
MN+N PLE+LGHVSWLWASSPLHRNWPVSLFAINVLPAI+ANQY LLTRD+YPVAYCSWA
Sbjct: 1 MNINKPLEILGHVSWLWASSPLHRNWPVSLFAINVLPAIQANQYVLLTRDDYPVAYCSWA 60

Query: 115 NLSLENEIKYLNDVTSLVAEDWTSGDRKWFIDWIAPFGDNGALYKYMRKKFPDELFRAIR 174
NLSLENEIKYLNDVTSLVAEDWTSGDRKWFIDWIAPFGDNGALYKYMRKKFPDELFRAIR
Sbjct: 61 NLSLENEIKYLNDVTSLVAEDWTSGDRKWFIDWIAPFGDNGALYKYMRKKFPDELFRAIR 120

Query: 175 VDPKTHVGKVSEFHGGKIDKQLANKIFKQYYHELITEVKNKTDFNFSLTG 224
VDPKTHVGKVSEFHGGKIDKQLANKIFKQY+HELITEVK K+DFNFSLTG
Sbjct: 121 VDPKTHVGKVSEFHGGKIDKQLANKIFKQYHHELITEVKRKSDFNFSLTG 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3570RTXTOXINA14840.0 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 1484 bits (3843), Expect = 0.0
Identities = 988/1024 (96%), Positives = 1001/1024 (97%)

Query: 1 MPTITTAQIKSTLQSAKQSSANKLHSAGQSTKDALKKAAEQTRNAGNRLILLIPKDYKGQ 60
M TITTAQIKSTLQSAKQS+ANKLHSAGQSTKDALKKAAEQTRNAGNRLILLIPKDYKGQ
Sbjct: 1 MTTITTAQIKSTLQSAKQSAANKLHSAGQSTKDALKKAAEQTRNAGNRLILLIPKDYKGQ 60

Query: 61 GSSLNDLVRTADELGIEVQYDEKNGTAITKQVFGTAEKLIGLTERGVTIFAPQLDKLLQK 120
GSSLNDLVRTADELGIEVQYDEKNGTAITKQVFGTAEKLIGLTERGVTIFAPQLDKLLQK
Sbjct: 61 GSSLNDLVRTADELGIEVQYDEKNGTAITKQVFGTAEKLIGLTERGVTIFAPQLDKLLQK 120

Query: 121 YQKAGNKLGGSAENIGDNLGKAGSVLSTFQNFLGTALSSMKIDELIKRQKSGSNVSSSEL 180
YQKAGN LGG AENIGDNLGKAG +LSTFQNFLGTALSSMKIDELIK+QKSG NVSSSEL
Sbjct: 121 YQKAGNILGGGAENIGDNLGKAGGILSTFQNFLGTALSSMKIDELIKKQKSGGNVSSSEL 180

Query: 181 AKASIELINQLVDTAASINNNVNSFSQQLNKLGSVLSNTKHLTGVGNKLQNLPNLDNIGA 240
AKASIELINQLVDT AS+NNNVNSFSQQLN LGSVLSNTKHL GVGNKLQNLPNLDNIGA
Sbjct: 181 AKASIELINQLVDTVASLNNNVNSFSQQLNTLGSVLSNTKHLNGVGNKLQNLPNLDNIGA 240

Query: 241 GLDTVSGILSAISASFILSNADADTGTKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGL 300
GLDTVSGILSAISASFILSNADADT TKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGL
Sbjct: 241 GLDTVSGILSAISASFILSNADADTRTKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGL 300

Query: 301 STSAAAAGLIASVVTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKE 360
STSAAAAGLIAS VTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKE
Sbjct: 301 STSAAAAGLIASAVTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKE 360

Query: 361 TGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEASKQAMFEH 420
TGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEASKQAMFEH
Sbjct: 361 TGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILEASKQAMFEH 420

Query: 421 VASKMADVIAEWEKKHGKNYFENGYDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHW 480
VASKMADVIAEWEKKHGKNYFENGYDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHW
Sbjct: 421 VASKMADVIAEWEKKHGKNYFENGYDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHW 480

Query: 481 DTLIGELAGVTRNGDKTLSGKSYIDYYEEGKRLEKKPDEFQKQVFDPLKGNIDLSDSKSS 540
DTLIGELAGVTRNGDKTLSGKSYIDYYEEGKRLEKK DEFQKQVFDPLKGNIDLSDSKSS
Sbjct: 481 DTLIGELAGVTRNGDKTLSGKSYIDYYEEGKRLEKKXDEFQKQVFDPLKGNIDLSDSKSS 540

Query: 541 TLLKFVTPLLTPGEEIRERRQSGKYEYITELLVKGVDKWTVKGVQDKGSVYDYSNLIQHA 600
TLLKFVTPLLTPGEEIRERRQSGKYEYITELLVKGVDKWTVKGVQDKG+VYDYSNLIQHA
Sbjct: 541 TLLKFVTPLLTPGEEIRERRQSGKYEYITELLVKGVDKWTVKGVQDKGAVYDYSNLIQHA 600

Query: 601 SVGNNQYREIRIESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATE 660
SVGNNQYREIRIESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATE
Sbjct: 601 SVGNNQYREIRIESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATE 660

Query: 661 AGNYTVTRVLGGDVKILQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVE 720
AGNYTVTRVLGGDVK+LQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVE
Sbjct: 661 AGNYTVTRVLGGDVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVE 720

Query: 721 ELIGTTRADKFFGSKFTDIFHGADGDDHIEGNDGNDRLYGDKGNDTLRGGNGDDQLYGGD 780
ELIGTTRADKFFGSKFTDIFHGADGDD IEGNDGNDRLYGDKGNDTL GGNGDDQLYGGD
Sbjct: 721 ELIGTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGD 780

Query: 781 GNDKLIGGTGNNYLNGGDGDDELQVQGNSLAKNVLSGGKGNDKLYGSEGADLLDGGEGND 840
GNDKLIG GNNYLNGGDGDDE QVQGNSLAKNVL GGKGNDKLYGSEGADLLDGGEG+D
Sbjct: 781 GNDKLIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDD 840

Query: 841 LLKGGYGNDIYRYLSGYGHHIIDDDGGKDDKLSLADIDFRDVAFRREGNDLIMYKAEGNV 900
LLKGGYGNDIYRYLSGYGHHIIDDDGGK+DKLSLADIDFRDVAF+REGNDLIMYK EGNV
Sbjct: 841 LLKGGYGNDIYRYLSGYGHHIIDDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEGNV 900

Query: 901 LSIGHKNGITFRNWFEKESGDISNHQIEQIFDKDGRVITPDSLKKALEYQQSNNKASYVY 960
LSIGHKNGITFRNWFEKESGDISNH+IEQIFDK GR+ITPDSLKKALEYQQ NNKASYVY
Sbjct: 901 LSIGHKNGITFRNWFEKESGDISNHEIEQIFDKSGRIITPDSLKKALEYQQRNNKASYVY 960

Query: 961 GNDALAYGSQDNLNPLINEISKIISAAGNFDVKEERAAASLLQLSGNASDFSYGRNSITL 1020
GNDALAYGSQ +LNPLINEISKIISAAG+FDVKEER AASLLQLSGNASDFSYGRNSITL
Sbjct: 961 GNDALAYGSQGDLNPLINEISKIISAAGSFDVKEERTAASLLQLSGNASDFSYGRNSITL 1020

Query: 1021 TASA 1024
T SA
Sbjct: 1021 TTSA 1024


103c3583c3591N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c3583739-3.470348PapG protein
c3584739-2.560297PapF protein
c3585533-0.463753PapE protein
c35865310.518206PapK protein
c3587530-0.808310Hypothetical protein
c3588530-1.398392PapJ protein
c3589426-1.723957PapD protein
c3590425-0.756002PapC protein
c3591329-2.245794PapH protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3583PF036276030.0 PapG
		>PF03627#PapG

Length = 336

Score = 603 bits (1555), Expect = 0.0
Identities = 333/336 (99%), Positives = 335/336 (99%)

Query: 1 MKKWFPALLFSLCVSGESSAWNNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIATVT 60
MKKWFPALLFSLCVSGESSAWNNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIATVT
Sbjct: 1 MKKWFPALLFSLCVSGESSAWNNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIATVT 60

Query: 61 WNQCNGPEFADGSWAYYREYIAWVVFPKKVMTKNGYPLFIEVHNKGSWSEENTGDNDSYF 120
WNQCNGP FADGSWAYYREYIAWVVFPKKVMTKNGYPLFIEVHNKGSWSEENTGDNDSYF
Sbjct: 61 WNQCNGPGFADGSWAYYREYIAWVVFPKKVMTKNGYPLFIEVHNKGSWSEENTGDNDSYF 120

Query: 121 FLKGYKWDERAFDAGNLCQKPGETTRLTEKFNDIIFKVALPADLPLGDYSVTIPYTSGIQ 180
FLKGYKWDERAFDAGNLCQKPGETTRLTEKF+DIIFKVALPADLPLGDYSVTIPYTSG+Q
Sbjct: 121 FLKGYKWDERAFDAGNLCQKPGETTRLTEKFDDIIFKVALPADLPLGDYSVTIPYTSGMQ 180

Query: 181 RHFASYLGARFKIPYNVAKTLPRENEMLFLFKNIGGCRPSAQSLEIKHGDLSINSANNHY 240
RHFASYLGARFKIPYNVAKTLPRENEMLFLFKNIGGCRPSAQSLEIKHGDLSINSANNHY
Sbjct: 181 RHFASYLGARFKIPYNVAKTLPRENEMLFLFKNIGGCRPSAQSLEIKHGDLSINSANNHY 240

Query: 241 AAQTLSVSCDVPANIRFMLLRNTTPTYSHGKKFSVGLGHGWDSIVSVNGVDTGETTMRWY 300
AAQTLSVSCDVPANIRFMLLRNTTPTYSHGKKFSVGLGHGWDSIVSVNGVDTGETTMRWY
Sbjct: 241 AAQTLSVSCDVPANIRFMLLRNTTPTYSHGKKFSVGLGHGWDSIVSVNGVDTGETTMRWY 300

Query: 301 KAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP 336
KAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP
Sbjct: 301 KAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3584FIMBRIALPAPF2675e-95 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 267 bits (683), Expect = 5e-95
Identities = 155/167 (92%), Positives = 156/167 (93%), Gaps = 1/167 (0%)

Query: 11 MARLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE 70
M RLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE
Sbjct: 1 MIRLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE 60

Query: 71 VTKTISISCTYKSGSPWIKVTGNAMA-GQTNVLATNIANFGIALYQGKGMSTPLTLGNGS 129
VTK ISISC YKSGS WIKVTGN M GQ NVLATNI +FGIALYQGKGMSTPLTLGNGS
Sbjct: 61 VTKNISISCPYKSGSLWIKVTGNTMGVGQNNVLATNITHFGIALYQGKGMSTPLTLGNGS 120

Query: 130 GNGYRVTAGLDTARSTFTFTSVPFRNGSRTLNGGDFRTTASMSMIYN 176
GNGYRVTAGLDTARSTFTFTSVPFRNGS LNGGDFRTTASMSMIYN
Sbjct: 121 GNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMIYN 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3585FIMBRIALPAPE306e-110 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 306 bits (784), Expect = e-110
Identities = 128/173 (73%), Positives = 145/173 (83%)

Query: 7 MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVTKAEVDWGNVEIQTLSQNG 66
MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTV AEV+WG++EIQ L Q+G
Sbjct: 1 MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAEVNWGDIEIQNLVQSG 60

Query: 67 NHEKEFTVNMQCPYHLGTMKVTITATNTYNNAILVQNTSNTSSDGVLVYLYNSNAGNIGT 126
++K+FTV+M CPY LGTMKVTIT+ N+ILV NTS S DG+L+YLYNSN IG
Sbjct: 61 GNQKDFTVDMNCPYSLGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYNSNNSGIGN 120

Query: 127 AITLGTPFTPGKITGNNADRTISLHAKLGYKGNMQSLKAGDFSATATLVASYS 179
A+TLG+ TPGKITG R I+L+AKLGYKGNMQSL+AG FSATATLVASYS
Sbjct: 121 AVTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVASYS 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3590PF005777420.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 742 bits (1917), Expect = 0.0
Identities = 243/882 (27%), Positives = 362/882 (41%), Gaps = 67/882 (7%)

Query: 2 MRGMKDRI-PFAVNNITCVILLSLFCNAASAVEFNTDVLDAADKKNIDFTRFSEAGYVLP 60
+ K R+ F V + +++ + FN L + D +RF + P
Sbjct: 16 LHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPP 75

Query: 61 GQYLLDVIVNGQSISPASLQISFVEPQSSGDKAEKKLPQACLTSDMVRLMGLTAESLDKV 120
G Y +D+ +N + A+ ++F S CLT + MGL S+ +
Sbjct: 76 GTYRVDIYLNNGYM--ATRDVTFNTGDSEQGI------VPCLTRAQLASMGLNTASVSGM 127

Query: 121 VYWHDGQCADF-HGLPGVDIRPDTGAGVLRINMPQAWLEYSDATWLPPSRWDDGIPGLML 179
D C + + D G L + +PQA++ ++PP WD GI +L
Sbjct: 128 NLLADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLL 187

Query: 180 DYNLNGTVSRNYQGGDSHQFSYNGTVGGNLGPWRLRADYQGSQEQSRYNGEKTTNRNFTW 239
+YN +G +N GG+SH N G N+G WRLR + S S + + +
Sbjct: 188 NYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSS--DSSSGSKNKWQH 245

Query: 240 SRFYLFRAIPRWRANLTLGENNINSDIFRSWSYTGASLESDDRMLPPRLRGYAPQITGIA 299
+L R I R+ LTLG+ DIF ++ GA L SDD MLP RG+AP I GIA
Sbjct: 246 INTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIA 305

Query: 300 ETNARVVVSQQGRVLYDSMVPAGPFSIQDLD-SSVRGRLDVEVIEQNGRKKTFQVDTASV 358
A+V + Q G +Y+S VP GPF+I D+ + G L V + E +G + F V +SV
Sbjct: 306 RGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSV 365

Query: 359 PYLTRPGQVRYKLVSGRSRGYGHETEGPVFATGEASWGLSNQWSLYGGAVLAGDYNALAA 418
P L R G RY + +G R + E P F GL W++YGG LA Y A
Sbjct: 366 PLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNF 425

Query: 419 GAGWDLGVPGTLSADITQSVARIEGERTFQGKSWRLSYSKRFDNADADITFAGYRFSERN 478
G G ++G G LS D+TQ+ + + + G+S R Y+K + + +I GYR+S
Sbjct: 426 GIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSG 485

Query: 479 YMTMEQYLNARYR--------------------NDYSSREKEMYTVTLNKNVADWNTSFN 518
Y +R + + ++ +T+ + + + +
Sbjct: 486 YFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTS-TLY 544

Query: 519 LQYSRQTYWDIRKTD-YYTVSVNRYFNVFGLQGVAVGLSASRSKYLGRD--NDSAYLRIS 575
L S QTYW D + +N F + LS S +K + + L ++
Sbjct: 545 LSGSHQTYWGTSNVDEQFQAGLNTAFE-----DINWTLSYSLTKNAWQKGRDQMLALNVN 599

Query: 576 VPLGT------------GTASYSGSMSND-RYVNMAGYTDM-FNDGLDSYSLNAGLNSGG 621
+P +ASYS S + R N+AG D SYS+ G GG
Sbjct: 600 IPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGG 659

Query: 622 GLTSQRQINAYYSHRSPLANLSANIASLQKGYTSFGVSASGGATITGKGAALHAGGMSGG 681
S A ++R N + S SGG G L G
Sbjct: 660 DGNSGSTGYATLNYRGGYGNANIG-YSHSDDIKQLYYGVSGGVLAHANGVTL--GQPLND 716

Query: 682 TRLLVDTDGVGGVPVDGGQVV-TNRWGTGVVTDISSYYRNTTSVDLKRLPDDVEATRSVV 740
T +LV G V+ V T+ G V+ + Y N ++D L D+V+ +V
Sbjct: 717 TVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVA 776

Query: 741 ESALTEGAIGYRKFSVLKGKRLFAILRLADGSQPPFGASVTSEKGRELGMVADEGLAWLS 800
T GAI +F G +L L + PFGA VTSE + G+VAD G +LS
Sbjct: 777 NVVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYLS 835

Query: 801 GVTPGETLSVNW--DGKIQCQVNVPETAISDQQLL----LPC 836
G+ + V W + C N S QQLL C
Sbjct: 836 GMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3591FIMBRIALPAPE320.001 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 31.5 bits (71), Expect = 0.001
Identities = 41/173 (23%), Positives = 75/173 (43%), Gaps = 29/173 (16%)

Query: 29 GMSLPEYWG----EEHVWWDGRAAFHGEVVRPACTLAMEDAWQIIDMGETPVRDL-QNGF 83
G+ LP G +HV F G+++ PACT+ + ++ G+ +++L Q+G
Sbjct: 6 GLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAE----VNWGDIEIQNLVQSG- 60

Query: 84 SGPERKFSLRLRNCEFNSQGGNLFSDSRIRVTFDGVRGET---PDKFNLSGQAKGINLQI 140
G ++ F++ + NC ++ ++ +T +G G + P+ SG I L
Sbjct: 61 -GNQKDFTVDM-NCPYS------LGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYN 112

Query: 141 ADARGNIARAGKV-MPAIP--LTGNEEALDYTLRIVR----NGKKLEAGNYFA 186
++ I A + P +TG A TL N + L+AG + A
Sbjct: 113 SNN-SGIGNAVTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSA 164


104c3989c3996N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c3989-217-0.772182Protease degQ precursor
c3990-314-0.682769Protease degS precursor
c3991-114-0.713928Malate dehydrogenase
c3992-212-0.998857Arginine repressor
c3993-213-0.429572Hypothetical protein yhcN precursor
c3994-2130.577689Hypothetical protein yhcO
c3995-3111.461312Hypothetical protein yhcP
c3996-2101.615520Hypothetical protein yhcQ
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3989V8PROTEASE733e-16 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 73.1 bits (179), Expect = 3e-16
Identities = 30/184 (16%), Positives = 63/184 (34%), Gaps = 38/184 (20%)

Query: 104 GLGSGVIINANKGYVLTNNHVINQAQKISIQL------------NDGREFDAKLIGSDDQ 151
+ SGV++ + +LTN HV++ L +G ++ +
Sbjct: 102 FIASGVVVGKDT--LLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 152 SDIALLQIQN-------PSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALG 204
D+A+++ + ++++ + +V G P ++ +
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKP-------VATMW 212

Query: 205 RSGLNLEGLEN-FIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSN 263
S + L+ +Q D S GNSG + N E+IGI+ G+
Sbjct: 213 ESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWG---------GVPNEFNGA 263

Query: 264 MART 267
+
Sbjct: 264 VFIN 267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3990V8PROTEASE536e-10 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 53.1 bits (127), Expect = 6e-10
Identities = 33/160 (20%), Positives = 61/160 (38%), Gaps = 26/160 (16%)

Query: 77 RTLGSGVIMDQRGYIITNKHVINDADQIIVALQ------------DGRVFEALLVGSDSL 124
+ SGV++ + ++TNKHV++ AL+ +G +
Sbjct: 101 TFIASGVVVG-KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 125 TDLAVLKINATGGLPTIP--INPRRVPH-----IGDVVLAIGNPYNLGQTITQGIISATG 177
DLA++K + I + P + + + + G P + T + G
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGD-KPVATMW--ESKG 216

Query: 178 RIGLNPTGRQNFLQTDASINHGNSGGALVNSLGELMGINT 217
+I + +Q D S GNSG + N E++GI+
Sbjct: 217 KI---TYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHW 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3991DHBDHDRGNASE280.036 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 28.5 bits (63), Expect = 0.036
Identities = 37/167 (22%), Positives = 61/167 (36%), Gaps = 27/167 (16%)

Query: 25 VAVLGAAGGIGQALALLLKTQLPSGSELSLYDIAPVTPGVAVDLSHIPTAVKIKGFSGED 84
+ GAA GIG+A+A L G+ ++ D P V S A + F +
Sbjct: 11 AFITGAAQGIGEAVARTL---ASQGAHIAAVDYNP-EKLEKVVSSLKAEARHAEAFPADV 66

Query: 85 ATPA------------LEGADVVLISAGVARK------PGMDRSDLFNVNAGIVKNLVQQ 126
A + D+++ AGV R + F+VN+ V N +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 127 VAKTCPK----ACIGIITNPVNTT-VAIAAEVLKKAGVYDKNKLFGV 168
V+K + + + +NP ++AA KA K G+
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGL 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3992ARGREPRESSOR1689e-57 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 168 bits (428), Expect = 9e-57
Identities = 44/141 (31%), Positives = 71/141 (50%), Gaps = 5/141 (3%)

Query: 15 KALLKEEKFSSQGEIVAALQEQGFDNINQSKVSRMLTKFGAVRTRNAKMEMVYCLPAELG 74
+ ++ + +Q E+V L++ G+ N+ Q+ VSR + + V+ Y LPA+
Sbjct: 11 REIITANEIETQDELVDILKKDGY-NVTQATVSRDIKELHLVKVPTNNGSYKYSLPADQR 69

Query: 75 VPTTSSPLKNLV---LDIDYNDAVVVIHTSPGAAQLIARLLDSLGKAEGILGTIAGDDTI 131
S ++L+ + ID ++V+ T PG AQ I L+D+L E I+GTI GDDTI
Sbjct: 70 FNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEE-IMGTICGDDTI 128

Query: 132 FTTPANGFTVKELYEAILELF 152
K + + ILEL
Sbjct: 129 LIICRTHDDTKVVQKKILELL 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c3996RTXTOXIND542e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 54.4 bits (131), Expect = 2e-10
Identities = 29/163 (17%), Positives = 59/163 (36%), Gaps = 16/163 (9%)

Query: 6 RKFSRTAITVVLVILAFIAIFNAWVYYTE----SPWTRDARFSADVVAIAPDVSGLITQV 61
SR V I+ F+ I + + S I P + ++ ++
Sbjct: 51 TPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEI 110

Query: 62 NVHDNQLVKKGQVLFTIDQPR-------YQKALEEAQADVAYYQVLAQEKRQEAGRRNRL 114
V + + V+KG VL + Q +L +A+ + YQ+L++ E + L
Sbjct: 111 IVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRS--IELNKLPEL 168

Query: 115 GVQAMSREEIDQANNVL---QTVLHQLAKAQATRDLAKLDLER 154
+ + VL + Q + Q + +L+L++
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDK 211



Score = 51.4 bits (123), Expect = 2e-09
Identities = 28/147 (19%), Positives = 54/147 (36%), Gaps = 15/147 (10%)

Query: 100 LAQEKRQEAGRRNRLGVQ-AMSREEIDQANNVLQT-VLHQLAKAQAT-------RDLAKL 150
E R + ++ + ++EE + + +L +L + +
Sbjct: 264 AVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEE 323

Query: 151 DLERTVIRAPADGWVTNLNVYT-GEFITRGSTAVALVKQNSFY-VLAYMEETKLEGVRPG 208
+ +VIRAP V L V+T G +T T + +V ++ V A ++ + + G
Sbjct: 324 RQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVG 383

Query: 209 YRAEIT----PLGSNKVLKGTVDSVAA 231
A I P L G V ++
Sbjct: 384 QNAIIKVEAFPYTRYGYLVGKVKNINL 410


105c4005c4011N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4005-3121.278380Rod shape-determining protein mreC
c4006-3130.463201Rod shape-determining protein mreB
c4007-2140.365735Hypothetical protein
c4008-2100.197855Hypothetical protein yhdA
c4009-2140.048475Protein yhdH
c4010-114-2.612532Hypothetical protein
c4011013-3.228325Biotin carboxyl carrier protein of acetyl-CoA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4005PF03544280.043 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 28.4 bits (63), Expect = 0.043
Identities = 12/72 (16%), Positives = 20/72 (27%), Gaps = 3/72 (4%)

Query: 296 MMPQVLPSPDAMGPKLPEPATGITQPTPQQPATGNAVTAPAAPTQPAANRSPQRATPPQS 355
P+ +P P P + E +P P+ V P +P +R
Sbjct: 78 PEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVK---KVEQPKRDVKPVESRPASPFENTAP 134

Query: 356 GAQPPARAPGGQ 367
+ A
Sbjct: 135 ARPTSSTATAAT 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4006SHAPEPROTEIN5800.0 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 580 bits (1498), Expect = 0.0
Identities = 347/347 (100%), Positives = 347/347 (100%)

Query: 26 MLKKFRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAAVGHDAK 85
MLKKFRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAAVGHDAK
Sbjct: 1 MLKKFRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAAVGHDAK 60

Query: 86 QMLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQ 145
QMLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQ
Sbjct: 61 QMLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQ 120

Query: 146 VERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNG 205
VERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNG
Sbjct: 121 VERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNG 180

Query: 206 VVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRN 265
VVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRN
Sbjct: 181 VVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRN 240

Query: 266 LAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALL 325
LAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALL
Sbjct: 241 LAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALL 300

Query: 326 RNLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHGGDLFSEE 372
RNLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHGGDLFSEE
Sbjct: 301 RNLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHGGDLFSEE 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4009NUCEPIMERASE290.026 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.0 bits (65), Expect = 0.026
Identities = 11/28 (39%), Positives = 17/28 (60%)

Query: 150 VVVTGASGGVGSTAVALLHKLGYQVVAV 177
+VTGA+G +G L + G+QVV +
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGI 30


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4011RTXTOXIND270.026 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 27.5 bits (61), Expect = 0.026
Identities = 8/27 (29%), Positives = 16/27 (59%)

Query: 127 IEADKSGTVKAILVESGQPVEFDEPLV 153
I+ ++ VK I+V+ G+ V + L+
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLL 125


106c4027c4032N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4027-210-1.792068DNA-binding protein fis
c4028-310-1.500723Hypothetical adenine-specific methylase yhdJ
c4029-412-1.255443Hypothetical protein yhdU
c4030-313-1.022500Potential acrEF/envCD operon repressor
c4031-312-0.199088Acriflavine resistance protein E precursor
c4032-314-0.815798Acriflavine resistance protein F
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4027DNABINDNGFIS1573e-54 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 157 bits (399), Expect = 3e-54
Identities = 98/98 (100%), Positives = 98/98 (100%)

Query: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60
MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ
Sbjct: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60

Query: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98
PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN
Sbjct: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4030HTHTETR1283e-39 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 128 bits (323), Expect = 3e-39
Identities = 76/209 (36%), Positives = 122/209 (58%), Gaps = 3/209 (1%)

Query: 1 MAKRTKAKALKTRQELIETAIAQFAQHGVSKTTLNDIADAANVTRGAIYWHFENKTQLFN 60
MA++TK +A +TRQ +++ A+ F+Q GVS T+L +IA AA VTRGAIYWHF++K+ LF+
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EMW-LQQPSLRELIQDHLTAGLEHDPFQQLREKLIVGLQYIAKIPRQQALLKILYHKCEF 119
E+W L + ++ EL + A DP LRE LI L+ R++ L++I++HKCEF
Sbjct: 61 EIWELSESNIGELELE-YQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 120 NDEM-LAEGVIREKMGFNPQTLREVLQACQQQGCVANNLDLDVVMIIIDGAFSGIVQNWL 178
EM + + R + + + L+ C + + +L II+ G SG+++NWL
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 179 MNMAGYDLYKQAPALVDNVLRMFMPDENI 207
+DL K+A V +L M++ +
Sbjct: 180 FAPQSFDLKKEARDYVAILLEMYLLCPTL 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4031RTXTOXIND431e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 43.3 bits (102), Expect = 1e-06
Identities = 38/217 (17%), Positives = 70/217 (32%), Gaps = 38/217 (17%)

Query: 98 ATYQASYDSAKGELAKSEAAAAIAHLTVKRYVPLVGTKYISQQEYDQAIADA-RQADAAV 156
K +L + E+ A + Q + I D RQ +
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLV----------TQLFKNEILDKLRQTTDNI 311

Query: 157 IAAKATVESARINLAYTKVTAPISGRIGK-STVTEGALVTNGQTTELATVQQLDPIYVDV 215
+ + + AP+S ++ + TEG +VT +T + V + D + V
Sbjct: 312 GLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVTA 370

Query: 216 TQSSND--FMRLKQSVEQGNLHKENATSNVELVMENGQTYP-LKGTLQ--FSDVTVDEST 270
+ D F+ + Q+ +++ Y L G ++ D D+
Sbjct: 371 LVQNKDIGFINVGQNAI------------IKVEAFPYTRYGYLVGKVKNINLDAIEDQRL 418

Query: 271 GSIT--LRAV------FPNPQHTLLPGMFVRARIDEG 299
G + + ++ N L GM V A I G
Sbjct: 419 GLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 7e-04
Identities = 22/127 (17%), Positives = 43/127 (33%), Gaps = 13/127 (10%)

Query: 46 TAPLEVKTELPGR-TNAYRIAEVRPQVSGIVLNRNFTEGSDVQAGQSLYQIDPATYQASY 104
+E+ G+ T++ R E++P + IV EG V+ G L ++ +A
Sbjct: 77 LGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA-- 134

Query: 105 DSAKGELAKSEAAAAIAHLTVKRYVPLVGTKYISQQEYDQAIADARQADAAVIAAKATVE 164
+ K++++ A L RY L E ++ +
Sbjct: 135 -----DTLKTQSSLLQARLEQTRYQIL-----SRSIELNKLPELKLPDEPYFQNVSEEEV 184

Query: 165 SARINLA 171
+L
Sbjct: 185 LRLTSLI 191



Score = 29.0 bits (65), Expect = 0.031
Identities = 11/34 (32%), Positives = 15/34 (44%), Gaps = 1/34 (2%)

Query: 65 AEVRPQVSGIVLNRN-FTEGSDVQAGQSLYQIDP 97
+ +R VS V TEG V ++L I P
Sbjct: 328 SVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVP 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4032ACRIFLAVINRP14040.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1404 bits (3636), Expect = 0.0
Identities = 1028/1034 (99%), Positives = 1032/1034 (99%)

Query: 1 MANFFIRRPIFAWVLAIILMIAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60
MANFFIRRPIFAWVLAIILM+AGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120
VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180
EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRL 240
QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300
KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIQEVVKTLFEAIMLVFLVMYLFLQ 360
DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSI EVVKTLFEAIMLVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 MEDKLPPREATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480
MEDKLPP+EATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540
SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600
LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERSGDENSAEAVIHRAKMELGKIRDG 660
EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEER+GDENSAEAVIHRAKMELGKIRDG
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720
FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKVYVQADAKFRM 780
EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKK+YVQADAKFRM
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840
LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900
ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960
MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020
EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1021 VPVFFVVIRRCFKG 1034
VPVFFVVIRRCFKG
Sbjct: 1021 VPVFFVVIRRCFKG 1034


107c4095c4112N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4095-119-1.466678Probable general secretion pathway protein C
c4096-117-0.398470Probable general secretion pathway protein D
c4097022-0.071725Probable general secretion pathway protein E
c4098023-0.599696Putative general secretion pathway protein F
c4099122-1.413490Putative general secretion pathway protein G
c4100221-1.781385Putative general secretion pathway protein H
c4101122-2.355176Probable general secretion pathway protein I
c4102222-3.062291Probable general secretion pathway protein J
c4103023-3.396837Probable general secretion pathway protein K
c4104-219-2.720582Probable general secretion pathway protein L
c4105-221-3.432074Putative general secretion pathway protein M
c4106-224-2.105299Type 4 prepilin-like proteins leader peptide
c4107135-1.744807Bacterioferritin
c4108237-1.322833Bacterioferritin-associated ferredoxin
c4109339-1.255093Probable bifunctional chitinase/lysozyme
c4110748-1.265498Hypothetical protein
c4111851-0.397975Elongation factor Tu
c4112644-1.079569Elongation factor G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4095BCTERIALGSPC844e-21 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 83.9 bits (207), Expect = 4e-21
Identities = 53/200 (26%), Positives = 94/200 (47%), Gaps = 15/200 (7%)

Query: 59 EFSLAALWRNENHAGVKDANPVAVNQETPKLSIALNGIVLTSNDETSFVLINEGNEQKRY 118
+F+L + +N AG DA N L+++L G++ +D S +I++ NEQ
Sbjct: 64 DFTLFGVSPEKNKAGALDA-SQMSNLPPSTLNLSLTGVMAGDDDSRSIAIISKDNEQFSR 122

Query: 119 SLNEALESAPGT--FIRKINKTSVVFETHGHYEKVTLH-------PGLP--DIIKQPDSE 167
+NE + PG I I VV + G YE + L+ G+P + +Q
Sbjct: 123 GVNEEV---PGYNAKIVSIRPDRVVLQYQGRYEVLGLYSQEDSGSDGVPGAQVNEQLQQR 179

Query: 168 NQNVLADYIIATPIRDGEQIYGLRLNPRKGLNAFTTSLLQPGDIALRINNLSLTHPDEVS 227
++DY+ +PI + ++ G RLNP ++F LQ D+A+ +N L L ++
Sbjct: 180 ASTTMSDYVSFSPIMNDNKLQGYRLNPGPKSDSFYRVGLQDNDMAVALNGLDLRDAEQAK 239

Query: 228 QALSLLLTQQSAQFTIRRNG 247
+A+ + + T+ R+G
Sbjct: 240 KAMERMADVHNFTLTVERDG 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4096BCTERIALGSPD7160.0 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 716 bits (1850), Expect = 0.0
Identities = 344/629 (54%), Positives = 466/629 (74%), Gaps = 11/629 (1%)

Query: 11 ITCCLLAALLMPCAGHAENEQYGANFNNADIRQFVEIVGQHLGKTILIDPSVQGTISVRS 70
+T + AALL A E++ A+F DI++F+ V ++L KT++IDPSV+GTI+VRS
Sbjct: 12 LTLLIFAALLF---RPAAAEEFSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRS 68

Query: 71 NDTFSQQEYYQFFLSILDLYGYSVITLDNGFLKVVRSANVKTSPGMIADSSRPGVGDELV 130
D ++++YYQFFLS+LD+YG++VI ++NG LKVVRS + KT+ +A + PG+GDE+V
Sbjct: 69 YDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPGIGDEVV 128

Query: 131 TRIVPLENVPARDLAPLLRQMMDAGSVGNVVHYEPSNVLILTGRASTINKLIEVIKRVDV 190
TR+VPL NV ARDLAPLLRQ+ D VG+VVHYEPSNVL++TGRA+ I +L+ +++RVD
Sbjct: 129 TRVVPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVDN 188

Query: 191 IGTEKQQIIHLEYASAEDLAEILNQLISESHGKSQMPALLSAKIVADKRTNSLIISGPEK 250
G + L +ASA D+ +++ +L ++ KS +P + A +VAD+RTN++++SG
Sbjct: 189 AGDRSVVTVPLSWASAADVVKLVTEL-NKDTSKSALPGSMVANVVADERTNAVLVSGEPN 247

Query: 251 ARQRITSLLKSLDVEESEEGNTRVYYLKYAKATNLVEVLTGVSEKLKDEKGNSRKPSSTS 310
+RQRI +++K LD +++ +GNT+V YLKYAKA++LVEVLTG+S ++ EK ++ +
Sbjct: 248 SRQRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAK--PVAA 305

Query: 311 AMDNVAITADEQTNSLVITADQSVQEKLATVIARLDIRRAQVLVEAIIVEVQDGNGLNLG 370
N+ I A QTN+L++TA V L VIA+LDIRR QVLVEAII EVQD +GLNLG
Sbjct: 306 LDKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLG 365

Query: 371 VQWANKNVGAQQFTNTGLPVFNAAQGVADYKKNGGITSANPAWDMFSAYNGMAAGFFNGD 430
+QWANKN G QFTN+GLP+ A G Y K+G ++S+ S++NG+AAGF+ G+
Sbjct: 366 IQWANKNAGMTQFTNSGLPISTAIAGANQYNKDGTVSSSLA--SALSSFNGIAAGFYQGN 423

Query: 431 WGVLLTALASNNKNDILATPSIVTLDNKLASFNVGQDVPVLSGSQTTSGDNVFNTVERKT 490
W +LLTAL+S+ KNDILATPSIVTLDN A+FNVGQ+VPVL+GSQTTSGDN+FNTVERKT
Sbjct: 424 WAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDNIFNTVERKT 483

Query: 491 VGTKLKVTPQVNEGDAVLLEIEQEVSSVD---SSSNSTLGPTFNTRTIQNAVLVKTGETV 547
VG KLKV PQ+NEGD+VLLEIEQEVSSV SS++S LG TFNTRT+ NAVLV +GETV
Sbjct: 484 VGIKLKVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGETV 543

Query: 548 VLGGLLDDFSKEQVSKVPLLGDIPLVGQLFRYTSTERAKRNLMVFIRPTIIRDDDVYRSL 607
V+GGLLD + KVPLLGDIP++G LFR TS + +KRNLM+FIRPT+IRD D YR
Sbjct: 544 VVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIRDRDEYRQA 603

Query: 608 SKEKYTRYRQEQQLRIDGKSKALIGSEDL 636
S +YT + Q + ++ + ++DL
Sbjct: 604 SSGQYTAFNDAQSKQRGKENNDAMLNQDL 632


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4098BCTERIALGSPF5120.0 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 512 bits (1321), Expect = 0.0
Identities = 195/405 (48%), Positives = 283/405 (69%), Gaps = 8/405 (1%)

Query: 2 NYRYRAMTQDGQKLQGIIDANDERQARLRLREEGLFLLDIRPQK-------SSGVKTRRP 54
Y Y+A+ G+K +G +A+ RQAR LRE GL L + + S+G+ RR
Sbjct: 3 QYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLRRK 62

Query: 55 -RISHSELTLFTRQLATLSAAALPLEESLAVIGQQSSNNRLADVLNQVRSAILEGHPLSD 113
R+S S+L L TRQLATL AA++PLEE+L + +QS L+ ++ VRS ++EGH L+D
Sbjct: 63 IRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD 122

Query: 114 ALQHFPTLFDSLYRTLVKAGEKSGLLAPVLEKLADYNENRQKIRSKLIQSLIYPCMLTTV 173
A++ FP F+ LY +V AGE SG L VL +LADY E RQ++RS++ Q++IYPC+LT V
Sbjct: 123 AMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLTVV 182

Query: 174 AIVVVIILLTAVVPKITEQFVHMKQQLPLSTRILLGLSDTLQRTGPTLLATVFIVAVGFW 233
AI VV ILL+ VVPK+ EQF+HMKQ LPLSTR+L+G+SD ++ GP +L + + F
Sbjct: 183 AIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMAFR 242

Query: 234 LWLKRGNNRHRFHAMLLRVALIGPLICAINSARYLRTLSILQSSGVPLLDGMNLSTESLN 293
+ L++ R FH LL + LIG + +N+ARY RTLSIL +S VPLL M +S + ++
Sbjct: 243 VMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDVMS 302

Query: 294 NLEIRQRLANAAENVRQGNSIHLSLEQTAIFPPMMLYMVASGEKSGQLGTLMVRAADNQE 353
N R RL+ A + VR+G S+H +LEQTA+FPPMM +M+ASGE+SG+L +++ RAADNQ+
Sbjct: 303 NDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADNQD 362

Query: 354 TLQQNRIALTLSIFEPALIITMALIVLFIVVSVLQPLLQLNSMIN 398
+++ L L +FEP L+++MA +VLFIV+++LQP+LQLN++++
Sbjct: 363 REFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLMS 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4099BCTERIALGSPG2491e-88 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 249 bits (636), Expect = 1e-88
Identities = 144/145 (99%), Positives = 144/145 (99%)

Query: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK 60
MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK 60

Query: 61 LDNHRYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDL 120
LDNH YPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDL
Sbjct: 61 LDNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDL 120

Query: 121 LSAGPDGEMGTEDDITNWGLSKKKK 145
LSAGPDGEMGTEDDITNWGLSKKKK
Sbjct: 121 LSAGPDGEMGTEDDITNWGLSKKKK 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4100BCTERIALGSPH1412e-45 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 141 bits (356), Expect = 2e-45
Identities = 49/153 (32%), Positives = 76/153 (49%), Gaps = 18/153 (11%)

Query: 3 QQRGFTLLEMMLVLALVAITASVVLFTYGREDAANTRARETAARFTAALELAIDRATLSG 62
+QRGFTLLEMML+L L+ ++A +VL + + + A +T ARF A L R +G
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFP--ASRDDSAAQTLARFEAQLRFVQQRGLQTG 59

Query: 63 QPVGIHFSDSAWRIMV----PGKTP-------SAWRWVPLQEDAADESKNDWGEELSIQL 111
Q G+ W+ +V G P S +RW+PL+ S + G +L++
Sbjct: 60 QFFGVSVHPDRWQFLVLEARDGADPAPADDGWSGYRWLPLRAGRVATSGSIAGGKLNLAF 119

Query: 112 ---QPFKPDDSNQPQVVILADGQITPFSLLMAN 141
+ + P D P V+I G++TPF L +
Sbjct: 120 AQGEAWTPGD--NPDVLIFPGGEMTPFRLTLGE 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4101BCTERIALGSPG300.002 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 29.9 bits (67), Expect = 0.002
Identities = 17/90 (18%), Positives = 41/90 (45%), Gaps = 8/90 (8%)

Query: 14 MNKQSGMTLLEVLLAMSIFTAVALTLMSSMQGQ--RNAIERMRNETLALWIADNQLQSQD 71
+KQ G TLLE+++ + I +A ++ ++ G + ++ ++ +AL A + + D
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK-LD 62

Query: 72 SFGEENTSSSGKELING-----EEWNWRSD 96
+ T+ + L+ N+ +
Sbjct: 63 NHHYPTTNQGLESLVEAPTLPPLAANYNKE 92


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4102BCTERIALGSPG341e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 34.1 bits (78), Expect = 1e-04
Identities = 14/45 (31%), Positives = 27/45 (60%), Gaps = 5/45 (11%)

Query: 3 NRQQGFTLLEVMAALAIFSMLSVLAFMIFSQVSELHQRSQKEIQK 47
++Q+GFTLLE+M + I + VLA ++ + + + + + QK
Sbjct: 5 DKQRGFTLLEIMVVIVI---IGVLASLVVPNL--MGNKEKADKQK 44


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4106PREPILNPTASE1514e-47 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 151 bits (383), Expect = 4e-47
Identities = 88/262 (33%), Positives = 118/262 (45%), Gaps = 47/262 (17%)

Query: 5 LPLFILVGFIAGYFVNVMAYHL---------------SPLEDKTALTFRQVLVH------ 43
L L + G F+NV+ + L +D+ L+
Sbjct: 16 FSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEPPYNLMVPRSCCP 75

Query: 44 FWQKKYAWHDTVPLI-------------------------LCVAAAIACALAPFTPIVTG 78
+ +PL+ L ++A A+ T
Sbjct: 76 HCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSVAVAMTLAPGWGTL 135

Query: 79 ALFLYFCFALTLSVIDFRTQLLPDKLTLPLLWLGLVFNAQSGLIDLHDAVYGAVAGYGVL 138
A L + L+ ID LLPD+LTLPLLW GL+FN G + L DAV GA+AGY VL
Sbjct: 136 AALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGYLVL 195

Query: 139 WCVYWGVWLVCHKEGLGYGDFKLLAAAGAWCGWQTLPMILLIASLGGIGYAIVSQLLQRR 198
W +YW L+ KEG+GYGDFKLLAA GAW GWQ LP++LL++SL G I LL+
Sbjct: 196 WSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLRNH 255

Query: 199 TIPT-IAFGPWLALGSMINLGY 219
I FGP+LA+ I L +
Sbjct: 256 HQSKPIPFGPYLAIAGWIALLW 277


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4107HELNAPAPROT353e-05 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 35.2 bits (81), Expect = 3e-05
Identities = 28/150 (18%), Positives = 59/150 (39%), Gaps = 24/150 (16%)

Query: 5 TKVINYLNKLLGNE---LVAINQYFLHARMFKNWGLKRLNDVEYHESIDEM-----KHAD 56
T V N LN L N ++++ +W +K + HE +E+ + D
Sbjct: 11 TLVENSLNTQLSNWFLLYSKLHRF--------HWYVKGPHFFTLHEKFEELYDHAAETVD 62

Query: 57 RYIERILFLEGLPN--LQDLGKL------NIGEDVEEMLRSDLALELDGAKNLREAIGYA 108
ER+L + G P +++ + EM+++ + + + IG A
Sbjct: 63 TIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYKQISSESKFVIGLA 122

Query: 109 DSVHDYVSRDMMIEILRDEEGHIDWLETEL 138
+ D + D+ + ++ + E + L + L
Sbjct: 123 EENQDNATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4111TCRTETOQM804e-18 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 79.5 bits (196), Expect = 4e-18
Identities = 57/198 (28%), Positives = 87/198 (43%), Gaps = 13/198 (6%)

Query: 28 VNVGTIGHVDHGKTTLTAAI------TTVLAKTYGGAARAFDQIDNAPEEKARGITINTS 81
+N+G + HVD GKTTLT ++ T L G R DN E+ RGITI T
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRT----DNTLLERQRGITIQTG 59

Query: 82 HVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGPMPQTREHILLGRQV 141
+ +D PGH D++ + + +DGAIL+++A DG QTR R++
Sbjct: 60 ITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKM 119

Query: 142 GVPYIIVFLNKCDMVDDEELLELVEMEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWE 201
G+P I F+NK D + L V +++E LS + + +W+
Sbjct: 120 GIP-TIFFINKIDQNGID--LSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWD 176

Query: 202 AKILELAGFLDSYIPEPE 219
I L+ Y+
Sbjct: 177 TVIEGNDDLLEKYMSGKS 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4112TCRTETOQM6130.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 613 bits (1583), Expect = 0.0
Identities = 178/698 (25%), Positives = 304/698 (43%), Gaps = 81/698 (11%)

Query: 9 RYRNIGISAHIDAGKTTTTERILFYTGVNHKIGEVHDGAATMDWMEQEQERGITITSAAT 68
+ NIG+ AH+DAGKTT TE +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TAFWSGMAKQYEPHRINIIDTPGHVDFTIEVERSMRVLDGAVMVYCAVGGVQPQSETVWR 128
+ W ++NIIDTPGH+DF EV RS+ VLDGA+++ A GVQ Q+ ++
Sbjct: 62 SFQWEN-------TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFH 114

Query: 129 QANKYKVPRIAFVNKMDRMGANFLKVVNQIKTRLGANPVPLQLAIGAEEHFTGVVDLVKM 188
K +P I F+NK+D+ G + V IK +L A V Q V M
Sbjct: 115 ALRKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ----------KVELYPNM 164

Query: 189 KAINWNDADQGVTFEYEDIPADMVELANEWHQNLIESAAEASEELMEKYLGGEELTEAEI 248
N+ +++Q ++ E +++L+EKY+ G+ L E+
Sbjct: 165 CVTNFTESEQ------------------------WDTVIEGNDDLLEKYMSGKSLEALEL 200

Query: 249 KGALRQRVLNNEIILVTCGSAFKNKGVQAMLDAVIDYLPSPVDVPAINGILDDGKDTPAE 308
+ R N + V GSA N G+ +++ + + S
Sbjct: 201 EQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH----------------- 243

Query: 309 RHASDDEPFSALAFKIATDPFVGNLTFFRVYSGVVNSGDTVLNSVKAARERFGRIVQMHA 368
FKI L + R+YSGV++ D+V S K + + +
Sbjct: 244 ---RGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEK-EKIKITEMYTSIN 299

Query: 369 NKREEIKEVRAGDIAAAIG----LKDVTTGDTLCDPDAPIILERMEFPEPVISIAVEPKT 424
+ +I + +G+I L V GDT P ER+E P P++ VEP
Sbjct: 300 GELCKIDKAYSGEIVILQNEFLKLNSV-LGDTKLLPQR----ERIENPLPLLQTTVEPSK 354

Query: 425 KADQEKMGLALGRLAKEDPSFRVWTDEESNQTIIAGMGELHLDIIVDRMKREFNVEANVG 484
+E + AL ++ DP R + D +++ I++ +G++ +++ ++ +++VE +
Sbjct: 355 PQQREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIK 414

Query: 485 KPQVAYRETIRQKVTDVEGKHAKQSGGRGQYGHVVIDMYPLEPGSNPKGYEFINDIKGGV 544
+P V Y E +K E + + + + + PL GS G ++ + + G
Sbjct: 415 EPTVIYMERPLKK---AEYTIHIEVPPNPFWASIGLSVSPLPLGS---GMQYESSVSLGY 468

Query: 545 IPGEYIPAVDKGIQEQLKAGPLAGYPVVDMGIRLHFGSYHDVDSSELAFKLAASIAFKEG 604
+ + AV +GI+ + G L G+ V D I +G Y+ S+ F++ A I ++
Sbjct: 469 LNQSFQNAVMEGIRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQV 527

Query: 605 FKKAKPVLLEPIMKVEVETPEENTGDVIGDLSRRRGMLKGQESEVTGVKIHAEVPLSEMF 664
KKA LLEP + ++ P+E D + + + + V + E+P +
Sbjct: 528 LKKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQ 587

Query: 665 GYATQLRSLTKGRASYTMEFLKYDEAPSNVAQAVIEAR 702
Y + L T GR+ E Y + V + R
Sbjct: 588 EYRSDLTFFTNGRSVCLTELKGYHVT---TGEPVCQPR 622


108c4120c4130N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c41200170.226469Hypothetical protein yheO
c41210171.777418FKBP-type peptidyl-prolyl cis-trans isomerase
c4122-1153.321905SlyX protein
c4123-2143.143369FKBP-type peptidyl-prolyl cis-trans isomerase
c4124-1152.913553Conserved hypothetical protein
c4125-2152.924798Glutathione-regulated potassium-efflux system
c4126-1182.722861Putative NAD(P)H oxidoreductase yheR
c4127-1191.909137Hypothetical ABC transporter ATP-binding protein
c4128-3130.981423Hypothetical protein yheT
c4129-2141.262239Hypothetical protein yheU
c4130-2141.320114Probable phosphoribulokinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4120ACRIFLAVINRP290.023 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.7 bits (64), Expect = 0.023
Identities = 14/62 (22%), Positives = 29/62 (46%), Gaps = 1/62 (1%)

Query: 164 ASSVEDLVTQTLEFTIEEVNADRNV-SNNAKNRQIVLNLYEKGIFDIKDAINQVADRLNI 222
A +V+D VTQ +E + ++ + S + + + L + D A QV ++L +
Sbjct: 54 AQTVQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQL 113

Query: 223 SK 224
+
Sbjct: 114 AT 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4121INFPOTNTIATR1325e-40 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 132 bits (334), Expect = 5e-40
Identities = 79/226 (34%), Positives = 124/226 (54%), Gaps = 9/226 (3%)

Query: 28 AAKPATTADSKAAFKNDDQKSAYALGASLGRYMENSLKEQEKLGIKLDKDQLIAGVQDAF 87
A A A + D K +Y++GA LG K + GI ++ D L G+QD
Sbjct: 14 AMSTAMAATDATSLTTDKDKLSYSIGADLG-------KNFKNQGIDINPDVLAKGMQDGM 66

Query: 88 A-DKSKLSDQEIEQTLQAFEARVKSSAQAKMEKDAADNEAKGKEYREKFAKEKGVKTSST 146
+ + L++++++ L F+ + + A+ K A +N+AKG + + G+ +
Sbjct: 67 SGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVLPS 126

Query: 147 GLVYQVVEAGKGEAPKDSDTVVVNYKGTLIDGKEFDNSYTRGEPLSFRLDGVIPGWTEGL 206
GL Y++++AG G P SDTV V Y GTLIDG FD++ G+P +F++ VIPGWTE L
Sbjct: 127 GLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEAL 186

Query: 207 KNIKKGGKIKLVIPPELAYGKAGVPG-IPPNSTLVFDVELLDVKPA 251
+ + G ++ +P +LAYG V G I PN TL+F + L+ VK A
Sbjct: 187 QLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVKKA 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c412560KDINNERMP310.021 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 30.7 bits (69), Expect = 0.021
Identities = 13/69 (18%), Positives = 29/69 (42%), Gaps = 6/69 (8%)

Query: 261 TAIDPFKGLLLG---LFFISVGMSLNLGVLYTHL-LWVVISVVVLVAVKILVLYLLARLY 316
A+ P L + L+FIS + L +++ + W +++ V+ ++ L
Sbjct: 318 AAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKA-- 375

Query: 317 GVRSSERMQ 325
S +M+
Sbjct: 376 QYTSMAKMR 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4126ISCHRISMTASE320.001 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 31.9 bits (72), Expect = 0.001
Identities = 32/135 (23%), Positives = 51/135 (37%), Gaps = 16/135 (11%)

Query: 12 YAHPESQDSVANRVLLKPATQLSNVTVHDLYAHYPDFFIDIPREQALLREHEVIVFQH-- 69
Y P + D N+V P + + +HD+ ++ D F L + +
Sbjct: 9 YQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCV 68

Query: 70 ----PLYTYSCPALLKEWLDRVLSRGFASGPGGNQLAGKYWRSVITTGEPESA------Y 119
P+ + P DR L F GPG N +G Y +IT PE +
Sbjct: 69 QLGIPVVYTAQPGSQNP-DDRALLTDFW-GPGLN--SGPYEEKIITELAPEDDDLVLTKW 124

Query: 120 RYDALNRYPMSDVLR 134
RY A R + +++R
Sbjct: 125 RYSAFKRTNLLEMMR 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4127GPOSANCHOR330.005 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.7 bits (74), Expect = 0.005
Identities = 27/152 (17%), Positives = 54/152 (35%), Gaps = 22/152 (14%)

Query: 504 KVEPFDGDLEDYQQWLSDVQKQENQTDEAPKENANSAQARKDQKRREAELRAQTQPLRKE 563
+ D + ++ E + + ++ R+ +R R + L E
Sbjct: 272 AMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAE 331

Query: 564 IARLEKEME---------------------KLNAQLAQAEEKLGDSELYDQNRKAELTAC 602
+LE++ + +L A+ + EE+ SE Q+ + +L A
Sbjct: 332 HQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDAS 391

Query: 603 LQQQASAKSGLEECEMAWLEAQEQLEQMLLEG 634
+ + + LEE L A E+L + L E
Sbjct: 392 REAKKQVEKALEEANSK-LAALEKLNKELEES 422



Score = 32.7 bits (74), Expect = 0.005
Identities = 13/125 (10%), Positives = 39/125 (31%), Gaps = 7/125 (5%)

Query: 513 EDYQQWLSDVQKQENQTDEAPKENANSAQARKDQKRREAELRAQTQPLRKEIARLEKEME 572
+ + ++ + E A A + D ++ + +++
Sbjct: 127 KALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST-------ADSAKIK 179

Query: 573 KLNAQLAQAEEKLGDSELYDQNRKAELTACLQQQASAKSGLEECEMAWLEAQEQLEQMLL 632
L A+ A E + + E + TA + + ++ + ++ LE +
Sbjct: 180 TLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMN 239

Query: 633 EGQSN 637
++
Sbjct: 240 FSTAD 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4130PF07299320.002 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 31.8 bits (72), Expect = 0.002
Identities = 10/46 (21%), Positives = 21/46 (45%), Gaps = 2/46 (4%)

Query: 71 PEANDFGLLEQTFIEYGQSGKGKSRKYLHTYDEAVPWNQVPGTFTP 116
P+ + + E ++ KG SRK++ ++ + + GTF
Sbjct: 112 PDMEELDMKELSY--LSWIDKGSSRKFIIAKNDKNKFVGLQGTFQS 155


109c4236c4242N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4236-2193.305085Gamma-glutamyltranspeptidase precursor
c4237-2223.156804Hypothetical protein yhhA precursor
c4238-2223.068367Glycerophosphoryl diester phosphodiesterase
c4239-2253.124200SN-glycerol-3-phosphate transport ATP-binding
c4240-1242.363754SN-glycerol-3-phosphate transport system
c4241-1253.338301SN-glycerol-3-phosphate transport system
c4242-2263.564729Glycerol-3-phosphate-binding periplasmic protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4236NAFLGMOTY320.005 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 32.0 bits (72), Expect = 0.005
Identities = 27/80 (33%), Positives = 36/80 (45%), Gaps = 13/80 (16%)

Query: 272 RTPISGDYRGYQVYSMPPPSSGGIHIVQILNILENFDMQKYGF-GSADAMQIMAEAEKYA 330
R P+ G+ R + SMPPP G H +I N+ F Q G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNL--KFFKQFDGYVGGQTAWGILSELEKGR 133

Query: 331 YADRSEYLGDPDFVKVPWQA 350
Y P F WQ+
Sbjct: 134 Y---------PTFSYQDWQS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4238PF04619290.014 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 28.7 bits (64), Expect = 0.014
Identities = 12/60 (20%), Positives = 23/60 (38%), Gaps = 4/60 (6%)

Query: 29 VGAKYGHKMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGDLNWQD----LLRVDAGSW 84
+G ++ D + G+ FL+ D+N ++ + W + D GSW
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4239PF05272280.046 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.046
Identities = 10/29 (34%), Positives = 16/29 (55%)

Query: 46 IVMVGPSGCGKSTLLRMVAGLERVTTGDI 74
+V+ G G GKSTL+ + GL+ +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHF 627


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4242MALTOSEBP392e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 39.3 bits (91), Expect = 2e-05
Identities = 39/160 (24%), Positives = 66/160 (41%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQDLADYAAKLKASGMKCGYASGWQ 193
G L++ P L YNKD PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKDLLP-------NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


110c4285c4290N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4285-2122.703731Hypothetical protein yhhJ
c4286-392.607450Hypothetical ABC transporter ATP-binding protein
c4287-1110.924852Hypothetical protein yhiI precursor
c4288011-0.637900Hypothetical protein
c4289011-1.091411Hypothetical protein yhiM
c4290-1130.732561Hypothetical protein yhiN
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4285ABC2TRNSPORT503e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 50.3 bits (120), Expect = 3e-09
Identities = 41/171 (23%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 228 REREHGTVEHLLVMPITPFEIMMAKV-WSMGLVVLVVSGLSLVLMVKGVLGVPIEGSIPL 286
R T E +L + +I++ ++ W+ L +G+ +V G + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY----TQWLSLL 148

Query: 287 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQMLSGGSTPRESMPQMVQD 345
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P + Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 346 IMLTMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFF-TIALLRFR 395
+P +H + L + I+ ++ + I FF + ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4286PF05272300.047 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.047
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 37 ARCMVGLIGPDGVGKSSLLSLISGAR 62
V L G G+GKS+L++ + G
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4287RTXTOXIND844e-20 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 84.5 bits (209), Expect = 4e-20
Identities = 71/408 (17%), Positives = 139/408 (34%), Gaps = 81/408 (19%)

Query: 6 RHLAWWVVGALAVAAVVAWWLLRPAGVP-EGFAVSNGRIEATEVDIASKIAGRIDTILVK 64
R +A++++G L +A +++ G +GR + I + I+VK
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVK 113

Query: 65 EGQFVREGEVLAKMDTRV----------------LQEQRLEAI----------------- 91
EG+ VR+G+VL K+ L++ R + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 92 -------------------AQIKEAQSAVAAAQALLEQRQSETRAAQSLVNQRQAELDSV 132
Q Q+ + L+++++E + +N+ +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 133 AKRHTRSRSLAQRGAISAQQLDDDRAAAESARAALESAKAQVSASKAAIEAARTNIIQ-- 190
R SL + AI+ + + A L K+Q+ ++ I +A+
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 191 -----------AQTRVEAAQATERRIAADID--DSELKAPRDGRV-QYRVAEPGEVLAAG 236
QT T + S ++AP +V Q +V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 237 GRVLNMVDLSDVY-MTFFLPTEQAGTLKLGGEARLILDAAPDLRIPATISFVASVAQFTP 295
++ +V D +T + + G + +G A + ++A P R V V
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YLVGKVKNINL 410

Query: 296 KTVETSDERLKLMFRVKARIPPELLQQHLEYV--KTGLPGVAWVRVNE 341
+E D+RL L+F V I L + + +G+ A ++
Sbjct: 411 DAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4290ALARACEMASE290.033 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 29.0 bits (65), Expect = 0.033
Identities = 23/98 (23%), Positives = 38/98 (38%), Gaps = 18/98 (18%)

Query: 226 ENLLFTHRGLSGPAVLQISSYWQPGEFVSINLLPDVDLETFL--NEQRNAHPNQSLKNTL 283
E + RG GP +L + ++ + + + L T + N Q A N LK L
Sbjct: 63 EAITLRERGWKGP-ILMLEGFFHAQD---LEIYDQHRLTTCVHSNWQLKALQNARLKAPL 118

Query: 284 AVHL------------PKRLVERLQQLGQIPDVSLKQL 309
++L P R++ QQL + +V L
Sbjct: 119 DIYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTL 156


111c4545c4552N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4545-116-5.690715Hypothetical protein
c4546-117-4.721984Hypothetical protein
c4547022-3.212550S-adenosylmethionine synthetase
c4548230-4.854654Hypothetical protein
c4549432-5.379782Hypothetical protein
c4550530-3.829536Hypothetical protein
c4551727-1.651118Hypothetical protein
c45529261.380735Transposase insC for insertion element
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4545PF065802034e-64 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 203 bits (519), Expect = 4e-64
Identities = 56/214 (26%), Positives = 101/214 (47%), Gaps = 14/214 (6%)

Query: 203 RKRVEIERSLHEAEFKALSYQINPHFLFNVLNTIGRLAFLEDAQRTETMVHDFSDMMRYL 262
+ ++ EA+ AL QINPHF+FN LN I L LED + M+ S++MRY
Sbjct: 149 IDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALI-LEDPTKAREMLTSLSELMRYS 207

Query: 263 LRKNSHGLITLRNEINYVNNYMSIQKVRMRDRFDYLCDIPEKYLDVVCPFLILQPLVENF 322
LR ++ ++L +E+ V++Y+ + ++ DR + I +DV P +++Q LVEN
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENG 267

Query: 323 FNYVVEPRDSNSHLLIRATDDGLNVIIEVTDNGDGIAPDTINRILSGDQKLQKGSIGINN 382
+ + +L++ T D V +EV + G +T + G+ N
Sbjct: 268 IKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK----------ESTGTGLQN 317

Query: 383 IKNRLKLLFGESYGLEIMSPNKPRMGTTIKLRFP 416
++ RL++L+G +++ + P
Sbjct: 318 VRERLQMLYGTEAQIKLSEKQG---KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4546HTHFIS596e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 59.5 bits (144), Expect = 6e-12
Identities = 29/139 (20%), Positives = 53/139 (38%), Gaps = 3/139 (2%)

Query: 3 TIVIVEDEPIELESLRQIISQCVENAAIHEASTGKKAIHLIDQLSQIDMILVDINIPLPN 62
TI++ +D+ L Q +S+ + S I D+++ D+ +P N
Sbjct: 5 TILVADDDAAIRTVLNQALSR--AGYDVRITSNAATLWRWIAA-GDGDLVVTDVVMPDEN 61

Query: 63 GKQVIEYLKKKNSDTKIIVITANDDFDIVRSMYNLKVDDYLLKPVKKCILTDTIKKTLAF 122
++ +KK D ++V++A + F DYL KP L I + LA
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 123 DEGENEKSRALKQKVFAMI 141
+ K Q ++
Sbjct: 122 PKRRPSKLEDDSQDGMPLV 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4550HTHTETR724e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.4 bits (177), Expect = 4e-18
Identities = 33/63 (52%), Positives = 41/63 (65%)

Query: 10 RHTKFAAEETRKQILDVAEFCFCETGFSKTTLEMIAARAGCTRGAIYWYFNEKKDLLRQV 69
R TK A+ETR+ ILDVA F + G S T+L IA AG TRGAIYW+F +K DL ++
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 70 IER 72
E
Sbjct: 63 WEL 65


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4551ACRIFLAVINRP411e-137 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 411 bits (1057), Expect = e-137
Identities = 202/339 (59%), Positives = 270/339 (79%), Gaps = 1/339 (0%)

Query: 1 MIQARNQLLAEAAKSPA-LNMVRPNGMNDEPQFQILIDDEKVQAFKLSMSDVDNIMSAAW 59
+ QARNQLL AA+ PA L VRPNG+ D QF++ +D EK QA +S+SD++ +S A
Sbjct: 694 LTQARNQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTAL 753

Query: 60 GSMYVNDFNDRGRVKKVYIQGEPGSRISPQDFDKWYVRNSDGDMVSFASFATGKWIYGSP 119
G YVNDF DRGRVKK+Y+Q + R+ P+D DK YVR+++G+MV F++F T W+YGSP
Sbjct: 754 GGTYVNDFIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSP 813

Query: 120 KLEQYNGISAVEILGEPAPGYSSGDAMKAIEDIAARLPEGFHISWTGLSFEERLSGSQAP 179
+LE+YNG+ ++EI GE APG SSGDAM +E++A++LP G WTG+S++ERLSG+QAP
Sbjct: 814 RLERYNGLPSMEIQGEAAPGTSSGDAMALMENLASKLPAGIGYDWTGMSYQERLSGNQAP 873

Query: 180 ALYALSLLIVFLCLAALYESWSIPFSVMLVVPLGVLGAVCATLLRGLGNDVFFQVGLLTT 239
AL A+S ++VFLCLAALYESWSIP SVMLVVPLG++G + A L NDV+F VGLLTT
Sbjct: 874 ALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTT 933

Query: 240 IGLSAKNAILIVEFARELHEKEGLSIKEAAVEAARVRLRPIIMTSLAFVMGVIPLAVSTG 299
IGLSAKNAILIVEFA++L EKEG + EA + A R+RLRPI+MTSLAF++GV+PLA+S G
Sbjct: 934 IGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNG 993

Query: 300 ASSGSKHAIGTGVVGGMITATILAIFYIPLFYMLIAGFF 338
A SG+++A+G GV+GGM++AT+LAIF++P+F+++I F
Sbjct: 994 AGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRCF 1032



Score = 77.2 bits (190), Expect = 1e-17
Identities = 64/322 (19%), Positives = 125/322 (38%), Gaps = 21/322 (6%)

Query: 29 EPQFQILIDDEKVQAFKLSMSDVDNIMSAA----WGSMYVNDFNDRGRVKKVYIQGEPGS 84
+ +I +D + + +KL+ DV N + G+ I +
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQ-TR 239

Query: 85 RISPQDFDKWYVR-NSDGDMVSFASFAT---GKWIYGSPKLEQYNGISAVEILGEPAPGY 140
+P++F K +R NSDG +V A G Y + + NG A + + A G
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNV--IARINGKPAAGLGIKLATGA 297

Query: 141 SSGDAMKAI----EDIAARLPEGFHISW---TGLSFEERLSGSQAPALYALSLLIVFLCL 193
++ D KAI ++ P+G + + T + + A+ ++VFL +
Sbjct: 298 NALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAI--MLVFLVM 355

Query: 194 AALYESWSIPFSVMLVVPLGVLGAVCATLLRGLGNDVFFQVGLLTTIGLSAKNAILIVEF 253
++ + VP+ +LG G + G++ IGL +AI++VE
Sbjct: 356 YLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVEN 415

Query: 254 ARELHEKEGLSIKEAAVEAARVRLRPIIMTSLAFVMGVIPLAVSTGASSGSKHAIGTGVV 313
+ ++ L KEA ++ ++ ++ IP+A G++ +V
Sbjct: 416 VERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIV 475

Query: 314 GGMITATILAIFYIP-LFYMLI 334
M + ++A+ P L L+
Sbjct: 476 SAMALSVLVALILTPALCATLL 497


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4552PF06704250.047 DspF/AvrF protein
		>PF06704#DspF/AvrF protein

Length = 129

Score = 25.2 bits (55), Expect = 0.047
Identities = 20/85 (23%), Positives = 39/85 (45%), Gaps = 6/85 (7%)

Query: 28 RTTQEKIAIVQQSFEPGMTVSLVARQHGVAASQLFLWRKQYQEGSLTAVAA-GEQVVPAS 86
+ + + +S + SL A Q+GV A L+ Q E ++ + E V+
Sbjct: 2 NNSPTDFSRLIKSLGAQLGTSLTA-QNGVCA----LYDSQDNEAAVIEMPDHSEMVIFHC 56

Query: 87 ELAAAMKQIKELQRLLNKTPDVSRL 111
+ + + +LQ+LL+ DV+R+
Sbjct: 57 RVGRSPDRAADLQKLLSLNFDVARM 81


112c4586c4597N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c4586-1110.946105Hypothetical protein yicM
c4587-2131.909167Hypothetical protein yicN
c4588-2132.344932Hypothetical protein yicO
c4589-2111.211107Probable adenine deaminase
c4590-2130.463201Hexose phosphate transport protein
c45910131.468948Regulatory protein uhpC
c45920131.051825Sensor protein uhpB
c45930160.517321Transcriptional Regulatory protein uhpA
c4594014-0.232894Hypothetical protein
c45950173.416481Acetolactate synthase isozyme I small subunit
c45960153.507771Acetolactate synthase isozyme I large subunit
c5498-1171.781840IlvBN operon leader peptide
c4597-2171.733042Multidrug resistance protein D
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4586TCRTETA401e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 40.2 bits (94), Expect = 1e-05
Identities = 36/208 (17%), Positives = 72/208 (34%), Gaps = 13/208 (6%)

Query: 88 IIVEFLPVSLLTP----MAQDLGISEGVAGQSVTVTAFVAMFASLFITQTIQATDR--RY 141
+ ++ + + L+ P + +DL S V + A A+ +DR R
Sbjct: 14 VALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRR 73

Query: 142 VVILFAVLL-TLSCLLVSFANSFSLLLIGRACLGLALGGFWAMSASLTMRLVPPRTVPKA 200
V+L ++ + +++ A +L IGR G+ G A++ + + +
Sbjct: 74 PVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARH 132

Query: 201 LSVIFGAVSIALVIAAPLGSFLGELIGWRNVFNAAAAMG----VLCIFWIIKSLPSLPGE 256
+ +V LG +G F AAAA+ + F + +S
Sbjct: 133 FGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRP 191

Query: 257 PSHQKQNTFRLLQRPGVMAGMIAIFMSF 284
+ N + M + A+ F
Sbjct: 192 LRREALNPLASFRWARGMTVVAALMAVF 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4589UREASE381e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 37.8 bits (88), Expect = 1e-04
Identities = 30/105 (28%), Positives = 43/105 (40%), Gaps = 17/105 (16%)

Query: 22 AVSRGDAVADYIIDNVSILDLINGGEISGPIVIKGRYIAGVG-AEYAD---------APA 71
V+R D +I N ILD + G + I +K IA +G A D P
Sbjct: 60 QVTREGGAVDTVITNALILD--HWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPG 117

Query: 72 LQRIDARGATAVPGFIDAHLHIESSMMTPVTFETATLPRGLTTVI 116
+ I G G +D+H+H + P E A L GLT ++
Sbjct: 118 TEVIAGEGKIVTAGGMDSHIH----FICPQQIEEA-LMSGLTCML 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4590TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 0.001
Identities = 28/168 (16%), Positives = 61/168 (36%), Gaps = 17/168 (10%)

Query: 49 FNIAQNDMISTYGLSMTQLGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAIC 108
N++ D+ + + + F +T+ +G + +D K+ L F +I++ C
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN--C 90

Query: 109 MLGFSASMGSGSVSLFLMIAFYALSGFFQSTGGSCSYSTI----TKWTPRRKRGTFLGFW 164
+G SL +M + F Q G + + + ++ P+ RG G
Sbjct: 91 FGSVIGFVGHSFFSLLIM------ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 165 NISHNLGGAGAAGVALFGANYLFDGHVIGMFIFPSIIALIVGFIGLRY 212
+G + A+Y+ + + + P + I+ L
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSY---LLLIP--MITIITVPFLMK 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4591TCRTETB401e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.2 bits (94), Expect = 1e-05
Identities = 65/408 (15%), Positives = 137/408 (33%), Gaps = 60/408 (14%)

Query: 30 RHILLTIWLGYALFY--FTRKSFNAAVPEILANGVLSRSDIGLLATLFYITYGVSKFVSG 87
RH + IWL F+ N ++P+I + + + T F +T+ + V G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 88 IVSDRSNARYFMGIGLIATGIINILFGFSTSLWAFAVLWVLNAFFQGWGS---PVCARLL 144
+SD+ + + G+I +++ S F L ++ F QG G+ P ++
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHS---FFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 145 TAWY-SRTERGGWWALWNTAHNVGGALIPIVMAASALHYGWRAGMMIAGCMAIVVGIFLC 203
A Y + RG + L + +G + P + A + W ++ M ++ +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPF- 184

Query: 204 WRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKEILTKYVLLNPYIWLLSFCYVLV 263
+ L +I G L I+ + Y VL
Sbjct: 185 --------LMKLLKKEVRIKGHFDIK----GIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 264 YVV-----RAAINDWGNLYMSETLGVDLVTANTAVTMFELGGFI-----------GALVA 307
+++ R + + + + + + + + + GF+ A
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 308 GWGSDKLFNGNRGPMNLIFAAGILL-SVGSLWLMPFASYVMQATCFFTIGFFVFGPQMLI 366
GS +F G + + GIL+ G L+++ + + F T F + +
Sbjct: 293 EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFM 351

Query: 367 ---------GMAAAECS---------HKEAAGAATGFVGLFAYLGASL 396
G++ + ++ AGA + ++L
Sbjct: 352 TIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4592PF06580392e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.5 bits (92), Expect = 2e-05
Identities = 28/142 (19%), Positives = 57/142 (40%), Gaps = 11/142 (7%)

Query: 367 LRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWRIDESALSENQRVTLFRVCQEGLNN 426
LR ++L + + ++L L++ + + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 427 IVKHA-----DASAVTLQGWQHDERLMLVIEDDGSGLPPDSGQ-HGFGLTGMRERVTALG 480
+KH + L+G + + + L +E+ GS ++ + G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 481 G---TLTISCLHG-TRVSVSLP 498
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4593HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 2e-13
Identities = 29/174 (16%), Positives = 59/174 (33%), Gaps = 20/174 (11%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATG 118
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 119 GCYLTPDIAIKLASGRQDPLTKRERQVAEKLAQG---MAVKEIAAELGLSPKTV 169
A+ R L + + + + + A L + T+
Sbjct: 118 --------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4597TCRTETB607e-12 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 59.5 bits (144), Expect = 7e-12
Identities = 41/184 (22%), Positives = 81/184 (44%), Gaps = 1/184 (0%)

Query: 7 RNVNLLLMLVLLVAVGQMAQTIYIPAIADMARDLNVREGAVQSVMGAYLLTYGVSQLFYG 66
R+ +L+ L +L + + + ++ D+A D N + V A++LT+ + YG
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 67 PISDRVGRRPVILVGMSIFMLATLVA-VTTSSLTVLIAASAMQGMGTGVGGVMARTLPRD 125
+SD++G + ++L G+ I +++ V S ++LI A +QG G + +
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVAR 130

Query: 126 LYERTQLRHANSLLNMGILVSPLLAPLIGGLLDTMWNWRACYLFLLVLCAGVTFSMARWM 185
+ A L+ + + + P IGG++ +W L ++ V F M
Sbjct: 131 YIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLK 190

Query: 186 PETR 189
E R
Sbjct: 191 KEVR 194


113c4961c4969N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c49610183.818662Sensor protein zraS
c49620203.858797Transcriptional Regulatory protein zraR
c49630192.436656Phosphoribosylamine--glycine ligase
c49640150.454532Purine biosynthesis protein PurH
c4965015-0.143026Hypothetical protein
c49680140.485965*Hypothetical protein yjaA
c4969-1131.310993Hypothetical acetyltransferase yjaB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4961PF06580363e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 35.6 bits (82), Expect = 3e-04
Identities = 19/96 (19%), Positives = 37/96 (38%), Gaps = 16/96 (16%)

Query: 356 NLYLNAIQAIGQHGVISVTASESGAGVKISVTDSGKGIAADQLEAIFTPYFTTKAEGTGL 415
N + I + Q G I + ++ V + V ++G + E TG
Sbjct: 266 NGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNT------------KESTGT 313

Query: 416 GLAVVHNIVEQHGG---TIQVASQEGKGATFTLWLP 448
GL V ++ G I+++ ++GK + +P
Sbjct: 314 GLQNVRERLQMLYGTEAQIKLSEKQGKV-NAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4962HTHFIS5270.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 527 bits (1359), Expect = 0.0
Identities = 184/468 (39%), Positives = 253/468 (54%), Gaps = 35/468 (7%)

Query: 8 ILVVDDDISHCTILQALLRGWGYNVALANSGRQALEQVREQVFDLVLCDVRMAEMDGIAT 67
ILV DDD + T+L L GY+V + ++ + DLV+ DV M + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 68 LKEIKTLNPAIPVLIMTAYSSVETAVEALKTGALDYLIKPLDFDNLQATLEKALAHTHSV 127
L IK P +PVL+M+A ++ TA++A + GA DYL KP D L + +ALA
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRR 125

Query: 128 DAETPAVSASQFGMVGKSPAMQHLLSEIALVAPSEATVLIHGDSGTGKELVARAIHASSA 187
++ S +VG+S AMQ + +A + ++ T++I G+SGTGKELVARA+H
Sbjct: 126 PSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGK 185

Query: 188 RSEKPLVTLNCAALNESLLESELFGHEKGAFTGADKRREGRFVEADGGTLFLDEIGDISP 247
R P V +N AA+ L+ESELFGHEKGAFTGA R GRF +A+GGTLFLDEIGD+
Sbjct: 186 RRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPM 245

Query: 248 MMQVRLLRAIQEREVQRVGSNQTISVDVRLIAATHRDLAAEVNAGRFRQDLYYRLNVVAI 307
Q RLLR +Q+ E VG I DVR++AAT++DL +N G FR+DLYYRLNVV +
Sbjct: 246 DAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPL 305

Query: 308 EVPSLRQRREDIPLLANHFLQRFAERNRKAVKGFTPQAMDLLIHYDWPGNIRELENAVER 367
+P LR R EDIP L HF+Q+ + VK F +A++L+ + WPGN+RELEN V R
Sbjct: 306 RLPPLRDRAEDIPDLVRHFVQQAEKEGLD-VKRFDQEALELMKAHPWPGNVRELENLVRR 364

Query: 368 AVVLLTGEYISERELPLAIASTPIPLVQSQDIQP-------------------------- 401
L + I+ + + S +
Sbjct: 365 LTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALP 424

Query: 402 --------LVEVEKEVILAALEKTGGNKTEAARQLGITRKTLLAKLSR 441
L E+E +ILAAL T GN+ +AA LG+ R TL K+
Sbjct: 425 PSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4968SHAPEPROTEIN326e-04 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 31.7 bits (72), Expect = 6e-04
Identities = 23/62 (37%), Positives = 32/62 (51%), Gaps = 9/62 (14%)

Query: 49 IANFFVAEKVLQDLVLQLHPRSTWHSFLPAKRMDIVVSALEMNEGGLSQVEERILHEVVA 108
IA+FFV EK+LQ + Q+H S P+ R+ + V G +QVE R + E
Sbjct: 81 IADFFVTEKMLQHFIKQVHSNSF---MRPSPRVLVCVPV------GATQVERRAIRESAQ 131

Query: 109 GA 110
GA
Sbjct: 132 GA 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c4969SACTRNSFRASE324e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 4e-04
Identities = 16/52 (30%), Positives = 21/52 (40%), Gaps = 5/52 (9%)

Query: 78 IDPDVRGCGVGRMLVKHALSMAPE-----LTTNVNEQNEQAVGFYKKVGFKV 124
+ D R GVG L+ A+ A E L + N A FY K F +
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148


114c5111c5118N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c5111-2150.859777Phosphonates transport ATP-binding protein phnC
c5112-2150.424898PhnB protein
c5113-3160.662966PhnA protein
c5114-3150.444004Hypothetical protein yjdA
c5115-2140.996263Hypothetical protein yjcZ
c5116-213-0.733260Proline/betaine transporter
c5117-218-0.085969Sensor protein basS/pmrB
c5118-117-0.550735Transcriptional Regulatory protein basR/pmrA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5111PF05272290.016 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.016
Identities = 12/22 (54%), Positives = 13/22 (59%)

Query: 32 MVALLGPSGSGKSTLLRHLSGL 53
V L G G GKSTL+ L GL
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5116TCRTETA449e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.0 bits (104), Expect = 9e-07
Identities = 57/290 (19%), Positives = 105/290 (36%), Gaps = 55/290 (18%)

Query: 96 FFGMLGDKYGRQKILAITIVIMSISTFCIGLIPSYDTIGIWAPILLLICKMAQGFSVGGE 155
G L D++GR+ +L +++ ++ + P +W +L I ++ G + G
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPF-----LW---VLYIGRIVAGIT-GAT 112

Query: 156 YTGASIFVAEYSPDRKR----GFMGSWLDFGSIAGFVLGAGVVVLISTIVGEENFLDWGW 211
A ++A+ + +R GFM + FG +AG VLG G++ S
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG-GLMGGFSP------------ 159

Query: 212 RIPFFIALPLGIIGLYLRHALEETPAFQQHVDKLEQGDREGLQDGPKVSFKEIATKHWRS 271
PFF A L + L K E+ P SF+ W
Sbjct: 160 HAPFFAAAALNGLNFLTGCFLLPESH------KGERRPLRREALNPLASFR------WAR 207

Query: 272 LLTCIGLVIATNVTYYML----LTYMPSYLSHNLHYS-EDHGVLIIIAIMIGMLFVQPVM 326
+T + ++A ++ + H+ G+ + ++ L +
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMIT 267

Query: 327 GLLSDRFGRRPFVLLG----SVALFVLA--------IPAFILINSNVIGL 364
G ++ R G R ++LG +LA P +L+ S IG+
Sbjct: 268 GPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM 317



Score = 39.4 bits (92), Expect = 2e-05
Identities = 39/164 (23%), Positives = 73/164 (44%), Gaps = 16/164 (9%)

Query: 297 LSHNLHYSEDHGVLI-IIAIMIGMLFVQPVMGLLSDRFGRRPFVLLGSVALFVLAIPAFI 355
L H+ + +G+L+ + A+M PV+G LSDRFGRRP +L+ L A+ I
Sbjct: 35 LVHSNDVTAHYGILLALYALM--QFACAPVLGALSDRFGRRPVLLVS---LAGAAVDYAI 89

Query: 356 LINSNVIGLIFAGLLMLAVILNCFTGVMASTLPAMFPTHIR---YSALAAAFNISVLVAG 412
+ + + +++ G ++A I V + + + R + ++A F +VAG
Sbjct: 90 MATAPFLWVLYIG-RIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFG-MVAG 147

Query: 413 LTPTLAAWLVESSQNLMMPAYYLMVVAVVGLITG-VTMKETANR 455
P L + S + P + + + +TG + E+
Sbjct: 148 --PVLGGLMGGFSPH--APFFAAAALNGLNFLTGCFLLPESHKG 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5117PF06580377e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.2 bits (86), Expect = 7e-05
Identities = 39/182 (21%), Positives = 80/182 (43%), Gaps = 34/182 (18%)

Query: 184 ARLDQMMESVSQLLQLARAGQSFSSGNYQHVKLLEDV-ILPSYDELSTML--DQRQQTLL 240
+ +M+ S+S+L++ S N + V L +++ ++ SY +L+++ D+ Q
Sbjct: 191 TKAREMLTSLSELMR-----YSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 241 LPESAADITVQGDATLLRMLLRNLVENAHRY----SPQGSNIMIKLQEDGGAV-MAVEDE 295
+ + D+ V ML++ LVEN ++ PQG I++K +D G V + VE+
Sbjct: 246 INPAIMDVQV------PPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENT 299

Query: 296 GPGIDESKCGELSKAFVRMDSRYGGIGLGLSIV-SRITQLHHGQFFLQNRQETSGTRAWI 354
G + + G GL V R+ L+ + ++ ++ A +
Sbjct: 300 GSLA--------------LKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 355 RL 356
+
Sbjct: 346 LI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5118HTHFIS921e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.8 bits (228), Expect = 1e-23
Identities = 42/121 (34%), Positives = 60/121 (49%)

Query: 2 KILIVEDDTLLLQGLILAAQTEGYACDGVSTARMAEQSLEAGHYSLVVLDLGLPDEDGLH 61
IL+ +DD + L A GY S A + + AG LVV D+ +PDE+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 FLARIRQKKYTLPVLILTARDTLTDKIAGLDVGADDYLVKPFALEELHARIRALLRRHNN 121
L RI++ + LPVL+++A++T I + GA DYL KPF L EL I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 Q 122
+
Sbjct: 125 R 125


115c5179c5187N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c5179738-3.417978PapG protein
c5180639-2.531750PapF protein
c5181533-0.450700PapE protein
c51824310.489031PapK protein
c5183531-0.866435Hypothetical protein
c5184431-1.407677PapJ protein
c5185430-2.180318PapD protein
c5186428-1.344325PapC protein
c5187632-4.008344PapH protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5179PF036276010.0 PapG
		>PF03627#PapG

Length = 336

Score = 601 bits (1550), Expect = 0.0
Identities = 333/336 (99%), Positives = 334/336 (99%)

Query: 1 MKKWFPALLFSLCVSGESSAWNNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIATVT 60
MKKWFPALLFSLCVSGESSAWNNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIATVT
Sbjct: 1 MKKWFPALLFSLCVSGESSAWNNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIATVT 60

Query: 61 WNQCNGPEFADGSWAYYREYIAWVVFPKKVMTQNGYPLFIEVHNKGSWSEENTGDNDSYF 120
WNQCNGP FADGSWAYYREYIAWVVFPKKVMT+NGYPLFIEVHNKGSWSEENTGDNDSYF
Sbjct: 61 WNQCNGPGFADGSWAYYREYIAWVVFPKKVMTKNGYPLFIEVHNKGSWSEENTGDNDSYF 120

Query: 121 FLKGYKWDERAFDAGNLCQKPGETTRLTEKFDDIIFKVALPADLPLGDYSVKIPYTSGMQ 180
FLKGYKWDERAFDAGNLCQKPGETTRLTEKFDDIIFKVALPADLPLGDYSV IPYTSGMQ
Sbjct: 121 FLKGYKWDERAFDAGNLCQKPGETTRLTEKFDDIIFKVALPADLPLGDYSVTIPYTSGMQ 180

Query: 181 RHFASYLGARFKIPYNVAKTLPRENEMLFLFKNIGGCRPSAQSLEIKHGDLSINSANNHY 240
RHFASYLGARFKIPYNVAKTLPRENEMLFLFKNIGGCRPSAQSLEIKHGDLSINSANNHY
Sbjct: 181 RHFASYLGARFKIPYNVAKTLPRENEMLFLFKNIGGCRPSAQSLEIKHGDLSINSANNHY 240

Query: 241 AAQTLSVSCDVPANIRFMLLRNTTPTYSHGKKFSVGLGHGWDSIVSVNGVDTGETTMRWY 300
AAQTLSVSCDVPANIRFMLLRNTTPTYSHGKKFSVGLGHGWDSIVSVNGVDTGETTMRWY
Sbjct: 241 AAQTLSVSCDVPANIRFMLLRNTTPTYSHGKKFSVGLGHGWDSIVSVNGVDTGETTMRWY 300

Query: 301 KAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP 336
KAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP
Sbjct: 301 KAGTQNLTIGSRLYGESSKIQPGVLSGSATLLMILP 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5180FIMBRIALPAPF2675e-95 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 267 bits (683), Expect = 5e-95
Identities = 155/167 (92%), Positives = 156/167 (93%), Gaps = 1/167 (0%)

Query: 11 MARLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE 70
M RLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE
Sbjct: 1 MIRLSLFISLLLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNINPEHVDNSRGE 60

Query: 71 VTKTISISCTYKSGSPWIKVTGNAMA-GQTNVLATNIANFGIALYQGKGMSTPLTLGNGS 129
VTK ISISC YKSGS WIKVTGN M GQ NVLATNI +FGIALYQGKGMSTPLTLGNGS
Sbjct: 61 VTKNISISCPYKSGSLWIKVTGNTMGVGQNNVLATNITHFGIALYQGKGMSTPLTLGNGS 120

Query: 130 GNGYRVTAGLDTARSTFTFTSVPFRNGSRTLNGGDFRTTASMSMIYN 176
GNGYRVTAGLDTARSTFTFTSVPFRNGS LNGGDFRTTASMSMIYN
Sbjct: 121 GNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMIYN 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5181FIMBRIALPAPE306e-110 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 306 bits (784), Expect = e-110
Identities = 128/173 (73%), Positives = 145/173 (83%)

Query: 7 MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVTKAEVDWGNVEIQTLSQNG 66
MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTV AEV+WG++EIQ L Q+G
Sbjct: 1 MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAEVNWGDIEIQNLVQSG 60

Query: 67 NHEKEFTVNMQCPYHLGTMKVTITATNTYNNAILVQNTSNTSSDGVLVYLYNSNAGNIGT 126
++K+FTV+M CPY LGTMKVTIT+ N+ILV NTS S DG+L+YLYNSN IG
Sbjct: 61 GNQKDFTVDMNCPYSLGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYNSNNSGIGN 120

Query: 127 AITLGTPFTPGKITGNNADRTISLHAKLGYKGNMQSLKAGDFSATATLVASYS 179
A+TLG+ TPGKITG R I+L+AKLGYKGNMQSL+AG FSATATLVASYS
Sbjct: 121 AVTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVASYS 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5186PF005777420.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 742 bits (1918), Expect = 0.0
Identities = 243/882 (27%), Positives = 362/882 (41%), Gaps = 67/882 (7%)

Query: 5 MRGMKDRI-PFAVNNITCVILLSLFCNAASAVEFNTDVLDAADKKNIDFTRFSEAGYVLP 63
+ K R+ F V + +++ + FN L + D +RF + P
Sbjct: 16 LHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPP 75

Query: 64 GQYLLDVIVNGQSISPASLQISFVEPQSSGDKAEKKLPQACLTSDMVRLMGLTAESLDKV 123
G Y +D+ +N + A+ ++F S CLT + MGL S+ +
Sbjct: 76 GTYRVDIYLNNGYM--ATRDVTFNTGDSEQGI------VPCLTRAQLASMGLNTASVSGM 127

Query: 124 VYWHDGQCADF-HGLPGVDIRPDTGAGVLRINMPQAWLEYSDATWLPPSRWDDGIPGLML 182
D C + + D G L + +PQA++ ++PP WD GI +L
Sbjct: 128 NLLADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLL 187

Query: 183 DYNLNGTVSRNYQGGDSHQFSYNGTVGGNLGPWRLRADYQGSQEQSRYNGEKTTNRNFTW 242
+YN +G +N GG+SH N G N+G WRLR + S S + + +
Sbjct: 188 NYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSS--DSSSGSKNKWQH 245

Query: 243 SRFYLFRAIPRWRANLTLGENNINSDIFRSWSYTGASLESDDRMLPPRLRGYAPQITGIA 302
+L R I R+ LTLG+ DIF ++ GA L SDD MLP RG+AP I GIA
Sbjct: 246 INTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIA 305

Query: 303 ETNARVVVSQQGRVLYDSMVPAGPFSIQDLD-SSVRGRLDVEVIEQNGRKKTFQVDTASV 361
A+V + Q G +Y+S VP GPF+I D+ + G L V + E +G + F V +SV
Sbjct: 306 RGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSV 365

Query: 362 PYLTRPGQVRYKLVSGRSRGYGHETEGPVFATGEASWGLSNQWSLYGGAVLAGDYNALAA 421
P L R G RY + +G R + E P F GL W++YGG LA Y A
Sbjct: 366 PLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNF 425

Query: 422 GAGWDLGVPGTLSADITQSVARIEGERTFQGKSWRLSYSKRFDNADADITFAGYRFSERN 481
G G ++G G LS D+TQ+ + + + G+S R Y+K + + +I GYR+S
Sbjct: 426 GIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSG 485

Query: 482 YMTMEQYLNARYR--------------------NDYSSREKEMYTVTLNKNVADWNTSFN 521
Y +R + + ++ +T+ + + + +
Sbjct: 486 YFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTS-TLY 544

Query: 522 LQYSRQTYWDIRKTD-YYTVSVNRYFNVFGLQGVAVGLSASRSKYLGRD--NDSAYLRIS 578
L S QTYW D + +N F + LS S +K + + L ++
Sbjct: 545 LSGSHQTYWGTSNVDEQFQAGLNTAFE-----DINWTLSYSLTKNAWQKGRDQMLALNVN 599

Query: 579 VPLGT------------GTASYSGSMSND-RYVNMAGYTDM-FNDGLDSYSLNAGLNSGG 624
+P +ASYS S + R N+AG D SYS+ G GG
Sbjct: 600 IPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGG 659

Query: 625 GLTSQRQINAYYSHRSPLANLSANIASLQKGYTSFGVSASGGATITGKGAALHAGGMSGG 684
S A ++R N + S SGG G L G
Sbjct: 660 DGNSGSTGYATLNYRGGYGNANIG-YSHSDDIKQLYYGVSGGVLAHANGVTL--GQPLND 716

Query: 685 TRLLVDTDGVGGVPVDGGQVV-TNRWGTGVVTDISSYYRNTTSVDLKRLPDDVEATRSVV 743
T +LV G V+ V T+ G V+ + Y N ++D L D+V+ +V
Sbjct: 717 TVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVA 776

Query: 744 ESALTEGAIGYRKFSVLKGKRLFAILRLADGSQPPFGASVTSEKGRELGMVADEGLAWLS 803
T GAI +F G +L L + PFGA VTSE + G+VAD G +LS
Sbjct: 777 NVVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYLS 835

Query: 804 GVTPGETLSVNW--DGKIQCQVNVPETAISDQQLL----LPC 839
G+ + V W + C N S QQLL C
Sbjct: 836 GMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5187FIMBRIALPAPE320.001 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 31.5 bits (71), Expect = 0.001
Identities = 41/173 (23%), Positives = 75/173 (43%), Gaps = 29/173 (16%)

Query: 29 GMSLPEYWG----EEHVWWDGRAAFHGEVVRPACTLAMEDAWQIIDMGETPVRDL-QNGF 83
G+ LP G +HV F G+++ PACT+ + ++ G+ +++L Q+G
Sbjct: 6 GLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTVQNAE----VNWGDIEIQNLVQSG- 60

Query: 84 SGPERKFSLRLRNCEFNSQGGNLFSDSRIRVTFDGVRGET---PDKFNLSGQAKGINLQI 140
G ++ F++ + NC ++ ++ +T +G G + P+ SG I L
Sbjct: 61 -GNQKDFTVDM-NCPYS------LGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYN 112

Query: 141 ADARGNIARAGKV-MPAIP--LTGNEEALDYTLRIVR----NGKKLEAGNYFA 186
++ I A + P +TG A TL N + L+AG + A
Sbjct: 113 SNN-SGIGNAVTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSA 164


116c5396c5415N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c5396315-1.141126Outer membrane usher protein fimD precursor
c53970151.041905FimF protein precursor
c5398-1212.667356Hypothetical protein
c5399-1202.737563FimG protein precursor
c5400-2202.590588FimH protein precursor
c54010232.966686High-affinity gluconate transporter
c5402-1202.411155Mannonate dehydratase
c5403-1182.809674D-mannonate oxidoreductase
c5404-1171.610865Uxu operon transcriptional regulator
c5405-2191.348035Hypothetical protein yjiD
c5406-1201.206869Hypothetical protein
c54071191.633802Hypothetical transcriptional regulator yjiE
c5408219-0.088261Isoaspartyl dipeptidase
c5409321-1.483271Hypothetical protein yjiG
c5410120-0.703036Hypothetical protein yjiH
c5411119-0.319132Hypothetical protein
c54142180.312389Hypothetical protein yjiJ
c5415216-0.696130Hypothetical protein yjiK
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5396PF0057710980.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 1098 bits (2842), Expect = 0.0
Identities = 877/878 (99%), Positives = 877/878 (99%)

Query: 1 MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQA 60
MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQA
Sbjct: 1 MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQA 60

Query: 61 VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN 120
VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN
Sbjct: 61 VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN 120

Query: 121 TASVSGMNLLADDACVPLTSMIHDATAHLDVGQQRLNLTIPQAFMSNRARGYIPPELWDP 180
TASVSGMNLLADDACVPLTSMIHDATA LDVGQQRLNLTIPQAFMSNRARGYIPPELWDP
Sbjct: 121 TASVSGMNLLADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDP 180

Query: 181 GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSK 240
GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSK
Sbjct: 181 GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSK 240

Query: 241 NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV 300
NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV
Sbjct: 241 NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV 300

Query: 301 IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV 360
IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV
Sbjct: 301 IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV 360

Query: 361 PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY 420
PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY
Sbjct: 361 PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY 420

Query: 421 RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR 480
RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR
Sbjct: 421 RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR 480

Query: 481 YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRT 540
YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRT
Sbjct: 481 YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRT 540

Query: 541 STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNI 600
STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNI
Sbjct: 541 STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNI 600

Query: 601 PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD 660
PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD
Sbjct: 601 PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD 660

Query: 661 GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVL 720
GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVL
Sbjct: 661 GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVL 720

Query: 721 VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP 780
VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP
Sbjct: 721 VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP 780

Query: 781 TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA 840
TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA
Sbjct: 781 TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA 840

Query: 841 GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878
GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR
Sbjct: 841 GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5399VACCYTOTOXIN300.004 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 30.4 bits (68), Expect = 0.004
Identities = 30/158 (18%), Positives = 49/158 (31%), Gaps = 9/158 (5%)

Query: 3 WRKRGYLLAAMLAFASATIQAADVTITVNGKVVAKPCTVSTTNATVDLGDLYSFSLMSAG 62
W R + A LA + +TI + VT VN + + + + G
Sbjct: 258 WMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTH------IG 311

Query: 63 AASAWHDVALELTNCPVG--TSRVTASFSGAADSTGYYKNQGTAQNIQLELQDDSGNTLN 120
W L + P G + S + Q ++QN + N+
Sbjct: 312 TLDLWQSAGLNIIAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSAQ 371

Query: 121 TGATKTVQVDDSSQSAHFPLQVRALTVNGGATQGTIQA 158
+ QV D + V +N A GTI+
Sbjct: 372 KTEIQPTQVIDGPFAGGKNTVVNINRINTNA-DGTIRV 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5400SURFACELAYER280.045 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 28.1 bits (62), Expect = 0.045
Identities = 19/79 (24%), Positives = 32/79 (40%), Gaps = 1/79 (1%)

Query: 214 SQNLGYYLSGTTADAGNSIFTNTASFSPAQGVGVQLTRNGTIIPANNTVSLGAVGTSAVS 273
S+N G ++ +A+ N FT PA V V L ++G ++ + + +
Sbjct: 133 SENAGKEITIGSAN-PNVTFTEKTGDQPASTVKVTLDQDGVAKLSSVQIKNVYAIDTTYN 191

Query: 274 LGLTANYARTGGQVTAGNV 292
+ TG VT G V
Sbjct: 192 SNVNFYDVTTGATVTTGAV 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5401PF06580310.008 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.008
Identities = 10/49 (20%), Positives = 25/49 (51%)

Query: 230 LVPLIPAIIMISTTIANIWLVKDTPAWEVVNFIGSSPIAMFIAMVVAFV 278
+ +I ++ I +W V +T W ++ FI + P+A + + ++ +
Sbjct: 73 MGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSII 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5406PHPHTRNFRASE1531e-47 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 153 bits (389), Expect = 1e-47
Identities = 46/110 (41%), Positives = 66/110 (60%)

Query: 1 MIEVPAAVMIADKLASEVDFFSIGTNDLTQYIMAADRGNSTVAKLVDYCNDAVINAIAMV 60
M+E+P+ + A+ A EVDFFSIGTNDL QY MAADR N V+ L + A++ + MV
Sbjct: 430 MVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPAILRLVDMV 489

Query: 61 CQAGRNNEIPVSMCGEMAGDIQQTARLLTMGIDKLSASPSRLPALKAAIR 110
+A + V MCGEMAGD LL +G+D+ S S + + ++ +
Sbjct: 490 IKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLL 539


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5408UREASE354e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 35.5 bits (82), Expect = 4e-04
Identities = 30/129 (23%), Positives = 48/129 (37%), Gaps = 33/129 (25%)

Query: 26 CDVLIANGKIIAVASNIPSDIVPDCT--------VVDLSGQILCPGFIDQHVHLIGG--- 74
D+ + +G+I A+ D+ P T V+ G+I+ G +D H+H I
Sbjct: 86 ADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIHFICPQQI 145

Query: 75 ------------GGEAGP------TTRTP-EVALSRLTEA--GITSVVGLLGTDSISRHP 113
GG GP TT TP ++R+ EA + G + S P
Sbjct: 146 EEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIARMIEAADAFPMNLAFAGKGNAS-LP 204

Query: 114 ESLLAKTRA 122
+L+
Sbjct: 205 GALVEMVLG 213


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5414TCRTETA290.027 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.4 bits (66), Expect = 0.027
Identities = 58/312 (18%), Positives = 106/312 (33%), Gaps = 16/312 (5%)

Query: 82 RPFLLASALASGLLILAMAWLPPFILVLLIRVLAGVASAGMLIFGSTLIMQHTRHPFVLA 141
RP LL S + + MA P ++ + R++AG+ A + G+ +
Sbjct: 73 RPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARH 132

Query: 142 ALFSGVGIGIALSNEYVLAGLHFDLSSQTLWQGAGALSGMMLIALTLLMP-SKKHAIAPM 200
F G + VL GL S + A AL+G+ + L+P S K P+
Sbjct: 133 FGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPL 192

Query: 201 PLAKTEQQIMSWW---------LLAILYGLAGFGYIIVATYLPLMAKDAGSPLLTAHLWT 251
W L+A+ + + G + A ++ T + +
Sbjct: 193 RREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGI-S 251

Query: 252 LVGLSIVPGCFGWLWA---AKRWGALPCLTANLLVQAIS-VLLTLASDSPLLLIISSLGF 307
L I+ + A R G L ++ +LL A+ + I L
Sbjct: 252 LAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVL-L 310

Query: 308 GGTFMGTTSLVMTIARQLSVPGNLNLLGFVTLIYGIGQILGPALTSMLGNGTSALASATL 367
+G +L ++RQ+ L G + + + I+GP L + + + +
Sbjct: 311 ASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWA 370

Query: 368 CGAAALFIAALI 379
A A +
Sbjct: 371 WIAGAALYLLCL 382


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5415ADHESNFAMILY290.026 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 29.1 bits (65), Expect = 0.026
Identities = 10/45 (22%), Positives = 17/45 (37%)

Query: 54 LFVIVAVCTFFVQSCARKSNHAASFQNYHATIDGKEIAGITNNIS 98
+++ + + +CA S Q IA IT NI+
Sbjct: 6 TLLVLFLSAIILVACASGKKDTTSGQKLKVVATNSIIADITKNIA 50


117c5482c5488N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
c5482-2151.028931Probable phosphoglycerate mutase 2
c5483-1130.429038Right origin-binding protein
c54840140.048849CreA protein
c5485Transcriptional Regulatory protein creB
c5486Sensor protein creC
c5487Inner membrane protein creD
c5488Aerobic respiration control protein arcA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5482VACCYTOTOXIN290.017 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 28.8 bits (64), Expect = 0.017
Identities = 14/45 (31%), Positives = 20/45 (44%), Gaps = 4/45 (8%)

Query: 145 PLLVSHGIALGCLVSTILGLPAWAERRLRLRNCSISRVDYQESLW 189
P +V GIA G V T+ GL W ++ N D + +W
Sbjct: 42 PAIVG-GIATGAAVGTVSGLLGWGLKQAEEAN---KTPDKPDKVW 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5485HTHFIS909e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 9e-23
Identities = 34/139 (24%), Positives = 61/139 (43%)

Query: 1 MQRETVWLVEDEQGIADTLVYMLQQEGFDVEVFERGLPVLDKARQQVPDVMILDVGLPDI 60
M T+ + +D+ I L L + G+DV + + D+++ DV +PD
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 SGFELCRQLLALHPALPVLFLTARSEEVDRLLGLEIGADDYVAKPFSPREVCARVRTLLR 120
+ F+L ++ P LPVL ++A++ + + E GA DY+ KPF E+ + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 RVKKFSTPSPVIRIGHFEL 139
K+ + L
Sbjct: 121 EPKRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5486PF06580320.006 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.006
Identities = 41/182 (22%), Positives = 72/182 (39%), Gaps = 40/182 (21%)

Query: 312 LRQARLENRQEVVLTAVDVAALFR---RVSEARTVQLAE--KNITLHVM--------PTE 358
+R LE+ + ++ L R R S AR V LA+ + ++ +
Sbjct: 182 IRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQ 241

Query: 359 VNVASEPALLEQALGNLL-----DNA----IDFTPESGCITLSAEVDQEYVTLKVLDTGS 409
PA+++ + +L +N I P+ G I L D VTL+V +TGS
Sbjct: 242 FENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGS 301

Query: 410 GIPDYALSRIFERFYSLPRANGQKSSGLGLAFVSE-VARLFNGEVTLR-NVQEGGVLASL 467
N ++S+G GL V E + L+ E ++ + ++G V A +
Sbjct: 302 LALK----------------NTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 468 RL 469
+
Sbjct: 346 LI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
c5488HTHFIS824e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.8 bits (202), Expect = 4e-20
Identities = 30/122 (24%), Positives = 60/122 (49%), Gaps = 1/122 (0%)

Query: 1 MQTPHILIVEDELVTRNTLKSIFEAEGYDVFEATDGAEMHQILSEYDINLVIMDINLPGK 60
M IL+ +D+ R L GYDV ++ A + + ++ D +LV+ D+ +P +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLLLARELRE-QANVALMFLTGRDNEVDKILGLEIGADDYITKPFNPRELTIRARNLLS 119
N L +++ + ++ ++ ++ ++ + I E GA DY+ KPF+ EL L+
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RT 121

Sbjct: 121 EP 122



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.