PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeATCC33237.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NZ_CP012541 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1CCON33237_RS00005CCON33237_RS00060Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS00005113-3.014791chromosomal replication initiation protein DnaA
CCON33237_RS00010113-2.074438DNA polymerase III subunit beta
CCON33237_RS00015112-0.692964DNA gyrase subunit B
CCON33237_RS00020112-2.042100preQ(1) synthase
CCON33237_RS00025-111-0.276458hydrolase
CCON33237_RS000300111.271060hypothetical protein
CCON33237_RS000351144.079757flagellar biosynthesis protein FlgC
CCON33237_RS000403226.489637flagellar basal body rod modification protein
CCON33237_RS000454247.054780flagellar hook-length control protein FliK
CCON33237_RS000502217.307857GTP-binding protein
CCON33237_RS00060-1165.198581pyrroline-5-carboxylate reductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS00005ACRIFLAVINRP310.013 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.0 bits (70), Expect = 0.013
Identities = 18/101 (17%), Positives = 40/101 (39%), Gaps = 11/101 (10%)

Query: 287 EVINYIATNMGDNIREIEG-AIINLNVFKTLMKEEITLDLAKSI-----LKDLIKE-KRE 339
++ +Y+A+N+ D + + G + L + M I LD D+I + K +
Sbjct: 153 DISDYVASNVKDTLSRLNGVGDVQLFGAQYAM--RIWLDADLLNKYKLTPVDVINQLKVQ 210

Query: 340 NINFD--TIVEIVSKELNIKQSDIKSKSRVTNIVEARRIII 378
N + + + I +++R N E ++ +
Sbjct: 211 NDQIAAGQLGGTPALPGQQLNASIIAQTRFKNPEEFGKVTL 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS00035FLGHOOKAP1418e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.1 bits (96), Expect = 8e-06
Identities = 43/235 (18%), Positives = 81/235 (34%), Gaps = 32/235 (13%)

Query: 302 NKEKLSSEIYNSDGTKSLVTINFTKQIPQGGNQTTWNATATITDANGVVQNTAMGTLTFD 361
NK ++ +D + + T + W T ++ V A G + FD
Sbjct: 340 NKGDVAIGATVTDAS----AVLATDY-KISFDNNQWQVTRLASNTTFTVTPDANGKVAFD 394

Query: 362 GSGRLVTNTLTSVGNVALNFLGDGDANVYNGITSSAN-SKKDFVIKADGYAEGNLSKYSV 420
G T T + L + D N+ IT A + D
Sbjct: 395 GLELTFTGTPAVNDSFTLKPVSDAIVNMDVLITDEAKIAMASEEDAGDS----------- 443

Query: 421 DDRGNIMANFD---NSRSFIVAKIALYHFQNEQGVSKVGDNLYEATPNSGEAFFYKNKAG 477
D N A D NS++ AK + + S +G+ +S G
Sbjct: 444 -DNRNGQALLDLQSNSKTVGGAKSFNDAYASLV--SDIGNKTATLKTSS-------ATQG 493

Query: 478 ETIYGSQILSNKLEMSNVDLGQALSEVIVTQKAYEASAKSITTSDEMIQTAIQMK 532
+ +Q+ + + +S V+L + + Q+ Y A+A+ + T++ + I ++
Sbjct: 494 NVV--TQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS00050TCRTETOQM1967e-57 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 196 bits (499), Expect = 7e-57
Identities = 108/447 (24%), Positives = 189/447 (42%), Gaps = 86/447 (19%)

Query: 3 KIRNIAVIAHVDHGKTTMVDELLKQSGTFNE--HQNLGERVMDSNDIERERGITILSKNT 60
KI NI V+AHVD GKTT+ + LL SG E + G D+ +ER+RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 61 AIRYKDTKINIIDTPGHADFGGEVERVLKMVDGVLLLVDAQEGVMPQTKFVVKKALSLGL 120
+ ++++TK+NIIDTPGH DF EV R L ++DG +LL+ A++GV QT+ + +G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 121 RPIVVVNKIDKPAGDPDRVINEIFDLFVA----------------------------LDA 152
I +NKID+ D V +I + A ++
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 153 NDEQLE--------------------------FPVVYAAAKNGYAKLKLSDENKDMQPLF 186
ND+ LE FPV + +AKN N + L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKN----------NIGIDNLI 231

Query: 187 ETILSHVPAPSGSDENSLQLQVFTLDYDNYVGKIGIARIFNGKISKNQNVMLAKADGTKT 246
E I + + + ++ L +VF ++Y ++ R+++G + +V ++ K
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRIS----EKE 287

Query: 247 TGRISKLIGFMGLERTDINEAGTGDIVAIAGFDA---LDVGDSVVDPNNPHPLDPLHIEE 303
+I+++ + E I++A +G+IV + +GD+ + P +PL
Sbjct: 288 KIKITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENPL---- 343

Query: 304 PTLSVVFSVNDGPLAGTEGKHVTSNKIDERLANEMKTNIAMKYENIGEGKFKVSGRGELQ 363
P L + + +++ Y + + +S G++Q
Sbjct: 344 PLLQTTVEPSKPQQREMLLDALLE------ISDSDPL--LRYYVDSATHEIILSFLGKVQ 395

Query: 364 ITILAENMRRE-GYEFLLGRPEVIVKE 389
+ + ++ + E + P VI E
Sbjct: 396 MEVTCALLQEKYHVEIEIKEPTVIYME 422



Score = 41.8 bits (98), Expect = 7e-06
Identities = 20/80 (25%), Positives = 29/80 (36%), Gaps = 1/80 (1%)

Query: 396 EPYELLVIDAPDDTTGTVIEKLGKRKAEMVSMNPTGDGQTRIEFEIPARGLIGFRSQFLT 455
EPY I AP + K A +V + + + EIPAR + +RS
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 456 DTKGEGVMNHSFLEFRPLSG 475
T G V + +G
Sbjct: 596 FTNGRSVCLTELKGYHVTTG 615


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS00060PF05272280.043 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.043
Identities = 9/70 (12%), Positives = 19/70 (27%)

Query: 20 EALWRAQSDEVDAGRSFAEEERKFDGGSEAQRAENLAQTGKTSSQRECEKKTKFEIIACT 79
E +R + + A AE AQ G + + + +
Sbjct: 751 EIYFRPEQELRLVETGVQGRLWALLTREGAPAAEGAAQKGYSVNTTFVTIADLVQALGAD 810

Query: 80 RSKNEALRQR 89
K+ + +
Sbjct: 811 PGKSSPMLEG 820


2CCON33237_RS01185CCON33237_RS01215Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS01185424-6.314014hypothetical protein
CCON33237_RS01190827-6.905958hypothetical protein
CCON33237_RS011951028-6.890494hypothetical protein
CCON33237_RS093501128-6.490913hypothetical protein
CCON33237_RS01205926-5.600648hypothetical protein
CCON33237_RS01210825-4.602756hypothetical protein
CCON33237_RS01215722-2.297945hypothetical protein
3CCON33237_RS01875CCON33237_RS02010Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS018750183.618223hypothetical protein
CCON33237_RS018801214.853719tryptophan--tRNA ligase
CCON33237_RS018853245.176233shikimate kinase
CCON33237_RS018904276.009241ribosome biogenesis GTPase Der
CCON33237_RS018957327.668569hypothetical protein
CCON33237_RS019007368.9605413-deoxy-8-phosphooctulonate synthase
CCON33237_RS019054317.682181chaperonin
CCON33237_RS019103317.403972multidrug transporter
CCON33237_RS019153223.0795526,7-dimethyl-8-ribityllumazine synthase
CCON33237_RS019203230.174134N utilization substance protein B
CCON33237_RS01925-127-4.282111orotidine 5'-phosphate decarboxylase
CCON33237_RS01930131-5.685820Pathogenicity locus
CCON33237_RS01935234-8.119703*site-specific integrase
CCON33237_RS01945337-8.996077hypothetical protein
CCON33237_RS01950439-7.016617radical SAM protein
CCON33237_RS01955438-7.040200ABC transporter ATP-binding protein
CCON33237_RS01960539-6.630363zonular occludens toxin
CCON33237_RS01965542-6.263313hypothetical protein
CCON33237_RS01970641-5.259597hypothetical protein
CCON33237_RS01975637-6.062336hypothetical protein
CCON33237_RS01980727-4.586168hypothetical protein
CCON33237_RS01990523-3.649347hypothetical protein
CCON33237_RS01995422-3.746492hypothetical protein
CCON33237_RS02000322-3.899610hypothetical protein
CCON33237_RS02005222-3.420258hypothetical protein
CCON33237_RS02010217-2.296039LemA family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS01880BORPETOXINA310.004 Bordetella pertussis toxin A subunit signature.
		>BORPETOXINA#Bordetella pertussis toxin A subunit signature.

Length = 269

Score = 31.3 bits (70), Expect = 0.004
Identities = 19/60 (31%), Positives = 33/60 (55%), Gaps = 7/60 (11%)

Query: 237 KLFLDESGQKELQARYERGGEGHGHFKAYLNELIWD--YFKDAREKFEH---YQNNPGEV 291
+++L+ Q+ ++A ER G G GHF Y+ E+ D ++ A FE+ Y +N G +
Sbjct: 95 EVYLEHRMQEAVEA--ERAGRGTGHFIGYIYEVRADNNFYGAASSYFEYVDTYGDNAGRI 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS01890TCRTETOQM357e-04 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 34.8 bits (80), Expect = 7e-04
Identities = 30/146 (20%), Positives = 64/146 (43%), Gaps = 7/146 (4%)

Query: 196 KNIRVGIIGRVNVGKSSLLNALVKESRAV--VSDV-AGTTIDPVNEIYEHDGRVFEFVDT 252
K I +G++ V+ GK++L +L+ S A+ + V GTT + G + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 253 AGIRKRGKIEGIERYA----LNRTEKILEETDVALLVLDSSEPLTELDERIAGIASKFEL 308
+ + K+ I+ L + L D A+L++ + + + + K +
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 309 GVIIVLNKWDKSSEEFDELCKEIKDR 334
I +NK D++ + + ++IK++
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEK 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS01960BCTERIALGSPD788e-18 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 78.4 bits (193), Expect = 8e-18
Identities = 49/325 (15%), Positives = 111/325 (34%), Gaps = 52/325 (16%)

Query: 73 NDYIIKFKHVSKEDVVSALSLFSENIK---------------YTVYSD----RILLITTE 113
N +I K+ D+V L+ S ++ + + +++
Sbjct: 268 NTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAKPVAALDKNIIIKAHGQTNALIVTAAP 327

Query: 114 SQYEIINNLINGLDTSYQLRQLSFTIISTDNTKLKEIGPRIESLLNPLDHFY---FKIIT 170
+ +I LD + I + +G + + + F I T
Sbjct: 328 DVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGIQWANKNAGMTQFTNSGLPIST 387

Query: 171 NVLTVDSTKVNKDSVTS----------------------LINLLKEKGVSDLLYNPRVTL 208
+ + + +S L+ L +D+L P +
Sbjct: 388 AIAGANQYNKDGTVSSSLASALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVT 447

Query: 209 IDNKDSVIESVIKTPIQKSSIEIQNNQRMTTNQVDYQDVGLKLYITNVLITNDSVSFTLD 268
+DN ++ + P+ S + N V+ + VG+KL + + DSV ++
Sbjct: 448 LDNMEATFNVGQEVPVLTGSQTTSGDN--IFNTVERKTVGIKLKVKPQINEGDSVLLEIE 505

Query: 269 LYIENLLDDT------LTPRISSRHLKTNVYLTDSNSFLIGGINSKETIKSTKTIPFVEN 322
+ ++ D L ++R + V + + ++GG+ K + +P + +
Sbjct: 506 QEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGD 565

Query: 323 IPILGDITTYKSEKTTDYSFSIFIT 347
IP++G + S+K + + +FI
Sbjct: 566 IPVIGALFRSTSKKVSKRNLMLFIR 590



Score = 34.1 bits (78), Expect = 0.001
Identities = 12/66 (18%), Positives = 29/66 (43%), Gaps = 2/66 (3%)

Query: 4 ISSITGKNIVISGNVDTNFDVFLPTLDLSNTDTFSKLLKDILNVNGLDYLIQDSVLLIYN 63
+S K ++I +V V + D+ N + + + +L+V G + ++ +L
Sbjct: 46 VSKNLNKTVIIDPSVRGTITVR--SYDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVV 103

Query: 64 PTVEDK 69
+ + K
Sbjct: 104 RSKDAK 109


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS01975FIMBRIALPAPE330.002 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 33.1 bits (75), Expect = 0.002
Identities = 26/116 (22%), Positives = 50/116 (43%), Gaps = 17/116 (14%)

Query: 460 IKQTSNTGNKATYKGNIVTPDQSIIDVDV------VETTTQTGSRVQDVTYSYTYKTPKG 513
+ Q + + T+KG ++ P ++ + +V ++ Q+G +D T G
Sbjct: 18 MSQHVHAADNLTFKGKLIIPACTVQNAEVNWGDIEIQNLVQSGGNQKDFTVDMNCPYSLG 77

Query: 514 SSKFSTGYVNTIGSDNRVNNSIPKDGTSSSGGNS-----SSPGNGGSGSSTTTPTQ 564
+ K TI S+ + NSI TS++ G+ + N G G++ T +Q
Sbjct: 78 TMKV------TITSNGQTGNSILVPNTSTASGDGLLIYLYNSNNSGIGNAVTLGSQ 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS02000RTXTOXINC270.019 Gram-negative bacterial RTX toxin-activating protein C...
		>RTXTOXINC#Gram-negative bacterial RTX toxin-activating protein C

signature.
Length = 170

Score = 27.2 bits (60), Expect = 0.019
Identities = 12/35 (34%), Positives = 20/35 (57%), Gaps = 2/35 (5%)

Query: 28 NQEKYIKDLNLVG--DWNIQTERYFIDFILTAGNN 60
N+ KY+ D+ + DW ++FID+I G+N
Sbjct: 66 NEIKYLNDVTSLVAEDWTSGDRKWFIDWIAPFGDN 100


4CCON33237_RS02055CCON33237_RS02150Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS02055117-3.334365hypothetical protein
CCON33237_RS02060215-2.205477beta-ketoacyl synthase
CCON33237_RS02065114-1.7823361-acyl-sn-glycerol-3-phosphate acyltransferase
CCON33237_RS02070-114-1.048106acyl carrier protein
CCON33237_RS02075115-1.154506acyl carrier protein
CCON33237_RS02080314-0.727201DNA gyrase subunit B
CCON33237_RS02085314-0.0441203-phosphoshikimate 1-carboxyvinyltransferase
CCON33237_RS020902170.605495glycosyl transferase family 2
CCON33237_RS020954211.6521894-hydroxybenzoyl-CoA thioesterase
CCON33237_RS021000203.083651ATP synthase
CCON33237_RS021050173.470129hypothetical protein
CCON33237_RS02110-2185.216920hypothetical protein
CCON33237_RS02115-1185.276209hypothetical protein
CCON33237_RS02120-1175.032876competence protein
CCON33237_RS02125-2184.932242isoleucine--tRNA ligase
CCON33237_RS02130-2163.920472aspartyl/glutamyl-tRNA amidotransferase subunit
CCON33237_RS02135-2194.679961IMP dehydrogenase
CCON33237_RS021400213.160082homoserine O-acetyltransferase
CCON33237_RS021452222.311958exodeoxyribonuclease VII small subunit
CCON33237_RS021502233.185623carbon-nitrogen hydrolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS02105ACRIFLAVINRP397e-05 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 39.1 bits (91), Expect = 7e-05
Identities = 34/152 (22%), Positives = 60/152 (39%), Gaps = 21/152 (13%)

Query: 605 ELALKLKIAALVVAFLLLWFYFSALVSALVMGIIV-FGVLLTLFIFAIFGINLSIFGVFG 663
E+ L A ++V ++ F + + + L+ I V +L T I A FG +++ +FG
Sbjct: 339 EVVKTLFEAIMLVFLVMYLFLQN-MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFG 397

Query: 664 LILASAVGIDYMI--------FALNDSLSEKERIY-----------GILCAFVTSFISFF 704
++LA + +D I + D L KE GI FI
Sbjct: 398 MVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMA 457

Query: 705 TLSFSQTAALSVFGLSVSLCVLIYGLCASVLS 736
S A F +++ + + L A +L+
Sbjct: 458 FFGGSTGAIYRQFSITIVSAMALSVLVALILT 489



Score = 33.3 bits (76), Expect = 0.004
Identities = 28/160 (17%), Positives = 62/160 (38%), Gaps = 21/160 (13%)

Query: 594 SSLNESLTQAKELALKLKIAALVVAFLLLWFYF-SALVSALVMGIIVFGVLLTLFIFAIF 652
+ ++ + A L + VV FL L + S + VM ++ G++ L +F
Sbjct: 859 TGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLF 918

Query: 653 GINLSIFGVFGLI----LASAVGIDYMIFALNDSLSE------------KERIYGILCAF 696
++ + GL+ L++ I + FA + E + R+ IL
Sbjct: 919 NQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTS 978

Query: 697 VTSFISFFTLSFSQTA-ALSVFGLSVSLCVLIYGLCASVL 735
+ + L+ S A + + + + ++ G+ ++ L
Sbjct: 979 LAFILGVLPLAISNGAGSGAQNAVGI---GVMGGMVSATL 1015



Score = 32.1 bits (73), Expect = 0.010
Identities = 33/173 (19%), Positives = 71/173 (41%), Gaps = 17/173 (9%)

Query: 218 YQAFSKQKNESESLYMSALSLSLTAMFLMLA--FRNLRI-FYVIFIAAFGFSVAFAGTLL 274
+ S Q+ S + + +++S +FL LA + + I V+ + G L
Sbjct: 858 WTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATL 917

Query: 275 CLGELNILTILISTSLIG-------LMFDYVLHWLSKNEGEAIR---ADSIKNMLKIFLL 324
+ ++ ++ + IG L+ ++ L + EG+ + +++ L+ L+
Sbjct: 918 FNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKD-LMEKEGKGVVEATLMAVRMRLRPILM 976

Query: 325 GLLITLSGYLAFTFSN---LKLLKEVALFSAFALVTAFLASYFFMPLVFEGVK 374
L + G L SN V + +V+A L + FF+P+ F ++
Sbjct: 977 TSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIR 1029


5CCON33237_RS02235CCON33237_RS02345Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS02235-1183.125276flagellar biosynthesis protein FlhB
CCON33237_RS02240-2173.896478hypothetical protein
CCON33237_RS02245-2184.48119850S ribosomal protein L21
CCON33237_RS02250-1174.39783850S ribosomal protein L27
CCON33237_RS02255-2183.462978Obg family GTPase CgtA
CCON33237_RS02260-4160.904096GDP-mannose dehydrogenase
CCON33237_RS02265-4150.287776methionyl-tRNA formyltransferase
CCON33237_RS02270-216-1.294200hydroxymethylpyrimidine/phosphomethylpyrimidine
CCON33237_RS02275018-2.829570biotin--[acetyl-CoA-carboxylase] ligase
CCON33237_RS02280120-1.356921chromosome partitioning protein ParA
CCON33237_RS02285319-1.225702chromosome partitioning protein ParB
CCON33237_RS022903230.404389ATP synthase F0F1 subunit B'
CCON33237_RS022953230.705491ATP synthase F0F1 subunit B
CCON33237_RS023003230.779268ATP synthase F0F1 subunit delta
CCON33237_RS023054240.887069ATP synthase subunit alpha
CCON33237_RS02310424-1.210271ATP synthase subunit gamma
CCON33237_RS02315524-1.870378ATP synthase subunit beta
CCON33237_RS02320427-3.594078F0F1 ATP synthase subunit epsilon
CCON33237_RS02325427-4.061329flagellar motor protein MotA
CCON33237_RS02330427-3.226438biopolymer transporter ExbD
CCON33237_RS02335327-3.152161hypothetical protein
CCON33237_RS02340226-2.491627translocation protein TolB
CCON33237_RS02345223-1.626642hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS02235TYPE3IMSPROT352e-123 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 352 bits (905), Expect = e-123
Identities = 111/351 (31%), Positives = 183/351 (52%), Gaps = 1/351 (0%)

Query: 7 EKTEEATPKKIEDAKKDGNVPKSQDLAGFVTLVIAIGVLLAMLNFMKEQIISLYIYYSKF 66
EKTE+ TPKKI DA+K G V KS+++ +V +L+ + ++ E L + ++
Sbjct: 4 EKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAEQ 63

Query: 67 IGQPLTLPTVKMIVVNTFARSLLMILPVCICVAIAGVIANVMQFGFIFTAKPIMPNFGKI 126
P + + +V N + P+ A+ + ++V+Q+GF+ + + I P+ KI
Sbjct: 64 SYLPFS-QALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKKI 122

Query: 127 NPLKGLKNLFSMKKVIDSIKIVLKVSIVFGVGFYFFLQFIKELPHTLFFSMFDQLAWLKE 186
NP++G K +FS+K +++ +K +LKV ++ + + + L + L +
Sbjct: 123 NPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQ 182

Query: 187 KLIILVSVMLFILFVIGLIDLLIVRFQYFKDLRMSKQEIKDEYKQMEGDPQVKGRIRQAQ 246
L L+ + VI + D +QY K+L+MSK EIK EYK+MEG P++K + RQ
Sbjct: 183 ILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQFH 242

Query: 247 MRAAKRRMMQNIPQADVVITNPTHYAVAIRYDKSRDEAPIILAKGVDFLALQIKKIAVEN 306
R M +N+ ++ VV+ NPTH A+ I Y + P++ K D ++KIA E
Sbjct: 243 QEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEEE 302

Query: 307 GVQIYENPPLARELYKICEVDDTIPAHLFRAVAEVLSFVYMSNKQKFKDKL 357
GV I + PLAR LY VD IPA A AEVL ++ N +K ++
Sbjct: 303 GVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEKQHSEM 353


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS02335IGASERPTASE432e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 42.7 bits (100), Expect = 2e-06
Identities = 41/224 (18%), Positives = 79/224 (35%), Gaps = 17/224 (7%)

Query: 60 QTIKAPKQADEVV----KETKPKPKKESEEDKQETTNKPVVPDEPLPTPSLPTPPKEEPK 115
QT + + E ETK E EE + T K + P T + P +E+ +
Sbjct: 1081 QTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEK--TQEVPKVTSQVS-PKQEQSE 1137

Query: 116 PEPKKSEPKPEISKPSEESKEDVKPESKPEPKPTPKPVEKPKPKEPNIKDLFSDIDPTKL 175
++EP E + P+ KE + P ++P + T
Sbjct: 1138 TVQPQAEPARE-NDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTE------STTVN 1190

Query: 176 KKDDGIKKTENKVQSRKKSEASSSKAAKEASDIIKSLKIDQNPTAPKSQSTGTYDPLMGA 235
+ ++ EN + + +S + K + +S++ P + +T + D A
Sbjct: 1191 TGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVR--SVPHNVEPATTSSNDRSTVA 1248

Query: 236 ITKQIQRRWQSYKADSANLAKVKVMIDQSGNFSYEILELSYNEE 279
+ + +D+ A+ V ++ S I +L N E
Sbjct: 1249 LCDLTSTNTNAVLSDARAKAQF-VALNVGKAVSQHISQLEMNNE 1291



Score = 40.8 bits (95), Expect = 6e-06
Identities = 45/258 (17%), Positives = 83/258 (32%), Gaps = 37/258 (14%)

Query: 39 KKYTDDKDAIMDVVMVDREIDQTIKAPKQADEV------VKETKPKPKKES------EED 86
K D + V +E +KA Q +EV KET+ KE+ E+
Sbjct: 1053 KNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKA 1112

Query: 87 KQETTNKPVVPD-----EPLPTPSLPTPPKEEPKPEP----------KKSEPKPEISKPS 131
K ET VP P S P+ EP E ++ + +P+
Sbjct: 1113 KVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPA 1172

Query: 132 EESKEDVKP--ESKPEPKPTPKPVEKPKPKEPNIKDLFSDIDPTKLKKDDGIKKTENK-- 187
+E+ +V+ VE P+ P + PT + K ++
Sbjct: 1173 KETSSNVEQPVTESTTVNTGNSVVENPENTTP------ATTQPTVNSESSNKPKNRHRRS 1226

Query: 188 VQSRKKSEASSSKAAKEASDIIKSLKIDQNPTAPKSQSTGTYDPLMGAITKQIQRRWQSY 247
V+S + ++ ++ + S + N A S + + + K + +
Sbjct: 1227 VRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQL 1286

Query: 248 KADSANLAKVKVMIDQSG 265
+ ++ V V
Sbjct: 1287 EMNNEGQYNVWVSNTSMN 1304


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS02345OMPADOMAIN1122e-32 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 112 bits (281), Expect = 2e-32
Identities = 37/115 (32%), Positives = 56/115 (48%), Gaps = 9/115 (7%)

Query: 61 SVYFDFDKFNIKADQQGVVSSNASVFNQADAQALSIKVEGNCDEWGTDEYNYALGLKRAK 120
V F+F+K +K + Q + S + D + S+ V G D G+D YN L +RA+
Sbjct: 220 DVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQ 279

Query: 121 SAKDALVRNGVSADRIAVVSFGESNPVCTDKT---------KACDAQNRRADFKV 166
S D L+ G+ AD+I+ GESNPV + C A +RR + +V
Sbjct: 280 SVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


6CCON33237_RS02700CCON33237_RS02800Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS02700323-3.049801*hypothetical protein
CCON33237_RS02705521-1.285436hypothetical protein
CCON33237_RS027107275.702055hypothetical protein
CCON33237_RS027157349.257979hypothetical protein
CCON33237_RS0272093710.047018NAD-dependent formate dehydrogenase subunit
CCON33237_RS0272583810.730531formate dehydrogenase
CCON33237_RS0273084011.512465formate dehydrogenase
CCON33237_RS0273583911.202465formate dehydrogenase
CCON33237_RS027409379.999444formate dehydrogenase
CCON33237_RS027459369.612866Tat pathway signal protein
CCON33237_RS0275010388.282274formate dehydrogenase
CCON33237_RS027557273.762422tungsten ABC transporter ATP-binding protein
CCON33237_RS027607336.903943ABC transporter permease
CCON33237_RS027657326.532302tungsten ABC transporter substrate-binding
CCON33237_RS027706326.253145hypothetical protein
CCON33237_RS027805306.068252aminotransferase
CCON33237_RS027903284.395465hypothetical protein
CCON33237_RS027953274.101132hypothetical protein
CCON33237_RS028002223.017972formate dehydrogenase accessory protein FdhD
7CCON33237_RS03050CCON33237_RS03200Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS03050314-1.542978arginine biosynthesis protein ArgJ
CCON33237_RS03055215-2.127310hypothetical protein
CCON33237_RS03060214-2.268533MFS transporter
CCON33237_RS03065114-4.400238flagellar hook-length control protein FliK
CCON33237_RS03070114-4.604980flagellar biosynthesis protein FlhB
CCON33237_RS03075-111-3.004980chemotaxis protein CheB
CCON33237_RS03080-110-2.914267SAM-dependent methyltransferase
CCON33237_RS03085112-2.068060MFS transporter
CCON33237_RS030900160.511579copper homeostasis protein
CCON33237_RS030950204.190818copper homeostasis protein CutF
CCON33237_RS031000235.167155protease
CCON33237_RS031051245.080213activity regulator of membrane protease YbbK
CCON33237_RS031100255.585587hypothetical protein
CCON33237_RS03115-1265.982134dehypoxanthine futalosine cyclase
CCON33237_RS03120-1235.038679peptidase M16
CCON33237_RS03125-2133.411095ATP-dependent DNA helicase RecG
CCON33237_RS03130-1151.788188nuclease
CCON33237_RS03135-1151.695154iron ABC transporter ATP-binding protein
CCON33237_RS03140-1170.855208iron ABC transporter permease
CCON33237_RS03145-120-0.916273peptide ABC transporter substrate-binding
CCON33237_RS03150-220-1.790971nitric-oxide reductase large subunit
CCON33237_RS03155-220-2.971061hypothetical protein
CCON33237_RS03160220-3.095590hypothetical protein
CCON33237_RS03165219-3.27430930S ribosomal protein S30
CCON33237_RS03170319-3.077554hypothetical protein
CCON33237_RS03175110-1.149748haloacid dehalogenase
CCON33237_RS03180110-0.874129prepilin-type N-terminal cleavage/methylation
CCON33237_RS03185-180.715680hypothetical protein
CCON33237_RS03190-4112.340436flagellar biosynthesis protein FliW
CCON33237_RS03195-4132.978387outer membrane protein assembly factor BamD
CCON33237_RS03200-4113.163062endopeptidase La
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03070TYPE3IMSPROT801e-21 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 80.2 bits (198), Expect = 1e-21
Identities = 18/76 (23%), Positives = 31/76 (40%), Gaps = 1/76 (1%)

Query: 8 AVALGYNRSKDNAPRVLASGAGEIANRIIDLAKEHDIPIKEDPDLI-EILSKVEVDQEIP 66
A+ + Y R + P V + +A+E +PI + L + VD IP
Sbjct: 268 AIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEEEGVPILQRIPLARALYWDALVDHYIP 327

Query: 67 PNLYKAVAEIFSFLYK 82
+A AE+ +L +
Sbjct: 328 AEQIEATAEVLRWLER 343


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03085TCRTETA432e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.5 bits (100), Expect = 2e-06
Identities = 61/342 (17%), Positives = 126/342 (36%), Gaps = 26/342 (7%)

Query: 61 VVIAPFSGILVDKFSPKPMLVIMMAVETISVFMLLFIDSLDFLWLLLLIIFVRNGTGGMY 120
AP G L D+F +P+L++ +A + ++ L L++ ++ G G
Sbjct: 57 FACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV----AGITGAT 112

Query: 121 FQVEMSVLPKILSKENLKLANEIHSIIWAVSYTAGMGLAGVYIHFFGIKSAFLLDGILYI 180
V + + I + S + AG L G + F + F L
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGG-LMGGFSPHAPFFAAAALNG 171

Query: 181 LSFGFLYFLNLQSLKPEFIEKPLKML-KNGLVYLKENRLIVHLIFLHA-FVGITAYD--- 235
L+F FL +S K E +PL+ N L + R + + L A F +
Sbjct: 172 LNFLTGCFLLPESHKGE--RRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVP 229

Query: 236 -ALIALLADYKYA-NLLSTSLVIGLLNTSRSISLMFAPAILSKFINKNTLI---FVYIGQ 290
AL + + ++ + + + + S++ ++ + + + + G
Sbjct: 230 AALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGT 289

Query: 291 GLGIIIWALSLWNFYLSLIGIIFAGFCTSSLWSYTYTMLQQNCKKEFYGRVIAYNDMIFL 350
G ++ +A W + ++ + G +L ML + +E G++
Sbjct: 290 GYILLAFATRGWMAFPIMVLLASGGIGMPAL----QAMLSRQVDEERQGQLQGSLAA--- 342

Query: 351 GFSALISFIIGLLYDIGFSVEMIASFMGSLFFVGAFYYHIVL 392
++L S + LL+ ++ I ++ G + GA Y + L
Sbjct: 343 -LTSLTSIVGPLLFTAIYAAS-ITTWNGWAWIAGAALYLLCL 382


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03170BCTERIALGSPH280.009 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 28.0 bits (62), Expect = 0.009
Identities = 15/57 (26%), Positives = 26/57 (45%), Gaps = 4/57 (7%)

Query: 1 MR-KGFTLIAAIFFLVVAASISTLALSIASTSARQSSEIYL-REQAQLVAQAAAEYA 55
MR +GFTL+ + L++ + + L S S+ L R +AQL + +
Sbjct: 1 MRQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQL--RFVQQRG 55


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03175BCTERIALGSPG310.004 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 30.6 bits (69), Expect = 0.004
Identities = 11/25 (44%), Positives = 19/25 (76%), Gaps = 2/25 (8%)

Query: 3 MKKTKK--AFTLIELIIVITVLGVI 25
M+ T K FTL+E+++VI ++GV+
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVL 25


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03200PF02370310.008 M protein repeat
		>PF02370#M protein repeat

Length = 168

Score = 31.2 bits (70), Expect = 0.008
Identities = 24/116 (20%), Positives = 53/116 (45%), Gaps = 7/116 (6%)

Query: 177 KKQIAYSFFVEENLEQR-LLKLIDYVIEEIEANKLQKEIKNKVHSKIDKTNKEYFLKEQL 235
+ Y + EN + R IEE+E + +K+ + + K ++ +++ +EQ
Sbjct: 45 ENDPQYRALMGENQDLRKREGQYQDKIEELEKERKEKQERPERREKFERQHQDKHYQEQQ 104

Query: 236 KQIQAELGADTSREEELEEYRKKLDTKKKFMAED------AYKEIKKQIDKLSRMH 285
K+ Q E + +++L + ++ D ++ + D A KE++ + KL H
Sbjct: 105 KKHQQEQQQLEAEKQKLAKEKQISDASRQGLNRDLEASRAAKKELEPKHQKLGTEH 160


8CCON33237_RS03710CCON33237_RS03910Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS03710021-5.989153hypothetical protein
CCON33237_RS03715-119-5.314018potassium channel protein
CCON33237_RS03720-217-3.90186950S ribosomal protein L28
CCON33237_RS03725-118-4.131873hypothetical protein
CCON33237_RS03730116-3.219022prephenate dehydrogenase
CCON33237_RS03735315-0.102431ribulose-phosphate 3-epimerase
CCON33237_RS037403161.417964DNA polymerase III subunit epsilon
CCON33237_RS037454170.199708hypothetical protein
CCON33237_RS037505201.336920sodium-dependent transporter
CCON33237_RS037555232.934191peptidase M48
CCON33237_RS037606262.670510protein-(glutamine-N5) methyltransferase,
CCON33237_RS037652220.751046hypothetical protein
CCON33237_RS037702200.250512hypothetical protein
CCON33237_RS03775319-0.567627hypothetical protein
CCON33237_RS03780219-2.396782hypothetical protein
CCON33237_RS03785121-4.887152acyl-CoA thioester hydrolase
CCON33237_RS03790123-6.706886hypothetical protein
CCON33237_RS03795-118-5.915634uracil-DNA glycosylase
CCON33237_RS03805-113-3.21583850S ribosomal protein L34
CCON33237_RS03810-112-2.611442hypothetical protein
CCON33237_RS03815-3100.756147membrane protein insertion efficiency factor
CCON33237_RS03820-491.524723membrane protein insertase YidC
CCON33237_RS03825-2112.603965RNA-binding protein
CCON33237_RS03830-2123.175642tRNA modification GTPase
CCON33237_RS03835-2133.626917NADPH quinone reductase MdaB
CCON33237_RS03840-2133.347696phosphoribosylformylglycinamidine synthase II
CCON33237_RS03845-1141.471126hypothetical protein
CCON33237_RS03850-114-0.326770molecular chaperone DnaJ
CCON33237_RS03855-114-0.277426hypothetical protein
CCON33237_RS03860-114-0.107262bifunctional
CCON33237_RS03865021-2.093270peptide-methionine (R)-S-oxide reductase
CCON33237_RS03870226-4.417603hypothetical protein
CCON33237_RS03875126-4.784254hypothetical protein
CCON33237_RS03880127-4.141566PaaD-like protein (DUF59) involved in Fe-S
CCON33237_RS03885026-4.535942cell division protein FtsZ
CCON33237_RS03890-221-4.904389cell division protein FtsA
CCON33237_RS03895-218-5.279242peptidylprolyl isomerase
CCON33237_RS03900-117-3.502623hypothetical protein
CCON33237_RS03905018-3.04037816S rRNA
CCON33237_RS03910017-3.977770hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03785TYPE4SSCAGX290.036 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 29.0 bits (64), Expect = 0.036
Identities = 17/49 (34%), Positives = 27/49 (55%)

Query: 331 EDMKEAQKRLEKLIEQEKRTQKMLDKNRKYDKELKQNTRKNLEELRAMM 379
++++E +K LEK E +++ QK R+ KE + R NLE L M
Sbjct: 139 KELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANLENLTNAM 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03795ARGDEIMINASE320.004 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 32.1 bits (73), Expect = 0.004
Identities = 23/89 (25%), Positives = 36/89 (40%), Gaps = 10/89 (11%)

Query: 124 LSYVLKQDADIEIKFINSNDESPQSLTNAMQTARAQGFNYFIAALTSNGANIMNSLVLSN 183
+S VL +E KFI+ + T+ YF + N + M S V++
Sbjct: 76 ISEVLVSSVALENKFISQFILEAEIKTDFTINLLKD---YFSSLTIDNMISKMISGVVTE 132

Query: 184 ELIYIPSV-------HSSFVITPKPNLIF 205
EL S + F+I P PN++F
Sbjct: 133 ELKNYTSSLDDLVNGANLFIIDPMPNVLF 161


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS0382060KDINNERMP358e-120 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 358 bits (919), Expect = e-120
Identities = 144/551 (26%), Positives = 260/551 (47%), Gaps = 51/551 (9%)

Query: 8 KRLLLAALLSIVFFIVYDFFMPKRVLLEQNQTTMSQTIDQNKAPNTNQNTPKSNENL--A 65
+R LL L V F+++ + + Q T +++ + +
Sbjct: 4 QRNLLVIALLFVSFMIWQAW--------EQDKNPQPQAQQTTQTTTTAAGSAADQGVPAS 55

Query: 66 SNETIATIKGLSYEAKIDKLG-RISKFYLTEEKYKTEDGDKIELVSQNPLPLELRFN--- 121
+ ++K + I+ G + + L + +L+ +P + +
Sbjct: 56 GQGKLISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGLT 115

Query: 122 -----DNTLNADAFKVAYSSDASEIDASSEPKTIKLTQIL-DGVTITKNIKFYPNGRYEV 175
DN N DA + + +T G T TK G Y V
Sbjct: 116 GRDGPDNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTKTFVLKR-GDYAV 174

Query: 176 EVNL------SKSVDYFI------TPGFRPNIAVDS-----YTVHGVMLRNTDDSLNIIE 218
VN K ++ + P++ S +T G D+ +
Sbjct: 175 NVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFALHTFRGAAYSTPDEKYEKYK 234

Query: 219 ---DGDAKEVKNYANTTIAAASDRYYTTLFYSFEKPFEVAVDKDASNN--------PILF 267
D + + + A +Y+ T + + N +
Sbjct: 235 FDTIADNENLNISSKGGWVAMLQQYFATAWIPHNDGTNNFYTANLGNGIAAIGYKSQPVL 294

Query: 268 VKASEN--LKLGGYIGPKEHKILSSMDERLNDVIEYGWFTFIAKPMFAFLNFLHNYIGNW 325
V+ + + ++GP+ ++++ L+ ++YGW FI++P+F L ++H+++GNW
Sbjct: 295 VQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNW 354

Query: 326 GWAIVVLTLVIRIVLFPLTYKGMLSMNKLKELAPKVKELQTKYKDDKQKMQVHMMELYKK 385
G++I+++T ++R +++PLT SM K++ L PK++ ++ + DDKQ++ MM LYK
Sbjct: 355 GFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKA 414

Query: 386 HGANPMGGCLPILLQIPVFFAIYRVLLNAIELKGAPWIVWIHDLSVMDPYFVLPILMGLT 445
NP+GGC P+L+Q+P+F A+Y +L+ ++EL+ AP+ +WIHDLS DPY++LPILMG+T
Sbjct: 415 EKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVT 474

Query: 446 MFLQQKLTPTTFADPMQEKVMKFLPLIFTFFFVTFPAGLTLYWFVNNVCSVVQQVFVNKL 505
MF QK++PTT DPMQ+K+M F+P+IFT FF+ FP+GL LY+ V+N+ +++QQ + +
Sbjct: 475 MFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQQLIYRG 534

Query: 506 FEKHKKSAEVK 516
EK + K
Sbjct: 535 LEKRGLHSREK 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03890SHAPEPROTEIN401e-05 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 40.1 bits (94), Expect = 1e-05
Identities = 41/209 (19%), Positives = 74/209 (35%), Gaps = 18/209 (8%)

Query: 155 IVTVQKSSISNLRKAVNLAGVQLD----NIVLSGYASAIATLTKDEKELGAALVDMGGAT 210
+V V + R+A+ + ++ A+AI + G+ +VD+GG T
Sbjct: 111 LVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGT 170

Query: 211 CNLVVHSGNSIRYNEFLPVGSANITNDL------SMALHTPLPKAEEIKLGYG-ALINKS 263
+ V S N + Y+ + +G + + AE IK G A
Sbjct: 171 TEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDE 230

Query: 264 VDLIELP---ILGDETKSHEVSLDIISNVIYARAEETLMVLAKMLEDSG---YKDSIGAG 317
V IE+ + + ++ + I + + + LE D G
Sbjct: 231 VREIEVRGRNLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERG 290

Query: 318 IILTGGMTKLEGIRDLASAIFDKMPVRIA 346
++LTGG L + L +PV +A
Sbjct: 291 MVLTGGGALLRNLDRLLME-ETGIPVVVA 318


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03895FbpA_PF05833363e-04 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 36.0 bits (83), Expect = 3e-04
Identities = 52/211 (24%), Positives = 84/211 (39%), Gaps = 23/211 (10%)

Query: 193 QIINANQSDIKIDEKELKDLWETNKNNYMTKTIYG-LETYFIESNKNDVNQTTLSDYYNE 251
IN KI LK + +YG L T I + K ++ L++YY+E
Sbjct: 310 NNINRCTKKDKILNNTLKKC-----EDKDIFKLYGELLTANIYALKKGLSHIELANYYSE 364

Query: 252 NKERYKGSDDKIKSFDEVKTEVIKDYNIEKSKTDALKKYTSIKKAELATNEFVSVNEDNA 311
N + K + D+ K+ + K YN K +A + + EL V N +NA
Sbjct: 365 NYDTVKITLDENKTPSQNVQSYYKKYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNA 424

Query: 312 TFSLDEIKGAKVGEVIKPFTYKDGYLIVRV----KSITPPQPMSFEQARAMVLEIYKDKK 367
+ DEI E IK + GY+ + K +PM F + + + K+
Sbjct: 425 D-NYDEI------EEIKKELIETGYIKFKKIYKSKKSKTSKPMHFISKDGIDIYVGKNNI 477

Query: 368 KKENLTTMAKESLQNFKGTDIGFISRDINSS 398
+ + LT L+ DI F +++I S
Sbjct: 478 QNDYLT------LKFANKHDIWFHTKNIPGS 502


9CCON33237_RS04115CCON33237_RS04275Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS04115-212-3.306826DNA-binding response regulator
CCON33237_RS04120-114-4.986145pyruvate kinase
CCON33237_RS04125-214-4.297627hypothetical protein
CCON33237_RS04130-212-3.169456hypothetical protein
CCON33237_RS04135-214-4.409056*DNA-binding protein
CCON33237_RS04145-216-3.730608hypothetical protein
CCON33237_RS04150-217-4.293736flagellin biosynthesis protein FlgL
CCON33237_RS04155-119-4.164412DNA translocase FtsK
CCON33237_RS04160023-6.078654**ferredoxin
CCON33237_RS04175033-9.063291hypothetical protein
CCON33237_RS04180035-7.409642replication protein
CCON33237_RS04185034-6.797857type II and III secretion system protein
CCON33237_RS04190036-6.769283zonula occludens toxin
CCON33237_RS04195138-7.051247hypothetical protein
CCON33237_RS04200137-7.240025hypothetical protein
CCON33237_RS04205337-7.182726hypothetical protein
CCON33237_RS04210542-10.551466hypothetical protein
CCON33237_RS04215739-9.960070hypothetical protein
CCON33237_RS04220339-9.383982hypothetical protein
CCON33237_RS04230-115-1.152952putative addiction module antidote protein
CCON33237_RS04235-3162.268093hypothetical protein
CCON33237_RS04240-4214.517304hypothetical protein
CCON33237_RS04245-1214.310543cytochrome c
CCON33237_RS04250-1224.938659D,D-heptose 1,7-bisphosphate phosphatase
CCON33237_RS042550225.176970ADP-glyceromanno-heptose 6-epimerase
CCON33237_RS042601224.830338bifunctional heptose 7-phosphate kinase/heptose
CCON33237_RS042650224.579472phosphoheptose isomerase
CCON33237_RS042700204.127789transporter
CCON33237_RS042751213.965111DNA gyrase subunit A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04120HTHFIS912e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 90.7 bits (225), Expect = 2e-23
Identities = 36/117 (30%), Positives = 58/117 (49%)

Query: 3 RILLVEDDEILLDLISEYLSENGYDVTTSDNAKEALDLAYEQNFDLLILDVKLPQGDGFS 62
IL+ +DD + ++++ LS GYDV + NA + DL++ DV +P + F
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 63 LLSSLRELGVSAPSIFTTSLNTIDDLEKGYKSGCDDYLKKPFELKELLIRIQALLKR 119
LL +++ P + ++ NT K + G DYL KPF+L EL+ I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04145DNABINDINGHU873e-26 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 87.1 bits (216), Expect = 3e-26
Identities = 39/87 (44%), Positives = 51/87 (58%)

Query: 3 KAEFIQAVADKAGLSKKDTLKVVDATLETIQAVLEKGDTISFIGFGTFGTADRAARKARV 62
K + I VA+ L+KKD+ VDA + + L KG+ + IGFG F +RAARK R
Sbjct: 4 KQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGRN 63

Query: 63 PGTKKVIDVPASKAVKFKVGKKLKEAV 89
P T + I + ASK FK GK LK+AV
Sbjct: 64 PQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04155FLAGELLIN563e-10 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 56.2 bits (135), Expect = 3e-10
Identities = 60/344 (17%), Positives = 112/344 (32%), Gaps = 23/344 (6%)

Query: 11 QTLNDYQKNMTGVNKSYKQLSNGLKIQDPYDGAATYNDAMRLDYEATTLTQVVDATGKSV 70
LN Q ++ + + ++LS+GL+I D AA A R LTQ +
Sbjct: 15 NNLNKSQSSL---SSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGI 71

Query: 71 NFSKNTDNALQEFEKQLENFKTKVVQAASSVHSKTSLEALANDLQGIKNHLVNIAN-TSV 129
+ ++ T+ AL E L+ + VQA + +S + L+++ +++Q + ++N T
Sbjct: 72 SIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQF 131

Query: 130 NGQFLFSGSAVDTKPIDGAGKYQGNRDYMKTSAGAQVELPYNIPGYDLFLGKDGDYSKIL 189
NG + S + GA + ++ + L
Sbjct: 132 NGVKVLSQDNQMKIQV-GANDGETITIDLQKIDVKSLGL-----------------DGFN 173

Query: 190 TTNVRLADQTRTDISYAPKFLNDNSKIKNMIGLNYASDSVVRSDGSYNGTINPDYDFLDN 249
+ A S+ D + + V +D + + Y N
Sbjct: 174 VNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAAN 233

Query: 250 SNVNFPDTYFFMQGKKPDGTTFTSKFKMSANTTMAGLMEKIGMEFGNTKTTKVVDVSINN 309
+ D T T+ + A K G F T +D N
Sbjct: 234 GQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGN 293

Query: 310 DGQFNIKDLTKGNQTIDFHMVAATSVAPNRGAIAQNNALDTVNS 353
DG + T + + + T+ A N A ++ + S
Sbjct: 294 DGNGKVST-TINGEKVTLTVADITAGAANVDAATLQSSKNVYTS 336



Score = 37.3 bits (86), Expect = 2e-04
Identities = 25/201 (12%), Positives = 52/201 (25%), Gaps = 6/201 (2%)

Query: 565 NAYKEALSKTKGTVETTLDDRGRMVLTDKTKSVTNIEVTMHDAKNSDKFDGDSTGRDTAG 624
+D + SV N + T D + + + +
Sbjct: 305 GEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTF------DDKTKNESAKLSDL 358

Query: 625 NAGHPQGKGSVFSFNENNALTIDEPSTSVFQDLDNMIEAVRKGYYRADANSNDPRNTGMQ 684
A + S + N I+ G
Sbjct: 359 EANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTA 418

Query: 685 GALQRLDHLIDHANKELTKIGSQSRLLTATKERAEVMKVNVQTVKNDVIDADYAESYLKF 744
L +D + + + +G+ + N+ + ++ + DADYA
Sbjct: 419 NPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNM 478

Query: 745 TQLSLSYQATLQASAKINQLS 765
++ + QA A+ NQ+
Sbjct: 479 SKAQILQQAGTSVLAQANQVP 499


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04190BCTERIALGSPD1064e-27 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 106 bits (265), Expect = 4e-27
Identities = 55/284 (19%), Positives = 116/284 (40%), Gaps = 36/284 (12%)

Query: 127 SSNSVFFRADDYIFDQIKEAISKIDKSLEQVTFKLTITETNLKDIKDLGTN----LKGLL 182
+N++ A + + ++ I+++D QV + I E D +LG G+
Sbjct: 317 QTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGIQWANKNAGMT 376

Query: 183 KPLNHGDLAYYINL-----------ITSPYITNSNIIKNDDRAFFG-----ILNFLDTNG 226
+ N G L + ++S + + F+ +L L ++
Sbjct: 377 QFTNSG-LPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAGFYQGNWAMLLTALSSST 435

Query: 227 ITKIISSPVLTAKNHTEVYFSSVQNIPYLVSKTDISNLNYQKTDSYEYKDIGLKINLKPI 286
I+++P + ++ E F+ Q +P L S N ++ E K +G+K+ +KP
Sbjct: 436 KNDILATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDN--IFNTVERKTVGIKLKVKPQ 493

Query: 287 ILSDHIDFDLHLILEDILSQ--------STSLTPIVSKKELKSSYSLKRGDVLVLSGINK 338
I + L +E +S S+ L + + + ++ + G+ +V+ G+
Sbjct: 494 INEGD---SVLLEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGETVVVGGLLD 550

Query: 339 TTTSKQRNGVPILKDIWFLKYLFS--VEQDSEINSVLTLTIQII 380
+ S + VP+L DI + LF ++ S+ N +L + +I
Sbjct: 551 KSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVI 594


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04205cloacin546e-10 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 54.3 bits (130), Expect = 6e-10
Identities = 32/68 (47%), Positives = 33/68 (48%), Gaps = 5/68 (7%)

Query: 288 NTTPGGGGSSGGGSSGGTGGGSSGGNNGGSSG---GTGGGSSGGNNQGNGGQGGGQGNNG 344
N P G G GG S G+G S GG SG GGGS GN GNG GGG G G
Sbjct: 21 NGGPTGLGVGGGASD-GSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGG 79

Query: 345 N-SENATP 351
N S A P
Sbjct: 80 NLSAVAAP 87



Score = 46.6 bits (110), Expect = 1e-07
Identities = 25/67 (37%), Positives = 31/67 (46%), Gaps = 6/67 (8%)

Query: 292 GGGGSSGGGSSGG-TGGGSSGGNNGGSSGGT-----GGGSSGGNNQGNGGQGGGQGNNGN 345
G S+ G +GG TG G GG + GS + GGGS G + G G G G NGN
Sbjct: 11 TGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGN 70

Query: 346 SENATPG 352
S +
Sbjct: 71 SGGGSGT 77



Score = 41.2 bits (96), Expect = 7e-06
Identities = 29/84 (34%), Positives = 40/84 (47%), Gaps = 8/84 (9%)

Query: 292 GGGGSSGGGSSGGT--GGGSSGGNNGGSSGGTGGGSS-----GGNNQGNG-GQGGGQGNN 343
G G ++G S+ G GG + G GG+S G+G S GG+ G G G G GN
Sbjct: 6 GRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNG 65

Query: 344 GNSENATPGNIDYGELEERTADLA 367
G + N+ G+ G L A +A
Sbjct: 66 GGNGNSGGGSGTGGNLSAVAAPVA 89



Score = 40.1 bits (93), Expect = 2e-05
Identities = 19/42 (45%), Positives = 23/42 (54%), Gaps = 2/42 (4%)

Query: 287 DNTTPGGGGSSGGGSSGGTGGGSSGGNN--GGSSGGTGGGSS 326
+N GGG SG GG+G G+ GGN GG SG G S+
Sbjct: 42 ENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSA 83



Score = 33.5 bits (76), Expect = 0.002
Identities = 12/43 (27%), Positives = 14/43 (32%)

Query: 279 DPSKDKDKDNTTPGGGGSSGGGSSGGTGGGSSGGNNGGSSGGT 321
+P GGG G G G GG SG S+
Sbjct: 44 NPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAA 86



Score = 30.8 bits (69), Expect = 0.014
Identities = 16/47 (34%), Positives = 21/47 (44%), Gaps = 1/47 (2%)

Query: 292 GGGGSSGGGSSGGTGGG-SSGGNNGGSSGGTGGGSSGGNNQGNGGQG 337
GG G GG +G +GGG +GGN + G + G GG
Sbjct: 58 GGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 29.7 bits (66), Expect = 0.029
Identities = 15/44 (34%), Positives = 20/44 (45%), Gaps = 1/44 (2%)

Query: 302 SGGTGGGSSGGNNGGSSGGTGGGSSGGNNQGNGGQGGGQGNNGN 345
SGG G G + G + +SG GG +G G G G + N
Sbjct: 2 SGGDGRGHNTGAHS-TSGNINGGPTGLGVGGGASDGSGWSSENN 44


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04225PHPHTRNFRASE280.033 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 28.2 bits (63), Expect = 0.033
Identities = 12/50 (24%), Positives = 24/50 (48%)

Query: 98 EILKDDIFINEIKKEEKEEKEQKEEKQENFRDKFKAEYRTNDGHYVRSRA 147
+L D ++ IK + + E+ E + D F + + + D Y++ RA
Sbjct: 79 LVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEYMKERA 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04255NUCEPIMERASE1381e-40 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 138 bits (350), Expect = 1e-40
Identities = 76/347 (21%), Positives = 142/347 (40%), Gaps = 44/347 (12%)

Query: 7 KIVITGGAGFIGSALAHYFDENYKDAHVLVVDKFRN--DETFSNGNLKSFGHFKNLLGFK 64
K ++TG AGFIG ++ E V+ +D + D + L+ GF+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQ--VVGIDNLNDYYDVSLKQARLELLAQ----PGFQ 55

Query: 65 GEIYAGDINNPSTLEKI-KSFCPDVIYHEAAISDT--TVKEQDELIKTNVNAFVNLLDIC 121
+ D+ + + + S + ++ +++ +N+ F+N+L+ C
Sbjct: 56 --FHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGC 113

Query: 122 ESLGAK-MIYASSGATYG-NAKSP-QTVGECEAPNNVYGFSKLSMDNINKIYAK-RGVSV 177
+ ++YASS + YG N K P T + P ++Y +K + + + Y+ G+
Sbjct: 114 RHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPA 173

Query: 178 VGLRYFNVFGKGEFFKNKTASMVLQFGLQILAGKTPRLFEGSDQIKRDFVYIKDIIDANI 237
GLR+F V+G + + +F +L GK+ ++ ++KRDF YI DI +A I
Sbjct: 174 TGLRFFTVYGP----WGRPDMALFKFTKAMLEGKSIDVY-NYGKMKRDFTYIDDIAEAII 228

Query: 238 KALD--------------------APSGVYNAATGKARSFQDIADILQREIGVNLGNEYI 277
+ D AP VYN D L+ +G+ +
Sbjct: 229 RLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNML 288

Query: 278 KNPFIGSYQFHTEADVAPAREAFGFSATWSLEEAIKDYLPEIKRIYK 324
G T AD E GF+ ++++ +K+++ + YK
Sbjct: 289 P-LQPGDVL-ETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYK 333


10CCON33237_RS04595CCON33237_RS04635Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS04595-1193.652168AAA family ATPase
CCON33237_RS04600-1225.344896molybdopterin adenylyltransferase
CCON33237_RS046050224.614209hypothetical protein
CCON33237_RS04610-1245.593743thiamine phosphate synthase
CCON33237_RS04615-1245.234020thiamine biosynthesis protein ThiH
CCON33237_RS04620-1224.816025thiazole synthase
CCON33237_RS046250193.462164thiamine biosynthesis protein ThiF
CCON33237_RS04630-1193.227941thiamine biosynthesis protein ThiS
CCON33237_RS04635-1183.560931aspartate aminotransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04595HTHFIS356e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.2 bits (81), Expect = 6e-04
Identities = 43/207 (20%), Positives = 74/207 (35%), Gaps = 54/207 (26%)

Query: 188 VLMIGPPGVGKTLVAKAV---AGEANVPFFYQNGASF-----------VQIYVGMGAKRV 233
+++ G G GK LVA+A+ N PF N A+ + GA+
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTR 222

Query: 234 RE-LFSKAKSYAPSIIFIDEIDAVGKSRGGTRNDEREATLNQLLTEMDGFEDNSG----- 287
F +A+ +F+DEI G + + L ++L + + + G
Sbjct: 223 STGRFEQAEG---GTLFLDEI--------GDMPMDAQTRLLRVLQQGE-YTTVGGRTPIR 270

Query: 288 --VIVIAATNRIEMID-EALLRSGRFDRRIF-------LSMPDFNDR---VAILNTYLKD 334
V ++AATN+ D + + G F ++ L +P DR + L +
Sbjct: 271 SDVRIVAATNK----DLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQ 326

Query: 335 KKCDVSAEDIARMSVGFSGAALSTLVN 361
+ AE F AL +
Sbjct: 327 Q-----AEKEGLDVKRFDQEALELMKA 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04610PF04183280.034 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 27.9 bits (62), Expect = 0.034
Identities = 9/30 (30%), Positives = 17/30 (56%), Gaps = 3/30 (10%)

Query: 34 LLRAKGLDEAKFYDFARVVAQICENYRKKF 63
L+ G+ E +FY +++A + +Y KK
Sbjct: 495 LMVRLGVPERRFY---QLLAAVLSDYMKKH 521


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04630TYPE4SSCAGA240.047 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 24.3 bits (52), Expect = 0.047
Identities = 15/47 (31%), Positives = 28/47 (59%), Gaps = 3/47 (6%)

Query: 12 ELENDINVYDFLAQNGYELKFIALERDGEILPKKLWRERFMSEGKAY 58
E++N I+ +FLAQN +L ++ E++ E ++ + F + KAY
Sbjct: 392 EIQNKIDFMEFLAQNNAKLDNLS-EKEKEKFRTEI--KDFQKDSKAY 435


11CCON33237_RS04685CCON33237_RS04820Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS04685-218-4.282702biopolymer transporter ExbB
CCON33237_RS04690-215-3.396178ferritin
CCON33237_RS04695-217-3.772284histidine phosphatase
CCON33237_RS04700-218-4.300214transcriptional regulator
CCON33237_RS04705-117-5.295680PAS sensor protein
CCON33237_RS04710-118-5.719161type II secretion system protein F
CCON33237_RS04715-119-6.499699general secretion pathway protein GspE
CCON33237_RS04720125-8.467928hypothetical protein
CCON33237_RS04725024-8.582183hypothetical protein
CCON33237_RS04730-121-7.552603AAA family ATPase
CCON33237_RS04735-215-5.266630pilus (MSHA type) biogenesis protein MshL
CCON33237_RS04740-113-3.984220hypothetical protein
CCON33237_RS04745011-2.943275hypothetical protein
CCON33237_RS04750-112-0.318557hypothetical protein
CCON33237_RS04755-1162.882964GTPase Era
CCON33237_RS04760-3173.606295ATP-dependent protease ATP-binding subunit HslU
CCON33237_RS04765-2173.233098HslU--HslV peptidase proteolytic subunit
CCON33237_RS04770-2183.24609450S ribosomal protein L9
CCON33237_RS04775-1183.374108argininosuccinate synthase
CCON33237_RS04780-2152.753671colicin I receptor
CCON33237_RS04785-1151.955727IroE protein
CCON33237_RS04790-2141.325596hypothetical protein
CCON33237_RS04795-113-0.178897tRNA
CCON33237_RS04800012-0.930795LPS export ABC transporter ATP-binding protein
CCON33237_RS04805-111-2.596426RNA polymerase factor sigma-54
CCON33237_RS04810012-3.685786hypothetical protein
CCON33237_RS04815-113-3.597090adenylosuccinate lyase
CCON33237_RS04820-113-3.927575hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04695PF06580310.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.0 bits (70), Expect = 0.002
Identities = 11/43 (25%), Positives = 26/43 (60%), Gaps = 3/43 (6%)

Query: 68 AKTLKFKEEIDLID---ELYNISFEDLLKFVKNIDESLDEIFI 107
A+ + +E+ ++D +L +I FED L+F I+ ++ ++ +
Sbjct: 213 ARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQV 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04700TYPE4SSCAGA290.041 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 29.3 bits (65), Expect = 0.041
Identities = 25/132 (18%), Positives = 69/132 (52%), Gaps = 7/132 (5%)

Query: 248 NIKTLQQEVNDIDENSKKIDEISKLTTQNVENFKKVLSEFNANAQNTAQTSKYVENKTFA 307
N +++ D++++ +K + + K + +E+ ++ A AQ +Q +++ FA
Sbjct: 603 NYDEVKKAQKDLEKSLRKREHLEKEVEKKLESKSGNKNKMEAKAQANSQ-----KDEIFA 657

Query: 308 IIAKISQIAYKTKVYSDLIN--EQGYSEEIRDLAQSLLDWYENESISEHNQNKNFDKSKE 365
+I K + + Y+ + ++ S+++ ++ ++L D+ ++ ++ +NK+F K++E
Sbjct: 658 LINKEANRDARAIAYAQNLKGIKRELSDKLENVNKNLKDFDKSFDEFKNGKNKDFSKAEE 717

Query: 366 LTASLNNEIKPL 377
+L +K L
Sbjct: 718 TLKALKGSVKDL 729


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04710BCTERIALGSPF2233e-71 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 223 bits (571), Expect = 3e-71
Identities = 96/408 (23%), Positives = 193/408 (47%), Gaps = 8/408 (1%)

Query: 1 MKFYEIEYIK-DGKRQKMSLKANSKNDVKNRANIQGMI-VKIKETQVSSINNYFLDLQEK 58
M Y + + GK+ + + +A+S + +G++ + + E + + L +
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 59 FNKIFSSSKVKIPALVATIRQLSVMTNAGISIHDGIKETANATEDKRLKTIFQTLDEDLN 118
S+S L RQL+ + A + + + + A +E L + + +
Sbjct: 61 RKIRLSTS-----DLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVM 115

Query: 119 QGASLTQSIENFQEELGDVTVAMVRLGESTGNMADALSKLASILQEVWDNQQKFKKAIRY 178
+G SL +++ F + AMV GE++G++ L++LA ++ + + ++A+ Y
Sbjct: 116 EGHSLADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIY 175

Query: 179 PITVICSIILAFIVLMTSVVPQFREIFSQLNADLPLPTKILLNIEYIMSNYGIYIIVVLV 238
P + I +L++ VVP+ E F + LPL T++L+ + + +G ++++ L+
Sbjct: 176 PCVLTVVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALL 235

Query: 239 TFVFLLKRQYSNDENFRDKVDKYLLKVYLVGKIIFFANMSRFNLIFTELVRAGLPIADAL 298
R E R + LL + L+G+I N +R+ + L + +P+ A+
Sbjct: 236 AGFMAF-RVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAM 294

Query: 299 DTAVVTVSNQDIRNKLTAVKVLVGRGISLTEAFRQTGLYEGMLIQMIGAGEQSGSLDDMT 358
+ +SN R++L+ V G+SL +A QT L+ M+ MI +GE+SG LD M
Sbjct: 295 RISGDVMSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSML 354

Query: 359 QKVTDYYRVKFNDIIDNISNYIEPILLIFIAAMVLLLALGIFMPMWDM 406
++ D +F+ + EP+L++ +AA+VL + L I P+ +
Sbjct: 355 ERAADNQDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQL 402


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04735BCTERIALGSPD1571e-43 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 157 bits (398), Expect = 1e-43
Identities = 64/289 (22%), Positives = 133/289 (46%), Gaps = 18/289 (6%)

Query: 202 NAGLITVTATPSQLKRVEKYIDEMQKRLKKQVIIDVSIISVELNNEYKQGVDWSKFELGF 261
+ VTA P + +E+ I ++ R + QV+++ I V+ + G+ W+ G
Sbjct: 317 QTNALIVTAAPDVMNDLERVIAQLDIR-RPQVLVEAIIAEVQDADGLNLGIQWANKNAGM 375

Query: 262 NTYIGNSRNNPSSSA---TWTNKGNSLSDGFGRTLN----IAANLNFSLDGMINFLETNG 314
+ + ++ A + G S + A + ++ L ++
Sbjct: 376 TQFTNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAGFYQGNWAMLLTALSSST 435

Query: 315 KTKVISSPKVTTLNNQQALISVGDNINYRVQQKTDNGNSNSDRLTTTYKQYSVFIGILLN 374
K ++++P + TL+N +A +VG + +T +G D + T ++ +V GI L
Sbjct: 436 KNDILATPSIVTLDNMEATFNVGQEVPVLTGSQTTSG----DNIFNTVERKTV--GIKLK 489

Query: 375 LLPEVSDNNKIMLRINPSLSNFKYAEDDTRQNALREIAPDTVQKKLSTVVQVDSGDTIIL 434
+ P++++ + ++L I +S+ D + ++ + ++ V V SG+T+++
Sbjct: 490 VKPQINEGDSVLLEIEQEVSSV----ADAASSTSSDLGATFNTRTVNNAVLVGSGETVVV 545

Query: 435 GGLIGQTKGKNNTSVPLLSDIPLIGGVFKSTRDNIKTTELIFVITPHVV 483
GGL+ ++ VPLL DIP+IG +F+ST + L+ I P V+
Sbjct: 546 GGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVI 594


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04760HTHFIS290.029 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.4 bits (66), Expect = 0.029
Identities = 9/33 (27%), Positives = 16/33 (48%), Gaps = 3/33 (9%)

Query: 51 NILMIGSTGVGKTEIAR---RLSKMMGLPFIKV 80
+++ G +G GK +AR K PF+ +
Sbjct: 162 TLMITGESGTGKELVARALHDYGKRRNGPFVAI 194


12CCON33237_RS04955CCON33237_RS05070Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS04955210-1.322554hypothetical protein
CCON33237_RS0496029-0.8298361-acyl-sn-glycerol-3-phosphate acyltransferase
CCON33237_RS04965310-0.099159protease
CCON33237_RS04970391.219437molecular chaperone HtpG
CCON33237_RS049752101.986499hypothetical protein
CCON33237_RS049802102.695172heterodisulfide reductase subunit B
CCON33237_RS04985-2154.691698(2Fe-2S)-binding protein
CCON33237_RS04990-3143.897064succinate dehydrogenase
CCON33237_RS04995-1143.167391iron ABC transporter ATP-binding protein
CCON33237_RS05000-2162.3364722-oxoglutarate:acceptor oxidoreductase
CCON33237_RS05010-2194.3900452-oxoglutarate ferredoxin oxidoreductase subunit
CCON33237_RS05015-1164.2863632-oxoglutarate synthase subunit alpha
CCON33237_RS050200154.6614822-oxoglutarate oxidoreductase delta subunit
CCON33237_RS05025-2165.001547malate dehydrogenase
CCON33237_RS05030-2154.062836isocitrate dehydrogenase
CCON33237_RS050351120.910419aminodeoxychorismate lyase
CCON33237_RS050401120.018135DUF3971 domain-containing protein
CCON33237_RS050451120.473188hydrogenase maturation nickel metallochaperone
CCON33237_RS050500140.995245hydrogenase expression/formation protein HypE
CCON33237_RS050550150.871146hydrogenase formation protein HypD
CCON33237_RS05060-1161.741698hydrogenase formation protein
CCON33237_RS05065-3184.562697hydrogenase accessory protein HypB
CCON33237_RS05070-3163.273741hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04980PYOCINKILLER360.002 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 35.5 bits (81), Expect = 0.002
Identities = 28/122 (22%), Positives = 49/122 (40%), Gaps = 9/122 (7%)

Query: 660 GVPISVNNGTIETTTRLYGLTTNFTNTFTDKAGNKAAPVTDKITLDNSIKVQFARDNDGD 719
G+P SVN + + L TN + TD +++ ++ V+ A N
Sbjct: 336 GLPPSVNLNAVAKASGTVDLPMRLTNEARGNTTTLSVVSTDGVSVPKAVPVRMAAYNATT 395

Query: 720 GLIDSITTPSGSPSTQSDLDIYLPNNTKPGSNINVTISTPTIPSVTTTVRYTISDDGLTA 779
GL + +T PS + + + P + P N N + +TP +P + +G T
Sbjct: 396 GLYE-VTVPSTTAEAPPLILTWTPAS--PPGNQNPSSTTPVVP------KPVPVYEGATL 446

Query: 780 VP 781
P
Sbjct: 447 TP 448


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS05005FLGMOTORFLIG290.021 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 28.6 bits (64), Expect = 0.021
Identities = 18/111 (16%), Positives = 43/111 (38%), Gaps = 12/111 (10%)

Query: 91 FSVFDVVMMSANARLGIFERPSKKYEMIALDVLKTLNLESFKDKIYTDLSGGERQMVLIA 150
F D+V++ + + + AL ++KI+ ++S +R ++
Sbjct: 245 FVFEDIVLLDDRSIQRVLREIDGQELAKALKS----VDIPVQEKIFKNMS--KRAASMLK 298

Query: 151 RALAQRSKVMLLDEPTANLDFGNQMRVLKEIKKLAKQGYIIILTSHQPEQV 201
+ D + Q +++ I+KL +QG I+I + + +
Sbjct: 299 EDMEFLGPTRRKDVEES------QQKIVSLIRKLEEQGEIVISRGGEEDVL 343


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS05080PF05211270.008 Neuraminyllactose-binding hemagglutinin
		>PF05211#Neuraminyllactose-binding hemagglutinin

Length = 260

Score = 26.9 bits (59), Expect = 0.008
Identities = 10/23 (43%), Positives = 16/23 (69%)

Query: 52 MQKIDTQFALESLEVYQKIADDM 74
MQ+ID + ++LE YQK A ++
Sbjct: 232 MQEIDKKLTQKNLESYQKDAKEL 254


13CCON33237_RS05460CCON33237_RS05680Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS054604180.881079phage tail protein
CCON33237_RS054655181.182747tail protein
CCON33237_RS054704161.804226hypothetical protein
CCON33237_RS054753161.682619phage tail tape measure protein
CCON33237_RS054802141.805291hypothetical protein
CCON33237_RS054852140.794719phage tail protein
CCON33237_RS054902131.006562phage tail protein
CCON33237_RS054951140.661644hypothetical protein
CCON33237_RS05500116-0.121501hypothetical protein
CCON33237_RS05505217-0.167381hypothetical protein
CCON33237_RS055102200.230194hypothetical protein
CCON33237_RS055153210.132639hypothetical protein
CCON33237_RS05520324-0.109336baseplate assembly protein
CCON33237_RS05525123-0.163795baseplate assembly protein
CCON33237_RS05530125-0.208940baseplate assembly protein
CCON33237_RS05535023-0.365456hypothetical protein
CCON33237_RS05540019-0.949130hypothetical protein
CCON33237_RS055452230.178811hypothetical protein
CCON33237_RS055502180.306361lysozyme
CCON33237_RS055552181.008413hypothetical protein
CCON33237_RS055603180.412595hypothetical protein
CCON33237_RS055652170.739272hypothetical protein
CCON33237_RS05570115-0.956522phage capsid protein
CCON33237_RS05575116-1.441576hypothetical protein
CCON33237_RS05580018-1.883897portal protein
CCON33237_RS05585117-2.390391hypothetical protein
CCON33237_RS05590219-2.745378hypothetical protein
CCON33237_RS05595118-4.005379hypothetical protein
CCON33237_RS05600324-2.116857hypothetical protein
CCON33237_RS05610223-1.339878hypothetical protein
CCON33237_RS05615423-0.374106hypothetical protein
CCON33237_RS05620524-0.157924hypothetical protein
CCON33237_RS056254270.163593hypothetical protein
CCON33237_RS056303260.805038hypothetical protein
CCON33237_RS056353241.024306hypothetical protein
CCON33237_RS056403211.281364hypothetical protein
CCON33237_RS056451191.751949hypothetical protein
CCON33237_RS056500181.939620bacteriocin
CCON33237_RS056550161.950937hypothetical protein
CCON33237_RS05660-1182.443990hypothetical protein
CCON33237_RS05665-2161.097942hypothetical protein
CCON33237_RS05670-214-2.837822cupin
CCON33237_RS05675-118-4.469343transcriptional regulator
CCON33237_RS05680-219-4.371611type II secretion system protein K
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS05520ENTEROVIROMP280.030 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 27.6 bits (61), Expect = 0.030
Identities = 25/133 (18%), Positives = 47/133 (35%), Gaps = 6/133 (4%)

Query: 74 YYSGTFYSLNKALSALYADAKVKEWFDYA-GLPYHFKLELDASKNGV-SPQTLKRSDEII 131
+ +GT + ++ YA + + + G ++ E D S GV T
Sbjct: 16 FTAGTSVAATSTVTGGYAQSDAQGQMNKMGGFNLKYRYEEDNSPLGVIGSFTYTEKSRTA 75

Query: 132 NTYKNVRSVYDGASIKATASINLKAYSYTFSGES---ISVDPYVISNINEHAH-FKFGAT 187
++ ++ Y G + IN A Y G Y + + F +GA
Sbjct: 76 SSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKFQTTEYPTYKHDTSDYGFSYGAG 135

Query: 188 TQINEIISIPIDA 200
Q N + ++ +D
Sbjct: 136 LQFNPMENVALDF 148


14CCON33237_RS05770CCON33237_RS05855Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS05770214-0.991102aminoacyl-tRNA deacylase
CCON33237_RS05775115-1.209954dihydrodipicolinate reductase
CCON33237_RS05780117-2.434456hypothetical protein
CCON33237_RS05785118-3.328809hypothetical protein
CCON33237_RS05790119-4.172601hypothetical protein
CCON33237_RS05795225-5.720470transcriptional repressor
CCON33237_RS05800325-5.660869hypothetical protein
CCON33237_RS05805626-5.379205aminotransferase
CCON33237_RS05810121-1.104181hypothetical protein
CCON33237_RS058150170.015122hypothetical protein
CCON33237_RS058200170.677695hypothetical protein
CCON33237_RS058250150.729675ATPase
CCON33237_RS05830-1132.544069amino acid carrier protein
CCON33237_RS05835-2112.814826hypothetical protein
CCON33237_RS05845-1183.231853ModD protein
CCON33237_RS058500193.480495choline transporter
CCON33237_RS05855-1213.925058hypothetical protein
15CCON33237_RS05920CCON33237_RS06040Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS05920-2223.130294phosphate ABC transporter ATP-binding protein
CCON33237_RS05925-3223.031669carbon-nitrogen hydrolase
CCON33237_RS05930-1222.210814cation transporter
CCON33237_RS05935-3192.612246peptidyl-arginine deiminase
CCON33237_RS05940-2182.033181hypothetical protein
CCON33237_RS05945-1173.073213polar amino acid ABC transporter permease
CCON33237_RS05950-1163.101946amino acid ABC transporter permease
CCON33237_RS05955-1172.950029amino acid ABC transporter ATP-binding protein
CCON33237_RS05960-1142.003603ABC transporter substrate-binding protein
CCON33237_RS059650142.885882phospho-sugar mutase
CCON33237_RS059700194.399980molecular chaperone GroEL
CCON33237_RS059751233.872235co-chaperone GroES
CCON33237_RS059801233.686070hypothetical protein
CCON33237_RS059852244.226738hypothetical protein
CCON33237_RS05990-1224.274117serine/threonine protein phosphatase
CCON33237_RS05995-1214.000428excinuclease ABC subunit B
CCON33237_RS06000-1141.690548hypothetical protein
CCON33237_RS06005-3120.247219hypothetical protein
CCON33237_RS06010-3110.382671N-terminal cleavage protein
CCON33237_RS06015-2110.385343primosomal protein N'
CCON33237_RS06020-1120.7487084-hydroxy-3-methylbut-2-en-1-yl diphosphate
CCON33237_RS06025011-0.478106replicative DNA helicase
CCON33237_RS06030212-0.625572competence protein
CCON33237_RS060351130.232999hypothetical protein
CCON33237_RS060402120.456069hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS05960adhesinb290.026 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 28.7 bits (64), Expect = 0.026
Identities = 11/32 (34%), Positives = 18/32 (56%)

Query: 1 MRKLKFFLLALVATVFLTGCGDDKGADKAAAS 32
M+K +F +L L+A V L C K + + +S
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSS 32


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06010BCTERIALGSPG485e-10 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 48.0 bits (114), Expect = 5e-10
Identities = 17/58 (29%), Positives = 37/58 (63%)

Query: 2 KRRAYTLLELIFIVVILGILSTVAIPRLFFSRSDATISNAKTQLAAIRSGISLKYNDN 59
K+R +TLLE++ ++VI+G+L+++ +P L ++ A A + + A+ + + + DN
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDN 63


16CCON33237_RS06085CCON33237_RS06225Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS06085-313-4.183541hypothetical protein
CCON33237_RS06090-214-3.865711di-trans,poly-cis-decaprenylcistransferase
CCON33237_RS06095-116-5.107062peptidase A24
CCON33237_RS06100-216-5.315028hypothetical protein
CCON33237_RS06105-116-3.536040tRNA pseudouridine synthase A
CCON33237_RS06110-214-2.251433type II secretion system protein K
CCON33237_RS06140-211-2.510369**carbamoyl phosphate synthase large subunit
CCON33237_RS06145-110-0.927121hypothetical protein
CCON33237_RS06150-2110.246600nicotinate phosphoribosyltransferase
CCON33237_RS06155-2110.802318guanine permease
CCON33237_RS06160-2110.695274radical SAM protein
CCON33237_RS06165-2120.815244ABC transporter ATP-binding protein
CCON33237_RS06170-2153.601922glutamine--fructose-6-phosphate
CCON33237_RS061750175.518753hypothetical protein
CCON33237_RS061800185.8461992,3,4,5-tetrahydropyridine-2,6-carboxylate
CCON33237_RS061851226.827841hypothetical protein
CCON33237_RS061901236.865225hypothetical protein
CCON33237_RS061950226.146499hypothetical protein
CCON33237_RS06200-1184.677949PFL family protein
CCON33237_RS06205-2141.964898hypothetical protein
CCON33237_RS06210-3121.801065ATP-binding protein
CCON33237_RS062150120.200487phosphomethylpyrimidine synthase
CCON33237_RS06220012-2.502043bifunctional enzyme IspD/IspF
CCON33237_RS06225012-4.073789response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06095PREPILNPTASE1285e-38 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 128 bits (324), Expect = 5e-38
Identities = 65/276 (23%), Positives = 117/276 (42%), Gaps = 37/276 (13%)

Query: 7 FFAVFTFVLGICVGSFSNVLIYRLPK------------------------NESINFPASH 42
+ F+ + +GSF NV+I+RLP ++ P S
Sbjct: 14 LYFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEPPYNLMVPRSC 73

Query: 43 CPNCDHKLNFYHNVPIFSWIFLGGKCAFCKQKISLIYPAIELVSGILFLICFFKECGEIL 102
CP+C+H + N+P+ SW++L G+C C+ IS YP +EL++ +L +
Sbjct: 74 CPHCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSVAVAMT------ 127

Query: 103 SIETLLYALFLGLCFIMLLALSVIDIRYKAVPDPLLFAALFFAFIYALMLFIVKGNFAQI 162
+ L L +L+AL+ ID+ +PD L L+ ++ L+ V A +
Sbjct: 128 -LAPGWGTLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDA-V 185

Query: 163 LNLFLFALIFWVLRFVVSFAIKKEAMGSADIFIAAIIGAILPAKLALVAIYLAALFTLPV 222
+ L+ W L + KE MG D + A +GA L + + + L++L +
Sbjct: 186 IGAMAGYLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFM 245

Query: 223 YALVQK-----KGYELAFVPFLSLGLLIAYAFKEQI 253
+ + + F P+L++ IA + + I
Sbjct: 246 GIGLILLRNHHQSKPIPFGPYLAIAGWIALLWGDSI 281


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06140BCTERIALGSPG393e-06 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 39.1 bits (91), Expect = 3e-06
Identities = 12/30 (40%), Positives = 24/30 (80%)

Query: 1 MHKNKGFTVIELIFVIIAVGILAAMIIPRL 30
K +GFT++E++ VI+ +G+LA++++P L
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNL 33


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06225HTHFIS633e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.9 bits (153), Expect = 3e-13
Identities = 24/113 (21%), Positives = 54/113 (47%), Gaps = 7/113 (6%)

Query: 2 KILIVENEIYLAGSMASKLADFGYDCEIAKSVKEALKF---ENFDVVLLSTTLPGQDFYP 58
IL+ +++ + + L+ GYD I + ++ + D+V+ +P ++ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 59 VIEKFKSS----IIILLIAYINSDTVLKPIQAGAVDYIQKPFMIEELVRKIKH 107
++ + K + ++++ A T +K + GA DY+ KPF + EL+ I
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117



Score = 31.7 bits (72), Expect = 0.003
Identities = 7/27 (25%), Positives = 16/27 (59%)

Query: 270 TELSKKLGISRKSLWEKRKKYDVSKKK 296
+ + LG++R +L +K ++ VS +
Sbjct: 453 IKAADLLGLNRNTLRKKIRELGVSVYR 479


17CCON33237_RS06300CCON33237_RS06350Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS063003111.067384RNA methyltransferase
CCON33237_RS063055163.427596poly(A) polymerase
CCON33237_RS063108214.405918hypothetical protein
CCON33237_RS0631511264.682845hypothetical protein
CCON33237_RS0632012295.4789593-isopropylmalate dehydrogenase
CCON33237_RS0632513335.3215453-isopropylmalate dehydratase
CCON33237_RS0633012345.546077pyridine nucleotide-disulfide oxidoreductase
CCON33237_RS063358314.423427hypothetical protein
CCON33237_RS063406264.831826C4-dicarboxylate ABC transporter
CCON33237_RS063454224.737246hypothetical protein
CCON33237_RS063501173.413559hypothetical protein
18CCON33237_RS06475CCON33237_RS06655Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS064753191.676013DNA-binding protein
CCON33237_RS064801150.728510hypothetical protein
CCON33237_RS064850130.2786213-deoxy-manno-octulosonate cytidylyltransferase
CCON33237_RS06490-111-1.310319threonine synthase
CCON33237_RS06495020-3.560681divalent-cation tolerance protein CutA
CCON33237_RS06500022-3.427628tetraacyldisaccharide 4'-kinase
CCON33237_RS06505022-3.587041aminotransferase DegT
CCON33237_RS06510120-2.495476NAD+ synthase
CCON33237_RS06515120-2.085533MBL fold metallo-hydrolase
CCON33237_RS06520118-1.051168hypothetical protein
CCON33237_RS06525-3110.866426thioesterase
CCON33237_RS06530-3153.054996tRNA 5-methoxyuridine(34)/uridine 5-oxyacetic
CCON33237_RS06535-3174.202055transcriptional regulator
CCON33237_RS06540-2184.298359hypothetical protein
CCON33237_RS06545-1194.705468mercury transporter
CCON33237_RS06550-2183.596303aminoacyl-histidine dipeptidase
CCON33237_RS065550224.443858dihydroorotase
CCON33237_RS065600183.060792aspartate carbamoyltransferase
CCON33237_RS065650162.151493serine protease
CCON33237_RS065701162.330488hypothetical protein
CCON33237_RS065752173.2412763'-to-5' oligoribonuclease B
CCON33237_RS065802162.235482flagellar biosynthesis protein FlhA
CCON33237_RS065851150.081333Rrf2 family transcriptional regulator
CCON33237_RS065900131.28050930S ribosomal protein S15
CCON33237_RS065950131.756551spermidine/putrescine ABC transporter
CCON33237_RS06600-2153.209789formate dehydrogenase
CCON33237_RS06605-1172.544477tRNA (cmo5U34)-methyltransferase
CCON33237_RS06610-1194.323314FAD synthetase
CCON33237_RS06615-1173.953364TlyA family rRNA
CCON33237_RS066200163.178166DNA ligase (NAD(+)) LigA
CCON33237_RS06625-1143.181264dihydropteroate synthase
CCON33237_RS06630-2130.685878DNA polymerase III subunit delta'
CCON33237_RS06635-112-0.960667hypothetical protein
CCON33237_RS06640-214-2.054964aspartate kinase
CCON33237_RS06645-214-2.652285RNA pyrophosphohydrolase
CCON33237_RS06650-113-2.376199peptidase
CCON33237_RS06655012-3.543622ligand-gated channel protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06555UREASE484e-08 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 48.2 bits (115), Expect = 4e-08
Identities = 40/168 (23%), Positives = 63/168 (37%), Gaps = 39/168 (23%)

Query: 1 MRIAIINGTIVNSDEKFKANILIENGKIAKIGSEKF------------EADKVIDATNKL 48
+ I N I++ KA+I +++G+IA IG +VI K+
Sbjct: 68 VDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKI 127

Query: 49 VMPGLIDMHVHFRDPGQEYKDDIISGSEAAVAGGVTTCLCMANTNPVNDNASIT------ 102
V G +D H+HF P Q E A+ G+T + T P + + T
Sbjct: 128 VTAGGMDSHIHFICPQQ---------IEEALMSGLTC-MLGGGTGPAHGTLATTCTPGPW 177

Query: 103 --RAMIEKAKNCGLIDLLPI--AAISKGLGGNEIVEMGDLIEAGAVAF 146
MIE A D P+ A KG + +++ GA +
Sbjct: 178 HIARMIEAA------DAFPMNLAFAGKGNASLP-GALVEMVLGGATSL 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06650CARBMTKINASE431e-06 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 42.5 bits (100), Expect = 1e-06
Identities = 25/96 (26%), Positives = 45/96 (46%), Gaps = 6/96 (6%)

Query: 115 IETARLKAELKAGKIVVVAGFQGI---DEKGNITTL-GRGGSDLSAVALAGALDADLCEI 170
+E +K ++ G IV+ +G G+ E G I + DL+ LA ++AD+ I
Sbjct: 174 VEAETIKKLVERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMI 233

Query: 171 FTDVDGVYTTDPRIEKKAKKLEQISYDEMLELASAG 206
TDV+G +K + L ++ +E+ + G
Sbjct: 234 LTDVNGAALYYGT--EKEQWLREVKVEELRKYYEEG 267


19CCON33237_RS06740CCON33237_RS06885Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS06740-120-3.772323****bifunctional 3,4-dihydroxy-2-butanone
CCON33237_RS06745-120-3.9795213-methyl-2-oxobutanoate
CCON33237_RS06770016-3.333218histidine phosphotransferase
CCON33237_RS06775-213-1.831452HIT family protein
CCON33237_RS06780-3140.226133AI-2E family transporter
CCON33237_RS06790-3161.058116hypothetical protein
CCON33237_RS06795-3181.623342Holliday junction DNA helicase RuvB
CCON33237_RS06800-3172.656033two-component sensor histidine kinase
CCON33237_RS06805-3163.143908chemotaxis protein CheY
CCON33237_RS068100111.500293polysulfide reductase
CCON33237_RS068200122.438757hypothetical protein
CCON33237_RS06825-1122.449861thiosulfate reductase
CCON33237_RS06830-2122.453229C4-dicarboxylate ABC transporter
CCON33237_RS06835-2102.038784aspartate ammonia-lyase
CCON33237_RS06840-4142.749950ABC transporter ATP-binding protein
CCON33237_RS06845011-0.067005DUF1287 domain-containing protein
CCON33237_RS06850214-1.680818glucose-6-phosphate isomerase
CCON33237_RS06855314-2.060726UTP--glucose-1-phosphate uridylyltransferase
CCON33237_RS06860516-2.630838phosphoenolpyruvate carboxykinase
CCON33237_RS06865415-0.320417hypothetical protein
CCON33237_RS06870416-0.093197hypothetical protein
CCON33237_RS068753162.928563prolipoprotein diacylglyceryl transferase
CCON33237_RS068802142.981877fumarate reductase
CCON33237_RS068851143.595724fumarate reductase flavoprotein subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06810PF06580310.006 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.0 bits (70), Expect = 0.006
Identities = 20/85 (23%), Positives = 36/85 (42%), Gaps = 15/85 (17%)

Query: 269 QRIIDNTIINAIKYSPKESEILINLSFENERIKLSVQDFGKGIKDVKKVWKRYVREDEIQ 328
Q +++N I + I P+ +IL+ + +N + L V++ G +
Sbjct: 261 QTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLAL------------KNTK 308

Query: 329 GGFGLGLNIVSEICQKHGILYGVDS 353
G GL V E Q +LYG ++
Sbjct: 309 ESTGTGLQNVRERLQ---MLYGTEA 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06815HTHFIS868e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 86.4 bits (214), Expect = 8e-22
Identities = 27/107 (25%), Positives = 49/107 (45%)

Query: 2 KILLLEDDLGFQESVCEFLQTLGYEVTAVSDGQEACDLIEKNFYHLFILDIKVPGVNGHE 61
IL+ +DD + + + L GY+V S+ I L + D+ +P N +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VIKYIRSLNPNAPIMITTSLVDIGDMAIGYELGCNEYLKKPFELAEL 108
++ I+ P+ P+++ ++ E G +YL KPF+L EL
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTEL 111


20CCON33237_RS07455CCON33237_RS07520Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS07455-3183.271553Vi polysaccharide biosynthesis protein
CCON33237_RS07460-1183.201386ecotin
CCON33237_RS07465-1173.743767UDP-glucose 6-dehydrogenase
CCON33237_RS074700162.958886UDP-glucose 4-epimerase GalE
CCON33237_RS07475-2142.767590DNA ligase
CCON33237_RS07480-2153.314644DNA-3-methyladenine glycosylase
CCON33237_RS07485-1184.456583NAD-dependent epimerase
CCON33237_RS074900195.401208hypothetical protein
CCON33237_RS074952246.876154malate:quinone oxidoreductase
CCON33237_RS075004308.798829TonB-dependent receptor
CCON33237_RS0750563710.084830enoyl-[acyl-carrier-protein] reductase
CCON33237_RS075105359.554705triose-phosphate isomerase
CCON33237_RS075154317.174218phosphoglycerate kinase
CCON33237_RS075201234.865227type I glyceraldehyde-3-phosphate dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS07470NUCEPIMERASE1454e-43 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 145 bits (368), Expect = 4e-43
Identities = 70/334 (20%), Positives = 130/334 (38%), Gaps = 41/334 (12%)

Query: 1 MKILITGGAGYIGSHVVKALLKQGKDEITIIDNLCKGSQKALE----ALQKIGNFKFINA 56
MK L+TG AG+IG HV K LL+ G ++ IDNL +L+ L F+F
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 57 NLED--DLSEIFANGKFDAIIHFAAFIEVFESMSEPLKYYLNNTANVARVLRYAKTYNVN 114
+L D ++++FA+G F+ + + V S+ P Y +N +L + +
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 115 KFIFSSTAAVYGEPDVAEVSETTPT-IPINPYGRSKLMSEQIIKDYAASNKNFKFAILRY 173
+++S+++VYG S P++ Y +K +E + Y+ LR+
Sbjct: 120 HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH-LYGLPATGLRF 178

Query: 174 FNVAGADEEGLIGQNYPNATHLIKVAVQTILGKRESMGIFGDDYATKDGTCVRDYIHVSD 233
F V G G T + + +S+ ++ G RD+ ++ D
Sbjct: 179 FTVYG--PWGRPDMALFKFTKAML--------EGKSIDVYN------YGKMKRDFTYIDD 222

Query: 234 LADAHISALEYIGQN----------------GSETFNVGYGRGFSVKEVIETAKKVSEVN 277
+A+A I + I +N+G + + I+ + +
Sbjct: 223 IAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIE 282

Query: 278 FKVQNAPRRDGDPAILISNASKLRSLTSWKPKRD 311
K P + GD ++ L + + P+
Sbjct: 283 AKKNMLPLQPGDVLETSADTKALYEVIGFTPETT 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS07485NUCEPIMERASE5150.0 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 515 bits (1329), Expect = 0.0
Identities = 201/350 (57%), Positives = 244/350 (69%), Gaps = 16/350 (4%)

Query: 1 MKILVTGTAGFIGFHLANALVKRGDEVVGYDVINDYYDVNLKLERLKTAGFDTSEIDYGK 60
MK LVTG AGFIGFH++ L++ G +VVG D +NDYYDV+LK RL+
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLA---------- 50

Query: 61 LITSKTHPNLKFIKADLADEKTMKELFAKEKFDVVVNLAAQAGVRYSLINPKAYIDSNIT 120
P +F K DLAD + M +LFA F+ V + VRYSL NP AY DSN+T
Sbjct: 51 ------QPGFQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLT 104

Query: 121 GFVNILECCRHNEIKNLVYASSSSVYGLNENMPFSTHEAVNHPISLYAATKKSNEMMAHT 180
GF+NILE CRHN+I++L+YASSSSVYGLN MPFST ++V+HP+SLYAATKK+NE+MAHT
Sbjct: 105 GFLNILEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHT 164

Query: 181 YSHLFNVPTTGLRFFTVYGPWGRPDMALFLFVDAALKDKTIDVFNYGKMKRDFTYVDDIV 240
YSHL+ +P TGLRFFTVYGPWGRPDMALF F A L+ K+IDV+NYGKMKRDFTY+DDI
Sbjct: 165 YSHLYGLPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIA 224

Query: 241 KGIIKCIDNPAKPNPAWDAKHPDPATSKAPFKVYNIGNNSPVELMDYIKAVEIKIGREIK 300
+ II+ D + W + PA S AP++VYNIGN+SPVELMDYI+A+E +G E K
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 301 KNFLPLQAGDVPATFADVNDLVADFDYKPNTKVNDGVAKFVEWYCEFYGV 350
KN LPLQ GDV T AD L + P T V DGV FV WY +FY V
Sbjct: 285 KNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYKV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS07510DHBDHDRGNASE703e-16 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 69.7 bits (170), Expect = 3e-16
Identities = 63/259 (24%), Positives = 112/259 (43%), Gaps = 21/259 (8%)

Query: 3 LKGKKGLIVGVANAKSIAYGIAEACHAQGAQ-MAFTYLNDALKKRVEPIAEEFGSKFVYE 61
++GK I G A + I +A +QGA A Y + L+K V + E +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 62 LDVNNPAHLDGLADRIKKDLGEIDFVVHAVAYAPKEALEGEFVNTTKEAFDIAMGTSVYS 121
DV + A +D + RI++++G ID +V+ + + + E ++ +
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIH----SLSDEEWEATFSVNSTG 119

Query: 122 LLSLTRAVLPVLK--EGGSVLTLTYLGGPKFVPHYNV--MGVAKAALESSVRYLAHDLGA 177
+ + +R+V + GS++T+ P VP ++ +KAA + L +L
Sbjct: 120 VFNASRSVSKYMMDRRSGSIVTVG--SNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAE 177

Query: 178 KNIRVNAISAGPIKT-----LAASGIGDFRMILRYNE---VNSPLKRNVTTEDVGNSAMY 229
NIR N +S G +T L A G ++I E PLK+ D+ ++ ++
Sbjct: 178 YNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLF 237

Query: 230 LLSDLASGVTGEVHYVDCG 248
L+S A +T VD G
Sbjct: 238 LVSGQAGHITMHNLCVDGG 256


21CCON33237_RS07580CCON33237_RS07760Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS07580-211-3.045252hypothetical protein
CCON33237_RS07585-213-3.981669hypothetical protein
CCON33237_RS07590-212-1.213272hypothetical protein
CCON33237_RS07595-115-0.7580733'-5' exonuclease
CCON33237_RS07600-213-0.754452lipopolysaccharide heptosyltransferase I
CCON33237_RS07605-112-2.005280lipid A biosynthesis acyltransferase
CCON33237_RS07610-211-2.762231glycosyl transferase
CCON33237_RS07615-312-3.727791putative lipopolysaccharide heptosyltransferase
CCON33237_RS07620-314-4.162365glycosyl transferase family 1
CCON33237_RS07625-314-4.080012hypothetical protein
CCON33237_RS07630-216-5.066052hypothetical protein
CCON33237_RS07635-118-5.850735glycosyl transferase
CCON33237_RS07640-411-1.961727xylanase
CCON33237_RS07645-410-0.922506glycosyl transferase family 1
CCON33237_RS07650-410-0.525084hypothetical protein
CCON33237_RS07655-39-1.089406O-antigen ligase
CCON33237_RS07660-4100.027708excinuclease ABC subunit A
CCON33237_RS07665-3131.383647ferredoxin
CCON33237_RS07670228-2.311678nucleoside-diphosphate kinase
CCON33237_RS07675026-2.264981hypothetical protein
CCON33237_RS07680-126-3.57970250S ribosomal protein L32
CCON33237_RS07685-218-1.547904phosphate acyltransferase
CCON33237_RS07695-216-1.7715053-oxoacyl-ACP synthase
CCON33237_RS07700-216-1.525600alkyl hydroperoxide reductase
CCON33237_RS07705-314-2.868176pilus assembly protein PilZ
CCON33237_RS07710-412-2.132268hypothetical protein
CCON33237_RS07715-315-4.305314glycerol-3-phosphate acyltransferase
CCON33237_RS07720-214-4.594470dihydroneopterin aldolase
CCON33237_RS07725-114-4.278575DNA-binding response regulator
CCON33237_RS07730-115-3.812446two-component sensor histidine kinase
CCON33237_RS07735-1100.077405guanosine polyphosphate pyrophosphohydrolase
CCON33237_RS07740-2141.255369hypothetical protein
CCON33237_RS07745-2204.340092flagellar motor switch protein FliN
CCON33237_RS07750-2214.870179restriction endonuclease
CCON33237_RS07755-2214.697709tryptophan synthase subunit alpha
CCON33237_RS07760-3194.113979tryptophan synthase subunit beta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS07725HTHFIS794e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 78.7 bits (194), Expect = 4e-19
Identities = 35/112 (31%), Positives = 57/112 (50%)

Query: 2 RILIVEDEVTLNKTIAEGLQEFGYQTDSSENFKDAEYYIGIRNYDLVLTDWMLQDGDGVD 61
IL+ +D+ + + + L GY + N +I + DLV+TD ++ D + D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 LINIIKHKSPRTSVVVLSAKDDKESEIKALRAGADDYIKKPFDFDILVARLE 113
L+ IK P V+V+SA++ + IKA GA DY+ KPFD L+ +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS07730PF06580385e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.9 bits (88), Expect = 5e-05
Identities = 19/98 (19%), Positives = 39/98 (39%), Gaps = 11/98 (11%)

Query: 286 IKLDLKPEILNLKIQTSLLTHIVQNFVQNAIKFSPKNSTITISSKLIKNKFIIEVIDEGI 345
+ + P I+++++ L+ +V+N +++ I P+ I + +EV + G
Sbjct: 242 FENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGS 301

Query: 346 GIDESKDLFAPFKRYGDKGGAGLGLFLVKGAAQALGGE 383
++ K G GL V+ Q L G
Sbjct: 302 LALKN-----------TKESTGTGLQNVRERLQMLYGT 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS07735SHAPEPROTEIN363e-04 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 35.5 bits (82), Expect = 3e-04
Identities = 20/55 (36%), Positives = 29/55 (52%)

Query: 114 EEATFGAIAAKNLLHNIAECVTIDIGGGSTELARISKGKIIDTLSLDIGTVRLKE 168
EE AI A + + +DIGGG+TE+A IS ++ + S+ IG R E
Sbjct: 142 EEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDE 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS07745FLGMOTORFLIN951e-28 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 94.6 bits (235), Expect = 1e-28
Identities = 30/85 (35%), Positives = 48/85 (56%)

Query: 14 GLFKSYDELMDISVDFIAELGTTTVSINELLKFEAGSVIDLEKPAGESVELYINNRIFGK 73
G + D +MDI V ELG T ++I ELL+ GSV+ L+ AGE +++ IN + +
Sbjct: 49 GAMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQ 108

Query: 74 GEVMVYEKNLAIRINEILDSKSVIQ 98
GEV+V +RI +I+ ++
Sbjct: 109 GEVVVVADKYGVRITDIITPSERMR 133


22CCON33237_RS07815CCON33237_RS07855Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS07815214-1.468235ABC transporter
CCON33237_RS07820114-1.488828serine hydroxymethyltransferase
CCON33237_RS07825117-3.306671lysine--tRNA ligase
CCON33237_RS07830021-4.695004transcriptional repressor
CCON33237_RS07835-220-5.015309colicin V synthesis protein
CCON33237_RS07840-119-4.989845twitching motility protein PilT
CCON33237_RS07845-122-5.984425asparaginyl/glutamyl-tRNA amidotransferase
CCON33237_RS07850021-6.278829hypothetical protein
CCON33237_RS07855-116-4.935449L-seryl-tRNA selenium transferase
23CCON33237_RS07950CCON33237_RS08055Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS07950-215-5.091676hypothetical protein
CCON33237_RS07955016-5.822981ATP/GTP-binding protein
CCON33237_RS07960019-6.651431hypothetical protein
CCON33237_RS07965-115-5.182560dUTPase
CCON33237_RS07970-116-5.084324hypothetical protein
CCON33237_RS07975-116-5.240716GGDEF-domain containing protein
CCON33237_RS07980-210-3.049076hypothetical protein
CCON33237_RS07985-39-1.966294GGDEF-domain containing protein
CCON33237_RS07990-3110.678779hypothetical protein
CCON33237_RS079951141.029168acyl-CoA thioesterase
CCON33237_RS08000-1161.368995Cro/Cl family transcriptional regulator
CCON33237_RS080052203.436985formate transporter
CCON33237_RS080101192.773484hydrogenase 3 maturation endopeptidase HyCI
CCON33237_RS080150212.157742hypothetical protein
CCON33237_RS080201212.254901formate hydrogenlyase subunit 7
CCON33237_RS080253241.850142formate hydrogenlyase
CCON33237_RS080304231.958780hydrogenase-4 component G
CCON33237_RS080353191.600137hydrogenase
CCON33237_RS080402171.278543hydantoin racemase
CCON33237_RS080453191.811010hydrogenase
CCON33237_RS080504191.230935NADH dehydrogenase
CCON33237_RS080552141.626009electron transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS07985SSPAMPROTEIN300.022 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type M

signature.
Length = 147

Score = 29.7 bits (66), Expect = 0.022
Identities = 20/60 (33%), Positives = 32/60 (53%), Gaps = 2/60 (3%)

Query: 309 FSNVRKMDILIKREIHIKKVLNNQIESKVKELEETNRHLQRISKYDYLTNALNRQYFIAR 368
++ +RK I ++R+I ++ QI+ K ELE+ Q SKY +L N Q +I R
Sbjct: 69 YALLRKQSI-VRRQIKDLELQIIQIQEKRSELEKKREEFQEKSKY-WLRKEGNYQRWIIR 126


24CCON33237_RS08115CCON33237_RS08190Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS081153277.010328arginine ABC transporter ATP-binding protein
CCON33237_RS081205317.918192flavodoxin
CCON33237_RS081258389.580110hypothetical protein
CCON33237_RS0813084110.382906fumarate reductase
CCON33237_RS0813594311.201163sugar transporter
CCON33237_RS08140115212.409126pyruvate:ferredoxin (flavodoxin) oxidoreductase
CCON33237_RS081459378.000058DNA-deoxyinosine glycosylase
CCON33237_RS081506276.311516hypothetical protein
CCON33237_RS081551161.695742hypothetical protein
CCON33237_RS081601151.886387potassium transporter KefF
CCON33237_RS08165116-0.312760hypothetical protein
CCON33237_RS081702140.824624hypothetical protein
CCON33237_RS081750122.519913hypothetical protein
CCON33237_RS08180-1132.939800hypothetical protein
CCON33237_RS08185-2122.267161DUF4810 domain-containing protein
CCON33237_RS08190-1123.085992hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS08135TCRTETB492e-08 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 48.7 bits (116), Expect = 2e-08
Identities = 39/167 (23%), Positives = 68/167 (40%), Gaps = 1/167 (0%)

Query: 12 VIALAFCAFIFNTTEFVPVPLLSDIAKDFDMSTADTGLIITIYAWSVTILSLPLMLLTAN 71
+I L +F E V L DIA DF+ A T + T + + +I + L+
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 72 LERRSLLLKVFIVFVVAHTLCTFAWN-FKILIIARLMIAIAHAIFWAITASLAVRLAPIN 130
L + LLL I+ + + F +LI+AR + A F A+ + R P
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 131 KSSQALGLLALGTSLAMILGLPLGRILGDALGWRVTFGLIGIFAVGV 177
+A GL+ ++ +G +G ++ + W + I + V
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITV 182


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS0818556KDTSANTIGN270.024 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 27.2 bits (60), Expect = 0.024
Identities = 11/19 (57%), Positives = 13/19 (68%)

Query: 68 GYKIAPGVYAHLGLLYLNN 86
G IAPG A LG++YL N
Sbjct: 80 GMTIAPGFRAELGVMYLRN 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS08190FRAGILYSIN270.049 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 27.3 bits (60), Expect = 0.049
Identities = 13/30 (43%), Positives = 19/30 (63%), Gaps = 1/30 (3%)

Query: 1 MKNVFKFGTVLLTAALFAGCASESSRVVET 30
MKNV K +L TAAL A C++E+ + +
Sbjct: 9 MKNV-KLLLMLGTAALLAACSNEADSLTTS 37


25CCON33237_RS08305CCON33237_RS08330Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS08305014-3.2958593'-5'-bisphosphate nucleotidase
CCON33237_RS08310015-3.509836SAM-dependent methyltransferase
CCON33237_RS08315017-3.478091thiol peroxidase
CCON33237_RS08320017-3.171211molybdenum ABC transporter permease subunit
CCON33237_RS08325018-3.649783GTPase
CCON33237_RS08330-115-3.172308TOBE domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS08310ANTHRAXTOXNA372e-04 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 36.6 bits (84), Expect = 2e-04
Identities = 42/234 (17%), Positives = 87/234 (37%), Gaps = 50/234 (21%)

Query: 271 AVVDEYKNNKFKDRIDLEQFMDMISNKVFRQSLIVHSKTYESIANKQIGPSDINKIHVVA 330
A+ + Y + K E+ NK ++ +SI N + K
Sbjct: 33 AMNEHYTESDIKRNHKTEK------NKTEKEKF------KDSINN-------LVKTEFTN 73

Query: 331 DFIKKDNQWQDSYGAMPQDI-SWLCEVFYKMYPASINLSQ--ILEILPEDK--------- 378
+ + K Q QD +P+D+ E+ ++Y I+L + L+ L E++
Sbjct: 74 ETLDKIQQTQDLLKKIPKDVLEIYSELGGEIYFTDIDLVEHKELQDLSEEEKNSMNSRGE 133

Query: 379 -LIVYSAFV--------RILTNSSDAMILKDEQKNIEYRPGH----------SRLSQKLI 419
+ S FV +++ N D I ++ K + Y G L + +
Sbjct: 134 KVPFASRFVFEKKRETPKLIINIKDYAINSEQSKEVYYEIGKGISLDIISKDKSLDPEFL 193

Query: 420 NYVRYFLNHKNNADVVFANKFSISRKLNNIDYYILLLLDGKNSLEDVAAKTLKF 473
N ++ + +++D++F+ KF +LNN I + + + + +
Sbjct: 194 NLIKSLSDDSDSSDLLFSQKFKEKLELNNKSIDINFIKENLTEFQHAFSLAFSY 247


26CCON33237_RS08380CCON33237_RS08620Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS083801213.809972hypothetical protein
CCON33237_RS083851225.418944S-ribosylhomocysteine lyase
CCON33237_RS083900204.835102flagellar export apparatus protein FliQ
CCON33237_RS083950163.138981UDP-N-acetylenolpyruvoylglucosamine reductase
CCON33237_RS084001152.260250hypothetical protein
CCON33237_RS08405-2194.500572DNA recombination/repair protein RecA
CCON33237_RS08410-1184.001193phosphopyruvate hydratase
CCON33237_RS08415-1153.146170hypothetical protein
CCON33237_RS08420-1133.506235AMIN domain-containing protein
CCON33237_RS08425-1133.893704integrase
CCON33237_RS08430-1123.843081orotate phosphoribosyltransferase
CCON33237_RS08435-1121.954695chemotaxis protein
CCON33237_RS084400111.307677glycerate kinase
CCON33237_RS08445011-0.042584gluconate transporter
CCON33237_RS08450-1110.469351phospholipase
CCON33237_RS084554224.534121molybdenum cofactor guanylyltransferase
CCON33237_RS084603213.598548hypothetical protein
CCON33237_RS084651213.840420hypothetical protein
CCON33237_RS084701192.912043hypothetical protein
CCON33237_RS084750183.3985293-isopropylmalate dehydratase large subunit
CCON33237_RS084800171.780718ligand-gated channel protein
CCON33237_RS08485014-2.385247fibronectin-binding protein
CCON33237_RS08490316-0.015415hypothetical protein
CCON33237_RS084952130.051450type II secretion system protein K
CCON33237_RS085250141.238358**reactive intermediate/imine deaminase
CCON33237_RS085300131.469543hypothetical protein
CCON33237_RS085351132.045246fumarate hydratase
CCON33237_RS08540-1142.359283fumarate hydratase
CCON33237_RS08545-2131.616537hypothetical protein
CCON33237_RS08550-3153.693860sodium-dependent transporter
CCON33237_RS08555-3204.511698CopG family transcriptional regulator
CCON33237_RS08560-4193.946228transporter
CCON33237_RS08565-3162.792964peptidase M48
CCON33237_RS08570-2162.056608*Na+/H+ antiporter
CCON33237_RS08580-2151.723498hypothetical protein
CCON33237_RS08585-212-0.780113ABC transporter permease
CCON33237_RS08590-111-1.433892ABC transporter ATP-binding protein
CCON33237_RS08595115-3.276071ferredoxin
CCON33237_RS08600116-2.685110cytochrome c
CCON33237_RS08605116-2.090625hypothetical protein
CCON33237_RS08610115-1.742389iron-sulfur protein
CCON33237_RS08615214-0.292730nitrous oxidase accessory protein
CCON33237_RS086202151.250616hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS08390TYPE3IMQPROT562e-14 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 56.3 bits (136), Expect = 2e-14
Identities = 21/81 (25%), Positives = 39/81 (48%)

Query: 3 STLVSLGVETFKIALYISLPMLLSGLIAGLIISIFQATTQINETTLSFVPKILLVVVVII 62
LV G + + L +S + I GL++ +FQ TQ+ E TL F K+L V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 63 FLMPWMISMMVEFTTRMLDFI 83
L W +++ + +++
Sbjct: 62 LLSGWYGEVLLSYGRQVIFLA 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS08445ACRIFLAVINRP290.041 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 29.4 bits (66), Expect = 0.041
Identities = 12/44 (27%), Positives = 23/44 (52%), Gaps = 3/44 (6%)

Query: 37 LMLAIVAGIDLSKIPAMIGVGF-SGTFKSIGIVIIFGTIIGTVL 79
LM ++ + + +P I G SG ++GI ++ G + T+L
Sbjct: 975 LMTSLAFILGV--LPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS08450PHPHLIPASEA12172e-71 Bacterial phospholipase A1 protein signature.
		>PHPHLIPASEA1#Bacterial phospholipase A1 protein signature.

Length = 289

Score = 217 bits (554), Expect = 2e-71
Identities = 87/242 (35%), Positives = 132/242 (54%), Gaps = 14/242 (5%)

Query: 88 NALGIELYKFNYLLPVTYAKNVP--------NDERKSVETKFQISLAKPLFYDLFGLRES 139
N + Y NYL+ + + + E KFQ+SLA PL+ + G
Sbjct: 48 NPFTLYPYDTNYLIYTQTSDLNKEAIASYDWAENARKDEVKFQLSLAFPLWRGILGPNSV 107

Query: 140 LVAAYTQTSWWQITRT--SAPFRETNYQPEIFLNFASPKYLEQIGVKSLKFGLLHESNGR 197
L A+YTQ SWWQ++ + S+PFRETNY+P++FL FA+ ++ ++ G H+SNGR
Sbjct: 108 LGASYTQKSWWQLSNSEESSPFRETNYEPQLFLGFATDYRFAGWTLRDVEMGYNHDSNGR 167

Query: 198 DGSNSRSWNRAYVQSDFVFGKLSISPRAWMVVGNKGDNKDILKYIGHGDVRLSYNLNDHI 257
SRSWNR Y + G + + W VVGN DN DI KY+G+ +++ Y+L D +
Sbjct: 168 SDPTSRSWNRLYTRLMAENGNWLVEVKPWYVVGNTDDNPDITKYMGYYQLKIGYHLGDAV 227

Query: 258 FSLMLRNNLHFDKTNKGAAEISYMFPIFSTGVYGYLQYFTGYGESLIDYNRHTDKFGLGF 317
+ +++ T G AE+ +PI V Y Q ++GYGESLIDYN + + G+G
Sbjct: 228 L--SAKGQYNWN-TGYGGAELGLSYPITKH-VRLYTQVYSGYGESLIDYNFNQTRVGVGV 283

Query: 318 VI 319
++
Sbjct: 284 ML 285


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS08485FbpA_PF058331269e-34 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 126 bits (319), Expect = 9e-34
Identities = 67/291 (23%), Positives = 120/291 (41%), Gaps = 27/291 (9%)

Query: 168 DHFFKAEAARVNESRIASLKEAKLASVQKKIDSMSEILNSLEDKDELMKKSEEASNLGSL 227
++F+ A+ R+ S V I+ ++ L + + + + G L
Sbjct: 285 ENFYYAKDKS---DRLKSKSSDLQKIVMNNINRCTKKDKILNNTLKKCEDKDIFKLYGEL 341

Query: 228 LLANLGNFKGYEREICLKDF---DGNEIKLTLSD--TPKNSANEFYTRSKKFRAKALGVE 282
L AN+ K I L ++ + + +K+TL + TP + +Y + K +
Sbjct: 342 LTANIYALKKGLSHIELANYYSENYDTVKITLDENKTPSQNVQSYYKKYNKLKKSEEAAN 401

Query: 283 IEKRNLKEKIEFFEGLKSLLKEASSLYELE----------ILSPKNKAKQRERHIKDVSE 332
+ +E++ + + + + A + E+E + K K ++ S+
Sbjct: 402 EQLLQNEEELNYLYSVLTNINNADNYDEIEEIKKELIETGYIKFKKIYKSKKS---KTSK 458

Query: 333 NAEIFYVREFKILVGRNEKGNINL-LDLAKKDDIWLHLKDSPSAHVIIKTNKSRVPEDVL 391
I VG+N N L L A K DIW H K+ P +HVI+K +PE L
Sbjct: 459 PMHFISKDGIDIYVGKNNIQNDYLTLKFANKHDIWFHTKNIPGSHVIVKNIMD-IPESTL 517

Query: 392 EMAAKFCVEFS-VKGAGRYEVDYTKRENLKRENGAN---VTYTNYKTIIIN 438
AA +S + + VDYT+ +N+K+ NGA V Y+ +TI +
Sbjct: 518 LEAANLAAYYSKSQNSSNVPVDYTEVKNVKKPNGAKPGMVIYSTNQTIYVT 568



Score = 52.9 bits (127), Expect = 1e-09
Identities = 31/147 (21%), Positives = 64/147 (43%), Gaps = 8/147 (5%)

Query: 14 SNFTKINQAKRINDMAILIEFNGE--KIIFDLNKSNSAIYKDDEQKEAKIYQAPFDNVLK 71
K+NQ ++ +++ + I K++ + + I+ D K I F VL+
Sbjct: 22 GKIDKVNQPEK-DEIILNIRKGRLSFKLLISSSSNYPRIHLTDLTKPNPIKAPMFCMVLR 80

Query: 72 KRFNASHIKSVECLKDNRILKF-ICTQSGSYKSENFVLYLEFTGRFTNAVITD-ENNVII 129
K + + I + + +RI+ + + + L +E GR +N + +N+I+
Sbjct: 81 KYISNAKIVDIHQINQDRIVVIDFESTDELGFNSIYSLIIEIMGRHSNMTLIRKRDNIIM 140

Query: 130 EALRHID---NSYRKIETGEILKELPA 153
++++HI N+YR I G P
Sbjct: 141 DSIKHITPDINTYRSIYPGIEYVYPPK 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS08490VACCYTOTOXIN260.038 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 26.1 bits (57), Expect = 0.038
Identities = 21/98 (21%), Positives = 35/98 (35%)

Query: 9 SNFINQNAPVVSQVHANQQARFDMQSLMAAELASQQSEEIKEVRPMEESYKIDPEKEHER 68
+ +N N V +A Q L QS + + P E YK P +
Sbjct: 282 TGEVNFNHLTVGDHNAAQAGIIASNKTHIGTLDLWQSAGLNIIAPPEGGYKDKPNDKPSN 341

Query: 69 QKNEEEASEFESEKQNSRDGDEENLDNTADNEEIVPSE 106
++ + QN+ + N N+A EI P++
Sbjct: 342 TTQNNAKNDKQESSQNNSNTQVINPPNSAQKTEIQPTQ 379


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS0855060KDINNERMP300.017 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 30.3 bits (68), Expect = 0.017
Identities = 20/96 (20%), Positives = 36/96 (37%), Gaps = 14/96 (14%)

Query: 95 PLAGSLCIAIGYAVIIA--YVLKALTQALTGSFMSVDTNVWFNSFALQD-YSVLPYHFII 151
PL G + I + +A Y+L + F +W + + QD Y +LP ++
Sbjct: 419 PLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFA-----LWIHDLSAQDPYYILP--ILM 471

Query: 152 VVGTLLTLFFGAKSIEKTNQ----IMMPLFFVLFSI 183
V ++ Q MP+ F +F +
Sbjct: 472 GVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFL 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS08625BINARYTOXINB280.017 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 28.5 bits (63), Expect = 0.017
Identities = 18/63 (28%), Positives = 28/63 (44%)

Query: 109 ITKCSACHDDYANGIIGPSLLTKSENEIYTMINAYKNKEKVNVLMRDLVKKMDDSEIRNL 168
IT+ D + I L + IYT+++ K K+N+L+RD D + I
Sbjct: 576 ITEFDFNFDQQTSQNIKNQLAELNATNIYTVLDKIKLNAKMNILIRDKRFHYDRNNIAVG 635

Query: 169 AKE 171
A E
Sbjct: 636 ADE 638


27CCON33237_RS08935CCON33237_RS09250Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS08935-216-3.183211TRAP ABC transporter permease
CCON33237_RS08940-218-4.807544GGDEF domain-containing protein
CCON33237_RS08945-216-4.245618GGDEF domain-containing protein
CCON33237_RS08955-215-3.752864mechanosensitive ion channel protein
CCON33237_RS08960-212-1.977454carbonic anhydrase
CCON33237_RS08965-311-0.902573membrane protein
CCON33237_RS08970-1191.442137hypothetical protein
CCON33237_RS08975-1191.719880preprotein translocase subunit SecG
CCON33237_RS08980-2202.627755polysaccharide deacetylase
CCON33237_RS08985-3233.776419ribosome-recycling factor
CCON33237_RS09015-3213.444381*****sulfite reductase
CCON33237_RS090201266.211146ribonucleotide reductase of class Ia (aerobic),
CCON33237_RS090250174.646536hypothetical protein
CCON33237_RS090301215.730196nitrilase
CCON33237_RS090350185.347276protein-L-isoaspartate O-methyltransferase
CCON33237_RS090400195.057739hypothetical protein
CCON33237_RS090450205.160061single-stranded-DNA-specific exonuclease RecJ
CCON33237_RS09050-4163.123956NADPH dehydrogenase
CCON33237_RS09055-3193.946714CTP synthase
CCON33237_RS09060-3162.407618DUF4492 domain-containing protein
CCON33237_RS09065-2183.878314cytochrome ubiquinol oxidase subunit I
CCON33237_RS09070-3132.860934cytochrome c oxidase assembly protein
CCON33237_RS09075-2132.660757aryl sulfotransferase
CCON33237_RS09080-2143.003642hypothetical protein
CCON33237_RS09085-2131.912360dihydroxy-acid dehydratase
CCON33237_RS09090-2132.870232hypothetical protein
CCON33237_RS090950161.644584hypothetical protein
CCON33237_RS091001213.726257rubrerythrin
CCON33237_RS091050225.219412hypothetical protein
CCON33237_RS091101245.080213NAD(P)H dehydrogenase
CCON33237_RS091151245.514262antibiotic biosynthesis monooxygenase
CCON33237_RS091200235.424559hypothetical protein
CCON33237_RS091251173.611622oxidoreductase
CCON33237_RS09130-3173.627062oxidoreductase
CCON33237_RS09135-310-1.034787NADH-dependent alcohol dehydrogenase
CCON33237_RS09140-3110.499306anaerobic ribonucleoside-triphosphate reductase
CCON33237_RS09145-2131.415431anaerobic ribonucleoside triphosphate reductase
CCON33237_RS09150-1131.140514hypothetical protein
CCON33237_RS09155-1131.650295ribonucleotide-diphosphate reductase subunit
CCON33237_RS09160-1120.813490adenylosuccinate lyase
CCON33237_RS091651224.749515pseudouridine synthase
CCON33237_RS09170-2193.64310823S rRNA (adenine(2503)-C(2))-methyltransferase
CCON33237_RS09175-1172.161373purine-nucleoside phosphorylase
CCON33237_RS09180-2193.092823imidazole glycerol phosphate synthase cyclase
CCON33237_RS09185-1213.499262hypothetical protein
CCON33237_RS09190-1234.14922116S rRNA
CCON33237_RS09195-3192.769445ribonuclease J
CCON33237_RS09200-1202.146499hypothetical protein
CCON33237_RS092050182.018899pseudouridine synthase
CCON33237_RS092100130.022725hypothetical protein
CCON33237_RS09215216-1.788208hypothetical protein
CCON33237_RS09360216-2.731044hypothetical protein
CCON33237_RS09225-113-1.646660ABC transporter ATP-binding protein
CCON33237_RS09230012-1.244561acyl-ACP--UDP-N- acetylglucosamine
CCON33237_RS09235113-1.195141hypothetical protein
CCON33237_RS09240213-1.361813NosL family protein
CCON33237_RS09245312-0.457525Sel1 repeat protein
CCON33237_RS09250212-0.031745NAD(FAD)-utilizing dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS08975SECGEXPORT414e-08 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 41.5 bits (97), Expect = 4e-08
Identities = 24/101 (23%), Positives = 50/101 (49%), Gaps = 1/101 (0%)

Query: 3 LIFLILQFALAVIITIAVLLQKSSSIGLGAYSGSNESLFGAKGPAGFLAKFTFIVGILFI 62
L+ + L A+ ++ I + K + +G +G++ +LFG+ G F+ + T ++ LF
Sbjct: 5 LLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLATLFF 64

Query: 63 LNTLALGYFY-NKDLKRSIIDSVDSKSLVIPKSNDVPSAPS 102
+ +L LG NK K S +++ + + P+ P+
Sbjct: 65 IISLVLGNINSNKTNKGSEWENLSAPAKTEQTQPAAPAKPT 105


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS09055ANTHRAXTOXNA344e-04 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 33.6 bits (76), Expect = 4e-04
Identities = 13/43 (30%), Positives = 18/43 (41%)

Query: 124 SLREVLTPFELAFKYYFHADYRDFFAFYGAEEAPGMDYVSSQD 166
+L E F LAF YYF D+R Y + M+ +
Sbjct: 233 NLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFEYMNKLEKGG 275


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS09075TYPE3IMSPROT290.039 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 28.6 bits (64), Expect = 0.039
Identities = 25/191 (13%), Positives = 62/191 (32%), Gaps = 31/191 (16%)

Query: 209 LLINTVLFLPFFLGFLA--WVLTKDGFAYDANGVVSLMPYKYAIN--LIEMPIVGILLLV 264
L + ++ + ++ + + +S + + + P++ + L+
Sbjct: 38 LSAMLMGLSDYYFEHFSKLMLIPAEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALM 97

Query: 265 GVVLVLVGIFQGAF---TKSIR-------------GIFAYGVGV-----TLAVTALFLIT 303
+ + Q F ++I+ IF+ V L V L ++
Sbjct: 98 AI---ASHVVQYGFLISGEAIKPDIKKINPIEGAKRIFSIKSLVEFLKSILKVVLLSILI 154

Query: 304 GLNGTAFYPSFSDLS-SSLT--IKNASSSHYTLGVMAYVSLLVPVVLAYIIVVWRAIDSK 360
+ + L + L V+ V +V + Y ++ I
Sbjct: 155 WIIIKGNLVTLLQLPTCGIECITPLLGQILRQLMVICTVGFVVISIADYAFEYYQYIKEL 214

Query: 361 KITQDEIKNDH 371
K+++DEIK ++
Sbjct: 215 KMSKDEIKREY 225


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS09095RTXTOXIND357e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 35.2 bits (81), Expect = 7e-04
Identities = 31/176 (17%), Positives = 64/176 (36%), Gaps = 8/176 (4%)

Query: 69 QKARLMSNHERIDELISEIDTLKAKIAQKDEAEDAMERVINELKESIGTANERAKNNEAN 128
++ R I+ L + ++ +E+ + R+ + +KE T + E N
Sbjct: 149 EQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELN 208

Query: 129 F---TATLAELNQNKNALVEASERENRLKRDMAVLRNEIEAKENSLKEQEANLLKVKNEL 185
A + N S E D + L ++ ++++ EQE ++ NEL
Sbjct: 209 LDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNEL 268

Query: 186 NLEFSNLA---NKIFE--EKSANFTQNSQNSLDLLLKPLKEQISTFQERVNAVHDE 236
+ S L ++I E+ TQ +N + L+ + I + +
Sbjct: 269 RVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS09135DHBDHDRGNASE382e-05 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 38.1 bits (88), Expect = 2e-05
Identities = 39/185 (21%), Positives = 81/185 (43%), Gaps = 7/185 (3%)

Query: 7 ITGASSGIGAAVAKAFARRGENLILIARRGELLKELKSEIAKFANVDVVIELCDLSKQEN 66
ITGA+ GIG AVA+ A +G ++ + E L+++ S + A D+
Sbjct: 13 ITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPA-DVRDSAA 71

Query: 67 ALSLWQKLEK--FELKALINNAGFGDYNKVGEQNLEKITQMINLNIISLVTLSTLFTKKY 124
+ ++E+ + L+N AG + + E+ ++N + S +K
Sbjct: 72 IDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKYM 131

Query: 125 KDKDT-QLINISSIGGYKIVPNAVTYCASKFFVSAFSEGLYHELAQDKQAKMQAKVLAPA 183
D+ + ++ + S + Y +SK F++ L ELA + ++ +++P
Sbjct: 132 MDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA---EYNIRCNIVSPG 188

Query: 184 ATKTE 188
+T+T+
Sbjct: 189 STETD 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS09155IGASERPTASE350.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.0 bits (80), Expect = 0.001
Identities = 29/149 (19%), Positives = 46/149 (30%), Gaps = 20/149 (13%)

Query: 493 GLYPYTARYLKHFNNHFSTIGINGMNELLRNFTNDKENISTKFGRDFAIEMVEFLRDKIR 552
R + N + ++ N L T E+ + K D+ + + DK
Sbjct: 119 NGNAKAHRDVSSEENRYFSVEKNEYPTKLNGKTVTTEDQTQKRREDYYMPRL----DKFV 174

Query: 553 TFQEETGNLYNLEATPAEGTT---YRFAKEDKKRYPDIIQAGGGENIYYTNST--QLPAN 607
T E P E +T D+ +YP ++ G G Y L N
Sbjct: 175 T-----------EVAPIEASTASSDAGTYNDQNKYPAFVRLGSGSQFIYKKGDNYSLILN 223

Query: 608 FTDDAYEALDLQDDLQTSYTGGTVFHLYM 636
+ L L D T GT + +
Sbjct: 224 NHEVGGNNLKLVGDAYTYGIAGTPYKVNH 252


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS09195GPOSANCHOR340.003 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 34.3 bits (78), Expect = 0.003
Identities = 38/175 (21%), Positives = 67/175 (38%), Gaps = 21/175 (12%)

Query: 518 EAKEKFNLALHEVNLLFSEIRIKEEELKSLKTELADVDRRLESYNSAKQI-EELLSPLES 576
A ++ L +E E E L+ + ++ +S E LE+
Sbjct: 271 GAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEA 330

Query: 577 EFASLSDELDDKKAEFSKLTRVVSQNEALQEYLSTPEKPPFFIFQQIFKTEAFEKYNDEA 636
E L ++ +A L R + + ++ + EA + +E
Sbjct: 331 EHQKLEEQNKISEASRQSLRRDLDASREAKK-----------------QLEAEHQKLEEQ 373

Query: 637 LKVGEINRQIAEQNLKASKQNGESKEKNEARLNELNSEIAGLESRAAELGTKINL 691
K+ E +RQ ++L AS+ E+K++ E L E NS++A LE EL L
Sbjct: 374 NKISEASRQSLRRDLDASR---EAKKQVEKALEEANSKLAALEKLNKELEESKKL 425


28CCON33237_RS00490CCON33237_RS00525N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS004907211.123597hypothetical protein
CCON33237_RS004954151.660311hypothetical protein
CCON33237_RS00500-3100.799107aryl-sulfate sulfotransferase
CCON33237_RS00505-4110.687843carboxymuconolactone decarboxylase
CCON33237_RS00510-212-2.244352thiol:disulfide interchange protein
CCON33237_RS00515-213-1.667908disulfide oxidoreductase
CCON33237_RS00520-214-2.357774histidine kinase
CCON33237_RS00525-115-3.435772transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS00490RTXTOXINA451e-06 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 45.3 bits (107), Expect = 1e-06
Identities = 73/347 (21%), Positives = 126/347 (36%), Gaps = 81/347 (23%)

Query: 295 AKIESVYGGVGNDNVNLQNGAETVEIFGGSGNDGITVNNSTVSGRAIKNDEGRIIRYEAI 354
+IES + G G+D V L A + I+ G G+D + +
Sbjct: 610 IRIES-HLGDGDDKVFL--SAGSANIYAGKGHDVVYYD---------------------- 644

Query: 355 KGGDGVDTISIQNESKVRGNIYGNAGQDNIS--LEGGAKVYGAIYGDTKTMNDIYEKQV- 411
K G TI ++ AG ++ L G KV + + + ++
Sbjct: 645 KTDTGYLTIDGTKATE--------AGNYTVTRVLGGDVKVLQEVVKEQEVSVGKRTEKTQ 696

Query: 412 ----------NPALDIVDSGADIDTITLSGAGTSAKHIFAGDGDDKIVVKDGAKVTGEVR 461
L D+ ++ + GT+ F G I GA +
Sbjct: 697 YRSYEFTHINGKNLTETDNLYSVEELI----GTTRADKFFGSKFTDIF--HGADGDDLIE 750

Query: 462 GDEGDDKITVSDNQTYVSAINGRGGDDTILVENGANVDGIVGRWGNDSITVTGKGTTVKE 521
G++G+D++ G G+DT+ G D + G GND + + G
Sbjct: 751 GNDGNDRLY------------GDKGNDTL--SGGNGDDQLYGGDGNDKL-IGVAG---NN 792

Query: 522 YIHGDDGNDTIKVLDGAVVKGDIEGNAGNDIIEIDGGAN--KEGFVGGKIISGDGDDTVT 579
Y++G DG+D +V ++ K + G GND + GA+ G + G G+D
Sbjct: 793 YLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYR 852

Query: 580 IRNMNLNKVDNDYNKNVKVDTGTGNDTVNLYNVTVNDTNIWTKEGND 626
+ Y ++ D G D ++L ++ D + +EGND
Sbjct: 853 YL--------SGYGHHIIDDDGGKEDKLSLADIDFRDV-AFKREGND 890



Score = 37.6 bits (87), Expect = 3e-04
Identities = 56/249 (22%), Positives = 92/249 (36%), Gaps = 48/249 (19%)

Query: 417 IVDSGADIDTITLSGAGTSAKHIFAGDGDDKI---------VVKDGAK--------VTGE 459
I D D AG++ +I+AG G D + + DG K VT
Sbjct: 612 IESHLGDGDDKVFLSAGSA--NIYAGKGHDVVYYDKTDTGYLTIDGTKATEAGNYTVTRV 669

Query: 460 VRGDEGDDKITVSDNQTYV-----------SAINGRGGDDTILVENGANVDGIVGRWGND 508
+ GD + V + + V G + +N +V+ ++G D
Sbjct: 670 LGGDVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVEELIGTTRAD 729

Query: 509 SITVTGKGTTVKEYIHGDDGNDTIKVLDGAVV----KGD--IEGNAGNDIIEIDGGANKE 562
G+ + HG DG+D I+ DG KG+ + G G+D ++ GG +
Sbjct: 730 KF----FGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDD--QLYGGDGND 783

Query: 563 GFVGGK----IISGDGDDTVTIRNMNLNKVDNDYNKNVKVDTGTGNDTVNLYNVTVNDTN 618
+G + GDGDD ++ +L K N D G++ +L + D
Sbjct: 784 KLIGVAGNNYLNGGDGDDEFQVQGNSLAK--NVLFGGKGNDKLYGSEGADLLDGGEGDDL 841

Query: 619 IWTKEGNDT 627
+ GND
Sbjct: 842 LKGGYGNDI 850


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS00495RTXTOXINA384e-04 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 37.6 bits (87), Expect = 4e-04
Identities = 56/274 (20%), Positives = 92/274 (33%), Gaps = 29/274 (10%)

Query: 904 INTGADKDHVVINNSALTHAVVDTDKGDDDVDIEEGTTFNYNIINTSE---GNDTIT--I 958
+ D D V ++ + + KG D V ++ T I T GN T+T +
Sbjct: 613 ESHLGDGDDKVFLSAGSAN--IYAGKGHDVVYYDKTDTGYLTIDGTKATEAGNYTVTRVL 670

Query: 959 NRDPGDARITFQSSALNTGDGNDTVKVTNTTFLKLADGNPSAIDTGN------GDDTITI 1012
D + + ++ G + + + F + N + D G
Sbjct: 671 GGDVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVEELIGTTRADK 730

Query: 1013 KDGTIFQDFSYIAAGNGVDTITLESGAKFNQANVYADAGDDIIN--------VNGAEFKG 1064
G+ F D + +G D I G +Y D G+D ++ G
Sbjct: 731 FFGSKFTDIFH--GADGDDLIEGNDG----NDRLYGDKGNDTLSGGNGDDQLYGGDGNDK 784

Query: 1065 ENAYNHNAGVHGGAGDDKIFVNSGKFDNAKVEGDAGNDTIHIKSGARFENASIYGDSIDG 1124
N ++GG GDD+ V + G GND ++ GA + D + G
Sbjct: 785 LIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLKG 844

Query: 1125 LTTGNDTIIVEKGATLINTTINGGAGYDTLKIAD 1158
GND G + + G D L +AD
Sbjct: 845 -GYGNDIYRYLSGYGH-HIIDDDGGKEDKLSLAD 876



Score = 30.7 bits (69), Expect = 0.045
Identities = 26/97 (26%), Positives = 45/97 (46%), Gaps = 13/97 (13%)

Query: 528 GGGNDEITAGDNLTLSNHSGIHGDWGNNGSAGTGSDDGDDIIKIGK-NLTMSGG---SVI 583
G+D I D + ++GD GN+ +G +GDD + G N + G + +
Sbjct: 743 ADGDDLIEGNDG-----NDRLYGDKGNDTLSG---GNGDDQLYGGDGNDKLIGVAGNNYL 794

Query: 584 SGGGGNDEITIGDKANISNSTIDSGAGKDEIYMTNGT 620
+GG G+DE + + N + G G D++Y + G
Sbjct: 795 NGGDGDDEFQVQGNSLAKN-VLFGGKGNDKLYGSEGA 830


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS00515YERSSTKINASE280.030 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 28.2 bits (62), Expect = 0.030
Identities = 28/130 (21%), Positives = 57/130 (43%), Gaps = 15/130 (11%)

Query: 61 KSVTPKVVEKIHGLKYEPFHLKTKG----DYGEVASKVFAVLIVMDEAKGVGLFDENSL- 115
+ +TPK + ++ L HL + D G V S + +L+ +D+A+ G D++ L
Sbjct: 438 RRITPKKLRELSDLLRT--HLSSAATKQLDMGGVLSDLDTMLVALDKAEREGGVDKDQLK 495

Query: 116 -FKKAKFAYYKAYHDK-KERWSDGKDAEGFLKTGLEAAGVSKADYEKELANPKVTELLKK 173
F Y+ D K R D K+ + E + ++++ + P + + K
Sbjct: 496 SFNSLILKTYRVIEDYVKGREGDTKN------SSTEVSPYHRSNFMLSIVEPSLQRIQKH 549

Query: 174 WDESYDVAKI 183
D+++ + I
Sbjct: 550 LDQTHSFSDI 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS00530HTHFIS801e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.9 bits (197), Expect = 1e-19
Identities = 28/112 (25%), Positives = 54/112 (48%)

Query: 10 KTSVLVVEDDDMARELIISGLKPYCEQVIGACNGQDGVEKFKKQGFDIVMSDIHMPVLNG 69
++LV +DD R ++ L V N D+V++D+ MP N
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 70 FEMMNEMKRTKPHQKFIVFTSYDSDENLIKSMEEGAMLFLKKPIDMKDLRAM 121
F+++ +K+ +P +V ++ ++ IK+ E+GA +L KP D+ +L +
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGI 114


29CCON33237_RS00600CCON33237_RS00630N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS00600829-4.662737GDP-mannose 4,6-dehydratase
CCON33237_RS006051034-6.067076GDP-4-keto-6-deoxy-D-mannose-3,
CCON33237_RS006101137-7.051075glucose-1-phosphate cytidylyltransferase
CCON33237_RS006151239-8.755708CDP-glucose 4,6-dehydratase
CCON33237_RS006201238-9.959233hypothetical protein
CCON33237_RS006251238-9.947281lipopolysaccharide biosynthesis protein RfbH
CCON33237_RS006301140-11.297477CDP-paratose 2-epimerase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS00600NUCEPIMERASE873e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 86.8 bits (215), Expect = 3e-21
Identities = 45/181 (24%), Positives = 74/181 (40%), Gaps = 22/181 (12%)

Query: 7 LITGITGQDGSYLAEFLLKKGYIVHGIKRRTSLFNTDRIDHLYQDPH--------VDNRN 58
L+TG G G ++++ LL+ G+ V GI N Y D +
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDN----LND------YYDVSLKQARLELLAQPG 53

Query: 59 FFLHYGDMTDSMNLTRIIQEVQPDEIYNLAAMSHVQVSFETPEYVANADGTGTLRLLEAI 118
F H D+ D +T + + ++ V+ S E P A+++ TG L +LE
Sbjct: 54 FQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGC 113

Query: 119 RILGLEKKTKIYQASTSELYGKVQETPQSETTPF-YPRSPYAVAKMYAYWITVNYREAYG 177
R ++ + AS+S +YG ++ P S +P S YA K + Y YG
Sbjct: 114 RHNKIQ---HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYG 170

Query: 178 I 178
+
Sbjct: 171 L 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS00605NUCEPIMERASE813e-19 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 80.6 bits (199), Expect = 3e-19
Identities = 60/371 (16%), Positives = 129/371 (34%), Gaps = 73/371 (19%)

Query: 6 KIYLAGHKGLVGSAIVRNLKSKGYENI----ITRTHS-------------------ELDL 42
K + G G +G + + L G++ + + + ++DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 43 MDQKAVCEFFEKEKPEYVVLAAAKVGGIVANSTYRADFIYENLQIQNNVIHQSYLHKVKK 102
D++ + + F E V ++ ++ + A + NL N++ +K++
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHA-YADSNLTGFLNILEGCRHNKIQH 120

Query: 103 LLFLGSTCIYPKNAPQPMSEDVLLTSPLEYTNEPYAIAKIAGMKMCESYNLQYGTNFISV 162
LL+ S+ +Y N P S D + P+ YA K A M +Y+ YG +
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDHPVS----LYAATKKANELMAHTYSHLYGLPATGL 176

Query: 163 MPTNLYGPNDNFDLETSHVLPALIRKIHLAKLLSEEKFGEVVKDLKVRDINEAMVYLDKF 222
+YGP D+ L + + K
Sbjct: 177 RFFTVYGPWGRPDM----ALFKFTKAMLEGK----------------------------- 203

Query: 223 GISKDRVEIWGTGEPRREFLHSEDMADACVFLLENRDFKDIYDKNSKEIRNTH------I 276
++++ G+ +R+F + +D+A+A + L + D
Sbjct: 204 -----SIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVY 258

Query: 277 NIGTGKDISINELANFVKNIIGFKGELYFNDNKPDGTMLKLTDPSKLHS-LGWKHKVELE 335
NIG + + + +++ +G + + +P + D L+ +G+ + ++
Sbjct: 259 NIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVK 318

Query: 336 YGIKTLYEWYL 346
G+K WY
Sbjct: 319 DGVKNFVNWYR 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS00615NUCEPIMERASE1064e-28 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 106 bits (265), Expect = 4e-28
Identities = 66/357 (18%), Positives = 119/357 (33%), Gaps = 56/357 (15%)

Query: 13 TVLVTGHTGFKGSWLVYWLSQMGAKVIGY-ALEAPTSPN-----HIGLLNPDIISVTGDI 66
LVTG GF G + L + G +V+G L + L P D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 67 RDLDGLNQIFSKYKPDIVFHLAAQPIVRISYEKPIETYETNVIGTLKVFEACRKNNAKAI 126
D +G+ +F+ + VF + VR S E P ++N+ G L + E CR N + +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 127 VNITSDKAYENKEWVWGYRENDPMGGYD-------PYSSSKGCADLLTSSYRNSYFNTKD 179
+ Y + V+G P D Y+++K +L+ +Y + Y
Sbjct: 122 L-------YASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY----- 169

Query: 180 YKKTHNTLLVTCRAGNVIGGG---DWAKDRLITDVMVSASQGKKISIRNP-NATRPWQHV 235
R V G D A + ++ +GK I + N R + ++
Sbjct: 170 -----GLPATGLRFFTVYGPWGRPDMALFKFTKAML----EGKSIDVYNYGKMKRDFTYI 220

Query: 236 LECISGYLHIGQKLLEERVEYSGAWNFGPSNDRSICVEEVIKNIKKH----WNKIDYEIS 291
+ + +L + W + + NI +
Sbjct: 221 DDIAEAII----RLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALE 276

Query: 292 QDLGQPHEANLLKL----------DCSKAYALLKWKDVWGSETTFEKTAKWYKAYYE 338
LG + N+L L D Y ++ + + + WY+ +Y+
Sbjct: 277 DALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYK 333


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS00620NUCEPIMERASE751e-17 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 75.2 bits (185), Expect = 1e-17
Identities = 61/338 (18%), Positives = 128/338 (37%), Gaps = 70/338 (20%)

Query: 1 MKILLAGSTGFLGRCLLESFIKNGHEVIALKRSTS--NTNIIDKNL-----NDIKFYNVD 53
MK L+ G+ GF+G + + ++ GH+V+ + + ++ L +F+ +D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 54 ---EVALRDIFKNNKVDIVVNTVTNYG-----KNNNSVSEIVMTNLMFGLELLE------ 99
+ D+F + + V + +N ++ ++ +NL L +LE
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYAD---SNLTGFLNILEGCRHNK 117

Query: 100 ---------NSI--NNTKAFINTDTLLYRNINAYAFSK--AQLVDWMIYLSNKNTKI--I 144
+S+ N K +TD + ++ YA +K +L M + + +
Sbjct: 118 IQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANEL---MAHTYSHLYGLPAT 174

Query: 145 NVRLEHMYGPFGGHNNFICWLVGQLRQNVQKVKLTSGLQKRDFIYIDDVVSAYEVIIKNI 204
+R +YGP+G + + + + G KRDF YIDD+ A + I
Sbjct: 175 GLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI 234

Query: 205 SKFQ---------------DYEEFELGTGNSIEVKVFIEKVYKEILKQQNINTKLLFGAI 249
Y + +G + +E+ +I+ + + + N
Sbjct: 235 PHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKN-------- 286

Query: 250 AYRENENMDMK---ANIQKLVS-FGWKPEVSIESGIKK 283
+ D+ A+ + L G+ PE +++ G+K
Sbjct: 287 -MLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKN 323


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS00630NUCEPIMERASE1781e-55 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 178 bits (453), Expect = 1e-55
Identities = 81/359 (22%), Positives = 145/359 (40%), Gaps = 52/359 (14%)

Query: 1 MRYLITGGCGFLGTNIASRILEQGDELIIFDSLYRYGSYQNK----EWLGTKGKFVFVYG 56
M+YL+TG GF+G +++ R+LE G +++ D+L Y K E L G F F
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPG-FQFHKI 59

Query: 57 DIRNINDIEQTIKIYKPDVIFHLAGQVAMTTSIENPRMDFEINAVGSFNLINAVRLYSPD 116
D+ + + + +F ++A+ S+ENP + N G N++ R
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 117 STVIYSSTNKVYGNLNQYKYKETDTRYECIDRPNGFNEEVALDFHSPYGVSKGSADQYML 176
++Y+S++ VYG + + D+ +D P S Y +K + +
Sbjct: 120 -HLLYASSSSVYGLNRKMPFSTDDS----VDHP-----------VSLYAATKKANELMAH 163

Query: 177 DFTRIYGVKTAVFRHSSMFG--GRQFATYDQGWIGWFVQKAVEIKINTLKEPFTISGNGK 234
++ +YG+ R +++G GR + F + +E + + GK
Sbjct: 164 TYSHLYGLPATGLRFFTVYGPWGRPDMALFK-----FTKAMLE------GKSIDVYNYGK 212

Query: 235 QVRDLLYASDCVDLYLMASHKIDEIKGQ----------------VFNIGGGIQNSYSLLE 278
RD Y D + + I Q V+NIG + L++
Sbjct: 213 MKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNS--SPVELMD 270

Query: 279 LFSFLEQELNIKMKYKQLPPRESDQKIFVADINKVKKLIGWEPKISKEEGVKKMIEWVL 337
LE L I+ K LP + D AD + ++IG+ P+ + ++GVK + W
Sbjct: 271 YIQALEDALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYR 329


30CCON33237_RS01435CCON33237_RS01465N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS01435-2162.3534425-methyltetrahydropteroyltriglutamate--
CCON33237_RS01440-1122.669978ABC transporter ATP-binding protein
CCON33237_RS01445-1122.567710diacylglucosamine hydrolase like protein
CCON33237_RS014500144.059843beta-hydroxyacyl-ACP dehydratase
CCON33237_RS01455-1123.292269acyl-[acyl-carrier-protein]--UDP-N-
CCON33237_RS01460-1112.476355ATP-dependent protease ATP-binding subunit ClpX
CCON33237_RS01465-1112.483087rod shape-determining protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS01435HTHFIS280.039 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 27.9 bits (62), Expect = 0.039
Identities = 10/16 (62%), Positives = 13/16 (81%)

Query: 32 ITGASGSGKSLFAKSL 47
ITG SG+GK L A++L
Sbjct: 165 ITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS01440HTHFIS280.028 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 28.3 bits (63), Expect = 0.028
Identities = 10/31 (32%), Positives = 16/31 (51%), Gaps = 1/31 (3%)

Query: 43 ILGQSGSGKSTLANLISFSEPKSGGK-IYIN 72
I G+SG+GK +A + + G + IN
Sbjct: 165 ITGESGTGKELVARALHDYGKRRNGPFVAIN 195


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS01460HTHFIS340.001 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 33.7 bits (77), Expect = 0.001
Identities = 25/105 (23%), Positives = 47/105 (44%), Gaps = 16/105 (15%)

Query: 107 SKSNILLVGPTGSGKTLMAQTL---ARFLDVP-IAI-CDA--TSLTEAGYVGEDVENILT 159
+ +++ G +G+GK L+A+ L + + P +AI A L E+ G + T
Sbjct: 159 TDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGH-EKGAFT 217

Query: 160 RLLQAANGDVKKAEQGIVFVDEID--------KIARMSENRSITR 196
+ G ++AE G +F+DEI ++ R+ + T
Sbjct: 218 GAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTT 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS01465SHAPEPROTEIN459e-165 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 459 bits (1183), Expect = e-165
Identities = 184/338 (54%), Positives = 245/338 (72%), Gaps = 2/338 (0%)

Query: 3 LDQVIGFFSSDMGIDLGTANTLVLVKDKGIIINEPSVVAVRREKYGKQK-ILAVGHAAKE 61
L + G FS+D+ IDLGTANTL+ VK +GI++NEPSVVA+R+++ G K + AVGH AK+
Sbjct: 2 LKKFRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAAVGHDAKQ 61

Query: 62 MVGKTPGDIEAIRPMRDGVIADFDMTERMIRYFIEKTHKRKSF-LRPRIIISVPYGLTQV 120
M+G+TPG+I AIRPM+DGVIADF +TE+M+++FI++ H PR+++ VP G TQV
Sbjct: 62 MLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQV 121

Query: 121 ERKAVRESALSAGAREVFLIEEPMAAAIGANLPVREPQGNLVVDIGGGTTEIGVVSLGGL 180
ER+A+RESA AGAREVFLIEEPMAAAIGA LPV E G++VVDIGGGTTE+ V+SL G+
Sbjct: 122 ERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGV 181

Query: 181 VISKSIRTAGDKIDISIVNYIKEKYNLLIGERTGEEIKIAVGSAVQLEKELSVVVKGRDQ 240
V S S+R GD+ D +I+NY++ Y LIGE T E IK +GSA ++ + V+GR+
Sbjct: 182 VYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNL 241

Query: 241 VSGLLSRVELTSEDVREAMREPLKEIADALKTVLEMMPPDLAGDIVETGIVLTGGGALIR 300
G+ L S ++ EA++EPL I A+ LE PP+LA DI E G+VLTGGGAL+R
Sbjct: 242 AEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLR 301

Query: 301 GLDKFLSDIVKLPVFVADEPLLAVARGTGKALEEIGLL 338
LD+ L + +PV VA++PL VARG GKALE I +
Sbjct: 302 NLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMH 339


31CCON33237_RS03400CCON33237_RS03455N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS03400123-5.382720ABC transporter
CCON33237_RS03405426-4.043176HlyD family type I secretion periplasmic adaptor
CCON33237_RS03410630-5.202367hypothetical protein
CCON33237_RS03415836-6.197911helix-turn-helix transcriptional regulator
CCON33237_RS03420837-4.491319hypothetical protein
CCON33237_RS034251042-4.126725****elongation factor Tu
CCON33237_RS03450943-3.27605650S ribosomal protein L33
CCON33237_RS03455542-5.890538*preprotein translocase subunit SecE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03405SECFTRNLCASE310.018 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 31.0 bits (70), Expect = 0.018
Identities = 12/62 (19%), Positives = 27/62 (43%), Gaps = 3/62 (4%)

Query: 375 EIANRSIKSKIITTSITTVTSFLVQLNTIAIIVLGVYMIQDTHLTMGGLIAAVMLSSRAI 434
++ N S+ + T +T +T+ L + +++ G +I+ M + SS +
Sbjct: 244 DVMNLSVNETLSRTVMTGMTTLLA---LVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYV 300

Query: 435 AP 436
A
Sbjct: 301 AK 302


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03410RTXTOXIND3005e-99 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 300 bits (769), Expect = 5e-99
Identities = 96/462 (20%), Positives = 208/462 (45%), Gaps = 9/462 (1%)

Query: 35 ILNSVDDIKSNLQTKNYDAYDLKFMSSLSEAVLAKAPSTSKKILYTVAITMFWLLVWASW 94
+ + I+ L T + + +F+ + E + + + Y + + + +
Sbjct: 18 VWSETWKIRKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVL 77

Query: 95 AQIDEITRGSGKIIPSGKNQAIQNLEGGIVDQIFVKEGDEVKKDQILIRLDNKNFTSSYG 154
Q++ + +GK+ SG+++ I+ +E IV +I VKEG+ V+K +L++L +
Sbjct: 78 GQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTL 137

Query: 155 ESKLRLDELQAKFMRLDAEANDKDFDYDEARDANNSKAIR--------YELSLHSSNIDH 206
+++ L + + + R + + + + + SL
Sbjct: 138 KTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFST 197

Query: 207 LNEQIGILTEQIHQRQSELVELKNKISQTQNSYNLVLKEKAIMEPIFKKGLVSEVEYIQL 266
Q + ++++E + + +I++ +N + + K +++ ++
Sbjct: 198 WQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQ 257

Query: 267 QRRVNDLRGELDAAVLAVPRVESTIKEAKNKIEEAKLAFKNNAKKELNEVSAEIARINES 326
+ + + EL + ++ES I AK + + FKN +L + + I +
Sbjct: 258 ENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE 317

Query: 327 QISLSDRVERTYVRSPVNGIVSKMMVHTVSGVIKPGENIAEIVPLEDKLVAEVKVKPADV 386
+R + + +R+PV+ V ++ VHT GV+ E + IVP +D L V+ D+
Sbjct: 318 LAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDI 377

Query: 387 AFLRPGLDTMVKFTAYDFSIYGGLKGKVTQISADTETNEKTGESYYLVRIETEKNYLGSE 446
F+ G + ++K A+ ++ YG L GKV I+ D +++ G + V I E+N L +
Sbjct: 378 GFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFN-VIISIEENCLSTG 436

Query: 447 EKPLRIKVGMIVSADIITGKKTILDYLLKPILKAKQNALTER 488
K + + GM V+A+I TG ++++ YLL P+ ++ +L ER
Sbjct: 437 NKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03450TCRTETOQM932e-22 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 93.0 bits (231), Expect = 2e-22
Identities = 79/313 (25%), Positives = 135/313 (43%), Gaps = 51/313 (16%)

Query: 13 VNIGTIGHVDHGKTTLTAAI---SAVLSRKGLAELKDYDNIDNAPEEKERGITIATSHIE 69
+NIG + HVD GKTTLT ++ S ++ G + DN E++RGITI T
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGT-TRTDNTLLERQRGITIQTGITS 62

Query: 70 YETEKRHYAHVDCPGHADYVKNMITGAAQMDGAILVVSAADGPMPQTREHILLSRQVGVP 129
++ E +D PGH D++ + + +DGAIL++SA DG QTR R++G+P
Sbjct: 63 FQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIP 122

Query: 130 YIVVFMNKAD----------------MVDDAELLELVEMEIRELLNEYNFP--------G 165
I F+NK D + + + + VE+ + + G
Sbjct: 123 TI-FFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 166 DD---TPIVSGSALKALE----EAKAGQDGEW------SAK----IMELMDAVDSYIPTP 208
+D +SG +L+ALE E+ + SAK I L++ + + +
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSS 241

Query: 209 VRATDKDLLMPIEDVFSISGRGTVVTGRIEKGVIKVGDTIEIVG---IKPTQTTTVTGVE 265
+L + + R + R+ GV+ + D++ I IK T+ T E
Sbjct: 242 THRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKIKITEMYTSINGE 301

Query: 266 MFRKEMDQGEAGD 278
++D+ +G+
Sbjct: 302 --LCKIDKAYSGE 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03465SECETRNLCASE293e-04 Bacterial translocase SecE signature.
		>SECETRNLCASE#Bacterial translocase SecE signature.

Length = 127

Score = 29.5 bits (66), Expect = 3e-04
Identities = 15/49 (30%), Positives = 29/49 (59%)

Query: 3 KIINYIKLSKLEIMKVIYPTKEQIRNAFFAVFIVVAVVSLFLALVDVIM 51
+ + + ++ E+ KVI+PT+++ + V V AV+SL L +D I+
Sbjct: 67 ATVAFAREARTEVRKVIWPTRQETLHTTLIVAAVTAVMSLILWGLDGIL 115


32CCON33237_RS03955CCON33237_RS03985N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS03955-290.313859preprotein translocase subunit SecA
CCON33237_RS03960-290.042171hypothetical protein
CCON33237_RS03965-112-1.566553potassium transporter
CCON33237_RS03970-112-2.377019MerR family transcriptional regulator
CCON33237_RS03975-115-3.037715DnaJ family protein
CCON33237_RS03980-313-2.135402serine protease
CCON33237_RS03985-310-0.601125DNA-binding response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03960SECA11170.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 1117 bits (2891), Expect = 0.0
Identities = 456/908 (50%), Positives = 598/908 (65%), Gaps = 58/908 (6%)

Query: 1 MISSVFRKIFGTKNDREVKKYIRRVAQINALEPTYEKMSDDELKIKFNELKAQVVEEKVT 60
M+ + K+FG++NDR +++ + V INA+EP EK+SD+ELK K E +A++ E+
Sbjct: 1 MLIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARL-EKGEV 59

Query: 61 LDEILNDVFAIVREASKRVLKMRHFDVQLIGGMVLNEGRIAEMKTGEGKTLVATLPVILN 120
L+ ++ + FA+VREASKRV MRHFDVQL+GGMVLNE IAEM+TGEGKTL ATLP LN
Sbjct: 60 LENLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLN 119

Query: 121 AMSGKGVHVVTVNDYLAKRDATQMGELYNFLGLSVDVILSGGYDDNVRQAAYNADITYGT 180
A++GKGVHVVTVNDYLA+RDA L+ FLGL+V + L G ++ AY ADITYGT
Sbjct: 120 ALTGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPA-KREAYAADITYGT 178

Query: 181 NSEFGFDYLRDNMKFEAGQKVQRGHNFVIVDEVDSILIDEARTPLIISGPTNRTLDGYIR 240
N+E+GFDYLRDNM F ++VQR ++ +VDEVDSILIDEARTPLIISGP + + Y R
Sbjct: 179 NNEYGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKR 238

Query: 241 ADQVAKQLTRGTPADPNVPGSKPTGDFIVDEKNRTIMITEAGISKAEKLF-------GVE 293
+++ L R D + G F VDEK+R + +TE G+ E+L E
Sbjct: 239 VNKIIPHLIRQEKEDSE--TFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGE 296

Query: 294 NLYNLENAILSHHLDQALKAHNLFEKDVHYVVKDGEVVIVDEFTGRLSEGRRFSEGLHQA 353
+LY+ N +L HH+ AL+AH LF +DV Y+VKDGEV+IVDE TGR +GRR+S+GLHQA
Sbjct: 297 SLYSPANIMLMHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQA 356

Query: 354 LEAKEGVKIQEESQTLADTTYQNYFRMYKKLAGMTGTAQTEATEFSQIYNLDVISIPTNV 413
+EAKEGV+IQ E+QTLA T+QNYFR+Y+KLAGMTGTA TEA EFS IY LD + +PTN
Sbjct: 357 VEAKEGVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNR 416

Query: 414 PVIRIDSNDLIYKTQNEKFKAVIDEIKKAHEKGQPVLVGTASIERSEVLHEMLKKVGIPH 473
P+IR D DL+Y T+ EK +A+I++IK+ KGQPVLVGT SIE+SE++ L K GI H
Sbjct: 417 PMIRKDLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKH 476

Query: 474 SVLNAKNHEKEAEIIAQAGVKGAVTIATNMAGRGVDIRI--------------------- 512
+VLNAK H EA I+AQAG AVTIATNMAGRG DI +
Sbjct: 477 NVLNAKFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEK 536

Query: 513 --------DDEVRNLGGLYIIGTERHESRRIDNQLRGRAGRQGDPGMSRFYLSLEDNLLR 564
D V GGL+IIGTERHESRRIDNQLRGR+GRQGD G SRFYLS+ED L+R
Sbjct: 537 IKADWQVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMR 596

Query: 565 IFGSDRIKAIMDRLGIDEGESIESRMVTRAVENAQKKVESLHFEARKHLLEYDDVANEQR 624
IF SDR+ +M +LG+ GE+IE VT+A+ NAQ+KVES +F+ RK LLEYDDVAN+QR
Sbjct: 597 IFASDRVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQR 656

Query: 625 KTIYKYRDELLDKNYDMSEKIAQNRKEYAANLLDTAEIFHGGLKDDFDIKNLCSIILADC 684
+ IY R+ELLD + D+SE I R++ +D A I L++ +DI L + D
Sbjct: 657 RAIYSQRNELLDVS-DVSETINSIREDVFKATID-AYIPPQSLEEMWDIPGLQERLKNDF 714

Query: 685 GEEIDESELKGLEYD----ELIEKLAQIFEARYNEKMSVLNEDQRKEIEKILYLQVLDNA 740
++ +E E + L E++ Y K V+ + + EK + LQ LD+
Sbjct: 715 DLDLPIAEWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSL 774

Query: 741 WREHLYQMDILKTGIGLRGYNQKDPLVEYKKESYNLFMELVSRLKSESVKTLQVVRFKSR 800
W+EHL MD L+ GI LRGY QKDP EYK+ES+++F ++ LK E + TL V+ +
Sbjct: 775 WKEHLAAMDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMP 834

Query: 801 EEQEEQARMM---------LEASQNAENENLSYNNQGEDENFTPEKKIPRNAPCPCGSGK 851
EE EE + ++ + ++++ + T E+K+ RN PCPCGSGK
Sbjct: 835 EEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQ---TGERKVGRNDPCPCGSGK 891

Query: 852 KYKDCHGK 859
KYK CHG+
Sbjct: 892 KYKQCHGR 899


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03970ABC2TRNSPORT290.038 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 29.1 bits (65), Expect = 0.038
Identities = 20/74 (27%), Positives = 31/74 (41%), Gaps = 6/74 (8%)

Query: 95 GFIMGVMLYYVLRLKDETALIAGLALALSSTAIVLKTLNDSGDVSKIYGRKALGILLFQD 154
G+ + L Y L +IA LA +S +V+ L S D Y + +LF
Sbjct: 140 GYTQWLSLLYAL------PVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLS 193

Query: 155 IAVIPILLMIDMFS 168
AV P+ + +F
Sbjct: 194 GAVFPVDQLPIVFQ 207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03985V8PROTEASE628e-13 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 62.3 bits (151), Expect = 8e-13
Identities = 30/163 (18%), Positives = 57/163 (34%), Gaps = 26/163 (15%)

Query: 94 KTTSLGSGVIISNDGYIVTNNHVIEDSDQIVVTLA-----------NGGKEYKAKLIGSD 142
T + SGV++ ++TN HV++ + L G ++
Sbjct: 99 TGTFIASGVVV-GKDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYS 157

Query: 143 PKTDLAVVKIEA--------NGLNAITFADSSKLLDADVVFAIGNPFGVGESITQGIVSG 194
+ DLA+VK + T +++++ + G P + T G
Sbjct: 158 GEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKPVA-TMWESKG 216

Query: 195 LNKDNIGLNQYENFIQTDASINPGNSGGALVDSRGYLVGINSA 237
G +Q D S GNSG + + + ++GI+
Sbjct: 217 KITYLKG-----EAMQYDLSTTGGNSGSPVFNEKNEVIGIHWG 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS03990HTHFIS965e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 95.7 bits (238), Expect = 5e-25
Identities = 34/137 (24%), Positives = 63/137 (45%), Gaps = 3/137 (2%)

Query: 2 TRILMIEDDMELAEILTEYLENYDIEVITAEEPYIGLSTLNTSKFDLVILDLTLPGMDGL 61
IL+ +DD + +L + L +V + DLV+ D+ +P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 EVCKEIRK-NHNIPIIISSARHDITDKVNALDNGADDYLPKPYDPQELLARIKSHL--RR 118
++ I+K ++P+++ SA++ + A + GA DYLPKP+D EL+ I L +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 119 QNVTPLNESKNLNKDLV 135
+ + L + LV
Sbjct: 124 RRPSKLEDDSQDGMPLV 140


33CCON33237_RS04105CCON33237_RS04150N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS04105-1161.589913two-component sensor histidine kinase
CCON33237_RS04115-212-3.306826DNA-binding response regulator
CCON33237_RS04120-114-4.986145pyruvate kinase
CCON33237_RS04125-214-4.297627hypothetical protein
CCON33237_RS04130-212-3.169456hypothetical protein
CCON33237_RS04135-214-4.409056*DNA-binding protein
CCON33237_RS04145-216-3.730608hypothetical protein
CCON33237_RS04150-217-4.293736flagellin biosynthesis protein FlgL
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04115BINARYTOXINB310.010 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 30.8 bits (69), Expect = 0.010
Identities = 19/113 (16%), Positives = 45/113 (39%), Gaps = 10/113 (8%)

Query: 31 YKNKKETLILNEVKSLKEIKMGIYMKARMNGLDSISSLTKEKGVHACIVLKNGEKIYKDF 90
+ + I +E + + ++K + + + ++ H + + + E I K
Sbjct: 74 IPSSELENIPSENQYFQSAIWSGFIKVKKSDEYTFAT---SADNHVTMWVDDQEVINKAS 130

Query: 91 DCQKIDKSKNVNLISGKVAIFEKIQYMDDNTTDELSHADIFLVGKDIKAEILS 143
+ K + L G++ KIQY +N T++ ++ K E++S
Sbjct: 131 NSNK------IRLEKGRLYQI-KIQYQRENPTEKGLDFKLYWTDSQNKKEVIS 176


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04120HTHFIS912e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 90.7 bits (225), Expect = 2e-23
Identities = 36/117 (30%), Positives = 58/117 (49%)

Query: 3 RILLVEDDEILLDLISEYLSENGYDVTTSDNAKEALDLAYEQNFDLLILDVKLPQGDGFS 62
IL+ +DD + ++++ LS GYDV + NA + DL++ DV +P + F
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 63 LLSSLRELGVSAPSIFTTSLNTIDDLEKGYKSGCDDYLKKPFELKELLIRIQALLKR 119
LL +++ P + ++ NT K + G DYL KPF+L EL+ I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04145DNABINDINGHU873e-26 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 87.1 bits (216), Expect = 3e-26
Identities = 39/87 (44%), Positives = 51/87 (58%)

Query: 3 KAEFIQAVADKAGLSKKDTLKVVDATLETIQAVLEKGDTISFIGFGTFGTADRAARKARV 62
K + I VA+ L+KKD+ VDA + + L KG+ + IGFG F +RAARK R
Sbjct: 4 KQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGRN 63

Query: 63 PGTKKVIDVPASKAVKFKVGKKLKEAV 89
P T + I + ASK FK GK LK+AV
Sbjct: 64 PQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS04155FLAGELLIN563e-10 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 56.2 bits (135), Expect = 3e-10
Identities = 60/344 (17%), Positives = 112/344 (32%), Gaps = 23/344 (6%)

Query: 11 QTLNDYQKNMTGVNKSYKQLSNGLKIQDPYDGAATYNDAMRLDYEATTLTQVVDATGKSV 70
LN Q ++ + + ++LS+GL+I D AA A R LTQ +
Sbjct: 15 NNLNKSQSSL---SSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGI 71

Query: 71 NFSKNTDNALQEFEKQLENFKTKVVQAASSVHSKTSLEALANDLQGIKNHLVNIAN-TSV 129
+ ++ T+ AL E L+ + VQA + +S + L+++ +++Q + ++N T
Sbjct: 72 SIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQF 131

Query: 130 NGQFLFSGSAVDTKPIDGAGKYQGNRDYMKTSAGAQVELPYNIPGYDLFLGKDGDYSKIL 189
NG + S + GA + ++ + L
Sbjct: 132 NGVKVLSQDNQMKIQV-GANDGETITIDLQKIDVKSLGL-----------------DGFN 173

Query: 190 TTNVRLADQTRTDISYAPKFLNDNSKIKNMIGLNYASDSVVRSDGSYNGTINPDYDFLDN 249
+ A S+ D + + V +D + + Y N
Sbjct: 174 VNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAAN 233

Query: 250 SNVNFPDTYFFMQGKKPDGTTFTSKFKMSANTTMAGLMEKIGMEFGNTKTTKVVDVSINN 309
+ D T T+ + A K G F T +D N
Sbjct: 234 GQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGN 293

Query: 310 DGQFNIKDLTKGNQTIDFHMVAATSVAPNRGAIAQNNALDTVNS 353
DG + T + + + T+ A N A ++ + S
Sbjct: 294 DGNGKVST-TINGEKVTLTVADITAGAANVDAATLQSSKNVYTS 336



Score = 37.3 bits (86), Expect = 2e-04
Identities = 25/201 (12%), Positives = 52/201 (25%), Gaps = 6/201 (2%)

Query: 565 NAYKEALSKTKGTVETTLDDRGRMVLTDKTKSVTNIEVTMHDAKNSDKFDGDSTGRDTAG 624
+D + SV N + T D + + + +
Sbjct: 305 GEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTF------DDKTKNESAKLSDL 358

Query: 625 NAGHPQGKGSVFSFNENNALTIDEPSTSVFQDLDNMIEAVRKGYYRADANSNDPRNTGMQ 684
A + S + N I+ G
Sbjct: 359 EANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTA 418

Query: 685 GALQRLDHLIDHANKELTKIGSQSRLLTATKERAEVMKVNVQTVKNDVIDADYAESYLKF 744
L +D + + + +G+ + N+ + ++ + DADYA
Sbjct: 419 NPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNM 478

Query: 745 TQLSLSYQATLQASAKINQLS 765
++ + QA A+ NQ+
Sbjct: 479 SKAQILQQAGTSVLAQANQVP 499


34CCON33237_RS06045CCON33237_RS06065N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS06045-1100.912114multidrug transporter
CCON33237_RS06050-191.037532hemolysin D
CCON33237_RS06055-191.456493L-seryl-tRNA(Sec) selenium transferase
CCON33237_RS06060-290.982525flagellar biosynthetic protein FliP
CCON33237_RS06065-280.881714flagellar motor protein MotB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06045ACRIFLAVINRP6300.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 630 bits (1626), Expect = 0.0
Identities = 262/1036 (25%), Positives = 467/1036 (45%), Gaps = 42/1036 (4%)

Query: 1 MIKTAINRPITTLMIFLSLVVFGIYSLKTMNVNLYPQVNIPIVKI-TTYANGDMNYIKTK 59
M I RPI ++ + L++ G ++ + V YP + P V + Y D ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 60 ITQKIEDEISSIEGIKKIYSTSF-DNLSVVSIEFELNKDLESATNDVRDKMQKAR----- 113
+TQ IE ++ I+ + + STS +++ F+ D + A V++K+Q A
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 114 VGANYEIEKLNGLSSSVFSLFITRLDGNETK--LMQEIDDVAKPFLERISGVSKVKTNGF 171
I SS + + T+ + + K L R++GV V+ G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 172 LEPAVKILLDRFKLDKNALSANEVANLIKVENLKAPLGKIENE------QIQMAIKSNFS 225
+ A++I LD L+K L+ +V N +KV+N + G++ Q+ +I +
Sbjct: 181 -QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 226 AKSIDEIRNLTIK-----QGVFLKDIASVDLSYKDANEAAIMDKKSGVLLGLELAPDANA 280
K+ +E +T++ V LKD+A V+L ++ N A ++ K LG++LA ANA
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 281 LTVIALAKSKLDQFKSLLGSEYDVKIAYDKSEVIQKHIDQTAFDMILGILLTIVIVYLFL 340
L K+KL + + V YD + +Q I + + I+L +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 341 RNFSITIISVVAIPTSIVATFFIINALGYDINRLSLIALTLGIGIFIDDAIVVTENIASK 400
+N T+I +A+P ++ TF I+ A GY IN L++ + L IG+ +DDAIVV EN+
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 401 LKDEP-NALKASFAGIKEIAFSVFAISLVLLCVFVPIAFMSGIVGKYFNSFAMSVAAGIV 459
+ ++ +A+ + +I ++ I++VL VF+P+AF G G + F++++ + +
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 460 ISFFVSIFLVPTLSARFVNA-------KQSGFFLKSEPFFEALENFYEKILALALKFKLI 512
+S V++ L P L A + + GFF F+ N Y + L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 513 FLAITLVVVVCSFTLAKFVGGDFMPSEDNSEFNIYFKLDPSLSLQASKDKLKD--KISLI 570
+L I ++V L + F+P ED F +L + + ++ L L
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 571 NADPQVAYAYFILGYTDAKQ-PYLVKAYVRLKELKDRVNHE-RQNAIMQSFRDRLKS--D 626
N V + + G++ + Q A+V LK ++R E A++ + L D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 627 DMSVIVADLPVVEGGDVQPVKLTITSENGKELEKFVPK----ISKMLKEINDATDVNSPE 682
+ +VE G + + G + + + V
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 683 EDLLKRVQISIDEDKAKRLILDKASVASAVYSAFSQNEVSVFENENGKEYELYMRLDDKF 742
+ + ++ +D++KA+ L + + + + +A V+ F + G+ +LY++ D KF
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDF-IDRGRVKKLYVQADAKF 778

Query: 743 RSDTDDILKTKIRSKEGFFVTLGDVATISFEQKPASISRFNRADEIKFLANTKNNAPLNS 802
R +D+ K +RS G V T + + R+N ++
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSS-G 837

Query: 803 VANEISKKLDEILPANFKYKFLGFVELMDDTNASFIFTVSASAVLIYMVLAALYESFLLP 862
A + + L LPA Y + G + V+ S V++++ LAALYES+ +P
Sbjct: 838 DAMALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIP 897

Query: 863 FLIMLAMPLAFCGVVIGLFISGNPFSLFVMVGVILLFGMVGKNAILVVDFANHF-ANNGI 921
+ML +PL GV++ + ++ MVG++ G+ KNAIL+V+FA G
Sbjct: 898 VSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGK 957

Query: 922 EANEAVKMAAKKRLRAVLMTTFAMIFAMLPLALGRGAGFEANSPMAISIIFGLISSTLLS 981
EA MA + RLR +LMT+ A I +LPLA+ GAG A + + I ++ G++S+TLL+
Sbjct: 958 GVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLA 1017

Query: 982 LLVVPVLFAWVYNLDK 997
+ VPV F + K
Sbjct: 1018 IFFVPVFFVVIRRCFK 1033


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06050RTXTOXIND382e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 38.3 bits (89), Expect = 2e-05
Identities = 22/87 (25%), Positives = 34/87 (39%), Gaps = 10/87 (11%)

Query: 5 IILMIFGIFSFASEEIFADFEVYAKQSSKLAFEG---------SGKVDKIFVDVSSHVKK 55
+M F + +F + E+ A + KL G + V +I V V+K
Sbjct: 62 YFIMGFLVIAFILS-VLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRK 120

Query: 56 GDVLAILDQSSLEIALKKAKNDLELAK 82
GDVL L E K ++ L A+
Sbjct: 121 GDVLLKLTALGAEADTLKTQSSLLQAR 147



Score = 32.1 bits (73), Expect = 0.002
Identities = 33/217 (15%), Positives = 75/217 (34%), Gaps = 28/217 (12%)

Query: 53 VKKGDV--LAILDQSSLEIALKKAKNDLELAKNASEFAKNTLSKFTQVRNVTSKQEFD-E 109
+ K + A+L+Q E +A N+L + K+ E ++ + + Q F E
Sbjct: 244 LHKQAIAKHAVLEQ---ENKYVEAVNELRVYKSQLEQIESEILS-AKEEYQLVTQLFKNE 299

Query: 110 VKYKFDEAILRVQSAQIAILNAQDRLKKAVLKAPFDGVIASKNV-ELGESASPLQPAFVL 168
+ K + + + + ++R + +V++AP + V G + + V+
Sbjct: 300 ILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVI 359

Query: 169 NSEEAKILIA--IDEKYANLVKVGDTFKFKLDATSEEK----EVKIALIYPE-------- 214
E+ + + + K + VG K++A + K+ I +
Sbjct: 360 VPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419

Query: 215 ----IKRETRKFYAQAYDTG--LKPGMFGQGKVIIGE 245
+ + + L GM ++ G
Sbjct: 420 LVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06055PF06580290.036 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.1 bits (65), Expect = 0.036
Identities = 9/53 (16%), Positives = 22/53 (41%), Gaps = 5/53 (9%)

Query: 289 NKFGIAFKWKIFDFFATSKMSQAQKIALDEARLNLEYKRRENETKLKNLQSEI 341
N + F W + +F ++ +D+ ++ E +L L+++I
Sbjct: 123 NVVVVTFMWSLL-YFGWHFFKNYKQAEIDQWKM----ASMAQEAQLMALKAQI 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06060FLGBIOSNFLIP2532e-87 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 253 bits (648), Expect = 2e-87
Identities = 109/239 (45%), Positives = 161/239 (67%), Gaps = 1/239 (0%)

Query: 5 LSLAVLFCVVFGADPALPTINLSLNSPANAEQLVNSLNVLLILTALALAPSLIFMMTSFL 64
++ +L+ + A LP I S P + + L+ +T+L P+++ MMTSF
Sbjct: 7 VAPVLLWLITPLAFAQLPGIT-SQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMMTSFT 65

Query: 65 RLVIVFSFLRQAMGTQQVPPSTVLISLAMVLTFFIMEPVGQKSYNDGIKPYIAEQIGYEE 124
R++IVF LR A+GT PP+ VL+ LA+ LTFFIM PV K Y D +P+ E+I +E
Sbjct: 66 RIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKISMQE 125

Query: 125 MLDKSLKPFKEFMVKNTREKDLALFFRIRNLQNPANIEEIPLSIAMSAFMISELKTSFEI 184
L+K +P +EFM++ TRE DL LF R+ N E +P+ I + A++ SELKT+F+I
Sbjct: 126 ALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKTAFQI 185

Query: 185 AFLLYLPFLVIDMVVSSVLMAMGMMMLPPVMISLPFKLLIFVLVDGWNLLIGNLVKSFH 243
F +++PFL+ID+V++SVLMA+GMMM+PP I+LPFKL++FVLVDGW LL+G+L +SF+
Sbjct: 186 GFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQSFY 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06065OMPADOMAIN691e-15 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 69.2 bits (169), Expect = 1e-15
Identities = 34/123 (27%), Positives = 55/123 (44%), Gaps = 14/123 (11%)

Query: 124 VRLPAAMLFDKDSAEISGEDAKLFLKRIGMIVAKMPNEVKVDIIGHTDNIEPNKDSAYKN 183
L + +LF+ + A + E + + P + V ++G+TD I AY
Sbjct: 215 FTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGS---DAY-- 269

Query: 184 NWQLSTARALSVVEQLSSDGVPQNRLIASGKASFDPIATNNTEEGR---------AKNNR 234
N LS RA SVV+ L S G+P +++ A G +P+ N + + A + R
Sbjct: 270 NQGLSERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRR 329

Query: 235 VEI 237
VEI
Sbjct: 330 VEI 332


35CCON33237_RS06390CCON33237_RS06420N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS063900132.019815flagellar basal body rod protein FlgG
CCON33237_RS063950111.101454flagellar biosynthesis protein FlgG
CCON33237_RS064002120.995746RNA polymerase sigma factor RpoD
CCON33237_RS06405-1121.705194flagellin C
CCON33237_RS06410-1131.935155histidinol dehydrogenase
CCON33237_RS064151141.1479421-aminocyclopropane-1-carboxylate deaminase
CCON33237_RS064202140.944274chemotaxis protein MotB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06390FLGHOOKAP1496e-09 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 49.2 bits (117), Expect = 6e-09
Identities = 11/42 (26%), Positives = 25/42 (59%)

Query: 220 EMSNVQLVEEMTDLITGQRAYEANSKAITTSDSMLEIVNGLK 261
+S V L EE +L Q+ Y AN++ + T++++ + + ++
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 45.3 bits (107), Expect = 1e-07
Identities = 10/35 (28%), Positives = 19/35 (54%)

Query: 4 SLYTAATGMIAQQTQIDVTSHNIANVNTYGYKKNR 38
+ A +G+ A Q ++ S+NI++ N GY +
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQT 37


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06395FLGHOOKAP1352e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 35.3 bits (81), Expect = 2e-04
Identities = 11/40 (27%), Positives = 19/40 (47%)

Query: 3 NGYYQATAGMVTQFNRLNVISNNLANVNTIGYKRNDVVIG 42
+ A +G+ LN SNN+++ N GY R ++
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMA 41


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06405FLAGELLIN541e-10 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 54.3 bits (130), Expect = 1e-10
Identities = 39/155 (25%), Positives = 65/155 (41%), Gaps = 10/155 (6%)

Query: 16 YLDQAKNSEKKALNAISANSEI---KASGANLQIAESLLSQTNVLNEGLANANDMIGMLQ 72
L+++++S A+ +S+ I K A IA S L + NAND I + Q
Sbjct: 16 NLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGISIAQ 75

Query: 73 IADSTLLNLSKSTDRIGELSSKLTNPTLSANEQKGIKGEINALRSAMSDSVKEAKFNGKN 132
+ L ++ + R+ ELS + TN T S ++ K I+ EI + + +FNG
Sbjct: 76 TTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQFNGVK 135

Query: 133 VFDAELGFFAGESTKSINLGTNALLNVKDDGSNAD 167
V ++ I +G N + D D
Sbjct: 136 VLS-------QDNQMKIQVGANDGETITIDLQKID 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS06420OMPADOMAIN604e-12 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 59.9 bits (145), Expect = 4e-12
Identities = 24/75 (32%), Positives = 39/75 (52%), Gaps = 7/75 (9%)

Query: 226 LSSSVLFDKGSAVLKEEVKEELKATLSKYFDVLLNDKDIASNIDQIIIEGFTDSDGSYIY 285
L S VLF+ A LK E + L S+ ++ D +++ G+TD GS Y
Sbjct: 217 LKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDG-------SVVVLGYTDRIGSDAY 269

Query: 286 NLELSQKRAYAVMEF 300
N LS++RA +V+++
Sbjct: 270 NQGLSERRAQSVVDY 284


36CCON33237_RS07725CCON33237_RS07745N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CCON33237_RS07725-114-4.278575DNA-binding response regulator
CCON33237_RS07730-115-3.812446two-component sensor histidine kinase
CCON33237_RS07735-1100.077405guanosine polyphosphate pyrophosphohydrolase
CCON33237_RS07740-2141.255369hypothetical protein
CCON33237_RS07745-2204.340092flagellar motor switch protein FliN
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS07725HTHFIS794e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 78.7 bits (194), Expect = 4e-19
Identities = 35/112 (31%), Positives = 57/112 (50%)

Query: 2 RILIVEDEVTLNKTIAEGLQEFGYQTDSSENFKDAEYYIGIRNYDLVLTDWMLQDGDGVD 61
IL+ +D+ + + + L GY + N +I + DLV+TD ++ D + D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 LINIIKHKSPRTSVVVLSAKDDKESEIKALRAGADDYIKKPFDFDILVARLE 113
L+ IK P V+V+SA++ + IKA GA DY+ KPFD L+ +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS07730PF06580385e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.9 bits (88), Expect = 5e-05
Identities = 19/98 (19%), Positives = 39/98 (39%), Gaps = 11/98 (11%)

Query: 286 IKLDLKPEILNLKIQTSLLTHIVQNFVQNAIKFSPKNSTITISSKLIKNKFIIEVIDEGI 345
+ + P I+++++ L+ +V+N +++ I P+ I + +EV + G
Sbjct: 242 FENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGS 301

Query: 346 GIDESKDLFAPFKRYGDKGGAGLGLFLVKGAAQALGGE 383
++ K G GL V+ Q L G
Sbjct: 302 LALKN-----------TKESTGTGLQNVRERLQMLYGT 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS07735SHAPEPROTEIN363e-04 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 35.5 bits (82), Expect = 3e-04
Identities = 20/55 (36%), Positives = 29/55 (52%)

Query: 114 EEATFGAIAAKNLLHNIAECVTIDIGGGSTELARISKGKIIDTLSLDIGTVRLKE 168
EE AI A + + +DIGGG+TE+A IS ++ + S+ IG R E
Sbjct: 142 EEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDE 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CCON33237_RS07745FLGMOTORFLIN951e-28 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 94.6 bits (235), Expect = 1e-28
Identities = 30/85 (35%), Positives = 48/85 (56%)

Query: 14 GLFKSYDELMDISVDFIAELGTTTVSINELLKFEAGSVIDLEKPAGESVELYINNRIFGK 73
G + D +MDI V ELG T ++I ELL+ GSV+ L+ AGE +++ IN + +
Sbjct: 49 GAMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQ 108

Query: 74 GEVMVYEKNLAIRINEILDSKSVIQ 98
GEV+V +RI +I+ ++
Sbjct: 109 GEVVVVADKYGVRITDIITPSERMR 133



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.